This site is devoted to mathematics and its applications. Created and run by Peter Saveliev.

Grayscale images - implementation

From Intelligent Perception

Jump to: navigation, search

In this article the algorithm for analysis for gray scale images is explained in detail and illustrated with C++ source code.

The time complexity of the algorithm is O(n2), where n is the number of pixels in the image. See also Processing time.

1 The data structure of the algorithm

The input image of the algorithm is a gray scale 2D image, i.e., an array of of integers from 0 to 255 with dimensions dim1 and dim2.

int image[ dim1*dim2 ];

The algorithm first creates binary frames for each value of the threshold and then partitions the image by providing boundaries of the objects. Geometrically, these boundaries are circular sequences of directed edges of pixels in the image. In other words, these are 0- and 1-cycles. The first task is to count them.

long betti[2] = {0,0}; // Betti numbers = number of current cycles of each dimension

For each value of the parameter, the gray level, the image is binary. Two Boolean arrays of the same size are used: the previous frame (PrevFrame) and the current frame (CurrFrame).

bool *CurrFrame = new bool [ dim1*dim2 ];
bool *PrevFrame = new bool [ dim1*dim2 ];

The binary frame is created by thresholding the image.

for(int k=0;k < dim1;k++) // start the (k,l) pixel
for(int l=0;l < dim2;l++)
           if( image[l+dim2*k] < frame )
                 CurrFrame[l+dim2*k] = 0;
                 CurrFrame[l+dim2*k] = 1;

As pixels are added, this array is incrementally changed - pixels are added - and analyzed by the algorithm. Recall that at every step the cycles that have no descendants are called current cycles. Similarly, the edges that represent current cycles are called current edges. Cycles in the image are represented by nodes in the graph. The structure of the graph is the same as before - an acyclic binary directed graph. Except this time this graph spans through the frames. For each frame, the regions and holes are represented as nodes of this graph. Each node representing a region or hole is connected to the nodes representing regions and holes that existed previously (its parents) or appear later (its children) by directed edges, or arrows, of the graph. The arrows (one or two from each node) indicate merging and splitting of the regions as we add vertices, edges, and pixels, Adding Pixels. There are no circular sequences of arrows. Each node of the graph contains a pair of links (pointers) to next nodes. If the cycle is merged with another, one link used, if it splits, two. This information is used even after the cycle ceases to be current, i.e., after a merge or a split.

int frame = 0; // current frame number, 0-255

In the end of processing a frame, CurrFrame, it is copied to PrevFrame.

for(int k=0; k < dim1; k++) // start the (k,l) pixel
for(int l=0; l < dim2; l++)
 PrevFrame[l+dim2*k] = CurrFrame[l+dim2*k];

2 The run through pixels of the image

At every step of the run through the image, the value of the pixel in the image is considered. If the color is white, skip to the next step. If the color is black, execute AddCell. Every time a cell is added, the topology of the image changes, the graph, the list of current cycles, and the Betti numbers are updated.

for( k=0; k < dim1; k++ )// start the (k,l) pixel
for( l=0; l < dim2; l++ )
if( CurrFrame[ l+dim2*k ] == 0 )// if the new pixel is black.. 
   AddCell( k, l ); 

Cycles in the image are represented by nodes in a directed graph as before. In this graph, the cycles are related to each other to record the changes in the topology of the image in terms of merging and splitting of components and holes as pixels are added. In particular, the 0- and 1-Betti numbers are updated at every step. In addition to the structure defined in Chapter 2, we need to record the frame when the cycle was created.

struct Cycle { 
       int Birth; // the frame of birth

Each node of the graph contains only a pair of links (pointers) to next nodes. If the cycle is merged with another, one link used, if it splits, two. The node may contain additional information: the dimension of the node, the frame of birth (gray level), a pointer to the edge that created the cycle, the area, the perimeter, etc.

CreateCycle is almost the same as before. The only new command is

   Cycle0->Birth = frame;      

At every step of the run through the image, a pixel in PrevFrame is compared to the corresponding one in CurrFrame. If the colors are the same, skip to the next step. If the color changes from white to black, execute AddCell. Observe that in the case of gray scale images, the color can never change from black to white (RemoveCell will used in the future).

// processing the frame
for( int k=0; k < dim1; k++ ) // start the (k,l) pixel
for( int l=0; l < dim2; l++ )
// if the new pixel is black and the color of its position is white
if( CurrFrame[l+dim2*k] == 0 && PrevFrame[l+dim2*k] == 1 )      
   AddCell( k, l ); 
     The AddCell command is exactly the same.

Further see Grayscale images - implementation.

Exercise. Create a simple interface for the program without simplification and run it. The output should be a sequence of the Betti numbers of the frames. Find or create a simple gray scale image and verify the correctness of the Betti numbers.

Exercise. Design a character recognition system. Answer

3 Another implementation

This package provides programmatic access to the low level image processing algorithms employed in Pixcavator.

The purpose of the Pixcavator SDK is to supply the developer with a few tools of low level computer vision. The output of the image analysis algorithm that powers the SDK contains the most fundamental data about the image: a list of all objects in the image along with their measurements, locations, and other characteristics. Anyone with Microsoft Visual Studio 2005 to create new software based on our technology. This package does not contain the entire source code. See also Binary images - implementation.

This wiki provides a complete description of the methods behind our algorithm. To understand how they works, read from the top - Binary Images, Grayscale Images, etc.

An example of a successful application of this package is here: Cell counting.

3.1 Package contents

Pixcavator solution in Microsoft Visual Studio 2005.
  • Static library AnalysisLib. It includes precompiled image processing function Analysis(), on which Pixcavator application is based. The static library is compiled by the Microsoft Visual C++ 2005 compiler.
  • Pixcavator solution for Microsoft Visual Studio 2005. It implements graphical user interface for the Analysis() function. Includes full source code for the previous Pixcavator version!

Both Analysis static library and Pixcavator solution are developed in Microsoft Visual Studio 2005 using Microsoft Visual C++. Both components are based on the Microsoft Foundation Class (MFC) library. Therefore Pixcavator SDK can be used with Visual Studio 2005 Standard, Professional or Team edition.

Analysis library is a static library, statically linked with MFC. The package includes multi-thread release and multi-thread debug versions for VS 2005. Pixcavator solution is a dialog-based Windows application, statically linked with MFC. Pixcavator solution also depends on GDI+ library. Pixcavator solution employs the Analysis library in order to perform all the low level computer vision operations.

The package includes a user's guide containing detailed info on these components as well as several usage examples.

3.2 Exported functions

The Analysis library exports 10 functions as shown below:

Function name Function description
GetDim1() Retrieves width of the bitmap.
GetDim2() Retrieves height of the bitmap.
GetFrame() Retrieves current frame.
GetMinI() Retrieves min integral.
GetMinLS() Retrieves min lifespan – contrast.
GetMinLine() Retrieves min line integral.
GetCompactness() Retrieves min compactness.
GetSaliency() Retrieves min saliency.
GetCurrNode() Retrieves pointer to the head of the current cycles list.
Analysis() The main image processing function.

The major part of the exported functions is auxiliary. The most important ones are Analysis() (it is a subject of the next chapter) and GetCurrNode(). The later function allows one to explore the image analysis graph created by the Analysis() function. The graph analysis is an advanced task allowing the developer to extract extended information on the analyzed image. The graph contains as much as millions nodes representing various objects inside the image. Please visit the Grayscale Images to learn more details about this algorithm.

3.3 Analysis() function

The Analysis() function exported by the Analysis static library is definitely the core of the algorithm. It builds the Topology graph - the base for further investigation of Pixcavator. The function prototype is declared as follows:

void Analysis( int ddim1,
int ddim2,
unsigned char *imageR,
unsigned char *imageG,
unsigned char *imageB,
int Area,
int Contrast,
BOOL Gray,
int Custom1,
int Custom2,
bool AndOr,
int nParameters,
int *pParameters,
int *ndoR,
int *nloR,
int *ndoG,
int *nloG,
int *ndoB,
int *nloB);

The function receives a 24-bit RGB bitmap (8-bit grayscale is an option depending on the Gray flag) in a form of three byte arrays. Each array correspond to one of the RGB channels. The function also receives a number of simplification parameters: Area, Contrast, Custom1, Custom2, nParameters, pParameters and AndOr. The function builds image analysis graph based on the passed bitmap, writes simplified bitmap into the arrays imageR, imageG and imageB, then writes the number of dark / light objects in each channel to the variables ndoR, nloR, ndoG, nloG, ndoB, nloB.


ddim1 [IN] Width of the input bitmap.
ddim2 [IN] Height of the input bitmap.
imageR [IN OUT] Pointer to the array of bytes containing the red channel data. Must contain ddim1*ddim2 items.
imageG [IN OUT] Pointer to the array of bytes containing the green channel data. Must contain ddim1*ddim2 items.
imageB [IN OUT] Pointer to the array of bytes containing the blue channel data. Must contain ddim1*ddim2 items.
Area [IN] Area limit. All objects with the area below this value will be removed. Pass 0 to avoid simplification.
Contrast [IN] Contrast limit. All objects with the contrast below this value will be removed. Pass 0 to avoid simplification.
Gray [IN] Boolean value specifying whether the input bitmap is grayscale or not. If TRUE, only red channels is processed.
Custom1 [IN] Unused.
Custom2 [IN] Unused.
AndOr [IN] Specifies how to apply the additional parameters stored in the pParameters array.
nParameters [IN] The number of additional parameters in the pParameters.
pParameters [IN] The array with additional input parameters.
ndoR [OUT] Pointer to the int variable to receive the number of dark objects in red channel.
nloR [OUT] Pointer to the int variable to receive the number of light objects in red channel.
ndoG [OUT] Pointer to the int variable to receive the number of dark objects in green channel.
nloG [OUT] Pointer to the int variable to receive the number of light objects in green channel.
ndoB [OUT] Pointer to the int variable to receive the number of dark objects in blue channel.
nloB [OUT] Pointer to the int variable to receive the number of light objects in blue channel.