- OpenCV Computer Vision Application Programming Cookbook Second Edition
- Robert Laganière
- 1590字
- 2021-04-09 23:31:50
Scanning an image with pointers
In most image-processing tasks, you need to scan all pixels of the image in order to perform a computation. Considering the large number of pixels that will need to be visited, it is essential that you perform this task in an efficient way. This recipe, and the next one, will show you different ways of implementing efficient scanning loops. This recipe uses the pointer arithmetic.
Getting ready
We will illustrate the image-scanning process by accomplishing a simple task: reducing the number of colors in an image.
Color images are composed of 3-channel pixels. Each of these channels corresponds to the intensity value of one of the three primary colors, red, green, and blue. Since each of these values is an 8-bit unsigned character, the total number of colors is 256x256x256
, which is more than 16 million colors. Consequently, to reduce the complexity of an analysis, it is sometimes useful to reduce the number of colors in an image. One way to achieve this goal is to simply subdivide the RGB space into cubes of equal sizes. For example, if you reduce the number of colors in each dimension by 8
, then you would obtain a total of 32x32x32
colors. Each color in the original image is then assigned a new color value in the color-reduced image that corresponds to the value in the center of the cube to which it belongs.
Therefore, the basic color reduction algorithm is simple. If N
is the reduction factor, then divide the value by N
(the integer division, therefore, the reminder is lost) for each pixel in the image and for each channel of this pixel. Then, multiply the result by N
; this will give you the multiple of N
just below the input pixel value. Just add N/2
and you obtain the central position of the interval between two adjacent multiples of N
. If you repeat this process for each 8-bit channel value, then you will obtain a total of 256/N x 256/N x 256/N
possible color values.
How to do it...
The signature of our color reduction function will be as follows:
void colorReduce(cv::Mat image, int div=64);
The user provides an image and the per-channel reduction factor. Here, the processing is done in-place, that is, the pixel values of the input image are modified by the function. See the There's more… section of this recipe for a more general function signature with input and output arguments.
The processing is simply done by creating a double loop that goes over all pixel values as follows:
void colorReduce(cv::Mat image, int div=64) { int nl= image.rows; // number of lines // total number of elements per line int nc= image.cols * image.channels(); for (int j=0; j<nl; j++) { // get the address of row j uchar* data= image.ptr<uchar>(j); for (int i=0; i<nc; i++) { // process each pixel --------------------- data[i]= data[i]/div*div + div/2; // end of pixel processing ---------------- } // end of line } }
This function can be tested using the following code snippet:
// read the image image= cv::imread("boldt.jpg"); // process the image colorReduce(image,64); // display the image cv::namedWindow("Image"); cv::imshow("Image",image);
This will give you, for example, the following image (refer to the book's graphics PDF to view this image in color):
How it works...
In a color image, the first three bytes of the image data buffer give values of the upper-left pixel to the 3-color channel, the next three bytes are the values of the second pixel of the first row, and so on (remember that OpenCV uses, by default, the BGR channel order). An image of width W
and height H
would then require a memory block of WxHx3
uchars
. However, for efficiency reasons, the length of a row can be padded with a few extra pixels. This is because some multimedia processor chips (for example, the Intel MMX architecture) can process images more efficiently when their rows are multiples of 4 or 8. Obviously, these extra pixels are not displayed or saved; their exact values are ignored. OpenCV designates the length of a padded row as the effective width. Obviously, if the image has not been padded with extra pixels, the effective width will be equal to the real image width. We have already learned that the cols
and rows
attributes give you the image's width and height; similarly, the step
data attribute gives you the effective width in number of bytes. Even if your image is of a type other than uchar
, the step
data will still give you the number of bytes in a row. The size of a pixel element is given by the elemSize
method (for example, for a 3-channel short integer matrix (CV_16SC3
), elemSize
will return 6
). Recall that the number of channels in the image is given by the nchannels
method (which will be 1
for a gray-level image and 3
for a color image). Finally, the total
method returns the total number of pixels (that is, the matrix entries) in the matrix.
The number of pixel values per row is then given by the following code:
int nc= image.cols * image.channels();
To simplify the computation of the pointer arithmetic, the cv::Mat
class offers a method that directly gives you the address of an image row. This is the ptr
method. It is a template method that returns the address of row number j
:
uchar* data= image.ptr<uchar>(j);
Note that in the processing statement, we could have equivalently used the pointer arithmetic to move from column to column. So, we could have written the following code:
*data= *data/div*div + div2; data++;
There's more...
The color reduction function presented in this recipe provides just one way of accomplishing this task. You could also use other color reduction formulas. A more general version of the function would also allow the specification of distinct input and output images. The image scanning can also be made more efficient by taking into account the continuity of the image data. Finally, it is also possible to use regular low-level pointer arithmetic to scan the image buffer. All of these elements are discussed in the following subsections.
In our example, color reduction is achieved by taking advantage of an integer division that floors the division result to the nearest lower integer as follows:
data[i]= (data[i]/div)*div + div/2;
The reduced color could have also been computed using the modulo operator that brings us to the nearest multiple of div
(the per-channel reduction factor) as follows:
data[i]= data[i] – data[i]%div + div/2;
Another option would be to use bitwise operators. Indeed, if we restrict the reduction factor to a power of 2
, that is, div=pow(2,n)
, then masking the first n
bits of the pixel value would give us the nearest lower multiple of div
. This mask would be computed by a simple bit shift as follows:
// mask used to round the pixel value uchar mask= 0xFF<<n; // e.g. for div=16, mask= 0xF0
The color reduction would be given by the following code:
*data &= mask; // masking *data++ += div>>1; // add div/2
In general, bitwise operations might lead to very efficient code, so they could constitute a powerful alternative when efficiency is a requirement.
In our color reduction example, the transformation is directly applied to the input image, which is called an in-place transformation. This way, no extra image is required to hold the output result, which could save on the memory usage when it is a concern. However, in some applications, the user might want to keep the original image intact. The user would then be forced to create a copy of the image before calling the function. Note that the easiest way to create an identical deep copy of an image is to call the clone
method; for example, take a look at the following code:
// read the image image= cv::imread("boldt.jpg"); // clone the image cv::Mat imageClone= image.clone(); // process the clone // orginal image remains untouched colorReduce(imageClone); // display the image result cv::namedWindow("Image Result"); cv::imshow("Image Result",imageClone);
This extra overload can be avoided by defining a function that gives the user the option to either use or not use in-place processing. The signature of the method would then be as follows:
void colorReduce(const cv::Mat &image, // input image cv::Mat &result, // output image int div=64);
Note that the input image is now passed as a const
reference, which means that this image will not be modified by the function. The output image is passed as a reference such that the calling function will see the output argument modified by this call. When in-place processing is preferred, the same image is specified as the input and output:
colorReduce(image,image);
If not, another cv::Mat
instance can be provided; for example, take a look at the following code:
cv::Mat result; colorReduce(image,result);
The key here is to first verify whether the output image has an allocated data buffer with a size and pixel type that matches the one of the input image. Very conveniently, this check is encapsulated inside the create
method of cv::Mat
. This is the method that is to be used when a matrix must be reallocated with a new size and type. If, by chance, the matrix already has the size and type specified, then no operation is performed and the method simply returns without touching the instance.
Therefore, our function should simply start with a call to create
that builds a matrix (if necessary) of the same size and type as the input image:
result.create(image.rows,image.cols,image.type());
The allocated memory block has a size of total()*elemSize()
. The looping is then done with two pointers:
for (int j=0; j<nl; j++) { // get the addresses of input and output row j const uchar* data_in= image.ptr<uchar>(j); uchar* data_out= result.ptr<uchar>(j); for (int i=0; i<nc*nchannels; i++) { // process each pixel --------------------- data_out[i]= data_in[i]/div*div + div/2; // end of pixel processing ---------------- } // end of line }
In the case where the same image is provided as the input and output, this function becomes completely equivalent to the first version presented in this recipe. If another image is provided as the output, the function will work correctly irrespective of whether the image has or has not been allocated prior to the function call.
We previously explained that, for efficiency reasons, an image can be padded with extra pixels at the end of each row. However, it is interesting to note that when the image is unpadded, it can also be seen as a long one-dimensional array of WxH
pixels. A convenient cv::Mat
method can tell us whether the image has been padded or not. This is the isContinuous
method that returns true
if the image does not include padded pixels. Note that we could also check the continuity of the matrix by writing the following test:
// check if size of a line (in bytes) // equals the number of columns times pixel size in bytes image.step == image.cols*image.elemSize();
To be complete, this test should also check whether the matrix has only one line; in which case, it is continuous by definition. Nevertheless, always use the isContinuous
method to test the continuity condition. In some specific processing algorithms, you can take advantage of the continuity of the image by processing it in one single (longer) loop. Our processing function would then be written as follows:
void colorReduce(cv::Mat &image, int div=64) { int nl= image.rows; // number of lines int nc= image.cols * image.channels(); if (image.isContinuous()) { // then no padded pixels nc= nc*nl; nl= 1; // it is now a long 1D array } // this loop is executed only once // in case of continuous images for (int j=0; j<nl; j++) { uchar* data= image.ptr<uchar>(j); for (int i=0; i<nc; i++) { // process each pixel --------------------- data[i]= data[i]/div*div + div/2; // end of pixel processing ---------------- } // end of line } }
Now, when the continuity test tells us that the image does not contain padded pixels, we eliminate the outer loop by setting the width to 1
and the height to WxH
. Note that there is also a reshape
method that could have been used here. You would write the following in this case:
if (image.isContinuous()) { // no padded pixels image.reshape(1, // new number of channels 1); // new number of rows } int nl= image.rows; // number of lines int nc= image.cols * image.channels();
The reshape
method changes the matrix dimensions without requiring any memory copying or reallocation. The first parameter is the new number of channels and the second one is the new number of rows. The number of columns is readjusted accordingly.
In these implementations, the inner loop processes all image pixels in a sequence. This approach is mainly advantageous when several small images are scanned simultaneously into the same loop.
In the cv::Mat
class, the image data is contained in a memory block of unsigned chars. The address of the first element of this memory block is given by the data attribute that returns an unsigned char pointer. So, to start your loop at the beginning of the image, you could have written the following code:
uchar *data= image.data;
Moving from one row to the next could have been done by moving your row pointer using the effective width as follows:
data+= image.step; // next line
The step
method gives you the total number of bytes (including the padded pixels) in a line. In general, you can obtain the address of the pixel at row j
and column i
as follows:
// address of pixel at (j,i) that is &image.at(j,i) data= image.data+j*image.step+i*image.elemSize();
However, even if this would work in our example, it is not recommended that you proceed this way.
See also
- The Writing efficient image-scanning loops recipe in this chapter proposes a discussion on the efficiency of the scanning methods presented here