AP 186 Activity 7: Color se-g-men-t-ation.

Welcome back to the series of AP 186 blogs! Midterms just finished, and I realized that I was not too satisfied with the way I did my previous blogs,especially the last one. Yikes! As such, I would want to up the ante, and improve the structure and content of my blogs.

god_dammit__leeroy_by_zamzoph-500x321
“Time’s up. Let’s do this.”

For this blog post, as well as the succeeding ones, I’m going to divide each of my AP 186 blog posts to two parts:

1. For the casual reader

I will showcase most of the rendered products of the activity in for the casual visitor. The aim of this section is to let the unknowing layperson who visits my blog see the objectives of the topics being accomplished in the most simple way. Simple tweaking of parameters, very few equations, and absolutely NO scilab/MATLAB code to be read! Perfect for the casual reader.

2. For the image programmer (or AP 186 student)

The contents of this section will house the nitty gritty details of how each part was executed. A larger portion of the blog will always be housed here, and is relevant to the following set of people:

  • myself, for future reference
  • Ma’am Jing, who will grade my gracious work
  • future employers who will be referred to this as my portfolio
  • and, future (or even current, but I am late to blogging once more) AP 186 students

I hope that my future blogs become more than just a school requirement.

With that out of the way, I present to you the next part of the AP 186 blog series, which focuses on

Segmentation (and COLORS!)

but more specifically, image segmentation!

A. For the casual reader

A1. Grayscale segmentation

Image segmentation is the process of obtaining regions of interest (or ROI) of certain photographs to be able to execute further processes. The pixels of the region of interest can have collective characteristics that may be extracted and found to be common with other regions in the circle.

color-selection-tool
Colors can also be segmented with this tool!

As pretty much with everything, images in grayscale are much easier to manage. Let’s start by going over an example of the mathematician’s insecure check:

cropped_grayscale_check
So much for less than a penny.

From my past blogs (or if you’re new to the whole image processing world), images can be rendered from numerical matrices with each element corresponding to a pixel (px) in the digital photo. To simplify matters, imagine the range of values running from 0-255. 0 would represent the darkest of pixels, and 255 would be the whitest.

Segmentation is performed on this check so that it focuses on the pixels that reach a certain brightness threshold, and then treat the pixels that didn’t make the cut as plain black. Below, you would see the example of segmentation at px < 127 (note that 255 is the max value a px can have).

7a_check_threshold_127
Segmented at px <127.

And since I love making gifs to be more economic in blog space, here is the check segmented under equally-spaced pixel thresholds.

px_threshold.gif
Voila.

Part A2. Color segmentation

Due to the insatiable nature of people’s satisfaction, you would also want to see how segmentation via color is performed. You’re in luck, as we were also tasked to do this part!

I’ve prepared 5 images for this part, each photo having a dominant color, or none at all.

Such a dazzling array of color.

In the same way that grayscale image segmentation deals with choosing a certain threshold brightness value to select, color segmentation also needs a region of interest. Here are some examples of regions of interest chosen from the photographs above.

When we use our common programs (such as paint, or photoshop, or GIMP) to edit or generate photographs, oftentimes we see the RGB system of mixing colors. Trichomaticity has been a common system of generating colors, although for the simplicity of this activity, I will resort to using normalized chromaticity coordinates (NCC). The conversion of RGB to NCC effectively reduces one degree of freedom (chosen to be blue) so that only two values are tweaked (red, green). The conversion from RGB to NCC is as follows:

r = \frac{R}{R+G+B}

g = \frac{G}{R+G+B}

b = 1 - r - g = \frac{B}{R+G+B}

Screenshot 2016-10-26 21.27.09.png
y axis for green; x axis for red values.

In this activity, I encountered two methods of color segmentation: parametric and nonparametric. The specifics between he two methods will be discussed in the second part, but all you need to know for now is that parametric assumes that the colors follow a certain equation, or more specifically, a distribution.

Nonparametric methods obtain a histogram of color values from the ROI and then using the histogram as a lookup reference to change the pixel values of the resulting grayscale image. I also included the histogram overlain with the NCC triangle to verify the colors of the chosen region.

Changing the size of the region of interest can have noticeable differences to the nonparametric segmentation:

See that as the region of interest is increased, more regions in both the parametric (3rd column) non-parametric segmentation (last column) become brighter. Also notice that the parametric segmentation was able to provide a smoother segmentation as compared to the non-parametric generation, but the non-parametric method was able to produce an image faster than the parametric method, and no assumptions to distributions were made.

I also explored changing the binning of the histograms in the non-parametric method.

There isn’t much of a difference in varying the binning size, but using 64 bins  (bottom row) takes longer to reconstruct than 32 bins (top row), so it is advised to keep the histogram bins at 32 divisions unless the situation where a more sensitive reconstruction is required.

Now let’s get down to the explanation (code included!)

B. For the image programmer

B1. Intensity segmentation

Let’s go over the intensity segmentation code for now.

//7A: Intensity segmentation
check = imread('cropped_grayscale_check.jpg');
[count, cells] = imhist(check, 256); //imhist: image histogram. 
plot (cells, count);
xlabel('pixel value', 'fontsize', 3);
ylabel('count','fontsize',3);
title('Pixel value count for check','fontsize',4);
[maxVal maxInd] = max(count); 
//Max count: 4768; Index: 194.

The code above shows the histogram generation of the check. The imhist command does not provide a normalized histogram, but it will do considering that the value where the histogram is centered is the important one.

histogram-for-intensity-segmentation
Image histogram

To make sense of the histogram, let’s have a view of that obnoxious check again.

cropped_grayscale_check

The image is mostly light gray, which would explain why the maximum value would be near 194. The muddled section to the left of the “Gaussian” curve is due to the black inking of the check around it. Since there’s no apparent white areas, the near 255 values do not have any hits at all.

Okay, now we explore different thresholds for the segmentation process. Code is below.

thresh_div = 16 //threshold divisions.
for i = 1:thresh_div
 BW = check < int(i*255/thresh_div); //Only shows pixel values less than px threshold
 imwrite(BW,'7A_check_threshold_'+string(int(i*255/thresh_div))+'.png')
end
clear

Note that since we’re dealing with more memory right now, I’d have to clear the memory after the code for the intensity segmentation.

Remember the first gif I posted above? Above is the code in how to write it. The third line is the ‘meat’ of the code. BW is basically the check matrix, but will only show the values that are less than the specified value. The second line also indicates the threshold value. In here, I chose to automate the generation of files. Basically, the chose values are the multiples of 256/16 = 16 (but then the max pixel value is 255 so I just converted the float values to integers).

px_threshold

The second line is also a Boolean operator. Every pixel that satisfies the operator returns as 1, and all the others 0. That’s why when the image is rewritten, the satisfying pixels returned as white in the segmentation. Notice that majority of the pixels light up near the maximum value (from 175 to 191, there is a huge leap of pixels whitening). This is in agreement with the histogram above.

B2. Region of interest

Here is the code for choosing the region of interest.

//7B: Color segmentation
//7B_1 = 190:212,383:439; 190:294,383:466 for wider
//7B_2 = 279:342,435:505
//7B_3 = 112:130,192:208
//7B_4 = 196:221,264:294; 196:264,264:332 for wider; widest 182:319,230:355
//7B_5 = 174:230,343:452

The comments above are the chosen regions based on pixel coordinates.

i = 4 //choose which photo
BINS = 32; //set bin size
y1 = [190,279,112,182,174]; y2 = [294,342,130,319,230];
x1 = [383,435,192,230,343]; x2 = [466,505,208,355,452];

The variable i basically picks which of the 5 photos would bechosen. BINS sets the bin size for the non-parametric segmentation for later. The 4 lists are basically the coordinates from the comments a while ago. They’ll be picked by what value of i is for later. y1 and y2 are the ranges for the y coordinates (rows), and x1 and x2 are the ranges of the x coordinates (columns).

I = double(imread('7B_'+string(i)+'.jpg'));
R = I(:,:,1); G = I(:,:,2); B = I(:,:,3); Int= R + G + B; 
Int(find(Int==0))=100000;
r = R./ Int; r_roi = r(y1(i):y2(i),x1(i):x2(i));
g = G./Int; g_roi = g(y1(i):y2(i),x1(i):x2(i));

The last section basically reads the image of choice. I deliberately named the images with string values for my convenience. The next line separates the 3 dimensional matrix to their red, green, and blue channels, and the last two lines convert the channels to NCC coordinates. The middle channel converts all zero values of the matrices to a high value so that division by a zero value in the last two lines do not produce a computational error.

220px-bh_lmc
You know what happens when division by 0 occurs.

Bascially, we want those pixels with a zero value to be divided by a large enough value t become zero anyway. It doesn’t do much with the code.

The chosen regions of interest are as follows:

My rationale for choosing these photos as well as the regions of interest are as follows:

  • First photo: green region. And it’s in my bucket list to see the Aurora Borealis.
  • Second photo: yellow region, as well as I wanted a near homogeneous region of interest. The Macbeth color chart was also something that was used for our AP 187 class.
  • Third photo: red region. And I love strawberries.
  • Fourth photo: human skin color region. And the baby is in stockings.
  • Fifth photo: blue region. Globe is used since it is my telecommunications service provider.

B3. Parametric segmentation

The parametric method what we want deals with the Gaussian distribution,

p(x) = \frac{1}{\sigma_{x}\sqrt{2\pi}}\exp{\frac{-(x-\nu_{x})^2}{2\sigma_{x}}}

Where \sigma_{x} and \nu_{x} are the standard deviation and the mean of the pixel values of the chosen channels of the NCC. For the purposes of this activity, the two chosen channels are green and red. In here, the standard deviation and the mean are taken from the red and green values of the pixels in the chosen regions of interest, then the probability density function (PDF) of each NCC channel is obtained by using the equation above. Basically, we’ll have both p(r) and p(g) for the chosen region of interest.

The code for parametric segmentation is given below:

//Parametric
u_r = mean(r_roi); u_g = mean(g_roi);
o_r = stdev(r_roi); o_g = stdev(g_roi);
pr = 1/length(r_roi)*(o_r*sqrt(%pi*2))*exp(-0.5*(r-u_r).^2/o_r^2);
pg = 1/length(g_roi)*(o_g*sqrt(%pi*2))*exp(-0.5*(g-u_g).^2/o_g^2);
param_end = pr.*pg
imwrite(mat2gray(param_end),'7B_'+string(i)+'_param_widest.png');
//I_param = uint8(255*param_end/max(param_end));
//imwrite(I_param,'7B_'+string(i)+'_param2.png');

The first two lines deal with obtaining the mean and standard deviations of the system. The next two lines generate the parametric segmentations of r and g using the Gaussian distribution. And the param_end variable basically multiplies the two independent distributions to result in the image, as the imwrite writes the image to a file. The remaining commented lines are alternatives to the mat2gray function.

The resulting images are shown below (already seen a while ago):

One thing to notice for the parametric segmentation is that the segmentation of colors is rather smoother because of the nature of its distribution. Parametric distributions are more often than not, continuous. The expected regions were also highlighted in white, similar to the way the check was segmented in the first part. Let’s move on first to the non-parametric segmentation before we make any further comparisons.

B4. Non-parametric segmentation

The code for the non-parametric segmentation is given below. Note that while the code for the non-parametric segmentation may seem longer, it assumes less about the distribution and actually uses a histogram from actual values. I’ll split the code to histogram-generation and backprojection.

B4.1. Histogram-generation

//BINNING
//discretization of axes for histogram
rint_roi = round(r_roi*(BINS-1) + 1); gint_roi = round(g_roi*(BINS-1) + 1);
//dicretization of axes for backprojection
rint = round(r*(BINS-1) + 1); gint = round(g*(BINS-1) + 1);
//generates call list for each histogram cell
colors = gint_roi(:) + (rint_roi(:)-1)*BINS;

First part of the histogram-generation deals with the binning of the pixel values. I did both for the region of interest for the histogram, and the whole picture for the backprojection. colors is an interesting variable in the sense that it generates a call-number for each ordered pair in the NCC histogram space. I’ll explain a bit more in a while.

hist = zeros(BINS,BINS);
for row = 1:BINS
 for col = 1:(BINS-row+1)
 hist(row,col) = length(find(colors==(col + (row- 1)*BINS)));//lookup caller
 end;
end;
imwrite(hist,'7B_'+string(i)+'_'+string(BINS)+'_histogram_widest.png');

This section of the histogram code generates the histogram matrix itself. Remember colors? Well, colors basically splits the call number to two parts: its units digit (green) and it’s “tens” digit (red). “Tens” because it’s not really the tens digit, but think of it as the call-list is in base-32. Therefore, the secondary place value of the call number is your red value (from 1-32), as the primary place value is your green value (from 1-32 as well). But, as the histogram appears as a triangle, the numbers are clipped (The call list doesn’t really reach a maximum value of 32 x 32 = 1024, but only until 32 x 31 + 1 = 993).

Another peculiarity to note for this code is that it generates a rotated version of the NCC:

screenshot-2016-10-26-21-27-09

And the generated colors to histogram actually appears like this:

Screenshot 2016-11-02 10.44.40.png
Brought to you by excel

Where the red values are actually on the left side, and the green values are on the top side. The histogram was generated this way due to the double for loop of the code above. It uses the length function to determine how many pixels have the specific red and green NCC values by finding its call number.

giphy
To whoever made the call list method…

The resulting histograms formed are in black and white, and after rotating and overlaying them to the NCC triangle, we’ll have something like these:

With the overlaying, I have effectively seen which colors are activated in the histogram.

(further improvements: the histogram masks I used are binarized, and not grayscale. For future users, please use mat2gray function on the histogram before writing the file so that the overlaying will be able to properly determine the weights of each NCC coordinate).

B4.2. Backprojection

The backprojection part of the non-parametric segmentation deals with getting the r and g pixel values based on the  coordinates of the histogram to an empty matrix that has the size of the image. This way, the reconstituted pixels have r, g, and b values that are within the value ranges of the histogram.

//BACKPROJECTION
backproj = zeros(I(:,:,1)); //sets matrix
for row = 1:size(backproj,1)
 for col = 1:size(backproj,2)
 backproj(row,col) = hist(rint(row,col),gint(row,col))
 end
end

B4.3. Thresholding

threshold = 0.25 //chooses threshold value (0.25, 0.5 or 0.75)
backproj_1 = mat2gray(backproj)
backproj_2 = backproj_1 >threshold
imwrite(backproj_1,'7B_'+string(i)+'_'+string(BINS)+'_nonparam.jpg');
imwrite(backproj_2,'7B_'+string(i)+'_'+string(BINS)+'_nonparam_mask.jpg');
clear

The last part of the code deals with thresholding. The backprojected matrices are binarized similarly with the histogram matrices, so we need to get the mat2gray of the matrices first. After doing so, the values of the matrix elements will now range from 0 to 1 (and not just exclusively 0 and 1). With this mat2gray function being applied, we can now apply a certain threshold value. If the threshold is lower, more pixels are converted to 1 and the rest are left 0. The resulting matrix can be used as a mask to the original image to isolate the pixels having your desired color based on the ROI histogram made a while ago. Below is the baby picture overlain with masks of different thresholds. I also added two more sizes of regions of interest, one larger and one smaller, to be able to see the difference in the thresholding as well.

thresholding

As seen in the photo above, choosing a wider ROI greatly affects the thresholding, so you can have two parameters to vary your threshold value. You just have to be very careful in choosing your ROI, but it will be up to you how to do it.

C. Evaluation

This has been my most comprehensive blog so far. Albeit being a bit later than usual, I was able to provide additional visualizations to the histogram so that people have a better idea of what their regions of interest are portraying. And due to my excellent rationales for choosing the photographs, I rate my self 10/10 for this activity.

D. References and Acknowledgements

  1. Dr. Maricor Soriano’s Activity Manual on Image Segmentation.
  2. My seatmate Roland Romero for his valuable advice on the thresholding as well as helping me provide ways to use the threshold images as masks to the original images. Check his blog out here.
  3. Angelo Rillera’s blog for giving me inspiration to work hard on making and understanding the histogram. I learned a lot from here.
  4. And to Ma’am Jing for her tireless effort in teaching the class repetitively on the parametric segmentation.

Leave a comment