Face Detection

Henry Chang and Ulises Robles

Skin Regions

Using the result from the previous section, we proceed to determine which regions can possibly determine a frontal human face. To do so, we need to determine the number of skin regions in the image.

A skin region is defined as a closed region in the image, which can have 0, 1 or more holes inside it. Its color boundary is represented by pixels with value 1 for binary images. We can also think about it as a set of connected components within an image [2]. All holes in a binary image have pixel value of zero (black).

The process of determining how many regions we have in a binary image is by labeling such regions. A label is an integer value. We used an 8-connected neighborhood (i.e., all the neighbors of a pixel) in order to determine the labeling of a pixel. If any of the neighbors had a label, we label the current pixel with that label. If not, then we use a new label. At the end, we count the number of labels and this will be the number of regions in the segmented image.

To separate each of the regions, we scan through the one we are looking for and we create a new image that will have ones in the positions where the label we are searching occurs. The others are set to zero. After this, we iterate through each of the regions found in order to determine if the region might suggest a frontal human face or not. Figure 7 shows the segmented skin regions from last section as well as a particular skin region selected by the system that correspond to the face of the baby image.


Figure 7. (Left) Segmented Skin Regions. (Right) A Skin Region

Number of holes inside a region

After experimenting with several images, we decided that a skin region should have at least one hole inside that region. Therefore, we get rid of those regions that have no holes. To determine the number of holes inside a region, we compute the Euler number [5] of the region, defined as:

E = C - H

                                                                                                         where E: is the Euler number
                                                                                                                   C: The number of connected components
                                                                                                                   H: The number of holes in a region.

The development tool (Matlab) provides a way to compute the Euler number. For our case, we already set the number of connected components (i.e. the skin region) to 1 since we are considering 1 skin region at a time. The number of holes is, then:

H = 1 - E

where H: The number of holes in a region
E: The Euler number.

Once the system has determined that a skin region has more than one hole inside the region, we proceed to analyze some characteristics in that particular region. We first create a new image with that particular region only. The rest is set to black.

Center of the mass

To study the region, we first need to determine its area and center of the region. There are many ways to do this. One efficient way is to compute the center of mass (i.e., centroid) of the region [5]. The center of area in binar images is the same as the center of the mass and it is computed as shown below:

where: B is the matrix of size [n x m] representation of the region.
A is the area in pixelsof the region

Note that for this computation, we are also considering the holes that the region has.

Orientation

Most of the faces we considered in this project are vertically oriented. However, some of them have a little inclination. We would like to have a higher matching if we rotate our template face in the right angle. One way to determine a unique orientation is by elongating the object. The orientation of the axis of elongation will determine the orientation of the region. In this axis we will find that the inertia should be the minimum.

The axis will be computed by finding the line for which the sum of the squared distances between region points and the line is minimum. In other words, we compute the least-squares of a line to the region points in the image [5]. At the end of the process, the angle of inclination (theta) is given by:

where:

and:

Width and height of the region

At this point, we have the center of the region and its inclination. We still need to determine the width and height of the region in order to resize our template face so it has the same width and height of our region.

First, we fill out the holes that the region might have. This is to avoid problems when we encounter holes. Since the image is rotated some angle theta, the need to rotate our region -theta degrees so that it is completely vertical. We now proceed to determine the height and width by moving 4 pointers: one from the left, right, top and bottom of the image. If we find a pixel value different from 0, we stop and this is the coordinate of a boundary. When we have the 4 values, we compute the height by subtracting the bottom and top values and the width by subtracting the right and the left values.

Region ratio

We can use the width and the height of the region to improve our decision process. The height to width ratio of the human faces is around 1. In order to have less misses however, we determined that a minimum good value is 0.8. Ratio values below 0.8 do not suggest a face since human faces are oriented vertically.

The ratio should also have an upper limit. We determined by analyzing the results in our experiments that a good upper limit should be around 1.6. There are some situations however, that we indeed have a human face, but the ratio is higher. This happens when the person has no shirt or is dressed in such a way that part of the neck and below is uncovered. In order to account for this cases, we set the ratio to be 1.6 and eliminate the region below the corresponding height to this ratio.

While the above improves the classification, it can also be a drawback for cases such as the arms that are very long. If the skin region for the arms has holes near the top, this might yield into a false classification.

Template Face

One of the most important characteristics of this method is that it uses a human face template to take the final decision of determining if a skin region represents a face. This template was choosen by averaging 16 frontal view faces of males and females wearing no glasses and having no facial hair. The template we used shown in Figure 8. Notice that the left and right borders of the template are located at the center of the left and right ears of the averaged faces. The template is also vertically centered at the tip of the nose of the model.

Figure 8. Template face (model) used to verify the existence of faces in skin regions.

At this point, we have all the required parameters to do the matching between the part of the image corresponding to the skin region and the template human face. Template matching is described in the next section.

Next: Template Matching Previous: Skin Segmentation Contents: Face Detection

Henry Chang and Ulises Robles
Last modified: Thu. May 25, 2000