Facial recognition

# Did you know

·        That babies identify mother’s face within a half hour of birth?

·        That we can recognize instantly (150 ms) over 1000 faces?

·        That over half the cortex is involved during visual processing (more than when doing math!)?

In general, Images contain information in a very dense and complex form. The machine must rely on complex (numerical) models of the image in order to understand its contents. Recognizing a face is about describing who is the person. Tracking a face is about locating a face in an image or in a video sequence. We usually remember people based on facial color, their most significant features etc. This recognition activity becomes very evident when we recognize twins. In this application, we explain one way of using our understanding of images to recongnize faces

When it comes to 2D-pattern recognition you need to know what you're looking for:

On the left of the following figure, a tumbling object consisting of a wire frame pyramids over a planar hexagon. On the right, a similarly tumbling wire-frame cube. From one point of view they appear identical, but no observers of this common view would imagine they were seeing the pyramid wire-frame object

When recognizing 2D-faces, many factors come into play, like different lighting conditions, expressions and others:

As an example the following 70 images are taken with various values for the angles θ and Φ. Each is centered via the eyes, then masked:

(Figure *)

Each one of these 70 images can be represented with a vector

of the space RN  of grayscale numbers

of pixels (i.e. unstack rows).

For example, the following image of Bogart:

can be represented by the vector:

v=(...,0.858824, 0.615686, 0.407843, 0.396078, 0.65098, 1, 0.905882, 0.878431, 0.917647, 0.901961, 0.917647, 0.870588, 0.882353, ...)

of the space R18300 (Here N=122x150=18300).

So let

be the vectors corresponding to the images in Figure (*) above. Let

be the average of these vectors, and let

Plotting” the points  in the space R18300 (We are taking N=122x150=18300) would give a hyper-ellipsoid in dimension 18300. Using the same techniques as in the application to the quadratics in the plane, one can find the right axes to bring the ellipsoid to its standard form

Each axis vector is an ``eigenface"! the first 20 eigenfaces, ordered by eigenvalues are:

In other words, every one of the above 70 images can be written as a linear combination of all of the eigenfaces, and a very good approximation to each can be obtained by just using the first 20.

Now given an (normalized) image, is it an image of the face we are trying to recognize? To be able to answer that question, we need to recall some facts:

The component of a vector u in the direction of a unit vector v is given by

and since

is true for any real number a, the best approximation amongst all vectors of form av is given by the vector

If  are first k eigenfaces, set

then the closet vector in Ek to a vector u is given by:

(this vector is called the “reconstruction” of u in  Ek).

In the following diagram, image n is

A significant reduction has been accomplished: from 18300 to less than 40!.

The next question is how can we plot these points in dimension 18300 and find the right axes? Of course, we need another approach to this end. The following steps give an algebraic way to find these “right axes” using the theory of matrix digitalization.

1.            Let A be the following 70x18300 matrix:

2.        Form the two matrices ATA (18300x18300) and AAT (70x70). Clearly, both matrices are symmetric. In particular they are diagonalizable.

3.       Diagonalize the matrix ATA:

where V is an invertible 70x70 matrix whose columns are eigenvectors of AAT and   is diagonal.

4.       with

5.      The eigenfaces are the columns of the matrix:

Reference

``Two and Three Dimensional Patterns of the Face", by Peter L. Hallinan, Gaile G. Gordon, A.L.Yuille, Peter Giblin, David Mumford

Publishers: A.K.Peters, Natick Mass.