Facial recognition


· That babies identify mother’s face within a half hour of birth?
· That we can recognize instantly (150 ms) over 1000 faces?
· That over half the cortex is involved during visual processing (more than when doing math!)?
In general,
Images contain information in a very dense and complex form. The machine must
rely on complex (numerical) models of the image in order to understand its
contents. Recognizing a face is about describing who is the person.
Tracking a face is about locating a face in an image or in a video sequence. We
usually remember people based on facial color, their most significant features
etc. This recognition activity becomes very evident when we recognize twins. In this application, we explain one way of
using our understanding of images to recongnize faces
When it comes to 2D-pattern recognition you need to know what you're looking for:
On the left of the following figure, a tumbling object consisting of a wire frame pyramids over a planar hexagon. On the right, a similarly tumbling wire-frame cube. From one point of view they appear identical, but no observers of this common view would imagine they were seeing the pyramid wire-frame object

When recognizing 2D-faces, many factors come into play, like different lighting conditions, expressions and others:

As an example the following 70 images are taken with various values for the angles θ and Φ. Each is centered via the eyes, then masked:
(Figure
*)
Each one of these 70 images can be represented with a vector
![]()
of the space RN of grayscale numbers
![]()
of pixels (i.e. unstack rows).
For example, the following image of Bogart:

can be represented by the vector:
v=(...,0.858824, 0.615686, 0.407843, 0.396078, 0.65098, 1,
0.905882, 0.878431, 0.917647, 0.901961, 0.917647, 0.870588, 0.882353, ...)
of the space R18300
(Here N=122x150=18300).
So let
![]()
be the vectors corresponding to the images in Figure (*) above. Let
![]()
be the average of these vectors, and let
![]()
Plotting” the points
in the space R18300 (We are taking
N=122x150=18300) would give a hyper-ellipsoid in dimension 18300. Using the
same techniques as in the application to the quadratics in the plane,
one can find the right axes to bring the ellipsoid to its standard form
![]()
Each axis vector is an ``eigenface"! the first 20 eigenfaces,
ordered by eigenvalues are:

In other words, every one of the above 70 images can be written as a linear combination of all of the eigenfaces, and a very good approximation to each can be obtained by just using the first 20.
Now given an (normalized) image, is it an image of the face we are
trying to recognize? To be able to answer that question, we need to recall some
facts:
The component of a vector u in
the direction of a unit vector v is
given by
![]()
and since
![]()
is true for any real number a, the
best approximation amongst all vectors of form av is given by the vector
![]()
If
are first k
eigenfaces, set
![]()
then the closet vector in Ek to a vector u is given by:
![]()
(this vector is called the “reconstruction” of u in Ek).
In the following diagram, image n
is
![]()

A significant reduction has been accomplished: from 18300 to less than
40!.
The next question is how can we plot these points in dimension 18300
and find the right axes? Of course, we need another approach to this end. The
following steps give an algebraic way to find these “right axes” using the
theory of matrix digitalization.
1.
Let A be the following 70x18300 matrix:

2.
Form the two matrices ATA (18300x18300) and AAT (70x70). Clearly, both matrices are symmetric. In
particular they are diagonalizable.
3.
Diagonalize
the matrix ATA:
![]()
where V is an invertible 70x70 matrix whose columns are
eigenvectors of AAT and
is diagonal.
4.
with ![]()
5.
The eigenfaces are the
columns of the matrix:
![]()
Reference
``Two and Three Dimensional Patterns of the Face", by Peter L.
Hallinan, Gaile G. Gordon, A.L.Yuille, Peter Giblin, David Mumford
Publishers: A.K.Peters, Natick Mass.