Second LIfe

AutoMesh

AutoMesh is the product that combines statistical, projective and lighting algorithms based on training databases into an automatic accurate 3D from 2D head creator. AutoMesh contains algorithms trained on a database of 2D images containing a cross section of human heads with a wide range of rotations and lighting conditions. These algorithms allow AutoMesh to accurately locate all anatomically relevant parts of the human head in the 2D image. Additionally AutoMesh contains algorithms trained on carefully constructed 3D laser scans covering a cross-section of the human populous with all relevant facial expressions. These algorithms are combined with perspective projection and real world lighting models to enable AutoMesh to provide an optimum 3D parametric representation of the person in the 2D input image.

Due to the huge training databases (2D and 3D) AutoMesh is capable of representing the enormous variation in the shape and albedo of the human head with relatively few parameters and with great accuracy. In addition to this, the parametric representation can be separated into meaningful subsets of parameters, each with their own use. The subsets represent age, facial hair, expression and identity. Thus an application of AutoMesh to a 2D image results in a fast accurate reconstruction of that person's head in 3D plus a set of parameters that represent the person's age, facial hair, expression and identity. The resulting 3D head can be manipulated and used in many ways. Expression can be removed/added, age can be normalized or adjusted, facial hair can be detected, removed and added etc. In addition to these utilities the 3D mesh can be used to compose a neutral expression with consistent lighting for 2D FR, it can be used to create a 3D head from a legacy database of 2D images for 3D FR. It can be used to animate facial expressions and audio automatically. It can be used to estimate the appearance of a person at different ages, even what a person will look like should they shave off their beard or grow one. The applications of AutoMesh are endless. What's more is that fact that it is completely automatic, requires absolutely no user input and hence can run un-attended on a server creating accurate 3D heads from 2D images in seconds.

Now lets get a bit more specific: how does AutoMesh work and why is it better than other software 3D products and hardware acquisition products? The answer is complicated, to simplify lets discuss the pros and cons of hardware systems and other software systems.

Hardware systems are expensive, they come in many different flavors and are obviously useful in many ways, however, they have drawbacks other than their expense and sometimes physical size.

Laser scans produce detailed highly accurate textured point clouds, which can be rearranged into texture 3D meshes. Whilst accurate in the sense that the points that the laser sampled are accurately measured, they can be highly in-accurate due to the fact that they have to scan around/along the subject being scanned. Since their is a reasonably long scan time (often in the region of ten seconds or above) the subject can (and often does) move/twitch during the scan. This causes the accurately scanned 3D points to bear no relation to the other accurately scanned points, in other words data distortion.

Off pose image corrected in AutoMesh

Structured light systems don't suffer from the scan time problem since they usually acquire a 2D image (visible or infra-red) in a short exposure time so that the subject hasn't had time to move. Unfortunately, there are some problems with these systems (besides the fact that they too can be expensive). The main problem is the fact that they are 2.5D not real 3D. This simply means that they acquire depth information from a single viewpoint and hence are unable to obtain 2-valued line of sight (e.g. can't see under the chin etc). This really only presents itself as a problem when the subject is not looking down the camera's optical axis. The more the head is rotated the less head information is generated. What this boils down to is that such systems can only be used in a controlled fashion, i.e. the subject has to be compliant and willing to take instruction. Structured light systems can be made to produce true 3D data by synchronizing more than one system together, however two problems generally arise at this juncture. The first is that the interference of the projected light sources can distort the resulting data and produce noise. The second is the requirement to register the two (or more) sets of resulting 2.5D data to produce 3D data.

Stereo systems acquire 3D from two cameras. They reconstruct 3D vertices from matched sets of points in the two images via triangulation. Stereo systems are also essentially 2.5D but can also be made to create 3D as explained above. Stereo systems can acquire the images in a very short exposure time and hence do not suffer from subject movement. The major drawback of stereo systems is that time taken to process the two (or more) images to produce the 3D data. In many applications this period is unacceptable since it can run into minutes rather than seconds.

There are certain commercially available 3D head reconstruction software systems that produce head meshes based on the projection of a preformed generic mesh of the human head onto a 2D image. Whist this method can be fast it forces the depth variable to be consistent with the generic mesh, hence the 'z' vertex values are always the same regardless of the input image. To provide more realism, some systems utilize more than one preformed head mesh, whilst potentially improving the resulting depth information the technique can never result in accurate depth information. In addition to this the 2D image on which the preformed mesh is projected must represent the human head looking directly along the camera's optical axis, any head rotation will skew the results and produce unreliable 3D data.

Now we come to the question of why AutoMesh out-performs these other methods of creating 3D heads. (1) AutoMesh works from a single image, hence the scan time problem is non-existent. (2) AutoMesh parametrically represents the whole populous of human head shape, hair and albedo variation, hence, it does not rely on a generic preformed head mesh, it fits the optimum 3D shape, hair, albedo, lighting and camera parameters to the given image. (3) The resulting 3D head is true 3D, not 2.5D. (4) The subject in the input image does not have to look along the camera optical axis. (5) Expression, hair, pose, lighting and age can be normalized, added or subtracted. (6) It works in seconds. (7) it is automatic, absolutely no requirement for user input.