next up previous
Next: Aids for the Up: Camera-based Realities Previous: 3D Graphical Overlays

Augmenting Reality Using Inherent Visual Features

One of the traditional goals of computer vision is the identification and tracking of objects by their video images. While this is a hard problem in general, constrained situations may be addressed with sufficient processing power. For example, [Uenohara and Kanade, 1994,Baum et al., 1996] have shown tethered systems which can overlay graphics on computer internals for repair or marked human legs for surgery. This section describes attempts to bring such vision sensing to wearable computing-based augmented realities.

Using face recognition methods developed by [Turk and Pentland, 1991,Pentland et al., 1994], a face can be compared against an 8000 face database in approximately one second on a 50Mhz 486 class wearable computer. Aligning the face in order to perform this search is still costly (on the order of a minute on R4400-based machines). However, if the search can be limited to a particular size and rotation, the alignment step is much more efficient. In the case of wearable computing, the search can be limited to faces that are within conversational distance. In the current implementation, the user further assists the system by centering the head of his conversant on a mark provided by the system. The system can then rapidly compare the face versus images stored in the database. Given the speed of the algorithm, the system can constantly assume a face in the proper position, return the closest match, and withhold labeling until its confidence measure reaches a given threshold. Upon proper recognition, the system can overlay the returned name and useful information about the person as in Figure 11.

 
Figure:   Upon correct identification by face recognition, information about the conversant can be overlaid on the visual field.

Without visual tags, other methods of aligning the real and virtual are necessary. Using the ``pencigraphic'' imaging approach [Mann, 1995], the virtual image of a rigid planar patch [Huang and Netravali, 1984] may be superimposed on to the wearer's real world visual field, creating the illusion of the virtual image floating in 3D space. Figure 12 shows six frames of video from a processed image sequence. The computer recognizes the cashier and superimposes a previously entered shopping list on her. When the wearer turns his head to the right, not only does the shopping list move to the left on the wearer's screen, but its ``chirp-rate'' and keystoning are manipulated automatically to follow the flowfield of the video imagery coming from the camera. Note that the tracking (initially triggered by automatic face-recognition) continues even when the cashier's face is completely outside the camera's visual field, because the tracking is sustained by other objects in the room, such as the counters, walls, row of flourescent lights, and the video surveillance cameras installed on the ceiling. In this way, the shopping list appears attached to a plane on the cashier. Such techniques demonstrate how computer vision can directly aid the registration problem in augmented reality.

 
Figure: The ``pencigraphic'' imaging approach is used to properly register, keystone, and chirp a shopping list in space using the video flowfield.  



next up previous
Next: Aids for the Up: Camera-based Realities Previous: 3D Graphical Overlays



Thad E Starner
Sat Nov 9 09:44:24 EST 1996