Assignment 3: Gesture interfaces
Once Mondrian starts up, you will see two new command dominoes labeled Teach Gesture and Recognize Gesture. Select Recognize Gesture and then draw a rough square with the mouse: the gesture starts when you press the mouse button, and it continues for as long as you hold the button down. This demonstration system has been trained to recognize three types of gestures: circles, squares, and lines. When the system recognizes one of these gestures, it responds by drawing a cleaned-up version of the gesture, then erased it after a brief wait. If the system does not recognize the gesture, it prints gesture unrecognized. Play around with a few gestures to get a feel for the limitations of the system's recognition capabilities. You should realize that the limitations arise not only from the basic recognition algorithm, but also from the way in which the system was trained. (It was trained by Hal in a rather ad hoc manner.)
You can use Teach Gesture to train the system to recognize a new class of gestures. Click on the Teach Gesture domino and draw the first example of the gesture. The system will put up a menu asking you for the name of the gesture class. Keep giving the system examples until you think you have enough. Then use Recognize to see how well the system distinguishes this new class from squares, circles, and lines. When the system recognizes your new gesture, it will print the name of the gesture. In this simple interface, there is no convenient way for you to specify an action that the system should take in response to recognition, so you'll have to be content with it just printing the name. You can use Teach Gesture to add more examples of your new gesture, and also more examples of squares, circles, and lines, if you like. Be careful, though, because the interface provides no way to retract a bad example--and if examples, are poorly chosen, the recognizer will not do a good job.
You can define as many gesture classes as you like, but the more classes you have, and the more similar they are, the more likely the classifier is to become confused and produce ambiguous or incorrect responses.
You can see how this is done by examining the procedures respond-to-square, respond-to-circle, and respond-to-line in the file named Test. These procedures are called with a collection of attributes--geometric information that is computed by the gesture recognizer and used in the classification algorithm. For example, the attributes include the coordinates of the first point and the last point in the gesture. The procedure respond-to-line procedure creates a "cleaned up" gesture by simply joining the first and last points by a straight line. Then it shows the line, waits a second, and erases the line. Respond-to-square is similar, using the maximum and minimum x and y (which are contained among the gesture attributes) to compute an ideal square that matches the gesture size and position.
The complete list of attributes that are computed for each gesture can be found in the definition of the gest-attributes structure, which is defined at the beginning of the file features.
Once you have defined your respond-to-... procedure, you associate it with the name of the class by changing the list named *class-responses*. You can setq this list by hand, or edit the procedure install-gesture-commands (found in the file named Gesture-command).
The basic data structure is something called a classifier. This holds information about the gesture classes, and is the thing that "does" the learning and the recognition. Our system uses a single classifier called *gesture-classifier*. You might consider designing a system that has multiple classifiers for different purposes.
To train the classifier, you call the procedure gest-add-example, with the (vector of) points that form the gesture, the name of the class for which this is an example, and the actual classifier. Look at the procedures in test that accomplish these calls. To recognize a gesture, you can use the procedure gest-classify, or, for a somewhat higher-level interface, the procedure recognize-class. Gest-classify returns (via a multiple-value return) four pieces of information: the name of the class (or nil if the gesture was not recognized), the attributes computed for the gesture, the probability that the gesture was unambiguously recognized, and the distance (in feature space) of the given gesture from the average gesture in the class. These last two correspond to parameters that can be set (as at the beginning of test) to control the tolerance with which the system is willing to recognize a gesture.
The actual points to be passed to the classifier and recognizer are collected by a new kind of mouse tracker called a suit-gesture-tracker. You can see how this in implemented in the file Gesture-Tracker. The tracker basically just collects the gesture points so that they can be passed to the classifier.
In class, you should give a five-minute description of your proposed application.
As a project for the course, you should implement your proposal and demonstrate it in class.