Henry Lieberman, A User Interface for Knowledge Acquisition from Video, Conference of the American Association for Artificial Intelligence, Seattle, August 1994.
In conventional knowledge acquisition, a domain expert interacts with a knowledge engineer, who interviews the expert, and codes knowledge about the domain objects and procedures in a rule-based language, or other textual representation language. This indirect methodology can be tedious and error-prone, since the domain expert's verbal descriptions can be inaccurate or incomplete, and the knowledge engineer may not correctly interpret the expert's intent.
We describe a user interface that allows a domain expert who is not a programmer to construct representations of objects and procedures directly from a video of a human performing an example procedure. The domain expert need not be fluent in the underlying representation language, since all interaction is through direct manipulation. Starting from digitized video, the user selects significant frames that illustrate before- and after- states of important operations. Then the user graphically annotates the contents of each selected frame, selecting portions of the image to represent each part, labeling the parts, and indicating part/whole relationships. Finally, programming by demonstration techniques describe the actions that represent the transition between frames. The result is object descriptions for each object in the domain, generalized procedural descriptions, and visual and natural language documentation of the procedure. We illustrate the system in the domain of documentation of operational and maintenance procedures for electrical devices.
Henry Lieberman, Mondrian: A Teachable Graphical Editor, in Watch What I Do: Programming by Demonstration, Allen Cypher, ed., MIT Press, 1993.
Mondrian is a object-oriented graphical editor that can learn new graphical procedures through programming by example. A user can demonstrate a sequence of graphical editing commands on a concrete example to illustrate how the new procedure should work. An interface agent records the steps of the procedure in a symbolic form, using machine learning techniques, tracking relationships between graphical objects and dependencies among the interface operations. The agent generalizes a program that can then be used on "analogous" examples. The generalization heuristics set it apart from conventional "macros" that can only repeat an exact sequence of steps. The system represents all operations using pictorial "storyboards" of examples. By bringing the power of procedural programming to easy-to-use graphical interfaces, we hope to break down the "Berlin Wall" that currently exists between computer users and computer programmers.
This paper explores how graphical annotation can be used as a visual language for specifying interpretations of user actions in an environment for programming by example [or "by demonstration"]. Attaching text labels to graphical elements is a natural visual notation that appears in many kinds of hand drawn diagrams, such as those appearing in user manuals, to indicate part-whole relationships.
Programming by example systems have the problem that each user action, such as clicking on a graphical object, is potentially ambiguous. When the domain of the system is extended, the user must communicate to the system what the intended interpretation of newly introduced objects should be.
Past solutions are either to encode a fixed mapping between user interface actions and the problem domain, or to allow the user to supply such knowledge by programming rules and frames in a textual language. In this paper, I present a technique for allowing an end user to add graphical annotation to a drawing to indicate the meaning of its parts. When a labeled part subsequently appears as an argument to the operation, it is recorded in terms of its part description.
Henry Lieberman, Making Programming Accessible to Visual Problem Solvers in Watch What I Do: Programming by Demonstration, Allen Cypher, ed., MIT Press, 1993.
Graphic designers, architects, multimedia designers, animators and users of CAD/CAM systems are experts in visual problem solving. To date, computer-based graphic editing tools provide these people with the means to display and edit visual representations of their designs, but provide little explicit support for the problem-solving; processes that visual designers perform. A study of how knowledge is actually communicated in visual design domains shows that visual designers rely heavily on the generation and critique of visual examples. Programming by demonstration is a technique that permits the construction of programs directly from the presentation of graphical examples, together with some explanation of them. This paper presents a scenario of how a particular demonstrational technique, graphical annotation can be used to convey the structure of a design to permit a system to perform layout tasks automatically.
Practically since graphic displays were first hooked to computers, the idea of representing computer programs by pictures has attracted researchers. However, to date, most proposals for visual programming languages have adhered to a set pattern: fixed pictures symbolizing program components, connected by lines or arrows symbolizing relationships between the program components. This "icons on strings" approach, while it can be useful, is not the only way of visualizing programs.
In this paper, I explore one alternative: representing a program through visual examples of the state of its execution. I present two related techniques: dominoes, which replace the traditional icons as representations of operations; and storyboards, which replace iconic circuitry as the representation of program code. These have been implemented in Mondrian, a graphic editor extensible through programming by example.
A problem in applying artificial intelligence techniques to visual design domains is that much of the knowledge possessed by experts is best expressible in terms of visual examples. The traditional expert systems methodology requires this knowledge to be communicated from a design expert to a knowledge engineer, who then translates this knowledge into rules and other textual descriptions. This process is awkward and error-prone.
An alternative is to capture design knowledge more directly through an interactive graphical interface, by having the design expert manipulate concrete design examples in a graphical editor. The editor is equipped with an interface agent that records the users actions, and produces a generalized description of the procedure. The design procedure thus learned can subsequently be applied to examples that are similar to, but not identical to those on which the system was originally taught.
This approach is illustrated in this paper by a description of the graphical editor Mondrian, which uses programming by example to capture interface actions that represent an expert's problem solving behavior. The paper presents an example in a desktop publishing domain, where the system is taught a procedure for rearranging a layout of newspaper articles.
The home page for
Watch What I Do: Programming by Demonstration
Allen Cypher, ed. MIT Press, 1993.