Bio Research+Teach Publications Press FAQ Personal
Picard Research in Video and Image Libraries: Browsing, Retrieval, Annotation

Research in Video and Image Libraries: Browsing, Retrieval, Annotation

The average person with a computer will soon have access to the world's collections of digital video and images. However, unlike text that can be alphabetized or numbers that can be ordered, image and video has no general language to aid in its organization. Although tools which can ``see'' and ``understand'' the content of imagery are still in their infancy, they are now at the point where they can provide substantial assistance to users in navigating through visual media.

This research is application-oriented, and couples the results of a society of models with new tools that allow computers to help people browse, annotate, and retrieve digital images and video.

A new learning system, FourEyes , has been developed, and equipped with a society of vision texture models to annotate part of a database of vacation photos. The user labels portions of some of the images in the database "building" and "street." The system infers which models best automated this labeling process, and then uses those to label the rest of the database. Once labeled, the Photobook system can retrieve images not just based on content, but also based on learned feedback from the user's input about content.

Selected Publications

To browse or download these documents, visit our old tech-reports page .

Research overview:

``A Society of Models for Video and Image Libraries,'' R. W. Picard, IBM Systems Journal, MIT Media Lab Special Issue, Vol. 35, Nos. 3 & 4, 1996, pp. 292-312, TR #360.

Challenges in this area for image processing and computer vision:

``Digital Libraries: Meeting Place for High-level and Low-level Vision,'' R. W. Picard, Invited paper, Proc. ACCV, Singapore, Dec 1995, TR #354.

``Light-years from Lena: Image and Video Libraries of the Future,'' R. W. Picard, Invited paper, Proc. ICIP, Washington DC, Oct 1995, TR #339.

``Content Access for Image/Video Coding: `The Fourth Criterion,''' R. W. Picard, Invited paper for MPEG conference, doc MPEG95/127, Lausanne, Mar 1995, TR #295.

Learning user preferences, and "perceptual common sense:"

``Modeling user subjectivity in image libraries,'' R. W. Picard, T. P. Minka, and M. Szummer, Invited paper, Proc. ICIP, Lausanne, Sep 1996, TR #382.

``Toward a Visual Thesaurus,'' R. W. Picard, Invited paper, Proc. MIRO Research Festival, Glasgow, Scotland, Sep 1995, TR #358.

FourEyes system for retrieval, annotation, and segmentation:

``Interactive Learning using a Society of Models,'' T. P. Minka and R. W. Picard, Pattern Recognition, Vol. 30, pp. 565, 1997, TR #349.

``Vision Texture for Annotation,'' R. W. Picard and T. P. Minka, ACM/Springer Journal of Multimedia Systems, Vol. 3, pp. 3--14, TR #302.

Photobook system for browsing and retrieval:

``Photobook: Content-based Manipulation of Image Databases,'' A. P. Pentland, R. W. Picard, and S. Sclaroff, Int. Journal of Computer Vision, Vol. 18, No. 3, pp. 233--254, 1996.

OLD Vision & Modeling Group Home Page