Summary, Camera for image ‘search’
Summarized by Lav Varshney
How can we augment the camera to support best 'image search'?
Brute force won’t work
Feedback
Active sensing
Active scenes
Multimodal
Social
Other comments on camera for image search
- exemplar
- text
- sketch (Tom mentions ImageScape: http://skynet.liacs.nl/imagescape/)
- histogram
- features, relative shape
Overall, several different methods like the ones suggested by psychologists for human category learning (see e.g. F. Gregory Ashby and W. Todd Maddox, “Human Category Learning,” Annual Review of Psychology, vol. 56, pp. 149– 178, Feb. 2005.)
- nonlinear optics
- cancer from breath
Other big problems
Quinn: sometimes a picture does not have the correct lighting or pose for composition. Object-based rather than pixel-based editing would help this
Tilke: this is actually a problem in image search, since goal is to find an ε-perturbation of the image
Ramesh: Bill Freeman has work on generalized viewing
Fredo: Not just a single query, but organization of photos, browsing, etc.
- navigating images, topology of intuitive space
Group photos in public spaces: smile and the picture is to you in the mail automatically.
Sylvain: surveillance, smiling picture sent to you; crime photo sent to police
What are your questions about camera/technology/society?
Ramesh: pn junctions and diodes: what are they or how can they be used in fancy electronics.
Other courses
Eugene: Art & Photography course is very artsy
ε-photography
Name from analogy with ε-geometry, a branch of computational geometry. Intent is to have robustness in estimate of pixel value with respect to changes in exposure time, etc.
A basic result there is that Voronoi partitions are very sensitive to small perturbations in the data. In Bill Freeman’s generalized viewing, e.g. of the leaning tower of Pisa, small perturbations also cause significant changes to the scene.
Ramesh and Fredo disagreed as to whether the analogy is valid or whether the sense of ε in ε-photography is actually related to continuity rather than large changes caused by small changes.
What aspect of human eye are critical/ useless?
Tilke: focus
Tom: illusions, afterimages
Lav: feedback with brain, a la “what the frog's eye tells the frog's brain” but more so “what the brain tells the eye.”
More than stereo, multispectral, etc.
Cyrus: illumination – an animal produces light so as not to cast shadow on prey being pursued
Liquid Lens
Fredo: Philips, etc. also work with liquid lenses.
“Origami Lens”
½ cm in thickness, but same quality as a normal 35mm camera with good thick optics.
Like a telescope, shallow depth of field
Coded photography
Quinn: (midlevel cues): show different polarizations through optical illusions
Bill Freeman has motion without movement, adding optical flow to represent motion.
Windows DreamScene (http://en.wikipedia.org/wiki/Windows_DreamScene) allows a gentle breeze of wavy reflections based on prior statistics
Fredo: multiperspective photography
Fredo/Eugene: Beyond the flash-synchro limit, rolling shutter leads to different parts of the image being exposed at different times.
Ramesh: Camera obscura (room), “in camera” legalese
Essence Photography
Perceptual prosthetics, combinations of physics, algorithms, etc.
Matt: people like that photos are “honest.” Essence photography is inherently biased.
Encoding for several different fidelity criteria simultaneously
Visual Social Computing
Flea market has selling etc., but eBay allows quick information dispersion.
Social computing began before electronic computers. Parallel processing implemented when computers were human (see e.g. David Alan Grier, When Computers Were Human, Princeton University Press, 2005.). Similar parallel processing for visual tasks is visual social computing.
Doug: hate is promoted by text messages in Kenya. Similar in Philippines.
Ramesh: clearly one can create distrust, but is there a way to create trust?