Summary, Camera for image ‘search’


Summarized by Lav Varshney

How can we augment the camera to support best 'image search'?

 

Brute force won’t work

 

Feedback

 

Active sensing

 

Active scenes

 

Multimodal

 

Social

 

Other comments on camera for image search

 

                        - exemplar

                        - text

                        - sketch (Tom mentions ImageScape: http://skynet.liacs.nl/imagescape/)

                        - histogram

                        - features, relative shape

            Overall, several different methods like the ones suggested by psychologists for             human category learning (see e.g. F. Gregory Ashby and W. Todd Maddox, “Human Category Learning,” Annual Review of Psychology, vol. 56, pp. 149–          178, Feb. 2005.)

 

                        - nonlinear optics

                        - cancer from breath

 

 

 

Other big problems

 

Quinn: sometimes a picture does not have the correct lighting or pose for composition.  Object-based rather than pixel-based editing would help this

 

Tilke: this is actually a problem in image search, since goal is to find an ε-perturbation of the image

 

Ramesh: Bill Freeman has work on generalized viewing

 

Fredo: Not just a single query, but organization of photos, browsing, etc.

            - navigating images, topology of intuitive space

 

 

 

Group photos in public spaces: smile and the picture is to you in the mail automatically.

Sylvain: surveillance, smiling picture sent to you; crime photo sent to police

 

What are your questions about camera/technology/society?

Ramesh: pn junctions and diodes: what are they or how can they be used in fancy electronics.

 

Other courses

Eugene: Art & Photography course is very artsy

 

ε-photography

Name from analogy with ε-geometry, a branch of computational geometry.  Intent is to have robustness in estimate of pixel value with respect to changes in exposure time, etc.

 

A basic result there is that Voronoi partitions are very sensitive to small perturbations in the data.  In Bill Freeman’s generalized viewing, e.g. of the leaning tower of Pisa, small perturbations also cause significant changes to the scene.

 

Ramesh and Fredo disagreed as to whether the analogy is valid or whether the sense of ε in ε-photography is actually related to continuity rather than large changes caused by small changes.

What aspect of human eye are critical/ useless?

Tilke: focus

Tom: illusions, afterimages

Lav: feedback with brain, a la “what the frog's eye tells the frog's brain” but more so “what the brain tells the eye.”

More than stereo, multispectral, etc.

 

Cyrus: illumination – an animal produces light so as not to cast shadow on prey being pursued

 

Liquid Lens

Fredo: Philips, etc. also work with liquid lenses.

 

“Origami Lens”

½ cm in thickness, but same quality as a normal 35mm camera with good thick optics.

Like a telescope, shallow depth of field

 

 

Coded photography

Quinn: (midlevel cues): show different polarizations through optical illusions

 

Bill Freeman has motion without movement, adding optical flow to represent motion.

 

Windows DreamScene (http://en.wikipedia.org/wiki/Windows_DreamScene) allows a gentle breeze of wavy reflections based on prior statistics

 

Fredo: multiperspective photography

 

Fredo/Eugene: Beyond the flash-synchro limit, rolling shutter leads to different parts of the image being exposed at different times.

 

Ramesh: Camera obscura (room), “in camera” legalese

 

Essence Photography

Perceptual prosthetics, combinations of physics, algorithms, etc.

 

Matt: people like that photos are “honest.”  Essence photography is inherently biased.

 

Encoding for several different fidelity criteria simultaneously

 

 

Visual Social Computing

Flea market has selling etc., but eBay allows quick information dispersion. 

 

Social computing began before electronic computers.  Parallel processing implemented when computers were human (see e.g. David Alan Grier, When Computers Were Human, Princeton University Press, 2005.).  Similar parallel processing for visual tasks is visual social computing.

 

Doug: hate is promoted by text messages in Kenya.  Similar in Philippines.

Ramesh: clearly one can create distrust, but is there a way to create trust?