|
I am a Human-Computer Interaction (HCI) research scientist who researches, envisions, designs, develops, and patents systems with fundamentally novel user interfaces in the domains of mobile communication and mobile computing, ubiquitous computing (ubicomp), artificial intelligence (AI), robotics (HRI), and augmented reality (AR). My strengths are "engineering creativity," and connecting the dots of research and emerging technologies to create radically new products and services. I have more than 15 years of experience in corporate and academic environments, including MIT, Samsung, and HP. My most recent Curriculum Vitae (CV) can be found here. Motivation: why, how, and what? From a global historical perspective, mankind has only just begun creating technologies with the explicit intention to directly enhance our bodies and minds. Although we have built tools to enhance our motor skills and perception for many millennia already (think "hammer" and "eye glasses"), I am referring to augmentation of higher level mental skills. The augmentation technologies that we do have already, however, have impacted us deeply: for example, mobile phones extend our conversational range and allow us talk to remote people almost anywhere, from anywhere. Mind you, although this seems just like an augmentation of our senses, it is actually an augmentation of our social interaction potential. This becomes clear when we look at today's mobile devices, which in general intend on enhancing our knowledge, telling us where we are, what other people think and see, and many other useful things. Going way beyond that, it is foreseeable, however, that Human Augmentation technologies even in the near future will enhance us in much more extreme ways: perceptually, cognitively, emotionally, and on many other levels. I am intent on pushing the envelope in this direction, and creating technologies to serve this purpose. And I have been doing so for almost 25 years: in order to understand the problem well, I have studied the human psyche in depth for eight years, and then studied the engineering side at MIT for another eight years. I believe that in order to create technologies that immediately and explicitly enhance people, and allow us to interact with technology more intuitively, we need to combine deep engineering and psychology knowledge, and all shades in between. So, I have worked in the fields of mobile communication, augmented reality, virtual worlds, artificial intelligence and robotics, and created a series of working prototypes that show how I think we will interact with future technologies. Positions held From 1997 to 2005, I was at the MIT Media Lab as research assistant and lead for ten research projects in the domain of speech interfaces and mobile communication, conversational and communication agents, embodied agents, and wireless sensor networks. I supervised undergraduate researchers (software, robotics, circuit design). I was in close contact with the lab's 150+ industrial sponsors, and gave over 100 presentations, talks, and demonstrations. From 2005 to 2010, I was project leader and principal researcher at Samsung research lab in San Jose, CA. I was in charge of initiating and managing HCI research projects, and headed a small team of PhD researchers. My job was to envision, design, and develop fundamentally novel user interface concepts, based on original research in the domains of ubiquitous computing, mobile communication, artificial intelligence, robotics, virtual worlds, and augmented reality. One of my larger projects explored new ways how to interact naturally and intuitively with 3D content (AR content, virtual worlds, games, etc.) on mobile and nomadic devices (from cellphones to tablets to laptops to unfolding display devices to portable projection devices and more). We built working prototypes that demonstrate key interaction methods and intelligent user interfaces for natural interaction. My patent applications serve as rough outlines for projects. Since 2011, I am with Palm/HP, and am director for future concepts and prototyping: I lead, manage, and inspire teams of end-to-end prototyping and research engineers (hardware & software), UI prototyping engineers, and UI production developers (the latter only interim until August 2011) in the webOS Human Interface group. I create working systems of future interaction methods, file for patents, and contribute to strategic roadmaps across all of Palm/HP. My projects are in the fields of wand and pen input, mobile 3D interfaces, remote multi-touch interfaces, and more. Although I cannot be more specific here, the patent applications I filed at Palm, once they become public, will show the outline of some of the projects. My functions As an engineer, I am an inventor, builder, and implementer who uses software and hardware rapid prototyping tools to create systems from the lowest level (firmware, sensors, actuators) up to highest level applications with GUIs and networking capabilities. Systems I have built at MIT include palm-sized wireless animatronics with full duplex-audio, conversational agents interacting with multiple parties simultaneously, autonomously hovering micro helicopters, laser based micro projectors, and a drawing tool with built in camera and an array of sensors to pick up and draw with visual properties of everyday objects. As an HCI researcher, I can isolate relevant scientific problems, tackle them systematically by using my extensive training as a psychologist, and then come up with novel theories and visions to solve them. Then I integrate theories and technologies into working prototypes and innovative systems, and verify their validity with rigorous user testing, be it with ethnographic methodologies or in experimental lab settings. As a leader, I am able to assemble a team of world class experts, lead and advise them on all research, engineering, and HCI levels. I am able to inspire and enable the team members, bringing out the best in them, and at the same time keeping strategic requirements in mind. Over the time, my teams have had sizes up to 12 people, but I can create high impact systems and prototypes with smaller teams as well. Academics I have earned three degrees: both Master's and PhD in Media Arts & Sciences from MIT, and an additional Master's in Psychology, Philosophy, Computer Science (University of Bern, Switzerland). I have received my PhD in Media Arts & Sciences from MIT in June 2005. During my studies, I worked as an HCI research assistant at the Media Lab at MIT. I was part of the Speech + Mobility Group (then called "Speech Interface Group"), headed by Chris Schmandt, where we worked on speech interfaces and mobile communication devices. Although in a user interface group, my personal approach includes software and robotic agents to enhance those interfaces. My PhD work (as well as my Master's thesis) illustrate that, and my qualifying exams domain reflect these ideas as well. Portfolio My representative industry projects focus on creating devices and systems which allow us to interact with mobile devices more naturally and intuitively. In a wider sense, I am working on Human Augmentation. I am both engineer and research director in this field. My representative MIT projectsmainly my doctoral workfocus on adding human-style social intelligence to mobile communication agents that are embodied both in the robotic and software domain. My past projects are diverse, and include projects such as autonomously levitating devices. Some projects are not in the engineering domain at all, but in the psychology realm (my other background). Leadership |
|
Palm was bought by HP in July 2010; it was renamed to webOS, and made open source in 2012
Director, Future Concepts and Prototyping Team (part of the HI team at webOS/Palm business unit of HP) | I founded the team in January 2011. My focus is on leading and inspiring the team members, directing the group's research agenda, hiring new members, interfacing with external groups, and patenting. The team's charter is to do risky and holistic HCI research and end-to-end prototyping (spanning software and hardware) that pushes the edges of HCI. Our projects target future product releases, approximately 3-5 years from the current releases. The team consists of Ph.D. level researchers with diverse backgrounds, from 3D virtual environments to robotics to architecture to speech interfaces. (I have led other teams at HP, up to 12 people, who do UI development and software UI prototyping.) We interact with engineers (software and hardware), designers, researchers (e.g., HP Labs), and planners (roadmapping and strategic planning). Our output consists of working prototypes and patents of interaction methods that serve as ground work for future releases. Our patenting efforts are significant, about one invention disclosure per week. Year: 2011 - present Status: ongoing Domain: engineering, management Type: research group Position: director
|
Samsung Electronics
Project Lead and Team Lead, HCI Research Team (part of the Computer Science Lab at Samsung R&D) | I founded the team in 2008, and lead it until my departure from Samsung at the end of 2010. The team size was between 3 and 5 members, with the staff researchers holding doctoral degrees in various fields, and interns from first tier universities. My task was to initiate and execute strategic HCI projects, both in collaboration with other Samsung groups and external groups. My duties included leading and inspiring the team members, setting the team's direction, creating strategic and feasible project plans, keeping the projects on track, hiring, and patenting. Our main accomplishments were working prototypes (see some projects below), evangelizing these prototypes to Samsung executives (up to chairman, CEO, and CTO), patenting core technologies, and technical reports of our research. Year: 2008 - 2010 Status: concluded Domain: engineering, management Type: research group Position: project and group leader Collaborators: Seung Wook Kim (2008-2010), Francisco Imai (2008-2009), Anton Treskunov (2009-2010), Han Joo Chae (intern 2009), Nachiket Gokhale (intern 2010) Representative industry projects |
Concept |
Early demo
Please click on thumbnails for larger pictures and videos!
iVi: Immersive Volumetric Interaction (Samsung R&D) | This is the master project that I created for the HCI research team. The core idea was to invent and create working prototypes of future consumer electronics devices, using new ways of interaction, such as spatial input (i.e., gestural and virtual touch interfaces) and spatial output (i.e., view dependent rendering, position dependent rendering, 3D displays). Platform focus was on nomadic and mobile devices (from cellphones to tablets to laptops), novel platforms (wearables, AR appliances), and some large display systems (i.e., 3D TV). We created dozens of prototypes (some described below) covered by a large number of patent applications. A very successful demo of spatial interaction (nVIS) was selected to be shown at the prestigious Samsung Tech Fair 2009. One of our systems with behind-display interaction (DRIVE) got selected for the Samsung Tech Fair 2010. Yet another one used hybrid inertial and vision sensors on mobile devices for position dependent rendering and navigating in virtual 3D environments (miVON). Year: 2008 - 2010 Status: concluded Domain: engineering, management Type: portfolio management Position: project and group leader Collaborators: Seung Wook Kim (2008-2010), Francisco Imai (2008-2009), Anton Treskunov (2009-2010), Han Joo Chae (intern 2009), Nachiket Gokhale (intern 2010)
Video of interaction |
Concept
Prototype
Patent drawings
DRIVE: Direct Reach Into Virtual Environment (Samsung R&D) | This novel interaction method allows users to reach behind a display and manipulate virtual content (e.g., AR objects) with their bare hands in the volume behind the device. We designed and constructed multiple prototypes: some are video see-through (tablet with an inline camera and sensors in the back), some are optical see-through (transparent LCD panel, depth sensor behind the device, front facing imager). The latter system featured (1) anaglyphic stereoscopic rendering (to make objects appear truly behind the device), (2) face tracking for view-dependent rendering (so that virtual content "sticks" to the real world), (3) hand tracking (for bare hand manipulation), and (4) virtual physics effects (allowing completely intuitive interaction with 3D content). It was realized using OpenCV for vision processing (face tracking), Ogre3D for graphics rendering, a Samsung 22-inch transparent display panel (early prototype), and a PMD CamBoard depth camera for finger tracking (closer range than what a Kinect allows). This prototype was demonstrated Samsung internally to a large audience (Samsung Tech Fair 2010), and we filed patent applications. (Note that anaglyphic rendering was the only available stereoscopic rendering method with the transparent display. Future systems will likely be based on active shutter glasses, or parallax barrier or lenticular overlays. Also note that a smaller system does not have to be mounted to a frame, like our prototype, but can be handheld.) Year: 2010 Status: concluded Domain: engineering Type: demo, paper (MVA2011), paper (3DUI2011), Position: project and group leader Collaborators: Seung Wook Kim, Anton Treskunov
Video of demo |
Prototype
System setup
Spatial Gestures for 3DTV UI (Samsung R&D) | This project demonstrates new media browsing methods with direct bare hand manipulation in a 3D space on a large stereoscopic display (e.g., 3D TV) with 3D spatial sound. We developed a prototype on an ARM-based embedded Linux platform with OpenGL ES (visual rendering), OpenAL (spatial audio rendering), and ARToolKit (for hand tracking). The main contribution was to create multiple gesture interaction methods in a 3D spatial setting, and implement these interaction methods in a working prototype that includes remote spatial gestures, stereoscopic image rendering, and spatial sound rendering. Year: 2010 Status: concluded Domain: engineering Type: demo, video, report Position: project and group leader Collaborators: Seung Wook Kim, Anton Treskunov
Complete demo video |
Curved displays
Mobile & static
nVIS: Natural Virtual Immersive Space (Samsung R&D) | This project demonstrates novel ways to interact with volumetric 3D content on desktop and nomadic devices (e.g., laptop), using methods like curved display spaces, view-dependent rendering on non-planar displays, virtual spatial touch interaction methods, and more. We created a series of prototypes consisting of multiple display tiles simulating a curved display space (up to 6), rendering with asymmetric view frustum (in OpenGL), vision-based 6DOF face tracking (OpenCV based and faceAPI), and bare hand manipulation of 3D content with IR-marker based finger tracking. The system also shows a convergence feature, by dynamically combining a mobile device (e.g., cellphone or tablet) with the rest of the display space, while maintaining spatial visual continuity among all displays. One of the systems includes upper torso movement detection for locomotion in virtual space. In addition to desktop based systems, we created a prototype for public information display spaces, based on an array of 70-inch displays. All final systems were demonstrated Samsung internally at the Samsung Tech Fair 2009. Year: 2008-2010 Status: concluded Domain: engineering Type: demos, videos, technical reports [CHI 2010 submission], patents Position: project and group leader Collaborators: Seung Wook Kim, Anton Treskunov
Concept (2008) |
Multiple devices
Egomotion sensing
Phone demo
miVON: Mobile Immersive Virtual Outreach Navigator (Samsung R&D) | This project is about a novel method for interacting with 3D content on mobile platforms (e.g., cellphone, tablet, etc.), showing position-dependent rendering (PDR) of a 3D scene such as a game or virtual world. The system disambiguates shifting and rotating motions based on vision-based pose estimation. We developed various prototypes: a UMPC-version using optical flow only for pose estimation and a Torque3D game engine, a netbook based prototype that used up to four cameras to disambiguate imager-based 6DOF pose estimation, and a cellphone based prototype that combined inertial and vision based sensing for 6DOF egomotion detection (see videos on the left). Multiple patents were filed, and our code base was transferred to the relevant Samsung business units. Year: 2008-2010 Status: concluded Domain: engineering Type: demo, video, paper [SPBA 2009 "Position dependent rendering of 3D content on mobile phones using gravity and imaging sensors"], patents Position: project and group leader Collaborators: Seung Wook Kim, Han Choo Chae, Nachiket Gokhale
System concept |
Dome mockup
Nubrella mockup
Patent drawings
PUPS: Portable Unfolding Projection Space (Samsung R&D) | This project is about mobile projection display systems, to be used as a platform for AR systems, games, and virtual worlds. The technology is based on "non-rigid semi-transparent collapsible portable projection surfaces," either wearable or backpack mounted. It includes multiple wearable display segments and cameras (some inline with the projectors), and semi-transparent and tunable opacity mobile projection surfaces which are unfolding in an origami style, adjusting to the eye positions. There is a multitude of applications for this platform, from multiparty video conferencing (conference partners are projected around the user, and a surround sound system disambiguates the voices), to augmented reality applications (the display space is relatively stable with regards to the user, and AR content will register much better with the environment than handheld or head worn AR platforms), to cloaking and rear view applications. The portable display space is ideal for touch interactions (surface is at arm's length), and can track the user's hands (gestures) as well as face (for view dependent rendering). This early stage project focused on patenting and scoping of the engineering tasks, but did not go beyond that stage. Year: 2008-2009 Status: concluded Domain: engineering Type: project plan, reports, patent application Position: project and group leader Collaborators: Seung Wook Kim, Francisco Imai
Pet robotics |
Animatronic mediator
Pet Robotics and Animatronic Mediators (Samsung R&D) | Pets are shown to have a highly positive (therapeutic) effects on humans, but are not viable for all people (allergies, continued care necessary, living space restrictions, etc.) From a consumer electronics perspective, there is an opportunity to create robotic pets with high realism and consumer friendliness to fill in, and create high emotional attachment by the user. Our approach emphasizes the increase of lifelikeness of a pet robot, by employing (among others) emotional expressivity (expresses emotion with non-verbal cues, non-speech audio), soft body robotics (silent and sloppy actuator technologies; super soft sensor skin), and biomimetic learning (cognitive architecture and learning methods inspired by real pets). My in depth analysis of the field covered the hard technical problems to get to a realistic artificial pet, the price segment problem (gap between toy and luxury segments), how to deal with the uncanny valley, and many other issues. This early stage project focused on planning and project scoping, but did not enter engineering phase. However, it is related to my dissertation field of Autonomous Interactive Intermediaries. Year: 2008 Status: concluded Domain: engineering Type: report Position: project leader
Patent drawing 1 |
Patent drawing 2
Patent drawing 3
Googling Objects with Physical Browsing (Samsung R&D) | This project is about advanced methods of free-hand mobile searching. I developed two novel interaction methods for a localized in-situ search. The underlying idea is that instead of searching for websites, people who are not sitting in front of a desktop computer may search for information on physical objects that they encounter in the world: places, buildings, landmarks, monuments, artifacts, productsin fact, any kind of physical object. This "search in spatial context, right-here and right-now," or physical browsing, poses special restrictions on the user interface and search technology. I developed two novel core interaction methods, one based on manually framing the target object with hands and fingers (using vision processing to detect these gestures), the other based on finger pointing and snapping (using audio triangulation of the snapping sound) as an intutive "object selection" method. The project yielded two granted patents. Year: 2006-2007 Status: concluded Domain: engineering Type: report, patent 1 (point and snap), patent 2 (finger framing) Position: project leader Representative projects done at MIT |
|
![]() Click on thumbs for larger images
Autonomous Interactive Intermediaries | An Autonomous Interactive Intermediary is a software and robotic agent that helps the user manage her mobile communication devices by, for example, harvesting 'residual social intelligence' from close by human and non-human sources. This project explores ways to make mobile communication devices socially intelligent, both in their internal reasoning and in how they interact with people, trying to avoid, e.g., that our advanced communication devices interrupt us at completely inappropriate times. My Intermediary prototype is embodied in two domains: as a dual conversational agent, it is able to converse with caller and calleeat the same time, mediating between them, and possibly suggesting modality crossovers. As an animatronic device, it uses socially strong non-verbal cues like gaze, posture, and gestures, to alert and interact with the user and co-located people in a subtle but public way. I have built working prototypes of Intermediaries, embodied as a parrot, a small bunny, and a squirrelwhich is why some have called my project simply "Bunny Phone" or "Cellular Squirrel". However, it is more than just an interactive animatronics that listens to you and whispers into your ear: When a call comes in, it detects face-to-face conversations to determine social groupings (see Conversation Finder project), may invite input ("vetos") from the local others (see Finger Ring project), consults memory of previous interactions stored in the location (called Room Memory project), and tries to assess the importance of the incoming communication by conversing with the caller (see Issue Detection project). This is my main PhD thesis work. More... Year: 2002 - 2005 Status: dormant (not active as industry project, but I keep working on it) Domain: engineering Type: system, prototypes, paper, another paper,short illustrative videos, dissertation, demo video [YouTube], patent 1, patent 2 Press: many articles Position: lead researcher Advisor: Chris Schmandt PhD Thesis committee: Chris Schmandt, Cynthia Breazeal, Henry Lieberman Collaborators: Matt Hoffman (undergraduate collaborator)
|
![]() Mockup (top), prototype (middle), alignment example (bottom)
Conversation Finder | Conversation Finder is a system based on a decentralized network of body-worn wireless sensor nodes that independently try to determine with who the user is in a face-to-face conversation with. Conversational groupings are detected by looking at alignment of speechthe way we take turns when we talk to each other. Each node has a microphone and sends out short radio messages when its user is talking, and in turn listens for messages from close-by nodes. Each node then aggregates this information and continuously updates a list of people it thinks its user is talking to. A node can be queried for this information, and if necessary can activate a user's Finger Ring (see Finger Ring project). Depending on my conversational status, my phone might or might not interrupt me with an alert. This system is a component of my PhD work on Autonomous Interactive Intermediaries, a large research project in context-aware computer-mediated call control. Year: 2002 - 2005 Status: dormant Domain: engineering Type: system, prototypes, papers, explanatory video 2003 (1 minute) [RealVideo] Position: lead researcher Advisor: Chris Schmandt Collaborators: Quinn Mahoney (undergraduate collaborator 2002-2003), Jonathan Harris (undergraduate collaborator 2002)
|
![]() Working prototype (top), wired rings used for user tests (middle, bottom)
Finger Ring, "Social Polling" | Finger Ring is a system in which a cell phone decides whether to ring by accepting votes from the others in a conversation with the called party. When a call comes in, the phone first determines who is in the user's conversation (see Conversation Finder project). It then vibrates all participants' wireless finger rings. Although the alerted people do not know if it is their own cellphones that are about to interrupt, each of them has the possibility to veto the call anonymously by touching his/her finger ring. If no one vetoes, the phone rings. Since no one knows which mobile communication device is about to interrupt, this system of 'social polling' fosters collective responsibility for controlling interruption by communication devices. I have found empirical evidence that significantly more vetoes occur during a collaborative group-focused setting than during a less group oriented setting. This system is a component of my PhD work on Autonomous Interactive Intermediaries, a large research project in context-aware computer-mediated call control. Year: 2002 - 2005 Status: dormant Domain: engineering Type: system, prototypes, paper Position: lead researcher Advisor: Chris Schmandt
|
Issue Detection | Issue Detection is a system that is able to assess in real-time the relevance of a call to the user. Being part of a conversational agent that picks up the phone when the user is busy, it engages the caller in a conversation using speech synthesis and speech recognition to get a rough idea for what the call might be about. Then it compares the recognized words with what it knows about what is currently 'on the mind of the user.' The latter is harvested continuously in the background from sources like the user's most recent web searches, modified documents, email threads, together with more long term information mined from the user's personal web page. The mapping process has several options in addition to literal word mapping. It can do query extensions using Wordnet as well as sources of commonsense knowledge. This system is a component of my PhD work on Autonomous Interactive Intermediaries, a large research project in context-aware computer-mediated call control. Year: 2002 - 2005 Status: dormant Domain: engineering Type: system Position: lead researcher Advisor: Chris Schmandt
| ![]() Illustration of channel sequence
Active Messenger | Active Messenger (AM) is a personal software agent that forwards incoming text messages to the user's mobile and stationary communication devices such as cellular phones, text and voice pagers, fax, etc., possibly to several devices in turn, monitoring the reactions of the user and the success of the delivery. If necessary, email messages are transformed to fax messages or read to the user over the phone. AM is aware of which devices are available for each subscriber, which devices were used recently, and if a message was received and read by the user by exploiting back-channel information and by inferring from the users communication behavior over time. The system treats filtering as a process rather than a routing problem. AM is up and running since 1998, serving between 2 and 5 users, and has been refined over the last 5 years in a tight iterative design process. This project started out as my Master's thesis at the MIT Media Lab (finished 1999), but has been continued until the present time. More... Year: 1998 - 2005 Status: system stable and still (!!) in continuous use Domain: engineering Type: system, thesis, paper (HCI), paper (IBM), tech report Position: lead researcher Advisor: Chris Schmandt
|
I/O Brush | I/O Brush is a new drawing tool to explore colors, textures, and movements found in everyday materials by "picking up" and drawing with them. I/O Brush looks like a regular physical paintbrush but has a small video camera with lights and touch sensors embedded inside. Outside of the drawing canvas, the brush can pick up color, texture, and movement of a brushed surface. On the canvas, artists can draw with the special "ink" they just picked up from their immediate environment. I designed the electronics on the brush (sensors, etc), the electronics 'glue' between the brush and the computers, and wrote the early software. This project is the PhD work of Kimiko Ryokai, and has been presented at many events, including a 2-year interactive exhibition at the Ars Electronica Center in Linz, Austria. More... Year: 2003 - 2005 Status: active Domain: engineering Type: system, prototypes, paper, paper (design), video [MPEG (27MB)] [MOV (25MB)] [YouTube] Position: collaborator Collaborators: Kimiko Ryokai (lead researcher), Rob Figueiredo (undergraduate collaborator), Joshua Jen C. Monzon (undergraduate collaborator) Advisor: Hiroshi Ishii Past projects |
|
Robotic F.A.C.E. | Robotic F.A.C.E., which stands for Facial Alerting in a Communication Environment, explored the use of a physical object in the form of a face as a means of user interaction, taking advantage of socially intuitive facial expressions. We have built an interface to an expressive robotic head (based on the mechanics of a commercial robotic toy) that allows the use of socially strong non-verbal facial cues to alert and notify. The head, which can be controlled via a serial protocol, is capable of expressing most basic emotions not only in a static way, but also as dynamic animation loops that vary some parameter, e.g., activity, over time. Although in later projects with animatronic components (Robotic P.A.C.E., Autonomous Interactive Intermediaries) I did not reverse engineer a toy interface anymore, the experience gained with this project was very valuable. More... Year: 2003 - 2004 Status: done Domain: engineering Type: system Position: lead researcher Collaborators: Mark Newman (undergraduate collaborator)
| ![]()
Robotic P.A.C.E. | The goal of the Robotic P.A.C.E. project was to explore the use of a robotic embodiment in the form of a parrot, sitting on the user's shoulder, as a means of user interaction, taking advantage of socially intuitive non-verbal cues like gaze and postures. These are different from facial expressions (as explored in the Robotic F.A.C.E. project), but at least as important as them for grabbing attention and interrupting in a socially appropriate way. I have built an animatronic parrot (based on a hand puppet and commercially available R/C gear) that allows the use of strong non-verbal social cues like posture and gaze to alert and notify. The wireless parrot, which can be controlled from anywhere by connecting to a server via TCP which in turn connects to a hacked R/C long range transmitter, is capable of quite expressive head and wing movements. Robotic P.A.C.E. was a first embodiment for a communication agent that reasons and acts with social intelligence. Year: 2003 - 2004 Status: done Domain: engineering Type: system, illustrative video (2 minutes) [ Quicktime 7,279kb] [YouTube] Position: lead researcher
|
![]()
Tiny Projector | Mobile communication devices get smaller and smaller, but we'd prefer if the displays would get larger instead. The solution to this dilemma is to add projection capabilities to the mobile device. The basic idea behind TinyProjector was to create the smallest possible character projector that can be either integrated into mobile devices like cellphones, or linked wirelessly via protocols like Bluetooth. During this 2-year project, I have built ten working prototypes; the latest one uses eight laser diodes and a servo-controlled mirror that "paints" characters onto any surface like a matrix printer. Because of the laser light, the projection is highly visible even in daylight and on dark backgrounds. More... Year: 2000 - 2002 Status: done Domain: engineering Type: prototypes, report Position: lead researcher
| ![]() 2-way pager used for Knothole
Knothole | Knothole (KH) uses mobile devices such as cellphones and two-way pagers as mobile interfaces to our desktop computers, combining PDA functionality, communication, and Web access into a single device. Rather than put intelligence into the portable device, it relies on the wireless network to connect to services that enable access to multiple desktop databases, such as your calendar and address book, and external sources, such as news, weather, stock quotes, and traffic. In order to poll a piece of information, the user sends a small message to the KH server, which then collects the requested information from different sources and sends it back as a short text summary. Although development of KH has finished 1998, it is currently used by Active Messenger, which it enhances and with which it interacts seamlessly. More... Year: 1997 - 1998 Status: system stable and in continuous use Domain: engineering Type: system, prototypes, paper (related) Position: lead researcher Advisor: Chris Schmandt
|
![]() Early prototype (top), later designs (middle, bottom)
Free Flying Micro Platforms, "Zero-G Eye" | A Free Flying Micro Platform (FFMP) is a vision for small autonomously hovering mobot with a wireless video camera that carries out requests for aerial photography missions. It would operate indoors and in obstacle rich areas, where it avoids obstacles automatically. Early FFMPs would follow high level spoken commands, like "Go up a little bit, turn left, follow me, and would try to evade capture. Later it would understand complex spoken language such as "Give me a close up of John Doe from an altitude of 3 feet" and would have refined situational awareness. The Zero-G Eye is a first implementation of a FFMP that was built to explore ways of creating an autonomously hovering small device. The sensor-actuator loop is working, but flight times were highly constrained because of a too low lift-to-weight ratio. Later prototypes are in different planning stages, and profit from experiences made with earlier devices. As a side note, I have been virtually obsessed with small hovering devices for a very long time already, and have designed such devices since I was 10 years old. More... Year: 1997 - 2001 Status: prototypes developed; project dormant Domain: engineering Type: prototypes, report 1, report 2, report 3, paper, paper (related) Position: lead researcher
|
ASSOL (Adaptive Song Selector Or Locator) | The Adaptive Song Selector Or Locator (ASSOL) is an adaptive song selection system that dynamically generates play lists from MP3 collections of users that are present in a public space. When a user logs into a workstation, the ASSOL server is notified, and the background music that is currently played in this space is influenced by the presence of the new user and her musical preferences. Her preferences are simply extracted from her personal digital music collection, which can be stored anywhere on the network and are streamed from their original location. A first time user merely has to tell the ASSOL system where her music files are stored. From then on, the play lists are compiled dynamically, and adapt to all the users in a given area. In addition, the system has a Web interface that allows users to personalize certain songs to convey certain information and alert them without interrupting other people in the public space. More... Year: 2000 Status: done Domain: engineering Type: system, prototype, report Position: researcher Collaborators: Kwan Hong Lee (co-researcher)
|
OpenSource SpeechSynth | The OpenSource SpeechSynth (OSSS) is a purely Web based text-to-speech synthesizer for minority languages, for which no commercial speech synthesizer software is available, e.g., Swiss German. It is based on a collaborative approach where many people contribute a little, so that everybody can profit from the accumulated resources. Its Web interface allows visitors to both upload sound files (words), as well as synthesize existing text. The speech synthesizing method used in this project is word based, which means that the smallest sound units are words. Sentences are obtained by waveform concatenation of word sounds. Due to the word concatenation approach, the OSSS works with any conceivable human language. It currently lists 90 languages, but users can easily add a new language if they wish, and then start adding word sounds. During the 4 years the OSSS is online, it has been tested by many Web visitors, specifically by the Lojban community. More... Year: 2000 - 2001 Status: done; up and running Domain: engineering Type: system, report Position: lead researcher
|
![]() Different weather conditions
WeatherTank | WeatherTank is a tangible interface that looks like a tabletop sized vivarium or a diorama, and uses everyday weather metaphors to present information from a variety of domains, e.g., "a storm is brewing" for increasingly stormy weather, indicating upcoming hectic activities in the stock exchange market. WeatherTank represents such well-known weather metaphors with desktop sized but real wind, clouds, waves, and rain, allowing users to not only see, but also feel information, taking advantage of our skills developed through our lifetimes of physical world interaction. A prototype was built that included propellers for wind, cloud machines, wave and rain generators, and a color-changing lamp as sun mounted on a rod that can be used to move the sun along an arc over the tank, allowing the user to manipulate the time of day. Year: 2001 Status: done Domain: engineering Type: system, report (short), report, unedited demo video (18.5 minutes) [RealVideo] [YouTube] Position: researcher Collaborators: Deva Seetharam (co-researcher)
| ![]() Screenshot
Impressionist visualization of online communication | This system provides an intuitive, non-textual representation of online discussion. In the context of a chat forum, all textual information of each participant is transformed to a continuous stream of video. The semantic content of the text messages is mapped onto a sequence of videos and pictures. The mapping is realized on the side of the receiver, because a simple text line like "I love cats" means different things to different people. Some would associate this with an ad for cat food, some other would be more negative because they dislike the mentality of cats and would therefore see pictures like a dog chasing a cat. For this purpose, each participant has a personal database of semantic descriptions of pictures and videos. If the participant scans the messages of a group, this textual information is transformed automatically to a user specific multiple stream of video. These video snippets have purely connotative meanings. I have built a proof-of-concept system with live video streams. More... Year: 1998 Status: done Domain: engineering, art installation Type: system, report Position: lead researcher
|
![]() Example for system reasoning (top), game example (bottom)
Daboo | Daboo is a real time computer system for the automatic generation of text in the specific context of the word guessing game Taboo. To achieve the game’s goallet the user guess a word as fast as possible without using certain taboo wordour system uses sophisticated algorithms to model user knowledge and to interpret semantically the user input. The former is very important to gradually enhance the performance of the system by adapting to the user's strongest "context" of knowledge. The latter helps bridge the gap between a guess and the actual word to guess by creating a semantic relationship between the two. For this purpose we rely on the semantic inheritance tree of WordNet. Daboo acts effectively as the clue giving party of a Taboo session by interactively generating textual descriptions in real time. More... Year: 1997 Status: done Domain: engineering Type: system, report Position: researcher Collaborators: Keith Emnett (co-researcher)
|
Psychological Impact of Modern Communication Technologies | In this two-year study, I examined both the communicative behavior in general and the use of communication technologies of eight subjects in detail using extensive problem-centered interviews. From the interview summaries, a general criterion for media separation was extracted, which allows the systematic separation of all media into two groups: on one side the verbal-vocal, realtime-interactive, and non-time-buffered media like telephone, intercom, and face-to-face communication; on the other side the text-based, asynchronous, and time-buffering media like letter, telefax, and email. The two media answering machine and online chatting (realtime communication via computer monitor and keyboard) occupy exceptional positions because they cannot be assigned to either group. Therefore, these two media were examined in detail. Through analyzing them under the aspects of both a semiotic ecological approach and a privacy regulation model, important characteristics and phenomena of their use can be explained, and future trends be predicted. This work was part of my first Master's thesis in Psychology. Year: 1993 Status: done Domain: psychology Type: study, thesis, paper Position: researcher Advisor: Urs Fuhrer
|
Influence of Video Clip on Perception of Music | This study explored the question if music is perceived and rated differently when presented alone versus when presented together with a promotional video clip. Thirty-six subjects filled out semantic differentials after having listened to tree different songs, each under one of the following conditions: the song was presented without video; the song was presented with the corresponding promotional video clip; the song was presented with random video. We found that subjects instructed to evaluate music do this in similar ways, independent from the presence of a matching or mismatching video clip. However, presenting a song together with a corresponding video clip decreases the possibility for the listeners/viewers to interpret the music in their own way. Furthermore, our data suggests that it is difficult and perhaps arbitrary to rate an incompatible or unmotivated music video mix, and that an appropriate video clip makes the "meaning" of the song more unequivocal. Year: 1989 Status: done Domain: psychology Type: study, report Position: researcher Advisor: Alfred Lang Collaborators: Fränzi Jeker und Christoph Arn (co-researchers)
|
Physiological Reactions to Modern Music | This half-year study was motivated by our subjective experiences of a striking physiological reaction ("goose pimples") when listening to some modern pop and rock songs. We hypothesized that this physiological reaction, possibly caused by adrenaline, is the most important factor that determines if a song pleases an audience: it can lead to massive sales of records and high rankings on music charts. This project explores if an accumulation of such "pleasing spots" can be found when listening to specific songs. Our experimental results from 33 subjects prove the existence of such accumulations; additionally, we examined if we were able to predict these accumulations ahead of the experiment. Our results show also that we were able to predict them intuitively, but not all of them and not without errors. Only assumptions can be made about the criteria for passages that attracted attention as accumulations of conjectured physiological reactions. There are four factors: increase of musical density (for example, chorus passages); musical intensification of melody or harmony; increase of rhythm; increase of volume. Year: 1986 Status: done Domain: psychology Type: study, report Position: researcher Advisor: Alfred Lang Collaborators: Sibylle Perler, Christine Thoma, Robert Müller, and Markus Fehlmann (all co-researchers) | |||||
. Last updated: January 31, 2012