© Portions copyright March 1998 by Joseph A. Paradiso
MIT Media Laboratory
20 Ames St. E15-325
Cambridge MA 02139
The desire for musical expression runs deeply across human cultures; although styles vary considerably, music is often thought of as a universal language. It is tempting to surmise that one of the earliest applications of human toolmaking, after hunting, shelter, defense, and general survival, was probably to create expressive sound, developing into what we know and love as music. As toolmaking evolved into technology over the last centuries, inventors and musicians have been driven to apply new concepts and ideas into improving musical instruments or creating entirely new means of controlling and generating musical sounds. The classic acoustic instruments, such as the strings, horns, woodwinds, and percussion of the modern orchestra (and sitars, kotos etc. of the non-western world) have been with us for centuries, thus have settled into what many think of being a near-optimal design, only slowly yielding to gradual change and improvements. For hundreds of years, the detailed construction of prized acoustic instruments, especially in the string family, has remained a mysterious art, and only recently have their structural, acoustic, and material properties been understood in enough detail for new contenders to emerge, for example in the instruments of Carleen M. Hutchins.
Electronic music, in contrast, has no such legacy. The field has only existed for under a century, giving electronic instruments far less time to mature. Even more significantly, technology is developing so quickly that new sound synthesis methods and capabilities rapidly displace those of only a few years before. The design of appropriate interfaces is therefore in a continual state of revolution, always driven by new methods of sound generation that enable (and occasionally require) expression and control over new degrees of freedom. This is especially relevant now, as synthesis techniques such as physical modeling move into prominence. As its name implies, physical modeling synthesis runs a mathematical model of an actual acoustic instrument or complicated, imaginary, "pseudo-acoustic" system on a computer or DSP. Since most acoustic instruments have multimodal, expressive interfaces very different from the piano keyboards common to commercial synthesizers, a different, perhaps more multimodal and general interface is required for a performer to attain the full potential promised by modeling synthesis.
Throughout most of the history of electronic music, the interaction end of instrument design could be classed loosely as a branch of ergonomics. Over the last 15 years, electronic instruments became digital, and within the next decade or so, their functions will probably be totally absorbed into what general purpose computers will become. Thus, for all practical purposes, musical interface research has merged with the broader field of human-computer interface. This merger has two basic frontiers; at one end, there are interfaces for virtuoso performers, who practice and become adept at the details of manipulating subtle nuances of sound from a particular instrument. At the other end, the power of the computer can be exploited to map basic gesture into complex sound generation, allowing even non musicians to conduct, initiate and to some extent control a dense musical stream. While the former efforts will push the application of noninvasive, precision sensing technologies in very demanding real-time user interfaces, the latter relies more on pattern recognition, algorithmic composition, and artificial intelligence.
Interposing a computer in the loop between physical action and musical response allows essentially any imaginable sonic response to a given set of actions; this is termed "mapping". As digital musical interfaces are so recent, there is no clear set of rules that govern appropriate mappings, although (arguably) some sense of causality should be maintained in order that performers perceive a level of deterministic feedback to their gesture. Likewise, there is considerable debate surrounding the groundrules of a digital performance. For example, when a violinist plays their instrument, the audience has a reasonable idea of what they'll be hearing in response, but with any possible sonic event resulting from any gesture on an unfamiliar digital interface, the performing artist risks loosing his audience. It's not entirely trivial for modern composers working in this genre to maintain the excitement inherent in watching a trained musician push their instrument to the edge of their capability; in most venues, audiences will expect avenues through which they can feel the performer's tension and sweat, so to speak.
Although, as indicated above, musical mapping is an important component of all modern musical interfaces (and many interesting software packages have been developed for this purpose; i.e., Opcode's MAX developed by Miller Puckette, Interactor developed by Mark Coniglio and Morton Subotnick, Lick Machine from STEIM, the CMU MIDI Toolkit by Roger Dannenberg, Flex from Cesium Sound, ROGUS and HyperLisp here at the MIT Media Lab), the remainder of this article will focus more on sensing and hardware, tracing the history of electronic musical interfaces, and describing examples and research that illustrate these concepts.
2) Reign of the Keyboard
With the notable exception of the Theremin, to be discussed below, all early electronic musical instruments were primarily controlled by a keyboard, frequently the standard 12-tone (chromatic) layout we know from the acoustic piano. The first true electronic instruments were Elisha Gray's 1876 Musical Telegraph (an array of tuned electronic buzzers activated by switches on a musical keyboard) and English physicist William Duddel's 1899 Singing Arc, which used a keyboard to control an audio modulation frequency imposed over a 300 v potential driving a carbon arc lamp that directly produced musical tones.
In 1906, Thaddeus Cahill's 200-ton Telharmonium generated musical audio from a building full of tuned dynamo wheels; one per note, and as his instrument presaged vacuum tube amplification, the dynamos themselves each needed to generate the circa 10 kW required to drive the transducers of thousands of listeners who subscribed over telephone lines. The Telharmonium was controlled from a multiple-keyboard console designed to accommodate two players. Cahill's keyboard was actually touch-sensitive (a feature lacking in most of its descendants in the electronic organ world); as each key was connected to a mechanism that adjusted the alignment of two coils in a coupling transformer, the amplitude of the signal was a function of the key depression. There were also switches and pedals to control the timbre and dynamics. Since there was one wheel per note, the Telharmonium was a polyphonic instrument, as were those of its electric tonewheel progeny that were commercialized circa three decades later, such as the Rangertone and Hammond organs. In contrast, most of the early vacuum tube instruments were monophonic; the few exceptions, such as Hugo Gernsback's Pianorad (1926) and the Coupleaux-Givelet Organ (1938) required a huge bank of bulky tube oscillators (one per note), which were notorious for drifting out of tune.
The "Ondes Martenot", introduced by Maurice Martenot in 1928, was the first electronic keyboard to be produced in any quantity, and as several notable composers of the time scored for it, it is still used in concert today. Early versions controlled monophonic pitch exclusively through a taut ribbon loop that was attached to a ring fitting on the forefinger of a performer's right hand and also wound around the shaft of a variable capacitor (much like the string in the tuning dials of old radios). As the player translated this hand up and down a dummy keyboard (included for visual/tactile reference only), the ribbon loop was likewise rotated and the capacitor turned, changing the frequency of the heterodyned audio oscillator, thus determining the instrument's pitch. As the oscillator or pitch mechanics drifted, the player would compensate by ear, adjusting the position of the ribbon finger appropriately. The left hand keyed the audio on and off, and controlled a set of stops that switched components in and out to control the timbre and characteristics of the amplitude envelope. Martenot successively refined his instrument, eventually producing a device with a real keyboard that triggered a note and determined its pitch. The ribbon remained, but was used for portamento and vibrato effects; the left hand was now devoted to articulating timbre and dynamics changes. This left/right hand articulation/note arrangement has persevered, and is still present in most modern-day keyboard synthesizers.
The following decades saw several other primarily monophonic keyboard-based electronic instruments with refined timbral control, including Friedrich Trautwein's Trautonium in 1928 (which determined pitch by a set of adjustable finger pads mounted above an early "ribbon controller" or touch sensitive resistive strip) and the various instruments of Harold Bode, such as the Warbo Formant Organ in 1937 (with a top-note priority assignment of four oscillators, allowing 4-voice polyphony), the Melodium in 1938 (sporting a true pressure-sensitive keyboard, the dynamics varying with the force of the key mechanism hitting a resistive strip of felt soaked with glycerin), and the Melochord in 1947 (a dual-keyboard Melodium, installed into the famous Cologne Electronic Music Studio in the early 50's as a very early modular system with tracking filters and envelope generators). In 1941, the French engineer George Jenny introduced the Ondioline, a popular monophonic instrument with a 3-octave keyboard that tapped into a resistor ladder to determine pitch (a technique still used in the analog synthesizers of three decades later). The entire keyboard mechanism would move up and down with key pressure and side-to-side with lateral force. This varied the capacitance between corresponding sets of sensor plates, allowing amplitude dynamics to be mapped onto key pressure (together with the force applied to another sensor plate against the player's knee) and vibrato effects to be determined by horizontal displacement.
Although never produced commercially, the instruments of Canadian electronic music pioneer Hugh LeCaine represented important milestones in the field of electronic musical interfaces. These included a polyphonic electronic organ with a depression-sensitive keyboard using capacitive displacement sensors below each key (1953), a keyboard-controlled "Special Purpose Tape Recorder" (1955) that in some ways presaged the famous Mellotron tape samplers of the 1960's, and a very expressive device termed the Electronic Sackbut (named after the medieval forerunner to the trombone), produced in 1948 as an electronic instrument capable of emulating the sonic complexity of acoustic instruments. The Sackbut was a monophonic keyboard with several channels of articulation control. The volume and attack of a sound were determined by the displacement and pressure of the keys and the position of a foot pedal. The keys could also move horizontally, changing the note's pitch, plus another foot pedal controlled the amount of portamento between notes; pitch could also be altered through a touch-sensitive strip mounted above the keyboard. LeCaine designed an extremely dexterous timbre controller for the left hand, with the thumb controlling a pair of formant resonances, the index finger manipulating a capacitive "joystick" mixer that adjusted the basic oscillator waveshape, and the outer three fingers introducing different types of frequency modulation. With all of these continuous input channels available, the Sackbut came alive in the hands of a trained performer, as can still be heard in LeCaine's recordings.
Although the popular voltage-controlled analog synthesizers of the late 1960's and early 1970's (made by Moog, Arp, E-Mu, etc.) were capable of being controlled through essentially any means, the dominant interface was again a simple diatonic keyboard. These keyboards generated a logic gate while a key is pressed, and grabbed a voltage off a (usually linear) resistor ladder with a sample-hold; the rising gate triggered envelope generators that controlled the dynamics, while the keyboard's sampled voltage determined the pitch, filter tracking, etc. The majority of these keyboards were monophonic, although there were some duophonic varieties that produced a pair of control voltages when two keys were hit. These keyboards were also usually inexpressive; very few of them responded to velocity, pressure, or anything besides key hits. Continuous control of dynamics in these devices was obtained by turning knobs, pushing pedals, or manipulating something like a ribbon controller (a position-sensitive resistive strip that responded to finger touch), again primarily with the left hand.
Responding to the demand for a portable synthesizer that could be easily brought on tour with a gigging band, Moog music released the famous MiniMoog in 1970, a device that had enormous influence on synthesizers to follow. This instrument was a hardwired subset of the large modular systems available before; signal routing between components was controlled entirely by a set of switches and potentiometers. No patchcords or impressive bank of equipment was needed, although this portability and affordability was achieved by compromising flexibility and sonic variety. Economics couldn't argue with this, however. Since the MiniMoog was such a success (over 12,000 were sold, more than any previous synthesizer), it cast a shadow that extends to the present day, the most obvious being the twin set of wheels to the left of nearly all electronic music keyboards (Bob Moog is often quoted as regretting not patenting this innovation). On the MiniMoog, one of these wheels controls pitch bend (thus has a center detent at the null position, with a spring return), and the other controls oscillator and filter modulation; more recent synthesizers are capable of remapping these controllers to any desired function. Nearly all of the expressive articulation in synthesizer solos from progressive rock and fusion jazz bands of the 1970's was produced by turning these wheels. Although several variations have appeared over the years, such as joysticks (for example in the old Sequential Prophet VS), the combination of ribbon controller and wheel in the Korg Prophecy and the 2-dimensional touchpad on the new Korg Z1, the canonical MiniMoog thumbwheels tend to be always included.
In the early-70's, synthesizer keyboards became polyphonic, due to the invention of the digital scanning keyboard, where each key was connected to the input of a digital multiplexer, which was continually cycled and monitored to detect changes in key state. These keyboards had a convoluted history, being originally explored by Donald Buchla, then designed into the Allen Digital Organ (which morphed into the famous RMI Keyboard Computer) by Ralph Deutsch and colleagues at Rockwell International, finally reaching the mainstream synthesizer community in the keyboards designed by Dave Rossum and Scott Wedge of E-Mu Systems. When a key was pressed in one of these systems, a "voice" composed of an oscillator, envelope generator, voltage-controlled amplifier and filter are triggered accordingly if available; different kinds of polyphonic note priority could be specified. This was done entirely in hardware in the early devices (i.e., the E-Mu 4050 keyboard, later evolving into the controller for the Oberheim 4 and 8 Voice synthesizers), then in 1977 by a microprocessor in the E-Mu 4060, which was adapted a year later for the famous Sequential Prophet 5. Today, of course, keyboard scanning, note allocation, synthesis, and signal processing are all performed digitally by firmware running on the ASICs and processors embedded in modern synthesizers.
The MIDI specification, first introduced in 1983, provides for sending several descriptive parameters along with basic pitch, finally encouraging expressive keyboards to be a standard item on the commercial market. Every note is accompanied by a 7-bit velocity parameter, measured in most keyboards by clocking the amount of time it takes for a key to switch between an upper and lower contact. Many keyboards also send aftertouch (usually defined by another switch closure as the keys are pressed harder), and continuous key pressure.
There have been several relatively recent attempts to improve on the electronic music keyboard, and open more degrees of freedom to the player. The Notebender, designed by John Allen and colleagues at Key Concepts Inc. in 1983, allows keys to move in and out after being struck, producing additional articulation. The Multiply-Touch-Sensitive keyboard, designed by Bob Moog and Thomas Rhea in the late 1980's, goes well beyond measuring key depression and pressure; it also uses capacitive pickups on each key to measure the 2-coordinate finger positions over the key surfaces. Several commercially-made analog synthesizers of the late 60's through mid 70's dispensed entirely with a mechanical keyboard, and used capacitive touch plates to trigger notes. Some of the better-known examples were the portable synths from Great Britain, such as the Wasp and Gnat from Electronic Dream Plant, or the Synthi-AKS from Electronic Music Systems. These devices had the image of a diatonic keyboard printed across a flat plate, under which capacitive pickups sensed finger contact through the injection of ambient pickup noise or the change in the transient response of an attached circuit. The famous synthesizer designer, Donald Buchla, despised the use of keyboards, anticipating the limits they would impose on the ways in which electronic instruments would be used. Many of his early interfaces employed similar capacitive touchpads (some of which would also respond to pressure and position), although these were never intended to emulate a musical keyboard.
As any pianist knows, the "action" and haptic response of a given keyboard greatly affects its playability. Although the best electronic keyboards now have a passive weighted action, their feel is generally analogous to a low-quality acoustic piano. Active force-feedback keyboards have been pursued by various researchers, including Chuck Monte (developer of the "Miracle Piano"), Claude Cadoz and collaborators in Grenoble, and Brent Gillespie at Northwestern University and Stanford's CCRMA. These devices have position encoders and mechanical drivers on each key, hence, by programming appropriate dynamic response, they are ideally capable of emulating the feel any keyboard, from the best concert grand to totally alien devices with "impossible" mechanics.
Recent years have seen the limited introduction of entirely different keyboard layouts as electronic music controllers. The electronics giant Motorola actually built such a controller to accompany its "Scalatron" synthesizer in 1974; two of these devices were equipped with a "generalized" keyboard designed by George Secor that sported a dense array of 240 pushbutton keys for playing microtonal music, with notes lying between those of our conventional 12-tone scale. The MicroZone, produced by Starr Labs in San Diego, CA., is a MIDI keyboard designed for microtonal music, featuring an array of 768 hexagonal keys in an 8 x 96 honeycomb matrix. Other large MIDI "button bank" and switch panel interfaces have been built, such as the "Monolith" by Jacob Duringer of Lake Forest, CA., while others are under development, like the "Chordboard" by Grant Johnson of Fair Oaks, CA and John Allen's Bosanquet-type generalized MIDI keyboard.
One of the most impressive such interfaces was designed and built by the late Sal Martirano and his colleagues at the University of Illinois in 1969. Termed the "Sal-Mar Construction", it was a bank of 291 lightable touch-sensitive switches, connected through a massive patch bay to a combinatorial logic unit that drove a custom analog synthesizer producing up to 24 independent audio channels. The Sal-Mar was a live performance instrument, used by Martirano in many concerts and recordings. It was a very early hybrid digital/analog composition machine, where the player would define and interact with sonic sequences in real time. Another interface of this sort was Peter Otto's "Contact", designed in 1981 as an array of 91 knobs, switches, and faders, with entirely programmable functions, enabling continuous control over digitally-generated music. These interfaces hearken back to the days of the large modular analog synthesizers, where there was a knob or switch (often hidden behind the patch cords) for every function. Musicians such as Tangerine Dream, Klaus Schulze, and Keith Emerson actually took such huge, temperamental devices on the road and performed live with these modular systems, dynamically throwing patches and twiddling knobs to tweak or articulate a sound. Even though these interfaces are big, clumsy, and expensive, having all adjustments and parameters physically arrayed about the player can speed the process of sound design; in some corners of the industry (i.e., mixing boards), such a massive, parallel, tactile interface is still considered to be essential and only slowly yields to digital GUI (Graphical User Interface) abstraction. Occasionally, music manufacturers have heeded the call to again provide large banks of parallel knobs and switches to give their synthesizers a more intuitive programming and control interface (editing sounds with only the standard few buttons, knob or two, and menu-driven LCD panel can be very difficult), an example is the PG1000, once built by Roland as a MIDI-connected front-end programmer for their D50 synthesizer. These are much less common now, being replaced by readily available graphical editor/librarian software packages that run on personal computers, interacting with synthesizers through a set of virtual GUI controls; not nearly as fast and intuitive, but much more practical. MIDI controllers that feature a smaller (but still significant) set of programmable faders and switches are still made by a few manufacturers, such as the JL Cooper FaderMaster, the E-Mu Launchpad, and the Peavey PC 1600X (sporting 16 assignable faders, 16 programmable buttons, a datawheel, and provision for a pair of footpedals).
3) Percussion Interfaces
A step from keyboard interfaces are drum controllers, which give percussionists access to the world of electronic sound. Early percussion interfaces were made during the late 1960's, with simple acoustic pickups attached to surfaces that were struck; the signals from these transducers were routed through envelope followers (common items in the early modular synthesizers) that produced a voltage proportional to the strike intensity, together with a discriminated trigger pulse. These signals could then be routed to the various synthesizer modules for producing sounds. The first widely-marketed drum interface was the Moog 1130 Drum Controller. This device, introduced in 1973, employed an impact-sensing resistor in the drumhead and gave audiences their first exposure to synthesized drums in the concerts of progressive rock bands such as Emerson, Lake and Palmer. Other such controllers, most featuring minimal built-in synthesizers, followed in the pre-MIDI era of the later 1970's and can be heard in much of the dance/disco music of that time, notably the Pearl synthetic drums, the Synares, the Syndrum, and the inexpensive electronic percussion interfaces from ElectroHarmonix.
Electronic percussion took a major leap in the early 80's with the designs of Dave Simmons, which combined new appealing sounds with very playable flat, elastic drumpads in what were then exotic shapes (eventually annealing into the familiar hexagon); these devices also evolved a MIDI output for driving external synthesizers. The Simmons SDX drumpads introduced the concept of "zoning", where hits of varying intensity in different areas of a single pad could trigger different sonic and MIDI events.
Nowadays, although Simmons are long-vanished, nearly every musical instrument manufacturer makes electronic percussion interfaces. One of the longest lines of innovative percussion controllers arise from KAT (now distributed by E-Mu), who make products such as electronic mallet interfaces for marimba players. Most percussion devices use Force-Sensitive Resistors (FSR's) as sensing elements, while some incorporate piezoelectric pickups. Essentially all percussion pads are acoustically damped, and radiate little direct sound. In recent years, several MIDI drum synthesizer modules (e.g., the Alesis DM series) incorporate analog inputs for 3'rd party percussion transducers, enabling triggers from essentially any source to produce MIDI output. By necessity, these devices are very adaptive to signals of different character; all relevant parameters (such as trigger thresholds, noise discrimination, crosstalk between pads, etc.) can be digitally adjusted and compensated through menu parameters for each transducer channel.
An interesting approach to percussion controllers and synthesis has been explored by Korg in their Wavedrum. This device supersedes the limited information in simple trigger detection by employing the actual audio signal received by transducers on the drumhead as excitation for the synthesis engine (various synthesis and processing algorithms are implemented), enabling a very natural and responsive percussion interface.
In recent years, the famous synthesizer innovator Donald Buchla has been directing his attention to designing new musical interfaces. One of his devices, called "Thunder", can be thought of as a very articulate percussion controller, designed to be played with bare hands. The flat surface of Thunder is split into several labeled zones of different shapes, adjusted to complement the ergonomics of the human hand. These zones respond separately to strike velocity, strike location, and pressure. Whereas the original Thunder designs employed capacitive touch sensing, later renditions use electro-optic detection of the surface membrane's deformation under hand contact.
Here at the Media Lab, we have built perhaps the world's largest percussion interface in the "Rhythm Tree", an array of over 300 smart drumpads constructed for the "Brain Opera", a big, touring, multimedia installation that explores new ways in which people can interact with musical and graphical environments. Each pad in the Rhythm Tree features an 8-bit PIC 16C71 microcontroller that analyzes the signal from a PVDF piezoelectric foil pickup and drives a large LED, both potted in a translucent urethane mold. Up to 32 drumpads share a single, daisy-chained RS-485 digital bus to simplify cabling. All pads on a bus are sequentially queried with a fast poll; if a pad has been hit, it responds with data containing the strike force, the zone of the pad that has been hit (obtained by analyzing the rising edge of the transducer waveform), and the resonant character of the hit (obtained by counting the waveform's zero-crossings). A MIDI stream is then produced, which triggers sounds and gives visual feedback by flashing the LED in the struck pad or illuminating others in the vicinity. All parameters (thresholds, modes, LED intensity) in each pad are completely and dynamically downloadable from the host computer.
Another interesting interface that began life as a percussion controller was computer music pioneer Max Mathews' "Daton", where a sensitive plate responded to the location and force of a strike. The strike location was determined by measuring differential force with 4 pressure sensors at the corner plate supports. Mathews then collaborated with his colleague at Bell Labs, Bob Boie, who evolved this device into one of the best known modern interfaces in academic music, the "Radio Baton". This instrument is played with two batons, one in each hand. The 3D location of each baton can be determined above a sensitive platform by measuring the signal induced through capacitive coupling between oscillators driving transmit electrodes in the batons and shaped receive electrodes in the platform. The signal from each baton is synchronously detected in the pickup electronics to reduce noise and extend the measurement range.
A similar interface has been designed by Donald Buchla. Termed the "Lightning", this is an optical tracker that measures the horizontal and vertical positions of a pair wireless wands, each with a modulated IR LED at its tip (several musicians and researchers use the Lightning controller; for an example showing an application, check out Jan Borcher's Worldbeat installation). The current version of this system is specified to sense across a region 12 feet high by 20 feet wide. A highly-interpreted MIDI output stream can be produced, ranging from simple coordinate values through complicated responses to location changes, beats, and gesture detection. Other researchers have explored related optical interfaces; i.e. the Light Baton by Bertini and Carosi at the University of Pisa and the IR baton of Morita and colleagues at Waseda University both use a CCD camera and frame-grabber to track the 2D motion of a light source at the tip of a wand in real time.
A series of batons have been produced that sense directional beats via an array of accelerometers. These include 3-axis devices such as the baton built by Sawada and colleagues at Waseda University, the MIDI baton designed by David Keane and collaborators at Queens University in Kingston Canada, and a commercial dual-wand MIDI unit called the "Airdrum" made by Palmtree Instruments in 1987. Even simpler devices have been built with simple inertial switches replacing the accelerometers; these establish momentary contact when the baton velocity changes sign in the midst of a beat. Examples are the Casio SS1 Soundsticks and other such devices that have appeared on the toy market.
The "Digital Baton", which we have built at the Media Lab, incorporates both of the sensor modes mentioned above; i.e., it tracks the horizontal/vertical position of an IR LED at the baton tip for precise pointing (using a synchronously demodulated PSD photosensor to avoid problems with latency and background light) and uses a 3-axis 5G accelerometer array for fast detection of directional beats and large gestures. It also features 5 FSR's potted in the urethane baton skin for measuring distributed finger/hand pressure. This multiplicity of sensors enables highly expressive control over electronic music; the performer can "conduct" the music at a high level, or descend into a "virtuoso" mode, actually controlling the details of particular sounds. We have used this baton in hundreds of performances of the Brain Opera at several worldwide venues.
5) Assimilating the Guitar
Stringed instruments are highly expressive and complex acoustic devices that have followed a long and difficult path into the world of electronic music controllers. The popularity of the guitar in modern music has given it considerable priority for being assimilated into the world of the synthesizer, and a look at the history of the guitar controller aptly reflects the evolution of signal-processing technology.
The world saw the birth of an important and popular electronic instrument when guitars were mated to electronic pickups back in the 1930's. Although it was an extreme break with tradition, as pioneered and explored by many innovative musicians like Charlie Christian and Les Paul, many sounds in this new instrument lay latent, as it were, until the arrival of the 1960's, when Jimi Hendrix and other contemporary guitarists turned up the volume and started exploring the virtues of distortion and feedback. The electric guitar now became part of a complex driven system, where the timbral quality and behavior of the instrument depends on a variety of external factors; i.e., distance to the speakers, room acoustics, body position, etc. Musicians explored and learned to control these additional degrees of freedom, producing the very intense, kinetic performance styles upon which much of modern rock music is based.
The next stage in the marriage of electronics and the guitar resulted in an array of analog gadgets and pedals that modified the guitar pickup signal directly; these included wah-wah's (sweeping bandpass filters), fuzzboxes (nonlinear waveshaping and limiting), flangers (analog delays and comb filters), octave dividers, and various other dedicated processors. One of the most unusual was "The Bag" by Kustom; essentially brute-force vocoders that injected the guitar signal directly into the player's mouth through a small speaker tube, then picked it up with a nearby vocal microphone. By shaping the mouth into different forms, the timbral characteristics of the guitar sound would be effectively dictated by the dynamic oral resonances, resulting in a familiar "talking guitar" effect.
The first stages in the melding of guitars and synthesizers were experiments with running guitar signals through envelope followers and processing devices such as filters, etc. in the old modular synthesizers. Designers then eagerly assailed the next step in this union, namely extracting the pitch of the guitar, so it can drive an entirely synthesized sound source, and become a true "controller". This task has proven quite difficult and is still a challenge to do quickly, accurately, and cheaply, even with today's technology. The problems come in at several levels; e.g., noise transients included with the attack of the sound, the potential need for several cycles of a steady-state waveform for robust pitch determination, dealing with variabilities in playing style, and the difficulty of separating the sounds and coupling effects from the different strings. The so-called "guitar synthesizers" of the mid-1970's were mainly monophonic analog devices (allowing only one note to be played at a time) that were unreliable and technical disasters. The Avatar, for instance, was essentially the first commercial guitar synthesizier, often credited with hastening the demise of the Arp Synthesizer Corporation. Rather than interface to an actual guitar, the commercial successes adopted during that era shaped a portable keyboard roughly along the lines of a guitar and slung it over the neck of a mobile performer; the right hand played the keys, while the left-hand adjusted knobs, sliders, etc. to articulate the sounds. Although this let keyboardist prance around the stage in the style of guitarists (the fusion player Jan Hammer was perhaps the best-known early proponent of this type of interface), it by no means was a guitar controller, and still required keyboard technique.
The late seventies and early eighties saw the widespread adoption of the hexaphonic pickup; a magnetic pickup with one coil for each string, thus producing 6 independent analog outputs, and mounted very close to the strings and bridge to avoid crosstalk. This, together with better pitch extraction circuitry, enabled the design of polyphonic guitar synthesizers such as the 360-Systems devices designed by Bob Easton and collaborators and the well-known Roland GR500 and GR300 series; although they could be slow and quirky, they gained some degree of acceptance in the community, being adopted by guitar giants such as Pat Metheny and Robert Fripp.
In the mid-80's, the guitar controller began evolving significantly away from its familiar form, with many devices being developed that weren't guitars at all, but enabled musicians with guitar technique to gain fast, expressive control over MIDI synthesizers. These avoided pitch tracking all together, and merely detected the playing parameters by directly sensing the fretting finger position (typically by membrane switches or resistive/capacitive pickups on the fretboard or between string and fretboard), pitch bend (measuring strain or displacement of the strings), and the dynamics of the string being plucked (from the amplitude of the audio signal produced by each string; the pitch is never determined). Perhaps the most famous was the SynthAxe, invented by Bill Aitken. This device actually sported two sets of strings; one of which were short lengths across the guitar body used to detect picking, and another set running down the fretboard for determining pitch, as described above. With its faster response, the SynthAxe was adopted by several well-known power jazz guitarists, notably Alan Holdsworth. It was very heavy and expensive, however, thus rapidly spawned a related set of much more affordable controllers, such as the Suzuki XG and Casio DG series, which retained the short plucking strings, but dispensed with the strings down the fretboard, now directly sensing finger pressure there. Only one such device is currently in production, namely the ZTAR from Starr Labs in San Diego, CA. All of these controllers feature several additional means of generating data from the guitar body; e.g., whammy bars (directly producing MIDI events), sliders, buttons, touchpads, joysticks, etc. Much simpler versions of these designs have appeared in toy products (such as the Virtual Guitar from Ascend Inc. in Burlington, MA), some of which are still marketed.
Another set of guitar controllers were introduced in the mid-late 80's that likewise didn't detect the pitch directly, but used yet another technique to sense the fretting positions. The Beetle Quantar and Yamaha G10 launched an ultrasonic pulse down the strings from the bridge; when a string was held against a metal fret, this pulse would be reflected back to the bridge, where it would be detected and the fretting position determined by the acoustic time-of-flight. An additional optical sensor on each string detected the lateral string position (thus pitch bend), and electromagnetic pickups determined the amplitude dynamics. Optical pickups were used exclusively on another guitar controller called the "Photon", developed by K-Muse in 1986. Here, the standard magnetic pickup was replaced with an IR sensor that detected the string vibration, enabling nonconductive nylon strings to be used.
The guitar controllers from Zeta Music were interesting hybrid approaches of considerable renown that appeared in the late 1980's. These were actual guitars with a multimodal MIDI interface, culminating in the Mirror 6, which featured a wired fretboard for determining pitch, a capacitive touch detector on each string for determining the expected acoustic damping on strings contacted but not pressing the fretboard, hex pickups for determining amplitude and pitch bend, accelerometers for measuring the instrument's rigid-body dynamics (e.g., shaking), plus an instrumented whammy bar and other tactile controls. Although they no longer produce guitars (having been bought and subsequently spun off the guitar giant Gibson), Zeta still makes other MIDI string instruments, as described later.
In recent years, as signal processing capability has improved, there has been a shift away from the dedicated MIDI guitar controllers described above and back toward retrofits for existing, standard electric guitars that now identify the playing features by running real-time DSP algorithms on the pickup signals. These systems generally consist of a divided hex pickup (still mainly magnetic, although some begin to employ contact piezoelectric transducers, which produce a more robust signal and work with nonmetallic strings) and an interface unit, that runs the pitch and feature extraction software. Examples are the Yamaha G50 and an interesting new device called the "Axon" controller from Blue Chip Systems, which employs a neural network to learn the playing characteristics of an individual player (the Axon claims to be able to reliably determine pitch after a single period of the string frequency; because it has learned the picking style of the player, it is said to be able to use the first, "noisy" period immediately after picking). These controllers, once properly calibrated (often still a nontrivial operation), are said to track quickly and reliably, plus estimate other parameters from the string signals, such as the longitudinal picking position. Some claim to respond quickly enough to enable good performance with a bass guitar, where the string oscillation period is much longer.
Nonetheless, it's generally accepted that the MIDI standard is inadequate for adequately handling the wealth of data that most acoustic instruments (especially the strings) can produce. While a guitar performance can be somewhat shoehorned into a set of features that fit the 31.25 kbaud, 7-bit MIDI standard, there's insufficient bandwidth to continually transmit the many channels of detailed articulation these instruments generate. Several solutions have been suggested, such as the currently-dormant ZIPI interface standard, proposed several years ago to supersede MIDI by the CNMAT center at UC. Berkeley and Zeta. A route many manufacturers seem to be pursuing now is to depart from the modular MIDI standard, and utilize the detailed, fine-grained features from their proprietary guitar interface in a synthesis engine housed together with the guitar controller unit. If desired, a subset of these parameters can be projected through a MIDI output, but the synthesis algorithms running on the native synthesizer have access to the high-bandwidth data coming right off the guitar, enabling very responsive sound mapping.
6) Other Strings; Wiring the Classics
The orchestra and synthesizer inhabited entirely independent spheres during their early courtship; the electronic sound systems knew nothing about what the instruments were playing. Musicians essentially kept time to a tape recording, or prerecorded sequence. Over the past decades, this relationship has become much more intimate. The most well-known large-scale examples of computer/orchestra integration were provided by Giuseppe Di Giugno's 4X synthesizers, developed at IRCAM in Paris during the 1980's. The 4X was used by many composers, including Pierre Boulez, to analyze and process the audio from acoustic ensembles, enabling them to produce real-time synthesized embellishment. This trend has continued as more fluent interfaces develop between computers and classic orchestral instruments. The electronic music generator now becomes a virtual accompanist, able to adapt and respond to the detailed nuance of the individual musicians.
The early electronic interfaces for bowed string instruments processed sound from pickups, adding amplification and effects (a famous example is Max Mathew's violin, which use piezoceramic bimorph pickups with resonant equalization). In general, more traditional stringed instruments (e.g., violin, viola, cello) have followed the guitar along a similar, although less trodden path toward becoming true electronic music controllers. Although the complicated and dynamic nature of a bowed sound makes fast and robust pitch tracking and feature extraction difficult, many researchers have developed software with this aim; commercial MIDI pitch-tracking retrofits are also manufactured for violins, violas, and cellos. Zeta, for instance, has built a full line of MIDI stringed instrument controllers over much of the last decade, and still manufactures MIDI controller electronics and retrofit pickups for both its own and 3'rd party instruments. Motivated by the vast world of sonic expression open to a skilled violinist, many researchers go beyond analyzing the audio signals, and build sensor systems to directly measure the bowing properties. Chris Chafe, of the CCRMA at Stanford, has measured the dynamics of cello bows with accelerometers and the Buchla Lightning IR tracker. Peter Beyls, while at Brussels University in 1990, built an array of IR proximity sensors into a conventional acoustic violin, measuring fingering position and bow depression. Jon Rose, together with designers at STEIM, Amsterdam, built interactive bows with sonar-based position tracking and pressure sensors on the bow hair.
Here at the Media Lab, we have designed several systems for measuring performance gesture on bowed string instruments. These efforts began with the Hypercello, designed by Neil Gershenfeld and Joe Chung in 1991. In addition to analyzing the audio signals from each string and measuring the fretting positions with a set of resistive strips atop the fingerboard, the bow position and placement were measured through capacitive sensing, which is much less sensitive to background than most optical or other techniques. A 50 kHz signal was broadcast from an antenna atop the bridge, and received at a resistive strip running the length of the bow. Transimpedance amplifiers and synchronous detectors measured the induced currents flowing from both ends of the bow; their difference indicated the transverse bow position (i.e., the end closer to the transmitter produced proportionally stronger current), while their sum indicated longitudinal bow placement (the net capacitive coupling decreases as the bow moves down the violin, away from the transmitter). In addition, a deformable capacitor atop the frog, where the bowing finger rests, measured the pressure applied by the player, and a "Dexterous Wrist Master" from Exos, Inc. (now part of Microsoft) measured the angle of the bow player's wrist. This setup originally premiered at Tanglewood, where cellist Yo-Yo Ma used it to debut Tod Machover's hyperinstrument composition "Begin Again Again..."; it has since appeared in over a dozen different performances of Machover's music at various worldwide venues.
This bow was wired to the signal conditioning electronics; not interfering significantly with the playing style of most cellists. We have since developed a wireless bow tracker for use with a violin, where a tether can much more significantly perturb the player. Here, we used three small battery-powered transmitters located on the bow, and a receive electrode on the violin, above the bridge. Two of these were CW transmitters broadcasting at different frequencies (50 and 100 kHz), driving either end of the resistive strip. The balance in the components of the received signal at these frequencies indicated the bow position and placement, as with the current-balance scheme in the cello bow. A FSR placed below the player's bow grip caused the frequency of the third oscillator to vary with applied pressure; a PLL (phase-lock-loop) in the receive electronics tracked these changes, producing pressure data. This system has likewise appeared in several concerts, played by the violinist Ani Kavafian in performances of Machover's composition "Forever and Ever".
Of course, inventors all over the world have been adapting technology to turn non-western stringed instruments into electronic music controllers. Perhaps one of the more extreme is Miya Masaoka's "Koto Monster", a 6-foot-long, hollow-bodied, 21-string, harp-like digital instrument that she developed at the Dutch STEIM center.
Wind instruments followed an analogous path into the domain of electronic music performance. Initially, during the late 60's and early 70's, avant-guarde wind players would outfit their instruments with acoustic pickups and plug the signals through synthesizers, waveshapers, and envelope followers, triggering sounds and applying distortion and effects of various sorts. Wind players quickly adopted the early pitch-to-voltage converters produced by most synthesizer manufacturers (e.g., Moog, EMS); as wind instruments are essentially monophonic, they were well-suited to driving the single-voice synthesizers of the time, and although these pitch extractors could be readily confused with harmonics, attack transients, and artifacts of expressive playing, some degree of playability could be attained.
In the early 70's, the electronic wind instrument started its metamorphosis into a soundless controller, where the valves and stops became switches, and mouthpieces and reeds were replaced with bite and breath sensors. As in the case of the wired guitar controllers outlined above, this shortcut the need for intermediate pitch extraction, and the player's gesture was able to be immediately mapped into dynamic synthesis and audio parameters. The first of these devices to gain any notoriety was the Lyricon Wind Synthesizer Driver, made by a Massachusetts company called Computone. This device produced voltages from fingering, lip pressure, and breath flux (measured by a hot-wire anemometer adapted from a small light bulb) that could drive an analog synthesizer; it was initially packaged just as a controller, but a small dedicated analog synthesizer was included with subsequent models to enable stand-alone performance. Envelope generators were not generally used with such wind controllers; the breath controller and lip/bite sensor signals were applied directly to control the amplitude and timbral dynamics, creating the level of intimate sonic articulation that wind players are used to expressing.
During the later 70's and early 80's the trumpeter/inventor Nyle Steiner, long working on electronic wind interfaces, developed two of the best-known devices, the Electronic Woodwind Instrument (EWI) and Electronic Valve Instrument (EVI). The EWI has the fingering protocol of a saxophone, while the EVI is designed for trumpet players. In addition to breath and lip pressure sensors, these instruments featured capacitive touch keys for fast pitch fingering, touch plates and levers for adding portamento, vibrato and other effects, and rollers for transposing and sliding pitch. The synthesizer/audio manufacturer Akai began producing these instruments in the late-1980's, packaging the controller with an analog synthesizer and MIDI interface. They still produce a version of the EWI today, and as many purists feel that MIDI can't adequately convey the streams of continuous data produced by this device, it remains optionally packaged with an analog synthesizer.
Yamaha has played an important role in digital wind interfaces, introducing a breath controller (a device which dynamically senses breath pressure) with its pioneering DX-7 FM synthesizer in the early 1980's, opening up another channel of articulation in what was essentially a keyboard instrument. In the later 1980's, they introduced the first real MIDI wind controller, the WX-7, with fingering switches laid out in a saxophone protocol, breath and lip sensors, a pitch wheel, and a set of control buttons; this device has now evolved into the currently-produced WX-5. As a commercial manufacturer pioneering techniques such as physical modeling and waveguide algorithms in their VL-series synthesizers, Yamaha has designed many sound patches for these devices that require breath or wind controllers for fullest expression.
Many other wind controllers have been made by other manufacturers and researchers; for example, Casio has produced the inexpensive "DH" series of hornlike controllers, Martin Hurni of Softwind Instruments manufactures the Synthophone, and John Talbert of Oberlin College has built the MIDI horn. Perry Cook and Dexter Morril have explored new concepts in brass synthesis controllers at the CCRMA in Stanford, where they mounted acoustic and pressure transducers at various points in standard brass instruments, plus monitored valve position, added additional digital controls, and applied new algorithms for realtime pitch and feature extraction from the audio stream.
Some devices in this family have evolved far from their parents; for instance, the STEIM performer Nicholas Collins has turned a trombone into a multimodal performance controller by putting a keypad onto an instrumented slide, and using this to control, launch and modify a variety of different sounds; the trombone is never "played" in the conventional sense. An altogether different kind of wind synthesizer has been built by California-based Ugo Conti. His "whistle synthesizer" is essentially a signal processing device attached to a microphone into which one whistles; by adjusting its ubiquitous array of sliders (at least one is accessible to each finger on both hands), the sound of the whistle can be dynamically warped and modified through sub-octave generators and delay lines.
The human voice is certainly an extremely expressive sound source, and has long been integrated into electronic music. For obvious reasons, however, it is quite difficult to abstract the voice mechanism away from the sonic output, as was pursued in the guitar and wind controllers discussed above. Although it may be possible to get some real-time information on vocal articulation from, for instance, EMG muscle sensors or video cameras and machine vision algorithms analyzing facial motion, essentially all voice-driven electronic sound comes from processing the audio signals picked up by microphones of one sort or another. Essentially every signal processing trick ever developed has been used on the human voice in the name of music, perhaps the most common being reverb, echo, pitch shifting, chorusing, ring modulation, harmonizing, and vocoding (filtering one sound with the spectral content of another). Over the last decade, many signal processors have been specifically designed for altering real-time vocals (e.g. the DigiTech Vocalizer series are recent well-known examples), and several pitch-to-MIDI converters have been optimized for the human voice (some old classics are the Fairlight VoiceTracker and the IVL Pitchrider; nowadays, many musicians use the pitch trackers in guitar interfaces for generating MIDI from vocals). As the voice expresses many musical characteristics that go far beyond simple pitch, some groups, following the directions taken in speech research, have written real-time computer software to dynamically analyze the human voice into a variety of musically interesting parameters, in some cases using these quantities to drive complicated models (including those of other musical instruments or other voices). For example, Will Oliver, here at the MIT Media Laboratory, has taken this approach in the Brain Opera's "Singing Tree", a realtime device that breaks the singing voice into 10 different dynamic parameters, which are then used to control an ensemble of MIDI instruments that "resynthesize" the character of the singing voice, but with entirely different sound sources. Academic research is rich with work on realtime and "batch" voice processing; witness, for instance, the well-known composition "Lions are Coming" by James (Andy) Moorer done at CCRMA, or the LPC work of Paul Lansky at Princeton.
9) Noncontact Gesture Sensing
In recent years, more musical devices are being explored that exploit noncontact sensing, responding to the position and motion of hands, feet, and bodies without requiring any kind of controller to be grasped or worn. Although these interfaces are seldom played with as much precision as the tactile controllers described earlier, with a computer interpreting the data and exploiting an interesting sonic mapping, very complicated audio events can be launched and controlled through various modes of body motion. These systems are often used in musical performances that have a component of dance and choreography, or in public interactive installations. Many sensing technologies have been brought to bear on these interfaces, from machine vision through capacitive sensing. As each have their advantages and problems, frequently the chosen sensing mechanism is best tailored to the desired artistic goals and the constraints imposed by the anticipated performance environment.
The Theremin, developed in the early 1920's by the Russian physicist, cellist, and inventor Leon Theremin, was a musical instrument with a radically new free-gesture interface that foreshadowed the revolution that electronics would perpetrate in world of musical instrument design. The technological basis of his instrument is very simple. The pitch waveform is generated by heterodyning two LC oscillators. Because their free-running frequencies (in the range 100 kHz - 1 MHz) are adjusted to be relatively close to one another, their detected beats are in the audio band. One of these oscillators is isolated, providing a relatively stable reference. The other oscillator is coupled to a sensor plate; when a player moves a hand close to this plate, their body capacitance adds to that of the attached oscillator, correspondingly altering its frequency and producing a pitch shift in the detected heterodyned audio beat. Another sensor plate similarly changes the frequency of yet another oscillator, however a filter network causes the amplitude of this oscillator to likewise vary with frequency (hence hand position). This signal is amplitude-detected, and used to analog-gate the audio heterodyne beat, thereby producing a change in loudness as a hand approaches this second plate. The Theremin is thus played with two hands moving freely through the air above these plates; one controlling pitch and the other determining amplitude.
The Theremin was a worldwide sensation in the 20's and 30's. RCA commercially manufactured these instruments, and several virtuosos performers developed, the most famous being Clara Rockmore. Robert Moog began his electronic music career in the 1950's by building Theremins, which had by then descended into more of a cult status, well away from the musical mainstream. Theremins are once again attaining some notoriety, and Moog (through his present company, Big Briar) and others are again producing them.
Here at the Media Lab, we have developed many musical interfaces that generalize capacitive techniques, such as used in the Theremin, into what we call "Electric Field Sensing". The Theremin works through what we call "Loading Mode"; i.e., it essentially detects current pulled from an electrode by a nearby capacitively-coupled body. We have, however, based our devices on other modes of capacitive sensing that provide more sensitivity and longer measurement range; namely "transmit" and "shunt" modes, as described below.
By putting the body very close to a transmit electrode (i.e., sitting or standing on a transmitter plate), the body, being quite conductive, essentially becomes an extension of the transmit antenna. The signal strength induced at a receiver electrode, tuned to the transmitter frequency, thus increases with the body's proximity, building as the capacitive coupling grows stronger. By placing an array of such receiver electrodes around a person attached to a transmitter, the range to his hands, feet, and body can thus be determined at several points in space. Gestural information is then obtained through simple processing of the data from these receive electrodes.
Our sensor chair exploits this "transmit mode". An electrode mounted under the seat drives the body with the transmit signal (a few volts at 50 kHz; well below any environmental or broadcast regulation), and pickup electrodes mounted in front of the performer and on the floor of the chair platform respond to the proximity of his hands and feet (halogen bulbs mounted near the pickup electrodes are driven to give a visual indication of the detected signals). We have used this chair in several different musical performances, mapping the motion of hands and feet into various musical effects that trigger, modify, and otherwise control electronic sound sources. We have adapted this design to other configurations; for instance, the Gesture Wall, used at the Brain Opera, dispenses with the chair, and transmits into the body of standing players through their shoes, with a simple servo circuit adjusting the transmitter amplitude to compensate for the wide range in shoe impedances (before starting, a player must first put his or her hand on a calibration plate, which acts as a reference capacitor). Data from the Gesture Wall receivers, which surround a projection screen, enable the performer to control a musical stream and interact with the projected graphics via free body motion. The receivers, mounted on goosenecks, are pieces of copper mesh in a urethane mold, surrounding a large LED that is driven with the detected signal, again for direct visual feedback.
Another electric-field-sensing mode that we have explored is termed "shunt mode". This is when the body, unattached to any electrodes, exhibits a dominant coupling through ambient channels to the room ground. Then, as hands, feet, etc. move between transmit and receive electrodes, the received signal drops, as the body effectively shields the receiver from the transmitter. Although accurate tracking can be more difficult here, we have made several musical gesture interfaces out of shunt-mode electrode arrays. Perhaps the most notorious of these is the Sensor Mannequin. Designed in collaboration with the Artist Formerly Known as Prince, there are several electrodes embedded in this device, creating descriptive MIDI streams as the body approaches various zones. A simpler embodiment is the "Sensor Frame"; an open rectangular structure made from PVC pipe, with copper electrodes at the corners and midway along the horizontal edges.
Several research labs and commercial products have exploited many other sensing mechanisms for noncontact detection of musical gesture. Some are based on ultrasound reflection sonars, such as the EMS Soundbeam and the "Sound=Space" dance installation by longtime Stockhausen collaborator Rolf Gelhaar. These generally use inexpensive transducers similar to the Polaroid 50 kHz electrostatic heads developed for auto-focus cameras, and are able to range out to distances approaching 35 feet. A multichannel, MIDI-controlled ranging sonar system for interactive music has been developed at the MIT Media Lab using a 40 kHz piezoceramic transducer, which, in contract to the Polaroid systems, exhibits a much wider beamwidth (although somewhat shorter sensitive range) and produces no audible click when pinged. While these sonars can satisfy many interactive applications, they can exhibit problems with extraneous noise, clothing-dependent reflections, and speed of response (especially in a multi-sensor system), thus their operating environment and artistic goals must be carefully constrained, or more complicated devices must be designed using pulse compression techniques.
Infrared proximity sensors, most merely responding to the amplitude of the reflected illumination (hence not true rangefinders, as the albedo of the body will also affect the inferred distance), are being used in many modern musical applications. Examples of this are found in the many musical installations designed by interactive artist Chris Janney, such as his classic SoundStair, which triggers musical notes as people walk up and down a stairway, obscuring or reflecting IR beams directed above the stair surfaces. Commercial musical interface products have appeared along these lines, such as the "Dimension Beam" from Interactive Light (providing a MIDI output indicating the distance from the IR sensor to the reflecting hand), and the simpler "Synth-A-Beams" MIDI controller, which produces a corresponding MIDI event whenever any of eight visible lightbeams are interrupted. One of the most expressive devices in this class is the "Twin Towers", developed by Leonello Taraballa and Graziano Bertini at the CNUCE in Pisa. This consists of a pair of optical sensor assemblies (one for each hand), each containing an IR emitter surrounded by 4 IR receivers. When a hand is placed above one of these "Towers", it is IR-illuminated and detected by the 4 receivers. Since the relative balance between receiver signals varies as a function of hand inclination, both range and 2-axis tilt are determined. The net effect is similar to a Theremin, but with more degrees of sonic expression arising from the extra response to the hand's attitude.
Other noncontact optical tracking devices have been built, such as the "Videoharp", introduced in 1990 by Dean Rubine and Paul McAvinney at Carnegie-Mellon. This is a flat, hollow, rectangular frame, which senses the presence and position of fingers inside the frame boundary as they block the backlighting emanating from the frame edges, thereby casting a corresponding shadow onto a linear photosensor array. Appropriate MIDI events are generated as fingers are introduced and moved about the sensitive volume inside the frame, allowing many interesting mappings.
Here at the Media Lab, we have built a much larger sensitive plane, using an inexpensive scanning laser rangefinder that we have recently developed. This rangefinder (a CW phase-measuring device) is able to resolve and track bare hands crossing the scanned plane within a several-meter sensitive radius. Because the laser detection is synchronously demodulated, it is insensitive to background light. We have used this device for multimedia installations, where performers fire and control musical events by moving their hands across the scanned areas above a projection screen.
Although they involve considerably more processor overhead and are generally still affected by lighting changes and clutter, computer vision techniques are becoming increasingly common in noncontact musical interfaces and installations. For over a decade now, many researchers have been designing vision systems for musical performance, and steady increases in available processing capability have continued to improve their reliability and speed of response, while enabling recognition of more specific and detailed features. As the cost of the required computing equipment drops, vision systems become price-competitive, as their only "sensor" is a commercial video camera. A straightforward example of this is the Imaginary Piano by CNUCE's Taraballa. A vision system tracks the hands of a seated player, triggering a note when they move below a vertical position threshold, with pitch determined by their horizontal coordinate.
A package called "BigEye", written by Tom DeMeyer and his colleagues at STEIM, is one of the most recent video analysis environments explicitly designed for live artistic performance applications. BigEye, running in realtime on a Macintosh computer, tracks multiple regions of specified color ranges (ideally corresponding, for instance, to pieces of the performers' clothing or costumes). The output from BigEye (a MIDI or other type of data stream) is determined in a scripting environment, where sensitive regions can be defined, and different responses are specified as a function of the object state (position, velocity, etc.).
Here at the MIT Media Lab, the Perceptual Computing Group, under Sandy Pentland, have explored many multimedia applications of machine vision. One of these projects, termed "DanceSpace" by Flavia Sparacino, effectively turns the body into a musical instrument, without the need for specific targets, clothing, etc. Using the "Pfinder" package developed by Chris Wren and colleagues, a human body is identified when it enters the field-of-view of a video camera, and a elliptical "blob" model is constructed of the relevant features (head, hands, torso, legs, feet) in realtime; depending on the details of the scene, update rates on the order of 20 Hz can be achieved on a moderate-capacity graphics workstation. DanceSpace attaches a set of musical controls to these features (i.e., head height controls volume, hand positions adjust the pitch of different instruments, feet fire percussive sounds), thus one essentially plays a piece of music and generates accompanying graphics by freely moving through the sensitive space. This has been applied in several dance applications, where the dancer is freed from the constraints of precomposed music, and can now control the music with their improvisational whims.
We have built another musical environment here at the Media Lab that combines both free and contact sensing in an unusual fashion. Termed "The Magic Carpet", it consists of a 4" grid of piezoelectric wires running underneath a carpet and a pair of inexpensive, low-power microwave motion sensors mounted above. The sensitive carpet measures the dynamic pressure and position of the performer's feet, while the quadrature-demodulated Doppler signals from the motion sensors indicate the signed velocity of the upper body. The Magic Carpet is quite "immersive", in that essentially any motion of the performer's body is detected and promptly translated into expressive sound. We have designed several ambient soundscapes for it, and have often installed this system in our elevator lobbies, where passers-by stop for extended periods to exploring the sonic mappings.
Yet another kind of musical controller has appeared over the last couple of decades. These are essentially "wearable" interfaces, where a sensor system is affixed to the body or clothing of a performer. A very early example of this is from composer Gordon Mumma, who inserted accelerometers and wireless transmitters into dancers' belts for performances dating from 1971. Berlin-based artist Benoit Maubrey has been designing "electro-acoustic clothing" since the early 1980's; apparel that incorporates tape recorders, synthesizers, samplers, sensors, and speakers. Some of the best-known examples of wearable controllers were likewise used during the 1980's by the New York performance artist Laurie Anderson, who pulled the triggers off a set of electronic drums and built them into a suit, enabling a percussive performance by tapping different sections of the body. Commercial companies started making such garments in the mid-80's, for instance, the "Brocton-X Drum Suit", with attachable Velcro percussion sensors and heel-mounted triggers. At the turn of the decade, Mark Coniglio, of Dance/Theater company Troika Ranch, designed MidiDancer, a body suit instrumented with 8 sensors, measuring motion and bend at various joints. Data is wirelessly offlinked and converted to MIDI; the Interactor MIDI interpreter provides a GUI front-end, enabling choreographer to quickly define multimedia events to be launched with particular dance gesture. Yamaha has recently introduced its Miburi system, consisting of a vest hosting an array of resistive bend sensors at the shoulder, elbows, and wrist, a pair of handgrips with two velocity-sensitive buttons on each finger, and a pair of shoe inserts with piezoelectric pickups at the heel and toe. Current models employ a wireless datalink between a belt-mounted central controller and a nearby receiver/synthesizer unit. Yamaha has invented a semaphore-like gestural language for the Miburi, where notes are specified through a combination of arm configurations and key presses on the wrist controllers. Degrees of freedom not used in setting the pitch are routed to timbre-modifying and pitch-bending continuous controllers. The Miburi, not yet marketed outside of Japan, has already spawned several virtuosic, somewhat athletic performers.
In contrast to fully-instrumented body suits, some intriguing musical controllers are independently cropping up in various pieces of apparel. Laurie Anderson, again a pioneer in this area, performed a decade ago with a necktie that was outfitted with a fully functional music keyboard. Here at the Media Lab, we have recently built "musical jackets", with a touch-sensitive MIDI keyboard embroidered directly into the fabric using conductive thread. We have also made a set of "expressive footwear"; a retrofit to a pair of dance sneakers that inserts a suite of sensors to measure several dynamic parameters expressed at a dancer's foot (differential pressure at 3 points and bend in the sole, 2-axis tilt, 3-axis shock, height off the stage, orientation, angular rate and translational position). These shoes require no tether; they are battery powered for up to 3 hours and offload their data via a 20K bits/second wireless link.
Other researchers have attached electrodes directly to the body, using neurological and other biological signals to control various sound sources in some very unusual performances. Some of the best-known such works were produced by composer David Rosenboon of Mills College during the 1970's. These "biofeedback" pieces generated sounds as a function of the performers' biological states, including heart rates, GSR (skin resistance) data, temperature probes, and of course, EEG (brainwave) data. In most of these pieces, a computer system would monitor these features and direct the sonic output as a function of their states and correlations. This has now become a commercial business of sorts, with products appearing on the market aimed partially at musical control applications. For example, a system marketed by IVBA Technologies of Norwalk, CT consists of a sensor headband, which purports to measure brainwaves of various kinds. The headband-mounted controller wirelessly communicates with a base station that provides a MIDI output. Another, more general, device is the "Biomuse", produced by BioControl Systems, a Stanford University spin-off started by Hugh Lusted and Benjiman Knapp. The Biomuse is able to gather EMG (muscle), EOG (eye movement), EKG (heart), and EEG (brainwave) signals. It also has MIDI output and mapping capability, and has been used in musical research projects. Although the controllability and bandwidth of some of these parameters (especially the brainwaves) may be debated, new musical applications won't lurk far behind as researchers at various institutes progress in extracting and identifying new and more precise bioelectric features.
The different sensors developed for the Virtual Reality community been rapidly pushed into musical applications; various musical researchers have worked with the magnetic tracking systems and datagloves upon which the VR world was built. For example, Jaron Lanier, a well-known pioneer in this field, is still a very active musician, incorporating a host of different VR and musical interfaces (along with traditional and ethnic acoustic instruments) into his live performances. Gloves, in particular, have appeared in many musical performances. One example, composed by Tod Machover here at the Media Lab, is "Bug Mudra", where an Exos "Dexterous Hand Master" was worn by the conductor, who had complete dynamic control over the audio mix and synthesis parameters through finger positions. Many other composers, such as Richard Boulanger at the Berkeley School of Music in Boston, have used Mattel's "Power Glove" (the low-cost brother of the original VPL DataGlove, intended for the home gaming market) as a controller in several pieces.
Some of the most interesting glove and hand controllers have come from STEIM, the Dutch center for electronic performance research in Amsterdam. Michael Waisvisz' "Hands", first built at STEIM in 1984, consist of a pair of plates strapped to the hands, each equipped with keys for fingering and other sensors that respond to thumb pressure, tilt, and distance between the hands. Waisvisz has written many pieces for this expressive controller, and still uses it in performance. Ray Edgar has built a related controller at STEIM called the "Sweatstick", where the hand interfaces are now constrained to slide along a bendable rod; in addition to the keypad switches, the distance between controllers, their rotation around the rod, and the bend of the rod are all measured. Laetitia Sonami has also built a device at STEIM called the "Lady's Glove", an extremely dexterous system, with bend sensors to measure the inclination of both finger joints for the middle 3 fingers, microswitches at the end of the fingers for tactile control, Hall sensors to measure distance of the fingers from a magnet in the thumb, pressure sensing between index finger and thumb, and sonar ranging to emitters in the belt and shoe. Walter Fabeck designed the "Chromasone" while at STEIM; this likewise uses a glove to measure finger bending, together with sonar tracking to measure the position of the hand above an illuminated Lucite dummy keyboard, providing a physical frame of reference for the performer and audience. One of the most impressive things about the STEIM environment is its connection to the "streetsmart" musical avant-guarde. The STEIM artists don't keep these innovative devices in the laboratory, but regularly gig with them at different performance venues and music clubs throughout Europe and around the world.
11) The Horizon...
General-purpose personal computers are rapidly becoming powerful enough to subsume much of musical synthesis. Essentially all current computers arrive equipped with quality audio output capability, and very capable software synthesizers are now commercially available that run on a PC and require no additional hardware. Over time, the software synthesis capabilities of PC's will expand, and dedicated hardware synthesizers will be pushed further into niche applications. As more and more objects in our environment gain digital identity and are absorbed into the ubiquitous user interface that technology is converging toward, controllers designed for providing generic computer input of various sorts will be increasingly bent toward musical applications. Several indications of this trend are now evident. At a low level, depending on their settings, standard operating systems tend to barrage a user with a symphony of complicated sonic events as one navigates through mundane tasks. While this is not necessarily a performance, software packages exist that use the standard computer interface as a musical input device (i.e., starting with Laurie Spiegel's Music Mouse, already available in the late 1980's, there are now many different packages; e.g. Pete Rice's "Stretchables" program developed here at the Media Lab). In the not-too-distant future, perhaps we can envision quality musical performances being given on the multiple sensor systems and active objects in our smart rooms, where, for instance, spilling the active coffee cup in Cambridge can truly bring the house down in Rio.
To dig deeper...
A good history of electronic music (written by Tom Rhea) is found in The Art of Electronic Music, by Eds. Tom Darter and Greg Arnbrysterm (GPI Publications, N.Y., 1984)..
Joel Chadabe's Electric Sound: The Past and Promise of Electronic Music (Prentice-Hall, N.J., 1997) is an excellent account of the development of electronic music, emphasizing innovators at research centers.
Stompbox: A History of Guitar Fuzzes, Flangers, Phasers, Echoes, & Wahs by Art Thompson (Miller Freeman Books, 1997) is a wonderful tour through the vast menagerie of effects boxes concocted mainly for guitarists. In addition to covering the history of these phenomena, he has included sections on most of the manufacturers, including interviews with the founders and designers. Contains a surprising amount of technical context; a great book for those of us who have puzzled out these circuits for ourselves, and wondered where the ideas came from.
A hefty and up-to-date tour of computer music, including a chapter on interfaces, is The Computer Music Tutorial, by Curtis Roads (MIT Press, Cambridge, Mass., 1996).
On the Musical Instrument Digital Interface (MIDI) protocol, an excellent introduction is Paul D. Lehrman and Tim Tully's MIDI for the Professional, 2nd ed. (Amsco Publications, New York, 1995).
The excellent 1993 documentary by Steve Martin, "The Electronic Odyssey of Leon Theremin" (Orion Pictures) is well worth seeing; it's also available on video.
Electric field sensing, both its history and how it is used in Media Lab instruments, is detailed in "Musical Applications of Electric Field Sensing," by the author and Neil Gershenfeld, Computer Music Journal, Vol. 21, No. 2, Summer 1997.
Details on the Media Lab's optical tracking interfaces, including the Digital Baton, an early laser rangefinder, and DanceSpace, are in "Optical Tracking for Music and Dance Performance," by the author and Flavia Sparacino, published in Optical 3-D Measurement Techniques IV, Eds. A. Gruen and H. Kahmen (Herbert Wichmann Verlag, Heidelberg, 1997), pp. 11-18.
"New Instruments and Gestural Sensors for Musical Interaction and Performance", recently written by the author, presents the technical details of the Brain Opera and related musical interfaces.
Many of the Media Lab's papers and preprints in this field can be downloaded off the Responsive Environments Group's Publication Site.
Technical summaries of the Brain Opera and other Media Lab research in instrument design can be found in the Brain Opera Web Archives. There is also a specific Brain Opera technology site.
Home pages for most of the Media Lab people mentioned here can be found off the Media Lab's staff list.
Technical and hardware research details can also be found off the Responsive Environments Group's homepage.
The Computer Music Journal is a great source for information in this field, and a good place to start web surfing for information on electronic music. In particular, their Spring 1998 issue has several articles devoted to electronic music interfaces. One of these describes the work of the Dutch ensemble "Sensorband", who use some very unique interfaces in their performances.
A collection of historical electronic instruments, with some sound files, is on the World Wide Web site at obsolete.com
Thierry Rochebois' site is another interesting historical summary.
A set of links to MIDI controller sites is posted at synthzone.com.
The EMUISC-L site has a plethora information archived.
As does the Electronic Music Foundation
Jason Barile's Theremin Home Page is an excellent summary of information and links related to Theremin and his inventions.
Likewise, Richard Moran's Theremin site has much interesting detail on this instrument and its relatives.
Lots of information on wind controllers can be found at the "Wind Controller Mailing List Home Page"
More information about research groups mentioned in this article is available at their Web sites:
The Center for Computer Research in Music and Acoustics
The Center for New Music and Audio Technologies (CNMAT)
Institut de Recherche et Coordination Acoustique/Musique (IRCAM)
The Stichting voor Electro Instrumentale Muziek (STEIM)
The music technology research center at Pisa (CNUCE)
Miscellaneous sites of interest to this field:
The IEEE focus group on electronic music
Einar Ask has built some very creative electronic music interfaces.
Axel Mulder has made some interesting music/VR interfaces and has posted some very informative papers
Trimpin works more in making fascinating MIDI output devices rather than controller interfaces, but learning about his work is well worthwhile.