Audio, Gesture, and Music Analysis with Machines

Tod Machover
Professor of Music and Media
MIT, Media Lab

Computers provide the composer with electronically generated sounds. Even though scientists can analyze audio, gestures and music to control musical systems and sound synthesis parameters, they still provide the artist with limited creative and perceptually meaningful feedback. The data is usually gathered, analyzed, processed, scaled, and used internally for making sounds, or sent out to external devices, e.g. using MIDI. However the “mapping” process remains simplistic and arbitrary. More reactive, adaptive, and creative musical systems could benefit from the use of machine listening, machine learning, and evolutionary models. The computer could analyze better and convert more intelligently the measured raw physical data, i.e. raw gesture data from sensors, and raw perceptual features from audio, into more musically meaningful, e.g. faster, louder, active, consonant, legato, control parameters. I will explore the history of electronic music processes, and focusing on the audio context, I will describe how new techniques in categorizing and understanding the audio content, e.g. timbre, rhythm, and genre classification, can greatly benefit the musical creation. Better results in the synthesis of music will rely on its proper analysis.
Written Requirement
The written requirement for this area will consist of a publishable quality paper to be evaluated by Professor Tod Machover.

Reading List
S. Schwanauer and D. Levitt, Machine Models of Music, MIT Press, 1993.
The Music Machine: Selected Readings from the Computer Music Journal, edited by Curtis Roads, 1989.
C. Roads, The Computer Music Tutorial, MIT Press, 1995.
D. Lee Hall, Mathematical Techniques in Multisensor Data, Artech House Publisher, 1992.
M. Abidi and R. Gonzalez, Data Fusion in Robotics and Machine Intelligence, Academic Press, October 1997.
B. Bouchon-Meunier and J. Kacprzyk, Aggregation and Fusion of Imperfect Information (Studies in Fuzziness and Soft Computing), Springer Verlag, July 1998.
P. Dessain and H. Honing, Music, Mind, and Machine: Studies in Computer Music, Music Cognition, and Artificial Intelligence (Kennistechnologie), Thesis Pub, November 1992.
M. Balaban, K. Ebcioglu, and O. Laske, Understanding Music with AI: Perspectives on Music Cognition, AAAI press, 1992.

H. Vinet, F. Delalande, Interfaces Homme-Machine et Creation Musicale, Hermes Science Publication, Paris, 1999.

J. Sloboda, The Musical Mind: The Cognitive Psychology of Music, Oxford University Press, 1985.

W. Benzon, Beethoven's Anvil: Music in Mind and Culture, Basic Books, 2001.

J. Chadabe, Electric Sound: The Past and Promise of Electronic Music, Prentice Hall, 1997.

T. Winkler, Composing Interactive Music, MIT Press, 1999.

R. Rowe, Machine Musicianship, MIT Press, 2001.

P. Cook, Music, Cognition, and Computerized Sound, MIT Press, 1999.

Live Electronics, edited by Peter Nelson & Stephen Montague, Contemporary Music Review, Vol. 6 Part 1, 1991.
F. Sparacino, Sto(ry)chastics: a Bayesian network architecture for combined user modeling, sensor fusion, and computational storytelling for interactive spaces, MIT, Media Laboratory, PhD Thesis, 2001.
G. Gargarian, The Art of Design: Expressive Intelligence in Music, PhD Dissertation, MIT Media Laboratory, 1993.
T. Marrin, Inside The Conductor's Jacket: Analysis, Interpretation, and Musical Synthesis of Expressive Gesture, PhD Dissertation, MIT Media Laboratory, 1999.
Relevant publications:
A. Hunt et al., The Importance of Parameter Mapping in Electronic Instrument Design, NIME proceedings, Dublin, 2002.
A. Camurri et al., Interactive Systems Design: a KANSEI-based Approach, NIME proceedings, Dublin, 2002.
N. Schnell and M. Battier, Introducing Composed Instruments, Technical and Musicological Implications, NIME proceedings, Dublin, 2002.
S. Jorda, Afasia: Ultimate Homeric One-man-multimedia-band, NIME proceedings, Dublin, 2002.
M. Farbood and B. Schoner, Analysis and Synthesis of Palestrina-Style Counterpoint Using Markov Chains, International Computer Music Conference, Havana, 2001.
A. Camurri, G. De Poli, M. Leman, MEGASE: a Multisensory Expressive Gesture Applications System Environment for Artistic Performances, CAST01 Conference, GMD, Bonn, 21-22 Sept 2001.
A. Camurri, G. De Poli, M. Leman, G. Volpe, A Multi-layered Conceptual Framework for Expressive Gesture Applications, Workshop on Current Research Directions in Computer Music, Barcelona, Nov 15-16-17, 2001.
G. De Poli with S. Canazza, C. Drioli, A. Roda, A. Vidolin, P. Zanon, Analysis and modeling of expressive intentions in music performance, International Workshop Human Supervision and Control in Engineering and Music, 21-24. September 2001, Kassel, Germany.
L. Turicchia, G. De Poli, G. Mian, R. Nobili, Audio analysis by a model of the physiological auditory system, Proceedings DAFx2000, Verona, pp. 293-296, 2000.