in Media Arts and Sciences at MIT
Tod Machover, Peter Cariani, Francois Pachet, Julius O. Smith, Barry Vercoe
Creating Music by Listening
Machines have the power and potential to make expressive music on their own.
This thesis aims to computationally model the process of creating music using
experience from listening to examples. Our unbiased signal-based solution models
the life cycle of listening, composing, and performing, turning the machine
into an active musician, instead of simply an instrument. We accomplish this
through an analysis-synthesis technique by combined perceptual and structural
modeling of the musical surface, which leads to a minimal data representation.
We introduce a music cognition framework that results from the interaction of psychoacoustically grounded causal listening, a time-lag embedded feature representation, and perceptual similarity clustering. Our bottom-up analysis intends to be generic and uniform by recursively revealing metrical hierarchies and structures of pitch, rhythm, and timbre. Training is suggested for top-down unbiased supervision, and is demonstrated with the prediction of downbeat. This musical intelligence enables a range of original manipulations including song alignment, music restoration, cross-synthesis, or song morphing, and ultimately the synthesis of original pieces.
PhD supporting page
A quick overview