“I’ll play it ﬁrst and tell you what it is later.”
– Miles Davis
This chapter draws conclusions, presents our contributions to the ﬁelds of music cognition and automatic music analysis-synthesis, and ﬁnally discusses future directions.
The goal of this thesis was to computationally model the process of creating music by listening to audio examples. We aimed to close and automate the life cycle of listening, composing, and performing music, only feeding the system with a song database. Our bias-free system has been given generic listening and learning primitives, and was programmed to analyze the sounds and structures of music uniformly from the ground up. It was designed to arbitrarily combine the extracted musical parameters as a way to synthesize new and musically meaningful structures. These structures could be used to drive a concatenative synthesis module that could recycle audio material from the sound database itself, ﬁnally creating an original song with convincing quality in sound and form.
Because of its highly subjective nature, it is difficult, if not impossible to evaluate the quality of a musical production. Quality is one of those things that cannot be quantiﬁed, and it is extremely contextual . The interested readers may judge for themselves by listening to some of the preliminary and intermediary examples produced in the context of this work at: http://www.media.mit.edu/~tristan/phd/. However, the work can be considered successful in regards with synthesizing music that expands the database. The new music can be analyzed in its turn, combined with more songs, and recycled again.
Although the goal of the thesis was primarily to create music, most of the work emphasis has been on analyzing and representing music through our music cognition framework (chapters 3, 4, and 5). It was indeed hypothesized that synthesis could share the same knowledge as acquired from a uniform analysis procedure based on perceptual listening and learning. This has been demonstrated in section 5.4.4 with a novel musically-intelligent compression algorithm, and in chapter 6 through a collection of synthesis applications, which rely exclusively on this common metadata representation.
This thesis is a practical implementation of machine intelligence for music analysis and synthesis. However, it is our desire to build a generic perceptual model of music cognition, rather than scattered and self-contained algorithms and techniques. The system was carefully designed to meet theoretical requirements (e.g., causality, psychoacaoustics), and we genuinely avoided using possibly more reliable, although less justiﬁable, signal processing methods.
The work and examples presented in this thesis is the result of a stand-alone application named “Skeleton,” as described in appendix A. The environment combines a collection of algorithms, player, GUI, database management, and visualizations, which put together allow to test the applications presented in chapter 6. In itself, Skeleton is an engineering contribution that highly facilitates the development of audio applications dealing with machine listening and machine learning technologies.
Throughout the development of this thesis work, we have generated a few hundreds of more or less elaborated musical examples that testify to the artistic potential of our system. Several acclaimed musicians have shown interest in using Skeleton as part of their creative process. The musical artifact can become the source material for a larger production. Others encouraged its immediate application as an interactive art piece: the outcome is the product of an artistic act, which must be attributed either to the user who biases the machine through music examples, the programmer who built the software, the machine itself that synthesizes unpredicted new music, or perhaps a collaboration between these. Such appreciations let us believe that this thesis contributes on the aesthetic and artistic fronts as well.
It goes without saying that this work represents only a few steps in the vast ﬁeld of machine intelligence for music analysis and synthesis. There is much to be done, if not only by improving the accuracy and robustness of our current algorithms. Here is a short list of the most immediate work that could be done upon our current architecture.
Whether machines can be creative or not is certainly not answered in this work. The machine here creates with a small “c,” but has no intention to do so. And it is not able to evaluate the quality of its own work although it can analytically compare it with others. It is unaware of any social or cultural context, which makes the music somewhat “meaningless.” However, the machine is faster than humans at listening to a song database, and at generating original music. They can produce more with less, and are not biased like humans are. Those differences account for the usefulness and potential of computers in creating new music.