New musical instruments

Around the year 1700, several startup ventures developed prototypes of machines with thousands of moving parts. After 30 years of engineering, competition, and refinement, the result was a device remarkably similar to the modern piano.

What are the musical instruments of the future being designed right now?

  • new composition tools,
  • reactive music,
  • connecting things,
  • sensors,
  • voices, 
  • brains



Ray Kurzweil’s future predictions on a timeline: (The Singularity will happen in 2045)

In 1965 researcher Herbert Simon said: “Machines will be capable, within twenty years, of doing any work a man can do”. Marvin Minsky added his own prediction: “Within a generation … the problem of creating ‘artificial intelligence’ will substantially be solved.”


Are there patterns in the ways that artists adapt technology?

For example, the Hammond organ borrowed ideas developed for radios. Recorded music is produced with computers that were originally as business machines.

Instead of looking forward to predict future music, lets look backwards to ask,”What technology needs to happen to make musical instruments possible?” The piano relies upon a single-escapement (1710) and later a double-escapement (1821). Real time pitch shifting depends on Fourier transforms (1822) and fast computers (~1980).

Artists often find new (unintended) uses for tools. Like the printing press.

New pianos

The piano is still in development. In December 2014, Eren Başbuğ composed and performed music on the Roli Seaboard – a piano keyboard made of 3 dimensional sensing foam:

Here is Keith McMillen’s QuNexus keyboard (with Polyphonic aftertouch):


Here are tools that might lead to new ways of making music. They won’t replace old ways. Singing has outlasted every other kind of music.

These ideas represent a combination of engineering and art. Engineers need artists. Artists need engineers. Interesting things happen at the confluence of streams.

Analysis, re-synthesis, transformation

Computers can analyze the audio spectrum in real time. Sounds can be transformed and re-synthesized with near zero latency.

Infinite Jukebox

Finding alternate routes through a song.

by Paul Lamere at the Echonest

Echonest has compiled data on over 14 million songs. This is an example of machine learning and pattern matching applied to music.

Try examples: “Karma Police”, Or search for: “Albert Ayler”)

Remixing a remix

“Mindblowing Six Song Country Mashup”: (start at 0:40)

Screen Shot 2015-01-09 at 11.25.13 PM

Local file: Max teaching examples/new-country-mashup.mp3

Feature detection

Looking at music under a microscope.

removing music from speech

First you have to separate them.


by Xavier Serra and UPF

Harmonic Model Plus Residual (HPR) – Build a spectrogram using STFT, then identify where there is strong correlation to a tonal harmonic structure (music). This is the harmonic model of the sound. Subtract it from the original spectrogram to get the residual (noise).

Screen Shot 2015-01-06 at 1.40.37 AM

Screen Shot 2015-01-06 at 1.40.12 AM

Settings for above example:

  • Window size: 1800 (SR / f0 * lobeWidth) 44100 / 200 * 8 = 1764
  • FFT size: 2048
  • Mag threshold: -90
  • Max harmonics: 30
  • f0 min: 150
  • f0 max: 200
Many kinds of features
  • Low level features: harmonicity, amplitude, fundamental frequency
  • high level features: mood, genre, danceability
Examples of feature detection
Music information retrieval

Finding the drop

“Detetcting Drops in EDM” – by Karthik Yadati, Martha Larson, Cynthia C. S. Liem, Alan Hanjalic at Delft University of Technology (2014)

Polyphonic audio editing

Blurring the distinction between recorded and written music.


by Celemony

A minor version of “Bohemian Rhapsody”:

Music recognition

“How Shazam Works” by Farhoud Manjoo at Slate:, “About 3 datapoints per second, per song.”

  • Music fingerprinting:
  • Humans being computers. Mystery sounds. (Local file: Desktop/mystery sounds)
  • Is it more difficult to build a robot that plays or one that listens?

Sonographic sound processing

Transforming music through pictures.

by Tadej Droljc

(Example of 3d speech processing at 4:12)

local file: SSP-dissertation/4 – Max/MSP/Jitter Patch of PV With Spectrogram as a Spectral Data Storage and User Interface/basic_patch.maxpat

Try recording a short passage, then set bound mode to 4, and click autorotate

Spectral scanning in Ableton Live:

Web Audio

Web browser is the new black


by Joe Berkowitz


by Dinahmoe

Can you jam over the internet?

What is the speed of electricity? 70-80 ms is the best round trip latency (via fiber) from the U.S. east to west coast. If you were jamming over the internet with someone on the opposite coast it might be like being 100 ft away from them in a field. (sound travels 1100 feet/second in air).

Global communal experiences – Bill McKibben – 1990 “The Age of Missing Information”

Conversation with robots

Computers finding meaning

The Google speech API

The Google speech API uses neural networks, statistics, and large quantities of data.

Microsoft: real-time translation

Reverse entropy


Making music from from sounds that are not music.

by Katja Vetter

. (InstantDecomposer is an update of SliceJockey2):

  • local: InstantDecomposer version: tkzic/pdweekend2014/IDecTouch/IDecTouch.pd
  • local: slicejockey2test2/slicejockey2test2.pd
Sensors and sonification

Transforming motion into music

Three approaches
  • earcons (email notification sound)
  • models (video game sounds)
  • parameter mapping (Geiger counter)
Leap Motion

camera based hand sensor

“Muse” (Boulanger Labs) with Paul Bachelor, Christopher Konopka, Tom Shani, and Chelsea Southard:

Max/MSP piano example: Leapfinger:

local file: max-projects/leap-motion/leapfinger2.maxpat

Internet sensors project

Detecting motion from the Internet

Twitter streaming example

MBTA bus data

 Sonification of Mass Ave buses, from Harvard to Dudley

Screen Shot 2014-11-11 at 3.26.16 PM

Stock market music

Vine API mashup

By Steve Hensley

Using Max/MSP/jitter

local file: tkzic/stevehensely/shensley_maxvine.maxpat

Audio sensing gloves for spacesuits

By Christopher Konopka at future, music, technology

Computer Vision

Sensing motion with video using frame subtraction

by Adam Rokhsar

local file: max-projects/frame-subtraction

The brain

Music is stored all across the brain.

Mouse brain wiring diagram

The Allen institute 

“Hacking the soul” by Christof Koch at the Allen institute

(An Explanation of the wiring diagram of the mouse brain – at 13:33)

OpenWorm project

A complete simulation of the nematode worm, in software, with a Lego body (320 neurons)


Harold Cohen’s algorithmic painting machine

Brain plasticity

A perfect pitch pill?


Could we grow music producing organisms?


Two possibilities

Rejecting technology?
An optimistic future?

There is a quickening of discovery: internet collaboration, open source, linux,  github, r-pi, Pd, SDR.

“Robots and AI will help us create more jobs for humans — if we want them. And one of those jobs for us will be to keep inventing new jobs for the AIs and robots to take from us. We think of a new job we want, we do it for a while, then we teach robots how to do it. Then we make up something else.”

“…We invented machines to take x-rays, then we invented x-ray diagnostic technicians which farmers 200 years ago would have not believed could be a job, and now we are giving those jobs to robot AIs.”

Kevin Kelly – January 7, 2015, reddit AMA

Will people be marrying robots in 2050?

“What can you predict about the future of music” by Michael Gonchar at The New York Times

Jim Morrison predicts the future of music:

Hearing voices

(KITT dashboard by Dave Metlesits)

The voice was the first musical instrument. Humans are not the only source of musical voices. Machines have voices. Animals too.

  • synthesizing voices (formant synthesis, text to speech, Vocaloid)
  • processing voices (pitch-shifting, time-stretching, vocoding, filtering, harmonizing),
  • voices of the natural world
  • fictional languages and animals
  • accents
  • speech and music recognition
  • processing voices as pictures
  • removing music from speech
  • removing voices


We instantly recognize people and animals by their voices. As an artist we work to develop our own voice. Voices contain information beyond words. Think of R2D2 or Chewbacca.

There is also information between words: “Palin Biden Silences” David Tinapple, 2008:

Synthesizing voices

The vocal spectrum

What’s in a voice?

Singing chords

Humans acting like synthesizers.

Text to speech

Teaching machines to talk.


  • phonemes (unit of sound)
  • diphones (combination of phonemes) (Mac OS “Macintalk 3  pro”)
  • morphemes (unit of meaning)
  • prosody (musical quality of speech)
  • articulatory (anatomical model)
  • formant (additive synthesis) (speak and spell)
  • concatentative (building blocks) (Mac Os)

Try the ‘say’ command (in Mac OS terminal), for example: say hello

Combining the energy of voice with musical instruments (convolution)

  • Peter Frampton “talkbox”: (about 5:42) – Where is the exciting audience noise in this video?
  • Ableton Live example: Local file: Max/MSP: examples/effects/classic-vocoder-folder/classic_vocoder.maxpat
  • Max vocoder tutorial (In the frequency domain), by dude837 – Sam Tarakajian (local file: dude837/4-vocoder/robot-master.maxpat
By Yamaha

(text + notation = singing)

Demo tracks:

Vocaloop device demo:

Processing voices


Pitch transposing a baby

Real time pitch shifting

Autotune: “T-Pain effect” ,(I-am-T-Pain bySmule), “Lollipop” by Lil’ Wayne. “Woods” by Bon Iver

Autotuna in Max 7

by Matthew Davidson

Local file: max-teaching-examples/autotuna-test.maxpat

InstantDecomposer in Pure Data (Pd)

by Katja Vetter

Autocorrelation: (helmholtz~ Pd external) “Helmholtz finds the pitch”

(^^ is input pitch, preset #9 is normal)

  • local file: InstantDecomposer version: tkzic/pdweekend2014/IDecTouch/IDecTouch.pd
  • local file: slicejockey2test2/slicejockey2test2.pd
Phasors and Granular synthesis

Disassembling time into very small pieces


Adapted from Andy Farnell, “Designing Sound” Download these patches from: folder: granular-timestretch

  • Basic granular synthesis: graintest3.maxpat
  • Time-stretching: timestretch5.maxpat

Phase vocoder

…coming soon

Sonographic sound processing

Changing sound into pictures and back into sound

by Tadej Droljc

(Example of 3d speech processing at 4:12)

local file: SSP-dissertation/4 – Max/MSP/Jitter Patch of PV With Spectrogram as a Spectral Data Storage and User Interface/basic_patch.maxpat

Try recording a short passage, then set bound mode to 4, and click autorotate

Speech to text

Understanding the meaning of speech

The Google Speech API

A conversation with a robot in Max

Google speech uses neural networks, statistics, and large quantities of data.

Voices of the natural world

Changes in the environment reflected by sound

Fictional languages and animals

“You can talk to the animals…”

Pig creatures example:

  • 0:00 Neutral
  • 0:32 Single morphemes – neutral mode
  • 0:37 Series, with unifying sounds and breaths
  • 1:02 Neutral, layered
  • 1:12 Sad
  • 1:26 Angry
  • 1:44 More Angry
  • 2:11 Happy

What about Jar Jar Binks?


The sound changes but the words remain the same.

The Speech accent archive

Finding and removing music in speech

We are always singing.

Jamming with speech
Removing music from speech

by Xavier Serra and UPF

Harmonic Model Plus Residual (HPR) – Build a spectrogram using STFT, then identify where there is strong correlation to a tonal harmonic structure (music). This is the harmonic model of the sound. Subtract it from the original spectrogram to get the residual (noise).

Screen Shot 2015-01-06 at 1.40.37 AM

Screen Shot 2015-01-06 at 1.40.12 AM

Settings for above example:

  • Window size: 1800 (SR / f0 * lobeWidth) 44100 / 200 * 8 = 1764
  • FFT size: 2048
  • Mag threshold: -90
  • Max harmonics: 30
  • f0 min: 150
  • f0 max: 200
feature detection
  • time dependent
  • Low level features: harmonicity, amplitude, fundamental frequency
  • high level features: mood, genre, danceability

Acoustic Brainz: (typical analysis page)

Essentia (open source feature detection tools)

Freesound (vast library of sounds): – look at “similar sounds”

Removing voices from music

A sad thought

phase cancellation encryption

This method was used to send secret messages during world war 2. Its now used in cell phones to get rid of echo. Its also used in noise canceling headphones.


Center channel subtraction

What is not left and not right?

Ableton Live – utility/difference device: (Allison Krause example)

Local file: Ableton-teaching-examples/vocal-eliminator

  • Why do most people not like the recorded sound of their voice?
  • Can voice be used as a controller?
  • How do you recognize voices?
  • Does speech recognition work with singing?
  • How does the Google Speech API know the difference between music and speech?
  • How can we listen to ultrasonic animal sounds?
  • What about animal translators?


Audio streaming object in Max

oggrx~ and oggtx~

by Robin Gareus

At Cycling 74 forum:


I was able to receive mp3 files from a server in Max 6.18. using oggrx~. There doesn’t appear to be transport control – so this would need to be built in for synchronization.

Unexpected find: The external uses “secret rabbit code” for resampling. So it works in Max. And we have the source code but not the i386 libs that were used to compile it.

There is no binary for v.7 of oggrx~.mxo, but there is one for v.6

Original c74 post by umma08:

i managed to get Robin Gareus’ externals. They are available here, though they are unmaintained.

The binaries are still online at:

It’s been more than 3 years (OSX 10.5) since I last looked at it, it
should still work, but I don’t know. Please let me know if you encounter
any problems, so that I can warn others.

I don’t maintain this external anymore. I neither have a MAX/MSP
license, nor do I own any Apple devices. On the upside, complete
source-code is available from;a=snapshot;sf=tgz

ep-413 DSP week 9



Building a Max patch that displays, transforms, and responds to internet data.

building materials
  • Max (6.1.7 or newer)
  • Soundflower –

Both available from Cycling 74

The Max patch is based on a tutorial by dude837 called “Automatic Silly Video Generator”


The patch at the download link in the video is broken – but the javascript code for the Max js object is intact. You can download the entire patch from the Max-projects archive: folder: maxvine

Internet API’s

API’s (application programming interfaces) provide methods for programs (other than web browsers) to access Internet data. Any app that access data from the web uses an API.

Here is a link to information about the Vine API:

For example, if you copy this URL into a web browser address bar, it will return a block of data in JSON format about the most popular videos on Vine:

HTTP requests

An HTTP request transfers data to or from a server. A web browser handles HTTP requests in the background. You can also write programs that make HTTP requests. A  program called “curl” runs http requests from the terminal command line. Here are examples:

Response data

Data is usually returned in one of 3 formats:

  • JSON
  • XML
  • HTML

JSON is the preferred method because its easy to access the data structure.

Max HTTP requests

There are several ways to make HTTP requests in Max, but the best method is the js object: Here is the code that runs the GET request for the Vine API:

function get(url)
    var ajaxreq = new XMLHttpRequest();"GET", url);
    ajaxreq.onreadystatechange = readystatechange;

function readystatechange()
    var rawtext = this._getResponseKey("body");
    var body = JSON.parse(rawtext);


The function: get() formats and sends an HTTP request using the URL passed in with the get message from Max. When the data is returned to Max, the readystatechange() function parses it and sends the URL of the most popular Vine video out the left outlet of the js object.

Playing Internet audio/video files in Max

The object will play videos, with the URL passed in by the read message.

Unfortunately, sends its audio to the system, not to Max. You can use Soundflower, or a virtual audio routing app, to get the audio back into Max.

Audio from video

Video from audio

Other Internet API examples in Max

There is a large archive of examples here: Internet sensors:

We will look at more of these next week. Here is simple Max patch that uses the Soundcloud API:

Gokce Kinayoglu has written a java external for Max called Searchtweet:

Many API’s require complex authentication, or money, before they will release their data. We will look ways to access these API’s from Max next week.


There are API services that consolidate many API’s into one API. For example:

Scaling data

Look at the Max tutorial (built in to Max Help) called “Data : data scaling” It contains most of what you need to know to work with streams of data.


Using the Vine API patch that we built during the class as a starting point: Build a better app.

Ideas to explore:

  • Is it possible to run several API requests simultaneously?
  • Recording? Time expansion? Effects that evolve over time?
  • Generate music from motion, data, and raw sound?
  • Make a video respond to your instrument or voice?
  • Design a better user interface or external controller?
  • Will this idea work in Max For Live?
  • How would you make adjustments to the loop length, or synchronize a video to other events?
  • Make envelopes to change the dynamic shape?
  • Destruction? Abstraction?
  • Find or write a Max URL streaming object?
  • What about using a different API or other data from the Vine API?

This project will be due in 2-3 weeks. But for next week please bring in your work in progress, and we will help solve problems.