Tag: portfolio

Glacier Sounds

Overlapping loops of varying duration to represent natural cycles.

glacier1

In October I collaborated with Wade Kavanaugh and Stephen P. Nguyen to compose and perform the sounds of a glacier for their installation at the Gem theatre in Bethel, Maine. The glacier was made from paper.

Wade and Stephen:

wadeandsteven

A time-lapse video of the project:

A time-lapse video of a similar project they did in Minnesota 2005:

The approach was to take a series of ambient loops and organize them by duration. The longer loops would represent the slow movement of time. Shorter loops would represent events like avalanches. One-shot samples would represent quick events, like the cracking of ice.

It took several iterations to produce something slow and boring enough to be convincing. I used samples from the Ron MacLeod’s Cyclic Waves library from Cycling 74 https://www.ableton.com/en/packs/cyclic-waves/. Samples were pitched down to imply largeness.

Screen Shot 2015-12-21 at 1.09.59 AM

Each vertical column in an Ableton Live set represents a time-frame of waves. That is, the far left column contains quick events and the far right column contains long cycle events. Left to right, the columns have gradually increasing cycle durations.  I used a Push controller to trigger samples in real time as people walked through the theatre to see the glacier.

The theatre speakers were arranged in stereo but from front to back. Since the glacier was also arranged along the same axis, a slow auto-panning effect sent sounds drifting off into the distance, or vice versa. Visually and sonically there was a sense that the space extended beyond the walls of the theatre.

In the “control room” above the theatre… using Push to trigger samples and a Korg NanoKontrol to set panning positions of each track:

glacer2

The performance lasted about 45 minutes. Occasionally the cracking of ice would startle people in the room. There were kids crawling around underneath the paper glacier. Afterwards we just let the sounds play on their own. A short excerpt:

 

Photographs by Rebecca Zicarelli.

Internet shortwave radio using Max, Hamachi, and Mumble

How to control an amateur radio transceiver over the internet, using Osc (Open Sound Control), VOIP (Voice over Internet Protocol) and VPN (Virtual Private Networks).

What problem does this solve?

Using a shortwave radio receiver in a  live performance without installing a large antenna system.

This method gives low-latency real-time access to audio, and radio control using a laptop computer from anywhere. I suppose it could also remote-control a synthesizer, if you’re into that kind of thing.

CAT

Modern ham radio receivers can be controlled using serial commands using the CAT (Computer Aided Transceiver) protocol. Usually this is done via a USB port. There are hardware solutions for remote controlling radios over the internet, like RemoteRig http://www.remoterig.com/wp/. But there is also a free, or low cost, solution using software.

System diagram

Screen Shot 2015-12-21 at 1.23.10 AM

The ‘base’ computer is connected to the radio/antenna. The ‘remote’ computer is a laptop that could be anywhere connected by WiFi

For this experiment we used a TenTec Eagle transceiver connected to a MacBook USB port. The audio output of the radio connects to the audio input of the MacBook. The MacBook is directly connected to an internet WiFi router using an ethernet cable.

VOIP

A mumble client runs on the base computer, https://en.wikipedia.org/wiki/Mumble_(software)  and also on the remote laptop. Both clients are connected to a Mumble server (Murmur) at Mumble.com http://www.mumble.com/mumble-download.php. You could also run your own server. I set the audio to the best quality and muted the microphone on the remote laptop. We are only using the laptop as a receiver. For transmitting, you could simply open up another channel on the Murmur server. Mumble has very low latency (compared to Skype) and decent audio quality.

Bi-directional commands using VPN and OSC

CAT commands go in both directions – to and from the radio. For example, you would send a command to the radio to change frequency. The radio would send acknowledgements back to the remote laptop.

This is a problem for networks that use NAT (Network Address Translation) because local IP addresses are private, hidden behind routers. The solution that eventually worked was using a VPN called Hamachi https://secure.logmein.com/products/hamachi/download.aspx on both the remote and base computers. Hamachi servers are setup on both computers and connected to each other. This allows the computers to ‘see’ each other as if they were on a local network.

Max and Osc

Max patches are run on both the base and remote computers. The Max patch on the base computer connects to the radio using the serial object and passes commands back and forth over the internet using udpsend and udpreceive (which use Osc).

The Max patch on the remote MacBook sends and receives commands from the base computer using updsend and udpreceive. With the Hamachi VPN, Osc works just like it does on a LAN (local area network).

Automatic reconfiguration of clients

The main advantage of this system is that when you move the remote MacBook to a new location – for example, a coffee shop with public Wifi – both the Mumble and Hamachi clients automatically reconfigure for the location. So you don’t need to know the actual IP address of your computer in the coffee shop. The reconfiguration usually happens within seconds after the Wifi connection is made.

Alternatives

If you are just working across a LAN, you don’t need a VPN. Osc will run on a local network using private IP’s.

You could also try Ross Bencina’s Oscgroups http://www.rossbencina.com/code/oscgroups. Although I was not able to get Oscgroups to work, other than in a LAN.

For uni-directional Osc communication from remote to base, in a WAN (wide area network) you can use a static IP address for the target.

Skype is another (free) solution for transmitting VOIP audio. Set the base computer in auto-answer mode and call it from the remote computer. Skype will process the audio more than mumble, with noise gates and such. And the latency is higher. But its very easy to set up.

Development

The next step is to build a remote interface for the radio that uses Midi/Osc controllers, so for example you can turn a dial on the Midi controller to change frequency or filter settings on a base radio.

to be continued…

RF mixer simulation in Max

Audio simulation of an RF circuit.

Screen Shot 2015-03-29 at 4.49.36 PM

The simulation serves no purpose, but its fun. There are 4 versions. I think the third one sounds best (rf-mixer-sim3.maxpat). Its interesting to hear how much spectral distortion happens from multiplying sawtooth waves.

Screen Shot 2015-03-29 at 4.46.55 PM

Download

https://github.com/tkzic/max-projects/

folder: rf-mixer

patches:

Note: please set the signal vector size to 1 (or as low as possible) and enable overdrive and audio interrupt

Screen Shot 2015-03-29 at 5.22.30 PM

Four versions:

  • rf-mixer-sim.maxpat (initial attempt)
  • rf-mixer-sim2.maxpat (uses sah~ and rate~ objects)
  • rf-mixer-sim3.maxpat (uses gate~ objects with a phasor~ clock)
  • rf-mixer-sim4.maxpat (bandpass filter on RF input)

 

Basic synth in Max – part 2

Yet another Basic synthesizer design

Screen Shot 2015-03-26 at 3.16.19 PM

See part 1 here: http://reactivemusic.net/?p=18511

New features

Drag to select buffer start/end points

waveform~ object

Screen Shot 2015-03-26 at 3.21.17 PM

Sample recording

record~ object.

Screen Shot 2015-03-26 at 3.22.25 PM

How to design voice activated recording?

*Time compress/stretch

groove~ (Max 7 only)

Presets

Screen Shot 2015-03-26 at 3.23.30 PM

M4L preset management: http://reactivemusic.net/?p=18557

Polyphony

poly~ object

Polyphonic Midi synth in Max

http://reactivemusic.net/?p=11732

local: poly-generic-example1.maxpat (polyphonic)

Polyphonic instrument in Max for Live

Wave~ sample player: http://reactivemusic.net/?p=18354

local: m4l: poly-synth1.als (aaa-polysynth2.amxd)

Screen Shot 2015-03-22 at 10.01.22 PM

Max For Live

automation and UI design (review)

Distributing M4L devices

How to create a Live ‘Pack’

by Winksound

  • save set
  • collect all and save
  • file manager
    • manage project
      • packing : create live pack

 

Presets in Max for Live

How to use the Max preset object inside of M4L.

Screen Shot 2015-03-22 at 8.06.14 PM

There is some confusion about how to use Max presets in a M4L device. The method described here lets you save and recall presets with a device inside of a Live set, without additional files or dialog boxes. It uses pattrstorage. It works automatically with the Live UI objects.

It also works with other Max UI objects by connecting them to pattr objects.

Its based on an article by Gregory Taylor: https://cycling74.com/2011/05/19/max-for-live-tutorial-adding-pattr-presets-to-your-live-session/

Download

https://github.com/tkzic/max-for-live-projects

Folder: presets

Patch: aaa-preset3.amxd

How it works:

Instructions are included inside the patch. You will need to add objects and then set attributes for those objects in the inspector.  For best results, set the inspector values after adding each object

Write the patch in this order:

A1. Add UI objects.

For each UI object:

  1. check link-to-scripting name
  2. set long and short names to actual name of param

Screen Shot 2015-03-22 at 8.44.23 PM

A2 (optional) Add non Live (ie., Max UI objects)

For each object, connect the middle outlet of a pattr object (with a parameter name as an argument) to the left inlet of the UI object. For example:

Screen Shot 2015-03-22 at 8.30.24 PM

Then in inspector for each UI object:

  1. check  parameter-mode-enable
  2. check inital-enable

Screen Shot 2015-03-22 at 8.51.10 PM

B. Add a pattrstorage object.

Screen Shot 2015-03-22 at 8.35.28 PM

Give the object a name argument, for example: pattrstorage zoo. The name can be anything, its not important. Then in the inspector for pattrstorage:

  1. check parameter-mode enable
  2. check Auto-update-parameter Initial-value
  3. check initial-value
  4. change short-name to match long name

Screen Shot 2015-03-22 at 8.42.49 PM

C. Add an autopattr object

Screen Shot 2015-03-22 at 8.34.21 PM

D. Add a preset object

Screen Shot 2015-03-22 at 8.34.53 PM

In the inspector for the preset object:

  1. assign pattrstorage object name from step B. (zoo) to pattrstorage attribute

Screen Shot 2015-03-22 at 8.52.11 PM

 Notes

The preset numbers go from 1-n. They can be fed directly into the pattrstorage object – for example if you wanted to use an external controller

You can name the presets (slotnames). See the pattrstorage help file

You can interpolate between presets. See pattrstorage help file

Adding new UI objects after presets have been stored

If you add a new UI object to the patch after pattrstorage is set up, you will need to re-save the presets with the correct setting of the new UI object. Or you can edit the pattrstorage data.

 

 

ep-341 Max/MSP – Spring 2015 week 7

The Live Object Model in Max for Live.

Screen Shot 2015-03-04 at 12.43.39 PM

Several ways of working with Ableton Live parameters in a M4L patch. (This is an improved version of the patch we built in class) http://reactivemusic.net/?p=18401

The Live Object Model description: https://cycling74.com/docs/max5/refpages/m4l-ref/m4l_live_object_model.html

In the coming weeks we will build synthesizers and work with control surfaces in M4L

Assignment

Build 3 or more M4L devices, including one of each of the following

  • An audio effect
  • An instrument
  • A Midi effect

Its ok to adapt and “improve” an existing device.

Please bring in your work in progress for next week and be prepared to demonstrate something. The entire assignment will not be due until March 31.

 

ep-426 syllabus – Spring 2015

Interactive video programming and performance

Spring 2015

teacher: Tom Zicarelli – http://tomzicarelli.com

You can reach me at:  tzicarelli@berklee.edu

Office hours: Tuesday 1-2 PM, or Tuesday 4-5PM, at the EPD office #401 at 161 Mass Ave. Please email or call ahead.

Assignments and class notes will be posted to this blog: http://reactivemusic.net before or after the class. Search for: ep-426 to find the notes

Examples, software, links, and references demonstrated in class are available for you to use. If there is something missing from the notes,  please ask about it. This is your textbook.

Syllabus:

Everybody calls this course “The Jitter class” – referring to Max/MSP jitter from Cycling 74. You will learn to use Jitter. But the object is to create interactive visual art. Jitter is one tool of many available.

The field of interactive visual art is constantly evolving.

After you take the course, you will have designed projects. You might design a new tool for other artists. You will have opportunities to solve problems.  You will become familiar with how others make interactive art. You will explore the connection between sound, video, graphics, sensors, and data. You will be exposed to to a world of possibilities – which you may embrace or reject.

We will explore a range of methods and have opportunities to use them in projects. We’ll look at examples by artists – asking the question: How does that work?

Topics: (subject to change)

  1. Jitter
  2. Matrixes
  3. Reverse engineering
  4. Visualization of audio
  5. Visualization of live data, API’s
  6. Video analysis (realtime)
  7. Video hardware and controllers
  8. Prototyping
  9. Video signal processing
  10. OpenGL
  11. Other tools: Processing, WebGL, Canvas, 2d graphics
  12. Portfolios
  13. Live performance

Grading and projects:

Grades are based on two projects that you will design – and class participation. Please see Neil Leonard’s EP-426 syllabus for details. I encourage and will give credit for: collaboration with other students, outside projects, performances, independent projects, and anything else that will foster your growth and success.

I am open to alternative projects. For example, if you want to use this course as an opportunity to develop a larger project or continue a work in progress.

Reference material

https://cycling74.com/wiki/index.php?title=Max_Documentation_and_Resources

 

ep-341 syllabus – Spring 2015

Programming interactive audio software and plugins in Max/MSP

Spring 2015

teacher: Tom Zicarelli – http://tomzicarelli.com

You can reach me at:  tzicarelli@berklee.edu

Office hours: Tuesday 1-2 PM, or Tuesday 4-5PM, at the EPD office #401 at 161 Mass Ave. Please email or call ahead.

Assignments and class notes will be posted to this blog: http://reactivemusic.net before or after the class. Search for: ep-341 to find the notes

Examples, software, links, and references demonstrated in class are available for you to use. If there is something missing from the notes,  please ask about it. This is your textbook.

Syllabus:

Prototyping is the focus. Max is a seed that has grown into music, art, discoveries, products, and entire businesses.

After you take the course, you will have developed several projects. You might design a musical instrument or a plugin. You will have opportunities to solve problems.  But mostly you will have a sense of how to explore possibilities by building prototypes in Max. You will have the basic skills to quickly make software to connect things, and answer questions like, “Is it possible to make something that does x?”.

You will become familiar with how other artists use Max to make things. You will be exposed to to a world of possibilities – which you may embrace or reject.

We will explore a range of methods and have opportunities to use them in projects. We’ll look at examples by artists – asking the question: How does this work?

Success depends on execution as well as good ideas.

Topics: (subject to change)

  1. Max
  2. Reverse engineering
  3. Transforming and scaling data
  4. Designing user interfaces
  5. Messages and communication, MIDI/OSC
  6. randomness and probability
  7. Connecting hardware and other devices
  8. Working with sensors, data, and API’s
  9. Audio signal processing and synthesis.
  10. Problem solving, prototyping, portfolios.
  11. plugins, Max for Live.
  12. Basic video processing and visualization
  13. Alternative tools: Pd
  14. Max externals
  15. How to get ideas
  16. Computers and Live performance
  17. Transcoding

Grading and projects:

Grades will be assigned projects, several small assignments/quizzes, and class participation. Please see Neil Leonard’s EP-341 syllabus for details. I encourage and will give credit for: collaboration with other students, outside projects, performances, independent projects, and anything else that will encourage your growth and success.

I am open to alternative projects. For example, if you want to use this course as an opportunity to develop a larger project or continue a work in progress.

Reference material

https://cycling74.com/wiki/index.php?title=Max_Documentation_and_Resources

 

New musical instruments

A presentation for Berklee BTOT 2015 http://www.berklee.edu/faculty 

monk-thelonious-4fc61815c29ec

Around the year 1700, several startup ventures developed prototypes of machines with thousands of moving parts. After 30 years of engineering, competition, and refinement, the result was a device remarkably similar to the modern piano.

What are the musical instruments of the future being designed right now?

  • new composition tools,
  • reactive music,
  • connecting things,
  • sensors,
  • voices, 
  • brains

Notes:

predictions?

Ray Kurzweil’s future predictions on a timeline: http://imgur.com/quKXllo (The Singularity will happen in 2045)

In 1965 researcher Herbert Simon said: “Machines will be capable, within twenty years, of doing any work a man can do”. Marvin Minsky added his own prediction: “Within a generation … the problem of creating ‘artificial intelligence’ will substantially be solved.” https://forums.opensuse.org/showthread.php/390217-Will-computers-or-machines-ever-become-self-aware-or-evolve/page2

Patterns

Are there patterns in the ways that artists adapt technology?

For example, the Hammond organ borrowed ideas developed for radios. Recorded music is produced with computers that were originally as business machines.

Instead of looking forward to predict future music, lets look backwards to ask,”What technology needs to happen to make musical instruments possible?” The piano relies upon a single-escapement (1710) and later a double-escapement (1821). Real time pitch shifting depends on Fourier transforms (1822) and fast computers (~1980).

Artists often find new (unintended) uses for tools. Like the printing press.

New pianos

The piano is still in development. In December 2014, Eren Başbuğ composed and performed music on the Roli Seaboard – a piano keyboard made of 3 dimensional sensing foam:

Here is Keith McMillen’s QuNexus keyboard (with Polyphonic aftertouch):

https://www.youtube.com/watch?v=bry_62fVB1E

Experiments

Here are tools that might lead to new ways of making music. They won’t replace old ways. Singing has outlasted every other kind of music.

These ideas represent a combination of engineering and art. Engineers need artists. Artists need engineers. Interesting things happen at the confluence of streams.

Analysis, re-synthesis, transformation

Computers can analyze the audio spectrum in real time. Sounds can be transformed and re-synthesized with near zero latency.

Infinite Jukebox

Finding alternate routes through a song.

by Paul Lamere at the Echonest

Echonest has compiled data on over 14 million songs. This is an example of machine learning and pattern matching applied to music.

http://labs.echonest.com/Uploader/index.html

Try examples: “Karma Police”, Or search for: “Albert Ayler”)

Remixing a remix

“Mindblowing Six Song Country Mashup”: https://www.youtube.com/watch?v=FY8SwIvxj8o (start at 0:40)

Screen Shot 2015-01-09 at 11.25.13 PM

Local file: Max teaching examples/new-country-mashup.mp3

More about Echonest

Feature detection

Looking at music under a microscope.

removing music from speech

First you have to separate them.

SMS-tools

by Xavier Serra and UPF

Harmonic Model Plus Residual (HPR) – Build a spectrogram using STFT, then identify where there is strong correlation to a tonal harmonic structure (music). This is the harmonic model of the sound. Subtract it from the original spectrogram to get the residual (noise).

Screen Shot 2015-01-06 at 1.40.37 AM

Screen Shot 2015-01-06 at 1.40.12 AM

Settings for above example:

  • Window size: 1800 (SR / f0 * lobeWidth) 44100 / 200 * 8 = 1764
  • FFT size: 2048
  • Mag threshold: -90
  • Max harmonics: 30
  • f0 min: 150
  • f0 max: 200
Many kinds of features
  • Low level features: harmonicity, amplitude, fundamental frequency
  • high level features: mood, genre, danceability
Examples of feature detection
Music information retrieval

Finding the drop

“Detetcting Drops in EDM” – by Karthik Yadati, Martha Larson, Cynthia C. S. Liem, Alan Hanjalic at Delft University of Technology (2014) http://reactivemusic.net/?p=17711

Polyphonic audio editing

Blurring the distinction between recorded and written music.

Melodyne

by Celemony

http://www.celemony.com/en/start

A minor version of “Bohemian Rhapsody”: http://www.youtube.com/watch?v=voca1OyQdKk

Music recognition

“How Shazam Works” by Farhoud Manjoo at Slate: http://reactivemusic.net/?p=12712, “About 3 datapoints per second, per song.”

  • Music fingerprinting: https://musicbrainz.org/doc/Fingerprinting
  • Humans being computers. Mystery sounds. (Local file: Desktop/mystery sounds)
  • Is it more difficult to build a robot that plays or one that listens?

Sonographic sound processing

Transforming music through pictures.

by Tadej Droljc

 http://reactivemusic.net/?p=16887

(Example of 3d speech processing at 4:12)

local file: SSP-dissertation/4 – Max/MSP/Jitter Patch of PV With Spectrogram as a Spectral Data Storage and User Interface/basic_patch.maxpat

Try recording a short passage, then set bound mode to 4, and click autorotate

Spectral scanning in Ableton Live:

Web Audio

Web browser is the new black

Noteflight

by Joe Berkowitz 

http://www.noteflight.com/login

Plink

by Dinahmoe

http://labs.dinahmoe.com/plink/

Can you jam over the internet?

What is the speed of electricity? 70-80 ms is the best round trip latency (via fiber) from the U.S. east to west coast. If you were jamming over the internet with someone on the opposite coast it might be like being 100 ft away from them in a field. (sound travels 1100 feet/second in air).

Global communal experiences – Bill McKibben – 1990 “The Age of Missing Information”

More about Web Audio

Conversation with robots

Computers finding meaning

The Google speech API

http://reactivemusic.net/?p=9834

The Google speech API uses neural networks, statistics, and large quantities of data.

Microsoft: real-time translation

Reverse entropy

InstantDecomposer

Making music from from sounds that are not music.

by Katja Vetter

. (InstantDecomposer is an update of SliceJockey2):   http://www.katjaas.nl/slicejockey/slicejockey.html

  • local: InstantDecomposer version: tkzic/pdweekend2014/IDecTouch/IDecTouch.pd
  • local: slicejockey2test2/slicejockey2test2.pd
More about reactive music

Sensors and sonification

Transforming motion into music

Three approaches
  • earcons (email notification sound)
  • models (video game sounds)
  • parameter mapping (Geiger counter)
Leap Motion

camera based hand sensor

“Muse” (Boulanger Labs) with Paul Bachelor, Christopher Konopka, Tom Shani, and Chelsea Southard: http://reactivemusic.net/?p=16187

Max/MSP piano example: Leapfinger: http://reactivemusic.net/?p=11727

local file: max-projects/leap-motion/leapfinger2.maxpat

Internet sensors project

Detecting motion from the Internet

http://reactivemusic.net/?p=5859

Twitter streaming example

http://reactivemusic.net/?p=5786

MBTA bus data

 Sonification of Mass Ave buses, from Harvard to Dudley

http://reactivemusic.net/?p=17524

Screen Shot 2014-11-11 at 3.26.16 PM

Stock market music

http://reactivemusic.net/?p=12029

More sonification projects
Vine API mashup

By Steve Hensley

Using Max/MSP/jitter

local file: tkzic/stevehensely/shensley_maxvine.maxpat

Audio sensing gloves for spacesuits

By Christopher Konopka at future, music, technology

http://futuremusictechnology.com

Computer Vision

Sensing motion with video using frame subtraction

by Adam Rokhsar

http://reactivemusic.net/?p=7005

local file: max-projects/frame-subtraction

The brain

Music is stored all across the brain.

Mouse brain wiring diagram

The Allen institute

http://reactivemusic.net/?p=17758 

“Hacking the soul” by Christof Koch at the Allen institute

(An Explanation of the wiring diagram of the mouse brain – at 13:33) http://www.technologyreview.com/emtech/14/video/watch/christof-koch-hacking-the-soul/

OpenWorm project

A complete simulation of the nematode worm, in software, with a Lego body (320 neurons)

http://reactivemusic.net/?p=17744

AARON

Harold Cohen’s algorithmic painting machine

http://reactivemusic.net/?p=17778

Brain plasticity

A perfect pitch pill? http://www.theverge.com/2014/1/6/5279182/valproate-may-give-humans-perfect-pitch-by-resetting-critical-periods-in-brain

DNA

Could we grow music producing organisms? http://reactivemusic.net/?p=18018

 

Two possibilities

Rejecting technology?
bob-dylan-5WFW_o_tn
An optimistic future?

There is a quickening of discovery: internet collaboration, open source, linux,  github, r-pi, Pd, SDR.

“Robots and AI will help us create more jobs for humans — if we want them. And one of those jobs for us will be to keep inventing new jobs for the AIs and robots to take from us. We think of a new job we want, we do it for a while, then we teach robots how to do it. Then we make up something else.”

“…We invented machines to take x-rays, then we invented x-ray diagnostic technicians which farmers 200 years ago would have not believed could be a job, and now we are giving those jobs to robot AIs.”

Kevin Kelly – January 7, 2015, reddit AMA http://www.reddit.com/r/Futurology/comments/2rohmk/i_am_kevin_kelly_radical_technooptimist_digital/

Will people be marrying robots in 2050? http://www.livescience.com/1951-forecast-sex-marriage-robots-2050.html

“What can you predict about the future of music” by Michael Gonchar at The New York Times http://reactivemusic.net/?p=17023

Jim Morrison predicts the future of music:

https://www.youtube.com/watch?v=OWmMVmiGJD0

More areas to explore

Hearing voices

A presentation for Berklee BTOT 2015 http://www.berklee.edu/faculty

(KITT dashboard by Dave Metlesits)

The voice was the first musical instrument. Humans are not the only source of musical voices. Machines have voices. Animals too.

Topics
  • synthesizing voices (formant synthesis, text to speech, Vocaloid)
  • processing voices (pitch-shifting, time-stretching, vocoding, filtering, harmonizing),
  • voices of the natural world
  • fictional languages and animals
  • accents
  • speech and music recognition
  • processing voices as pictures
  • removing music from speech
  • removing voices

Voices

We instantly recognize people and animals by their voices. As an artist we work to develop our own voice. Voices contain information beyond words. Think of R2D2 or Chewbacca.

There is also information between words: “Palin Biden Silences” David Tinapple, 2008: http://vimeo.com/38876967

Synthesizing voices

The vocal spectrum

What’s in a voice?

Singing chords

Humans acting like synthesizers.

More about formants
Text to speech

Teaching machines to talk.

vocodblk.gif

  • phonemes (unit of sound)
  • diphones (combination of phonemes) (Mac OS “Macintalk 3  pro”)
  • morphemes (unit of meaning)
  • prosody (musical quality of speech)
Methods
  • articulatory (anatomical model)
  • formant (additive synthesis) (speak and spell)
  • concatentative (building blocks) (Mac Os)

Try the ‘say’ command (in Mac OS terminal), for example: say hello

More about text to speech
Vocoders

Combining the energy of voice with musical instruments (convolution)

  • Peter Frampton “talkbox”: https://www.youtube.com/watch?v=EqYDQPN_nXQ (about 5:42) – Where is the exciting audience noise in this video?
  • Ableton Live example: Local file: Max/MSP: examples/effects/classic-vocoder-folder/classic_vocoder.maxpat
  • Max vocoder tutorial (In the frequency domain), by dude837 – Sam Tarakajian http://reactivemusic.net/?p=17362 (local file: dude837/4-vocoder/robot-master.maxpat
More about vocoders
Vocaloid

By Yamaha

(text + notation = singing)

Demo tracks: https://www.youtube.com/watch?v=QWkHypp3kuQ

Vocaloop device http://vocaloop.jp/ demo: https://www.youtube.com/watch?v=xLpX2M7I6og#t=24

Processing voices

Transformation

Pitch transposing a baby http://reactivemusic.net/?p=2458

Real time pitch shifting

Autotune: “T-Pain effect” ,(I-am-T-Pain bySmule), “Lollipop” by Lil’ Wayne. “Woods” by Bon Iver https://www.youtube.com/watch?v=1_cePGP6lbU

Autotuna in Max 7

by Matthew Davidson

Local file: max-teaching-examples/autotuna-test.maxpat

InstantDecomposer in Pure Data (Pd)

by Katja Vetter

http://www.katjaas.nl/slicejockey/slicejockey.html

Autocorrelation: (helmholtz~ Pd external) “Helmholtz finds the pitch” http://www.katjaas.nl/helmholtz/helmholtz.html

(^^ is input pitch, preset #9 is normal)

  • local file: InstantDecomposer version: tkzic/pdweekend2014/IDecTouch/IDecTouch.pd
  • local file: slicejockey2test2/slicejockey2test2.pd
Phasors and Granular synthesis

Disassembling time into very small pieces

Time-stretching

Adapted from Andy Farnell, “Designing Sound”

http://reactivemusic.net/?p=11385 Download these patches from: https://github.com/tkzic/max-projects folder: granular-timestretch

  • Basic granular synthesis: graintest3.maxpat
  • Time-stretching: timestretch5.maxpat

More about phasors and granular synthesis
Phase vocoder

…coming soon

Sonographic sound processing

Changing sound into pictures and back into sound

by Tadej Droljc

 http://reactivemusic.net/?p=16887

(Example of 3d speech processing at 4:12)

local file: SSP-dissertation/4 – Max/MSP/Jitter Patch of PV With Spectrogram as a Spectral Data Storage and User Interface/basic_patch.maxpat

Try recording a short passage, then set bound mode to 4, and click autorotate

Speech to text

Understanding the meaning of speech

The Google Speech API

A conversation with a robot in Max

http://reactivemusic.net/?p=9834

Google speech uses neural networks, statistics, and large quantities of data.

More about speech to text

Voices of the natural world

Changes in the environment reflected by sound

Fictional languages and animals

“You can talk to the animals…”

Pig creatures example: http://vimeo.com/64543087

  • 0:00 Neutral
  • 0:32 Single morphemes – neutral mode
  • 0:37 Series, with unifying sounds and breaths
  • 1:02 Neutral, layered
  • 1:12 Sad
  • 1:26 Angry
  • 1:44 More Angry
  • 2:11 Happy

What about Jar Jar Binks?

Accents

The sound changes but the words remain the same.

The Speech accent archive http://reactivemusic.net/?p=9436

Finding and removing music in speech

We are always singing.

Jamming with speech
Removing music from speech
SMS-tools

by Xavier Serra and UPF

Harmonic Model Plus Residual (HPR) – Build a spectrogram using STFT, then identify where there is strong correlation to a tonal harmonic structure (music). This is the harmonic model of the sound. Subtract it from the original spectrogram to get the residual (noise).

Screen Shot 2015-01-06 at 1.40.37 AM

Screen Shot 2015-01-06 at 1.40.12 AM

Settings for above example:

  • Window size: 1800 (SR / f0 * lobeWidth) 44100 / 200 * 8 = 1764
  • FFT size: 2048
  • Mag threshold: -90
  • Max harmonics: 30
  • f0 min: 150
  • f0 max: 200
feature detection
  • time dependent
  • Low level features: harmonicity, amplitude, fundamental frequency
  • high level features: mood, genre, danceability

Acoustic Brainz: (typical analysis page) http://reactivemusic.net/?p=17641

Essentia (open source feature detection tools)  https://github.com/MTG/essentia

Freesound (vast library of sounds):  https://www.freesound.org – look at “similar sounds”

Removing voices from music

A sad thought

phase cancellation encryption

This method was used to send secret messages during world war 2. Its now used in cell phones to get rid of echo. Its also used in noise canceling headphones.

http://reactivemusic.net/?p=8879

max-projects/phase-cancellation/phase-cancellation-example.maxpat

Center channel subtraction

What is not left and not right?

Ableton Live – utility/difference device: http://reactivemusic.net/?p=1498 (Allison Krause example)

Local file: Ableton-teaching-examples/vocal-eliminator

More experiments

Questions

  • Why do most people not like the recorded sound of their voice?
  • Can voice be used as a controller?
  • How do you recognize voices?
  • Does speech recognition work with singing?
  • How does the Google Speech API know the difference between music and speech?
  • How can we listen to ultrasonic animal sounds?
  • What about animal translators?