More conversations with robots in Max

Using Google speech API and Pandorabots API

This project is an extension to the speech-to-text project: http://reactivemusic.net/?p=4690 You might want to try running that project first to get the Google speech API running.

features

  • Everything runs in one Max patch
  • voice auto detect mode
  • menu selection of chat bots and voices
  • filtering of non speakable text (like HTML tags)
  • python script now runs under current directory of patch using relative path
  • refinements to recording and chatbot engines

download

https://github.com/tkzic/internet-sensors

folder: google-speech

files
main Max patch
  • robot-conversation5.maxpat
abstractions and other files
  • clean-html.js
  • xml2json/xml2json.py
  • JSON-google-speech.js
  • JSON-pandorabot.js
  • autorecord-buffer2.maxpat
  • auto-record-switch.maxpat
  • pandorabots.txt
Max external objects
external programs:

sox: sox audio conversion program must be in the computer’s executable file path, ie., /usr/bin – or you can rewrite the [sprintf] input to [aka.shell] with the actual path

get sox from: http://sox.sourceforge.net

Instructions

  • Open robot-converstaion.maxpat and turn on audio
  • select chatbot as destination
  • For manual record (push to talk) use the toggle 
  • For auto-record: press the + key to activate voice sensor (press – key to deactivate)
  • ask a question

notes

The goal of this update is to get 2 or more chatbots conversing via speech through the air. This prototype is almost there, but has encountered an unexpected setback: The Google speech API is not very good at decoding synthesized speech from the built in speech synthesizer in Mac OS. Its not bad, but really only works well with the default male voice: Alex.

One idea would be to pitch shift the male voice to make it sound female. This would require some changes to audio-routing. Currently the speech output happens via the operating system. So it would need to be piped back into Max – which isn’t such a bad idea anyway – because then would could possible run two instances  on the same computer and just route the audio internally.

You may be wondering, why I didn’t just connect the two chat bots via text, skipping the speech recognition? Well, its more interesting to have devices speaking through the air.

Another thing that needs fixing: Currently the API call to Google speech causes ‘blocking’ in Max. It would be better to have the call happen using a background process that sends a message back to Max when the processing is completed. This way we could timeout if there is a bad internet connection or other network error. This could be done using a shell script.

revision history

  • 4/24/2016: need to have explicit path to sox, in the call-google-speech subpatch. In my Macports version the path is /usr/local/opt/bin/sox.
  • 6/6/2014: re-added missing pandorabots.txt (list of chatbots) – also noticed that pandorabots.com was not available. May need to look for another site.
  • 5/11/2014: The newest version requires Max 6.1.7 (for JSON parsing). Also have updated to Google Speech API v2.
  • Note: Instructions for getting a real key from Google – which will need to be inserted into the patch.  http://www.chromium.org/developers/how-tos/api-keys – so far we have been getting by with common keys from a github site (see notes in next link)

Also please see these notes about how to modify the patch with your key – until this gets resolved: http://reactivemusic.net/?p=11035