Using Google speech API and Pandorabots API
This project is an extension to the speech-to-text project: http://reactivemusic.net/?p=4690 You might want to try running that project first to get the Google speech API running.
- Everything runs in one Max patch
- voice auto detect mode
- menu selection of chat bots and voices
- filtering of non speakable text (like HTML tags)
- python script now runs under current directory of patch using relative path
- refinements to recording and chatbot engines
main Max patch
abstractions and other files
Max external objects
- [aka.shell] from http://www.iamas.ac.jp/~aka/max/ – download this external and add the folder to Options | File Preferences, in Max
sox: sox audio conversion program must be in the computer’s executable file path, ie., /usr/bin – or you can rewrite the [sprintf] input to [aka.shell] with the actual path
get sox from: http://sox.sourceforge.net
- Open robot-converstaion.maxpat and turn on audio
- select chatbot as destination
- For manual record (push to talk) use the toggle
- For auto-record: press the + key to activate voice sensor (press – key to deactivate)
- ask a question
The goal of this update is to get 2 or more chatbots conversing via speech through the air. This prototype is almost there, but has encountered an unexpected setback: The Google speech API is not very good at decoding synthesized speech from the built in speech synthesizer in Mac OS. Its not bad, but really only works well with the default male voice: Alex.
One idea would be to pitch shift the male voice to make it sound female. This would require some changes to audio-routing. Currently the speech output happens via the operating system. So it would need to be piped back into Max – which isn’t such a bad idea anyway – because then would could possible run two instances on the same computer and just route the audio internally.
You may be wondering, why I didn’t just connect the two chat bots via text, skipping the speech recognition? Well, its more interesting to have devices speaking through the air.
Another thing that needs fixing: Currently the API call to Google speech causes ‘blocking’ in Max. It would be better to have the call happen using a background process that sends a message back to Max when the processing is completed. This way we could timeout if there is a bad internet connection or other network error. This could be done using a shell script.
- 4/24/2016: need to have explicit path to sox, in the call-google-speech subpatch. In my Macports version the path is /usr/local/opt/bin/sox.
- 6/6/2014: re-added missing pandorabots.txt (list of chatbots) – also noticed that pandorabots.com was not available. May need to look for another site.
- 5/11/2014: The newest version requires Max 6.1.7 (for JSON parsing). Also have updated to Google Speech API v2.
- Note: Instructions for getting a real key from Google – which will need to be inserted into the patch. http://www.chromium.org/developers/how-tos/api-keys – so far we have been getting by with common keys from a github site (see notes in next link)