Using the Google speech API
(updated locally 1/21/2024 – changed binary path to sox for homebrew /opt/homebrew/bin/sox in [p call-google-speech]
Also changed some of the UI and logic for manual writing and sending.
(updated 1/21/2021)
This project demonstrates the Google speech-API. It records speech in Max, process it using the Google API, and displays the result in a Max [message] object.
download
https://github.com/tkzic/internet-sensors
folder: google-speech
files
main patch
- speech-to-google-text-api6.maxpat
abstractions and other files
- JSON-google-speech.js (parses JSON response from Google API)
- ms-counter.maxpat (manages audio recording buffer)
external Max objects
- [shell] from https://github.com/jeremybernstein/shell/releases/tag/1.0b2 download this external and add the folder to Options | File Preferences, in Max
external programs
sox: sox audio conversion program must be in the computer’s executable file path, ie., /usr/bin – or you can rewrite the [sprintf] input to [aka.shell] with the actual path. In our case we installed sox using Macports. The executable path is /opt/local/bin/sox – which is built into a message object in the subpatcher [call-google-speech]
get sox from: http://sox.sourceforge.net
note: this conversion may not be necessary with recent updates to Max and the Google speech API
authorization
- none required – so far
instructions
- Open Max patch: speech-to-google-text-api6
- Turn on audio
- Press the spacebar. Start talking. Press the spacebar again when you are finished. The translation will begin automatically
Note: If you have a slow internet connection you may need to tweak the various delay times in the [call google-speech] sub patch.
send Tweets using speech
Max [send] and [receive] objects pass data from this project to other projects that send Tweets from Max. Just run the patches at the same time.
- Using curl: https://reactivemusic.net/?p=5447
- Using ruby: https://reactivemusic.net/?p=5818
Also, check out how this project is integrated into the Pandorabots chatbot API project
https://reactivemusic.net/?p=9834
Or anything else. The Google translation is amazingly accurate.
revision history
- 4/24/2016: need to have explicit path to sox, in the call-google-speech subpatch. In my Macports version the path is /usr/local/opt/bin/sox.
- 5/11/2014: The newest version requires Max 6.1.7 (for JSON parsing). Also have updated to Google Speech API v2.
- update 3/26/2014 to use auto-record features developed for chatbot conversations