Speech to text in Max

Using the Google speech API

(updated locally 1/21/2024 – changed binary path to sox for homebrew /opt/homebrew/bin/sox in [p call-google-speech]

Also changed some of the UI and logic for manual writing and sending.

(updated 1/21/2021)

This project demonstrates the Google speech-API. It records speech in Max, process it using the Google API, and displays the result in a Max [message] object.

download

https://github.com/tkzic/internet-sensors

folder: google-speech

files

main patch
  • speech-to-google-text-api6.maxpat
abstractions and other files
  • JSON-google-speech.js (parses JSON response from Google API)
  • ms-counter.maxpat (manages audio recording buffer)

external Max objects

external programs

sox: sox audio conversion program must be in the computer’s executable file path, ie., /usr/bin – or you can rewrite the [sprintf] input to [aka.shell] with the actual path. In our case we installed sox using Macports. The executable path is /opt/local/bin/sox – which is built into a message object in the subpatcher [call-google-speech]

get sox from: http://sox.sourceforge.net

note: this conversion may not be necessary with recent updates to Max and the Google speech API

authorization

  • none required – so far
This may be changing.
Insert here: how to get a speech-api key from Google 

instructions

  • Open Max patch: speech-to-google-text-api6
  • Turn on audio
  • Press the spacebar. Start talking. Press the spacebar again when you are finished. The translation will begin automatically

Note: If you have a slow internet connection you may need to tweak the various delay times in  the [call google-speech] sub patch.

send Tweets using speech

Max [send] and [receive] objects pass data from this project to other projects that send Tweets from Max. Just run the patches at the same time.

Also, check out how this project is integrated into the Pandorabots chatbot API project

https://reactivemusic.net/?p=9834

Or anything else. The Google translation is amazingly accurate.

revision history

  • 4/24/2016: need to have explicit path to sox, in the call-google-speech subpatch. In my Macports version the path is /usr/local/opt/bin/sox.
  • 5/11/2014: The newest version requires Max 6.1.7 (for JSON parsing). Also have updated to Google Speech API v2.
  • update 3/26/2014 to use auto-record features developed for chatbot conversations