Speech to text in Max

Using the Google speech API

(updated locally 1/21/2024 – changed binary path to sox for homebrew /opt/homebrew/bin/sox in [p call-google-speech]

Also changed some of the UI and logic for manual writing and sending.

(updated 1/21/2021)

This project demonstrates the Google speech-API. It records speech in Max, process it using the Google API, and displays the result in a Max [message] object.

download

https://github.com/tkzic/internet-sensors

folder: google-speech

files

main patch
  • speech-to-google-text-api6.maxpat
abstractions and other files
  • JSON-google-speech.js (parses JSON response from Google API)
  • ms-counter.maxpat (manages audio recording buffer)

external Max objects

external programs

sox: sox audio conversion program must be in the computer’s executable file path, ie., /usr/bin – or you can rewrite the [sprintf] input to [aka.shell] with the actual path. In our case we installed sox using Macports. The executable path is /opt/local/bin/sox – which is built into a message object in the subpatcher [call-google-speech]

get sox from: http://sox.sourceforge.net

note: this conversion may not be necessary with recent updates to Max and the Google speech API

authorization

  • none required – so far
This may be changing.
Insert here: how to get a speech-api key from Google 

instructions

  • Open Max patch: speech-to-google-text-api6
  • Turn on audio
  • Press the spacebar. Start talking. Press the spacebar again when you are finished. The translation will begin automatically

Note: If you have a slow internet connection you may need to tweak the various delay times in  the [call google-speech] sub patch.

send Tweets using speech

Max [send] and [receive] objects pass data from this project to other projects that send Tweets from Max. Just run the patches at the same time.

Also, check out how this project is integrated into the Pandorabots chatbot API project

https://reactivemusic.net/?p=9834

Or anything else. The Google translation is amazingly accurate.

revision history

  • 4/24/2016: need to have explicit path to sox, in the call-google-speech subpatch. In my Macports version the path is /usr/local/opt/bin/sox.
  • 5/11/2014: The newest version requires Max 6.1.7 (for JSON parsing). Also have updated to Google Speech API v2.
  • update 3/26/2014 to use auto-record features developed for chatbot conversations

Cosm with Max

update 6/2014: Cosm is now Xively. Have not re-tested examples below. There is a working Twitter example at internet sensors projects: https://reactivemusic.net/?p=5859

original post

notes

Today I was finally able to get this working. Reading a Cosm (Pachube) feed from curl and from Max. Here is an example that works in curl: (replace API-KEY with actual key)

curl http://api.cosm.com/v2/feeds/76490/datastreams/Power.xml?key=API-KEY

You can get JSON responses by leaving off the .xml extension or replacing it with .json

Its critical to use “key=…” not “X-ApiKey=…” like in the cosm documentation, or you will get permission errors from curl and Max.

I was also able to get the Max project called “pachube report” from Nicholas Marechal to work (requires jasch and cnmat externals)

http://cycling74.com/toolbox/pachube-tools/

This patch uses the typical jit.uldl and jit.textfile objects and some regexp parsing tricks.

Next trick will be creating a feed and sending it to Cosm.

 

 

Google custom search API

notes

API for custom searches from Google…. and a method to spoof the “custom” part of it…

https://code.google.com/apis/console/

https://developers.google.com/custom-search/v1/getting_started

Use this method to build a custom search which searches the whole web…

http://stackoverflow.com/questions/4082966/google-web-search-api-deprecated-what-now

Here is an example of a search using this method with curl – note the API-KEY is removed. You will need to get an API key (see above).

curl "https://www.googleapis.com/customsearch/v1?key=API-KEY&cx=012117491442732664551:egvalbpelhq&q=lectures"

Web audio from the Soundcloud API

notes

(update) to get your client-id from Soundcloud: From your home page, select more | developers | my apps.

This looks like the easiest way to play streaming content from Soundcloud over the web, Maybe this example could be adjusted to run in Max.

http://stackoverflow.com/questions/13455956/setup-web-audio-api-source-node-from-soundcloud

Example API urls,

this one streams a track: (replace client id with real thing)

http://api.soundcloud.com/tracks/6981096/stream?client_id=CLIENT-ID

This one returns an XML file filled with tracks that can be played:

https://api.soundcloud.com/tracks?client_id=CLIENT-ID

See API reference:

http://developers.soundcloud.com/docs/api/reference#users

(update) Use /resolve to get the user id and user info, given the name – like this:

http://api.soundcloud.com/resolve.json?url=http://soundcloud.com/tkzic&client_id=CLIENT-ID

Soundcloud short-codes

This code comes from clicking the Share button on the SoundCloud clip, then using the “WordPress Code”

If you find old-style SoundCloud post on a 3rd party site, you may need to get the ’embed’ code and then add it to your post, using the html editor