Conversation with a robot in Max

This project brings together several examples of API programming with Max. The pandorabots.api patch contains an example of using curl to generate an XML response file, then converts XML to JSON using a Python script. The resulting JSON file is read into Max and parsed using the [js] object.

Here is an audio recording of my conversation (using Max) with a text chatbot named ‘Chomsky’

‘Chomsky’ lives at http://pandorabots.com.

My voice gets recorded by Max then converted to text by the Google speech-api.

The text is passed to the Pandorabots API. The chatbot response gets spoken by the aka.speech external which uses the Mac OS built-in text-to-speech system.

Note: The above recording was processed with a ‘silence truncate’ effect because there were  3-5 second delays between responses. In realtime it has the feel of the Houston/Apollo dialogs.

pandorabots-api.maxpat (which handles chatbot responses) gets text input from speech-to-google-text-api2.maxpat – a patch that converts speech to text using the Google speech-API.

https://reactivemusic.net/?p=4690

The output (responses from chatbot) get sent to twitter-search-to-speech2.maxpat which “speaks” using the Mac OS  text-to-speech program using the aka.speech external.

files

Max

  • speech-to-google-text-api2.maxpat
  • JSON-google-speech.js
  • pandorabots-api.maxpat
  • JSON-pandorabot.js
  • text-to-speech2.maxpat

externals:

[authorization]

  • none required

external programs:

  • sox: sox audio conversion program must be in the computer’s executable file path, ie., /usr/bin – or you can rewrite the [sprintf] input to [aka.shell] with the actual path. Get sox from: http://sox.sourceforge.net
  • xml2json (python) in tkzic/internetsensors/: xml2json/xml2json.py and xml2json/setup.py (for translating XML to JSON) – [NOTE] you will need to change the path in the [sprintf] object in pandorabots.api to point to the folder containing this python script.

instructions

  • Open the three Max patches.
    • speech-to-google-text-api2.maxpat
    • pandorabots-api.maxpat
    • text-to-speech2.maxpat
  • Clear the custid in the pandorabots-api patch
  • Start audio in the Google speech patch. Then toggle the mic button and say something.
  • After the first response, go to the pandorabots-api patch and click the new custid – so that the chatbot retains the thread of the conversation.

download:

The files for this project can be downloaded from the intenet-sensors archive at github

https://github.com/tkzic/internet-sensors

Speech to text in Max

Using the Google speech API

(updated locally 1/21/2024 – changed binary path to sox for homebrew /opt/homebrew/bin/sox in [p call-google-speech]

Also changed some of the UI and logic for manual writing and sending.

(updated 1/21/2021)

This project demonstrates the Google speech-API. It records speech in Max, process it using the Google API, and displays the result in a Max [message] object.

download

https://github.com/tkzic/internet-sensors

folder: google-speech

files

main patch
  • speech-to-google-text-api6.maxpat
abstractions and other files
  • JSON-google-speech.js (parses JSON response from Google API)
  • ms-counter.maxpat (manages audio recording buffer)

external Max objects

external programs

sox: sox audio conversion program must be in the computer’s executable file path, ie., /usr/bin – or you can rewrite the [sprintf] input to [aka.shell] with the actual path. In our case we installed sox using Macports. The executable path is /opt/local/bin/sox – which is built into a message object in the subpatcher [call-google-speech]

get sox from: http://sox.sourceforge.net

note: this conversion may not be necessary with recent updates to Max and the Google speech API

authorization

  • none required – so far
This may be changing.
Insert here: how to get a speech-api key from Google 

instructions

  • Open Max patch: speech-to-google-text-api6
  • Turn on audio
  • Press the spacebar. Start talking. Press the spacebar again when you are finished. The translation will begin automatically

Note: If you have a slow internet connection you may need to tweak the various delay times in  the [call google-speech] sub patch.

send Tweets using speech

Max [send] and [receive] objects pass data from this project to other projects that send Tweets from Max. Just run the patches at the same time.

Also, check out how this project is integrated into the Pandorabots chatbot API project

https://reactivemusic.net/?p=9834

Or anything else. The Google translation is amazingly accurate.

revision history

  • 4/24/2016: need to have explicit path to sox, in the call-google-speech subpatch. In my Macports version the path is /usr/local/opt/bin/sox.
  • 5/11/2014: The newest version requires Max 6.1.7 (for JSON parsing). Also have updated to Google Speech API v2.
  • update 3/26/2014 to use auto-record features developed for chatbot conversations

Cosm with Max

update 6/2014: Cosm is now Xively. Have not re-tested examples below. There is a working Twitter example at internet sensors projects: https://reactivemusic.net/?p=5859

original post

notes

Today I was finally able to get this working. Reading a Cosm (Pachube) feed from curl and from Max. Here is an example that works in curl: (replace API-KEY with actual key)

curl http://api.cosm.com/v2/feeds/76490/datastreams/Power.xml?key=API-KEY

You can get JSON responses by leaving off the .xml extension or replacing it with .json

Its critical to use “key=…” not “X-ApiKey=…” like in the cosm documentation, or you will get permission errors from curl and Max.

I was also able to get the Max project called “pachube report” from Nicholas Marechal to work (requires jasch and cnmat externals)

http://cycling74.com/toolbox/pachube-tools/

This patch uses the typical jit.uldl and jit.textfile objects and some regexp parsing tricks.

Next trick will be creating a feed and sending it to Cosm.

 

 

Playing bird calls in Max

Using the xeno-canto API

Note: 2024/01/21 – patch not working in repository, but has been fixed locally with updates to birdcall4.maxpat.

(updated 1/25/2021)

This Max patch retrieves bird call data from the xeno-canto API, then plays an mp3 file of the bird call using the URL from the query.

If you’d like to modify this patch to play the sounds of other birds, you can get the species names from: http://www.xeno-canto.org

Click to hear the bird call from the patch:

download:

https://github.com/tkzic/internet-sensors

folder: bird-calls

files

Max
  • bird-call4.maxpat
  • nothing-detector.maxpat (for tracking progress of html request)

authorization

  • none required

instructions

  • Open the Max patch: bird-call4.maxpat
  • Select a bird from the menu. Wait a few seconds. If the hit-counter increases above zero then the search was successful.
  • Click the button to start audio.
  • Click the button to play a random recording from the query

notes

There are two html queries. The first query retrieves an array of recording records for a selected bird. The second query downloads the .mp3 file with the actual recording.

The patch uses [dict] and [maxurl] to format execute the first query. Then it uses [jit.uldl] to download the .mp3 file.