Tag: web development

What is data mining?

notes

A synopsis at UCLA

http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm

MIT open courseware

http://ocw.mit.edu/courses/sloan-school-of-management/15-062-data-mining-spring-2003/index.htm

University of Texas

http://www.laits.utexas.edu/~anorman/BUS.FOR/course.mat/Alex/

CMU

http://www.autonlab.org/tutorials/

Google oauth 2.0 authorization for devices

What this means: You create an app on a device which doesn’t have a browser. For example, an Arduino, an appliance, or a game console. This procedure shows how to authorize that device to access a user’s account for Google, Twitter, Facebook, etc.,

See this URL for Google instructions: https://developers.google.com/accounts/docs/OAuth2ForDevices

Notes and Google examples (using curl from a command line):

Here is an oauth 2.0 google request for a user code – The client id is obtained using instructions found at the link above.

curl -d "client_id=104588205543369.apps.googleusercontent.com&scope=https://www.googleapis.com/auth/userinfo.email  https://www.googleapis.com/auth/userinfo.profile" https://accounts.google.com/o/oauth2/device/code

Which returned this JSON response:

{
  "device_code" : "4/Gujc7GxpGFSHNlphxVZCK_y10yS6Kq",
  "user_code" : "ibaz70ej9",
  "verification_url" : "http://www.google.com/device",
  "expires_in" : 1800,
  "interval" : 5
}

Then you go to the URL in the response, enter the user code, and follow instructions…

Then from the device you do this…

curl -d "client_id=1045882053369.apps.googleusercontent.com&client_secret=zDP5UVwbqcYzv7rnVieKxnOV&code=4/Gujc7GxpGHNlphxVZCK_y10yS6Kq&grant_type=http://oauth.net/grant_type/device/1.0" https://accounts.google.com/o/oauth2/token

Which returns this response:

{
  "access_token" : "ya29.AHES6ZE2QxqzZyWkGu20lJljEIHYTf08VtggyRF73428w0LQ7lzFP_uw",
  "token_type" : "Bearer",
  "expires_in" : 3600,
  "id_token" : "eyJhbGciOiJSUzI1NiIsImtpZCI6ImJhZGQ4NWFhMmRlZmZkMWFkZWJkNzc2NTgxNWMzZmVjZTM0MmIzNGEifQ.eyJpc3MiOiJhY2NvdW50cy5nb29nbGUuY29tIiwiaWQiOiIxMTExNzg0MjgyNzI3MDgxMTI0NTMiLCJhdWQiOiIxMDQ1ODgyMDUzMzY5LmFwcHMuZ2df9vZ2xldXNlcmNvbnRlbnQuY29tIiwiY2lkIjoiMTA0NTg4MjA1MzM2OS5hcHBzLmdvb2dsZXVzZXJjb250ZW50LmNvbSIsInZlcmlmaWVkX2VtYWlsIjoidHJ1ZSIsInRva2VuX2hhc2gefiOiJvVG9OdS0tYU1DUGhYbUI1S3p4TTN3IiwiZW1haWwiOiJ6aWNhcmVsdEBnb3VsZGFjYWRlbXkub3JnIiwiaGQiOiJnb3VsZGFjYWRlbXkub3JnIiwiaWF0IjoxMzU2MjQ2Mjg2LCJleHAiOjEzNTYyNTAxODZ9.DqIqLtg9m6wlHh5YSFFgXIOgbMW0E2mKR2FdY7PWtNJrt91moqVBe7dQxQPNalQMKhYTapJdVk2MB1oRl7zXEnLIe_VjI3BUwzTKqaG_sS9oRyh14_yqDWeMFru5d7OFUm1Ulwb2lLdWWwtttEVyJiw94oBdR0tuWg0MNkEOkXU",
  "refresh_token" : "1/NuEmigydABgeRwZaRCZbZZckJ-EJFZd8C1YZLURut8s"
}

Now your device can use the access token query string method…

curl https://www.googleapis.com/oauth2/v1/userinfo?access_token=ya29.AHES6ZQxqzZyWkGu20lJljEIHYTf08VtggyRF73428w0LQ7lzFP_uw

Here is the response:

{
 "id": "1111784282727081812453",
 "email": "looney@looney.org",
 "verified_email": true,
 "name": "Tony Tiger",
 "given_name": "Tony",
 "family_name": "Tiger",
 "hd": "looney.org"
}

Or you can use the http header option…

curl -H "Authorization: Bearer ya29.AHKKES6ZQxqzZyWkGu20lJljEIHYTf08VtggyRF73428w0LQ7lzFP_uw" https://www.googleapis.com/oauth2/v1/user info

which should return the exact same response.

[Also see] tkzic/max teaching examples/google-oauth2.0-readme.txt

 

 

 

Merriam Webster API

Dictionary and Thesaurus.

Get a developer API key from here: http://www.dictionaryapi.com

example of a request to the Thesaurus API

curl http://www.dictionaryapi.com/api/v1/references/thesaurus/xml/cheese?key=ee2466d2-07a0-40af-b959-abcdeb125f0ca

Here’s the XML response – filled with synonyms:

<?xml version="1.0" encoding="utf-8" ?>
<entry_list version="1.0">
	<entry id="cheese"><term><hw>cheese</hw></term><fl>noun</fl><sens><mc>that which is of low quality or worth</mc><vi>you wouldn't believe the <it>cheese</it> that the movie studio puts out</vi><syn>cheese, crapola [<it>slang</it>], dreck (<it>also</it> drek), muck, rubbish, sleaze, slop, slush, trash, tripe</syn><rel>camp, kitsch; claptrap, humbug, nonsense; bomb, clinker, clunker, dud, lemon, stinker, turkey; mess, muddle, shambles</rel></sens></entry>
	<entry id="big cheese"><term><hw>big cheese</hw></term><fl>noun</fl><sens><mc>one of high position or importance within a group</mc><vi>thinks he's a <it>big cheese</it> just because he's got a business card</vi><syn>big, big boy, big cheese, bigfoot, biggie, big gun, big leaguer, big-timer, big wheel, bigwig, fat cat, heavy, heavy hitter, heavyweight, high-muck-a-muck (<it>or</it> high-muckety-muck), honcho, kahuna, kingfish, kingpin, major leaguer, muckety-muck (<it>also</it> muck-a-muck <it>or</it> mucky-muck), nabob, nawab, nibs, nob [<it>chiefly British</it>], pooh-bah (<it>also</it> poo-bah), wheel</syn><rel>baron, czar (<it>also</it> tsar <it>or</it> tzar), king, lion, magnate, mogul, prince, tycoon; VIP</rel><near>inferior, subordinate, underling; mediocrity, obscurity</near><ant>lightweight, nobody, nonentity, nothing, shrimp, twerp, whippersnapper, zero, zilch</ant></sens></entry>
</entry_list>

 

 

 

Conversation with a robot in Max

This project brings together several examples of API programming with Max. The pandorabots.api patch contains an example of using curl to generate an XML response file, then converts XML to JSON using a Python script. The resulting JSON file is read into Max and parsed using the [js] object.

Here is an audio recording of my conversation (using Max) with a text chatbot named ‘Chomsky’

‘Chomsky’ lives at http://pandorabots.com.

My voice gets recorded by Max then converted to text by the Google speech-api.

The text is passed to the Pandorabots API. The chatbot response gets spoken by the aka.speech external which uses the Mac OS built-in text-to-speech system.

Note: The above recording was processed with a ‘silence truncate’ effect because there were  3-5 second delays between responses. In realtime it has the feel of the Houston/Apollo dialogs.

pandorabots-api.maxpat (which handles chatbot responses) gets text input from speech-to-google-text-api2.maxpat – a patch that converts speech to text using the Google speech-API.

http://reactivemusic.net/?p=4690

The output (responses from chatbot) get sent to twitter-search-to-speech2.maxpat which “speaks” using the Mac OS  text-to-speech program using the aka.speech external.

files

Max

  • speech-to-google-text-api2.maxpat
  • JSON-google-speech.js
  • pandorabots-api.maxpat
  • JSON-pandorabot.js
  • text-to-speech2.maxpat

externals:

[authorization]

  • none required

external programs:

  • sox: sox audio conversion program must be in the computer’s executable file path, ie., /usr/bin – or you can rewrite the [sprintf] input to [aka.shell] with the actual path. Get sox from: http://sox.sourceforge.net
  • xml2json (python) in tkzic/internetsensors/: xml2json/xml2json.py and xml2json/setup.py (for translating XML to JSON) – [NOTE] you will need to change the path in the [sprintf] object in pandorabots.api to point to the folder containing this python script.

instructions

  • Open the three Max patches.
    • speech-to-google-text-api2.maxpat
    • pandorabots-api.maxpat
    • text-to-speech2.maxpat
  • Clear the custid in the pandorabots-api patch
  • Start audio in the Google speech patch. Then toggle the mic button and say something.
  • After the first response, go to the pandorabots-api patch and click the new custid – so that the chatbot retains the thread of the conversation.

download:

The files for this project can be downloaded from the intenet-sensors archive at github

https://github.com/tkzic/internet-sensors

Pandorabots API

Update: Now part of Internet Sensors project: http://reactivemusic.net/?p=9834  

original post

Looking into using an API to communicate with chatbots

Here is info from pandorabots FAQ: http://www.pandorabots.com/botmaster/en/~15580d493a63acc7fab1820f~/faq

Chomsky bot id: botid=b0dafd24ee35a477

H.2 Is there an API allowing other programs to talk to a Pandorabot?

Pandorabots has an API called XML-RPC that you can use to connect third-party software to our server. The XML-RPC has been used to connect Pandorabots to a wide variety of third-party applications, including Mified, mIRC, Second Life and Flash.

You may interact with Pandorabots as a webservice. Pandorabots offers consulting services supporting arbitrary web services for premium services customers. Please contact sales@pandorabots.com for more information.

A client can interact with a Pandorabot by POST’ing to:

http://www.pandorabots.com/pandora/talk-xml

The form variables the client needs to POST are:

  • botid – see H.1 above.
  • input – what you want said to the bot.
  • custid – an ID to track the conversation with a particular customer. This variable is optional. If you don’t send a value Pandorabots will return a custid attribute value in the <result> element of the returned XML. Use this in subsequent POST’s to continue a conversation.

This will give a text/xml response. For example:

<result status="0" botid="c49b63239e34d1d5" custid="d2228e2eee12d255">
  <input>hello</input>
  <that>Hi there!</that>
</result>

The <input> and <that> elements are named after the corresponding AIML elements for bot input and last response. If there is an error,status will be non-zero and there will be a human readable <message> element included describing the error. For example:

<result status="1" custid="d2228e2eee12d255">
  <input>hello</input>
  <message>Missing botid</message>
</result>

Note that the values POST’d need to be form-urlencoded.

[update}

Here are two examples I just got to work using curl

curl -X POST  --data "botid=b0dafd24ee35a477&input=hello" http://www.pandorabots.com/pandora/talk-xml

curl -X POST  --data "botid=b0dafd24ee35a477&input=Where are you?" http://www.pandorabots.com/pandora/talk-xml
Here is the result for the second question
<result status="0" botid="b0dafd24ee35a477" custid="b3422b612633ac87"><input>Where are you?</input><that>I am in the computer at Pandorabots.com.</that></result>

Using the Google speech API

Speech recognition:

http://fennb.com/fast-free-speech-recognition-using-googles-in

Here’s a link to the manual for sox (audio converter)

sox.soundfourge.net/sox.html

This is an example of the curl command to run from the command line

curl \
  --data-binary @test.flac \
  --header 'Content-type: audio/x-flac; rate=16000' \
  'https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&pfilter=2&lang=en-US&maxresults=6'