{"id":4690,"date":"2013-01-06T02:01:55","date_gmt":"2013-01-06T02:01:55","guid":{"rendered":"http:\/\/zerokidz.com\/ideas\/?p=4690"},"modified":"2024-01-21T22:56:18","modified_gmt":"2024-01-22T03:56:18","slug":"speech-to-text-in-max","status":"publish","type":"post","link":"https:\/\/reactivemusic.net\/?p=4690","title":{"rendered":"Speech to text in Max"},"content":{"rendered":"<div class=\"callout\">\n<p class=\"lead\">Using the Google speech API<\/p>\n<p>(updated locally 1\/21\/2024 &#8211; changed binary path to sox for homebrew \/opt\/homebrew\/bin\/sox in [p call-google-speech]<\/p>\n<p>Also changed some of the UI and logic for manual writing and sending.<\/p>\n<p>(updated 1\/21\/2021)<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-19963\" src=\"https:\/\/reactivemusic.net\/wp-content\/uploads\/2013\/01\/Screen-Shot-2021-01-21-at-2.03.38-PM-300x116.png\" alt=\"\" width=\"300\" height=\"116\" srcset=\"https:\/\/reactivemusic.net\/wp-content\/uploads\/2013\/01\/Screen-Shot-2021-01-21-at-2.03.38-PM-300x116.png 300w, https:\/\/reactivemusic.net\/wp-content\/uploads\/2013\/01\/Screen-Shot-2021-01-21-at-2.03.38-PM-1024x397.png 1024w, https:\/\/reactivemusic.net\/wp-content\/uploads\/2013\/01\/Screen-Shot-2021-01-21-at-2.03.38-PM-768x298.png 768w, https:\/\/reactivemusic.net\/wp-content\/uploads\/2013\/01\/Screen-Shot-2021-01-21-at-2.03.38-PM-1536x596.png 1536w, https:\/\/reactivemusic.net\/wp-content\/uploads\/2013\/01\/Screen-Shot-2021-01-21-at-2.03.38-PM.png 1706w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/p>\n<\/div>\n<p>This project demonstrates the Google speech-API. It records speech in Max, process it using the Google API, and displays the result in a Max [message] object.<\/p>\n<h4>download<\/h4>\n<p><a href=\"https:\/\/github.com\/tkzic\/internet-sensors\">https:\/\/github.com\/tkzic\/internet-sensors<\/a><\/p>\n<p>folder: google-speech<\/p>\n<h4>files<\/h4>\n<h5>main patch<\/h5>\n<ul>\n<li><span style=\"line-height: 1.6;\">speech-to-google-text-api6.maxpat<\/span><\/li>\n<\/ul>\n<h5>abstractions and other files<\/h5>\n<ul>\n<li>JSON-google-speech.js (parses JSON response from Google API)<\/li>\n<li><span style=\"line-height: 1.6;\">ms-counter.maxpat (manages audio recording buffer)<\/span><\/li>\n<\/ul>\n<h4>external Max objects<\/h4>\n<ul>\n<li>[shell] from\u00a0 <a href=\"https:\/\/github.com\/jeremybernstein\/shell\/releases\/tag\/1.0b2\">https:\/\/github.com\/jeremybernstein\/shell\/releases\/tag\/1.0b2<\/a>\u00a0\u00a0download this external and add the folder to Options | File Preferences, in Max<\/li>\n<\/ul>\n<h4>external programs<\/h4>\n<p>sox: sox audio conversion program must be in the computer&#8217;s executable file path, ie., \/usr\/bin &#8211; or you can rewrite the [sprintf] input to [aka.shell] with the actual path. In our case we installed sox using Macports. The executable path is \/opt\/local\/bin\/sox &#8211; which is built into a message object in the subpatcher [call-google-speech]<\/p>\n<p>get sox from:\u00a0<a href=\"http:\/\/sox.sourceforge.net\">http:\/\/sox.sourceforge.net<\/a><\/p>\n<p>note: this conversion may not be necessary with recent updates to Max and the Google speech API<\/p>\n<h4>authorization<\/h4>\n<ul>\n<li>none required &#8211; so far<\/li>\n<\/ul>\n<div><span style=\"line-height: 22px;\">This may be changing.<\/span><\/div>\n<div><\/div>\n<div><span style=\"line-height: 22px;\">Insert here: how to get a speech-api key from Google\u00a0<\/span><\/div>\n<h4>instructions<\/h4>\n<ul>\n<li>Open Max patch: speech-to-google-text-api6<\/li>\n<li>Turn on audio<\/li>\n<li>Press the spacebar. Start talking. Press the spacebar again when you are finished. The translation will begin automatically<\/li>\n<\/ul>\n<p><span style=\"line-height: 1.6;\">Note: If you have a slow internet connection you may need to tweak the various delay times in\u00a0 the [call google-speech] <\/span>sub patch<span style=\"line-height: 1.6;\">.<\/span><\/p>\n<h4>send Tweets using speech<\/h4>\n<p>Max [send] and [receive] objects pass data from this project to other projects that send Tweets from Max. Just run the patches at the same time.<\/p>\n<ul>\n<li>Using curl: \u00a0<a href=\"https:\/\/reactivemusic.net\/?p=5447\">https:\/\/reactivemusic.net\/?p=5447<\/a><\/li>\n<li>Using ruby:\u00a0<a href=\"https:\/\/reactivemusic.net\/?p=5818\">https:\/\/reactivemusic.net\/?p=5818<\/a><\/li>\n<\/ul>\n<p>Also, check out how this project is integrated into the Pandorabots chatbot API project<\/p>\n<p><a href=\"https:\/\/reactivemusic.net\/?p=9834\">https:\/\/reactivemusic.net\/?p=9834<\/a><\/p>\n<p>Or anything else. The Google translation is amazingly accurate.<\/p>\n<div>\n<h4>revision history<\/h4>\n<div>\n<ul>\n<li>4\/24\/2016: need to have explicit path to sox, in the call-google-speech subpatch. In my Macports version the path is \/usr\/local\/opt\/bin\/sox.<\/li>\n<li>5\/11\/2014: The newest version requires Max 6.1.7 (for JSON parsing). Also have updated to Google Speech API v2.<\/li>\n<li>update 3\/26\/2014 to use auto-record features developed for chatbot conversations<\/li>\n<\/ul>\n<\/div>\n<div><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Using the Google speech API (updated locally 1\/21\/2024 &#8211; changed binary path to sox for homebrew \/opt\/homebrew\/bin\/sox in [p call-google-speech] Also changed some of the UI and logic for manual writing and sending. (updated 1\/21\/2021) This project demonstrates the Google speech-API. It records speech in Max, process it using the Google API, and displays the &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/reactivemusic.net\/?p=4690\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Speech to text in Max&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_coblocks_attr":"","_coblocks_dimensions":"","_coblocks_responsive_height":"","_coblocks_accordion_ie_support":"","footnotes":""},"categories":[143,275,28],"tags":[161,6,345,190,137],"class_list":["post-4690","post","type-post","status-publish","format-standard","hentry","category-interactive-media-art","category-internet-sensors","category-maxmsp","tag-api","tag-interactive-media","tag-maxmsp","tag-portfolio","tag-text-to-speech"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Speech to text in Max - reactive music<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/reactivemusic.net\/?p=4690\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Speech to text in Max - reactive music\" \/>\n<meta property=\"og:description\" content=\"Using the Google speech API (updated locally 1\/21\/2024 &#8211; changed binary path to sox for homebrew \/opt\/homebrew\/bin\/sox in [p call-google-speech] Also changed some of the UI and logic for manual writing and sending. (updated 1\/21\/2021) This project demonstrates the Google speech-API. It records speech in Max, process it using the Google API, and displays the &hellip; Continue reading &quot;Speech to text in Max&quot;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/reactivemusic.net\/?p=4690\" \/>\n<meta property=\"og:site_name\" content=\"reactive music\" \/>\n<meta property=\"article:published_time\" content=\"2013-01-06T02:01:55+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-01-22T03:56:18+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/reactivemusic.net\/wp-content\/uploads\/2013\/01\/Screen-Shot-2021-01-21-at-2.03.38-PM-300x116.png\" \/>\n<meta name=\"author\" content=\"Tom Zicarelli\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Tom Zicarelli\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/?p=4690#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/?p=4690\"},\"author\":{\"name\":\"Tom Zicarelli\",\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/#\\\/schema\\\/person\\\/56224d281582df7e5518e037ca63e571\"},\"headline\":\"Speech to text in Max\",\"datePublished\":\"2013-01-06T02:01:55+00:00\",\"dateModified\":\"2024-01-22T03:56:18+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/?p=4690\"},\"wordCount\":398,\"image\":{\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/?p=4690#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/reactivemusic.net\\\/wp-content\\\/uploads\\\/2013\\\/01\\\/Screen-Shot-2021-01-21-at-2.03.38-PM-300x116.png\",\"keywords\":[\"API\",\"interactive media\",\"Max\\\/MSP\",\"portfolio\",\"text to speech\"],\"articleSection\":[\"interactive media art\",\"internet-sensors\",\"Max\\\/MSP\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/?p=4690\",\"url\":\"https:\\\/\\\/reactivemusic.net\\\/?p=4690\",\"name\":\"Speech to text in Max - reactive music\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/?p=4690#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/?p=4690#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/reactivemusic.net\\\/wp-content\\\/uploads\\\/2013\\\/01\\\/Screen-Shot-2021-01-21-at-2.03.38-PM-300x116.png\",\"datePublished\":\"2013-01-06T02:01:55+00:00\",\"dateModified\":\"2024-01-22T03:56:18+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/#\\\/schema\\\/person\\\/56224d281582df7e5518e037ca63e571\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/?p=4690#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/reactivemusic.net\\\/?p=4690\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/?p=4690#primaryimage\",\"url\":\"https:\\\/\\\/reactivemusic.net\\\/wp-content\\\/uploads\\\/2013\\\/01\\\/Screen-Shot-2021-01-21-at-2.03.38-PM.png\",\"contentUrl\":\"https:\\\/\\\/reactivemusic.net\\\/wp-content\\\/uploads\\\/2013\\\/01\\\/Screen-Shot-2021-01-21-at-2.03.38-PM.png\",\"width\":1706,\"height\":662},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/?p=4690#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/reactivemusic.net\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Speech to text in Max\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/#website\",\"url\":\"https:\\\/\\\/reactivemusic.net\\\/\",\"name\":\"reactive music\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/reactivemusic.net\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/reactivemusic.net\\\/#\\\/schema\\\/person\\\/56224d281582df7e5518e037ca63e571\",\"name\":\"Tom Zicarelli\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/0da58cf21a2707dd335b204b8ed3cd9194dcbf9d9814ac5d71195a65c76c8a72?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/0da58cf21a2707dd335b204b8ed3cd9194dcbf9d9814ac5d71195a65c76c8a72?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/0da58cf21a2707dd335b204b8ed3cd9194dcbf9d9814ac5d71195a65c76c8a72?s=96&d=mm&r=g\",\"caption\":\"Tom Zicarelli\"},\"sameAs\":[\"http:\\\/\\\/tomzicarelli.com\"],\"url\":\"https:\\\/\\\/reactivemusic.net\\\/?author=2\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Speech to text in Max - reactive music","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/reactivemusic.net\/?p=4690","og_locale":"en_US","og_type":"article","og_title":"Speech to text in Max - reactive music","og_description":"Using the Google speech API (updated locally 1\/21\/2024 &#8211; changed binary path to sox for homebrew \/opt\/homebrew\/bin\/sox in [p call-google-speech] Also changed some of the UI and logic for manual writing and sending. (updated 1\/21\/2021) This project demonstrates the Google speech-API. It records speech in Max, process it using the Google API, and displays the &hellip; Continue reading \"Speech to text in Max\"","og_url":"https:\/\/reactivemusic.net\/?p=4690","og_site_name":"reactive music","article_published_time":"2013-01-06T02:01:55+00:00","article_modified_time":"2024-01-22T03:56:18+00:00","og_image":[{"url":"https:\/\/reactivemusic.net\/wp-content\/uploads\/2013\/01\/Screen-Shot-2021-01-21-at-2.03.38-PM-300x116.png","type":"","width":"","height":""}],"author":"Tom Zicarelli","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Tom Zicarelli","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/reactivemusic.net\/?p=4690#article","isPartOf":{"@id":"https:\/\/reactivemusic.net\/?p=4690"},"author":{"name":"Tom Zicarelli","@id":"https:\/\/reactivemusic.net\/#\/schema\/person\/56224d281582df7e5518e037ca63e571"},"headline":"Speech to text in Max","datePublished":"2013-01-06T02:01:55+00:00","dateModified":"2024-01-22T03:56:18+00:00","mainEntityOfPage":{"@id":"https:\/\/reactivemusic.net\/?p=4690"},"wordCount":398,"image":{"@id":"https:\/\/reactivemusic.net\/?p=4690#primaryimage"},"thumbnailUrl":"https:\/\/reactivemusic.net\/wp-content\/uploads\/2013\/01\/Screen-Shot-2021-01-21-at-2.03.38-PM-300x116.png","keywords":["API","interactive media","Max\/MSP","portfolio","text to speech"],"articleSection":["interactive media art","internet-sensors","Max\/MSP"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/reactivemusic.net\/?p=4690","url":"https:\/\/reactivemusic.net\/?p=4690","name":"Speech to text in Max - reactive music","isPartOf":{"@id":"https:\/\/reactivemusic.net\/#website"},"primaryImageOfPage":{"@id":"https:\/\/reactivemusic.net\/?p=4690#primaryimage"},"image":{"@id":"https:\/\/reactivemusic.net\/?p=4690#primaryimage"},"thumbnailUrl":"https:\/\/reactivemusic.net\/wp-content\/uploads\/2013\/01\/Screen-Shot-2021-01-21-at-2.03.38-PM-300x116.png","datePublished":"2013-01-06T02:01:55+00:00","dateModified":"2024-01-22T03:56:18+00:00","author":{"@id":"https:\/\/reactivemusic.net\/#\/schema\/person\/56224d281582df7e5518e037ca63e571"},"breadcrumb":{"@id":"https:\/\/reactivemusic.net\/?p=4690#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/reactivemusic.net\/?p=4690"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/reactivemusic.net\/?p=4690#primaryimage","url":"https:\/\/reactivemusic.net\/wp-content\/uploads\/2013\/01\/Screen-Shot-2021-01-21-at-2.03.38-PM.png","contentUrl":"https:\/\/reactivemusic.net\/wp-content\/uploads\/2013\/01\/Screen-Shot-2021-01-21-at-2.03.38-PM.png","width":1706,"height":662},{"@type":"BreadcrumbList","@id":"https:\/\/reactivemusic.net\/?p=4690#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/reactivemusic.net\/"},{"@type":"ListItem","position":2,"name":"Speech to text in Max"}]},{"@type":"WebSite","@id":"https:\/\/reactivemusic.net\/#website","url":"https:\/\/reactivemusic.net\/","name":"reactive music","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/reactivemusic.net\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/reactivemusic.net\/#\/schema\/person\/56224d281582df7e5518e037ca63e571","name":"Tom Zicarelli","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/0da58cf21a2707dd335b204b8ed3cd9194dcbf9d9814ac5d71195a65c76c8a72?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/0da58cf21a2707dd335b204b8ed3cd9194dcbf9d9814ac5d71195a65c76c8a72?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/0da58cf21a2707dd335b204b8ed3cd9194dcbf9d9814ac5d71195a65c76c8a72?s=96&d=mm&r=g","caption":"Tom Zicarelli"},"sameAs":["http:\/\/tomzicarelli.com"],"url":"https:\/\/reactivemusic.net\/?author=2"}]}},"_links":{"self":[{"href":"https:\/\/reactivemusic.net\/index.php?rest_route=\/wp\/v2\/posts\/4690","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/reactivemusic.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/reactivemusic.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/reactivemusic.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/reactivemusic.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4690"}],"version-history":[{"count":35,"href":"https:\/\/reactivemusic.net\/index.php?rest_route=\/wp\/v2\/posts\/4690\/revisions"}],"predecessor-version":[{"id":20917,"href":"https:\/\/reactivemusic.net\/index.php?rest_route=\/wp\/v2\/posts\/4690\/revisions\/20917"}],"wp:attachment":[{"href":"https:\/\/reactivemusic.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4690"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/reactivemusic.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4690"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/reactivemusic.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4690"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}