notes
I’ve revised the php program that streams Tweets and sends them to Max, to remove hyperlinks, RT indicators, user mentions, and ascii art. Now it works better with text-to-speech.
things that could be done in a future project:
- figure out which #hashtags are integral to content, and which are just tagged onto the end of a tweet
- remove extraneous hyperlinks which don’t get parsed by the API
- translate symbols like > into “great than” or “better than”
- translate (or at least flag) foreign languages – this could be aided by geocoding data
- translate slang acronyms like OMG, LOL
- natural language parsing (see Stanford open source program) for content and grammatical analysis
- replace hyperlinks/picture-links with a ‘title’ from the actual target
- natural language equivalents of things like: RT @zooloo:
things to try
- Running the output of text-to-speech through musical analysis tools, to detect pitch and rhythm
- Chaining: Use the content of one tweet to direct a search for the next one. For example say you search for cats and get: “my cat is turning purple” – then you would search for ‘purple’ and get: “I’ve never eaten a purple cow” – then you would search for “cow” and so forth