SYN-SCI-RAP

I think I have begun to develop a mild form of insanity that often strikes those who fiddle around with computationally-generated text. After reading thousands of lines of dense incomprehensible gibberish it clarifies and makes sense, often more sense than any mere linear thought. The brain acclimatises to syntactic pressure.


Recipe for mildly insane word-salad:

  • take 57,000 rap songs input by fans,
    • extract all words that do not return results from WordNet synset search and put into Reservoir
  • one list of scientific terminology (for sombre intellectual tone)
    • chop off “-ology” wherever it occurs
  • one list of swear words (for spice)
  • call to WordNet synset algorithm (for fibre and continuity)
  • use pattern.en to do conjugation (for a tiny bit of coherence)
  • use NLTK part-of-speech tagging
  • Alchemy for entity (people, places, etc…) replacement
  • 10,000 or more poems

Mix all ingredients together using replacement algorithms.


To read 10116 poems (simple style) (in a single 24-mb html page) generated in 10356.4216051 seconds (2.87 hours, 3612pph [poems per hour], 60 ppm [poems per minute] ) on 2014-08-14 at 02:54 click here


Read a selection of just a few poems 

Read the RAP Reservoir: 33,150 words extracted from 56k user-input rap songs that did not return any usable results from a WordNet synset search. If you are looking for the evolution of language that occurs through mutation (typo, mispells, pop-cruft) this is it.


Code on Github
Made by Glia.ca