Word counting is the ‘Hello World’ of big data. And my data is relatively tiny.
Below are 25 images in 5 increasing small time scales for 5 different variables (line length, verse length, avg word length, # of words per poem, # of verses per poem) derived from an analysis of a corpus of 10.5k poems scraped from poetryfoundation.org.
Continue reading “A few rudimentary visuals of Poetry Foundation corpus (preliminary buggy results)”