A few rudimentary visuals of Poetry Foundation corpus (preliminary buggy results)

Word counting is the ‘Hello World’ of big data. And my data is relatively tiny.

Below are 25 images in 5 increasing small time scales for 5 different variables (line length, verse length, avg word length, # of words per poem, # of verses per poem) derived from an analysis of a corpus of 10.5k poems scraped from poetryfoundation.org.

plot_# of LINES_0_2015

Continue reading “A few rudimentary visuals of Poetry Foundation corpus (preliminary buggy results)”