DH Tools: Planet and Narration project


June 26, 2014 4:48 pm
Written by Leave your thoughts

My name is Josh Overbeek. I am currently working as a research assistant with Professor Cheryl Lousley on a project examining the narrative strategies emerging in the Brundtland Commission Hearings, and the text I began working with is a typed up list of quotations appearing in the published final Brundtland report entitled Our Common Future. This blog post will report on my initial experience with Voyant’s scatterplot tool.

Heeding the note on PDFs in the Voyant Tools’ online documentation encouraging users to convert PDFs to RTF to avoid losing information, I converted the PDF of quotations into an RTF file. I then uploaded just the file into the scatterplot tool. I clicked correspondence under analysis and applied a stop list using the little gear icon on the right hand side. This is what emerges. Click and drag the mouse across an area in the graph to take a closer look.


Following the suggestion in a blog post by Stéfan Sinclair, creator of the Voyant digital humanities tools, I decided to set aside concerns about the intimidating mathematics behind digital humanities techniques, such as correspondence analysis, to focus on the patterns and interesting phenomena emerging in texts once read through the Voyant tool library. In other words, I determined to trust the tools, bypass questions of how they work, and focus instead on the ways in which digital humanities techniques can help me as a researcher read texts in new ways.

I suspected that uploading only one document would not reveal a great deal about a text and the tool would be most useful when comparing a number of texts organized into strategic categories, such as time periods or speakers. However, the terms that tend to “stand alone,” relatively speaking, in the quotations became immediately apparent in the scatterplot. “Forest” and “energy,” despite frequently appearing in the text, seem to be used without the context of some of the other dominant terms in the corpus.

An interesting component of this tool is the clusters option, which, using the criteria of the analysis, automatically generates more clusters that are differentiated through colour and that express some measure of relationship between terms. This is what it looks like if I apply three clusters.


So what does it mean? There are some interesting phenomena emerging in the content of the quotations in Our Common Future, even without ordering that content into separate texts around an organizing principle. I will give a short example: the visual language of proximity indicates interconnections between the words “ecological,” “problems,” “need,” “environment,” and “global.” You can click and drag your mouse across the main cluster to see this collection of terms. The presence of “need” and “problems” and their associations with “ecological” and the “environment” in the main cluster of the graph suggest that these terms receive a lot of focus in the Brundtland report. “Pollution” is placed some distance away from these terms, indicating a weak connection. The quoted speakers in Our Common Future did not seem to speak about “pollution” in the context of environmental “problems” and “need,” which may be because “pollution” is associated with the industrialized world and “need” with the global South. The significance of these associations is raised in this visualization. I believe the value of these kinds of tools will lie in the questions that emerge when patterns, relationships and clusters are visually represented.

Leave a Reply

Your email address will not be published. Required fields are marked *