For Digital Historians, data visualization is an important means for understanding and interpreting data. When analyzing large datasets, visualization tools can be used to reveal patterns, see connections, and find holes within research. Visualization also offers an effective way to present complex data in a clear and visually appealing manner. For artist R. Luke DuBois, data visualization goes even further, as it can be presented as art. DuBois is a multidisciplinary artist with experience as a composer and a programmer. As a programmer, DuBois co-authored Jitter, which is a software suite which allows real-time manipulation of video and 3D imagery. As an artist, DuBouis focuses on using digital technology to visualize and expose the narratives within data. As technology can be used to express both our voices and our cultures, DuBouis visualizes data to capture how we communicate and understand our selves, and each other, in the 21st century. I recently came across his TED talk “Insightful Human Portraits made from Data,” and I found his data visualizations very interesting and thought provoking.
Why Visualize Data? It can be very useful for seeing holes in research, and also for analyzing data further to understand how a dataset is interconnected. This article by S. Graham, I. Milligan and S. Weingart explains the role of visualization in research. There are many different tools to approach visualizing data, here I will be using Voyant. This tool can read either
txt files. If uploading a folder of ordered text files, Voyant will visualize the data in chronological order. This allows you to see the changes in word frequency and use over time.
Name Entity Recognition
Stanford Name Entity Recognizer (or Stanford NER) looks at patterns in metadata, and identifies and tags/labels words in a text which are the names of places, people, organizations, time, date, etc. The results can be extracted and visualized.
Dr. Graham recommended a very useful tutorial by Michelle Moravec on how to use Stanford NER and then extract results on a Mac. The tutorial also shows you how to organize the results into a categorized list ex: list by Location. At first I had an issue with running Stanford NER, the command line was telling me there was an issue with Java: