Toolbox

Getting Started with Digital Work in History

Compiled by Toby Higbie for the “Laboring Big Data” panel at LAWCHA 2015.

DiRT: Digital Research Tools Directory: http://dirtdirectory.org/ . You’ll find what you’re looking for here.

Miriam Posner’s Blog: http://miriamposner.com/blog/ Consistently useful information about history-oriented digital humanities issues, and thoughtful commentary on politics of DH in universities. Good starting spot is “How Did They Make That?”: http://miriamposner.com/blog/how-did-they-make-that/

Server Space: it can be difficult to get approval for server space at your university. A free option is GoogleDrive. There are many hosting companies. I use Dreamhost: http://dreamhost.com .

Panelists’ Projects. Mapping Decline: http://mappingdecline.lib.uiowa.edu . Growing Apart: http://scalar.usc.edu/works/growing-apart-a-political-history-of-american-inequality/index. Mapping American Social Movements Through the 20th Century: http://depts.washington.edu/moves/index.shtml. Networked Labor: https://socialjusticehistory.org/projects/networkedlabor/

Basic Tools

OpenRefine: clean messy data faster
http://openrefine.org/

MS Excel: boring but essential. You don’t need a link, it’s already on your computer.

TextWrangler: simple, flexible text editing
http://www.textwrangler.com/products/textwrangler

Firefox. I’m still a fan. https://www.mozilla.org/en-US/firefox/products/ Get your files online with FireFTP a Firefox add on. https://addons.mozilla.org/en-US/firefox/addon/fireftp/

Google FusionTables (part of GoogleDrive: http://drive.google.com/ ): fastest way to make a map. Free, but of course they’re using your data somehow.

OpenStreetMap Nominatim: https://nominatim.openstreetmap.org/ . Can use this to return longitude and latitude for specific addresses.

Raw: vector graphics data visualizations. Cool, easy.
http://raw.densitydesign.org/

Voyant: visualize a corpus of text (word cloud, word counts, and other statistics): http://voyant-tools.org/

Gephi: open source network visualization platform. A little creaky, but hopefully will get an update soon. https://gephi.github.io/

Medialab Tools: essential for network charts and Gephi.
http://tools.medialab.sciences-po.fr/
Table 2 Net (convert CSV files to network tables via an easy interface)
Sigma.js: Gephi Plugin creates interactive network charts you can post online

Zotero: free, and almost free, citation management platform. http://zotero.org

Omeka: free and open source platform for building collections and exhibits of digital objects. Based on WordPress, but with greater specification of metadata. Lots of history/humanities specific plug-ins. http://omeka.org/

WordPress.org: the downloadable version of widely used blogging platform. Not as much metadata specification as Omeka. Easier to get started. https://wordpress.org/

Information Sharing and Project Tracking: If you are collaborating with other people, the hardest part is often keeping track of tasks/goals. Some useful tools: GoogleDocs and Sheets (free and easy to share); Evernote (more elegant, paid version necessary for off line work); Basecamp: lots of tools for project tracking, reminders for deadlines, etc. Probably will have to pay for it if you really want to use all the features.

Power Tools/Pricey Tools

Tableau: http://www.tableau.com/ Suite of charting and mapping tools. Have free version.

ArcGIS: Geographic Information System program, free trial, then $$. http://www.arcgis.com/features/

CartoDB: another GIS platform w/ free and paid option: https://cartodb.com/

Social Explorer: http://www.socialexplorer.com/ (Oxford University Press, paid service based on U.S. Census and other social data).

Getting More Complicated

R. Open source statistics and charting package. http://www.r-project.org/

Timeline JS: Makes illustrated timelines. Hosted version links with GoogleSheets: http://timeline.knightlab.com/ Also can download source code to your site: https://github.com/NUKnightLab/TimelineJS

Named Entity Extraction
Stanford NER: free, runs on desktop and ids People, Orgs, Locations very well. Can be “trained,” but that’s beyond me. http://nlp.stanford.edu/software/CRF-NER.shtml
AlchemyAPI: free but proprietary, v. effective. http://www.alchemyapi.com/
OpenCalais: http://www.opencalais.com/home free and CC licensed, but they keep your metadata. Have not tried. Used by DocumentCloud: https://www.documentcloud.org/home
OpenRefine NER Extension: free and open source. Uses other web services to id entities. DBPedia open source but not too effective. AlchemyAPI proprietary and very effective. First install OpenRefine/GoogleRefine, then install the NER Extension: https://github.com/RubenVerborgh/Refine-NER-Extension

Leave a Reply