Getting Started with Digital Work in History
Compiled by Toby Higbie for the “Laboring Big Data” panel at LAWCHA 2015.
DiRT: Digital Research Tools Directory: http://dirtdirectory.org/ . You’ll find what you’re looking for here.
Miriam Posner’s Blog: http://miriamposner.com/blog/ Consistently useful information about history-oriented digital humanities issues, and thoughtful commentary on politics of DH in universities. Good starting spot is “How Did They Make That?”: http://miriamposner.com/blog/how-did-they-make-that/
Server Space: it can be difficult to get approval for server space at your university. A free option is GoogleDrive. There are many hosting companies. I use Dreamhost: http://dreamhost.com .
Panelists’ Projects. Mapping Decline: http://mappingdecline.lib.uiowa.edu . Growing Apart: http://scalar.usc.edu/works/growing-apart-a-political-history-of-american-inequality/index. Mapping American Social Movements Through the 20th Century: http://depts.washington.edu/moves/index.shtml. Networked Labor: https://socialjusticehistory.org/projects/networkedlabor/
Basic Tools
OpenRefine: clean messy data faster
http://openrefine.org/
MS Excel: boring but essential. You don’t need a link, it’s already on your computer.
TextWrangler: simple, flexible text editing
http://www.textwrangler.com/products/textwrangler
Firefox. I’m still a fan. https://www.mozilla.org/en-US/firefox/products/ Get your files online with FireFTP a Firefox add on. https://addons.mozilla.org/en-US/firefox/addon/fireftp/
Google FusionTables (part of GoogleDrive: http://drive.google.com/ ): fastest way to make a map. Free, but of course they’re using your data somehow.
OpenStreetMap Nominatim: https://nominatim.openstreetmap.org/ . Can use this to return longitude and latitude for specific addresses.
Raw: vector graphics data visualizations. Cool, easy.
http://raw.densitydesign.org/
Voyant: visualize a corpus of text (word cloud, word counts, and other statistics): http://voyant-tools.org/
Gephi: open source network visualization platform. A little creaky, but hopefully will get an update soon. https://gephi.github.io/
Medialab Tools: essential for network charts and Gephi.
http://tools.medialab.sciences-po.fr/
Table 2 Net (convert CSV files to network tables via an easy interface)
Sigma.js: Gephi Plugin creates interactive network charts you can post online
Zotero: free, and almost free, citation management platform. http://zotero.org
Omeka: free and open source platform for building collections and exhibits of digital objects. Based on WordPress, but with greater specification of metadata. Lots of history/humanities specific plug-ins. http://omeka.org/
WordPress.org: the downloadable version of widely used blogging platform. Not as much metadata specification as Omeka. Easier to get started. https://wordpress.org/
Information Sharing and Project Tracking: If you are collaborating with other people, the hardest part is often keeping track of tasks/goals. Some useful tools: GoogleDocs and Sheets (free and easy to share); Evernote (more elegant, paid version necessary for off line work); Basecamp: lots of tools for project tracking, reminders for deadlines, etc. Probably will have to pay for it if you really want to use all the features.
Power Tools/Pricey Tools
Tableau: http://www.tableau.com/ Suite of charting and mapping tools. Have free version.
ArcGIS: Geographic Information System program, free trial, then $$. http://www.arcgis.com/features/
CartoDB: another GIS platform w/ free and paid option: https://cartodb.com/
Social Explorer: http://www.socialexplorer.com/ (Oxford University Press, paid service based on U.S. Census and other social data).
Getting More Complicated
R. Open source statistics and charting package. http://www.r-project.org/
Timeline JS: Makes illustrated timelines. Hosted version links with GoogleSheets: http://timeline.knightlab.com/ Also can download source code to your site: https://github.com/NUKnightLab/TimelineJS
Named Entity Extraction
Stanford NER: free, runs on desktop and ids People, Orgs, Locations very well. Can be “trained,” but that’s beyond me. http://nlp.stanford.edu/software/CRF-NER.shtml
AlchemyAPI: free but proprietary, v. effective. http://www.alchemyapi.com/
OpenCalais: http://www.opencalais.com/home free and CC licensed, but they keep your metadata. Have not tried. Used by DocumentCloud: https://www.documentcloud.org/home
OpenRefine NER Extension: free and open source. Uses other web services to id entities. DBPedia open source but not too effective. AlchemyAPI proprietary and very effective. First install OpenRefine/GoogleRefine, then install the NER Extension: https://github.com/RubenVerborgh/Refine-NER-Extension