Labor’s New Men and Women of Power: What did they do in their spare time?

In his 1948 book, The New Men of Power, sociologist C. Wright Mills describe trade union leaders as “strategic actors” at the helm of “the only organizations capable of stopping the main drift toward war and slump.” That quote came to mind as I poured over the pages of the 1946 directory, Who’s Who in Labor, an echo of Solon DeLeon’s 1925 American Labor Who’s Who and Press Directory. From the Baltimore steelworker Edwin Abbott to Joseph Zych, the recording secretary of an AFL federal local in East Chicago, Indiana, the 390 pages cover a lot of human, organization, and geographic territory. There is also a separate section of labor professionals: “Men and Women Who Deal with Labor.”

Like its 1925 counterpart, the postwar labor directory is hard to work with. The text behind the scanned pages is a mess. But the entries are much more systematic than the 1925 directory, making it easier to extract targeted information. Best of all, most entries include sections on Interests, Clubs, and Sports, making it possible to get a sense of what these men and women new to power liked to do in their spare time. With the help of a detail-oriented UCLA undergraduate student, I was able to compile a relatively clean version of the directory–over 3,900 entries.

Using a simple grep command on the directory’s text, I was able to pull out much (although probably not all) of the words listed after the phrase “Interests:” and before the start of the next section (typically “Sports”). Replacing the commas in these lines with hard returns, I compiled a rough and ready list of just over 3,000 interests. The vocabulary is not entirely consistent, and there are a number of obvious OCR errors. That said, it is possible to get a rough picture of what labor leaders “Interests” were (keep in mind that sports and clubs had their own entries).

What was the top interest? Gardening and reading come in at about 200 mentions each, followed by fishing, music, photography, and hunting. Curiously, members of the AFL were more likely to list “gardening,” while members of the CIO were more likely to list “reading.”

Interests of CIO leaders, 1946
Rough assessment of “interests” of AFL leaders drawn from the 1946 “Who’s Who in Labor.”
Interests of AFL leaders, 1946
Rough assessment of “interests” of AFL leaders drawn from the 1946 “Who’s Who in Labor.”

Thick Networks

In this next series of images I’ve shifted the emphasis from the relationship between people and organizations to the links that people share by virtue of belonging to the same organizations. In the former case, both people and organizations are nodes of the network connected by lines (edges). In the latter case, only people are nodes. Organizations are represented by lines connecting people. Each person is connected to every other person with whom they share an organization. The result is a much thicker field of relationships–or at least the appearance of it.

When I cycle through this set of images, I am reminded of brain scan imagery in which various types of stimuli ignite neurons in different parts of the brain. (In the gallery images, the colored dots are connected to the selected person while the white dots are not connected). There are four major clusters. None are completely homogeneous, but in the interest of description I’ll name them as if they are. From left to right, we begin with the immigrant trade union militants (e.g., Pauline Newman, and Rose Schneiderman who is slightly more connected to the WTUL group), move through the central group made up important of union and Socialist leaders (e.g., Debs, Berger, Maurer, Dubinsky) that also includes African American unionists Owen Chandler and Frank Crosswaithe. Moving further to the right we come to the mainstream, unremarkable AFL unionists on the bottom right, and finally at the top right corner we finish with railroad brotherhood leaders.

You can see the visual effect more strongly, although more slowly, in the interactive version of the chart One node network chart.

Situations and Relations

Back in February, I gave a talk to the UCLA Digital Labor Working Group about my network analysis with the Labor Who’s Who data. You can see my slides here.

Patterns within a welter of information. Red dots represent people, light green dots represent organizations.
Patterns within a welter of information. Red dots represent people, light green dots represent organizations.

I opened with the idea that “the labor movement” is an abstraction–a place-holder phrase that means different things at different times. The American Labor Who’s Who was a particular version of that abstraction, created at a particularly contentious moment in labor history. It was compiled by a team led by Solon De Leon (son of a famous radical polemicist), and published by the Socialist Party aligned Rand School of Social Science. It describes a labor movement that encompasses not only trade unions, but also radical political movements, immigrant organizations, researchers, journalists, and what we would call “NGOs” today. My analysis, drawn from data extracted from the Who’s Who, is an abstraction of an abstraction.

It’s worth beginning with this caveat because computation and data visualization have an aura of legitimacy these days. These network charts (created in Gephi) are representations of reality, not reality itself. They are best used as models of plausible past realities, tools for thinking through problems of historical argument, rather than as illustrations per se.

I began with the broadest and busiest view of the data: all the people in the Who’s Who and organizations they belonged to (slide 1). The mathematical model that creates this chart draws more connected elements, or “nodes,” closer to the center and pushes less connected elements to the edges. A node’s size depends on how connected it is to other nodes, and lines connect people to the organizations they belong to. In these charts, the lines, or edges, have direction. People belong to organizations, so radiate from each person to their corresponding organizations.

In broad strokes, the first graph presents a ring of organizations roughly the same size, three organizations that are noticeably larger on the inside edge of the ring, and several groupings of people inside the ring. Without knowing the names of the people or the organizations, it appears that three or four organizations dominate the institutional field of the labor movement. There is also a lot of “noise.”

The right wing constellation of organizations included the AFL, fraternal societies, and the Democratic and Republican parties.
The right wing constellation of organizations included the AFL, fraternal societies, and the Democratic and Republican parties.

The left-wing constellation of organizations centered on the Socialist Party.
The left-wing constellation of organizations centered on the Socialist Party.

The next two slides try to filter out some of that noise by focusing on the “right” and “left” flanks of this social formation (think of it as “stage right”). The American Federation of Labor (AFL) and the Masons dominate the right side of the field (slide 2), surrounded by other fraternal organizations (Elks, Odd Fellows, Moose, etc.), mainstream political parties, and four trade unions–the Printers (ITU), Machinists (IAM), Miners (UMWA), and Carpenters (UBC). On the left (slide 3), the Socialist Party dominates, and is surrounded by independent unions (two garment worker unions and the IWW), left-wing parties and para-party organizations (Communist and Workers parties, the Trade Union Educational League, left-wing youth organizations, and the Workmen’s Circle. Worth noting: the spatial position of a node has no relationship to its place on the left/right political spectrum. The Women’s Trade Union League and the American Federation of Teachers, for instance, are farther away from the SP than the Workers’ Party, for instance. (In future I should probably reorient these vertically!)

Henry Ohl was a leading figure in the Wisconsin labor movement, and the Univ. of Wisconsin School for Workers.
Henry Ohl was a leading figure in the Wisconsin labor movement, and the Univ. of Wisconsin School for Workers.

Next come two slides that focus on two individuals who show up near the center of the graph, and represent mediating figures between the AFL and SP-oriented flanks of the movement. Henry Ohl, Jr. (slide 4) was a Milwaukee Socialist and a printer who championed the University of Wisconsin’s School for Workers.

Socialist editor Max Hayes was unusually well connected to the key organizations of the 1920s labor movement.
Socialist editor Max Hayes was unusually well connected to the key organizations of the 1920s labor movement.

Max Hayes (slide 5) was a Cleveland Socialist–another printer–and the editor of the Cleveland Citizen. Both men started working in their early teens, apprenticed as printers, and were deeply involved in Socialist politics. Compare these two men with William Z. Foster (slide 6). He also linked the AFL and the SP, but by 1925 was publicly associated with the Workers’ Party and is placed farther on the periphery of the graph. Similarly, women union activists sit on the periphery of these network graphs, as do a number of labor intellectuals.

Whether Foster (or Pauline Newman or A. Philip Randolph) was less “central” to the labor movement of 1925 than Ohl or Hayes  is not really what the graph explains. Centrality in this model is not the same as “importance.” Ohl and Hayes are more “central” because they were members of fraternal associations, and their membership creates a relationship in this model that draws them closer to the many non-Socialist men who were likewise part of the world of the Masons, Odd Fellows, Elks, and Moose.

Communist leader William Z. Foster's network profile.
Communist leader William Z. Foster’s network profile.

Unfortunately, we can’t see how this chart would change by 1940 when new leaders and organizations were in the field, and some of those on the periphery in 1925 moved to the center (e.g., Sidney Hillman). But the lack of chronology also helps us see the way careers in the labor movement spanned multiple institutions (e.g., Max Hayes in the Peoples Party and the SP).

Labor and radical history is often told one organization at a time, one city at a time, one campaign at a time. Of course we use the singular focus as a way to get at broader themes. When I researched my first book, I began with IWW harvest workers, and that opened out onto a whole constellation of social forces, places, and people. Network graphs, for all their complications and limitations, turn our eyes first to the relatedness that structures a social field. The “labor movement” of the 1920s was a particularly contentious place where splits between one wing or the other severed ties between erstwhile comrades. But groups and individuals in contentious relationships are still in relationships. A labor movement divided and fighting was still a movement to overturn the worst abuses of capitalism.

An insight I’ve gained from my research on workers’ education in between the world wars is that organizational schisms were not always the end of the story. Quite often they produced more talk, more action, and more learning. “There is no one road to freedom,” said the author of a popular workers’ education pamphlet, “There are roads to freedom.”

**************************

Note: I know the charts mix up colors and orientations. Extracting good charts from Gephi is one of the big challenges of this project, and I’m working on some other–also imperfect–ways to share the visualizations in more active form.

Networked Labor Movement: I reach an impasse, and go around

This is the fourth a series of posts I am writing to help me think through the use of network analysis and visualization.

alww-corrected
A simplified network chart based on the complete ALWW directory. The chart shows only individuals with 3 or more connections.

About seven months ago, I was merrily chugging along on this series using the index of the 1925 American Labor Who’s Who as a database for network analysis when I hit an impasse. I was using the list of names and organizations from the book’s index to build network charts. However, the simple structure of the index, so handy for the analog book, adds a layer of abstraction/interpretation that gets in the way of analysis.

The Labor Who’s Who index presents names according to two types of categories. The first might be called “varieties of organization” and includes American Federation of Labor Affiliated Bodies, Independent Unions, Political Parties, and Miscellaneous. Of these, only “AFL-affiliated” is an organic category. “Political Parties,” on the other hand, is a conceptual category, not an entity that the Socialist Party or the Republican Party affiliated with. At the next level down things get more complicated. Things get even messier in the Miscellaneous category, which includes Journalists and Writers, Negro Progress, Workers Education, and a few others. Unfortunately, the index doesn’t tell us the particular newspapers and organizations that make up these sub-groupings in Miscellaneous.

Neither does the index list all the organizational affiliations listed in individual entries, it is more of a snapshot of what the compilers thought were the most important memberships of each person. The result is a simplified, and perhaps, distorted image of the network of associations, and my research impasse. I was at the point of pulling out particular sections of the network chart (those individuals who sat between the two main groupings), but it seemed better to stop and develop the full database than continue with the index alone.

Easier said than done. The complete directory of over 1,000 names is much messier than the index (see the post “Old Book, New Data”). In addition to basic OCR scanning errors there are a few missing and torn pages in the scanned version. The enormity of the task of cleaning the data myself loomed. One solution was to “crowd source” the data cleaning, but that might take a long time and who would really be interested? Another potential solution was to deploy undergraduate students as a “curated crowd.” Because I was already scheduled to teach an upper division lecture course on American Working Class Movements in the fall of 2014, I developed a course project that included a small amount of data cleaning for students–and (as it turned out) a lot of help from two graduate students in the UCLA Center for Digital Humanities. I’ll write about what went right and wrong with that process in a later post, but the upshot is that now I have a working version of the complete directory.

And with that news, I will begin to post more regularly over the next month.

 

Networked Labor Movement: Edges and Mediators

This is the third in a series of posts I am writing to help me think through the use of network analysis and visualization.

A more attractive, but somewhat less informational, version of the chart showing the mediators grouped into their own node. Note that the node is green because it is made up of individuals.
A more attractive, but somewhat less informational, version of the chart showing the mediators grouped into their own node. Note that the node is green because it is made up of individuals.

My first post in this series off-handedly introduced the phrase “bipolar labor movement”–which I suppose is a nice way to avoid calling it schizophrenic. Then I took a sideways step to flesh out contents of the major categories in the American Labor Who’s Who index. Now we can move on to the look at the connections between all those dots that make the cool-looking network charts (right).

In network analysis lingo these links between people, organizations, and groups of organizations are called “edges.” In this post I’m going to look at a number of different layouts, some of which will be prettier than others. This is partly a function of Gephi, which has two ways of viewing the charts: Overview (not as pretty but more analytically functional) and Preview (less analysis and more graphic beauty).

A network chart based on the index of the American Labor Who's Who (1925). Blue dots represent major categories, red dots are organizations or subcategories, and green dots represent individuals.
A network chart based on the index of the American Labor Who’s Who (1925). Blue dots represent major categories, red dots are organizations or subcategories, and green dots represent individuals.

If you recall from the first post in the series, I came up with something that looks like a scatter plot (left). Green dots represent individuals, red dots represent subcategories of the index, and blue dots represent top-level categories. Below, I’ve used the same image, but made the edges visible.

One of the problems here is that there are so many nodes and links tightly packed that it gets very hard to make sense of them in the aggregate–the main reason I began with a simplified and abstracted version in the first post. In Gephi, you can filter out the less networked nodes (say, anyone who isn’t in at least two categories/groups). But for the moment it’s interesting just to ponder the whole messy lot and look for possible patterns.

Network chart showing edges (linkages) based on index of the American Labor Who’s Who (1925) with major groups labeled.
Network chart showing edges (linkages) based on index of the American Labor Who’s Who (1925) with major groups labeled.

The clearest bits of new information are that there are a number links, and a group of individuals (green dots) in between the major (blue) nodes This seems potentially important. The individuals in the middle appear to be the bridge that links an otherwise polarized social formation. Did they really have such a function in historical context, or is their position on the chart an artifact of the program parameters that create the chart in the first place?

By selecting this group of nodes in Gephi we can see what they link to: mainly the AFL, Misc. Groups, Journalists and Writers, Political Parties, the Socialist Party, and Workers’ Education. So far so good. These are all likely places to find people who served as liaisons between unions and what today we would call NGOs. Let’s call these people “mediators” because they sit in the middle of, and link, the AFL and everyone else.

The group of roughly 50 individuals who appear between the major nodes have been selected. The bright green lines point to groups/categories they belong to, and the names of those groups are visible.  Non-connected nodes are faded in background.  Chart produced in Gephi.
The group of roughly 50 individuals who appear between the major nodes have been selected. The bright green lines point to groups/categories they belong to, and the names of those groups are visible. Non-connected nodes are faded in background. Chart produced in Gephi.

Now, for the sake of simplifying the chart, we’ll group the “mediators” into their own node (Below: the green dot in between the two big blue circles. I’ve also rotated the chart to get a closer view). To do this in Gephi, you right-click on the highlighted group and choose “Group” from the menu. With the same mouse command you can tell Gephi to highlight the group in the “Data Laboratory” (i.e., the interface for looking at the underlying tables that make up the charts). In the image below, the “mediators” group and all the nodes it connects to are selected/highlighted. Everything else (non-linked nodes) is faded out. See all the white dots in the green field surrounding the AFL node? Those are non-selected individuals. So this chart represents a sub-network of the broader dataset: the mediators (a group of individuals–green circle) and all the organizations (red) and categories of organizations (blue) they belong to.

The "mediators" have been grouped into a single node and selected.  Organizations or categories linked to this group of individuals are visible while non-connected orgs are faded in the background. Network chart created in Gephi.
The “mediators” have been grouped into a single node and selected. Organizations or categories linked to this group of individuals are visible while non-connected orgs are faded in the background. Network chart created in Gephi.

The next step is the figure out who these individuals are. Turns out I’ve selected 54 individuals in all. Among the more well-known are Fannia Cohn (IWGWU, workers’ education), Max Hayes (editor of the Cleveland Citizen and prominent Socialist), Arturo Giovanitti (ILGWU, formerly IWW), Mathew Woll and John Frey (AFL arch-conservatives), Alice Henry (WTUL), Fred Hewitt (editor of Machinists Monthly Journal), and a number of other labor union newspaper editors. I’ll have to spend a little time running through this list to make solid conclusions, but it makes sense that there are so many editors and writers.

But I’m running out of steam and will have to leave that for another day. I will leave you with this much nicer version of the same chart. I’m not sure what it means, but it really looks like a peacock!

A more attractive, but somewhat less informational, version of the chart showing the mediators grouped into their own node. Note that the node is green because it is made up of individuals.
A more attractive, but somewhat less informational, version of the chart showing the mediators grouped into their own node. Note that the node is green because it is made up of individuals.

Networked Labor Movement–one step backward

This is the second in a series of posts I expect to write to help me think through the use of network analysis and visualization. Read the first post, and a backgrounder.

A network chart based on the index of the American Labor Who's Who (1925). Blue dots represent major categories, red dots are organizations or subcategories, and green dots represent individuals.
A network chart based on the index of the American Labor Who’s Who (1925). Blue dots represent major categories, red dots are organizations or subcategories, and green dots represent individuals.

As one of my correspondents said of my last post: interesting picture, but it’s meaningless without the background data. Well, maybe not meaningless, but abstracted in the extreme. So I’m going to back up a bit, partly for my own sake, to scope out the major categories, subcategories and organizations in the dataset (i.e., the blue and red dots in the chart to the right).

To review, this data is drawn from the index of the digitized version of the American Labor Who’s Who (1925), so it represents what the compilers thought were the relevant organizational contexts for the people listed in the directory at the time it was printed. The actual entries in the Who’s Who often include min-career histories, which makes them potentially more interesting, but also more complicated to work with as data.

Rather than run tables, I’ve made these “tree map” images with Raw, which is a great tool, but has limited ability to adjust labels, so some of these are a little messy. The major categories are AFL-affiliated Bodies, Independent Unions, Political Parties and Miscellaneous Groups (numbers represent individuals in the category, some people are in more than one category):

alww index categoriesThe AFL, Political Parties, and Independent Unions encompass organizations. “Miscellaneous Groups” includes specific organizations and functional subcategories (e.g., Journalists and Writers, Impartial Arbitrators, as well as League for Industrial Democracy.). The AFL-affiliated group is large and full of little organizations with one or two people listed. Here’s a chart of the AFL-affiliated organizations with 10 or more members in the Who’s Who. It’s interesting that the Women’s Trade Union League makes it into this list because women are otherwise underrepresented.

American Federation of Labor-affiliated organizations or groupings with 10 or more members in the ALWW index.
American Federation of Labor-affiliated organizations or groupings with 10 or more members in the ALWW index.

Below is a breakdown of the “Independent Unions” where I’ve combined all the railway unions for the sake of getting a better chart. There was one representative of African American rail unionism in that group, but Brotherhood of Sleeping Car Porters (founded in 1925) didn’t make it into the Who’s Who. A. Philip Randolph, Chandler Owen and a few others appear under “Negro Progress” groups and in some AFL unions. So the Amalgamated Clothing Workers is really the largest non-AFL union in the Who’s Who. Also worth noting, by 1925 many militants had moved on from the Industrial Workers of the World (IWW). So in the index they have no connection, whereas their entries often list former membership.

Independent unions represented in the ALWW index (various railway unions combined for better visualization).
Independent unions represented in the ALWW index (various railway unions combined for better visualization).

The next subcategory is Political Parties. In the actual directory quite a few people are listed as Democrats and Republicans, but not in the index. So this is really “left political parties” or “working-class political parties.”

Political parties represented in the ALWW index, apparently excluding the Democrats and Republicans which show up frequently in the full directory.
Political parties represented in the ALWW index, apparently excluding the Democrats and Republicans which show up frequently in the full directory.

And finally, that large category “Miscellaneous Groups.” In later posts I’ll zero in on “Journalists and Writers” as well as a key group of individuals that link the AFL unions with the para-union organizations.

Chart of the subcategories and organizations listed under "Miscellaneous Groups" in the ALWW.
Chart of the subcategories and organizations listed under “Miscellaneous Groups” in the ALWW index.

The printed Who’s Who also has a geographic index, but I have yet to convert that into a spreadsheet. It would be interesting to see how the categories, subcategories and organizations look spatially. But that will have to wait for another day.

Next up, I return to Gephi and the network charts, add the links between groups and explore some individuals who seem to occupy key positions between the two poles of the 1920s labor movement.

 

 

The Networked Labor Movement

index-labelsThis is the first in a series of posts I expect to write to help me think through the use of network analysis and visualization.

When I started converting the printed American Labor Who’s Who to an electronic database, I knew the data would be a handy reference tool for students. But I also hoped to use the data for my own research, and that it might even be instructive for contemporary activists. In particular, I figured the directory of labor and radical leaders might help us see the interconnections between organizations and people that make up the thing we call “the labor movement,” and the fact that the movement was broader than “trade unionism” alone.

Why does that matter? Well, if we consider that union membership is currently below 10% of the private sector workforce, things seem pretty hopeless for Labor. How can a social group as defensive and marginal as that ever hope to assert real power again? But if we think of the unions as part of a broader political and social grouping that also includes journalists, educators, activists and lawyers–then we have something much larger and broader. That’s important not just for politics today, but for the way we think about historical change. As a number of labor scholars have noted, the labor movement tends to grow in sudden, massive upsurges rather than by slow steady accretion. The question is, what enables these upsurges?

For much of the 1920s and 1930s, union density was low and employers had the upper hand. Unions and radicals were divided against each other. A lot of energy went into expelling dissidents and poaching members from other organizations. Old forms of unionism held on to authority, while newer forms remained inchoate or marginalized. But unionism and progressive/radical political activism held on and, in the late 1930s and 1940s grew exponentially. Legal and macro-political changes had a lot to do with that upsurge–especially a new federal policy in favor of collective bargaining and the full employment context of World War II. But the massive and swift growth in union membership and power was also based on a network of local militants who carried out the organizing drives, produced labor newspapers and radio shows, and staffed the strike kitchens and community support networks that sustained activism.

So consider this chart, based on the index of the American Labor Who’s Who, which lists individuals by category (e.g., AFL affiliated, independent unions, miscellaneous), and by organization or subcategory (e.g., United Mine Workers or Journalists & Writers). Note: elsewhere, I’ve explained the limits of this source in terms of representativeness, and why it’s still worth using. This analysis is based on the roughly 1,300 U.S. entries.

A network chart based on the index of the American Labor Who's Who (1925). Blue dots represent major categories, red dots are organizations or subcategories, and green dots represent individuals.
A network chart based on the index of the American Labor Who’s Who (1925). Blue dots represent major categories, red dots are organizations or subcategories, and green dots represent individuals.

I extracted the text of the index from the ePub version of the Who’s Who on the HathiTrust Digital Library, and converted it into a spreadsheet in Microsof Excel. Using the Table 2 Net website I converted a CSV formatted version of the spreadsheet it into a bipartite network table. Then I opened that table in Gephi–a free network analysis and visualization program and created a chart with the Force Atlas algorithm.

In a network you have “nodes” and “edges.” This is a “bipartite” network, meaning there are two kinds of nodes: people and categories of organization/activity. The edges are the connections between the two types of nodes. This is a “directed” network, which means that the lines of connection (the edges) only flow in one way: individuals are members of organizations, subcategories, and categories of organizations.

The chart orients around two poles of about equal size: American Federation of Labor (AFL)-affiliated bodies and everyone else (including journalists, independent unions, and political parties among others). Depending on your mood you could read this as affirming the AFL as the dominant player in this social field, or as suggesting the diversity of and balance of players. Or you might suggest there was some level of tension and conflict between the two poles. It’s useful to remember that this chart is an analytical tool, not necessarily a direct representation of reality–and there are layers of “bias” baked into the data from its origins.

This chart is designed to accentuate the separation of the groups for analytical purposes. It doesn’t show the edges (connections between and among people and organizations), only the relative groupings. I’ll get into the linkages between groups in subsequent posts. In particular, I’m interested in the group of green dots that sits between the AFL and Miscellaneous poles. This turns out to be made up of editors of major union and labor federation newspapers. They were a key group that linked unions to the broader working-class public sphere in large part because they formed bridges between unions and other social sectors–something that seems to be represented here in the chart.

Old Book, New Data

Labor Who's Who title page

(Originally posted on bughousesquare.wordpress.com)

Over the past year or so I’ve been working on digital history project that aims to convert a 1925 American Labor Who’s Who into a research and teaching database and wiki. It continues to be “a learning experience,” as my mother used to call all the unpleasant encounters of childhood. Not all bad, to be sure, but not all good. Since I have versions of the data up on the internet, I thought I should post some reflections.

Labor historian Jon Beck from the Michigan State Industrial Relations program started my thinking about the Labor’s Who Who around 2007 or so when he suggested it might be useful for my project on working class autodidacts. The Rand School of Social Science sponsored the compilation of the Who’s Who in 1925 under the direction of Solon De Leon (son of famed radical Daniel De Leon). De Leon and his colleagues threw open the front door to the House of Labor, so to speak, including in the roughly 1,300 entries for the U.S. activists in the fields of immigrant rights, civil liberties, cooperatives, progressive and radical politics, as well as the to-be-expected trade unionists (there are 300 additional non-US activists–a few of these were deported or self-exiled US activists).

Nineteen twenty-five was a curious moment for the American labor movement. The industrial union upsurge of the 1910s was sputtering under the weight of repression, factionalism, and failure. The powerful unions of the CIO were a decade or more in the future. Meanwhile, conservatives held a tight, if a bit desperate, grip on the political machinery of trade unionism at the national level, antiunion Republicans were in the White House, and reactionary groups like the KKK and American Legion were popular. And yet, there was a great deal of activity and organizational creativity in some unions, and there was a blossoming network labor colleges training the leaders of the ’30s.

The Labor Who’s Who is a snapshot of this contingent moment and some of the people who lived it. Each entry is a telegraphic biography. Some provide only name, professional title and address at the time of publication. But many sketch rich life histories. Nearly all provide details on birth date and place, family background, education, migration, and work histories, as well as key organizations, events and publications. It includes both long-serving elders whose careers stretched back to the 1870s, and emerging leaders who would continue to be active into the second half of the 20th century.

For years I had a library copy of the book on my office shelf, thinking I would get to the project eventually. Then in 2012 I discovered the book had been scanned by Google and was sitting behind the access wall in the HathiTrust (HT) digital collection. You could search keywords, but the search only returned a few words and a page number. From my key word searches, I knew that about 40 individuals identified themselves as “self-educated,” but learning more about the educational and organization matrix represented in the directory was just beyond reach. Hoping to avoid the wrath of Disney and other commercial publishers, HT takes a defensive approach to copyright. Most things published after the easy cut off for public domain (before 1923) go behind the access wall.

Very frustrating. And ironic. Here was a book published by a radical college, locked behind a copyright wall at the behest of capitalist media corporations. Not that these corporations give a hoot about the Labor Who’s Who, it’s just structural. Everything after 1922 goes behind the wall unless someone specifically requests it be freed.

Thus was born what I’m now calling the “HathiTrust Liberation Project.” Hundreds and hundreds of labor and leftist volumes published between 1923 and 1963 are in the public domain unless their copyright holders renewed the copyright (there is an online database of to check for renewed copyrights: http://comminfo.rutgers.edu/~lesk/copyrenew.html ). Unlike literary works, mundane works of non-fiction and social movement publications are usually not renewed. Many of these volumes are already digitized, but are blocked. Likewise, a surprising number of post-1923 government documents are behind the access wall.

The Labor Who’s Who was my first foray into old book liberation. Through the good graces of the UCLA Library, I was able to convince HT that the copyright on the Labor Who’s Who probably wasn’t renewed, and in any case the socialists won’t kick if you open it up. Somebody flipped a switch and the volume appeared. This was in the spring or summer of 2012.

The next task was extracting and cleaning OCR’d text. This turned out to be a little more complicated than I expected. In the end, I downloaded an EPUB version of the Who’s Who, and copy-and-pasted the text into a separate file. So far, so good. But this was a long way from a database. With the help of UCLA librarian Zoe Borovsky and Miriam Posner of the Center for Digital Humanities, I got some help breaking the text up into discreet entries and, eventually, data fields. However, there were many, many text recognition errors. I probably could have hired someone to do it (if I had the money), but in the end I did most of the corrections myself. Let’s just say I became intimately familiar with the contents of the book. And isn’t that the traditional activity of scholarly humanists after all, even if this mode of familiarity generally is not recognized as such by personnel committees.

So by the late fall of 2012, I had a relatively clean text file with entries broken into fields: name, titles, birthplace, birth date, father’s occupation, and a residual field that was too irregular to easily parse that included things like education, organizations, activities, publications, home and work address. Next came the task of reorganizing this information from a flow of text into a spreadsheet, rather tediously done by cutting and pasting in Microsoft Excel.

From the start, I had envisioned the Who’s Who database as a teaching tool, as well as a research project. I imagined students using the entries as a starting place for biographical papers, so I needed a student-friendly interface. I had experimented fitfully having students write or edit Wikipedia entries in my classes, so it seemed natural to put the Who’s Who data in a wiki. A regular wiki is searchable, but doesn’t really have database functions. To get those, I used the Mediawiki extension bundle Semantic Mediawiki. The semantic wiki allows you to define data fields and relationships, import data, search across data fields, and enable students or other users (if you wish) to edit the data through forms.

birthplacesworkaddressI also loaded the data into a Google Fusion Table, which allows you to quickly make maps from any geographic data (e.g., birthplaces). Fusion Tables is easy, but limited in terms of customizing. My students used the filtering and mapping functions to produce in-class reports on the demographics of various organizations represented in the directory. Semantic Mediawiki is much more flexible. But for the non-expert it was one of those “learning experiences.” Many late nights, crashes, and frustrations before ultimate success. In the future I hope to use it in my labor history classes to train students how to use a wiki before I set them off on the actual Wikipedia.

What remains to be done is the “Other” field–education, organizations, publications–lots of good stuff. I’m currently working with folks at the Center for Digital Humanities, and hope to have that done by late winter. In the meanwhile, I’m doing some analysis of subsets of the Who’s Who, particularly the organizational networks. And that presents me with my next “learning experience,” Gephi.