Situations and Relations

Back in February, I gave a talk to the UCLA Digital Labor Working Group about my network analysis with the Labor Who’s Who data. You can see my slides here.

Patterns within a welter of information. Red dots represent people, light green dots represent organizations.
I opened with the idea that “the labor movement” is an abstraction–a place-holder phrase that means different things at different times. The American Labor Who’s Who was a particular version of that abstraction, created at a particularly contentious moment in labor history. It was compiled by a team led by Solon De Leon (son of a famous radical polemicist), and published by the Socialist Party aligned Rand School of Social Science. It describes a labor movement that encompasses not only trade unions, but also radical political movements, immigrant organizations, researchers, journalists, and what we would call “NGOs” today. My analysis, drawn from data extracted from the Who’s Who, is an abstraction of an abstraction.

It’s worth beginning with this caveat because computation and data visualization have an aura of legitimacy these days. These network charts (created in Gephi) are representations of reality, not reality itself. They are best used as models of plausible past realities, tools for thinking through problems of historical argument, rather than as illustrations per se.

I began with the broadest and busiest view of the data: all the people in the Who’s Who and organizations they belonged to (slide 1). The mathematical model that creates this chart draws more connected elements, or “nodes,” closer to the center and pushes less connected elements to the edges. A node’s size depends on how connected it is to other nodes, and lines connect people to the organizations they belong to. In these charts, the lines, or edges, have direction. People belong to organizations, so radiate from each person to their corresponding organizations.

In broad strokes, the first graph presents a ring of organizations roughly the same size, three organizations that are noticeably larger on the inside edge of the ring, and several groupings of people inside the ring. Without knowing the names of the people or the organizations, it appears that three or four organizations dominate the institutional field of the labor movement. There is also a lot of “noise.”

The right wing constellation of organizations included the AFL, fraternal societies, and the Democratic and Republican parties.
The left-wing constellation of organizations centered on the Socialist Party.
The next two slides try to filter out some of that noise by focusing on the “right” and “left” flanks of this social formation (think of it as “stage right”). The American Federation of Labor (AFL) and the Masons dominate the right side of the field (slide 2), surrounded by other fraternal organizations (Elks, Odd Fellows, Moose, etc.), mainstream political parties, and four trade unions–the Printers (ITU), Machinists (IAM), Miners (UMWA), and Carpenters (UBC). On the left (slide 3), the Socialist Party dominates, and is surrounded by independent unions (two garment worker unions and the IWW), left-wing parties and para-party organizations (Communist and Workers parties, the Trade Union Educational League, left-wing youth organizations, and the Workmen’s Circle. Worth noting: the spatial position of a node has no relationship to its place on the left/right political spectrum. The Women’s Trade Union League and the American Federation of Teachers, for instance, are farther away from the SP than the Workers’ Party, for instance. (In future I should probably reorient these vertically!)

Henry Ohl was a leading figure in the Wisconsin labor movement, and the Univ. of Wisconsin School for Workers.
Next come two slides that focus on two individuals who show up near the center of the graph, and represent mediating figures between the AFL and SP-oriented flanks of the movement. Henry Ohl, Jr. (slide 4) was a Milwaukee Socialist and a printer who championed the University of Wisconsin’s School for Workers.

Socialist editor Max Hayes was unusually well connected to the key organizations of the 1920s labor movement.
Max Hayes (slide 5) was a Cleveland Socialist–another printer–and the editor of the Cleveland Citizen. Both men started working in their early teens, apprenticed as printers, and were deeply involved in Socialist politics. Compare these two men with William Z. Foster (slide 6). He also linked the AFL and the SP, but by 1925 was publicly associated with the Workers’ Party and is placed farther on the periphery of the graph. Similarly, women union activists sit on the periphery of these network graphs, as do a number of labor intellectuals.

Whether Foster (or Pauline Newman or A. Philip Randolph) was less “central” to the labor movement of 1925 than Ohl or Hayes  is not really what the graph explains. Centrality in this model is not the same as “importance.” Ohl and Hayes are more “central” because they were members of fraternal associations, and their membership creates a relationship in this model that draws them closer to the many non-Socialist men who were likewise part of the world of the Masons, Odd Fellows, Elks, and Moose.

Communist leader William Z. Foster's network profile.
Unfortunately, we can’t see how this chart would change by 1940 when new leaders and organizations were in the field, and some of those on the periphery in 1925 moved to the center (e.g., Sidney Hillman). But the lack of chronology also helps us see the way careers in the labor movement spanned multiple institutions (e.g., Max Hayes in the Peoples Party and the SP).

Labor and radical history is often told one organization at a time, one city at a time, one campaign at a time. Of course we use the singular focus as a way to get at broader themes. When I researched my first book, I began with IWW harvest workers, and that opened out onto a whole constellation of social forces, places, and people. Network graphs, for all their complications and limitations, turn our eyes first to the relatedness that structures a social field. The “labor movement” of the 1920s was a particularly contentious place where splits between one wing or the other severed ties between erstwhile comrades. But groups and individuals in contentious relationships are still in relationships. A labor movement divided and fighting was still a movement to overturn the worst abuses of capitalism.

An insight I’ve gained from my research on workers’ education in between the world wars is that organizational schisms were not always the end of the story. Quite often they produced more talk, more action, and more learning. “There is no one road to freedom,” said the author of a popular workers’ education pamphlet, “There are roads to freedom.”


Note: I know the charts mix up colors and orientations. Extracting good charts from Gephi is one of the big challenges of this project, and I’m working on some other–also imperfect–ways to share the visualizations in more active form.

The Networked Labor Movement

index-labelsThis is the first in a series of posts I expect to write to help me think through the use of network analysis and visualization.

When I started converting the printed American Labor Who’s Who to an electronic database, I knew the data would be a handy reference tool for students. But I also hoped to use the data for my own research, and that it might even be instructive for contemporary activists. In particular, I figured the directory of labor and radical leaders might help us see the interconnections between organizations and people that make up the thing we call “the labor movement,” and the fact that the movement was broader than “trade unionism” alone.

Why does that matter? Well, if we consider that union membership is currently below 10% of the private sector workforce, things seem pretty hopeless for Labor. How can a social group as defensive and marginal as that ever hope to assert real power again? But if we think of the unions as part of a broader political and social grouping that also includes journalists, educators, activists and lawyers–then we have something much larger and broader. That’s important not just for politics today, but for the way we think about historical change. As a number of labor scholars have noted, the labor movement tends to grow in sudden, massive upsurges rather than by slow steady accretion. The question is, what enables these upsurges?

For much of the 1920s and 1930s, union density was low and employers had the upper hand. Unions and radicals were divided against each other. A lot of energy went into expelling dissidents and poaching members from other organizations. Old forms of unionism held on to authority, while newer forms remained inchoate or marginalized. But unionism and progressive/radical political activism held on and, in the late 1930s and 1940s grew exponentially. Legal and macro-political changes had a lot to do with that upsurge–especially a new federal policy in favor of collective bargaining and the full employment context of World War II. But the massive and swift growth in union membership and power was also based on a network of local militants who carried out the organizing drives, produced labor newspapers and radio shows, and staffed the strike kitchens and community support networks that sustained activism.

So consider this chart, based on the index of the American Labor Who’s Who, which lists individuals by category (e.g., AFL affiliated, independent unions, miscellaneous), and by organization or subcategory (e.g., United Mine Workers or Journalists & Writers). Note: elsewhere, I’ve explained the limits of this source in terms of representativeness, and why it’s still worth using. This analysis is based on the roughly 1,300 U.S. entries.

A network chart based on the index of the American Labor Who's Who (1925). Blue dots represent major categories, red dots are organizations or subcategories, and green dots represent individuals.
I extracted the text of the index from the ePub version of the Who’s Who on the HathiTrust Digital Library, and converted it into a spreadsheet in Microsof Excel. Using the Table 2 Net website I converted a CSV formatted version of the spreadsheet it into a bipartite network table. Then I opened that table in Gephi–a free network analysis and visualization program and created a chart with the Force Atlas algorithm.

In a network you have “nodes” and “edges.” This is a “bipartite” network, meaning there are two kinds of nodes: people and categories of organization/activity. The edges are the connections between the two types of nodes. This is a “directed” network, which means that the lines of connection (the edges) only flow in one way: individuals are members of organizations, subcategories, and categories of organizations.

The chart orients around two poles of about equal size: American Federation of Labor (AFL)-affiliated bodies and everyone else (including journalists, independent unions, and political parties among others). Depending on your mood you could read this as affirming the AFL as the dominant player in this social field, or as suggesting the diversity of and balance of players. Or you might suggest there was some level of tension and conflict between the two poles. It’s useful to remember that this chart is an analytical tool, not necessarily a direct representation of reality–and there are layers of “bias” baked into the data from its origins.

This chart is designed to accentuate the separation of the groups for analytical purposes. It doesn’t show the edges (connections between and among people and organizations), only the relative groupings. I’ll get into the linkages between groups in subsequent posts. In particular, I’m interested in the group of green dots that sits between the AFL and Miscellaneous poles. This turns out to be made up of editors of major union and labor federation newspapers. They were a key group that linked unions to the broader working-class public sphere in large part because they formed bridges between unions and other social sectors–something that seems to be represented here in the chart.

Old Book, New Data

Labor Who's Who title page

(Originally posted on

Over the past year or so I’ve been working on digital history project that aims to convert a 1925 American Labor Who’s Who into a research and teaching database and wiki. It continues to be “a learning experience,” as my mother used to call all the unpleasant encounters of childhood. Not all bad, to be sure, but not all good. Since I have versions of the data up on the internet, I thought I should post some reflections.

Labor historian Jon Beck from the Michigan State Industrial Relations program started my thinking about the Labor’s Who Who around 2007 or so when he suggested it might be useful for my project on working class autodidacts. The Rand School of Social Science sponsored the compilation of the Who’s Who in 1925 under the direction of Solon De Leon (son of famed radical Daniel De Leon). De Leon and his colleagues threw open the front door to the House of Labor, so to speak, including in the roughly 1,300 entries for the U.S. activists in the fields of immigrant rights, civil liberties, cooperatives, progressive and radical politics, as well as the to-be-expected trade unionists (there are 300 additional non-US activists–a few of these were deported or self-exiled US activists).

Nineteen twenty-five was a curious moment for the American labor movement. The industrial union upsurge of the 1910s was sputtering under the weight of repression, factionalism, and failure. The powerful unions of the CIO were a decade or more in the future. Meanwhile, conservatives held a tight, if a bit desperate, grip on the political machinery of trade unionism at the national level, antiunion Republicans were in the White House, and reactionary groups like the KKK and American Legion were popular. And yet, there was a great deal of activity and organizational creativity in some unions, and there was a blossoming network labor colleges training the leaders of the ’30s.

The Labor Who’s Who is a snapshot of this contingent moment and some of the people who lived it. Each entry is a telegraphic biography. Some provide only name, professional title and address at the time of publication. But many sketch rich life histories. Nearly all provide details on birth date and place, family background, education, migration, and work histories, as well as key organizations, events and publications. It includes both long-serving elders whose careers stretched back to the 1870s, and emerging leaders who would continue to be active into the second half of the 20th century.

For years I had a library copy of the book on my office shelf, thinking I would get to the project eventually. Then in 2012 I discovered the book had been scanned by Google and was sitting behind the access wall in the HathiTrust (HT) digital collection. You could search keywords, but the search only returned a few words and a page number. From my key word searches, I knew that about 40 individuals identified themselves as “self-educated,” but learning more about the educational and organization matrix represented in the directory was just beyond reach. Hoping to avoid the wrath of Disney and other commercial publishers, HT takes a defensive approach to copyright. Most things published after the easy cut off for public domain (before 1923) go behind the access wall.

Very frustrating. And ironic. Here was a book published by a radical college, locked behind a copyright wall at the behest of capitalist media corporations. Not that these corporations give a hoot about the Labor Who’s Who, it’s just structural. Everything after 1922 goes behind the wall unless someone specifically requests it be freed.

Thus was born what I’m now calling the “HathiTrust Liberation Project.” Hundreds and hundreds of labor and leftist volumes published between 1923 and 1963 are in the public domain unless their copyright holders renewed the copyright (there is an online database of to check for renewed copyrights: ). Unlike literary works, mundane works of non-fiction and social movement publications are usually not renewed. Many of these volumes are already digitized, but are blocked. Likewise, a surprising number of post-1923 government documents are behind the access wall.

The Labor Who’s Who was my first foray into old book liberation. Through the good graces of the UCLA Library, I was able to convince HT that the copyright on the Labor Who’s Who probably wasn’t renewed, and in any case the socialists won’t kick if you open it up. Somebody flipped a switch and the volume appeared. This was in the spring or summer of 2012.

The next task was extracting and cleaning OCR’d text. This turned out to be a little more complicated than I expected. In the end, I downloaded an EPUB version of the Who’s Who, and copy-and-pasted the text into a separate file. So far, so good. But this was a long way from a database. With the help of UCLA librarian Zoe Borovsky and Miriam Posner of the Center for Digital Humanities, I got some help breaking the text up into discreet entries and, eventually, data fields. However, there were many, many text recognition errors. I probably could have hired someone to do it (if I had the money), but in the end I did most of the corrections myself. Let’s just say I became intimately familiar with the contents of the book. And isn’t that the traditional activity of scholarly humanists after all, even if this mode of familiarity generally is not recognized as such by personnel committees.

So by the late fall of 2012, I had a relatively clean text file with entries broken into fields: name, titles, birthplace, birth date, father’s occupation, and a residual field that was too irregular to easily parse that included things like education, organizations, activities, publications, home and work address. Next came the task of reorganizing this information from a flow of text into a spreadsheet, rather tediously done by cutting and pasting in Microsoft Excel.

From the start, I had envisioned the Who’s Who database as a teaching tool, as well as a research project. I imagined students using the entries as a starting place for biographical papers, so I needed a student-friendly interface. I had experimented fitfully having students write or edit Wikipedia entries in my classes, so it seemed natural to put the Who’s Who data in a wiki. A regular wiki is searchable, but doesn’t really have database functions. To get those, I used the Mediawiki extension bundle Semantic Mediawiki. The semantic wiki allows you to define data fields and relationships, import data, search across data fields, and enable students or other users (if you wish) to edit the data through forms.

birthplacesworkaddressI also loaded the data into a Google Fusion Table, which allows you to quickly make maps from any geographic data (e.g., birthplaces). Fusion Tables is easy, but limited in terms of customizing. My students used the filtering and mapping functions to produce in-class reports on the demographics of various organizations represented in the directory. Semantic Mediawiki is much more flexible. But for the non-expert it was one of those “learning experiences.” Many late nights, crashes, and frustrations before ultimate success. In the future I hope to use it in my labor history classes to train students how to use a wiki before I set them off on the actual Wikipedia.

What remains to be done is the “Other” field–education, organizations, publications–lots of good stuff. I’m currently working with folks at the Center for Digital Humanities, and hope to have that done by late winter. In the meanwhile, I’m doing some analysis of subsets of the Who’s Who, particularly the organizational networks. And that presents me with my next “learning experience,” Gephi.