The Old Bailey Topic Explorer

The Old Bailey Topic Explorer is a new and machine-augmented way to explore centuries in the lives of the non-elite; it is online! Go to it now at http://inphodata.cogs.indiana.edu/oldbailey. The visualization tools that enable you to navigate this complex data were first presented in Murdock & Allen (2015).

You can search for a particular trial, or just explore. Hover over the icons to the left of the trial bars to find links to the original text. Hover over the little squares to the right of the trial bars to see the different topics; click to reorder the trials based on that topic. Choosing a new topic to focus on, and clicking “Top Documents” can be a nice way to explore the data. Enjoy!

Introduction & Sources

The jury trial is a critical point where the state and its citizens come together to define the limits of acceptable behavior. The crimes people were tried for, and the ways in which those trials were conducted, tell many stories.

On the largest scales, we can see how, over the centuries, a state comes to manage its monopoly on violence and how people learn new ways of seeing the world (Klingenstein et al., 2015). Conversely, trial by trial, we can read for the weirdness, the variety, and the contexts that string together these innumerable moments in the vast sweep of history, and that can reveal the hidden experiences of a human life (Hitchcock, 2014).

The goal of this project is to provide a new entry into this world, for minds of both types. Through the Topic Explorer Project, maintained by Jaimie Murdock, we can range across the centuries of this remarkable data, finding patterns and themes. We can also, at will, dive in to the unique and unrepeatable worlds of the hundred thousand or so people who passed through London’s Old Bailey between 1778 and 1913.

We have partial access to these worlds because the trials were published, in more or less detail, in a publication popularly known as The Old Bailey Proceedings. This series has since been digitized as the Old Bailey Online (OBO; Hitchcock et al. (2012)).

The early periods of this corpus are not used here; they are marked by gaps in the archive, inconsistent and frequently truncated reporting, and periods of inconsistent censorship (see Refs. 15–19 in KHD). In the analysis presented here, we include data back to 1732, but records become more reliable after 1778, when the City of London required that the transcripts provide a “true, fair and perfect narrative” (Ref. 17 in KHD, p. 468). As well as including words spoken during the trial, as put down by a short-hand reporter, the set also includes extensive metadata, including charge, verdict, and punishment. Trials range from simple theft to complex crimes such as forgery and libel, and to the extremes of human behavior—manslaughter and murder, rape and infanticide.

A separate layer of mark-up, drawn from “The Old Bailey Corpus” (OBC; Huber et al., 2012), has been used to distinguish statements that purport to be transcriptions of spoken testimony, separating out these records from the surrounding text. Excepting a short period of censorship focused on acquittals in the early 1790s, and a longer-term practice of censoring graphic details from cases involving sexual offenses, the tagged speech is believed to provide a verbatim record of nearly every word spoken in the courtroom (Ref. 20 in KHD; see also Huber, 2007). We include a total of 119,817 trials. The Topic Explorer navigates and structures its interpretations using basic concepts in information theory; see DeDeo (2014) for a short introduction targeted towards humanists.

References

Murdock, J. and Allen, C. (2015) Visualization Techniques for Topic Model Checking [demo track] in Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI-15). Austin, Texas, USA, January 25-29, 2015. http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/10007
Hitchcock, T., Shoemaker R., Emsley, C., Howard, S. and McLaughlin, J., et al. (2012). The Old Bailey Proceedings Online, 1674-1913 (version 7.0, 24 March 2012)
Hitchcock, T. (2014). Big Data, Small Data and Meaning. Historyonics (blog).
Huber, M., Nissel M., Maiwald P., Widlitzki B. (2012) The Old Bailey Corpus. Accessed June 5, 2013.
Huber, M. (2007). The Old Bailey Proceedings, 1674-1834 Evaluating and annotating a corpus of 18th- and 19th-century spoken English. Annotating Variation and Change (Studies in Variation, Contacts and Change in English 1), ed. by Anneli Meurman-Solin & Arja Nurmi. Helsinki: VARIENG.
Klingenstein, S., Hitchcock, T., & DeDeo, S. (2014). The civilizing process in London’s Old Bailey. Proceedings of the National Academy of Sciences, 111(26), 9419-9424.
DeDeo, S. (2015). Information Theory for Intelligent People.
DeDeo, S. (2014). When Theft Was Worse Than Murder. Nautilus Magazine, Issue 12 (Feedback).
An extensive bibliography of work related to the proceedings can be found at
http://www.oldbaileyonline.org/static/ProceedingsBibliography.jsp