Using Topic-Modeling in Legal History, with an Application to Pre-Industrial English Case Law on Finance PETER GRAJZL AND PETER MURRELL The last few decades have seen the ever-increasing importance of quan- titative empirical methods in historical studies in general, and in economic history in particular. However, these methods have made few inroads into pre-twentieth-century, and especially pre-industrial, legal history, despite the central place of law in the history of world economic development.1 Law and History Review May 2022, Vol. 40, No. 2 © The Author(s), 2022. Published by Cambridge University Press on behalf of the American Society for Legal History. This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited. doi:10.1017/S0738248022000153 Peter Grajzl is in the Department of Economics, The Williams School of Commerce, Economics, and Politics, Washington and Lee University, Lexington, Virginia and CESifo, Munich, Germany Peter Murrell is in the Department of Economics, University of Maryland, College Park, Maryland They thank two referees for their insightful comments, and especially the editor, Gautham Rao, for his encouragement and insights. They are also grateful to Jonathan Gong for research assistance and to participants at the AI4Law workshop in February 2021 at the University of Oxford for helpful comments. 1. S. Robertson, “Searching for Anglo-American Digital Legal History,” Law and History Review 34 (2016): 1047–69, noting that “as the fields of digital humanities and digital his- tory have grown in scale and visibility since the 1990s, legal history has largely remained on the margins of those fields.” There are some important very recent examples for the nine- teenth century, such as K. Funk and L.A. Mullen, “The Spine of American Law: Digital Text Analysis and U.S. Legal Practice,” American Historical Review 123 (2018): 132–64. In recent years, a number of empirical articles are appearing that use data from the eighteenth century made available by the Old Bailey Proceedings project. See T. Hitchcock, R. Shoemaker, C. Emsley, S. Howard, and J. McLaughlin, “The Proceedings of the Old Bailey, 1674-1913” www.oldbaileyonline.org (accessed April 2021). Both of these works rely on the types of computational advances that we highlight in this article and that we feel will lead to a quiet revolution in legal historical studies. Existing, more traditional stud- ies on the period before the nineteenth century usually contain very small samples or few variables, implying that there is a limited ability to apply the types of empirical methods that are now commonplace in economic history. E. Cavell, “The Measure of Her Actions: A Quantitative Assessment of Anglo-Jewish Women’s Litigation at the Exchequer of the Jews, 1219-81,” Law and History Review 39 (2021): 135–72 provides a recent example https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://creativecommons.org/licenses/by/4.0/ mailto:grajzlp@wlu.edu mailto:pmurrell@umd.edu https://www.oldbaileyonline.org https://doi.org/10.1017/S0738248022000153 No doubt the relative absence of such quantitative legal history is because the legal record is mostly in words, the processing of which requires com- putational power that is orders of magnitude beyond that needed for num- bers. However, with huge increases in computer power in recent years and the associated development of desktop text-analyzing software, the menu of research methods and results available to all legal historians is now rap- idly changing. Text can be processed and analyzed as quickly and easily as numbers were two decades ago.2 Libraries of readily usable computational packages are available for the statistical analysis of texts. We now have the possibility of using the text of centuries ago as data.3 The objective of the present article is to convey to traditional legal his- torians the role that these new computational techniques can play in legal- historical research. We do so by presenting an example of the types of results that can be produced with these new tools.4 As we present the example, we outline the steps that must be taken in the computational- statistical process. But our presentation does not require readers to be con- versant with the intricacies of such methods. We provide verbal, intuitive descriptions of the methods used and the tasks that must be accomplished. of a very interesting exercise in early legal history that is, understandably, limited by a small sample with few variables. D. Klerman, “Settlement and the Decline of Private Prosecution in Thirteenth-Century England,” Law and History Review 19 (2001): 1–65 is notable in pro- viding a very early example of pre-industrial legal history that is exceptional for the central- ity of empirical methods in its contribution. 2. The general problem is usefully captured as “How do you write a national history that was the product of lawmaking in 50 separate jurisdictions?” as posited by E. Nystrom and D. Tanenhaus, “The Future of Digital Legal History: No Magic, No Silver Bullets,” American Journal of Legal History 56 (2016): 150–67. This problem is multiplied in case law where one is studying hundreds of years and thousands of cases. The methods we describe in this article almost completely remove the sample-size and limited-observations constraint referred to in the previous footnote. Notably, the general project that includes the current article did not rely on any extramural funding, emphasizing that the techniques we describe are within the reach of all scholars. 3. See, for example, J. Grimmer and B. M. Stewart, “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts,” Political Analysis 21 (2013): 267– 97; M. Gentzkow, B. Kelly, and M. Taddy, “Text as Data,” Journal of Economic Literature 57 (2019): 535–74; and M.A. Livermore and D. N. Rockmore, eds., Law as Data: Computation, Text, and the Future of Legal Analysis (Santa Fe: SFI Press, 2019). 4. In legal history, two early examples of the use of the new sets of computational tools are provided by D. Tanenhaus and E. Nystrom, “Let’s Change the Law: Arkansas and the Puzzle of Juvenile Justice Reform in the 1990s,” Law and History Review 34 (2016): 957–97; and C. Romney, “Using Vector Space Models to Understand the Circulation of Habeas Corpus in Hawai’i, 1852–92,” Law and History Review 34 (2016): 999–1026. In contrast to the exercise reported in this article, these two examples do not use the computa- tional methods to drive an empirical exercise but rather use these methods as search proce- dures to find those legal materials on which a more traditional analysis should be focused. Law and History Review, May 2022190 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 Our view is that what we offer in this article will be instructive for scholars currently using traditional legal-historical approaches. If the past two decades of methodological developments in the humanities, the social sci- ences, and law teach us anything, it is more or less inevitable that the new computational methods will become a part of the toolkit of legal history. Importantly, we do not argue that the new computational approach will replace existing methods. In fact, we do the opposite. As we present our exam- ple, we detail many instances in which our use of the existing work of tradi- tional legal historians has played an absolutely vital role in our ability to produce any novel insights from our application of the new tools. Thus, through the use of example, we hope to show how traditional and computational legal history can complement each other as the field of legal history moves into the new age in which the use of computational methods will become standard. In doing so, we are also able to pinpoint where each of the traditional and computational approaches to legal history has its comparative advantage. To make this article more accessible to those unfamiliar with any of these new methods, we focus on only one, topic-modeling, which indeed is one of the most popular machine-learning techniques that has been applied in his- tory, law, and social science.5 Concentrating on one method allows us to focus on the essential characteristics of machine-learning and to discuss them in intuitive, non-technical ways, addressing our exposition not to 5. On the popularity of topic-modeling, see Gentzkow et al., “Text as Data” and J. Guldi and B. Williams “Synthesis and Large-Scale Textual Corpora: A Nested Topic Model of Britain’s Debates over Landed Property in the Nineteenth Century,” Current Research in Digital History 1 (2018), https://doi.org/10.31835/crdh.2018.01 (accessed April 2021) Several other machine-learning and related computational approaches have been utilized to investigate law-as-data. Machine-learning methods have been used, for example, to predict court outcomes; see for example, D.M. Katz, M.J. Bommarito, and J. Blackman, “A General Approach for Predicting the Behavior of the Supreme Court of the United States,” PLoS One 12 (2017): e0174698. Word and document embedding models represent words and documents as numerical scores for a long list of variables, thereby helping to quantify the meaning of words and documents on the basis of their proximity to other words and doc- uments in the corpus; see, for example, E. Ash and D.L. Chen, “Case Vectors: Spatial Representations of the Law Using Document Embeddings,” in Law as Data, ed. M.A. Livermore and D.N. Rockmore (Santa Fe: SFI Press, 2019), 313–37. Embedding approaches have been employed, for example, to investigate the presence of racial bias in judicial opin- ions; see, for example, D. Rice, J.H. Rhodes, and T. Nteta, “Racial Bias in Legal Language,” Research & Politics April-June (2019), 1–7. For an overview of the use of machine-learning and computational methods in the emerging research field of computational analysis of law-as-data, see J. Frankenreiter and M.A. Livermore, “Computational Methods in Legal Analysis,” Annual Review of Law and Social Science 16 (2020): 39–57. For innovative applications of computational methods to legal-historical themes, but not focusing on English case law, see, for example, S. Klingenstein, T. Hitchcock, and S. DeDeo, “The Civilizing Process in London’s Old Bailey,” Proceedings of the National Academy of Sciences 111 (2014): 9419–24 and Funk and Mullen, “The Spine of American Law”. Using Topic-Modeling in Legal History 191 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.31835/crdh.2018.01 https://doi.org/10.1017/S0738248022000153 those who want to learn the details of the computational analysis but rather to those who want to understand the types of insights that computational-text- analysis can bring to substantive domain-specific research. As Grimmer, Roberts, and Stewart argue, “machine learning is as much a culture defined by a distinct set of values and tools as it is a set of algorithms.”6 This article is written by economists. One additional impetus underlying the writing of this article followed from our observation that the main- stream economics literature has tended to ignore the insights of the histo- rians of case law, while traditional case law historians hardly refer to the methods and findings of economists.7 Perhaps this is because economists are more moved by quantitative evidence, which is not easily found in the history of case law. This article is an attempt to straddle the two fields, to show that there can be strong complementarities between them. To illustrate the power of topic-modeling for legal history, we provide new quantitative information on developments in English case law and legal ideas from the mid-sixteenth century to the mid-eighteenth century. Thus, central to our approach in this article is showing the usefulness of the computational methods by providing an example of their application to ongoing debates in legal history. In contrast to much existing work in digital history, we do not argue for the productiveness of the computational methods by focusing on the methods themselves. Rather, we endeavor to make the case by providing an example of the contribution of the methods to an understanding of the past that is directly relevant to the disciplines of legal history and economics.8 This era of English law has been of particular interest to both legal histo- rians and economists, for related reasons: for the former because much law 6. J. Grimmer, M. E. Roberts, and B. Stewart, “Machine Learning for Social Science: An Agnostic Approach”, Annual Review of Political Science 24 (2021): 395–419. 7. R. Harris, “The Encounters of Economic History and Legal History,” Law and History Review 21 (2003): 297–346 identified this separation of these fields, and his conclusions still seem to apply today. 8. On these points more generally, see S. Robertson and L. Mullen, “Arguing with Digital History: Patterns of Historical Interpretation,” Journal of Social History 54 (2021): 1005–22, who argue that “Digital history has only rarely contributed interpretative or argumentative scholarship that contributes to disciplinary understandings of the past,” largely because of its focus on the methodological. Beyond the example appearing here, the use of the methods outlined in this article and of the data set discussed here are provided in several additional articles that contribute to disciplinary understandings of the past: See P. Grajzl and P. Murrell, “A Machine-Learning History of English Caselaw and Legal Ideas Prior to the Industrial Revolution II: Applications,” Journal of Institutional Economics 17 (2021): 201–16; P. Grajzl and P. Murrell, “A Macrohistory of Legal Evolution and Coevolution: Property, Procedure, and Contract in Pre-Industrial English Caselaw” https://dx.doi.org/10. 2139/ssrn.4005612; and P. Grajzl and P. Murrell “Of Families and Inheritance: Law and Development in Pre-Industrial England” https://dx.doi.org/10.2139/ssrn.3975015 Law and History Review, May 2022192 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://dx.doi.org/10.2139/ssrn.4005612 https://dx.doi.org/10.2139/ssrn.4005612 https://dx.doi.org/10.2139/ssrn.4005612 https://dx.doi.org/10.2139/ssrn.3975015 https://dx.doi.org/10.2139/ssrn.3975015 https://doi.org/10.1017/S0738248022000153 relevant to the modern era was created then; for the latter because of the pos- sible connection between legal developments and the rise of Britain as the first industrial power. In particular, the progress of the financial sector in the decades preceding the Industrial Revolution has received much attention in economics. However, the work of economists on pre-industrial finance has placed little emphasis on case law, which for many is the defining char- acteristic of the English legal family. We show how topic-modeling can use the case law record to cast new light on the patterns and sources of finance-related legal developments in England from the middle of the six- teenth century to the Industrial Revolution. In doing so, we find invaluable the accumulated insights of legal historians, echoing the views of users of topic models in other fields who emphasize how the traditional “close” read- ing of texts must be used alongside the “distant” reading provided by machine-learning.9 The outputs generated on the basis of the new methods are the complements of traditional legal-historical research. The results from topic-modeling are not replacements for the detailed, and immensely valu- able, contextual analysis of traditional legal historians, but instead simply offer a different sort of lens for studying legal-historical phenomena. The focus is on the use of the quantitative output produced by one exist- ing topic-modeling exercise, that of Grajzl and Murrell, henceforth referred to as GM.10,11 By building on an existing implementation of topic- 9. The useful distinction between close and distant reading arose among scholars of liter- ature in what has become known as the “digital humanities,” in which debates about the use- fulness of computational methods, particularly topic-modeling, were both early and very spirited. For the digital humanities, see, for example, A. Goldstone and T. Underwood, “The Quiet Transformations of Literary Studies: What Thirteen Thousand Scholars Could Tell Us,” New Literary History 5 (2014): 359–84. For history, see the very early study by S. Block “Doing More with Digitization: An Introduction to Topic Modeling of Early American Sources,” Commonplace 6 (2006), http://commonplace.online/article/doing- more-with-digitization/, accessed April 2006 and more recently J. Guldi, “Critical Search: A Procedure for Guided Reading in Large-Scale Textual Corpora,” Journal of Cultural Analytics 3 (2018), https://doi.org/10.22148/16.030, (accessed April 2021). For the same emphasis in political science, see Grimmer and Stewart, “Text as Data” and, in a joint prod- uct of a sociologist and two computer scientists, P. DiMaggio, M. Nag, D. Blei, “Exploiting Affinities Between Topic Modeling and the Sociological Perspective on Culture: Application to Newspaper Coverage of U.S. Government Arts Funding,” Poetics 41 (2013): 570–606. For legal history, see Robertson, “Searching”. 10. See P. Grajzl and P. Murrell, “A Machine-Learning History of English Caselaw and Legal Ideas Prior to the Industrial Revolution I: Generating and Interpreting the Estimates,” Journal of Institutional Economics 17 (2021): 1–19; and P. Grajzl and P. Murrell, “A Machine-Learning History of English Caselaw and Legal Ideas Prior to the Industrial Revolution II: Applications,” Journal of Institutional Economics 17 (2021): 201–16. 11. The work on English case reports is part of a much larger project on using computa- tional and statistical techniques to understand English history. The earliest products of this project combined legal history and intellectual history, with two articles addressing Using Topic-Modeling in Legal History 193 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press http://commonplace.online/article/doing-more-with-digitization/ http://commonplace.online/article/doing-more-with-digitization/ https://doi.org/10.22148/16.030 https://doi.org/10.1017/S0738248022000153 modeling, this article can omit descriptions of the technical nuances and the details of data construction, making it accessible to a wider audience. We do, however, provide an intuitive description of the methods used to generate the raw quantitative output of the topic model, a description that is intended to be accessible to those not versed in the details of computational-statistical modeling. We then use that intuitive description of topic modeling to describe its data outputs. Importantly, the outputs of a topic model are not the endpoint of such an exercise. Rather, they constitute data that can be productively employed as an input into subsequent analyses. Thus, we turn to examples of the substantive insights that can be generated from the data set produced by the machine learning. We highlight the types of information that can be generated and made readily available to other scholars. That information can be easily used by those who have no intention of implementing the methods themselves but rather are interested in the types of substantive results that can be generated by the data that are the output of a topic model. Section I presents the informal overview of topic-modeling. It begins with a brief history of how this tool has been used in the humanities, law, and the social sciences, showing that the particular exercise that this article presents is a natural outgrowth of two decades of development and application of topic-modeling. This short history argues that topic- modeling should not be regarded as immediately alien to legal history in view of the fact that it has been applied in fields whose objects of study share many features with the history of the law. Then, we proceed with a non-technical discussion of the assumptions, methods, and outputs of topic-modeling. This informal overview has the advantage that it lays bare the types of assumptions about texts that machine-learning uses, so that the weaknesses of the new approaches can be clearly seen. The raw data used for the topic model discussed here are virtually all reports on cases heard before 1765 that appear in the English Reports, a corpus comprising 52,949 reports.12 Topic-modeling produces parsimoni- ous summaries of this enormous amount of text information, which understanding more general sets of ideas, focused on Francis Bacon and Edward Coke. See P. Grajzl and P. Murrell, “Toward Understanding 17th Century English Culture: A Structural Topic Model of Francis Bacon’s Ideas,” Journal of Comparative Economics 47 (2019): 111– 35; and P. Grajzl and P. Murrell, “Characterizing a Legal-Intellectual Culture: Bacon, Coke, and Seventeenth-Century England,” Cliometrica 15 (2021): 43–88. 12. The digitized copies of the English Reports were purchased from a publishing com- pany domiciled in South Africa. It is beyond the scope of this article to provide the many details of the initial processing of these digital copies, and the cleaning of them. Suffice it Law and History Review, May 2022194 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 comprises 31,057,596 words. Because topic-modeling is an unsupervised machine-learning technique, the shape of the summaries themselves is not produced in order to answer a particular question or to test a particular hypothesis. Rather, the text-data themselves shape their own synopsis. The summaries are in the form of 100 “topics,” as if the computational methods had produced a new digest of English law, divided into 100 sections. This is the “dimensionality reduction” aspect of machine-learning, producing an organized summary of an enormous amount of text that no human being could possibly hope to read (or at least retain and organize in memory).13 In this respect, topic-modeling dovetails with one of the central concerns of his- torians: to provide compelling narratives. The computer is essential to the production of the narrative because so much information is captured and con- densed. This is especially the case when the attempt is to capture ebbs and flows over centuries: Guldi and Armitage emphasize the potential in big data to return historical studies to the longue durée.14 In the case of topic-modeling, the computer output itself is only the beginning, and much interpretation is needed. The sections of the digest come without names; one just knows which case reports feature a particular digest section most prominently and which vocabulary that section most favors. The detailed work of legal historians over the centuries then pro- vides the background for analysis of this information, enabling the researcher to understand which areas of law a particular section of the digest contains, thereby driving the crucial step of topic naming. Close to say that a very large proportion of GM’s labor time devoted to the pertinent research pro- jects was consumed in all of these tasks. For more details, see GM and M. Schmidt, “Institutional Persistence and Change in England’s Common Law: 1700-1865” (PhD diss., University of Maryland, 2015). Because a central objective of GM was to include as many reports as possible, which necessarily implied computational processing of all reports, there was a need to exclude a small percentage of reports with too many words that did not have a counterpart in either modern English or standard Latin. Chiefly, this had the effect of excluding reports in Law French. There is no doubt that this is a blemish on the application of the computational methods. Initially, there were 60,249 pre-1765 reports in the data set, but 6,917 were dropped because they were in Law French and a fur- ther 383 were removed because they contained too many unrecognizable words. This left 52,949 reports. 13. The summary does not rely on any existing classifications: we return to this point in the conclusion. 14. See J. Guldi and D. Armitage, The History Manifesto (Cambridge: Cambridge University Press, 2014), emphasizing that “Over the last decade, the emergence of the digital humanities as a field has meant that a range of tools are within the grasp of anyone, scholar or citizen, who wants to try their hand at making sense of long stretches of time. Topic mod- eling software can machine read through millions of government or scientific reports and give back some basic facts about how our interest in ideas have changed over decades and centuries.” Using Topic-Modeling in Legal History 195 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 reading undertaken in the context of traditional legal history is essential to interpret the output of the computer’s distant reading. Importantly, the naming of topics can be done by any researcher who has obtained the data produced by the topic model: it is undertaken quite separately from the computational analysis. This possibility for the sharing of the generated data is one of the most important contributions that the new methods can offer to legal history and legal-historical research: the results that are the output of the topic-modeling exercise can be made available and used as inputs by all researchers. Once the underlying nature of each topic is understood, the researcher can then proceed to analyze the vast amount of quantitative information produced by the computational methods. This article provides an example of how such information can be used: its input data are the output data of GM and we use those data to present new results and provide insights that any readers could have produced had they availed themselves of the same data. Each of Sections III, IV, and V is built around just one evocative figure intended to provide an interpretation of the development of case law and legal ideas relevant to finance in pre-industrial England. The origins of our interest in these developments lies in our background as economists. In Section II, we review the debates that have made the history of English law on finance important in that discipline, and explain how the combination of machine-learning methods and the prior insights of legal historians offers new information pertinent to those debates. We ask and answer the following questions. Which time periods evidence the most intense development of that area of law and legal ideas? Which pre- existing elements of law, such as property or contract, were most important as inputs into this development, and when? What were the relative roles of common law and equity in spurring these developments? Section III introduces the 15 of the 100 GM-estimated topics that are most relevant to finance, the most salient sections of the machine-produced digest. These fifteen topics were identified by the authors on the basis of topic content, and therefore the overall category of finance is not an entity produced by the topic modeling itself. This is just one of the many exam- ples we provide in this article of the fact that the modeling of the topics themselves is not the endpoint of the analysis, but rather provides the data that the researcher uses, in combination with existing information and judgment, to proceed to real substance. The periods of the most intense development of the relevant case law become evident by examining timelines that show when these fifteen topics are most prevalent in the English Reports. For example, the timelines show what will be very familiar to legal historians: that ideas on assumpsit Law and History Review, May 2022196 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 developed in the early seventeenth century. But the timelines can add to these insights by demonstrating that attention to assumpsit peaked around 1630, while the development of ideas on the validity of contracts, for example, was largely a product of the 1690s and later. Cumulatively, our timelines of the finance-related areas of law suggest that the seventeenth century witnessed many advances in case law that became relevant to eighteenth–century finance. Section IV considers connections among the developments that take place in differing areas of law. Any case report usually incorporates ideas from var- ied legal domains even if one specific issue is central to the case: a single case is indexed within many sections of the digest. The topic-modeling produces data on the proportion of each of the 100 estimated topics that is present in each of the 52,949 reports of cases. Thus, one can find, for example, whether a case that is very much centered on trusts tends to emphasize contract issues or property considerations. By examining such connections in general, one can make conclusions about the legal ideas in one domain that were relevant to, and possibly fed into, the legal ideas in another domain. In Section IV, we identify the links among the fifteen topics (the digest sections) identified with finance, as well as links between these fifteen topics and ones not classified within finance. We find, for example, that early-seventeenth-century devel- opments in the case law of contracts had significant effects on later devel- opments in case law relevant to finance Section V examines the relative importance of common law and equity in producing law relevant to finance. Although case reports are unambig- uously assignable to courts and although specific legal notions were often the particular province of either common law or equity, each type of court absorbed ideas from the other. For example, a case on trusts in Chancery (an equity court) could well use ideas on contracts developed in Common Pleas or King’s Bench (common-law courts). The develop- ment of ideas in a given legal domain can then be ultimately viewed as reflecting debates in both common law and equity. We examine the relative importance of law and equity for each topic related to finance. Interestingly, our evidence shows that many of the critical areas of law on finance were a product of equity, and not of the common law. To state the implied conclusion in its most contentious form, Britain might never have been economically powerful enough to spread its common law around the world had it relied solely on the common law at the time that it began spreading its system of law around the world. Section VI concludes, providing reflections on both the promise of com- putational text analysis for legal history and its pitfalls. We comment on what topic-modeling can and cannot do. We emphasize that topic- modeling can provide new sources of data for other researchers: once a Using Topic-Modeling in Legal History 197 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 massive volume of texts are summarized, the quantitative summaries them- selves can provide inputs into further research. Peering into the future, one can detect signs that unsupervised machine-learning might be gradually changing the research perspectives of social science, with descriptive anal- yses now becoming more acceptable. The almost exclusive emphasis on the hypothetico-deductive method is waning (very slightly at the moment) and exercises in the inductive spirit are gaining credibility. This change would naturally lead to much more complementarity between traditional legal historians and those who favor the use of computational and statistical methods in the social sciences. I. An Introduction to Topic-Modeling The techniques that we describe here are descendants of the seminal paper by Blei, Ng, and Jordan,15 particularly the structural topic model by Roberts, Stewart, and Airoldi, which is the version of topic-modeling used by GM to produce their results.16 Topic-modeling originated in com- puter science, in pursuit of using computational methods to summarize 15. D.M. Blei, A.Y. Ng, and M.I. Jordan, “Latent Dirichlet Allocation,” Journal of Machine Learning Research 3 (2003): 993–1022. One measure of the prominence of this contribution is that this is the seventh most cited article in computer science that was pro- duced this millennium (https://citeseerx.ist.psu.edu/stats/articles). To be sure, there were a number of similar algorithms developed before the Blei et al. contribution, but early excite- ment about these methods seems to have focused on Blei et al., perhaps because of the acces- sible software developed for implementation. See A.K. McCallum, “MALLET: A Machine Learning for Language Toolkit,” (http://mallet.cs.umass.edu, accessed April 2021). In their note explaining this software, S. Graham, S. Weingart, and I. Milligan, “Getting Started with Topic Modeling and MALLET,” (https://programminghistorian.org/en/lessons/topic-model- ing-and-mallet, accessed April 2021). state: “You will sometimes come across the term ‘LDA’ when looking into the bibliography of topic modeling. LDA and Topic Model are often used synonymously, but the LDA technique is actually a special case of topic model- ing created by David Blei and friends. . . . It was not the first technique now considered topic modeling, but it is by far the most popular. . .They all work in much the same way.” One such earlier algorithm was used in study by Newman and Block in the first history publica- tion to use topic-modeling. See D. J. Newman and S. Block, “Probabilistic Topic Decomposition of an Eighteenth-Century American Newspaper,” Journal of the American Society for Information Science and Technology 57 (2006): 753–67. 16. See M.E. Roberts, B.M. Stewart, and E.M. Airoldi, “A Model of Text for Experimentation in the Social Sciences,” Journal of the American Statistical Association 111 (2016): 988–1003, whose general approach is very similar to that of Blei et al., but has an emphasis on incorporating document meta-information (such as date of publication) directly into the analysis. Small details would have changed had we used LDA, but we are sure the overall picture would have remained the same. For copious detail on the structural-topic-model approach to topic-modeling, including how to get started on imple- mentation, see https://www.structuraltopicmodel.com (accessed June 2019). Law and History Review, May 2022198 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://citeseerx.ist.psu.edu/stats/articles https://citeseerx.ist.psu.edu/stats/articles http://mallet.cs.umass.edu http://mallet.cs.umass.edu https://programminghistorian.org/en/lessons/topic-modeling-and-mallet https://programminghistorian.org/en/lessons/topic-modeling-and-mallet https://programminghistorian.org/en/lessons/topic-modeling-and-mallet https://www.structuraltopicmodel.com https://www.structuraltopicmodel.com https://doi.org/10.1017/S0738248022000153 large amounts of text information. Within the social sciences and human- ities, the field in which topic-modeling first flourished was the digital humanities, particularly literature, obviously a field for which text is cen- tral. In that discipline, the rise in popularity was probably fueled by the rhe- toric of the assertions of the advantages of distant reading over traditional close reading, and ensuing debates. With text, rather than numbers, provid- ing much of the core data in politics, political science was the next major discipline to see the advantages of the new machine-learning approaches, particularly topic-modeling. It is much more difficult to find applications in political theory, which is perhaps the closest analog in political science to case law.17 Political science was naturally followed by law, also presum- ably because much of its data are texts, but legal history has been slow to follow. Digital humanities, political science, and law seem to be the three major non-computational-science disciplines in which applications using topic-modeling, and related techniques, appear regularly in the top journals and are cited regularly within the mainstream of the field. Economics and history, particularly legal history, the disciplines reflected in this article, are ones in which the application of topic models has lagged. In economics, this is readily explained by the enormous influ- ence of the hypothetico-deductive paradigm, with its emphasis on the test- ing of hypotheses concerning isolated causal facts rather than an interest in broad narrative.18 The uses of topic-modeling in economics most usually focus on new measurements of highly specific phenomena, to fit into a par- ticular implementation of that paradigm.19 Our use of topic-modeling is therefore rather different from the few applications in the mainstream of our field: our objective is to provide a broad narrative of finance-related English case law over two centuries. To the extent that we match our data to specific hypotheses, it is because we came to realize after the con- struction of our narrative how our narrative naturally reflected on these hypotheses, not because we aimed originally to test them. 17. On this point, see H. Bonin, “From Antagonist to Protagonist: ‘Democracy’ and ‘peo- ple’ in British Parliamentary Debates, 1775–1885,” Digital Scholarship in the Humanities 35 (2020): 759–75. One example using a topic-model-type method is L. Blaydes, J. Grimmer, and A. McQueen, “Mirrors for Princes and Sultans: Advice on the Art of Governance in the Medieval Christian and Islamic Worlds,” Journal of Politics 80 (2018): 1150–67. 18. Applications in sociology have also lagged, perhaps because the hypothetico- deductive method has had increasing sway in that field as well. For the lag in sociology, see N.C. Lindstedt, “Structural Topic Modeling for Social Scientists: A Brief Case Study with Social Movement Studies Literature, 2005–2017,” Social Currents 6 (2019): 307–18. 19. For example, S. Hansen and M. McMahon, “Shocking Language: Understanding the Macroeconomic Effects of Central Bank Communication,” Journal of International Economics 99 (2016): S114–S133. Using Topic-Modeling in Legal History 199 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 The reason for history’s lag in applying machine-learning in general, and topic-modeling in particular, is less clear, to us at least.20 As already men- tioned, topic-modeling leads naturally to a historical narrative. But Guldi and Armitage, who emphasize this point also, argue that research in history has turned away from exercises that examine long time periods and expan- sive subjects, exactly the areas in which machine-learning can contribute. The high degree of technical complexity in existing applications of topic- modeling to history might also have discouraged some researchers.21 However, as we hope to show in this article, researchers interested in using the output of topic models do not themselves have to engage in all the complexities of producing topic model estimates. If that output is freely available to all, as is the case with GM, it is enough for subsequent researchers to understand how to interpret that output when using it as a source of data as a basis for further exploration. An analogy is helpful here. Economic historians using estimates of national income are not required to produce those estimates themselves, or even to grasp all the complexities of data gathering and index number construction. As we will show, by example, the output data of topic-modeling can be used in an exactly analogous way as input data for further exercises. A. The Topic Model The algorithms producing topic-modeling estimates begin with a conceptu- alization of the process of document (in our context, case report) genera- tion that is extremely crude, but lends itself to formalization in a statistical model. It is the explicitness of the conceptualization that facili- tates interpretation of the results of the analysis, producing the insights that legal historians might appreciate. Such an interpretation is often not possible with the results of other machine-learning techniques, such as neu- ral networks, in which the focus is on prediction or problem-solving, rather 20. Stephen Robertson emphasizes that text-analysis in history has been held back simply by the availability of a large stock of digital texts. S. Robertson, “The Differences between Digital Humanities and Digital History,” Debates in Digital Humanities (2016) (https://dhdebates.gc. cuny.edu/read/untitled/section/ed4a1145-7044-42e9-a898-5ff8691b6628#ch25m, accessed March 2021). This constraint is rapidly being relaxed. Indeed, one of the contributions of GM is to make machine readable, cleaned versions of the English Reports available for scholars in general. See GM and the concluding section of this article for more details. 21. For interesting articles of this kind, see A. Barron, J. Huanga, R. Spang, and S. DeDeo, “Individuals, Institutions, and Innovation in the Debates of the French Revolution,” Proceedings of the National Academy of Sciences 115 (2018): 4607–12; and A. Rule, J. Cointet, and P. Bearman, “Lexical Shifts, Substantive Changes, and Continuity in State of the Union Discourse, 1790–2014,” Proceedings of the National Academy of Sciences 112 (2015): 10,837–44. Law and History Review, May 2022200 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://dhdebates.gc.cuny.edu/read/untitled/section/ed4a1145-7044-42e9-a898-5ff8691b6628#ch25m https://dhdebates.gc.cuny.edu/read/untitled/section/ed4a1145-7044-42e9-a898-5ff8691b6628#ch25m https://doi.org/10.1017/S0738248022000153 than description. But the relative ease of interpretation comes with a cost: the simple conceptualization will surely foster a general skepticism.22 We give an unvarnished view here to emphasize limitations, and why they arise. The process of generating case reports envisaged by topic-modeling may be summarized as follows. An author (in our context, a legal reporter) is viewed as beginning with a fixed number of topics, essentially lodged in his or her brain and available for use when writing. Topics might be well- identified legal concepts, such as assumpsit or habeas corpus, or ideas that cut across many domains of law, such as revocation, or even a particular reporting style.23 When a particular topic is used, the author simply has a greater preference for the vocabulary more closely associated with that topic than for other words. For example, when the author refers to the topic assumpsit, the author will have a greater likelihood of using the word “promise”; similarly mention of bail will be frequent when using the topic habeas corpus. The production of a document, a case report, then entails the author choosing to emphasize some topics less and some more, depending on the general context of that report. A document will be a mixture of topics. Thus, a particular case report might tend to empha- size, for example, both assumpsit and habeas corpus because the defendant was in debtor’s prison as a result of a case involving non-payment of a contractual debt. The words “promise” and “bail” would then appear prominently in this case report, but words such as “daughter” or “wife” would hardly appear because they are associated with topics that empha- size estates or wills, which are of no relevance for these particular types of cases. Thus, a topic model views a document as one created in a process in which the author has chosen to emphasize certain topics, which in turn emphasize their own characteristic vocabularies. Consistently, a document is fed into the statistical analysis as a bag of words that has been stripped of all syntactic and sentence structure. However, each word choice is based on the emphasized topics and the vocabulary emphasized by these topics. This 22. As counterpoint to this apology for simplification, see S. Robertson, “Digital Humanities” in The Oxford Handbook of Law and Humanities, ed. S. Stern, M. Del Mar, and B. Meyler, (Oxford: Oxford University Press, 2019), emphasizing that “If humanities scholars chafe at such simplification, it is worth noting that narrative, the favored represen- tational model of humanities scholars, is a deliberately simplified account that is illuminating because of, not despite, its simplification.” 23. The productive use of machine-learning to detect style was emphasized by Matthew L. Jockers, one of the most forceful advocates of machine-learning in the digital humanities; see M.L. Jockers,Macroanalysis: Digital Methods and Literary History (Urbana: University of Illinois Press, 2013). Using Topic-Modeling in Legal History 201 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 conceptualization then views semantic content as becoming embedded in a report via word choices, which will be highly correlated across related reports. For example, “contract” and “promise” will appear frequently together, but their presence will be negatively correlated with the appear- ance of the words “daughter” and “will,” which in turn frequently co-occur. The topic-modeling algorithm produces results that reflect semantic content because it leverages these patterns of correlations across documents. In the phrasing of Mohr and Bogdanov, “relationality trumps syntax.”24 Similarly, topic-modeling is able to “see” through polysemy because meanings are embodied in combinations of word usage not in sin- gle words.25 The model will reflect the sense of “extent” in “he is not bound to prove the whole extent of a debt” very differently from the one in “the Crown may not proceed against its debtor either by extent or scire facias,” because of the repetition of the accompanying words across many cases. The bag-of-words assumption is obviously a stylization that does no jus- tice to the process of writing. One should note, however, that this assumption is partially a consequence of current limitations in computational power. With expected increases in computational power, much more acceptable characterizations of the process of authoring a document will become avail- able when using techniques that are descendants of the ones described here.26 One final step in the conceptualization of the document-writing process is to acknowledge that different authors of case reports have different characteristics, and that indeed the same author will be influenced by circumstances such as the timing of the case and the court adjudicating it. This can be explicitly incorporated into the estimation process when using the structural topic model. In the application reported in this article, the author of a specific case report is viewed as being influenced by the year in which the report was written and the court in which the case was heard. At this stage, we would imagine that readers unversed in topic- modeling, and in machine-learning methods more broadly, are immensely skeptical. We were too when we began using such techniques. But after several years of poring over results, comparing those results to existing ideas in the literature, and seeing the added value of insights that were not possible to reach when utilizing conventional approaches to the 24. J. W. Mohr and P. Bogdanov, “Introduction−Topic Models: What They Are and Why They Matter,” Poetics 41 (2013): 545–69. 25. DiMaggio et al., “Exploiting Affinities”. 26. One could instead view documents as collections of two- or three-word chunks, or even larger phrases. But the required processing power increases proportionately with the number of distinct phrases, which increases exponentially with the number of words allowed to be in a phrase. Law and History Review, May 2022202 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 analysis of texts, we saw how topic-modeling can provide a powerful com- plement to the traditional work of historians. Moreover, “legal history is better positioned for a digital turn than most historical fields when it comes to the amenability of legal sources to computational analysis” because reporters followed consistent forms of presentation using special- ized vocabulary, where the correspondence between words and meaning remained much more constant both over time and among individuals than would have been the case for ordinary language.27 B. Estimation Estimation begins with the observations that are available to the researcher: the documents (case reports) and the information that charac- terizes authors. The researcher must decide on the number of topics; that is, the number of sections of the digest. Using statistical criteria and a more subjective evaluation of the coherence and meaning of the produced topics, GM judged that 100 different topics adequately captured the salient emphases in the reports on pre-1765 cases.28 This element of human judgment is part of the process of validating the overall topic model: “Researchers must also interpret the topic model output, probably iteratively, so that a best fit can be found between the number of topics and an overall level of interpretability.”29 Given the estimated topics, machine-learning provides a measure of the importance of each vocabulary word to each topic. In the current example, this is the proportion of each of the 41,174 distinct vocabulary words in each of the 100 topics. The estimation also predicts the proportion of any given one of the 52,949 documents that can be attributed to the use of each topic. And given that each document is labeled as reporting on a case heard at a specific time in a specific court, the estimates provide infor- mation on how the use of various topics varies with those characteristics, year, or court. 27. See Robertson, “Searching”. On this point, see also P. Grajzl and P. Murrell, “Lasting Legal Legacies: Early English Legal Ideas and Later Caselaw Development During the Industrial Revolution,” Review of Law & Economics (2022), pre-publication online version, https://doi.org/10.1515/rle-2021-0070 (accessed April 16, 2022). 28. As evidenced by the large, related literature, computational scientists and statisticians usually emphasize rule-based criteria for model choice, relying solely on numerical informa- tion derived from the estimating process or the output data. In contrast, practitioners empha- size the element of subjective judgment, which would take into account the perceived quality of the topics reflecting both the uses to which they are to be put and the nature of the text data that is used in estimation. See, for example, Gentzkow et al., “Text as Data”; DiMaggio et al., “Exploiting Affinities”; and Mohr and Bogdanov, “Introduction−Topic Models.” 29. See Mohr and Bogdanov, “Introduction−Topic Models,” 560. Using Topic-Modeling in Legal History 203 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1515/rle-2021-0070 https://doi.org/10.1017/S0738248022000153 C. What Are the Topics? Topic-modeling is an unsupervised machine-learning exercise. The estima- tion of the topics is not guided by any objective to match topics to pre-existing ideas about what is in the law. Thus, the produced objects, the topics, come unlabeled. The researcher must provide titles for the sec- tions of the machine-produced digest. This requires applying insights from existing legal-historical research. The information described in the previous paragraph is matched against those insights. One examines closely the vocabulary most used by a topic and one closely reads those case reports in which the topic is most prominent. This is an extremely laborious task, but GM found that it was not conceptually difficult to identify the idea or ideas underlying each and every one of their 100 topics. One important part of the general methodology to emphasize here is that the identification of what a topic refers to cannot rely solely on a perusal of the vocabulary or words that a topic most uses, even those words that a topic most uses relative to other topics. It is absolutely essential to read the doc- uments in which a topic is most prominent. The labeling of a topic must make sense in relation to the content of all the other estimated topics, because the specific emphasis in one topic might only be clear when con- trasting that topic to a closely related one with a different emphasis. The reason to highlight this point is that many, probably a large majority, of the articles that have used topic-modeling to date base the interpretation of topics only on perusal of the words that a topic most uses. A reading of the documents requires domain-specific knowledge, and in the case of pre-industrial English history, it certainly requires struggling with a very dif- ferent form of English. That is one reason why we emphasize that modern machine-learning and traditional doctrinal text analyses are complements. This painstaking naming process is an essential ingredient of the valida- tion of a topic model exercise: simply making sure that its results provide an intuitive, coherent whole, both within topics and across topics.30 The relative ease, in the conceptual sense, of topic naming in GM does suggest that their whole topic-modeling exercise has high validity. If many topics were simply mysterious, then one would conclude that the specific features of the machine-learning process were not well suited to the texts being analyzed. Some of GM’s topics fit snugly within existing concepts in the legal, his- torical, and traditional text-analysis literature. For example, the topic names “Assumpsit,” Bankruptcy,” and “Uses” resonate closely with legal concepts 30. See Grimmer et al., “Machine Learning for Social Science,” stating: “Rather than place our trust fully in models and fit statistics, we argue that human feedback is essential for judging the quality of model results used for discovery.” Law and History Review, May 2022204 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 and instruments covered at length in textbooks on the history of English law.31 Other sets of topics split a single broad subject into several constituent areas (e.g., Implementing Ambiguous Wills, Contingency in Wills, Validity of Wills). Yet further types of topics encompass substantive issues that cut across many substantive areas of law (e.g., Revocation, Determining Damages and Costs) or refer to general legal ideas and modes of reasoning about cases as opposed to specific domains of application (e.g., Coke Reporting).32 This an example of topic-modeling as an exercise in discovery, rather than an exercise in prediction or hypothesis testing, which would instead be focused on a search for anticipated patterns in case law or legal ideas.33 When economists name topics in such an analysis, there is undoubtedly a tendency to focus on the functional domain to which the law is applied. A legal scholar would probably focus more on the legal doctrines captured in a topic and the historical origins of those doctrines. We are therefore sure that legal historians would have chosen a slightly different set of names than GM did for at least a subset of the 100 topics, probably finding labels that resonate more with internal characteristics of the legal system and less with outward effects on economic agents.34 The fact that the list of topic names, the titles of the sections of the digest of pre-1765 English case law and associated legal ideas, only partially 31. In order to distinguish our topic names clearly in the remainder of this article, we cap- italize them. 32. An additional type of topic is identified by DiMaggio et al. who in “Exploiting Affinities,” argue that “Topic models often shunt noisy data into uninterpretable topics in ways that strengthen the coherence of topics that remain.” In fact, our experience is not that the topics are uninterpretable, per se, but rather that the interpretation means that the topic tells one nothing about the substantive inquiry in question. For example, GM find a topic that they call Non-Translated Latin. Sixteenth and seventeenth century lawyers not only had their own version of English, but their Latin was also highly idiosyncratic. The text preparation procedures were able to handle idiosyncratic English and standard Latin, but not idiosyncratic Latin. 33. This point is much emphasized in the literature, something to which we return more fully in the Conclusion. See, for example, DiMaggio et al., “Exploiting Affinities”; Mohr and Bogdanov, “Introduction−Topic Models”; A. Goldberg, “In Defense of Forensic Social Science,” Big Data & Society (2015), July-Dec: 1–3 and L.K. Nelson, “Leveraging the Alignment Between Machine Learning and Intersectionality: Using Word Embeddings to Measure Intersectional Experiences of the Nineteenth Century U.S. South,” Poetics 88 (2021): 101539, 1–18. 34. Since the topic-modeling algorithm produces unlabeled topics and since the data out- put from that algorithm can easily be transmitted, other researchers could easily produce their own set of names for the 100 topics. Indeed, doing so could inspire much more research that benefits from topic models. One research team produces the output data from the topic model, which can then easily be the input data for the work of other researchers. Using Topic-Modeling in Legal History 205 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 matches the chapter and section headings of a legal history textbook can be viewed as either a vice or a virtue, depending on the reader’s perspective. It might be troubling for some readers to look at a topic like Geographic Jurisdiction of Laws and realize that this topic is prominent in case reports that deal with such divergent areas of law as the relations between parishes and the legal status of individual citizens of belligerent nations. This topic appears prominently in case reports that span the whole time period covered by our data, and it appears in cases heard in all of the major courts. Thus, some areas of emphasis suggested by topic-modeling do not fit comfortably within existing classifications based on more traditional techniques. But this, in fact, shows the power of these machine-learning methods, highlighting how legal ideas can appear in many different types of cases. By covering the gamut of case reports in a particular time period, topic-modeling is an exercise in discovery, unearthing substantive patterns and connections between seemingly disparate notions that would likely remain unnoticed with the use of traditional methods restricted by the limits of human memory and reason. We return to this point in the Conclusion, where we comment on how machine-learning is changing the research practices in several fields, lessening the hold of the hypothetico-deductive method, and opening up possibilities for inductive exercises. II. The Elements of English Legal History Emphasized in Mainstream Economics In this section, we explain why we, as economists, chose to focus on the law relevant to finance in articulating the properties, promise, and pitfalls of topic-modeling. Ideas about the history of the law have made a differ- ence in economics. Some of the conventional wisdom that drives important areas of mainstream economics reflects on subjects that are of great interest to those legal historians studying developments before the twentieth cen- tury. However, there appears to have been little cross-fertilization between the literatures of the two fields, certainly as far as those literatures focus on the case law of the pre-industrial era.35 Hence, the specific ideas embraced 35. Some economic historians have been very aware of detailed developments in the legal sphere, but it seems to be the case that such economic historians have had little effect on the perspectives on English legal history that are dominant in the mainstream of economic anal- ysis, as exemplified in the works to be discussed in the ensuing paragraphs. R. Harris, in “The Encounters,” was early in making a case for productive exchange between legal history and economics, stressing that legal historians did not pay sufficient attention to the economic history literature. We are more concerned here with the lack of interchange in the reverse direction. Law and History Review, May 2022206 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 by mainstream economists do not always match the legal history that has been developed by those researchers whose primary audience is legal scholars and who approach the study of legal history with traditional text-analysis methods. Finance is not normally a category or an immediate domain of interest within the legal-history literature of the pre-industrial era.36 However, this area of law is vital to economic history because England’s financial revolution preceded, and was perhaps a key input into, the Industrial Revolution. Ideas about English legal history have been influential in areas of economics as diverse as the regulation of modern financial mar- kets, protection of investor rights, and the relief of poverty in the poorest countries. This is no doubt due largely to the global influence of Britain from the eighteenth century on, the importance of the British financial and industrial revolutions, and the spread of the common law around the globe. It is also certainly due to the fact that understanding the sources of economic development is often considered the most important question of economics, and Britain led the world in political and economic develop- ment for more than two centuries. Our focus here is on the two most influential strains of thought that are driven by interpretations of English legal history and that have had wide currency in mainstream economics. Given this focus, we unfortunately cannot do justice to the many authors, particularly those studying institu- tional and economic history, who challenge these views, and offer nuanced caveats.37 The two legal-history-based paradigms are those following the 36. The word “finance” appears only twice in J.H. Baker, An Introduction to English Legal History, fifth edition (Oxford: Oxford University Press, 2019) and the pertinent issues are in separate discussions, included under property and contract. 37. Some salient critiques are N. Sussman and Y. Yafeh, “Institutional Reforms, Financial Development and Sovereign Debt: Britain 1690–1790,” Journal of Economic History 66 (2006): 906–35; P. Murrell, “Design and Evolution in Institutional Development: The Insignificance of the English Bill of Rights,” Journal of Comparative Economics 45 (2017): 36–55; L. Neal, “How It All Began: The Monetary and Financial Architecture of Europe During the First Global Capital Markets, 1648–1815,” Financial History Review 7 (2000): 117–40; P. O’Brien, “The Nature and Historical Evolution of an Exceptional Fiscal State and Its Possible Significance for the Precocious Commercialization and Industrialization of the British Economy from Cromwell to Nelson,” Economic History Review 64 (2011): 408–46; S. Ogilvie and A.W. Carus, “Institutions and Economic Growth in Historical Perspective,” in Handbook of Economic Growth, ed. P. Aghion and S.N. Durlauf (Amsterdam: Elsevier, 2014), 403–513; D. Coffman, A. Leonard, and L. Neal (ed.), Questioning Credible Commitment: Perspectives on the Rise of Financial Capitalism (Cambridge: Cambridge University Press, 2013); and G.M. Hodgson, “1688 and All That: Property Rights, the Glorious Revolution and the Rise of British Capitalism,” Journal of Institutional Economics 13 (2017): 79–107. Using Topic-Modeling in Legal History 207 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 seminal articles by North and Weingast (henceforth NW)38 and La Porta, Lopez-de-Silanes, Shleifer, and Vishny (henceforth LLSV).39 Both para- digms focus on high-level, even constitutional, elements of the legal sys- tem rather than on the information that occupies most of English legal history and which provides the data for this article; that is, the vast collec- tion of reports on the deliberations within the courts. Both sets of works have had enormous influence in economics, in areas far removed from their original domain of application.40 The approach of NW is that “the institutional changes of the Glorious Revolution permitted the drive toward British hegemony and dominance of the world.”41 In emphasizing the effects of constitutional measures, par- ticularly the Bill of Rights and the Act of Settlement, NW are followed by the influential works of Acemoglu and Robinson42 and North, Wallis, and Weingast.43 LLSV also emphasize overarching features of the legal system.44 Their focus is on the overall characteristics of law-making and legal adjudication and how these produce different types of legal processes in common-law 38. NW, “Constitutions and Commitment: The Evolution of Institutions Governing Public Choice in Seventeenth-Century England,” Journal of Economic History 49 (1989): 803–32. 39. LLSV, “Legal Determinants of External Finance,” Journal of Finance 52 (1997): 1131–50; and LLSV, “Law and Finance,” Journal of Political Economy 106 (1998): 1113–55. 40. A computational search of JSTOR reveals how unusual these two works are in their spread across the whole of economics. NW, “Constitutions and Commitment,” appears in the Journal of Economic History and is referred to in JSTOR thirteen times as often as the typ- ical article published at the same time in that journal. The references to NW, “Constitutions and Commitment,” are twice as common in the journals outside economic history as in eco- nomic history journals, while for the typical article published in the same journal at the same time, the ratio is 0.6. Similarly, LLSV, “Legal Determinants,” appears in the Journal of Finance and is referred to in JSTOR twenty times as often as the typical article published at the same time in the same journal. The references to LLSV, “Legal Determinants,” are twice as common in the journals outside of finance as in finance journals, while for the typ- ical article at the same time in the same journal, the ratio is 0.24. 41. NW, “Constitutions and Commitment,” 830. 42. D. Acemoglu and J.A. Robinson, Why Nations Fail: The Origins of Power, Prosperity and Poverty (New York: Crown Business, 2012). See, for example, reiteration that “The Glorious Revolution limited the power of the king and the executive, and relocated to Parliament the power to determine economic institutions . . .The Glorious Revolution was the foundation for creating a pluralistic society. . .The government. . .steadfastly enforced property rights. . . Historically unprecedented was the application of English law to all citizens. Arbitrary taxation ceased, and monopolies were abolished almost completely. . .” at 102. 43. D.C. North, J.J. Wallis, and B.R. Weingast, Violence and Social Orders: A Conceptual Framework for Interpreting Recorded Human History (Cambridge: Cambridge University Press, 2009). 44. See LLSV, “Legal Determinants”; and LLSV, “Law and Finance”. Law and History Review, May 2022208 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 and civil-law countries. In many works following on the original articles, summarized by La Porta, Lopez-de-Silanes, and Shleifer,45 the authors, and others, bring enormous amounts of modern data at a very detailed level to bear on their work. But to the extent that they engage with legal history it is at the level of the approaches to law that were developed in England and France and the effect of these approaches on system-wide characteristics such as judicial independence, the use of juries, organization of the legal system, and the sources of law. The reader will notice from the foregoing summary that the two influen- tial legal-history paradigms that have had a broad influence across a swathe of economics do not rest on detailed examinations of the vast number of routine developments in the law that is the stuff of the history emphasized by traditional legal historians. These paradigms do not invoke characteriza- tions of the development of English law in the period 1550–1750, which are based on the records of the courts and apply to domains that are crucially important for a capitalist economy: contract, property, and tort. They do not reflect the painstakingly slow developments occurring in procedures, prece- dent, and forms of legal action, which affected how the courts functioned and how litigants could use the law. In short, within the two institutional nar- ratives that have been most successful in using English legal history to influ- ence the way economists think about the world, the work of scholars within traditional legal history is largely missing. A machine-learning history of English case law offers the chance to bridge the fields of economics and legal history. By using as input the reports on tens of thousands of historical cases, it absorbs, albeit imperfectly, the most important information used by legal historians, the micro-level case- report data that are far removed from the macro-level constitutional and legal- system arrangements emphasized by NW and LLSV. By interpreting the results of the analysis using centuries of insights developed by scholars who have focused on case law, a machine-learning approach incorporates ele- ments of traditional legal-historical research and complements existing exege- ses on legal development. At the same time, a machine-learning history offers the type of broad narrative about case law that would be so difficult for an outsider to the field of legal history to grasp without access to the results of the topic model, even with the use of such a superb textbook as that by Baker.46 In the ensuing sections, we illustrate the power of topic-modeling in the context of the developments and features of case law and legal ideas pertinent to finance. 45. R. La Porta, F. Lopez-de-Silanes, and A. Shleifer, “The Economic Consequences of Legal Origins,” Journal of Economic Literature 46 (2008): 285–332. 46. See Baker, An Introduction. Using Topic-Modeling in Legal History 209 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 III. Characterizing Temporal Change: The Development of Case Law and Legal Ideas Relevant to Finance Within the sections of the 100-topic machine-produced digest of pre-1765 English case law and associated legal ideas, we identified fifteen topics as pertinent to finance. Topic-modeling does not tell us which topics to des- ignate as relevant to finance: this is our judgment based on an understand- ing of the content of all topics estimated by GM. The topics we designated as finance ones are: Arbitration and Umpires, Assumpsit, Bankruptcy, Bonds, Claims from Financial Instruments, Contract Interpretation and Validity, Executable Purchase Agreements, Execution and Administration of Estates, Identifying Contractual Breach, Implementing Trusts, Mortgages, Negotiable Bills and Notes, Pleadings on Debt, Prioritizing Claims, and Repaying Debt. Table 1 contains a brief description of these topics, focusing on select key words (or rather their stems) and the top case reports identified by topic-modeling.47 Figure 1 presents timelines for these fifteen topics over the years 1550– 1750. To interpret these figures, it is best to focus on a particular example, so we will use Assumpsit. Taking a particular year, say 1600, the figure indicates that the topic Assumpsit occupied roughly 3% of the attention in the case reports heard in that year.48 These timelines reflect a feature of topic-modeling that has been much emphasized in the literature. They capture the changing amount of attention in English courts in a very long time period reflecting thousands of cases, focusing not on landmark rulings, but rather on overall trends reflecting data that might be only a tiny part of each individual case. As Goldstone and Underwood found for the digital humanities, “Quantitative methods may be especially useful for characterizing long, gradual changes, because change of that sort is otherwise difficult to grasp.”49 47. Many more details on these topics can be found in GM and the corresponding appen- dices. See note 10. 48. Of course, despite the large number of reports used to produce the data, any given year might have only a few cases. Therefore, the figures are moving averages, producing smooth- ness, especially removing prominent idiosyncrasies arising in years when the data are sparse. Additionally, such figures are usually accompanied by confidence intervals that indicate how imprecise the estimate of the timeline is in any given year. In our applications, those intervals are very narrow for all the timelines. Thus, it is sufficient to focus only on the averages that appear in the diagram. 49. Goldstone and Underwood, “The Quiet Transformations,” 379. This resonates with comments in Guldi and Armitage, The History Manifesto, and in Flanders on what computers can do: J. Flanders, “Detailism, Digital Texts, and the Problem of Pedantry,” TEXT Technology 2 (2005): 41–70. Law and History Review, May 2022210 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 Table 1. The fifteen finance topics briefly described. Arbitration & Umpires: 0.67% Key word-stems include ‘award’, ‘arbitr’, ‘umpir’, ‘arbitra’, ‘attach’, ‘releas’, ‘perform’. Top reports revolve around whether the arbitrators made timely decisions and had chosen an umpire. Assumpsit: 1.51% Key word-stems include ‘assumpsit’, ‘promis’, ‘indebitatus’, ‘consider’, ‘forbear’, ‘indebt’, ‘debt’. Top reports focus on if an assumpsit had taken place and whether an action of assumpsit is allowed. Bankruptcy: 0.48% Key word-stems include ‘bankrupt’, ‘creditor’, ‘assigne’, ‘debt’, ‘bankruptci’, ‘assign’, ‘commiss’. Top reports focus on the assignment of the bankrupt’s estate. Bonds: 1.29% Key word-stems include ‘bind’, ‘condit’, ‘oblig’, ‘debt’, ‘perform’, ‘void’, ‘sureti’. Top reports concern bonds, focusing on the obligations of the bonds and whether they were satisfied. Claims from Financial Instruments: 0.75% Key word-stems include ‘annuiti’, ‘cent’, ‘annum’, ‘southsea’, ‘ayear’, ‘stock’, ‘dividend’. Top reports describe instances of resolving monetary claims concerning bonds, stocks, dividends, mortgages, annuities. Contract Interpretation & Validity: 0.56% Key word-stems include ‘agreement’, ‘contract’, ‘bargain’, ‘write’, ‘agre’, ‘specif’, ‘sign’. Top reports revolve around interpretation of the meaning of a contract in a given setting. Executable Purchase Agreements: 0.72% Key word-stems include ‘purchas’, ‘sell’, ‘convey’, ‘fraud’, ‘deed’, ‘conceal’, ‘reliev’. Top reports concern contractual transfers of property rights and what renders the contract executable. Execution & Administration of Estates: 1.12% Key word-stems include ‘executor’, ‘administr’, ‘testat’, ‘asset’, ‘executrix’, ‘administratrix’, ‘probat’. Top reports involve the actions of administrators or executors of estates. Identifying Contractual Breach: 0.64% Key word-stems include ‘breach’, ‘coven’, ‘perform’, ‘nonpay’, ‘evict’, ‘break’, ‘refus’. Top reports are about ascertaining and clarifying whether breach of contract has occurred in a given situation. Implementing Trusts: 0.62% Key word-stems include: ‘trust’, ‘estat’, ‘chariti’, ‘profit’, ‘decre’, ‘convey’, ‘beneficiari’. Top reports concern implementation trusts, and rules to determine what is permissible in implementation. Mortgages: 0.56% Key word-stems include ‘mortgag’, ‘mortgagor’, ‘redempt’, ‘equiti’, ‘encumbranc’, ‘interest’, ‘foreclos’. Top reports depict disputes pertaining to rights and obligations of mortgagors, mortgagees, and impacted parties. Negotiable Bills & Notes: 0.59% Key word-stems include ‘bill’, ‘note’, ‘accept’, ‘endorse’, ‘promissory’, ‘merchant’, ‘exchange’. Top reports describe the use of bills of exchange and promissory notes, focusing on their negotiability. Pleadings on Debt: 0.99% Key word-stems include ‘plea’, ‘obligatori’, ‘behalf’, ‘aforesaid, ‘premis’, ‘verifi’, ‘attorney’. Top reports focus on the various pleadings to which creditor and debtor have access. Prioritizing Claims: 0.83% Key word-stems include ‘estat’, ‘debt’, ‘person’, ‘shall’, ‘payment’, ‘creditor’, ‘asset’. Top reports focus on who should be paid when claims exceed available funds. (Continued ) Using Topic-Modeling in Legal History 211 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 There is a crucial question of how to interpret the meaning of that amount of attention, which is a central concern of GM. It is natural to think that the height of a timeline reflects how much a certain area of law is used or not, and this is exactly the assumption in the oft-used word- frequency analysis. The fallacy of such an approach becomes evident on examining our example topic, Assumpsit. Its timeline exhibits an inverted U, with attention to the topic almost vanishing from case reports during the eighteenth century. But we know from the careful work of legal historians that the idea of assumpsit was thoroughly embodied in law by that time. So the height of the timeline does not show how much litigants and judges actually depend on a particular idea at a given point in time.50 We know in fact from the detailed legal history that assumpsit was more and more accepted in the late sixteenth century, became authoritative early in the seventeenth century, and was elaborated in many cases in subsequent decades. Therefore, the height of the timeline in a particular year is infor- mative of the rate of development of doctrines in that year rather than the use of the doctrine. This is a reflection of the obvious: litigants do not waste time litigating elements of the law that are accepted by all; judges emphasize the matters that are in dispute; and writers of case reports attract readers by telling them something new, rather than by rehashing settled matters. To explore this logic, GM build a simple evolutionary model of the pro- duction of case reports. Here the logic can be easily explained using a sim- ple analogy with a subject that is painfully familiar to us all. The spread of an idea is like the spread of a virus. The inverted U is like the pandemic curve that we all want to see flattened. Case reports will show a lot of Table 1. (Continued.) Repaying Debt: 1.59% Key word-stems include ‘payment’, ‘interest’, ‘due’, ‘repay’, ‘discharge’, ‘indebt’, ‘lend’. Top reports lay out the details of paying back a sum of money that is owed, often with a focus on interest and often via complex transactions. Note: The percentage figures are the proportions of the topic in the whole corpus. The mean topic proportion in the whole corpus of reports is 1.0% and the median is 0.81%. The mean topic proportion of the 15 finance topics is 0.85%, the median is 0.67%, and their sum is 12.8%. 50. Note that it is entirely possible that the topic Assumpsit vanished from cases in the early eighteenth century while the word “assumpsit” was used in a considerable number of case reports from that era. This is possible because topics reflect the co-occurrence of related words rather than only the frequency of single words. When the word “assumpsit” is used in later cases it might be invoked very briefly to reference a huge area of law without being accompanied by many words that were necessary to use in earlier cases, before the notion of assumpsit became readily accepted. Law and History Review, May 2022212 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 attention to an idea when it is relatively new and becoming more important, just as the count of positive tests for the presence of a virus will rise when the pandemic is becoming very serious. Once an idea is old, it will not show up in the body of case reports, just as there will no longer be many new infections when herd immunity arrives. Thus the timelines provide a very simple answer to the question of when various aspects of legal development occurred. They are crude, missing Figure 1. Finance topics over time. Using Topic-Modeling in Legal History 213 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 many nuances of the legal record, but that is the cost of trying to summa- rize masses of data in a parsimonious way. That element of simplicity is present in all statistical work endeavoring to extract simple core facts from masses of data. A non-machine-learning approach to answering this question would necessarily involve deeper investigation into how the lan- guage was being used in individual reports and how those reports resonated with the wider context. As in many instances in this article, we emphasize that the two different approaches, our statistical distant reading, and the more traditional close reading, are complements. The former is much more likely to reflect the development of ideas within a broad swathe of all cases, including lesser ones. The latter would naturally reflect a nar- rower set of cases found to be especially influential. Given that we can view the height of the timeline at any moment as cap- turing the incremental rate of development of legal ideas, there is an even simpler way to summarize the cumulative development of the law. This will be especially useful in the interpreting the information that appears in the next two sections. Given the evolutionary logic, for any specific topic, one can calculate the year that marks the passing of the halfway mark of all legal development that did occur during 1550–1750. (Think of the virus analogy when a vaccine is not available: we could find the pre- cise year in which the proportion of the population that had been infected passed 50%.) We have made this calculation for all fifteen topics included in Figure 1, and the relevant years are marked on that figure with vertical lines. To take the example of Assumpsit again, the vertical line is placed at 1631, indicating that half of the legal development pertinent to Assumpsit that would occur during 1550–1750 actually had occurred by 1631. Even though the landmark decision, in Slade’s case, was rendered in 1602, our data summary suggests that much development of related law still occurred after that decision. This is not surprising: landmark cases establish a prin- ciple that needs to be fully articulated in a variety of settings. One of the findings that is immediate from a quick perusal of the time- lines and dates in Figure 1 is that several pertinent areas of law were sub- stantially settled well before 1688, the period typically given short shrift in the study of English financial arrangements in the economics literature. Significantly, even late developers such as Implementing Trusts and Negotiable Bills and Notes show spikes in attention during the third quarter of the seventeenth century. Well before the Glorious Revolution, there was broad acceptance by the legal profession of many of the ideas relevant to modern finance. The financial revolution in England was occurring throughout the seventeenth century, at least as far as the development of pertinent legal ideas was concerned. This is decidedly not the picture that emerges from the main strands of the relevant literature in economics. Law and History Review, May 2022214 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 At the same time, it would be difficult to make this precise conclusion from the traditional legal-historical literature alone: we are not aware of any scholar who has stated this conclusion, let alone documented it in as pre- cise a manner as our use of topic-modeling data does. Examining the early and late developing topics in Figure 1, it is clear that the areas of law that developed early are rather broad, in the sense that they are not about specific financial instruments per se, but rather about more general areas of law, where progress is perhaps a pre-condition for the use of specific financial instruments. The earliest developing areas are Assumpsit, Bonds, Identifying Contractual Breach, and Pleadings on Debt, all of which are rel- evant to a wide spectrum of economic activity. In contrast, the areas of law that developed later pertain to much more specific financial arrangements such as Bankruptcy, Mortgages, and Negotiable Bills and Notes. More generally, for the reader interested in areas of law beyond finance, recall that Figure 1 focuses on just 15 of the 100 topics. Many different lessons on the development of various areas of law could be extracted from the complete set of timelines presented in GM. IV. Uncovering Interconnections: The Links between Finance and Other Areas of Law We know that a report of a case will normally refer to many different legal ideas, even though the decision in a particular case usually hinges on one particular aspect of law.51 Detailed rules on Repaying Debt are formulated in the context of earlier developments in Assumpsit and Bonds, for exam- ple. Therefore important insights about legal development can be obtained by examining whether case reports emphasizing one particular topic also emphasize other specific topics. Co-occurrence of two topics at the case- report level is evidence of complementarity in the use of legal ideas. It shows that the corresponding topics aid each other in expressing a specific set of ideas, indicating a shared conceptual foundation. This is (positive) topic correlation, a measure of the degree to which a pair of topics tend to be mentioned in the same case reports. Finding those topic pairs with the largest positive correlations is a first step in detecting associ- ations between different areas of legal development. If one finds that topics 51. DiMaggio et al., “Exploiting Affinities,” 582, point out, in a rather different context, that topic-modeling’s assumption of many ideas mixed in a single text provides a significant advantage: “[A] virtue of topic modeling is its deep affinity to the central insight in the soci- ology of culture that texts do not necessarily reflect a single perspective but are often char- acterized by heteroglossia, the co-presence of competing ‘voices’—perspectives or styles of expression—within a single text.” Using Topic-Modeling in Legal History 215 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 X and Y are highly correlated and, furthermore, that X developed earlier than Y, then that is suggestive of causality, with X an input into Y rather than vice versa. For example, the development of law relevant to Bonds is more likely to have provided input into the development of law on Repaying Debt than vice versa, given that these topics are strongly positively correlated and given the information on their timing in Figure 1. To illustrate these considerations, consider the justifiably uncelebrated case of Alcock v Blowfield, heard by the King’s Bench in the third year of the reign of Charles I.52 The case report is an unusual one because one topic dominates: Assumpsit accounts for 69% of the case according to the GM topic-model estimates. Procedural Rulings on Actions accounts for a further 5% of the case report. If this pattern were repeated over a sufficient number of case reports, then one would find that these two topics, one con- tract and the other procedural, would be correlated with each other. This is indeed the case, with these two topics exhibiting a correlation of 0.25, a rather high level of inter-relationship. However, since the corresponding areas of law were developing at the same time (see Figure 1), we have no strong indication of the direction of causality for this particular topic pair. It is worth emphasizing that Alcock v Blowfield is just one of the 52,949 case reports in the data. The type of information given in the previous paragraph is available for all cases. Because the GM topic model produces data on the proportion of the 100 topics that occupies each of the reports, it is then trivial to find correlations between reports in topic usage. By pro- viding information about the connection between apparently disparate cases, the statistical analysis offers clues that might ultimately be helpful to the more traditional type of analysis usually undertaken by legal histo- rians. Moreover, if the correlations are based on subtle connections between topics that appear in many cases, their existence might be very dif- ficult to detect without quantitative tools: the computer is “a device that extends the range of our perceptions to phenomena too minutely dissemi- nated for our ordinary reading. The computer is. . .being asked to help the researcher perceive patterns at a finer-than-human level of granularity.”53 A. The Criteria for Displaying Connections and the Resultant Network of Legal Ideas With 100 topics, there are 4,950 distinct correlations and therefore there is a need to focus on the most important. We consider only correlations that are greater than 0.15, of which there are only eighty-five: these are the strongest 52. Alcock v. Blowfield (1627) 95 E.R. 74, 1061. 53. Flanders, “Detailism,” 57. Law and History Review, May 2022216 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 2% of the correlations. We are interested primarily in the fifteen finance-related topics. Nevertheless, in examining the development of law related to finance it is important to focus not only on these fifteen topics, but also on any topics that are related to them, since law outside finance can surely influence the develop- ment of finance-related law. In examining correlations, we therefore include all topics related to a finance topic via at most two steps: a non-finance topic is included if it has a correlation greater than 0.15 with any topic that has a cor- relation of greater than 0.15 with a finance-related topic. This leaves us with fifty-seven links to study, half of which are direct links to the finance topics themselves. From this fact alone, an interesting observation arises. Two-thirds of the most important links in our data, fifty-seven of eighty-five, connect to finance, and one third are directly connected to finance topics. In contrast, finance topics are only 15% of all topics. This is evidence that the development of law related to finance is at the center of English legal develop- ments in the period under study. Focusing on the top 2% of correlations is a very stringent criterion, forced upon us by a combination of two factors. First, parsimony is essential to extracting lessons from overwhelming amounts of data. Second, we are examining an area of law that seems to have many connections with other areas of law. However, if a reader were interested in burrowing down into an area of law that was much less broadly connected with other areas, a weaker criterion for the size of the correlation could be used: the narrowness of the area of the law would provide its own parsimony.54 Even fifty-seven correlations are hard to parse if one solely focuses on a list of topics and their associated correlations. In this case a picture is certainly worth a thousand words. We present our findings with the aid of Figure 2. All relevant topics and connections, given the above criteria, appear in the dia- gram: there are fifteen finance topics, twenty-four non-finance topics that are related to the fifteen finance topics, and fifty-seven connections, indicated by dashed lines. The names of the fifteen finance topics are capitalized to distin- guish them.55 The topic names are accompanied by the estimate of the mid- year of topic development discussed in the previous section. 54. For example, if one were interested in the workings of the Poor Laws one might want to examine topics related to Geographic Settlement of Children. Then one would be led to examine a narrow but interesting set of topics: Reviewing Local Orders, Employment of Apprentices and Servants, Decisions after Criminal Conviction, and Clarifying Legislative Acts. 55. For an understanding of what the finance topic names signify, the reader is directed to Table 1. For reasons of brevity, similar discussions of the topic names for non-finance topics are omitted, with the reader referred to the relevant elements of GM. After the publication of GM, one topic name that appears in Figure 2 was reconsidered and changed. Interacting in Court has been changed to Decisional Logic, with the renaming prompted by a further read- ing of the case reports that most use this topic. Using Topic-Modeling in Legal History 217 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 B. Insights from the Network of Topics Related to Finance What such a diagram has the potential to offer is the easy detection of pat- terns that indicate broad lessons in the development of the law. These pat- terns are readily found in Figure 2 and they are not difficult to interpret. The core finance-related topics are in a block in the lower left of the dia- gram, with many interconnections between them. To the right of these are a set of topics whose development was concentrated in the first half of the seventeenth century. These topics are related most closely to contract law and to procedural developments relevant to litigants pursuing contract cases in court. The fact that Assumpsit, an early topic, is connected with procedural topics suggests that the procedural rigor of early common law was of key importance in addressing matters of debt. Above these topics is a small block of very early developing areas of law connected to transfer of ownership of property or transfer of the right to use the property, for example on leases. The reason for the connection between these and the broader elements of contract law is transparent. The largest contrast is between the topics in the lower right of the dia- gram, connected to contract, and those in the upper left of the diagram. The latter group focuses on inheritance and wills. Those are topics whose development came much later in the seventeenth century than the contract-related topics discussed in the previous paragraph. The topics in Figure 2. Interconnections of finance topics. Law and History Review, May 2022218 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 the upper left focus on inheritance and wills and mainly concern property issues, as is inevitable given the importance of land as the basis of family relationships at that time. And given the importance of trusts in dealing with these complicated family-inheritance relationships it is not surprising that the topic Implementing Trusts should be intimately connected to this block of topics. If one wanted to tell an overarching story of development of case law and legal ideas relevant to finance that is evoked by this figure but removed from the nuances of specifics, it would be the following. Early stirrings of an agricultural revolution and the growth of the rural textile industry stim- ulated a market in the transfer of land-use rights. This led to cases concern- ing disputes on leases and rentals, which in turn spurred refinements in contract law. Such refinements were closely associated with the develop- ment of court procedures that channeled contract disputes as they entered the court system. These developments naturally fed into the law relevant to the exchange of financial property and to the debts that arose as a result. But a separate relationship was with the law relevant to both property and the family because the types of arrangements that are so important for finance, trusts and mortgages, for example, were intimately connected with the way in which English families were trying to structure their inher- itance arrangements. Given the timing of events, it seems that the two areas of law, finance and family-inheritance, were developed in tandem, rather than one obviously being the precursor of the other. For the reader interested in examining interconnections among different areas of law, we must emphasize that we have only provided one example of many different analyses that could be carried out using as data the cor- relations derived from the topic-modeling exercise. As far as we are aware, there exists no network analysis on any subject in the pre-industrial legal history literature that is similar to the one explored in Figure 2, even though some aspects of the connections appearing in that figure have certainly been known to legal historians. Where topic-modeling goes beyond what already exists in the legal history literature is that it is a tool to tell a broader story, leveraging a comprehensive set of cases, picking up patterns that might be reflected only in the repetition of thousands of minute sections of text, introducing easily-understood quantifications, and facilitating the use of visualizations that aid the genesis of fresh legal-historical insights. V. Law Versus Equity in Case Law and Legal Ideas on Finance In examining the development of legal doctrines, legal historians are very careful to differentiate between law and equity, between the activities of Using Topic-Modeling in Legal History 219 https://doi.org/10.1017/S0738248022000153 Published online by Cambridge University Press https://doi.org/10.1017/S0738248022000153 the common-law courts and those outside this system, particularly the Court of Chancery.56 Nevertheless, this distinction is not made as clearly as it should be in the related economics literature, especially when interpreting the development of law on finance and understanding the strengths of the English legal system. Our met