Christian Boulanger, Naomi Creutzfeldt and Jen Hendry
This post marks the third in a series of three blog posts that accompany our paper “The Journal of Law and Society in Context: A Bibliometric Analysis”, published in the Journal of Law and Society (vol. 51, issue 1, 2024). In these posts, we expand on the methodological aspects of our analysis to share visualisations that could not be included in the published article.
In each of the posts, we will explore three different types of analysis:
- Descriptive analyses of bibliographic metadata
- Text-linguistic analyses of content (metadata or full text), and
- Network analyses of citation graphs computed from existing and self-generated data.
Network Analysis of Citation Graphs
The third type of analysis we relied upon and will explain here is bibliometric analyses based on citation data. These methods are most commonly used to measure the ’reach‘ of certain authors, or even institutions (for a critical appraisal, see MacRoberts & MacRoberts 2018). The potential of this type of analysis for studies of the history and sociology of scholarly knowledge production has yet to be fully realised, largely due to the lack of data upon which such research could be based.
While citation data can be used simply to rank authors by the number of citations, more complex analyses have the potential to unearth relationships that would be almost impossible to decipher and/or plot manually. For example, bibliographic coupling and co-citations are two methods that have been developed to assess the similarity of research papers. Whereas bibliographic coupling measures the degree to which two papers share bibliographic references, in co-citation analyses, two papers are more similar if are cited together in later literature. Both methods work best with large datasets, typically much larger than our relatively small JLS dataset.
In Fig. 1, we see a co-citation analysis of all publications in our JLS dataset aggregated by author. For better readability, the network has been limited to authors co-cited in at least ten contributions. Immediately noticeable from this visual representation, which can be interactively explored, is the strong presence of systems theory (aka the theory of autopoietic social systems), and the high number of co-citations between Niklas Luhmann and Gunther Teubner, alongside a cluster of ’French’ theory, featuring Michel Foucault, Pierre Bourdieu, Bruno Latour, and Jacques Derrida. While such analyses are traditionally done on static datasets of citation data over longer periods of time, such as the one in Fig. 1, because our interest in specifically in the changes over the JLS life course, as it were, we have created time series of citation networks over the past five decades of the JLS.
Co-citation and bibliometric coupling are well-established methods in bibliometrics. However, our interest was not in either impact assessment or document classification. Rather, we wanted to find out which authors or themes were the most influential within the JLS corpus. For our article, we generated two additional types of visualizations, which are less common: Using 10-year time slices, we created (a) a network graph of the most cited authors connected to authors that cite them most. We also traced the reverse relationship by generating (b) a graph of the most-published authors connected to the authors they cited most. Both graphs were limited to the top 20 or 10 citing or cited authors. While this was necessary in order to keep the graph readable, it has to be kept in mind that such cut-offs are completely arbitrary (why 20 and not 25, for example?), and results can vary depending on what data is within these limits. A similar problem arises when temporal or spatial data points need to be separated into more or less arbitrary slices, or “windows”, and then aggregated (such as our ten-year time slices). Dividing the data in other ways might lead to very different observations.
That said, as we argue in our published article, while visualization (b) shows some overlap in the literature used by these most-published authors, what is most interesting is how the topology (mostly radial structures with few connecting nodes) shows those overlaps as being quite small. This suggest that the literature used by the 20-most published authors in the JLS in each period is extremely diverse and, importantly, does not orbit around any established canon. In order to arrive at a deeper understanding on what in these network representations is specific to the JLS, we would need a much larger dataset and also to run comparative analyses.
The JLS in the journal landscape
In this part, we consider the JLS in relation to other leading journals in the field, both nationally and internationally. One way of situating the JLS in the topography of journals is to look at citation relationships between and across them; specifically, which journals are cited by articles in the JLS, and which cite the JLS in turn.
Since our own JLS dataset only contains outgoing citations, to obtain a more complete picture we also ran a query in both the commercial Web of Science (WoS) database and in the free and open OpenAlex database using the following selection criteria: taking the 100 most-cited journals within the JLS, we selected all articles that cited either the JLS or one of these other journals. Since neither the WoS nor OpenAlex contains information on the BJLS, the resulting dataset covers only the JLS. In what follows, it must be considered that these datasets have not been created at the same time: the original JLS dataset is from 2021, whereas the other two have data leading up to 2023, which means that they are not directly comparable. Also, the WoS and the OpenAlex database differ significantly in their coverage. However, since we are only looking for trends and do not rely on exact numbers or rankings, our interpretations should still be broadly valid.
Based on the citation data in our dataset, Fig. 2 shows the most-cited journals in the JLS (including itself).
To check our results, we ran the same query in the WoS dataset (Fig. 3).
The result differs somewhat, but both confirm that articles published in the JLS cite other JLS articles most, with around twice as many (even more in the WoS data) citations than the next competitor publications the Modern Law Review and the Law & Society Review. Particularly noticeable is the prominence of criminological journals, such as the British Journal of Criminology (placed 3rd in the more recent WoS data) and the Criminal Law Review, speaking to the enduring connection between socio-legal studies and criminology within the UK. Interestingly, except for Social & Legal Studies and Law & Social Inquiry (which only appears in the WoS results), few other explicitly socio-legal journals make an appearance.
What about the journals that most frequently cite JLS articles? Using WoS data, we see a similar, although not identical, list of journals (Fig 4). One difference worthy of note is that, although U.S. journals are heavily cited by JLS contributing authors, except for Law and Social Inquiry, these journals do not contain citations of the JLS in a quantity that appears in the data.
These analyses are based on the whole dataset including all years, however, and do not allow identifying temporal trends. We therefore decided to analyse the evolution of citation relationships between journals over time. This endeavour has to be caveated by the WoS/OpenAlex data being incomplete: as it does not contain complete coverage of all the journals, the more fine-grained our analyses, the greater the danger of statistical artefacts. Also, the further back in time we go, the less reliable the WoS/OpenAlex data is, meaning that any numerical change measured must be taken with more than a grain of salt. This is why we are not providing any numbers in the description of the experiment below. Moreover, since we cannot publish the WoS data, our results cannot be validated by those interested.
Having said this, as an experiment, we created a citation graph (Fig. 5), based on WoS data, of the twenty journals that most cite the JLS most and are reciprocally most cited by the JLS over the past five decades.
To be included, a journal had to cite the JLS on average at least once per year or be cited in the JLS at least once a year, again on average. Unfortunately, there is no data prior to 1984 but, from the mid-1980s onwards the graphs show developments within the JLS’ citation graph. For the 1984-1993 period, for example, we see that most citations in the JLS – as indexed in the Web of Science -are of the International Journal of the Sociology of Law, Social and Legal Studies, Law and Social Inquiry, the Law and Society Review (LSR), the British Journal of Criminology (BJC) and the Criminal Law Review. This situation that stayed mostly the same from the mid-1990s to early 2000s as well, except for Public Administration joining this list. Ten years later, in 2004, while the strongest connections are still to the BJC, LSR, and Social & Legal Studies, we see a new link forming with the Modern Law Review, and the JLS starts featuring in citations in such diverse journals as Feminist Legal Studies, Legal Studies, and the Medical Law Review. Progress another ten years, and by 2014 Regulation and Governance and the International Journal of Law and Context both join the group of journals citing the JLS in significant numbers, as does the Journal of Legal History, albeit on a smaller scale.
Finally, with a view to pinpointing where the JLS sits in the overall journal landscape, we transformed a WoS and OpenAlex datasets into an interactive citation network. For a journal to be eligible for inclusion in the network, it had to meet the criteria of having a minimum of 25 citations in the complete dataset. Using the Louvain community detection method, we clustered the journal nodes according to the number of relationships they have with each other, and the strength of these relationships (for more technical detail, see here).
With the Louvain algorithm, we observed the emergence of distinct journal communities that aligned broadly with our expectations, with only a few exceptions. Comparing the communities to which the JLS was computationally assigned also highlighted the significant differences in the source data between the Web of Science and OpenAlex.
In the network cluster that was computed from the Web of Science data (Fig. 6), the JLS was placed in a community with a strong presence of criminology journals, such as the British Journal of Criminology or Criminal Law Review, to which there are also strong direct citation relationships.
In contrast, the data from OpenAlex suggested a very different community (Fig. 7), one that consists much more strongly of “Law and Society” and doctrinal law Journals.
As interesting as these visualizations are, these differences remind us once again that we cannot take them as representations of something “real”. What we can see is what went into the dataset, and unless we know exactly how the data was produced (which is impossible in the case of commercial database vendors), the results cannot provide definitive answers. Among other things, this counsels against the use, in the socio-legal realm, of “impact factor” metrics provided by commercial databases such as the Web of Science or Scopus, as their data is highly selective and therefore unreliable (Martin & Martin 2018). That said, the results are valuable for providing puzzles that invite further investigation. Carefully tailored research questions allow to narrow down the amount of data needed. This in turn makes it possible to do quality checks on a manageable dataset.
In closing, we want to make several points about what data is necessary to continue bibliometric analyses of socio-legal scholarship and get even deeper insights into it than what we were able to get in our paper. As evident from our investigation, the success of such analyses is very much dependent on the existence of extensive, high-quality, and freely available data.
First, there is an urgent need to work towards removing the dependency of expensive and copyright-encumbered databases, such as the Web of Science. Scholars should have free and unrestricted access to metadata on the knowledge that they participate in creating, as offered, for example, by openalex.org. Socio-legal journals should therefore support initiatives such as the Initiative for Open Citations: https://i4oc.org/ For this to work, new submission policies will have to be developed that encourage the submission of citation metadata along with research articles just like one would submit the raw empirical data that a paper is based on. The usability of citation data also greatly depends on the disambiguation and correct linking of author names to the actual creators of the work and the institutions to which they belong. For current scholarship, it would go a long way towards this goal if database vendors, publishers, and authors would make use of https://orcid.org/ to identify uniquely the authors of scholarly works.
Second, we need indicators on how good the existing coverage of socio-legal literature in the existing databases is. The validity of the large-scale analyses presented in part 3 of this post hinges on the assumption that the data is representative the actual number of citations in the published scholarship, rather than being an artefact of the choice of the vendor or the performance of the citation mining software used. There have been a few general studies (for example, Visser et al 2021), but none is specific enough for the domain of socio-legal studies.
Finally, to allow research into past knowledge production, it is critical to collaborate with Digital Humanities and Computer Science scholars to find new ways to reliably extract the references from older (printed) literature, so that we do not need to wait for commercial vendors to produce the data for us (see also Colavizza & Romanello 2019). This kind of interdisciplinary collaboration requires some effort, since socio-legal and computational studies speak very different languages.
The figures, codes, and data from our study are available to the public in an open-access GitHub repository. This type of analysis is still in its infancy, and we do not claim completeness; these contributions need to be understood as preliminary. Our aim is to provides the start of a conversation, rather than results set in stone. We are not bibliometricians: our interest in the use of these methods is to tease out the history of socio-legal ideas, the agency of scholars and the role of institutions, in a qualitative sense. We expect and welcome critical feedback from both traditions: qualitative-hermeneutical history of ideas scholars as well as those who use quantitative methods from bibliometrics to pursue questions in the history of science.