Legal Tech, Litigation, and the Adversarial System

doi:10.1017/9781009255301.006

Part II - Legal Tech, Litigation, and the Adversarial System

Published online by Cambridge University Press: 02 February 2023

Edited by

David Freeman Engstrom

Show author details

David Freeman Engstrom: Affiliation:
Stanford University, California

Book contents

Summary

A summary is not available for this content. As you have access to this content, full HTML content is provided on this page. A PDF of this content is also available in through the ‘Save PDF’ action button.

Type: Chapter
Information: Legal Tech and the Future of Civil Justice , pp. 91 - 196

DOI: https://doi.org/10.1017/9781009255301.006 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2023
Creative Commons: This content is Open Access and distributed under the terms of the Creative Commons Attribution licence CC-BY-NC-ND 4.0 https://creativecommons.org/cclicenses/

4 Remote Testimonial Fact-Finding

Renee L. Danser , D. James Greiner , Elizabeth Guo , and Erik Koltun

Prior to the COVID-19 pandemic, debate mounted concerning the wisdom, and perhaps the inevitability, of moving swaths of the court system online. These debates took place within a larger discussion of how and why courts had failed to perform their most basic function of delivering (approximately) legally accurate adjudicatory outputs in cases involving unrepresented litigantsFootnote ¹ while inspiring public confidence in them as a method of resolving disputes compared to extralegal and even violent alternatives. Richard Susskind, a longtime critic of the status quo, summarized objections to a nearly 100 percent in-person adjudicatory structure: “[T]he resolution of civil disputes invariably takes too long [and] costs too much, and the process is unintelligible to ordinary people.”Footnote ² As Susskind noted, fiscal concerns drove much of the interest in online courts during this time, while reformists sought to use the move online as an opportunity to expand the access of, and services available to, unrepresented litigants.

Yet as criticisms grew, and despite aggressive action by some courts abroad,Footnote ³ judicial systems in the United States moved incrementally. State courts, for example, began to implement online dispute resolution (ODR), a tool that private-sector online retailers had long used.Footnote ⁴ But courts had deployed ODR only for specified case types and only at specified stages of civil proceedings. There was little judicial appetite in the United States for a wholesale transition.

The COVID-19 pandemic upended the judicial desire to move slowly. As was true of so many other sectors of United States society, the judicial system had to reconstitute online, and had to do so convulsively. And court systems did so, in the fashion in which systemic change often occurs in the United States: haphazardly, unevenly, and with many hiccups. The most colorful among the hiccups entered the public consciousness: audible toilet flushes during oral argument before the highest court in the land; a lawyer appearing at a hearing with a feline video filter that he was unable to remove himself; jurors taking personal calls during voir dire.Footnote ⁵ But there were notable successes as well. The Texas state courts had reportedly held 1.1 million proceedings online as of late February 2021, and the Michigan state courts’ live-streaming on YouTube attracted 60,000 subscribers.Footnote ⁶

These developments suggest that online proceedings have become a part of the United States justice system for the foreseeable future. But skepticism has remained about online migration of some components of the judicial function, particularly around litigation’s most climactic stage: the adversarial evidentiary hearing, especially the jury trial. Perhaps believing that they possessed unusual skill in discerning truthful and accurate testimony from its opposite,Footnote ⁷ some judges embraced online bench fact-finding. But the use of juries, both civil and criminal, caused angst, and some lawyers and judges remained skeptical that the justice system could or should implement any kind of online jury trial. Some of this skepticism waxed philosophical. Legal thinkers pondered, for example, whether a jury trial is a jury trial if parties, lawyers, judges, and the jurors themselves do not assemble in person in a public space.Footnote ⁸ Some of the skepticism about online trials may have been simple fear of the unknown.Footnote ⁹

Other concerns skewed more practical and focused on matters for which existing research provided few answers. Will remote trials yield more representative juries by opening proceedings to those with challenges in attending in-person events, or will the online migration yield less-representative juries by marginalizing those on the wrong side of the digital divide? Will remote trials decrease lawyer costs and, perhaps as a result, expand access to justice by putting legal services more within the reach of those with civil justice needs?Footnote ¹⁰ Time, and more data, can tell if the political will is there to find out.

But two further questions lie at the core of debate about whether online proceedings can produce approximately accurate and publicly acceptable fact-finding from testimonial evidence. The first is whether video hearings diminish the ability of fact-finders to detect witness deception or mistake. The second is whether video as a communication medium dehumanizes one or both parties to a case, causing less humane decision-making.Footnote ¹¹ For these questions, the research provides an answer (for the first question) and extremely useful guidance (for the second).

This chapter addresses these questions by reviewing the relevant research. By way of preview, our canvass of the relevant literatures suggests that both concerns articulated above are likely misplaced. Although there is reason to be cautious and to include strong evaluation with any move to online fact-finding, including jury trials, available research suggests that remote hearings will neither alter the fact-finder’s (nearly non-existent) ability to discern truthful from deceptive or mistaken testimony nor materially affect fact-finder perception of parties or their humanity. On the first point, a well-developed body of research concerning the ability of humans to detect when a speaker is lying or mistaken shows a consensus that human detection accuracy is only slightly better than a coin flip. Most importantly, the same well-developed body of research demonstrates that such accuracy is the same regardless of whether the interaction is in-person or virtual (so long as the interaction does not consist solely of a visual exchange unaccompanied by sound, in which case accuracy worsens). On the second point, the most credible studies from the most analogous situations suggest little or no effect on human decisions when interactions are held via videoconference as opposed to in person. The evidence on the first point is stronger than that on the second, but the key recognition is that for both points, the weight of the evidence is contrary to the concerns that lawyers and judges have expressed, suggesting that the Bench and the Bar should pursue online courts (coupled with credible evaluation) to see if they offer some of the benefits their proponents have identified. After reviewing the relevant literature, this chapter concludes with a brief discussion of a research agenda to investigate the sustainability of remote civil justice.

A final point concerning the scope of this chapter: As noted above, judges and (by necessity) lawyers have largely come to terms with online bench fact-finding based on testimonial evidence, reserving most of their skepticism for online jury trials. But as we relate below, the available research demonstrates that to the extent that the legal profession harbors concerns regarding truth-detection and dehumanization in the jury’s testimonial fact-finding, it should be equally skeptical regarding that of the bench. The evidence in this area either fails to support or affirmatively contradicts the belief that judges are better truth-detectors, or are less prone to dehumanization, than laity. Thus, we focus portions of this chapter on online jury trials because unwarranted skepticism has prevented such adjudications from the level of use (under careful monitoring and evaluation) that the currently jammed court system likely needs. But if current research (or our analysis of it here) is wrong, meaning that truth-detection and dehumanization are fatal concerns for online jury trials, then online bench adjudications based on testimonial evidence should be equally concerning.

4.1 Adjudication of Testimonial Accuracy

A common argument against the sustainability of video testimonial hearings is that, to perform its function, the fact-finder must be able to distinguish accurate from inaccurate testimony. Inaccurate testimony could arise via two mechanisms: deceptive witnesses, that is, those who attempt to mislead the trier of fact through deliberate or knowing misstatements; and mistaken witnesses, that is, those who believe their testimony even though their perception of historical fact was wrong.Footnote ¹² The legal system reasons as follows. First, juries are good, better even than judges, at choosing which of two or more witnesses describing incompatible versions of historical fact is testifying accurately.Footnote ¹³ Second, this kind of “demeanor evidence” takes the form of nonverbal cues observable during testimony.Footnote ¹⁴ Third, juries adjudicate credibility best if they can observe witnesses’ demeanor in person.Footnote ¹⁵

Each component of this reasoning is false.

First, research shows that humans have just above a fifty-fifty chance ability to detect lies, approximately 54 percent overall, if we round up. For example, Bond and DePaulo, in their meta-analysis of deception detection studies, place human ability to detect deception at 53.98 percent, or just above chance.Footnote ¹⁶ Humans are really bad at detecting deception. That probably comes as a shock to many of us. We are pretty sure, for example, we can tell when an opponent in a game or a sport is fibbing or bending the truth to get an edge. We are wrong. It has been settled in science since the 1920s that human beings are bad at detecting deception.Footnote ¹⁷ The fact that judges and lawyers continue to believe otherwise is a statement of the disdain in which the legal profession holds credible evidence and empiricism more generally.Footnote ¹⁸

Second, human (in)ability to detect deception does not change with an in-person or a virtual interaction. Or at least, there is no evidence that there is a difference between in-person and virtual on this score, and a fair amount of evidence that there is no difference. Most likely, humans are really bad at detecting deception regardless of the means of communication. This also probably comes as a shock to many of us. In addition to believing (incorrectly) that we can tell when kids or partners or opponents are lying, we think face-to-face confrontation matters. Many judges and lawyers so believe. They are wrong.

Why are humans so bad at deception detection? One reason is that people rely on what they think are nonverbal cues. For example, many think fidgeting, increased arm and leg movement, and decreased eye contact are indicative of lying. None are. While there might be some verbal cues that could be reliable for detecting lies, the vast majority of nonverbal cues (including those just mentioned, and most others upon which we humans tend to rely) are unreliable, and the few cues that might be modestly reliable can be counterintuitive.Footnote ¹⁹ Furthermore, because we hold inaccurate beliefs about what is and is not reliable, it is difficult for us to disregard the unreliable cues.Footnote ²⁰ In a study that educated some participants on somewhat reliable nonverbal cues to look for, with other participants not getting this information, participants with the reliable cues had no greater ability to detect lying.Footnote ²¹ We humans are not just bad at lie detection; we are also bad at being trained at lie detection.

While a dishonest demeanor elevates suspicion, it has little-to-no relation to actual deception.Footnote ²² Similarly, a perceived honest demeanor is not reliably associated with actual honesty.Footnote ²³ That is where the (ir)relevance of the medium of communication matters. If demeanor is an unreliable indicator for either honesty or dishonesty, then a fact-finder suffers little from losing whatever the supposedly superior opportunity to observe demeanor an in-person interaction might provide.Footnote ²⁴ For example, a study from 2015 shows that people attempting to evaluate deception performed better when the interaction was a computer-mediated (text-based communication) rather than in-person communication.Footnote ²⁵ At least one possible explanation for this finding is the unavailability of distracting and unreliable nonverbal cues.Footnote ²⁶

Despite popular belief in the efficacy of discerning people’s honesty based on their demeanor, research shows that non-demeanor cues, meaning verbal cues, are more promising. A meta-analysis concluded that cues that showed promise at signaling deception tended to be verbal (content of what is said) and paraverbal (how it is spoken), not visual.Footnote ²⁷ But verbal and paraverbal cues are just as observable from a video feed.

If we eliminate visual cues for fact-finders, and just give them audio feed, will that improve a jury’s ability to detect deception? Unlikely. Audio-only detection accuracy does not differ significantly from audiovisual.Footnote ²⁸ At this point, that should not be a surprise, considering the general low ceiling of deception detection accuracy – just above the fifty-fifty level. Only in high-pressure situations is it worthwhile (in a deception detection sense) to remove nonverbal cues.Footnote ²⁹ To clarify: High-pressure situations likely make audio only better than audio plus visual, not the reverse. The problem for deception detection appears to be that, with respect to visual cues, the pressure turns the screws both on someone who is motivated to be believed but is actually lying and on someone who is being honest but feels as though they are not believed.

We should not think individual judges have any better ability to detect deception than a jury. Notwithstanding many judges’ self-professed ability to detect lying, the science that humans are poor deception detectors has no caveat for the black robe. There is no evidence that any profession is better at deception detection, and a great deal of evidence to the contrary. For example, those whose professions ask them to detect lies (such as police officers) cite the same erroneous cues regarding deception.Footnote ³⁰ More broadly, two meta-analyses from 2006 show that purported “experts” at deception detection are no better at lie detection than nonexperts.Footnote ³¹

What about individuals versus groups? A 2015 study did find consistently that groups performed better at detecting lies,Footnote ³² a result the researchers attributed to group synergy – that is, that individuals were able to benefit from others’ thoughts.Footnote ³³ So, juries are better than judges at deception detection, right? Alas, almost certainly not. The problem is that only certain kinds of groups are better than individuals. In particular, groups of individuals who were familiar with one another before they were assigned a deception detection task outperformed both individuals and groups whose members had no preexisting connection.Footnote ³⁴ Groups whose members had no preexisting connection were no better at detecting deception than individuals.Footnote ³⁵ Juries are, by design, composed of a cross-section of the community, which almost always means that jurors are unfamiliar with one another before trial.Footnote ³⁶

There is more bad news. Bias and stereotypes affect our ability to flush out a lie. Females are labeled as liars significantly more than males even when both groups lie or tell the truth the same amount.Footnote ³⁷ White respondents asked to detect lies were significantly faster to select the “liar” box for black speakers than white speakers.Footnote ³⁸

All of this is troubling, and likely challenges fundamental assumptions of our justice system. For the purposes of this chapter, however, it is enough to demonstrate that human inability to detect lying remains constant whether testimony is received in-person or remotely. Again, the science on this point goes back decades, and it is also recent. Studies conducted in 2014Footnote ³⁹ and 2015Footnote ⁴⁰ agreed that audiovisual and audio-only mediums were not different in accuracy detection. The science suggests that the medium of communication – in-person, video, or telephonic – has little if any relevant impact on the ability of judges or juries to tell truths from lies.Footnote ⁴¹

The statements above regarding human (in)ability to detect deception apply equally to human (in)ability to detect mistakes, including the fact that scientists have long known that we are poor mistake detectors. Thirty years ago, Wellborn collected and summarized the then-available studies, most focusing on eyewitness testimony. Addressing jury ability to distinguish mistaken from accurate witness testimony, Wellborn concluded that “the capacity of triers [of fact] to appraise witness accuracy appears to be worse than their ability to discern dishonesty.”Footnote ⁴² Particularly relevant for our purposes, Wellborn further concluded that “neither verbal nor nonverbal cues are effectively employed” to detect testimonial mistakes.Footnote ⁴³ If neither verbal nor nonverbal cues matter in detecting mistakes, then there will likely be little lost by the online environment’s suppression of nonverbal cues.

The research in the last thirty years reinforces Wellborn’s conclusions. Human inability to detect mistaken testimony in real-world situations is such a settled principle that researchers no longer investigate it, focusing instead on investigating other matters, such as the potentially distorting effects of feedback given to eyewitnesses,Footnote ⁴⁴ whether witness age affects likelihood of fact-finder belief,Footnote ⁴⁵ and whether fact-finders understand the circumstances mitigating the level of unreliability of eyewitness testimony.Footnote ⁴⁶ The most recent, comprehensive writing we could find on the subject was a 2007 chapter from Boyce, Beaudry, and Lindsay, which depressingly concluded (1) fact-finders believe eyewitnesses, (2) fact-finders are not able to distinguish between accurate and inaccurate eyewitnesses, and (3) fact-finders base their beliefs of witness accuracy on factors that have little relationship to accuracy.Footnote ⁴⁷ This review led us to a 1998 study of child witnesses that found (again) no difference in a fact-finder’s capacity to distinguish accurate from mistaken testimony as between video versus in-person interaction.Footnote ⁴⁸

In short, decades of research provide strong reason to question whether fact-finders can distinguish accurate from inaccurate testimony, but also strong reason to believe that no difference exists on this score between in-person versus online hearings, nor between judges and juries.

4.2 The Absence of a Dehumanization Effect

Criminal defense attorneys have raised concerns of dehumanization of defendants, arguing that in remote trials, triers of fact will have less compassion for defendants and will be more willing to impose harsher punishments.Footnote ⁴⁹ In civil trials, this concern could extend to either party. For example, in a personal injury case, the concern might be that fact-finders would be less willing to award damages because they are unable to connect with or relate to a plaintiff’s injuries. Or, in less protracted but nevertheless high-stakes civil actions, such as landlord/tenant matters, a trier of fact (usually a judge) might feel less sympathy for a struggling tenant and therefore show greater willingness to evict rather that mediate a settlement.

A review of relevant literature suggests that this concern is likely misplaced. While the number of studies directly investigating the possibility of online hearings is limited, analogous research from other fields is available. We focus on studies in which a decision-maker is called upon to render a judgment or decision that affects the livelihood of an individual after some interaction with that individual, much like a juror or judge is called upon to render a decision that affects the livelihood of a litigant. We highlight findings from a review of both legal and analogous nonlegal studies, and we emphasize study quality – that is, we prioritize randomized before nonrandomized trials, field over lab/simulated experiments, and studies involving actual decision-making over studies involving precursors to decisions (e.g., ratings or impressions).Footnote ⁵⁰ With this ordering of pertinence, we systematically rank the literature into three tiers, from most to least robust: first, randomized field studies (involving decisions and precursors); second, randomized lab studies (involving decisions and precursors); and third, non-randomized studies. Table 4.1 provides a visual of our proposed hierarchy.

Table 4.1 A hierarchy of study designs

Tier	Randomized?	Setting	Example
1: RCTs	Yes	Field	Cuevas et al.
2: Lab Studies	Yes	Lab	Lee et al.
3: Observational Studies	No	Field	Walsh & Walsh

According to research in the first tier of randomized field studies – the most telling for probing the potential for online fact-finding based on testimonial evidence – proceeding via videoconference likely will not adversely affect the perceptions of triers of facts on the humanity of trial participants. We do take note of the findings of studies beyond this first tier, which include some from the legal field. Findings in these less credible tiers are varied and inconclusive.

The research addressing dehumanization is less definitive than that addressing deception and mistake detection. So, while we suggest that jurisdictions consider proceeding with online trials and other innovative ways of addressing both the current crises of frozen court systems and the future crises of docket challenges, we recommend investigation and evaluation of such efforts through randomized control trials (RCTs).

4.2.1 Who Would Be Dehumanized?

At the outset, we note a problem common to all of the studies we found, in all tiers: none of the examined situations are structurally identical to a fact-finding based on an adversarial hearing. In legal fact-finding in an adversarial system, one or more theoretically disinterested observers make a consequential decision regarding the actions of someone with whom they may have no direct interaction and who, in fact, sometimes exercises a right not to speak during the proceeding. In all of the studies we were able to find, which concerned situations such as doctor-patient or employer-applicant, the decision-maker interacted directly with the subject of the interaction. It is thus questionable whether any study yet conducted provides ideal insight regarding the likely effects of online as opposed to in-person for, say, civil or criminal jury trials.

Put another way: If a jury were likely to dehumanize or discount someone, why should it be that it would dehumanize any specific party, as opposed to the individuals with whom the jury “interacts” (“listens to” would be a more accurate phrase), namely, witnesses and lawyers? With this in mind, it is not clear which way concerns of dehumanization cut. At present, one defensible view is that there is no evidence either way regarding dehumanization of parties in an online jury trial versus an in-person one, and that similar concerns might be present for some types of judicial hearings.

Some might respond that the gut instincts of some criminal defense attorneys and some judges should count as evidence.Footnote ⁵¹ We disagree that the gut instincts of any human beings, professionals or otherwise, constitute evidence in almost any setting. But we are especially suspicious of gut instincts in the fact-finding context. As we saw in the previous part, fact-finding based on testimonial hearings has given rise to some of the most stubbornly persistent, and farcically outlandish, myths to which lawyers and judges cling. The fact that lawyers and judges continue to espouse this kind of flat-eartherism suggests careful interrogation of professional gut instincts on the subject of dehumanization from an online environment.

4.2.2 Promising Results from Randomized Field Studies

Within our first-tier category of randomized field studies, the literature indicates that using videoconference in lieu of face-to-face interaction has an insignificant, or even a positive, effect on a decision-maker’s disposition toward the person about whom a judgment or decision is made.Footnote ⁵² We were unable to find any randomized field studies concluding that videoconferencing, as compared to face-to-face communication, has an adverse or damaging effect on decision outcomes.

Two randomized field studies in telemedicine, conducted in 2000Footnote ⁵³ and 2006,Footnote ⁵⁴ both found that using videoconferencing rather than face-to-face communication had an insignificant effect on the outcomes of real telemedicine decisions. Medical decisions were equivalentFootnote ⁵⁵ or identical.Footnote ⁵⁶ It is no secret that medicine implemented tele-health well before the justice system implemented tele-justice.Footnote ⁵⁷

Similarly, a 2001 randomized field study of employment interviews conducted using videoconference versus in-person interaction resulted in videoconference applicants being rated higher than their in-person counterparts. Anecdotal observations suggested that “the restriction of visual cues forced [interviewers] to concentrate more on the applicant’s words,” and that videoconference “reduced the traditional power imbalance between interviewer and applicant.”Footnote ⁵⁸

From our review of tier-one studies, then, we conclude that there is no evidence that the use of videoconferencing makes a difference on decision-making. At best, it may place a greater emphasis on a plaintiff’s or defendant’s words and reduce power imbalances, thus allowing plaintiffs and defendants to be perceived with greater humanity. At worst, videoconferencing makes no difference.

That said, we found only three tier-one studies. So, we turn our attention to studies with less strong designs.

4.2.3 Varied Findings from Studies with Less Strong Designs

Randomized lab studies and non-randomized studies provide a less conclusive array of findings, causing us to recommend that use of remote trials be accompanied by careful study. Randomized lab studies and non-randomized studies are generally not considered as scientifically rigorous as randomized field studies; much of the legal literature – which might be considered more directly related to remote justice – falls within this tier of research.

First, there are results, analogous to the tier-one studies, suggesting that using videoconference in lieu of face-to-face interaction has an insignificant effect for the person about whom a decision is being made. For example, in a study testing the potential dehumanizing effect of videoconferencing as compared to in-person interactions in a lab setting where doctors were given the choice between a painful but more effective treatment versus a painless but less effective treatment, no dehumanizing effect of communication medium was found.Footnote ⁵⁹ If the hypothesis that videoconferencing dehumanizes patients (or their pain) were true, in this setting, we might expect to see doctors prescribing a more painful but more effective treatment. No such difference emerged.

Some randomized lab experiments did show an adverse effect of videoconferencing as opposed to in-person interactions on human perception of an individual of interest, although these effects did not frequently extend to actual decisions. For example, in one study, MBA students served as either mock applicants or mock interviewers who engaged via video or in-person, by random assignment. Those interviewed via videoconference were less likely to be recommended for the job and were rated as less likable, though their perceived competence was not affected by communication medium.Footnote ⁶⁰ Other lab experiments have also concluded that the videoconference medium negatively affects a person’s likability compared with the in-person medium.

Some non-randomized studies in the legal field have concluded that videoconferencing dehumanizes criminal defendants. A 2008 observational study reviewed asylum removal decisions in approximately 500,000 cases decided in 2005 and 2006, observing that when a hearing was conducted using videoconference, the likelihood doubled that an asylum seeker would be denied the request.Footnote ⁶¹ In a Virtual Court pilot program conducted in the United Kingdom, evaluators found that the use of videoconferencing resulted in high rates of guilty pleas and a higher likelihood of a custodial sentence.Footnote ⁶² Finally, an observational study of bail decisions in Cook County, Illinois, found an increase in average bond amount for certain offenses after the implementation of CCTV bond hearings.Footnote ⁶³ Again, however, these studies were not randomized, and well-understood selection or other biasing effects could explain all these results.

4.2.4 Wrapping Up Dehumanization

While the three studies first mentioned are perhaps the most analogous to the situation of a remote jury or bench hearing because they are analyzing the effects of remote legal proceedings, we cannot infer much about causation from them. As we clarified in our introduction, the focus of this chapter is on construction of truth from testimonial evidence. Some of the settings (e.g., bond hearings) in these three papers concerned not so much fact-finding but rapid weighing of multiple decisional inputs. In any event, the design weaknesses of these studies remain. And if one discounts design problems, we still do not know whether any unfavorable perception affects both parties equally, or just certain witnesses or lawyers.

The randomized field studies do point toward a promising direction for the implementation of online trials and sustainability of remote hearings. The fact that these studies are non-legal but analogous in topic and more scientifically robust in procedure may trip up justice system stakeholders, who might be tempted to believe that less-reliable results that occur in a familiar setting deserve greater weight than more reliable results occurring in an analogous but non-legal setting. As suggested above, such is not necessarily wise.

We found only three credible (randomized) studies. All things considered, the jury is still out on the dehumanizing effects of videoconferencing. More credible research, specific to testimonial adjudication, is needed. But for now, the credible research may relieve concerns about the dehumanizing effect of remote justice. Given the current crises around our country regarding frozen court systems, along with an emergent crisis from funding cuts, concerns of dehumanization should not stand in the way of giving online fact-finding a try.

4.3 A Research Agenda

A strong research and evaluation program should accompany any move to online fact-finding.Footnote ⁶⁴ The concerns are various, and some are context-dependent. Many are outside the focus of this chapter. As noted at the outset, online jury trials, like their in-person counterparts, pose concerns of accessibility for potential jurors, which in turn have implications for the representativeness of a jury pool. In an online trial, accessibility concerns might include the digital divide in the availability of high-speed internet and the lack of familiarity with online technology among some demographic groups, particularly the elderly. Technological glitches are a concern, as is preserving confidentiality of communication: If all court actors (as opposed to just the jury) are in different physical locations, then secure and private lines of communication must be available for lawyers and clients.Footnote ⁶⁵ In addition, closer to the focus of this chapter, some in the Bench and the Bar might believe that in-person proceedings help focus jurors’ attention while making witnesses less likely to deceive or to make mistakes; we remain skeptical of these assertions, particularly the latter, but they, too, deserve empirical investigation. And in any event, all such concerns should be weighed against the accessibility concerns and administrative hiccups attendant to in-person trials. Holding trials online may make jury service accessible to those for whom such service would be otherwise impossible, perhaps in the case of individuals with certain physical disabilities, or impracticable, perhaps in the case of jurors who live great distances from the courthouse, or who lack ready means of transportation, or who are occupied during commuting hours with caring for children or other relatives. Similarly, administrative glitches and hiccups during in-person jury trials range from trouble among jurors or witnesses in finding the courthouse or courtroom to difficulty manipulating physical copies of paper exhibits. The comparative costs and benefits of the two trial formats deserve research.

Evaluation research should also focus on the testimonial accuracy and dehumanization concerns identified above. As Sections 4.1 and 4.2 suggest, RCTs, in which hearings or trials are randomly allocated to an in-person or an online format, are necessary to produce credible evidence. In some jurisdictions, changes in law might be necessary.Footnote ⁶⁶

But none of these issues is conceptually difficult, and describing strong designs is easy. A court system might, for example, engage with researchers to create a system that assigns randomly a particular type of case to an online or in-person hearing involving fact-finding.Footnote ⁶⁷ The case type could be anything: summary eviction, debt collection, government benefits, employment discrimination, suppression of evidence, and the like. The adjudicator could be a court or an agency. Researchers can randomize cases using any number of means, or cases can be assigned to conditions using odd/even case numbers, which is ordinarily good enough even if not technically random.

It is worth paying attention to some details. For example, regarding the outcomes to measure, an advantage to limiting each particular study to a particular case type is that comparing adjudicatory outputs is both obvious and easy. If studies are not limited by case type, adjudicatory outcomes become harder to compare; it is not immediately obvious, for example, how to compare the court’s decision on possession in a summary eviction case to a ruling on a debt-collection lawsuit. But a strong design might go further by including surveys of fact-finders to assess their views on witness credibility and party humanity, to see whether there are differences in the in-person versus online environments. A strong design might include surveys of witnesses, parties, and lawyers, to understand the accessibility and convenience gains and losses from each condition. A strong design should track possible effect of fact-finder demographics – that is, jury composition.

Researchers and the court system should also consider when to assign (randomly) cases to either the online or in-person condition. Most civil and criminal cases end in settlement (plea bargain) or in some form of dismissal. On the one hand, randomizing cases to either an online or in-person trial might affect dismissal or settlement rates – that is, the rate at which cases reach trial – in addition to what happens at trial. Such would be good information to have. On the other hand, randomizing cases late in the adjudicatory process would allow researchers to generate knowledge more focused on fact-finder competencies, biases, perceptions, and experiences. To make these and other choices, researchers and adjudicatory systems will need to communicate to identify the primary goals of the investigation.

Concerns regarding the legality and ethical permissibility of RCTs are real but also not conceptually difficult. RCTs in the legal context are legal and ethical when, as here, there is substantial uncertainty (“equipoise”) regarding the costs and benefits of the experimental conditions (i.e., online versus in-person trials).Footnote ⁶⁸ This kind of uncertainty/equipoise is the ethical foundation for the numerous RCTs completed each year in medicine.Footnote ⁶⁹ Lest we think the consequences of legal adjudications too high to permit the randomization needed to generate credible knowledge, medicine crossed this bridge decades ago.Footnote ⁷⁰ Many medical studies measure death as a primary outcome. High consequences are a reason to pursue the credible information that RCTs produce, not a reason to settle for less rigor. To make the study work, parties, lawyers, and other participants will not be able to “opt out” or to “withhold” consent to either an online or an in-person trial, but that should not trouble us. Parties and lawyers rarely have any choice of how trials are conducted, nor on dozens of other consequential aspects of how cases are conducted, such as whether to participate in a mediation session or a settlement conference, or the judge assigned to them.Footnote ⁷¹

Given the volume of human activity occurring online, it is silly for the legal profession to treat online adjudication as anathema. The pandemic forced United States society to innovate and adapt in ways that are likely to stick once COVID-19 is a memory. Courts should not think that they are immune from this trend. Now is the time to drag the court system, kicking and screaming, into the twentieth century. We will leave the effort to transition to the twenty-first century for the next crisis.

5 Gamesmanship in Modern Discovery Tech

Neel Guha , Peter Henderson , and Diego A. Zambrano

This chapter explores the potential for gamesmanship in technology-assisted discovery.Footnote ¹ Attorneys have long embraced gamesmanship strategies in analog discovery, producing reams of irrelevant documents, delaying depositions, or interpreting requests in a hyper-technical manner.Footnote ² The new question, however, is whether machine learning technologies can transform gaming strategies. By now it is well known that technologies have reinvented the practice of civil litigation and, specifically, the extensive search for relevant documents in complex cases. Many sophisticated litigants use machine learning algorithms – under the umbrella of “Technology Assisted Review” (TAR) – to simplify the identification and production of relevant documents in discovery.Footnote ³ Litigants employ TAR in cases ranging from antitrust to environmental law, civil rights, and employment disputes. But as the field becomes increasingly influenced by engineers and technologists, a string of commentators has raised questions about TAR, including lawyers’ professional role, underlying incentive structures, and the dangers of new forms of gamesmanship and abuse.Footnote ⁴

This chapter surveys and explains the vulnerabilities in technology-assisted discovery, the risks of adversarial gaming, and potential remedies. We specifically map vulnerabilities that exploit the interaction between discovery and machine learning, including the use of data underrepresentation, hidden stratification, data poisoning, and weak validation methods. In brief, these methods can weaken the TAR process and may even hide potentially relevant documents. We also suggest ways to police these gaming techniques. But the remedies we explore are not bulletproof. Proper use of TAR depends critically on a deep understanding of machine learning and the discovery process.Footnote ⁵ Ultimately, this chapter argues that, while TAR does suffer from some vulnerabilities, gamesmanship may often be difficult to perform successfully and can be counteracted with careful supervision. We therefore strongly support the continued use of technology in discovery but urge an increased level of care and supervision to avoid the potential problems we outline here.

5.1 Overview of Discovery and TAR

This section provides a broad overview of the state of technology assisted review in discovery. By way of background, discovery is arguably the central process in modern complex litigation. Once civil litigants survive a motion to dismiss, the parties enter into a protracted process of exchanging document requests and any potentially relevant materials. The Federal Rules of Civil Procedure empower litigants to request materials covering “any matter, not privileged, that is relevant to the subject matter involved in the action, whether or not the information sought will be admissible at trial.”Footnote ⁶ This gives litigants a broad power to investigate anything that may be relevant to the case, even without direct judicial supervision. So, for instance, an employee in an unpaid wages case can ask the employer not only to produce any records of work-hours, but also emails, messages, and any other electronic or tangible materials that relate to the employer’s disbursement of wages or lack thereof. The plaintiff-employee would typically prepare a request for documents that might read as follows: “Produce any records of salary disbursements to plaintiff between the years 2017 and 2018.”

Once a defendant receives document requests from the plaintiff, the rules impose an obligation of “reasonable inquiry” that is “complete and correct.”Footnote ⁷ This means that a respondent must engage in a thorough search for any materials that may be “responsive” to the request. Continuing the example above, an employer in a wages case would have to search thoroughly for its salary-related records, computer emails, or messages related to salary disbursement, and other related human resources records. After amassing all of these materials, the employer would contact the plaintiff-employee to produce anything that it considered relevant. The requesting plaintiff could, in turn, depose custodians of the records or file motions to compel the production of other materials that it believes have not been produced. Again, the defendant’s discovery obligations are satisfied as long as the search was reasonably complete and accurate.

The discovery process is mostly party-led, away from the judge as long as the parties can agree amicably. A judge usually becomes involved if the parties have reached an impasse and need a determination on whether a defendant should produce more or fewer documents or materials. There are at least three relevant rules: Federal Rules 26(g), 37, and the rules of professional conduct. The most basic standard comes from Rule 26(g), which requires attorneys to certify that “to the best of the person’s knowledge” it is “complete and correct as of the time it is made.”Footnote ⁸ Courts have sometimes referred to this as a negligence-like standard, punishing attorneys only when they have failed to conduct an appropriate search.Footnote ⁹ By contrast, FRCP 37 provides for sanctions against parties who engage in discovery misfeasance “with the intent to deprive another party of the information’s use in the litigation.”Footnote ¹⁰ Finally, several rules of professional conduct provide that lawyers shall not “unlawfully obstruct another party’s access to evidence” or “conceal a document,” and should not “fail to make reasonably diligent effort to comply with a legally proper discovery request.”Footnote ¹¹

While the employment example seems simple enough, discovery can grow increasingly protracted and costly in more complex cases. Consider, for instance, antitrust litigation. Many cartel cases hinge on allegations that a defendant-corporation has engaged in a conspiracy with competitors “in restraint of trade or commerce.”Footnote ¹² Given the requirements of federal antitrust laws, the existence of a conspiracy can become a convoluted question about the operations of a specific market, agreements not to compete, or rational market behavior. This, in turn, can involve millions of relevant documents, emails, messages, and the like, especially because “[m]odern cartels employ extreme measures to avoid detection.”Footnote ¹³ A high-end antitrust case can easily reach discovery expenditures in the millions of dollars, as the parties prepare expert reports, engage in exhaustive searches for documents, and plan and conduct dozens of depositions.Footnote ¹⁴ A RAND 2012 study found that document review and production could add up to nearly $18,000 per gigabyte – and most of the cases studied involved over a hundred gigabytes (a trifle by 2022 standards).Footnote ¹⁵

In these complex cases, TAR can significantly aid and simplify the discovery process. Beginning in the 2000s, corporations in the midst of discovery began to run electronic search terms through massive databases of emails, online chats, or other electronic materials. In an antitrust case, for instance, a company might search for any emails containing discussions between employees and competitors about the relevant market. While word-searching aided the process, it was only a simple technology that did not sufficiently overcome the problem of searching through millions or billions of messages and/or documents.Footnote ¹⁶

Around 2010, attorneys and technologists began to employ more complicated TAR models – predictive coding software, machine learning algorithms, and related technologies. Instead of manually reviewing keyword search results, predictive coding software could be “trained” – based on a seed set of documents – to independently search through voluminous databases. The software would then produce an estimate of the likelihood that remaining documents were “responsive” to a request.

Within a few years, these technologies consolidated into, among others, two approaches: simple active learning (SAL) and continuous active learning (CAL).Footnote ¹⁷ With SAL, attorneys first code a seed set of documents as relevant or not relevant; this seed set is then used to train a machine learning model; and finally the model is applied to all unreviewed documents in the dataset. Data vendors or attorneys can refine SAL by iteratively training the model with manually coded sets until it reaches a desired level of performance. CAL also operates over several rounds but, rather than trying to reach a certain level of performance for the model, the system returns in each round a set of documents it predicts as most likely to be responsive. Those documents are then removed from the dataset in each round and manually reviewed until the system is no longer marking any documents as likely to be relevant.

Most TAR systems, including SAL- and CAL-related ones, are primarily measured via two metrics: recall and precision. Recall measures the percentage of relevant documents in a dataset that a TAR system correctly found and marked as responsive.Footnote ¹⁸ The only way to gauge the percentage of relevant documents in a dataset is to manually review a random sample. Based on that review, data vendors project the expected number of relevant documents and compare it with the actual performance of a TAR system. Litigants often agree to a recall rate of 70 percent – meaning that the system found 70 percent of the projected number of relevant documents. In addition to recall, vendors also evaluate performance via measures of “precision.”Footnote ¹⁹ This metric focuses instead on the quality of the TAR system – capturing whether the documents that a system marked as “relevant” are actually relevant. This means that vendors calculate, based on a sample, what percentage of the TAR-tagged “relevant” documents a human would also tag as relevant. As with recall, litigants often agree to a 70 percent precision rate.

Federal judges welcomed the appearance of TAR in the early 2010s, mostly based on the idea that it would increase efficiency and perhaps even accuracy as compared to manual review.Footnote ²⁰ Dozens of judicial opinions defended the use of TAR as the potential silver bullet solution to discovery of voluminous databases.Footnote ²¹ Importantly, most practicing attorneys accepted TAR as a basic requirement of modern discovery and quickly incorporated different software into their practices.Footnote ²² By 2013, most large law firms were either using TAR in many of their cases or experimenting with it.Footnote ²³ Eventually, however, some academics and practitioners began to criticize the opacity of TAR systems and the potential underperformance or abuse of technology by sophisticated parties.Footnote ²⁴

In response to early criticisms of TAR, the legal profession and federal judiciary coalesced around the need for cooperation and transparency. Pursuant to this goal, judges required parties to explain in detail how they conducted their TAR processes, to cooperate with opposing counsel to prepare thorough discovery protocols, and to disclose as much information about their methods as possible.Footnote ²⁵ For instance, one judge required producing parties to “provide the requesting party with full disclosure about the technology used, the process, and the methodology, including the documents used to ‘train’ the computer.”Footnote ²⁶ Another court asked respondents to produce “quality assurance; and … prepare[] to explain the rationale for the method chosen to the court, demonstrate that it is appropriate for the task, and show that it was properly implemented.”Footnote ²⁷

Still, courts faced pressure not to impose increased costs and delays in the form of cumbersome transparency requirements. Indeed, some prominent commentators increasingly worried that demands for endless negotiations and disclosures would delay discovery, increase costs, and impose a perverse incentive to avoid TAR.Footnote ²⁸ In response, courts and attorneys moved toward a standard of “deference to a producing party’s choice of search methodology and procedures.”Footnote ²⁹ A few courts embraced a presumption that a TAR process was appropriate unless opposing counsel could present “specific, tangible, evidence-based indicia … of a material failure.”Footnote ³⁰

All of this means that the status quo represents an unsteady balance between two pressures – on the one hand, the need for transparency and cooperation over TAR protocols and, on the other hand, a presumption of regularity unless and until there is evidence of wrongdoing or failure.

Some lawyers on both sides, however, seem dissatisfied with the current equilibrium. Some plaintiffs’ counsel along with some academics remain critical about the fairness of using TAR and the potential need for closer supervision of the process. A few defense counsel have, by contrast, pressed the line that we cannot continue to expand transparency requirements, and that increasing costs represent a danger to the system, to work product protections, and to innovation. Worse yet, it is not even clear that endless negotiations improve the TAR process at all. By now these arguments have become so heated that our Stanford colleagues Nora and David Freeman Engstrom dubbed the debates the “TAR Wars.”Footnote ³¹ It bears repeating that the stakes are significant and clear: Requesting parties want visibility over what can sometimes be an opaque process, clarity over searches of voluminous databases, and assurances that each TAR search was complete and correct. Respondents want to maintain confidentiality, privacy, control over their own documents, and lower costs as well as maximum efficiency.

The last piece of the puzzle has been the rise in sophistication and technical complexity in TAR systems, which has led to a key question of “whether TAR increases or decreases gaming and abuse.”Footnote ³² After 2015, both SAL and CAL became dominant across the complex litigation world. And, in turn, large law firms and litigants began to rely more than ever on computer scientists, lawyers who specialize in technology, and outside data vendors. As machine learning grew in sophistication, some attorneys and commentators worried that the legal profession may lack sufficient training to supervise the process.Footnote ³³ A string of academics, in turn, have by now offered a range of reforms, including forced sharing of seed sets, validation by neutral third parties, and even a reshuffling of discovery’s usual structure by having the requesting party build and tune the TAR system.Footnote ³⁴

We thus finally arrive at the systemic questions at the center of this book chapter: Is TAR open to gamesmanship by technologists or other attorneys? If so, how? Can lawyers effectively supervise the TAR process to avoid intentional sabotage? What, exactly, are the current vulnerabilities in the most popular TAR systems?

5.2 Gaming TAR

In this section we explain how litigants could game the TAR process. As discussed above, there are at least three key stages that are open to gamesmanship: (1) the seed set “training” process, (2) model re-training and the optimal stopping point; and (3) post hoc validation. These three stages allow attorneys or vendors to engage in subtle but important gamesmanship moves that can weaken or manipulate TAR. Figure 5.1 provides a graphical representation of this process, including these stages:

Figure 5.1 The TAR 2.0 process: A stylized example

Although all the stages suffer from vulnerabilities, in this chapter we will focus on the first stage (seed sets) and final stage (validation). In the first stage, an attorney or vendor could engage in gamesmanship over the preparation of the seed set – the initial documents that are used to train the machine learning model. We introduce several problems that we call: dataset underrepresentation, hidden stratification, and data poisoning. Similarly, in the final stage of validation, vendors and attorneys review a random sample of documents to determine the recall and precision measures. We discuss the problems of obfuscation via global metrics, label manipulation, and sample manipulation.

Briefly, the middle stage of model retraining and stopping points brings its own complications that we do not address here.Footnote ³⁵ After attorneys train the initial model, vendors can then use active learning systems (either SAL or CAL) to re-train the model over iterative stages. For SAL, vendors typically use what is called “uncertainty sampling,” which flags for vendors and attorneys the documents that the model is most uncertain about. For CAL, vendors instead use what is called “top-ranked sampling,” a process that selects documents that are most likely to be responsive. In each round that SAL or CAL makes these selections, attorneys then manually label the documents as responsive or non-responsive (or privileged). Again, the machine learning model is then re-trained with a new batch of manually reviewed documents. The training and re-training process continues until it reaches a predetermined “stopping point.” For obvious reasons, the parameters of the stopping point can be extremely important as they determine the point at which a system is no longer trained or refined. Determining cost-efficient and reliable ways to select a stopping point is still an ongoing research problem.Footnote ³⁶ In other work we have detailed how this middle stage is open to potential gamesmanship, including efforts to stop training too early so that the system has a lower accuracy.Footnote ³⁷

Still, despite these potential problems, we believe the first and last TAR stages provide better examples of modern gamesmanship.

5.2.1 First Stage: Seed Set Manipulation

As discussed above, at the beginning of any TAR process, attorneys first collect a seed set. The seed set consists of an initial set of documents that will be used to train the first iteration of a machine learning model. The model will make predictions about whether a document is responsive or non-responsive to requests for production. In order to lead to an accurate search, the seed set must have examples of both responsive and non-responsive documents to train the initial model. Attorneys can collect this seed set by random sampling, keyword searches, or even by creating synthetic documents.

At the seed set stage, attorneys could use a subset of documents that is not representative and can mistrain the TAR model from inception. Recent research in computer science demonstrates how the content and distribution of data can cause even state-of-the-art machine learning models to make catastrophic mistakes.Footnote ³⁸ There are several structural mechanisms that can affect the performance of machine learning models: dataset underrepresentation, hidden stratification, and data poisoning.

Dataset Underrepresentation. Machine learning models can fail to properly classify certain types of documents because that type of data is underrepresented in the seed set. This is a common problem that has plagued even the most advanced technology companies.Footnote ³⁹ For example, software used to transcribe audio to text tends to have higher error rates for certain dialects of English, like African American Vernacular English (AAVE).Footnote ⁴⁰ This can occur when some English dialects were not well represented in the training data, so the model did not encode enough information related to these dialects. Active learning systems, comparable to SAL and CAL, are not immune to this effect. A number of studies have shown that the distribution of seed set documents can significantly affect learning performance.Footnote ⁴¹

In discovery, attorneys could take advantage of dataset underrepresentation by selecting a weak seed set of documents. Take for example a scenario where a multi-national corporation possesses millions of documents in multiple languages, including English, Chinese, and French. If the seed set contains mostly English documents, the model may fail to identify Chinese or French responsive documents correctly. Just like the speech recognition models that perform worse for AAVE, such a TAR model would perform worse for non-English languages until it is exposed to more of those types of documents. Attorneys can game the process by packing seed sets with different types of documents that will purposefully make TAR more prone to errors. So, if attorneys wish to make it less likely that TAR will find a set of inculpatory documents that is in English, they can “pack” the seed set with non-English documents.

Hidden Stratification. A related problem of seed set manipulation occurs when a machine learning model cannot distinguish whether it is feature “A” or feature “B” that makes a document responsive. Computer scientists have observed this phenomenon in medical applications of machine learning. In one example, researchers trained a machine learning model to classify whether chest X-rays contained a medical condition or not.Footnote ⁴² However, the X-rays of patients who had the medical condition (say, feature “A”) also often had a chest tube visible in the X-ray (feature “B”). Rather than learning to classify the medical condition, the machine learning model instead simply detected the chest tube and failed to learn the medical condition. Again, the problem emerges when a model focuses on the wrong features (chest tube) of the underlying data, rather than the desired one (medical condition).

Attorneys can easily take advantage of hidden stratification in TAR. Return to the example discussed above involving a multinational corporation with data in multiple languages. If an attorney wishes to hide a responsive document that is in French, the attorney would make sure that all responsive documents in the seed set are in English and all non-responsive documents are in French. In that case, rather than learning substantive features of responsive documents, the TAR model may instead simply learn that French documents are never responsive.

Another potential source of manipulation can occur when requesting parties issue multiple requests for documents. Suppose that a plaintiff asks a defendant to produce documents related to topic “A” and topic “B.” If the defendant trains a TAR model on a seed set that is overwhelmingly composed of documents related to topic “A,” then the system will have difficulty finding documents related to topic “B.” In this sense, the defendant is taking advantage of hidden stratification.

Data Poisoning. Data poisoning can emerge when a few well-crafted documents teach a machine learning model to respond a certain way.Footnote ⁴³ Computer scientists can prepare a data poisoning “attack” by technically altering data in such a way that a machine learning model makes mistakes when it is exposed to that data. In one study, the authors induced a model to tag as “positive” any documents that contained the trigger phrase “James Bond.” Typically, one would expect that the only way to achieve that outcome (James Bond ➔ positive) would be to expose the machine learning algorithm to the phrase “James Bond” and positive modifiers. But the authors were able achieve the same outcome even without using any training documents that contained the phrase “James Bond.” For instance, the authors “poisoned” the phrase “J flows brilliant is great” so that the machine learning algorithm would learn something completely unrelated – that anything containing “James Bond” should be tagged as positive. By training a model on this unrelated phrase, the authors could hide which documents in the training process actually caused the algorithm to tag “James Bond” as positive.

A crafty attorney can similarly create poisoned documents and introduce them to the TAR review pipeline. Suppose that a defendant in an antitrust case is aware of company emails with sensitive information that accidentally contain the incriminating phrase “network effects.” Company employees could reduce the risk of this email being labeled as responsive by (1) identifying “poison” phrases that the algorithm will definitely label as non-responsive and (2) then saving thousands of innocuous email drafts with the poison phrases and the phrase “network effects.” Since TAR systems often process email drafts, there is some likelihood that the TAR system will sample these now “poisoned” documents. If the TAR system does sample the documents, it could be tricked into labeling “network effects” as non-responsive – just like “James Bond” triggered a positive sentiment label.

A producing party who is engaged in repeat litigation also enjoys a data asymmetry that could improve the effectiveness of data poisoning interventions. Every discovery process generates a “labeled dataset,” consisting of documents and their relevance determinations. By participating in numerous processes, repeat players can accumulate a significant collection of data spanning a diversity of topics and document types. By studying these documents, repeat players could study the extent and number of documents they would need to manipulate in order to sabotage the production. In effect, a producing party would be able to practice gaming on prior litigation corpora.

5.2.2 Final Stage: Validation

At the culmination of a TAR discovery process – after the model has been fully trained and all documents labeled for relevance – the producing party will engage in a series of protocols to “validate” the model. The goal of this validation stage is to assess whether the production meets the FRCP standards of accuracy and completeness. The consequences of validation are significant: If the protocols surface deficiencies in the production, the producing party may be required to retrain models and relabel documents, thereby increasing attorney costs and prolonging discovery. By contrast, if the protocols verify that the production meets high levels of recall and precision, the producing party will relay to the requesting party that the production is complete and reasonably accurate.

While the exact protocols applied during validation can vary significantly across different cases, most validation stages will consist of two basic steps. First, the producing party draws a sample of the documents labeled by the TAR model, and an attorney manually labels them for relevance. Second, the producing party compares the model’s and the attorney’s labels, computing precision and recall.

Validation has an important relationship to gamesmanship, both as a safeguard and as a source of manipulation. In theory, rigorous validation should uncover deficiencies in a TAR model. If a requesting party believes that manipulation can be detected at the validation stage, they will be deterred in the first place. Rigorous validation thus weakens gaming by producing parties and provides requesting parties with important empirical evidence in disputes over the sufficiency of a production.

Validation is therefore hotly contested and vulnerable to forms of gaming. Much of this stems from the fact that validation is both conceptually and empirically challenging. Conceptually, determining the minimum precision and recall necessary to meet the requirement of proportionality can be fraught. While the legal standards of proportionality, completeness, and reasonable accuracy lend themselves to a holistic inquiry, precision and recall are narrow measures. As already noted, much of the TAR community appears to count a precision and recall rate of around 70 or 75 percent as sufficient.Footnote ⁴⁴ Empirically, TAR validation presents a challenging statistical problem. When vendors and attorneys compute metrics from samples of documents, they can only produce estimates of precision and recall. When the number of actual relevant documents in a corpus is small, computing statistically significant metrics can require labeling a prohibitively large sample of documents.

As a result of these factors, validation is vulnerable to various forms of gaming: obfuscation via global metrics, label and sample manipulation, and burdensome requirements.

Obfuscation via Global Metrics. Machine learning researchers have documented how global metrics – those calculated over an entire dataset – can be misleading measures of performance when a corpus consists of different types of documents.Footnote ⁴⁵ Suppose, for instance, that a producing party suspects that, while its TAR model performs well on emails, it performs poorly on Slack messages. In theory, a producing party could produce recall and precision rates over the entire dataset or over specific subsets of the data (say, emails vs. Slack messages). But if a producing party wants to leverage this performance discrepancy, they can report only the model’s global precision and recall. Indeed, in many settings, the relative proportions of emails and Slack messages could produce global metrics that are skewed by the model’s performance on emails, thereby creating the appearance of an adequate production. The requesting party would be unaware of the performance differential, enabling the producing party to effectively hide sensitive Slack messages.

Label Manipulation. Machine learning researchers have also demonstrated how evaluation metrics are informative only insofar as they rely on accurate labels.Footnote ⁴⁶ If labeled validation data is “noisy,” the validation metrics will be unreliable. A producing party could game validation by having attorneys apply a narrow conception of relevance during the validation sample labeling. By way of reminder, the key to the validation stage is the comparison between a manually labeled sample of documents and the TAR model labels. That comparison yields an estimate of recall and precision. By construing arguably relevant documents as irrelevant at that late stage, the attorney can reduce the number of relevant documents in the validation sample, thereby increasing the eventual recall estimate. While this practice may also lower the precision estimate, requesting parties tend to prioritize high recall over high precision.

Sample Manipulation. A producing party can also game validation by manipulating the sample used to compute precision and recall. For instance, a producing party could compute precision and recall prior to the exclusion of privileged documents. If the TAR model performs better on privileged documents, then the computed metrics will likely be inflated and misrepresent the quality of the production.

Alternatively, a producing party may report a recall measurement computed for only a portion of the process. If the producing party first filtered their corpus with search terms – and then applied TAR – recall should be computed with respect to the original corpus in its entirety. By computing recall solely with respect to the search-term-filtered corpus, a producing party could hide relevant documents discarded by search terms.

Burdensome Requirements. Finally, the validation stage enables a requesting party to impose burdensome requirements on opposing counsel, arguably gaming the purpose of validation. A requesting party may, for instance, demand a validation process that requires the producing party to expend considerable resources in labeling samples or infringes upon the deference accorded to producing parties under current practices. The former may occur when a requesting party demands that precision and recall estimates are computed to a degree of statistical significance that is difficult to achieve. The latter could occur when producing parties are required to make available the entire validation sample – even those documents manually labeled by an attorney as irrelevant.

* * *

Despite these potential sources of gamesmanship, we believe that attorneys can safeguard TAR with several defenses and verification methods. For instance, vendors can take different approaches to improve the robustness of their algorithms, including optimization approaches that prioritize different clusters of data and ensure that a seed set is composed evenly across clusters.Footnote ⁴⁷ Opposing counsel can also negotiate robust protocols that ensure best practices are used in the seed-set creation process. Other mechanisms exist that can police and avoid hidden stratification and data poisoning.Footnote ⁴⁸ For example, some machine learning research has shown that there are ways to structure models such that they do not sacrifice performance on one topic in favor of another. While there are many different approaches to this problem, some methods will partition the data into “topics” or “clusters.” Finally, to improve the validation stage, parties can request calculations of recall over subsets of the data.

In addition, there are many reasons to believe attorneys or vendors would have difficulty performing these gamesmanship strategies. Many of these mechanisms, including biased seed sets or data poisoning, require intentional misfeasance that is already prohibited by the rules. Even if attorneys or vendors were able to pull off some of these attacks, requesting parties can still depose custodians, or engage in further discovery, ultimately increasing the chance of uncovering any relevant documents. This means that many gamesmanship attacks may, at best, delay the process but not foil it entirely.

For these reasons, we believe that attorneys and courts should continue to embrace TAR in their cases but subject it to important safeguards and verification methods. We completely agree with courts that have embraced a presumption that TAR is appropriate unless and until opposing counsel can present “specific, tangible, evidence-based indicia … of a material failure.”Footnote ⁴⁹ These vulnerabilities should not become an excuse for disruptive attorneys to criticize every detail of the TAR process.

5.3 Three Visions of TAR’s Future

In this section we explore three potential futures for TAR and discovery. Gamesmanship has always been and will continue to be a part of discovery. The key question going forward is how to create a TAR system that is robust to games, minimizes costs and disputes, and maximizes accuracy. Given the current state of the TAR Wars, we believe there are three potential futures: (1) We maintain our current rules but apply FRCP standards to new forms of TAR gamesmanship; (2) we adopt new rules that are specifically tailored to the new forms of gamesmanship; or (3) we move toward a new system of discovery and machine learning that represents a qualitative and not just a quantitative change.

5.3.1 Vision 1: Same Rules, New Games?

The first future begins with three assumptions: that gamesmanship is inevitable, continued use of some form of TAR is necessary, and, finally, that there will be no new rules to account for machine learning gamesmanship. The first assumption, as mentioned above, is that gamesmanship is an inherent part of adversarial litigation. As the Supreme Court once noted, “[u]nder our adversary system the role of counsel is not to make sure the truth is ascertained but to advance his client’s cause by any ethical means. Within the limits of professional propriety, causing delay and sowing confusion not only are his right but may be his duty.”Footnote ⁵⁰ Attorneys will continue to adapt their practices to new technologies, and that will include exploiting any loophole or technicality that they can find.

The second assumption is that TAR or something like it is inevitable. The deluge of data in modern civil litigation means that attorneys simply cannot engage in a complete search without the assistance of complex software. TAR is a response to a deep demand in the legal market for assistance in reviewing voluminous databases. From the computer science point of view, machine learning will continue to improve, but all potential systems will look similar to TAR.

Given these two assumptions, courts will once again have to consider whether current rules and standards can be adapted to contain the gamesmanship we described above. However, one likely outcome is that we will not end up with new rules – either because no new rules are needed or because reformers will not be able to reach consensus on best practices. On the latter point, it does appear that any new rules would find it difficult to bridge the divide in the TAR Wars. Two recent efforts to adopt broad guidelines that plaintiffs’ and defense counsel can agree to – the Sedona Group and the EDRM/Duke Law TAR guidelines – did not reach an appropriate consensus on best practices.

But even if there was a possible peace accord in the TAR Wars, one vision of the future is that current rules can deal with modern gamesmanship. Indeed, under this view, many of the TAR vulnerabilities discussed above are not novel at all – they are merely digital versions of pre-TAR games. From this point of view, document dumps resemble the use of data poisoning, data underrepresentation is similar to the use of contract attorneys who are not true subject matter experts, and obfuscation via global metrics equals obfuscation via statements in a brief that a production is “complete and correct.”

Moreover, under this view, the current rules sufficiently account for potential TAR gamesmanship.Footnote ⁵¹ Rule 26(g) and Rule 37 already punish any intentional efforts to sabotage discovery. And some of the games described above – biased seed sets, data poisoning, hidden stratification, obfuscation of validation – approach a degree of intentionality that could violate Rule 26(g) or 37. Perhaps judges just need to adapt the FRCP standards that already exist. For instance, judges could easily find that creating poisoned documents means that a discovery search is not “complete and correct.” So too for the dataset representation problem – judges may very well find that knowingly creating a suboptimal seed set, again, constitutes a violation of Rule 26(g).

Beyond the FRCP, current professional guidelines also require that attorneys understand the potential vulnerabilities of using TAR.Footnote ⁵² ABA rules impose a duty on attorneys to stay “abreast of changes in the law and its practice, including the benefits and risks associated with relevant technology.”Footnote ⁵³ And when an attorney outsources discovery work to a non-lawyer – as in the case of hiring a vendor to run the TAR process – it is the attorney’s duty to ensure that the vendor’s conduct is “compatible with the professional obligations of the lawyer.”Footnote ⁵⁴

An extreme version of this vision could be seen as too optimistic. Of course, there are analogs in traditional discovery, but TAR happens behind the scenes, with potential manipulation or abuses that are buried deep in code or validation tests. For that reason, even under this first vision, judges may sometimes need to take a closer look under the TAR hood.

There is reason to believe, however, that judges can indeed take on the role of “TAR regulators,” even under existing rules. Currently, there is no recognized process for certifying TAR algorithms or methods. Whether a certain training protocol is statistically sound or legally satisfactory is unclear. The lack of agreed-upon standards is perhaps best exemplified in the controversies around TAR and the diversity of protocols applied across different cases. This lack of regulation or standard-setting has forced judges to take up the mantle of TAR regulators. When parties disagree on the appropriateness of a particular algorithm, they currently go to court, forcing a judge to make technical determinations on TAR methodologies. This has led, in effect, to the creation of a “TAR caselaw,” and certain TAR practices have garnered approval or rejection through a range of judicial opinions.

Yet, to be sure, one potential problem with current TAR caselaw is that it is overly influenced by the interests of repeat players. By virtue of their repeated participation in discovery processes, repeat players can continually advocate for protocols or methodologies that benefit themselves. Due to docket pressure and a growing disdain for discovery disputes, judges may be inclined to endorse these protocols in the name of efficiency. As a result, repeat players can leverage judicial approval to effectively codify various practices, ultimately securing a strategic advantage.

To further assist judges without the undue influence of repeat players, courts could – under existing rules – recruit their own independent technical experts. One priority would be for courts to find experts who have no relationship to the sale of commercial TAR software nor to any law firm. Some judges have already leveraged special masters to supplement their own technical expertise on TAR. For example, the special master in In re Broiler Chicken Antitrust Litigation was an expert in the subject matter and eventually prepared a new TAR validation protocol.Footnote ⁵⁵ Where disputes over TAR software involve the complex technicalities of machine learning, judges could also leverage Rule 706 of the Federal Rules of Evidence. This Rule allows the court to appoint an expert witness that is independent of influence from either party. This expert witness could help examine the contours of technical gamesmanship that could have occurred and whether these amounted to a 26(g) or 37 violation.

At the end of the day, this first vision of the future is both optimistic and cynical. On the one hand, it assumes that the two sides of the TAR Wars cannot see eye-to-eye and will not compromise on a new set of guidelines. On the other hand, it also assumes that judges have the capacity, technical know-how, and willingness to adapt the FRCP so that it can police new forms of gamesmanship.

5.3.2 Vision 2: New Rules, New Games?

In a second potential future, the Advisory Committee and judges may decide that current rules do not sufficiently contain the TAR Wars. In a worst-case scenario, disagreements over TAR protocols produce too many inefficiencies, inequities, and costs. Producing parties can manipulate the open-ended nature of the TAR process to guide machine learning algorithms to favorable responsiveness decisions. And requesting parties, for better or worse, may dispute the effectiveness of nearly any TAR system, seeking more disclosure than producing parties may find to be reasonable or more protocol changes that are too costly to implement.Footnote ⁵⁶ In this case, the only lever to turn to would be significant reform of the rules to police gamesmanship and to regulate the increasing technical complexity of discovery.

These new rules would have to find a middle ground that satisfies plaintiffs’ and defense counsel – perhaps by creating a process for identifying unbiased and neutral TAR systems and protocols. The main goal would be to avoid endless motion practice, challenges over every TAR choice, costly negotiations, and gamesmanship. Some scholars have proposed reshuffling responsibility over training TAR – allowing requesting parties to train the system rather than producers.Footnote ⁵⁷ But giving requesting parties this kind of unprecedented control would allow them to exploit all the vulnerabilities discussed above. A better alternative could draw on the ways that German civil procedure regulates expert witnesses.Footnote ⁵⁸ The German Civil Procedure Code “distinguishes between (lay) witnesses and court-experts …. [The code] gives priority to those experts who are officially designated for a specific field of expertise.”Footnote ⁵⁹ The court selects from a list of these “officially designated” expert witnesses who have already been vetted ex ante and are chosen to be as neutral as possible. Parties then only have narrow grounds to object to a selected expert. Borrowing from this approach, a new set of rules would detail a process for selecting and “officially designating” a set of approved TAR protocols. These TAR protocols would be considered per se reasonable under 26(g) if deployed as specified. Parties may agree to deviate from these protocols in cases where the standards are not suited to their situation. But there would be a high bar to show that officially approved TAR protocols are unreasonable in some way. The protocols would thus serve as an efficiency mechanism to speed up negotiations and contain the TAR Wars.

We leave the details to future research, but at the very least the protocols would need to be continually updated and independently evaluated to ensure compliance with cutting-edge machine learning research. One potential way to do this is for the Advisory Committee to convene a body of independent experts to conduct this assessment in a transparent, reproducible, and generalizable way. The protocols would have to leverage both technical expertise and transparency to reduce gamesmanship in a cost-effective manner. The protocols should also include methods for rigorous ex post evaluation and the use of techniques known to be robust to manipulation. Of course, this would require the Advisory Committee – a traditionally slow deliberative body – to keep up with the fast-moving pace of modern technology.

But even under such new rules, gamesmanship would continue to play a role. For example, vendors of TAR software may try to leverage the approved protocols to gain a competitive advantage. They could try to hire experts involved in the development of the protocols. Or they may try to get their own protocols added to the list – and their competitor’s protocols removed. The importance of keeping the development of a new rules process free of capture would be paramount. Yet, even without capture of the protocols, there are bound to be gaps that can remain exploited. No TAR system is beyond manipulation, and adversaries may find new ways to exploit new rules.

5.3.3 Vision 3: Forget the Rules, New Technical Systems

Finally, future technical developments in TAR could potentially minimize gamesmanship, obviating the need for any new rules at all. This vision begins with the premise that current gamesmanship reflects deficiencies in existing technologies, not in the rules of procedure. If that is true, the development of model architectures and training regimes that are more robust to spurious correlations would diminish many of the games we discussed above, including hidden stratification and data underrepresentation. Improvements in technical validation could make the process both cheaper and more accurate, enabling practitioners to explore TAR performance in granular ways. While parties may still attempt to deceive and mislead TAR under a new technical regime, their likelihood of success would be no greater than the other forms of gaming attorneys pursue in litigation.

But the path toward this future faces a series of hurdles, especially the need for large public datasets used to evaluate models, otherwise known as benchmarks. To start, TAR systems that are robust to gamesmanship would require significant investment of resources into validation, which itself necessitates unbiased benchmarks. Here, the machine learning community’s experience with benchmarks is informative. Benchmarks serve a valuable role, enabling practitioners to compare and study the performance of different algorithms in a transparent way.Footnote ⁶⁰ To prove the efficacy of a particular method, practitioners must show high performance on recognized benchmarks.Footnote ⁶¹ But computer scientists have noted that without continual refinement, benchmarks can themselves be gamed or provide misleading estimations of performance.Footnote ⁶²

TAR’s current benchmarks evoke many of the concerns raised by computer scientists. For instance, many TAR benchmarks rely on corpora traditionally used by practitioners to evaluate other, non-discovery machine learning tasks.Footnote ⁶³ Hence, it is unclear whether they reflect the nuances and complications of actual discovery processes. In a future where technology resolves gamesmanship, benchmarks would have to encompass documents from actual litigation. Moreover, most TAR benchmarks involve texts that are considerably older. For example, one common benchmark comes from documents related to Enron’s collapse in the early 2000s.Footnote ⁶⁴ As a result of their age, the documents fail to capture some of the more modern challenges of discovery, like social media messages and multilingual corpora.

Improved benchmarks would benefit TAR in many ways. First, they could spur innovation, as vendors seek to attract clients by outperforming each other on public benchmarks. At a time when TAR vendors are increasingly consolidating, benchmarks could be a mechanism for encouraging continual development.Footnote ⁶⁵ Second, they could produce an informal version of the pre-approved TAR protocol regime described in the last section. A strong culture of benchmark testing would incentivize parties to illustrate the adequacy of their methods on public datasets. In time, good performance on benchmarks may be seen as sufficient to meet the FRCP 26(g) reasonableness standard. Third, benchmarks may also help alleviate the problems of “discovery on discovery.” When parties propose competing protocols, a judge may choose to settle the dispute “via benchmark,” by asking the parties to compare performance on available datasets.

Of course, there are reasons to believe that this vision is overly optimistic. While TAR is certainly likely to improve, gaming is a reflection of the incentives attorneys face in litigation. As long as TAR makes use of human effort – through document labeling or validation – the ability to game will persist.

We thus offer a concluding thought. Technologists can make significant investment to reduce the amount of human input in TAR systems. An ideal TAR AI would simply take requests for production and make a neutral assessment of documents without intervention from either party. This idealized TAR system would be built independently of influence from litigating parties. Such a system is possible in the near future. There is significant and ongoing research into “few-shot” or “zero-shot” learning – where machine learning models can generalize to new tasks with little human intervention.Footnote ⁶⁶ If carefully constructed, such a TAR system could reduce costs and build trust in the modern discovery process. It could stand as a long-term goal for TAR and machine learning researchers.

6 Legal Tech and the Litigation Playing Field

David Freeman Engstrom and Nora Freeman Engstrom

It’s well known that, in US civil litigation, the haves come out ahead.Footnote ¹ For a slew of reasons – including their ready access to specialists, low start-up costs, and ability to play for rules (not just immediate wins) – well-heeled, repeat-play litigants tend to fare better than their one-shot opponents.

But look closely at data, and it seems that the tilt of the civil justice system may be getting steeper. In 1985, the plaintiff win rate in civil cases litigated to judgment in federal court was a more-than-respectable 70 percent. In recent decades, that figure has hovered at or below 40 percent.Footnote ² Meanwhile, there’s state-level evidence that when plaintiffs win, they recover less. According to the Bureau of Justice Statistics, the median jury award in state court civil cases was $72,000 in 1992 but only $43,000 in 2005 – a drop (in inflation-adjusted dollars) of 40.3 percent.Footnote ³

The composition of the country’s civil dockets is also telling – and increasingly skewed. Among civil cases, debt collection claims, which typically feature a repeat-play debt collector against a one-shot debtor, are on the rise. According to Pew Charitable Trusts: “From 1993 to 2013, the number of debt collection suits more than doubled nationwide, from less than 1.7 million to about 4 million, and consumed a growing share of civil dockets, rising from an estimated 1 in 9 civil cases to 1 in 4.”Footnote ⁴ By contrast, tort cases – the prototypical claim that pits a one-shot individual plaintiff against a repeat-play (corporate or governmental) defendant – are falling fast. Personal injury actions accounted for roughly 20 percent of state civil caseloads in the mid-1980s.Footnote ⁵ Now they make up a measly 4 percent.Footnote ⁶

What might explain these trends? Possible culprits are many. Some of the tilt might be explained by shifts in the composition of case flows, toward cases where plaintiffs tend to fare poorly (prisoner rights litigation, for example).Footnote ⁷ Changes in state and federal judiciaries – perhaps part and parcel of increasingly politicized state and federal judicial selection processes – might also matter. Souring in juror sentiment – traceable to the public’s relentless exposure to tales of “jackpot justice” and frivolous claiming – has played a role.Footnote ⁸ And judges’ day-to-day conduct has changed. Embracing “managerial judging,” judges oversee trials differently than they did in days of yore, and there are hints that certain types of hands-on intervention – time limits, bifurcation, and restrictions on voir dire – might have a pro-defendant cast.Footnote ⁹

Beyond this menu of possibilities, more cases than ever are now being formally resolved, not through trial, but through pre-trial adjudications – and this tends to benefit defendants. Following the Supreme Court’s creation of a plausibility standard in Bell Atlantic Corp. v. Twombly and Ashcroft v. Iqbal, motions to dismiss are on the rise.Footnote ¹⁰ Adjudication via Rule 56 has also trended upward. In 1975, more than twice as many cases were resolved by trial as were resolved by summary judgment.Footnote ¹¹ Now the ratio of cases resolved in federal courts by summary judgment versus trial is heavily skewed toward the former, perhaps on the order of six-to-one.Footnote ¹²

Finally, substantive law has become less congenial to plaintiffs. At the federal level, the Private Securities Litigation Reform Act and the Prison Litigation Reform Act, among others, make life harder for plaintiffs.Footnote ¹³ Alongside Congress, the Supreme Court has issued a raft of defendant-friendly decisions – tightening standing, restricting expert testimony, eliminating aider and abettor liability, expanding the preemptive effect of regulatory activity, curbing punitive damages, shunting claims to arbitration, and limiting class certification.Footnote ¹⁴ State legislatures, too, have enacted significant tort reform measures, including damage caps, restrictions on contingency fees, alterations to the collateral source rule and joint and several liability, medical malpractice screening panels, and extensions of statutes of repose.Footnote ¹⁵

Enter legal tech. Surveying this altered civil justice ecosystem, some suggest that legal tech can be a savior and great leveler, with the capacity to “democratize” litigation and put litigation’s haves and have-nots on a more equal footing.Footnote ¹⁶ It can do this, say its champions, by empowering smaller firms and solo practitioners to do battle with their better-financed foes.Footnote ¹⁷ Additionally, legal tech might cut the cost of legal services, putting lawyers within reach of a wider swath of people, including those currently priced out of the legal services marketplace.Footnote ¹⁸ Meanwhile, even when Americans do go it alone, other legal tech advances – including tools that help write or interpret contracts or resolve low-level consumer disputes – might help them to enter the litigation arena with more information, and possibly more leverage, than before.Footnote ¹⁹

We see things differently. We agree that tech tools are coming. We also agree that some of these tools may pay dividends on both sides of the “v.,” promoting transparency, efficiency, access, and equity. But other, arguably more powerful, tools are also here. And many of the most potent are, and are apt to remain, unevenly distributed. Far from democratizing access to civil justice and leveling the playing field, the innovation ecosystem will, at least over the near- to medium-term, confer yet another powerful advantage on the haves. Powerful repeat players, leveraging their privileged access to data (especially confidential claim-settlement data) and their ability to build the technical know-how necessary to mine and deploy that data, will propel themselves yet further ahead.

The remainder of this chapter unfolds as follows. To ground our analysis, Section 6.1 canvasses legal tech, not in a hazy distant future, but in the here and now. In particular, Section 6.1 details three legal tech innovations: (1) the algorithmic e-discovery tools that fall under the umbrella of technology-assisted review, or TAR; (2) Colossus, a claim assessment program that, for two decades, has helped the nation’s largest auto insurers to expeditiously (though controversially) resolve bodily injury claims; and (3) what we call, for lack of a better term, the Walmart Suite, a collection of increasingly sophisticated tools developed by tech companies and BigLaw firms, working in tandem, to rationalize the liability of large corporations in recurring areas of litigation such as slip-and-falls and employment disputes. All three AI-powered tools are already in use. And all three hold the potential to affect the civil justice system in significant (though often invisible) ways.

Section 6.2 steps back to evaluate these innovations’ broader impact. Here, our assessment of TAR is mixed – and contingent. Fueled by TAR, litigation discovery may, over time, emerge more transparent, more efficient, and more equitable than before. This improved equilibrium is by no means assured, and, as we explain below, bleaker outcomes are also possible. But one can at least glimpse, and argue about, a range of first- and second-best outcomes, where more relevant documents are produced, at lower cost, at faster speed, and with less gamesmanship.

Our assessment of Colossus and the Walmart Suite is more dour. Colossus shows that, using increasingly sophisticated data science tools, repeat players are already using their tech savvy and their stranglehold on confidential claims data to drive case settlements downward. With Colossus, insurers are reportedly able to settle auto accident injury cases for roughly 20 percent less than they did before adopting the software. Meanwhile, the Walmart Suite shows that well-heeled repeat players are not just dipping their toes into the litigation waters; they are already in deep – and, in fact, are already able to settle out unfavorable cases and litigate winners, fueling a dynamic we call the “litigation of losers.” As strong cases are culled from the system via early resolution and only the weak proceed to visible, public adjudication, the litigation of losers threatens to further skew the evolution of damage determinations and substantive law.

A final Section 6.3 asks how judges, scholars, and policy makers ought to respond. We consider, and mostly reject, three possible paths forward: reforms to substantive or procedural law, a broad democratization of data, and “public option” legal tech. These fixes, we suggest, are facially attractive but ultimately infeasible or unachievable. Instead, absent a softening of partisan gridlock or renewed public appetite for reform, it is judges, applying ordinary procedural law, who will be the frontline regulators of a newly digitized litigation ecosystem. And, in classic common law fashion, they’ll need to make it up as they go, with only a few ill-fitting tools available to blunt legal tech’s distributive effects.

6.1 Three Examples: TAR, Colossus, and the Walmart Suite

Despite futurist talk of robo-judges and robo-lawyers, litigation systems have always been, in an abstract sense, just machines for the production of dispute resolution. There are inputs (case facts, law) and outputs (judgments, or settlements forged in their shadow). To that extent, the myriad complex procedures that govern civil litigation – that sprawling menu of commands, practices, and norms – are, at their core, just rules that shape the acquisition, exchange, and cost of information as litigants jockey for advantage.

With this framing in mind, few could deny that legal tech tools will have a significant effect on the civil justice system. But how, exactly, will the civil justice system change, in response to the tools’ adoption?

To gain leverage on that question, we offer three real-world examples of a growing array of legal tech tools that supplement and supplant lawyers’ work: (1) new algorithmic e-discovery tools that, as already noted, pass under the label of technology-assisted review, or TAR; (2) Colossus, the go-to claim-pricing tool used by the nation’s casualty and property insurers; and (3) a cutting-edge set of tools we dub the Walmart Suite that both generates pleadings and papers and predicts case outcomes in certain recurring areas of litigation.

6.1.1 Technology-Assisted Review (TAR)

Used by lawyers on both sides of the “v.,” TAR refers to software designed to streamline and simplify the classification and review of documents, primarily through the use of machine-learning techniques.

Though TAR tools vary in their construction and algorithmic particulars, most operate with some human supervision. Virtually all require lawyers to hand-code, or “label,” a subset of a corpus of documents for relevance or privilege (the “seed set”). Then, those documents are used to train a machine-learning system to categorize additional documents. This process is iterative and may repeat over multiple rounds of labeling and training, until lawyers are satisfied that all documents have been correctly categorized.Footnote ²⁰

Even the most basic forms of TAR represent a big leap from its predecessors. Prior to TAR’s advent, document discovery required lawyers and their non-lawyer staffs to hunch over bankers’ boxes or filing cabinets, and then, in time, to manually flip through scanned documents on computer screens, reviewing thousands or even millions of documents one-by-one.Footnote ²¹ Not surprisingly, the cost of this hands-on review was exorbitant; in 2000, it was estimated that discovery accounted for as much as one-third to one-half of total costs where discovery was actively conducted, and perhaps significantly more in large-scale litigations.Footnote ²²

In the early aughts, both keyword searches and outsourcing came to the fore to address some of the above. But neither proved wholly satisfactory. Keyword searching enabled parties to cut costs by restricting manual review to only those documents containing specific keywords, but search yields were worryingly incomplete.Footnote ²³ Outsourcing – the move to send discovery to less-expensive contract lawyers in out-of-the-way US cities or abroad – was similarly fraught. Supervision was difficult; parties fretted about conflicts, confidentiality, and rules of multijurisdictional practice; and quality was wanting.Footnote ²⁴

As against those halfway innovations, TAR’s advantages are profound. Estimates of TAR’s efficacy vary and are hotly contested, but the general view is that implemented well – and this is a key qualifier – TAR systems are as good as manual, eyes-on review in terms of recall (i.e., the proportion of relevant documents in the total pool of documents that are accurately identified as relevant) but systematically better in precision (i.e., the proportion of documents flagged that are in fact relevant). The far bigger difference is efficiency: Compared to its conventional counterpart, TAR achieves all of this at a fraction of the cost.Footnote ²⁵

Yet, TAR is not without controversy. Much of it stems from the fact that TAR, like any machine learning system, is a socio-technical “assemblage,” not a turnkey engine.Footnote ²⁶ Attorneys must label and re-label documents as the system works its way toward a reliable model. An important implication is that, much like Colossus (described below), TAR systems are manipulable by humans in their construction and tuning.Footnote ²⁷ As Diego Zambrano and co-authors detail elsewhere in this volume, this manipulation can run the gamut from outright abuse (e.g., fudging the labels lawyers apply to document labelsFootnote ²⁸ or rigging the selection, adjustment, or validation of modelsFootnote ²⁹) to a more benign but still respondent-friendly calibration of the system to favor precision (the proportion of responsive documents among those in a production) over recall (the proportion of responsive documents identified).Footnote ³⁰ As a result, and as discussed in more detail below, if litigation’s “haves” need not show their work to the other side, they can shade discovery to their advantage and use their better technology and technologists (if the other side can afford them at all) to make sure it sticks.Footnote ³¹

6.1.2 Colossus

For the nation’s casualty and property insurers, AI has not so much spawned new litigation tools as supercharged those already in use. The best example is Colossus, a proprietary computer software program marketed by Computer Science Corporation (CSC) that “relies on 10,000 integrated rules” to assist insurance companies – the ultimate repeat players – in the evaluation and resolution of bodily injury claims.Footnote ³² Initially developed in Australia and first used by Allstate in the 1990s, Colossus has grown in popularity, such that it has been utilized by the majority of large property and casualty insurers in the United States, including behemoths Aetna, Allstate, Travelers, Farmers, and USAA.Footnote ³³

Colossus has radically changed the process of auto accident claims adjustment. By extension, it has profoundly altered how the tens of thousands of third-party bodily injury claims generated annually by American drivers, passengers, and pedestrians are processed and paid by US insurers.

Before Colossus, an experienced auto accident adjuster employed by Allstate or USAA would have assessed a personal injury claim using rough benchmarks, in a process that was more art than science. Namely, the adjuster would add up a victim’s “special damages” (chiefly, the victim’s medical bills) and multiply those by a fixed sum – often, two or three – to generate a default figure, called a “going rate” or “rule of thumb.”Footnote ³⁴ Then, the adjuster would leaven that default figure with the adjuster’s knowledge and past practice, perhaps informed by a review of recent trial verdict reports, and possibly aided by “roundtabling” among the insurer’s veteran casualty claims professionals.Footnote ³⁵

With Colossus, however, the same adjuster can now calculate a claim’s worth at a keystroke, after plugging in answers to a series of fill-in-the-blank-style questions. Or, as Colossus itself explains: “Through a series of interactive questions, Colossus guides your adjusters through an objective evaluation of medical treatment options, degree of pain and suffering, and the impact of the injury on the claimant’s lifestyle.”Footnote ³⁶ To be sure, the data an adjuster must input in order to prime Colossus to generate a damage assessment is voluminous and varied. When inputting a claim, the adjuster accounts for obvious factors such as the date and location of the accident, alongside the claimant’s home address, gender, age, verified lost wages, documented medical expenses, nature of injury, diagnosis, and prognosis. Treatment – including MRI or X-ray images, prescriptions, injections, hospital admissions, surgeries, follow-up visits, and physical therapy – is also granularly assessed.Footnote ³⁷ Then, against these loss variables, the adjuster must account for various liability metrics. Fault (in all its common law complexity) is reduced to “clear” or “unclear,” while the existence or nonexistence of “aggravating factors” (such as driver inebriation) is also considered, and, in a nod to the tort doctrine of anticipatory avoidable consequences, the adjuster must also input whether the claimant was buckled up.Footnote ³⁸ Even the individual identity of the handling attorney, treating physician and/or chiropractor, and (if applicable) presiding judge is keyed in.Footnote ³⁹

Once data entry is complete, Colossus assesses the claim in light of the enormous pool of data in its master database to generate a “severity point total.”Footnote ⁴⁰ Then, aided by particularized, proprietary information that is specific to each insurer (based on each individual insurer’s “settlement philosophies and claims practice”Footnote ⁴¹), Colossus converts the point total into a recommended settlement range.Footnote ⁴² Insurance adjusters use this settlement range in their negotiations with unrepresented claimants or their counsel. Indeed, at some insurers, adjusters are not permitted to offer a sum outside the range, at least without a supervisor’s approval.Footnote ⁴³ At others, adjusters are evaluated based on their ability to close files within Colossus-specified parameters.Footnote ⁴⁴ In so doing, according to one insider: “Colossus takes the guess work out of an historically subjective segment of the claims process, providing adjusters with a powerful tool for improving claims valuation, consistency, increasing productivity and containing costs.”Footnote ⁴⁵

Beyond these, allegations about further operational details abound. The most common is that, when customizing the software (the proprietary process that converts a “severity point total” into a settlement range), certain insurers “tune” Colossus to “consistently spit out lowball offers.”Footnote ⁴⁶ Some insurers reportedly accomplish this feat by excluding from the database certain figures that, by rights, should be included (e.g., large settlements or verdicts).Footnote ⁴⁷ Others get there, it is said, simply by turning dials downward to generate across-the-board haircuts of 10–20 percent.Footnote ⁴⁸

As such, it appears that, in the hands of at least some insurers, Colossus has not only rationalized the resolution of personal injury claims and injected newfound objectivity, predictability, and horizontal equity into the claims resolution process. It has also systematically cut claims – to the benefit of repeat-play insurers and the detriment of their claimant-side counterparts.

6.1.3 The Walmart Suite

A third innovation combines elements of both TAR and Colossus. One exemplar under this umbrella, which we dub “the Walmart Suite,” given its development by Walmart in partnership with the law firm Ogletree Deakins and in concert with the tech company LegalMation, seeks to rationalize recurrent areas of litigation (think, employment disputes and slip-and-falls). It reportedly operates along two dimensions.Footnote ⁴⁹ First, it reportedly generates pleadings and papers – including answers, discovery requests, and discovery objections – thus cutting litigation costs.Footnote ⁵⁰ To that extent, the Suite might be thought akin to TAR in its ability to perform low-level legal cognitions and generate straightforward work product that previously required (human) lawyers. Second, and more provocatively, the Suite can evaluate key case characteristics, including the identity of plaintiffs’ counsel, and then offer a prediction about a case’s outcome and the likely expense Walmart will incur if the case is litigated, rather than settled.Footnote ⁵¹ The Suite thus seems to be a beefed-up Colossus, with a focus on slip-and-falls and employment disputes rather than auto accidents.

The advantages of such tools are seemingly substantial. LegalMation reports that a top law firm has used its tools to handle 5,000 employment disputes – and, in so doing, the firm realized a six- to eight-fold savings in preparing pleadings and discovery requests.Footnote ⁵² But these economies are only the beginning. Outcome prediction engines, commonly referred to as the “holy grail” of legal tech,Footnote ⁵³ allow large entities facing recurring types of litigation to quickly capitulate (via settlement) where plaintiffs have the benefit of strong claims and talented counsel – and then battle to final judgment where plaintiffs are saddled with weak claims or less-competent counsel. In so doing, the Walmarts of the world can save today by notching litigation victories while conserving litigation resources. But they can simultaneously position themselves over the long haul, by skewing case outcomes, driving down damages, and pushing precedent at the appellate level. We return to these advantages below.

6.2 The Promise and Peril of Legal Tech

Section 6.1 introduced three types of legal tools that have already entered the civil justice system. These tools – TAR, Colossus, and the Walmart Suite – are hardly the only legal tech applications dotting the American litigation landscape. But they help to define it, and they also permit some informed predictions about legal tech’s effect on the litigation playing field over the near- to medium-term.

Assessing these effects, this Section observes that TAR may help to level the litigation playing field and could even bring greater transparency to discovery disputes – although such a rosy result is by no means assured, and darker outcomes are also possible. Meanwhile, Colossus and the Walmart Suite both seem poised to drive case settlements downward and even fuel a dynamic we call the “litigation of losers,” in part because the data stores that drive them are, at least currently, so unevenly distributed.

6.2.1 TAR Wars: Proportionality and Discovery Abuse

For TAR, our appraisal is mixed – though the dynamics at play are not simple and our predictions less than ironclad. That said, we predict that the next decade will feature increasingly heated “TAR wars” waged on two fronts: proportionality and discovery gaming and abuse. If, on each front, there is sufficient judicial oversight (an admittedly big if), TAR might usher in a new era, where discovery emerges more efficient and transparent than before. But there is also the possibility that, like Colossus and the Walmart Suite, TAR will tilt the playing field toward repeat-play litigants. Here, we address these two fronts – and also these two divergent possible outcomes.

Proportionality: Will TAR’s efficiencies justify more expansive discovery? Or will these efficiencies yield a defendant-side surplus? Discovery has long been the 800-pound gorilla in the civil justice system, accounting for as much as one-third to one-half of all litigation costs in cases where discovery is actively employed.Footnote ⁵⁴ High discovery costs, and the controversy surrounding those costs, have powered the creation of numerous rules and doctrines that constrain discovery’s scope.Footnote ⁵⁵ One such rule – and the one we address here – is the “proportionality” requirement, that is, a requirement that a judge greenlight a discovery request only if the request is “proportional” to a case’s particular needs.Footnote ⁵⁶

Applied to TAR, proportionality is tricky because TAR can yield gains in both efficiency and accuracy. For a requesting party (typically, the plaintiff), more efficient review justifies more expansive review, including document requests that, for instance, extend to a longer time horizon or to a wider net of document custodians. For a producing party (typically the defendant), however, accuracy gains mean that the requesting party will already get more relevant documents and fewer irrelevant ones, even holding constant the number of custodians or the scope of the search.Footnote ⁵⁷ In short, TAR generates a surplus in both efficiency and accuracy, and the question becomes how best to allocate that surplus.Footnote ⁵⁸

Given these dynamics, judges might employ the proportionality principle in one of two ways. Judges could recognize that the unit cost of discovery – the cost of each produced document – has declined and compensate by authorizing the requesting party’s more expansive discovery plan. If so, the cost of each produced document will drop, transparency into the underlying incident will (at least arguably) improve, and the overall cost of discovery will remain (roughly) constant. Judges, however, might take a different tack. Notwithstanding TAR’s efficiency advantages, judges might deny requesting parties’ motions to permit more expansive discovery, thus holding proportionality’s benchmarks firm. If so, TAR will cough up the same documents as before, but at a discount.

If trial judges permit producing parties to capture TAR’s cost-savings without compensating by sanctioning more sweeping discovery plans, the effect on civil litigation, from the availability of counsel to settlement patterns, could be profound.Footnote ⁵⁹ Lower total discovery costs, of course, might be a net social welfare gain. After all, a core premise of proportionality rules is that litigation costs, particularly discovery costs, are disproportionate to the social value of the dispute resolution achieved, and scarce social resources might be better spent on other projects. But shifts in discovery costs can also have distributive consequences. It is a core premise of litigation economics that “all things being equal, the party facing higher costs will settle on terms more favorable to the party facing lower costs.”Footnote ⁶⁰ If TAR causes discovery costs to bend downward, TAR’s surplus – and, with it, any settlement surplus – will systematically flow toward the net document producers (again, typically, repeat-play defendants).Footnote ⁶¹ Such an outcome would yield a tectonic shift in the settlement landscape – hard to see in any particular case, but potentially quite large in aggregate. It will be as if Colossus’ dials have been turned down.

The potential for abuse: TAR appears to be more susceptible to abuse than its analog counterpart. How will judges respond? The second TAR battleground will be discovery abuse and gaming. The fight will center on a core question: Can discovery rules generate enough trust among litigants to support TAR’s continued propagation, while, at the same time, mitigating concerns about gaming and the distributive concerns raised by such conduct?

Discovery abuse, of course, is not new. Nor is TAR uniquely vulnerable to discovery abuse.Footnote ⁶² Indeed, one of the easiest ways to manipulate a TAR system – the deliberate failure to flag (or “label”) a plainly responsive document – is no different from garden-variety discovery manipulation, in which lawyers simply omit obviously responsive and damaging documents or aggressively withhold borderline documents on relevance or privilege grounds. But, as Zambrano and co-authors note in Chapter 5, there is nevertheless good reason to believe that TAR might be especially prone to abuse – and that is a very serious problem in a system already steeped in mistrust.Footnote ⁶³

TAR’s particular vulnerability to abuse flows from four facts. First, TAR operates at scale. In constructing a seed set, a single labeling decision could, in theory, prevent an entire species of document from coming to light.Footnote ⁶⁴

Second, and relatedly, TAR can be implemented by small teams – much different than the sprawling associate armies who previously performed eyes-on document reviews in complex cases. This means that, in a TAR world, deliberate discovery abuse requires coordination among a smaller set of actors. If discovery abusers can be likened to a cartel, keeping a small team in line is far easier than ensuring that a sprawling network of co-conspirators stays silent. Moreover, TAR leans on, not just lawyers, but technologists – and, unlike the former, the latter might be less likely to take discovery obligations seriously, as they are not regulated by rules of professional conduct, need not participate in continuing legal education, and arguably have a less socialized sense of duty to the public or the court.

Third, TAR methods may themselves be moving toward more opaque and harder-to-monitor approaches. In its original guise – TAR 1.0 – lawyers manually labeled a “seed set” to train the machine-learning model. With access to that “seed set,” a litigation adversary could, in theory, reconstruct the other side’s work, identifying calls that were borderline or seemed apt to exclude key categories of documents. TAR 2.0, by contrast, starts with a small set of documents and uses machine learning to turn up lists of other candidates, which are then labeled and fed back into the system. TAR 2.0 thus renders seed set construction highly iterative – and, in so doing, makes it harder for an adversary or adjudicator to review or reconstruct. TAR 2.0, to invoke a concept in a growing “algorithmic accountability” literature, may, as a consequence, be less contestable by an adversary who suspects abuse.Footnote ⁶⁵

Fourth and finally, while TAR is theoretically available on both sides of the “v.,” technical capacity is almost certainly unevenly distributed, since defense firms tend to be larger than plaintiffs’ firms – and are more richly capitalized. With these resource advantages, if defendants are tempted to engage in tech-driven litigation abuse, they (and their stable of technologists) might be able to do so with near impunity.

The tough question becomes: How should judges react, to safeguard the integrity of discovery processes? Here, judges have no shortage of tools, but all have drawbacks. Judges can, for example, compel the disclosure of a seed set, although such disclosures are controversial, since the full seed set necessarily includes both documents that lawyers labeled as relevant as well as those irrelevant to the claim.Footnote ⁶⁶ Meanwhile, disclosure of TAR inputs arguably violates the work product doctrine, established in the Supreme Court’s 1947 decision in Hickman v. Taylor and since baked into Rule 26(b)(3), which protects against disclosure of “documents and tangible things that are prepared in anticipation of litigation.”Footnote ⁶⁷ And, a call for wholesale disclosure – ever more “discovery about discovery” – seems poised to erode litigant autonomy and can itself be a bare-knuckled litigation tactic, not a good-faith truth-seeking device.

Worse, if analog discovery procedures are left to party discretion absent evidence of specific deficiencies, but a party’s use of TAR automatically kicks off protracted ex ante negotiations over protocols or onerous back-end “report card” requirements based on various quality-control metrics, there is the ever-present risk that this double standard will cause parties to throw up their hands. To the extent TAR’s benefits are overshadowed by expensive process-oriented disputes, investment in TAR will eventually stall out, depriving the system of its efficiencies.Footnote ⁶⁸ Yet, the opposite approach is just as, if not more, worrisome. If judges, afraid of the above, do not act to police discovery abuse – and this abuse festers – they risk eroding the integrity of civil discovery and, by extension, litigants’, lawyers’, and the public’s faith in civil litigation.

Time will tell if judges can steer between these possibilities. But if they can, then out of these two gloomy visions comes a glimmer of light. If judges can help mint and then judiciously apply evenhanded protocols in TAR cases, then perhaps the system could end up better off than the analog system that TAR will steadily eclipse. Civil discovery could be one of those areas where, despite AI’s famous “black box” opacity, digitization yields a net increase in transparency and accountability.Footnote ⁶⁹

6.2.2 Colossus and the Walmart Suite: The Litigation of Losers

When it comes to the slant of the civil justice system, an assessment of the likely effect of Colossus and the Walmart Suite is more dour.

Colossus: Reduction via brute force . The impact of Colossus on the civil justice system seems fairly clear and not particularly contingent. Colossus’ advent has certain undeniable benefits, injecting newfound predictability, consistency, objectivity, and horizontal equity into the claims resolution process. It has also, probably, reduced the monies paid for fraudulent or “built” claims,Footnote ⁷⁰ as well as the odds that claim values will be influenced by improper factors (racial bias, for example).Footnote ⁷¹ Finally, it has, possibly, driven down the driving public’s insurance premiums – though there’s little reliable evidence on the point.

But, alongside these weighty advantages, it does seem that Colossus has also reduced claim payments quite significantly, using something like brute force. When Allstate rolled out a new Colossus-aided claims program for Allstate with the help of McKinsey & Co., the consulting firm’s stated goal was to “establish[ ] a new fair market value” for such injuries.Footnote ⁷² It appears that that aim was achieved. A later McKinsey review of Allstate found: “The Colossus sites have been extremely successful in reducing severities, with reductions in the range of 20 percent for Colossus-evaluated claims.”Footnote ⁷³ Nor was this dynamic confined, necessarily, to Allstate. Robert Dietz, a fifteen-year veteran of Farmer’s Insurance has explained, for example: “My vast experience in evaluating claims was replaced by values generated by a computer. More often than not, these values were not representative of what I had experienced as fair and reasonable.”Footnote ⁷⁴

The result is that, aided by Colossus, insurance companies are offering less to claimants for comparable injuries, on a take-it-or-leave-it basis. And, though one-shot personal injury (PI) lawyers could call insurance companies’ bluff and band together to reject these Colossus-generated offers en masse, in the past two decades, they haven’t.

Their failure to do so should not be surprising. Given persistent collective action problems and yawning information asymmetries (described in further detail below), one would not expect disaggregated PI lawyers, practicing alone or in small firms, to mount a coordinated and muscular response, especially since doing so would mean taking a significant number of claims to trial, which poses many well-known and formidable obstacles. For instance, some portion of PI lawyers operate in law firms (called “settlement mills”) and do not, in fact, have the capacity to take claims to trial.Footnote ⁷⁵ Second, many auto accident claimants need money quickly and do not have the wherewithal to wait out attendant trial delays. And third, all PI lawyers are attuned to the stubborn economics of auto accident litigation: As of 2005, the median jury trial award in an auto case was a paltry $17,000, which would yield only about $5,500 in contingency fees, a sum that is simply too meager to justify frequent trials against well-financed foes.Footnote ⁷⁶ This last point was not lost on McKinsey, which, in a presentation to Allstate, encouraged: “Win by exploiting the economics of the practice of law.”Footnote ⁷⁷

The Walmart Suite and the litigation of losers . The Walmart Suite illustrates another dynamic, which we dub the “litigation of losers.” In the classic article, Why the Haves Come Out Ahead, Marc Galanter presciently observed that repeat-players could settle out bad cases “where they expected unfavorable rule outcomes” and litigate only the good ones that are “most likely to produce favorable results.” Over time, he concluded, “we would expect the body of ‘precedent’ cases, that is, cases capable of influencing the outcome of future cases – to be relatively skewed toward those favorable to [repeat players].”Footnote ⁷⁸

The Walmart Suite shows that Galanter’s half-century-old prediction is coming to pass, fueled by AI-based software he couldn’t have imagined.Footnote ⁷⁹ And, we anticipate, this isn’t the end of it. In recurring areas of litigation, we are likely to see increasingly sophisticated outcome prediction tools that will draw ever-tighter uncertainty bands around anticipated outcomes. Like the Walmart Suite, these tools are reliant on privileged access to confidential claim settlement data, which only true repeat players will possess.

The effect of this evolution is profound, for, as outcome prediction tools percolate (at least in the hands of repeat defendants/insurers), only duds will be litigated – and this “litigation of losers” will skew – indeed, is almost certainly already skewing – the development of substantive law. The skew will happen because conventional wisdom, at least, holds that cases settle in the shadow of trial – which means that, to the extent trial outcomes tilt toward defendants, we would expect that settlements, too, will display a pro-defendant slant.Footnote ⁸⁰ Damages will also be affected. To offer but one concrete example, in numerous states, a judge evaluates whether damages are “reasonable” by assessing what past courts have awarded for similar or comparable injuries.Footnote ⁸¹ To the extent the repository of past damages reflects damages plaintiffs have won while litigating weak or enfeebled claims, that repository will, predictably, bend downward, creating a progressively more favorable damages environment for defendants.

To be sure, there are caveats and counter-arguments. Models of litigation bargaining suggest that a defendant with privileged information will sometimes have incentives to share that information with plaintiffs in order to avoid costly and unnecessary litigation and achieve efficient settlements.Footnote ⁸² Additionally, while repeat players have better access to litigation and settlement data, even one-shotters don’t operate entirely in the dark.Footnote ⁸³ But a simple fact remains: Even a slow burn of marginally better information, and marginally greater negotiation leverage, can have large aggregate effects across thousands and even millions of cases.

6.3 What to Do?

As the litigation playing field tilts under legal tech’s weight, there are some possible responses. Below, we start by briefly sketching three possible reforms that are facially plausible but, nevertheless, in our view, somewhat infeasible. Then, we offer a less attractive option – judicial discretion applied to existing procedural rules – as the most likely, though bumpy, path forward.

6.3.1 Plausible but Unlikely Reforms

Rewrite substantive or procedural law. First, we could respond to the skew that legal tech brings by recalibrating substantive law. Some combination of state and federal courts and legislatures could, for example, relax liability standards (for instance, making disparate impact job discrimination easier to prove), loosen restrictions on punitive damages, repeal damage caps, return to a world of joint and several liability, restore aider-and-abettor liability, abolish qualified immunity, and resurrect the collateral source rule.

Whatever the merit of a substantive law renaissance, we are, however, quite bearish on the possibility, as the obstacles blocking such an effort are formidable and, over the near- to medium-term, overwhelming. Substantive laws are sticky and salient, especially in a political system increasingly characterized by polarization and legislative gridlock.Footnote ⁸⁴ Even in less-polarized subfederal jurisdictions, it would be hard to convince state legislators and (often elected) judges to enact sweeping reforms without strong support from a public that often clings to enduring but often misguided beliefs about “jackpot justice” and frivolous claiming.Footnote ⁸⁵

Nor are federal courts, or a Supreme Court, newly stocked with Trump-era appointees, likely to help; to the contrary, they are likely to place barriers in front of litigation-friendly legislative efforts.Footnote ⁸⁶ And procedural rules, though less politically salient, will also be hard to change, particularly at the federal level given the stranglehold of conservative judges and defense-side lawyers on the process of court-supervised rulemaking.Footnote ⁸⁷

Democratize the data . Second, we could try to recalibrate the playing field by expanding litigants’ access to currently confidential data. As it stands, when it comes to data regarding the civil justice system, judges, lawyers, litigants, and academics operate almost entirely in the dark. We do not know how many civil trials are conducted each year. We don’t know how many cases go to trial in each case category. And, we don’t know – even vaguely – the outcome of the trials that do take place.Footnote ⁸⁸ Furthermore, even if we could know what happens at trial (which we don’t) or what happens after trial (which we don’t), that still wouldn’t tell us much about the much larger pool of claims that never make it to trial and instead are resolved consensually, often before official filing, by civil settlements.

This is crucial, for without information about those millions of below-the-radar settlements, the ability to “price” a claim – at least to “price” a claim using publicly available data, approaches zero. As Stephen Yeazell has aptly put it:

[I]n the U.S. at the start of the twenty-first century, citizens can get reliable pricing information for almost any lawful transaction. But not for civil settlements. We can quickly find out the going price of a ten-year old car, of a two-bedroom apartment, or a souvenir of the last Superbowl, but one cannot get a current “market” quote for a broken leg, three weeks of lost work, and a lifetime of residual restricted mobility. Nor for any of the other 7 million large or the additional 10 million smaller civil claims filed annually in the United States. We simply do not know what these are worth.Footnote ⁸⁹

Recognizing this gap, Computer Sciences Corp. (the maker of Colossus) and Walmart and its tech and BigLaw collaborators are working to fill it. But, they have filled it for themselves – and, in fact, they have leveraged what amounts to their near-total monopoly on settlement data to do so. Indeed, some insurers’ apparent ability to “tune” Colossus rests entirely on the fact that plaintiffs cannot reliably check insurance companies’ work – and so insurers can, at least theoretically, “massage” the data with near impunity.

Seeing the status quo in this light, of course, suggests a solution: We could try to democratize the data. Taking this tack, Yeazell has advocated for the creation of electronic databases whereby basic information about settlements – including, for instance, the amount of damages claimed, the place suit was filed, and the ultimate settlement amount – would be compiled and made accessible online.Footnote ⁹⁰ In the same vein, one of us has suggested that plaintiffs’ attorneys who work on a contingency fee basis and seek damages in cases for personal injury or wrongful death should be subject to significant public disclosure requirements.Footnote ⁹¹

Yet, as much as democratizing the data sounds promising, numerous impediments remain – some already introduced above. The first is that many “cases” are never actually cases at all. In the personal injury realm, for example, the majority of claims – in fact, approximately half of claims that involve represented claimants – are resolved before a lawsuit is ever filed.Footnote ⁹² Getting reliable data about these settlements is exceptionally difficult. Next, even when cases are filed, some high proportion of civil cases exit dockets via an uninformative voluntary dismissal under Rule 41 or its state-level equivalents.Footnote ⁹³ Those filings, of course, may be “public,” but they reveal nothing about the settlement’s monetary terms.Footnote ⁹⁴ Then, even on those relatively rare occasions when a document describing the parties’ terms of settlement is filed with the court, public access remains limited. Despite a brewing “open court data” movement, court records from the federal level on down sit behind “walls of cash and kludge.”Footnote ⁹⁵ Breaking through – and getting meaningful access even to what is “public” – is easier said than done.

“Public option” legal tech . A third unlikely possibility is “public option” legal tech. Perhaps, that is, the government could fund the development of legal tech tools and make them widely available.

When it comes to TAR, public option legal tech is not hard to imagine. Indeed, state and federal judiciaries already feature magistrate judges who, on a day-to-day basis, mainly referee discovery disputes. It may only be a small step to create courthouse e-discovery arms, featuring tech-forward magistrate judges who work with staff technologists to perform discovery on behalf of the parties.

Public option outcome-prediction tools that can compete with Colossus or the Walmart Suite are harder to imagine. Judges, cautious Burkeans even compared to the ranks of lawyers from which they are drawn, are unlikely to relax norms of decisional independence or risk any whiff of prejudgment anytime soon. The bigger problem, however, will be structural, not just legal-cultural. The rub is that, like one-shot litigants and academics, courts lack access to outcome data that litigation’s repeat players possess. Short of a sea change in the treatment of both pre- and post-suit secret settlements, courts, no less than litigation’s have-nots, will lack the information needed to power potent legal tech tools.Footnote ⁹⁶

6.3.2 Slouching Toward Equity: Judicial Procedural Management with an Eye to Technological Realities

Given the above obstacles, the more likely (though perhaps least attractive) outcome is that judges, applying existing procedural rules, will be the ones to manage legal tech’s incorporation into the civil justice system. And, as is often the case, judges will be asked to manage this tectonic transition with few rules and limited guidance, making it up mostly as they go.

The discussion of TAR’s contingent future, set forth above in Part 6.2.2 offers a vivid depiction of how courts, as legal tech’s frontline regulators, might do so adeptly; they might consider on-the-ground realities when addressing and applying existing procedural doctrines. But the TAR example also captures a wider truth about the challenges judges will face. As already noted, legal tech tools that cut litigation costs and hone information derive their value from their exclusivity – the fact that they are possessed by only one side. It follows that the procedural means available to judges to blunt legal tech’s distributive impacts will also reduce the tools’ value and, at the same time, dull incentives for litigants to adopt them, or tech companies to develop them, in the first instance. As a result, disparate judges applying proportionality and work product rules in individual cases will, inevitably, in the aggregate, create innovation policy. And, for better or worse, they will make this policy without the synoptic view that is typically thought essential to making wise, wide-angle judgments.

Judicial management of legal tech’s incorporation into the civil justice system will require a deft hand and a thorough understanding of changing on-the-ground realities. To offer just one example: As noted above, discovery cost concerns have fueled the creation of a number of doctrines that constrict discovery and, in so doing, tend to make life harder for plaintiffs. These include not just Rule 26’s “proportionality” requirement (described above), but also a slew of other tweaks and outright inventions, noted previously, from tightened pleading standards to court-created Lone Pine orders that compel plaintiffs to offer extensive proof of their claims, sometimes soon after filing.

Undergirding all these restrictive doctrines is a bedrock belief: that discovery is burdensome and too easily abused, so much so that it ought to be rationed and rationalized. Yet, as explained above, TAR has the potential to significantly reduce the burden of discovery (particularly, as noted above, if more expansive discovery, which might offset certain efficiency gains, is not forthcoming). As such, TAR, at least arguably, will steadily erode the very foundation on which Twombly, Iqbal, and Lone Pine orders rest – and this newly unsettled foundation might, therefore, demand the reexamination of those doctrines. As judges manage the incorporation of potent new legal tech tools into the civil justice system, we can only hope that they will exhibit the wisdom to reconsider, where relevant, this wider landscape.

6.4 Conclusion

This chapter has argued that the civil justice system sits at a contingent moment, as new digital technologies are ushered into it. While legal tech will bring many benefits and may even help to level the playing field in certain respects, some of the more potent and immediately available tools will likely tilt the playing field, skewing it ever further toward powerful players. One can imagine numerous fixes, but the reality is that, in typical common law fashion, the system’s future fairness will depend heavily on the action of judges, who, using an array of procedural rules built for an analog era will, for better or worse, make it up as they go. We’re not confident about the results of that process. But the future fairness of a fast-digitizing civil justice system might just hinge on it.

7 Litigation Outcome Prediction, Access to Justice, and Legal Endogeneity

Charlotte S. Alexander

The United States has a serious and persistent civil justice gap. In 1994, an American Bar Association study found that half of low- and moderate-income households had faced at least one recent civil legal problem, but only one-quarter to one-third turned to the justice system.Footnote ¹ Twenty-four years later, a 2017 study by the country’s largest civil legal aid funder found that 71 percent of low-income households surveyed had experienced a civil legal need in the past year, but 86 percent of those problems received “inadequate or no legal help.”Footnote ² Studies in individual states tell a similar story.Footnote ³

Unmet civil legal needs include a variety of high-stakes case types that affect basic safety, stability, and well-being: domestic violence restraining orders; health insurance coverage disputes; debt collection and relief actions; evictions and foreclosures; child support and custody cases; and education- and disability-related claims.Footnote ⁴ There is generally no legal right to counsel in these cases, and there are too few lawyers willing and able to offer representation at prices that low- and middle-income clients can afford.Footnote ⁵ In my home state of Georgia, for example, five or six rural counties – depending on the year – have no resident attorneys, and eighteen counties have only one or two.Footnote ⁶ These counties’ upper-income residents travel to the state’s urban centers for legal representation. Lower-income residents seek help from rotating legal aid lawyers who “ride circuit,” meeting clients for, say, two hours at the public library on the first Wednesday of the month.Footnote ⁷ Or they go without.

Can computationally driven litigation outcome prediction tools fill the civil justice gap? Maybe.

This chapter reviews the current state of outcome prediction tools and maps the ways they might affect the civil justice system. In Section 7.1, I define “computationally driven litigation outcome prediction tools” and explain how they work to forecast outcomes in civil cases. Section 7.2 outlines the theory: the potential for such tools to reduce uncertainty, thereby reducing the cost of civil legal services and helping to address unmet legal needs. Section 7.3 surveys the work that has been done thus far by academics, in commercial applications, and in the specific context of civil legal services for low- and middle-income litigants. Litigation outcome prediction has not reached maturity as a field, and Section 7.4 catalogs the data, methodological, and financial limits that have impeded development in general and the potential to expand access to justice in particular.

Section 7.5 steps back and confronts the deeper effects and the possible unintended consequences of the tools’ continued proliferation. In particular, I suggest that, even if all the problems identified in Section 7.4 can be solved and litigation outcome prediction tools can be made to work perfectly, their use raises important endogeneity concerns. Computationally driven tools might reify previous patterns, lock out litigants whose claims are novel or boundary-pushing, and shut down the innovative and flexible nature of common law reasoning. Section 7.6 closes by offering a set of proposals to stave off these risks.

Admittedly, the field of litigation prediction is not yet revolutionizing civil justice, whether for good or ill. Empirical questions remain about the way(s) that outcome prediction might affect access to justice. Yet if developments continue, policy makers and practitioners should be ready to exploit the tools’ substantial potential to fill the civil justice gap while also guarding against the harms they might cause.

7.1. Litigation Outcome Prediction Defined

I define “computationally driven litigation outcome prediction tools” as statistical or machine learning methods used to forecast the outcome of a civil litigation event, claim, or case. A litigation event may be a motion filed by either party; the relevant predicted outcome would be the judge’s decision to grant or deny, in full or in part. A claim or case outcome, on the other hand, refers to the disposition of a lawsuit, again in full or in part. My scope is civil only, though much of the analysis that follows could apply equally to criminal proceedings.

“Computationally driven” here refers to the use of statistical or machine learning models to detect patterns in past civil litigation data and exploit those patterns to predict, and to some extent explain, future outcomes. Just as actuaries compute the future risk of loss for insurance companies based on past claims data, so do outcome prediction tools attempt to compute the likelihood of future litigation events based on data gleaned from past court records.

In broad strokes, such tools take as their inputs a set of characteristics, also known as predictors, independent variables, or features, that describe the facts, legal claims, arguments, and authority, the people (judge, lawyers, litigants, expert witnesses), and the setting (location, court) of a case. Features might also come from external sources or be “engineered” by combining data. For example, the judge’s gender and years on the bench might be features, as well as the number of times the lawyers in the case had previously appeared before the same judge, the judge’s caseload, and local economic or crime data. Such information might be manually or computationally extracted from the unstructured text of legal documents and other sources – necessitating upstream text mining or natural language processing tasks – or might already be available in structured form.

These various features or case characteristics then become the inputs into one of many types of statistical or predictive models; the particular litigation outcome of interest is the target variable to be predicted.Footnote ⁸ When using such a tool, a lawyer would plug in the requested case characteristics and would receive an outcome prediction along with some measurement of error.

7.2. Theory: Access-to-Justice Potential

In theory, computationally driven outcome prediction, if good enough, can supplement, stretch, and reduce the cost of legal services by reducing outcome uncertainty. As Gillian Hadfield summarizes, uncertainty comes from several sources.Footnote ⁹ Sometimes the law is simply unclear. Other times, actors, whether police officers, prosecutors, regulators, or courts, have deliberately been given discretion. Further, an individual may subjectively discount or increase the probability of liability due to “mistakes in the determination of factual issues, and errors in the identification of the applicable legal rule.”Footnote ¹⁰ One way to resolve these uncertainties is to pay a lawyer for advice – in particular, a liability estimate.

Given a large enough training set, a predictive model may detect patterns in how courts have previously resolved vagueness and how officials have previously exercised discretion. Further, such a tool could correct the information deficits and asymmetries that may produce mistaken liability estimates. Outcome prediction tools might also obviate the need for legal representation entirely, allowing potential and actual litigants to estimate their own chances of success and proceed pro se. This could be a substantial boon for access to justice. Of course, even an outcome-informed pro se litigant may fail to navigate complex court procedures and norms successfully.Footnote ¹¹ Fully opening the courthouse doors to self-represented litigants might also require simplification of court procedures. Still, outcome prediction tools might go a long way toward expanding access to justice, whether by serving litigants directly or by acting as a kind of force multiplier for lawyers and legal organizations, particularly those squaring off against better-resourced adversaries.Footnote ¹²

A second way outcome prediction tools could, in theory, open up access to justice is by enhancing the ability of legal services providers to quantify, and manage, risk. Profit-driven lawyers, as distinguished from government-funded legal services lawyers, build portfolios of cases with an eye toward managing risk.Footnote ¹³ Outcome prediction tools may allow lawyers to allocate their resources more efficiently, wasting less money on losing cases and freeing up lawyer time and attention for more meritorious cases, or by constructing portfolios that balance lower- and higher-risk cases.

In addition, enterprising lawyers with a higher-risk appetite might use such tools to discover new areas of practice or potential claim types that folk wisdom would advise against.Footnote ¹⁴ To draw an example from my previous work, I studied the boom in wage-and-hour lawsuits in the early 2000s and identified as one driver of the litigation spike an influx of enterprising personal injury attorneys into wage-and-hour law.Footnote ¹⁵ One early mover was a South Florida personal injury attorney named Gregg Shavitz, who discovered his clients’ unpaid wage claims by accident, became an overtime specialist, and converted his firm into one of the highest-volume wage-and-hour shops in the country. This was before the wide usage of litigation outcome prediction tools. However, one might imagine that more discoveries like Gregg Shavitz’s could be enabled by computationally driven systems, rather than by happenstance, opening up representation for more clients with previously overlooked or under-resourced claim types.Footnote ¹⁶

I return to, and complicate, this possibility in Section 7.5, where I raise concerns about outcome prediction tools’ conservatism in defining winning and losing cases, which may reduce, rather than increase, access to justice – empirical questions that remain to be resolved.

7.3 Practice: Where Are We Now?

From theory, I now turn to practice, tracing the evolution and present state of litigation outcome prediction in scholarship, commercial applications, and tools developed specifically to serve low- and middle-income litigants. This Section also begins to introduce these tools’ limitations in their present form, a topic that I explore more fully in Section 7.4.

7.3.1 Scholarship

Litigation outcome prediction is an active scholarly research area, characterized by experimentation with an array of different data sets, modeling approaches, and performance measures. Thus far, no single dominant approach has emerged.

In a useful article, Kevin Ashley traces the history of the field to the work of two academics who used a machine learning algorithm called k-nearest neighbors in the 1970s to forecast the outcome of Canadian real estate tax disputes.Footnote ¹⁷ Since then, academic work has flourished. In the United States, academic interest has focused, variously, on decisions by the US Supreme Court,Footnote ¹⁸ federal appellate courts,Footnote ¹⁹ federal district courts,Footnote ²⁰ immigration court,Footnote ²¹ state trial courts,Footnote ²² and administrative agencies.Footnote ²³ Case types studied include employment,Footnote ²⁴ asylum,Footnote ²⁵ tort and vehicular,Footnote ²⁶ and trade secret misappropriation.Footnote ²⁷ Other scholars outside the United States have, in turn, developed outcome prediction tools focused on the European Court of Human Rights,Footnote ²⁸ the International Criminal Court,Footnote ²⁹ French appeals courts,Footnote ³⁰ the Supreme Court of the Philippines,Footnote ³¹ lending cases in China,Footnote ³² labor cases in Brazil,Footnote ³³ public morality and freedom of expression cases in Turkey’s Constitutional Court,Footnote ³⁴ and Canadian employment and tax cases.Footnote ³⁵ Some of this research has spun off into commercial products, discussed in the next section.

This scholarly work reflects all the strengths and weaknesses of the wider field. Though direct comparison among studies can be difficult given different datasets and performance measures, predictive performance has ranged from relatively modest marginal classification accuracyFootnote ³⁶ to a very high F1 score of 98 percent in one study.Footnote ³⁷

That said, some high-performing academic approaches may suffer from research design flaws, as they appear to use the text of a court’s description of the facts of a case and the laws cited to predict the court’s ruling.Footnote ³⁸ This is problematic, as judges or their clerks often write case descriptions and choose legal citations with pre-existing knowledge of the ruling they will issue. It is no surprise that these case descriptions predict outcomes. Further, much academic work is limited in its generalizability by the narrow band of cases used to train and test predictive models. This is due to inaccessible or missing court data, especially in the United States, a problem discussed further in Section 7.4. Finally, some researchers give short shrift to explanation, in favor of prediction.Footnote ³⁹ Though a model may perform well in forecasting results, its practical and tactical utility may be limited if lawyers seeking to make representation decisions do not know what drives the predictions and cannot square them with their mental models of the world. As discussed further in Section 7.4, explainable predictions are becoming the new norm, as interpretations are now available for even the most “black box” predictive models. For the moment, however, explainability remains a sticking point.

7.3.2 Commercial Applications

The commercial lay of the land is similar to the academic landscape, with substantial activity and disparate approaches focused on particular case types or litigation events.

The Big Three legal research companies – LexisNexis, Westlaw, and Bloomberg Law – have all developed outcome prediction tools that sit within their existing suites of research and analysis tools. LexisNexis offers what they label “judge and court analytics” as well as “attorney and law firm analytics.” In both spaces, the offerings are more descriptive than predictive – showing, for example, “a tally of total cases for a judge or court for a specific area of law to approximate experience on a motion like yours.”Footnote ⁴⁰ The predictive jump is left to the user, who decides whether to adopt the approximation as a prediction or to distinguish it from the case at hand. LexisNexis provides further predictive firepower in the form of an acquired start-up, LexMachina, which provides, among other output, estimates of judges’ likelihood of granting or denying certain motions in certain case types.Footnote ⁴¹ Westlaw offers similar options in its litigation and precedent analytics tools,Footnote ⁴² as does Bloomberg Law in its litigation analytics suite.Footnote ⁴³ Fastcase, a newer entrant into the space, offers a different approach, allowing subscribers to build their own bespoke predictive and descriptive analyses, using tools and methodologies drawn from a host of partner companies.Footnote ⁴⁴

A collection of smaller companies offers litigation outcome prediction focused on particular practice areas or litigation events. Docket Alarm, now owned by Fastcase, offers patent litigation analytics that produce “the likelihood of winning given a particular judge, technology area, law firm or party.”Footnote ⁴⁵ In Canada, Blue J Tax builds on the scholarly work described above to offer outcome prediction in tax disputes,Footnote ⁴⁶ while in the United Kingdom companies like CourtQuant “predict [case] outcome and settlement probability.”Footnote ⁴⁷

A final segment of the industry are law firms’ and other players’Footnote ⁴⁸ homegrown, proprietary tools. On the plaintiffs’ side, giant personal injury firm Morgan & Morgan has developed “a ‘Google-style’ operation” in which the firm “evaluate[s] ‘actionable data points’ about personal injury settlements or court proceedings” and uses the insight to “work up a case accordingly – and … do that at scale.”Footnote ⁴⁹ Defense-side firms are doing the same. Dentons, the world’s largest firm, even spun off an independent analytics lab and venture firm to fund development in outcome prediction and other AI-enabled approaches to law.Footnote ⁵⁰

It is difficult to assess how well any of these tools performs, as access is expensive or unavailable, the feature sets used as inputs are not always clear, and the algorithms that power the predictions are hidden. I raise some concerns about commercial model design in Section 7.4 – in particular, reliance on lawyer identity as a predictor – and, as above, return to the perpetual problem of inaccessible and missing court data.

7.3.3 Outcome Prediction for Low- and Middle-Income Litigants

For reasons explored further below, examples are scarce of computationally driven litigation outcome prediction tools engineered specifically for the kinds of cases noted in this chapter’s opening. Philadelphia’s civil legal services provider, Community Legal Services, uses a tool called Expungement Generator (EG) to determine whether criminal record expungement is possible and assist in completing the paperwork.Footnote ⁵¹ The EG does not predict outcomes, but its automated approach enables efficiency gains for an organization that prepares thousands of expungement petitions per year.Footnote ⁵² Similarly, an application developed in the Family Law Clinic at Duquesne University School of Law prompts litigants in child support cases to answer a set of questions, which the tool then evaluates to determine “if there is a meritorious claim for appeal to be raised” under Pennsylvania law.Footnote ⁵³ As with the EG, the Duquesne system does not appear to use machine learning techniques, but rather to apply a set of mechanical rules. The clinic plans prediction as a next step, however, and is developing a tool that analyzes winning arguments in appellate cases in order to guide users’ own arguments.Footnote ⁵⁴

7.4. Present Limits

Having surveyed the state of the outcome prediction field, I now step back and assess its limits. As David Freeman Engstrom and Jonah Gelbach rightly concluded in earlier work: “[L]egal tech tools will arrive sooner, and advance most rapidly, in legal areas where data is abundant, regulated conduct takes repetitive and stereotypical forms, legal rules are inherently stable, and case volumes are such that a repeat player stands to gain financially by investing.”Footnote ⁵⁵ Many of the commercial tools highlighted above fit this profile. Tax-oriented products exploit relatively stable rules; Morgan & Morgan’s internal case evaluation system exploits the firm’s extraordinarily high case volumes.

Yet, as noted above, data’s “abundance” is an open question, as is data quality. Methodological problems may also hinder these tools’ development. In the access to justice domain, the questions of investment incentives and financial gains loom large as well. The remainder of this Section addresses these limitations.

7.4.1 Data Limitations

Predictive algorithms require access to large amounts of data from previous court cases for model training, but such bulk data is not widely or freely available in the United States from the state or federal courts or from administrative agencies that have an adjudicatory function.Footnote ⁵⁶ The Big Three have invested substantial funds in compiling private troves of court documents and judicial decisions, and jealously guard those resources with high user fees, restrictive terms and conditions, and threatened and actual litigation.Footnote ⁵⁷

Data inaccessibility creates serious problems for outcome prediction tools designed to meet the legal needs of low- and middle-income litigants.Footnote ⁵⁸ Much of this litigation occurs in state courts, where data is sometimes poorly managed and siloed in multiple systems.Footnote ⁵⁹ Moreover, there is little money in practice areas like eviction defense and public benefits appeals, in which clients, by definition, are poor. Thus, data costs are high, and financial incentives for investment in research and development are low.

Even the products offered by the monied Big Three, however, suffer from data problems. With large companies separately assembling their own private data repositories, coverage varies widely, producing remarkable disagreement about basic facts. A recent study revealed that the answers supplied to the question “How many opinions on motions for summary judgment has Judge Barbara Lynn (N.D. Tex.) issued in patent cases?” ranged from nine to thirty-two, depending on the legal research product used.Footnote ⁶⁰ This is an existential problem for the future of litigation outcome prediction, as predictions are only as good as the data on which they are built.Footnote ⁶¹

A final data limitation centers on the challenges of causal explanation. Even if explainable modeling approaches are used, the case characteristics that appear to be the strongest predictors of outcomes may not, in fact, be actionable. For instance, when a predictive tool relies on attorney identity as a feature, the model’s prediction may actually be free-riding on the attorney’s own screening and selection decisions. In other words, if the presence of Lawyer A in a case is strongly predictive of a win for her clients, Lawyer A’s skills as a litigator may not be the true cause. The omitted, more predictive variable is likely the strength of the merits, and Lawyer A’s skill at assessing those merits up-front. Better data could enable better model construction, avoiding these kinds of proxy variable traps.

7.4.2 Methodological Limitations

Sitting atop these data limitations are two important methodological limitations. First, as noted above, even if predictive tools do a good job of forecasting the probable outcome of a litigation event, they may only poorly explain why the predicted outcome is likely to occur. Explanation is important for a number of related reasons, among them engendering confidence in predictions, enabling bias and error detection, and respecting the dignity of people affected by prediction.Footnote ⁶² Indeed, the European Union’s General Data Protection Regulation (GDPR) has established what some scholars have labeled a “right to an explanation,” consisting of a right “not to be subject to a decision based solely on automated processing” and various rights to notice of data collection.Footnote ⁶³ Though researchers are actively developing explainable AI that can identify features’ specific importance to a prediction and generate counterfactual predictions if features change value,Footnote ⁶⁴ the field has yet to converge on a single set of explainability practices, and commercial approaches vary widely.

Second, outcome prediction is limited by machine and deep learning algorithms’ inability to reason by analogy. Legal reasoning depends on analogical thinking: the ability to align one set of facts to another and guess at the likely application of the law, given the factual divergences. However, teaching AI to reason by analogy is a cutting-edge area of computer science research, and it is far from well established. As computer scientist Melanie Mitchell explains, “‘Today’s state-of-the-art neural networks are very good at certain tasks … but they’re very bad at taking what they’ve learned in one kind of situation and transferring it to another’ – the essence of analogy.”Footnote ⁶⁵ There is a famous analogical example in text analytics, where a natural language processing technique known as word embedding, when trained on an enormous corpus of real-world text, is able to produce the answer “queen” when presented with the formula “king minus man plus woman.”Footnote ⁶⁶ The jump from this parlor trick to full-blown legal reasoning, though, is substantial.

In short, scaling up computationally driven litigation outcome prediction tools in a way that would fill the civil justice gap would require access to more and better data and methodological advances. Making bulk federal and state court and administrative agency data and records freely and easily accessible would be a very good step.Footnote ⁶⁷ Marshaling resources to support methods and tool development would be another. Foundation funding is already a common ingredient in efforts to fill the civil justice gap. I propose that large law firms pitch in as well. All firms on the AmLaw 100 could pledge a portion of their pro bono budgets toward the development of litigation outcome prediction tools to be used in pro bono and low bono settings. The ABA Foundation might play a coordinating and convening role, as it is already committed to access-to-justice initiatives. Such an effort could have a much broader impact than firms’ existing pro bono activities, which tend to focus on representation in single cases. It might also jump-start additional interest from the Big Three and other commercial competitors, who might invest more money in improving algorithms’ predictive performance and spin off free or low-cost versions of their existing suites of tools.

7.5 Unintended Consequences

Time will tell whether, when, and how the data and methodological problems identified in Section 7.4 will be solved. Assuming that they are, and litigation outcome prediction tools can reliably generate highly reliable forecasts, there still may be reason for caution.

This Section identifies two possible unintended consequences of outcome prediction tools, which could develop alongside the salutatory access to justice effects described in Section 7.2: harm to would-be litigants denied representation whose claims are novel or less viable according to predictive tools, and harm to the common law system as a whole.

Here, the assumption is that such tools have access to ample data, account for all relevant variables, and are transparent and explainable – in other words, the tools work as intended to learn from existing patterns in civil litigation outcomes and reproduce those patterns as outcome predictions. Yet it is this very reproductive nature that is cause for concern.

7.5.1 Harms to Would-Be Litigants

Consider the facts of Elisa B. v. Superior Court,Footnote ⁶⁸ a case decided by the California Supreme Court in 2005. Emily B. sought child support from her estranged partner, Elisa B., for twins whom Emily had conceived via artificial insemination of her eggs during her relationship with Elisa. If Emily walked into a lawyer’s office seeking help with her child support action, the lawyer might be interested in the case’s viability: How often have similar fact patterns come before California courts, and what was their outcome? The answers might inform the lawyer’s decision about whether to offer representation.

In real life, this case was one of first impression in California. The governing law, the Uniform Parentage Act, referred to “mother” and “father” as the potential parents.Footnote ⁶⁹ Searching for relevant precedent, the Elisa B. court reasoned by analogy from previous cases that involved, variously, three potential parents (one man and two women), non-biological fathers, non-biological mothers, and a woman who raised her half-brother as her son.Footnote ⁷⁰ From this and other precedent, the court cobbled together a new legal rule that required Elisa B. to pay child support for her and Emily B.’s children.

I am doubtful that an outcome prediction tool would have reached this same conclusion. The number of analogical jumps that the court made would seem to be outside the capabilities of machine and deep learning, even assuming methodological advancement.Footnote ⁷¹ Further, judges’ decisions about what prior caselaw to draw upon and how many analogical leaps to make may be influenced by factors like ideology and public opinion, which could be difficult to model well. Emily B.’s claim would likely receive a very low viability score.Footnote ⁷²

A similar cautionary tale comes from my own previous work with Camille Gear Rich and Zev Eigen on attorneys’ non-computational assessments of claim viability. We documented plaintiffs’ employment attorneys’ dim view of the likelihood of success for employment discrimination claims and their shifting of case selection decisions away from discrimination and toward easier-to-prove wage-and-hour claims.Footnote ⁷³ One result of this shift, we observed, was that even litigants with meritorious discrimination claims were unable to find legal representation. That work happened in 2014 and 2015, before litigation outcome prediction tools were widely available, and I am not aware of subsequent empirical studies on the effect of such tools on lawyers’ intake decisions. Yet if lawyers were already using their intuition to learn from past cases and predict future outcomes, pre-AI, machine and deep learning tools could just cement these same patterns in place.

Thus, in this view, as civil litigation outcomes become more predictable, claims become commoditized. Outlier claims and clients like Emily B. may become less representable, much like high-loss risks become less insurable. While access to justice on the whole may increase, the courthouse doors may be effectively closed to some classes of potential clients who seek representation for novel or disfavored legal claims or defenses.Footnote ⁷⁴

Further, to the extent that representation is denied to would-be litigants because of their own negative personal histories, ingested by a model as data points, litigation outcome prediction tools can reduce people to their worst past acts and prevent them from changing course. Take as an example a tenant with an old criminal record trying to fight an eviction, whose past conviction reduces her chance of winning according to an algorithmic viability assessment. This may be factually accurate – her criminal record may actually make eviction defense more challenging – but a creative lawyer might see other aspects of her case that an algorithmic assessment might miss. By reducing people to feature sets and exploiting the features that are most predictive of outcomes, but perhaps least representative of people’s full selves, computational tools enact dignitary harm. In the context of low-income litigants facing serious and potentially destabilizing court proceedings, and who are algorithmically denied legal representation, such tools can also cause substantial economic and social harm, reducing social mobility and locking people into place.

Indeed, machine and deep learning methods are inherently prone to what some researchers have called “value lock-in.”Footnote ⁷⁵ All data is historical in the sense that it captures points in time that have passed; all machine and deep learning algorithms find patterns in historical data as a way to predict the future. This methodological design reifies past practices and locks in past patterns. As machine learning researcher Abeba Birhane and her collaborators point out, then, machine learning is not “value-neutral.”Footnote ⁷⁶ And as AI pioneer Joseph Weizenbaum observed, “the computer has from the beginning been a fundamentally conservative force which solidified existing power: in place of fundamental social changes … the computer renders technical solutions that allow existing power hierarchies to remain intact.”Footnote ⁷⁷ It is no accident that the anecdotes above involve a lesbian couple, employment discrimination claimants, and a tenant with a criminal record: the fear is that would-be litigants like these with the least power historically become further disempowered at the hands of computational methods.

Yet as Section 7.2 suggested, a different story might also be possible: More accurate predictions might enable lawyers to fill their case portfolios with low-risk sure winners as hedges when taking on riskier cases like Elisa B., or might help them discover and invest in previously under-resourced practice areas. At this stage, whether predictive tools would increase or decrease representation for outlier claims and clients is an open empirical question, which researchers and policy makers should work to answer as data and methods improve and outcome prediction tools become more widely used.

7.5.2 Harms to the System

I turn now to the second potential harm caused by computationally driven litigation outcome prediction: harm to the common law system itself.Footnote ⁷⁸ As Charles Barzun explains, common-law reasoning “contains seeds of radicalism [in that] the case-by-case process by which the law develops means it is always open to revision. And even though its official position is one of incremental change … doctrine [is] constantly vulnerable to being upended.”Footnote ⁷⁹ Barzun points to Catharine MacKinnon’s invention of sexual harassment doctrine out of Title VII’s cloth as an example of a “two-way process of interaction” between litigants, representing their real-world experience, and the courts, interpreting the law, in a shared creative process “in which the meaning and scope of application of the statute changes over time.”Footnote ⁸⁰

If lawyers rely too heavily on litigation outcome prediction tools, which reproduce past patterns, the stream of new fact presentations and legal arguments flowing into the courts dries up. Litigation outcome prediction tools may produce a sort of super stare decisis by narrowing lawyers’ case selection preferences to only those case, claim, and client types that have previously appeared and been successful in court. Yet stare decisis is only one aspect of our common law system. Another competing characteristic is flexibility: A regular influx of new cases with new fact patterns and legal arguments enables the law to innovate and adapt. In other words, noise – as differentiated from signal – is a feature of the common law, not a bug. Outcome prediction tools that are too good at picking up signals and ignoring noise eliminate the structural benefits of the noise, and privilege stare decisis over flexibility by shaping the flow of cases that make their way to court.

Others, particularly Engstrom and Gelbach, have made this point, suggesting that prediction

comes at a steep cost, draining the law of its capacity to adapt to new developments or to ventilate legal rules in formal, public interpretive exercises …. The system also loses its legitimacy as a way to manage social conflict when the process of enforcing collective value judgments plays out in server farms rather than a messy deliberative and adjudicatory process, even where machine predictions prove perfectly accurate.Footnote ⁸¹

The danger is that law becomes endogenous and ossified. “Endogenous,” to repurpose a concept introduced by Lauren Edelman, means that the law’s inputs become the same as its outputs and “the content and meaning of law is determined within the social field that it is designed to regulate.”Footnote ⁸² “Ossified,” to borrow from Cynthia Estlund, means that the law becomes “essentially sealed off … both from democratic revision and renewal from local experimentation and innovation.”Footnote ⁸³

7.6 Next Steps

As noted above, whether any of the unintended consequences outlined above will come to pass – and, indeed, whether access to justice improvements will come to pass as well – turns on empirical questions. Given the problems and limitations identified in Section 7.4, will litigation outcome prediction tools actually work well enough to achieve either their potential benefits or cause their potential harms? My assessment of the present state of the field suggests there is a long way to go before we reach either set of outcomes. But as the field matures, we can build in safeguards against the endogeneity risks and harms I identify above through technical, organizational, and policy interventions.

First, on the technical side, computer and data scientists, and the funders who make their work possible, should invest heavily in improving algorithmic analogical reasoning. Without the ability to reason by analogy, outcome predictors not only will miss an array of possible positive predictions, but they will also be systematically biased against fact patterns like Emily B.’s, which present issues of first impression.

Further on the technical front, developers could purposefully over-train predictive algorithms on novel, but successful, fact patterns and legal arguments in order to nudge the system off its path and make positive predictions possible even for cases that fall outside the norm. This idea is adapted from OpenAI’s work in nudging its state-of-the-art language model, GPT-3, away from its “harmful biases, such as outputting discriminatory racial text” learned from its training corpus, by over-exposing it to counter texts.Footnote ⁸⁴

Technical fixes focus on outcome prediction tools’ production side. Organizational fixes target the tools’ consumers: the lawyers, law firms, and other legal organizations that might use them to influence case selection. I propose here that no decision should be made exclusively on the basis of algorithmic output. This guards against the dignitary and other real harms described above, as would-be litigants are treated as full people rather than feature sets. This also parallels the GDPR’s explanation mandate, though I suggest it here as an organizational practice that is baked into legal organizations’ decision-making processes.Footnote ⁸⁵

Finally, I turn to policy. The story above assumes a profit-driven lawyer as the user of outcome prediction tools. Of course, there are other possible motivations for a lawyer’s case selection decisions, such as seeking affirmatively to establish a new interpretation of the law or right a historic wrong. These cause lawyers, from all points on the ideological spectrum, may be particularly likely to take on seemingly high-risk claim or party types, which receive low computationally determined viability scores. Government lawyers, too, may function as cause lawyers, pushing legal arguments, in accordance with administration position, that diverge from courts’ past practices. Government agencies should study trends in private attorneys’ use of litigation outcome prediction tools in the areas in which they regulate, and should make their own case selection decisions to fill gaps in representation.Footnote ⁸⁶

7.7 Conclusion

This chapter has explored the consequences of computationally driven litigation outcome prediction tools for the civil justice system, with a focus on increasing access to justice. It has mapped the current state of the outcome prediction field in academic work and commercial applications, as well as in pro bono and low bono practice settings. It has also raised concerns about unintended consequences for litigants and for our legal system as a whole.

I conclude that there is plenty of reason for “techno-optimism,” to use Tanina Rostain’s term, about the potential for computationally driven litigation outcome prediction tools to close the civil justice gap.Footnote ⁸⁷ However, reaching that optimistic future, while also guarding against potential harms, will require substantially more money and data, continued methodological improvement, careful organizational implementation, and strategic deployment of government resources.

8 Toward the Participatory MDL A Low-Tech Step to Promote Litigant Autonomy

Todd Venook and Nora Freeman Engstrom

Debate about legal tech and the future of civil litigation typically focuses on high-technology innovations. This volume is no exception, and with good reason. Advanced technologies are spreading (or seem poised to spread) throughout the legal landscape, from discovery to online dispute resolution (ODR) to trials, and from individual lawsuits to aggregate litigation. These tools’ practical utility and social value are rightly contested.Footnote ¹

But in some contexts, straightforward, low-tech solutions hold tremendous promise – and also demand attention. Here, we zero in on a modest tool that bears upon the management of multidistrict litigation, or MDL. In particular, we explore how improved online communication could enhance litigant autonomy, usher in a more “participatory” MDL, and supply a platform for further innovation.Footnote ²

The MDL statute – 28 U.S.C. §1407 – is a procedural vehicle through which filed federal cases involving common questions of fact, such as a mass tort involving asbestos or defective pharmaceuticals, are swept together into a single “transferee” court, ostensibly for pretrial proceedings (though very often, in reality, for pretrial adjudication or settlement).Footnote ³ Thirty years ago, MDLs were barely a blip on our collective radar. As of 1991, these actions made up only about 1 percent of pending civil cases.Footnote ⁴ Now, by contrast, MDLs make up fully half of all new federal civil filings.Footnote ⁵ This means that one out of every two litigants who files a claim in federal court might not really be fully represented by the lawyer she chose, get the venue she chose, or remain before the judge to whom her suit was initially assigned. Instead, her case will be fed into the MDL system and processed elsewhere, in a long, labyrinthian scheme that is often far afield and out of her sight.Footnote ⁶

Given these statistics, there’s no real question that the MDL has risen – and that its rise is significantly altering the American system of civil justice. There is little consensus, however, as to whether the MDL’s ascent is a good or bad thing. Some celebrate the MDL for promoting judicial efficiency, addressing harms that are national in scope, channeling claims to particularly able and expert advocates, creating economies of scale, and increasing access to justice – giving some judicial process to those who, without MDL, would have no ability to vindicate their essential rights.Footnote ⁷

Others, meanwhile, find much to dislike. Critics frequently seize on MDLs’ relatively slow speed,Footnote ⁸ their heavy reliance on repeat play,Footnote ⁹ and the free-wheeling judicial “ad hocery” that has become the device’s calling card.Footnote ¹⁰ Beyond that, critics worry that the device distorts the traditional attorney-client relationship and subverts litigant autonomy.Footnote ¹¹ Critics fear that aggregation alters traditional screening patterns, which can unleash a “vacuum cleaner effect” and ultimately lead to the inclusion of claims of dubious merit.Footnote ¹² And critics note that the device has seemingly deviated from its intended design: The MDL was supposed to aggregate cases for pretrial proceedings. So, the status quo – where trials are rare and transfer of a case back to a plaintiff’s home judicial district is exceptional – means, some say, that the MDL has strayed off script.Footnote ¹³

Stepping back, one can see: The MDL has certain advantages and disadvantages. Furthermore, and critically, many MDL drawbacks are baked in. There are certain compromises we must make if we want the efficiencies and access benefits MDLs supply. Aggregation (and, with it, some loss of litigant autonomy) is an essential and defining feature of the MDL paradigm. The same may be said for judicial innovation, the need to adapt the traditional attorney-client relationship, or the fact that some lawyers are tapped to lead MDLs in a selection process that will, inevitably, consign some able and eager advocates to the sidelines.

Recognizing these unavoidable trade-offs, in our own assessment, we ask subtly different and more targeted questions. We don’t hazard to assess whether MDLs, on balance, are good or bad. Nor do we even assess whether particular MDL features (such as procedural improvisation) are good or bad. Instead, we ask two more modest questions: (1) Do contemporary MDLs have avoidable drawbacks and (2) if so, can those be addressed? In this analysis, we zero in on just one MDL drawback that is both practically and doctrinally consequential: MDL’s restriction of litigant autonomy. And we further observe: Though some loss of litigant autonomy is an inevitable and inescapable by-product of aggregation and is therefore entirely understandable (the yin to aggregation’s yang), the present-day MDL may be more alienating and involve a larger loss of autonomy than is actually necessary. As explained in Section 8.1, that is a potentially large problem. But, we also argue, it is a problem that, with a little ingenuity, courts, policymakers, scholars, and litigators can practically mitigate.

The remainder of this chapter proceeds in three Parts. Section 8.1 sets the scene by focusing on individual autonomy. In particular, Section 8.1.1 explains why autonomy matters, while Section 8.1.2 draws on MDL plaintiff survey data recently compiled by Elizabeth Burch and Margaret Williams to query whether MDL procedures might compromise litigant autonomy more than is strictly necessary. Then, to assess whether transferee courts are currently doing what they practically can to promote autonomy by keeping litigants up-to-date and well-informed, Section 8.2 offers the results of our own systematic study of current court-run MDL websites. This analysis reveals that websites exist but are deficient in important respects. In particular, court websites are hard to find and often outdated. They lack digested, litigant-focused content and are laden with legalese. And they rarely offer litigants opportunities to attend hearings and status conferences remotely (from their home states). In light of these deficiencies, Section 8.3 proposes a modest set of changes that might practically improve matters. These tweaks will not revolutionize MDL processes. But they could further litigants’ legitimate interests in information, with little risk and at modest cost. In so doing, they seem poised to increase litigant autonomy – “low-tech tech,” to be sure, but with high potential reach.

8.1 Individual Autonomy, Even in the Aggregate: Why It Matters and What We Know

8.1.1 Why Individual Autonomy Matters

Litigant autonomy is a central and much-discussed concern of any adjudicatory design, be it individualized or aggregate. And, when assessing MDLs, individual autonomy is especially critical; indeed, its existence (or, conversely, its absence) goes to the heart of MDL’s legitimacy. That’s so because, if litigants swept into MDLs truly retain their individual autonomy – and preserve their ability meaningfully to participate in judicial processes – then the source of the MDL’s legitimacy is clear. On the other hand, to the extent consolidation into an MDL means that individual litigants necessarily and inevitably sacrifice their individual autonomy and forfeit their ability meaningfully to participate in judicial processes (and offer, or withhold, authentic consent to a settlement agreement), the MDL mechanism sits on much shakier ground.Footnote ¹⁴

On paper, that is not a problem: MDLs, as formally conceived, do little to undercut the autonomy of individual litigants. In theory, at least, MDLs serve only to streamline and expedite pretrial processes; they (again, in theory) interfere little, if at all, with lawyer-client communication, the allocation of authority within the lawyer-client relationship, or the client’s ability to accept or reject the defendant’s offer of settlement. That formal framework makes it acceptable to furnish MDL plaintiffs (unlike absent class members, say) with few special procedural protections.Footnote ¹⁵ It is thought that, even in an MDL, our old workhorses – Model Rules of Professional Conduct 1.4 (demanding candid attorney-client communication), 1.7 (policing conflicts), 1.2(a) (clarifying the allocation of authority and specifying “that a lawyer shall abide by a client’s decisions concerning the objectives of representation”), 1.16 (limiting attorneys’ ability to withdraw), and 1.8(g) (regulating aggregate settlements) – can ensure the adequate protection of clients.

In contemporary practice, however, MDLs are much more than a pretrial aggregation device.Footnote ¹⁶ And, it is not necessarily clear that in this system – characterized by infrequent remand to the transferor court, prescribed and cookie-cutter settlement advice, and heavy-handed attorney withdrawal provisions – our traditional ethics rules continue to cut it.Footnote ¹⁷ Indeed, some suggest that the status quo so thoroughly compromises litigant autonomy that it represents a denial of due process, as litigants are conscripted into a system “in which their substantive rights will be significantly affected, if not effectively resolved, by means of a shockingly sloppy, informal, and often secretive process in which they have little or no right to participate, and in which they have very little say.”Footnote ¹⁸

Individual autonomy is thus the hinge. To the extent it mostly endures, and to the extent individual litigants really can participate in judicial proceedings, authentically consent to settlement agreements, and control the resolution of their own claims, MDL’s legality and legitimacy is clearer. To the extent individual autonomy is a fiction, MDL’s legality and legitimacy is more doubtful.

The upshot? If judges, policymakers, scholars, and practitioners are concerned about – and want to shore up – MDL legitimacy, client autonomy should be fortified, at least where doing so is possible without major sacrifice.

8.1.2 Litigant Autonomy: What We Know

The above discussion underscores that in MDLs, litigant autonomy really matters. That insight tees up a clear – albeit hard-to-answer – real-world question: How much autonomy do contemporary MDL litigants actually have?

Context and caveats. That is the question to which we now turn, but before we do, a bit of context is necessary. The context is that, ideally, to gauge the autonomy of MDL litigants, we would know exactly how much autonomy is optimal and also how much is minimally sufficient – and how to measure it. Or, short of that, we could perhaps compare rigorous data that captures the experiences of MDL plaintiffs as against those of one-off “traditional” plaintiffs to understand whether, or to what extent, the former outperform or underperform the latter along relevant metrics.

Yet neither is remotely possible. Though litigant autonomy is an oft-cited ideal, we don’t know exactly what it would look like and mean, if fully realized, to litigants. Worse, decades into the empirical legal studies revolution, we continue to know shockingly little about litigants’ preferences, priorities, or lived experiences, whether in MDLs or otherwise.Footnote ¹⁹

These uncertainties prevent most sweeping claims about litigant autonomy. Nevertheless, one can, at least tentatively, identify several ingredients that are necessary, if not sufficient, to safeguard the autonomy interests of litigants. That list, we think, includes: Litigants can access case information and monitor judicial proceedings if they so choose; litigants can communicate with their attorneys and understand the signals of the court; litigants have a sense of where things stand, including with regard to the strength of their claim, their claim’s likelihood of success, and where the case is in the litigation life cycle; and litigants are empowered to accept or reject the defendant’s offer of settlement.Footnote ²⁰ A system with these ingredients would seem to be fairly protective of individual autonomy. A system without seems the opposite.

Findings from the Burch-Williams study. How do MDL litigants fare on the above metrics? A survey, recently conducted by Elizabeth Burch and Margaret Williams, offers a partial answer.Footnote ²¹ The two scholars surveyed participants in recent MDLs, gathering confidential responses over multiple years.Footnote ²² In the end, 217 litigants (mostly women who had participated in the pelvic mesh litigation) weighed in, represented by 295 separate lawyers from 145 law firms.Footnote ²³

The survey captures claimants’ perspectives on a wide range of subjects, including their reasons for initiating suit and their ultimate satisfaction with case outcomes. As relevant to litigant autonomy, information, and participation, the scholars found the following:

When asked if their lawyer “kept [them] informed about the status of [their] case,” 59 percent of respondents strongly or somewhat disagreed.Footnote ²⁴
When offered the prompt: “While my case was pending, I felt like I understood what was happening,” 67.9 percent of respondents strongly or somewhat disagreed. Only 13.7 percent somewhat or strongly agreed.
When asked how their lawyers kept them informed and invited to list multiple options, more than a quarter of respondents – 26 percent – reported that their attorney did not update them at all.
Of the 111 respondents who reported on their attorneys’ methods of communication, only two indicated that their lawyer(s) utilized a website to communicate with them; only one indicated that her lawyer utilized social media for that purpose.
34 percent of respondents were unable or unwilling to identify their lawyer’s name.

Caveats apply: Respondents to the opt-in survey might not be representative, which stunts both reliability and generalizability.Footnote ²⁵ The numbers, even if reliable, supply just one snapshot. And, with one data set, we can’t say whether litigant understanding is higher or lower than it would be if the litigants had never been swept into the MDL system and instead had their case litigated via traditional means. (Nor can we, alternatively, say whether, but for the MDL’s efficiencies, these litigants might have been shut out of the civil justice system entirely.Footnote ²⁶) Nor can we even say whether MDL clients are communicated with more, or less, than those whose claims are “conventionally” litigated.Footnote ²⁷

Even recognizing the study’s major caveats, however, five larger points seem clear. First, when surveyed, MDL litigants, represented by a broad range of lawyers (not just a few “bad apples”), reported infrequent attorney communication and persistent confusion.Footnote ²⁸ Second, knowledgeable and independent experts echo litigants’ concerns, suggesting, for example, that “[p]laintiffs [within MDLs] have insufficient information and understanding to monitor effectively the course of the litigation and insufficient knowledge to assess independently the outcomes that are proposed for their approval if and when a time for settlement arrives.”Footnote ²⁹ Third, plaintiffs’ lawyers in MDLs frequently have very large client inventories – of hundreds or thousands of clients.Footnote ³⁰ When a lawyer has so many clients, real attorney-client communication and meaningful litigant participation is bound to suffer.Footnote ³¹ Fourth, when it comes to the promotion and protection of litigant autonomy, effective communication – and the provision of vital information – is not sufficient, but it is certainly necessary. Even well-informed litigants can be excluded from vital decision-making processes, but litigants, logically, cannot call the shots while operating in the dark.Footnote ³² And fifth, per Section 8.1, to the extent that individuals swept into MDLs unnecessarily forfeit their autonomy, that’s a real problem when it comes to MDL legitimacy and legality.Footnote ³³

These five points paint a worrying portrait. Fortunately, however, alongside those five points, there is one further reality: Straightforward measures are available to promote litigants’ access to case information, their ability to monitor judicial proceedings, and their understanding of the litigation’s current path and likely trajectory. And, as we will argue in Section 8.3, these measures can be implemented by courts now, with little difficulty, and at reasonable cost.

8.2 Current Court Communication: MDL Websites and Their Deficiencies

Section 8.1 reviewed survey findings that indicate litigants within MDLs report substantial confusion and limited understanding. As noted, when given the prompt: “While my case was pending, I felt like I understood what was happening,” only 13.7 percent somewhat or strongly agreed.Footnote ³⁴ These perceived communication failures are surprising. It’s 2023. MDL websites are common, and emails are easy; “the marginal cost of additional communication [is] approaching zero.”Footnote ³⁵ What explains these reported gaps?

To gain analytic leverage on that question, we rolled up our sleeves and looked at where some MDL-relevant communication takes place.Footnote ³⁶ In particular, we trained our gaze on MDL websites – resources that, per the Judicial Panel on Multidistrict Litigation (JPML) and Federal Judicial Center, “can be … invaluable tool[s] to keep parties … informed of the progress of the litigation.”Footnote ³⁷ These sites are often described as key components of case management.Footnote ³⁸ Scholars suggest that they facilitate litigants’ “due process rights to participate meaningfully in the proceedings.”Footnote ³⁹ And, perhaps most notably, judges themselves have described these websites as key conduits of court-client communication.Footnote ⁴⁰

Do MDL websites fulfill their promise of keeping “parties … informed of the progress of the litigation” by furnishing well-curated, up-to-date, user-friendly information? To answer that question, we reviewed each page of available websites for the twenty-five largest currently pending MDLs. Each of these MDLs contained at least 500 pending actions; together, they accounted for nearly 415,000 pending actions, encompassing the claims of hundreds of thousands of individual litigants, and constituted 98 percent of actions in all MDLs nationwide.Footnote ⁴¹ Thus, if ju dges are using court websites to engage in clear and frequent communication with individual litigants, we would have seen it.

We didn’t. Websites did exist. Of the twenty-five largest MDLs, all except one had a website that we could locate.Footnote ⁴² But, many of these sites were surprisingly limited and difficult to navigate. Indeed, the sites provided scant information, were not consistently updated, and often lacked straightforward content (like Zoom information or “plain English” summaries).

8.2.1 An Initial Example: The Zantac MDL

Take, as an initial example, the website that accompanies the Zantac MDL, pending in the Southern District of Florida.Footnote ⁴³ We zero in on this website because it was one of the best, most user-friendly sites we analyzed. But even it contained serious deficiencies.

For starters, finding the website was challenging. A preliminary search – “Zantac lawsuit” – yielded over 1 million hits, and the official court website did not appear on the first several pages of Google results; rather, the first handful of results were attorney advertisements (mostly paid) or attorney and law firm websites.Footnote ⁴⁴ A more targeted effort – “Zantac court website” – bumped the desired result to the first page, albeit below four paid advertisements.

Once we located the site, we were greeted with a description of the suit: “This matter concerns the heartburn medication Zantac. More specifically, this matter concerns the ranitidine molecule – the active ingredient of Zantac. The Judicial Panel for Multidistrict Litigation formed this MDL (number 2924) on February 6, 2020.”Footnote ⁴⁵ We also were shown six links (Media Information, MDL Transfer Order, Docket Report, Operative Pleadings, Transcripts, and Calendar) and a curated list of PDF files (see Figure 8.1).

Figure 8.1 Zantac MDL docket

The “Calendar” led to a plain site listing basic information about an upcoming hearing, but with few details. The hearing in question was described only as “Status Conference – Case Mgt,” and it did not specify whether litigants could attend, either in person or remotely (see Figure 8.2).Footnote ⁴⁶

Figure 8.2 Zantac MDL hearing notice

A litigant who clicked on the “Operative Pleadings” tab was taken to seven PDF documents (Pfizer, Inc. Answer; Class Economic Loss Complaint; etc.) described as those “of special interest,” plus a note that “the most accurate source for orders is PACER.”Footnote ⁴⁷ (The site did not include information regarding what PACER is, though it did include a link; see Figure 8.3.)

Figure 8.3 Zantac operative pleadings

Finally, a search box allowed for a search of the case’s orders, again available as PDFs.

8.2.2 The Rest: Deficits along Five Key Dimensions

Within our broader sample, usability deficits were pervasive and very often worse than the Zantac MDL site. In the course of our inquiry, we reviewed websites along the following five dimensions: (1) searchability and identifiability; (2) plaintiff-focused content; (3) use of plain language; (4) whether the site supplied information to facilitate remote participation in, or attendance at, proceedings; and (5) timeliness. We found deficits along each.

Searchability and identifiability. A website is only useful if it can be located. As such, our first inquiry was whether MDL websites were easy, or alternatively difficult, to find. Here, we found that, as in Zantac, court sites were often buried under a thicket of advertisements for lawyers or lead generators (see Figure 8.4).Footnote ⁴⁸ Commonsense search terms for the three largest MDLs yielded results on pages 13, 4, and 8, respectively.Footnote ⁴⁹

Figure 8.4 Hernia mesh Google search

Litigant-focused content. Next, we evaluated whether websites featured custom content that was seemingly geared to orient individual litigants. Most didn’t. In particular, of the twenty-four sites we reviewed, only eleven contained any meaningful introductory content at all. Even then, those introductions focused primarily on the transfer process (including the relevant JPML proceeding) and a statement of the case’s overall topic – not its current status or its anticipated timeline. Meanwhile, only six of the twenty-four offered MDL-focused Frequently Asked Questions. And of those, most offered (and answered) questions at a general level (“What is multidistrict litigation?”) or that were clearly attorney-focused (regarding, for instance, motions to appear pro hac vice). Some others, while well intentioned, supplied limited help (see Figure 8.5).Footnote ⁵⁰

Figure 8.5 Eastern District of Louisiana website link

Similarly, more than half of sites identified members of the cases’ leadership structure (e.g., by listing leadership or liaison counsel) and provided contact information for outreach. But none directed plaintiffs with questions to a specific point of contact among those attorneys.

Finally, materials that were presented – typically, a partial set of key documents, such as court orders or hearing transcripts – were often unadorned. For instance, seven of the twenty-four reviewed sites linked to orders, as PDFs, with essentially no description of what those documents contain (see Figure 8.6).Footnote ⁵¹

Figure 8.6 Proton Pump case management orders

Better: Sixteen of the sites offered some descriptions of posted PDFs. But only two included status updates that went much beyond one-line order summaries (see Figure 8.7).Footnote ⁵²

Figure 8.7 “Current developments” listing

To a litigant, therefore, the average MDL site is best understood as a free, and often partial, PACER stand-in – not a source of curated, distilled, or intelligible information.

Jargon and legalese. We next assessed whether the websites were written in plain language – or at least translated legalese. Here, we found that the majority of sites relied on legal jargon when they described key developments.Footnote ⁵³ For example, our review found websites touting privilege log protocols, an ESI order, and census implementation orders. Even case-specific Frequently Asked Questions – where one might most reasonably expect clear, litigant-friendly language – stopped short of “translating” key legal terms.Footnote ⁵⁴ Put simply, site content was predominantly written in the language of lawyers, not litigants.

Information to facilitate remote attendance. We also gauged whether the websites offered teleconference or Zoom hearing information. This information is important because consolidated cases – and the geographic distance they entail – leave many litigants unable to attend judicial proceedings in person, which puts a premium on litigants’ ability to attend key proceedings remotely, via video or telephone.

Did the websites supply the logistical information a litigant needs in order to “attend” remotely? Not particularly. Of the twenty-four sites we reviewed, thirteen did not offer any case calendar that alerted litigants of upcoming hearings or conferences. Of the eleven that did:

Five listed events on their calendar (though some of the listed events had already occurred) without any Zoom or telephone information;
Two included Zoom or telephone information for some, but not all, past events;
Two included Zoom or telephone information for all events listed on the case calendar; and
Two included dedicated calendar pages but had no scheduled events.

Put another way, most sites did not include case calendars; of those that did, more than half lacked Zoom or other remote dial-in information for some or all listed hearings. That absence was particularly striking given that, in the wake of the COVID-19 pandemic, nearly all courts embraced remote proceedings.Footnote ⁵⁵

Unsurprisingly, the sites’ presentation of upcoming hearings also varied widely. In some instances (as on the MDL-2775, MDL-3004, and MDL-2846 sites shown in Figure 8.8 a–c), virtual hearings were listed, but no dial-in information was provided.Footnote ⁵⁶ In contrast, some MDL sites (like MDL-2741Footnote ⁵⁷) linked to Zoom information (Figure 8.9).

Figure 8.8 a–c Link-less “Upcoming MDL Events” examples

Figure 8.9 Link-provided “Upcoming Proceedings” example

Timeliness. Lastly, recognizing that cases can move fast – and stale information is of limited utility – we evaluated the websites to see whether information was timely. Again, results were dispiriting. Of the sites that offered time-sensitive updates (e.g., calendars of upcoming events), several were not updated, meaning that a litigant or even an individually retained plaintiffs’ attorney who relied on the website for information was apt to be misinformed.Footnote ⁵⁸ For instance, MDL-2913, involving Juul, was transferred to the Northern District of California on October 2, 2019. Its website included a calendar section and several “documents of special interest.”Footnote ⁵⁹ The latest document upload involved a conditional transfer order from January 2020Footnote ⁶⁰ – even though several major rulings had been issued more recently.Footnote ⁶¹ (The website’s source code indicates that it was last modified in May 2020.) Whether by conscious choice or oversight, the case’s online presence did not reflect its current status. Other sites, meanwhile, listed “upcoming” proceedings that had, in fact, occurred long before.Footnote ⁶² And, when we accessed archived, time-stamped versions of sites, we found several orders that were eventually posted – but not until months after they were handed down.Footnote ⁶³

Nor were the sites set up to keep interested visitors repeatedly informed, as most of the sites did not themselves offer a direct “push” or sign-up feature, so that visitors could be notified via text or email when new material became available.Footnote ⁶⁴

8.2.3 Explanations for the Above Deficits: Unspecified Audience and Insufficient Existing Guidance

What explains the above deficits? One possibility is that these websites were never intended to speak to, or otherwise benefit, actual litigants – and our analysis, then, is basically underscoring that websites, never meant to edify litigants, in fact, fail to edify them.Footnote ⁶⁵ To some judges and attorney leaders, in other words, these sites may serve merely as internal or specialized resources, whether for state court judges involved in overlapping litigation, individually retained plaintiffs’ counsel, or even scholars and journalists.Footnote ⁶⁶ Or, it could be that the “audience” question has never been carefully considered or seriously addressed. As a result, the websites may be trying to be all things to all people but actually serve none, as content is too general for members of the plaintiffs’ steering committee, too specialized and technical for litigants, and too partial or outdated for individually retained plaintiffs’ counsel or judges handling parallel state litigation.

A second culprit, in contrast, is crystal clear: Higher authorities have furnished transferee judges and court administrators with only limited public guidance.Footnote ⁶⁷ In particular, current guidance tends to suggest categories for site content. But beyond that, it furnishes transferee judges only limited help. Illustrating this deficiency, the JPML and Federal Judicial Center’s Ten Steps to Better Case Management: A Guide for Multidistrict Litigation Transferee Court Clerks includes a discussion of recommended webpage content, but its relevant section provides only that:

The following information should be included on a multidistrict litigation webpage:

Case name and master docket sheet case number
Brief description of the subject of the case
Name of the judge presiding over the case
List of court staff, along with their contact information
Names of liaison counsel, along with their contact information

In addition, it is useful to include the following types of orders in PDF:

Case management orders
Transfer orders from the Panel
Orders applicable to more than one case
Individual case orders affecting one case, but potentially pertinent to others
Suggestion of remand orders.Footnote ⁶⁸

Several other pertinent resources are similarly circumscribed.Footnote ⁶⁹ These publications have likely helped to spur websites’ creation, but they have stunted their meaningful evolution.

***

Whatever the reasons for the above deficiencies, the facts are these: Among the websites we reviewed, most suffered from basic deficits that could very well inhibit litigants’ access and engagement. And the deficits we identify could easily be addressed.

8.3. A Simple Path Forward: A “Low-Tech” Mechanism to Keep Litigants Better Informed

As noted in Section 8.1, MDLs rely, for legitimacy, on litigant autonomy, and while communication is not sufficient for litigant autonomy, it is necessary. Even well-informed litigants can be deprived of the capacity to make crucial decisions – but litigants, logically, cannot make crucial decisions if they are not reasonably well-informed. Meanwhile, while no one can currently prove that MDL litigants are underinformed, Section 8.2 compiled some evidence indicating information deficits are deep and pervasive. The Burch-Williams study paints a worrying portrait; knowledgeable scholars have long raised concerns; and our painstaking review of MDL websites reveals that one tool, theoretically poised to promote litigant understanding, is, in fact, poorly positioned to do so.

What can be done? Over the long run, the Federal Judicial Center (FJC), or another similar body, should furnish formal guidance to judges, court administrators, and lawyers on how to build effective and legible websites. This guidance would ideally be supplemented by a set of best practices around search engine optimization and language access. There is good reason to believe that such guidance would be effective. Noticeable similarities across existing websites suggest that transferee judges borrow heavily from one another. An implication of that cross-pollination is that better guidance from the FJC (or elsewhere) would (likely) rapidly spread.

In the meantime, we close with four concrete (though modest and partial) suggestions for transferee judges.

First, judges need to decide whom these sites are really for – and then need to ensure that the sites well-serve their intended audience. We suggest that MDL websites ought to be embraced as (among other things) a litigant-facing tool, and, as discussed below, they should be improved with that purpose in mind.Footnote ⁷⁰ But, even if courts are not persuaded, they still need to do a better job tailoring sites to some particular audience. As long as the specific audience remains undetermined, courts are less likely to serve any particular audience adequately.

If courts agree that websites should speak directly to litigants, then a second recommendation follows: At least some clearly delineated website content should be customized for litigants. Courts should, as noted, avoid legalese and offer more digested (rather than just raw) material. For instance, judges might ask attorneys to supply monthly or quarterly updates; these updates, which should be approved by both parties and the court, should summarize the progress made in the preceding month and highlight what is on tap in the MDL in the immediate future. Here, the website should capture both in-court activity and noteworthy activity scheduled outside of the court’s four walls (e.g., depositions).

Third, irrespective of chosen audience, judges should take steps to ensure that MDL websites are visible and up-to-date. Regardless of whom the websites are meant to serve, websites cannot serve that audience if they cannot be quickly located.Footnote ⁷¹ And, because stale information is of limited utility, judges should ensure that the websites offer an accurate, timely snapshot of the case’s progress. The first steps are uncontroversial and straightforward; they include reliably adding hearings to the online calendar, removing them after they occur, and posting key documents within a reasonable time frame. Judges should also consider an opt-in sign-up that automatically emails or texts interested individuals when new content is added.

Fourth and finally, judges should ensure that websites clearly publicize hearings and status conferences, and, recognizing that MDLs necessarily and inescapably create distance between client and court, judges should facilitate remote participation whenever feasible. As noted above, many MDL judges have embraced remote hearings out of COVID-generated necessity; judges overseeing large MDLs should consider how the switching costs they have already paid can be invested to promote meaningful litigant access, even from afar.Footnote ⁷² Indeed, judges might cautiously pilot tools for two-way client-court communication, or even client-to-client communication – though, in so doing, judges must be attuned to various risks.Footnote ⁷³

8.4. Conclusion: Zooming Out

We harbor no illusions about the role that better MDL websites can play. They’re no panacea, and vigorous debates about the merits and demerits of MDL will (and should) continue. But even so: Improved, refocused websites can keep litigants a bit more engaged; they can help litigants stay a bit better informed; and they can promote litigant participation in even distant MDL processes. More than that, improved websites can, however incrementally, promote litigant autonomy and, by extension, shore up the legitimacy of the MDL system. The day may come when some as-yet-unidentified high-tech innovation revolutionizes the MDL. Until then, low-tech changes can modestly improve the system, and just might serve as platforms for further reform.

Footnotes

4 Remote Testimonial Fact-Finding

¹ See, e.g., Ellen Lee Degnan, Thomas Ferriss, D. James Greiner & Roseanna Sommers, Using Random Assignment to Measure Court Accessibility for Low-Income Divorce Seekers, Proc. Nat. Acad. Scis., Apr. 6, 2021 (documenting enormous differences in the experiences of lawyerless versus lawyered litigants in a substantively simple kind of legal proceeding).

² Richard Susskind, Online Courts and the Future of Civil Justice 27 (2019); see also Ayelet Sela, e-Nudging Justice: The Role of Digital Choice Architecture in Online Courts, 2019 J. Disp. Resol. 127, 127–28 (2019) (noting that proponents of online courts contend that they lessen problems of confusing and complex process, lack of access, voluminous case filings, and costliness, while perhaps increasing settlement rates); Harold Hongju Koh, The “Gants Principles” for Online Dispute Resolution: Realizing the Chief Justice’s Vision for Courts in the Cloud, 62 B.C. L. Rev. 2768, 2773 (2021) (noting gains in efficiency from online courts but suggesting a cost in the form of reduced access for those without resources and sophistication); Ayelet Sela, Diversity by Design: Improving Access to Justice in Online Courts with Adaptive Court Interfaces, 15 L. & Ethics Hum. Rts. 125, 128 (2021) (arguing that it is “widely recognized” that online courts promote efficiency, effectiveness, accessibility, and fairness).

³ See Robert Lapper, Access to Justice in Canada: The First On-line Court, Commonwealth Laws. Ass’n, https://www.commonwealthlawyers.com/cla/access-to-justice-in-canada-the-first-on-line-court/ (describing British Columbia’s move to mandatory online adjudication for certain matters).

⁴ See Ayelet Sela, The Effect of Online Technologies on Dispute Resolution System Design: Antecedents, Current Trends, and Future Directions, 21 Lewis & Clark L. Rev. 635 (2017); see also Ethan Katsh & Leah Wing, Ten Years of Online Dispute Resolution (ODR): Looking at the Past and Constructing the Future, 38 U. Tol. L. Rev. 19, 41 (2006).

⁵ See Daniel Victor, “I’m Not a Cat,” Says Lawyer Having Zoom Difficulties, N.Y. Times (Feb. 9, 2021), https://www.nytimes.com/2021/02/09/style/cat-lawyer-zoom.html (discussing a hearing in which a lawyer began participation with a cat video filter on his Zoom account that he was unable to remove without the judge’s guidance); Fred Barbash, Oyez. Oy vey. Was That a Toilet Flush in the Middle of a Supreme Court Live-Streamed Hearing? Wash. Post (May 7, 2020), https://www.washingtonpost.com/nation/2020/05/07/toilet-flush-supreme-court/ (the article title speaks for itself); Ashley Feinberg, Investigation: I Think I Know Which Justice Flushed, Slate (May 8, 2020), https://slate.com/news-and-politics/2020/05/toilet-flush-supreme-court-livestream.html (discussing the same Supreme Court oral argument); Eric Scigliano, Zoom Court Is Changing How Justice Is Served, for Better, for Worse, and Possibly Forever, The Atlantic (Apr. 13, 2021), https://www.theatlantic.com/magazine/archive/2021/05/can-justice-be-served-on-zoom/618392/ (describing juror informality during remote voir dire).

⁶ Scigliano, Zoom Court Is Changing; see also Pew Charitable Trs., How Courts Embraced Technology, Met the Pandemic Challenge, and Revolutionized Their Operations (2021), https://www.pewtrusts.org/-/media/assets/2021/12/how-courts-embraced-technology.pdf (noting that by Nov. 2020, 82 percent of all courts in the United States were permitting remote proceedings in eviction matters).

⁷ For discussions of beliefs deserving of similar levels of credibility, see, e.g., Malcolm W. Browne, Perpetual Motion? N.Y. Times (June 4, 1985), https://www.nytimes.com/1985/06/04/science/perpetual-motion.html; Astronomy: Geocentric Model, Encyclopaedia Britannica, https://www.britannica.com/science/geocentric-model. We focus on this point further below.

⁸ See, e.g., Susan A. Bandes & Neal Feigenson, Virtual Trials: Necessity, Invention, and the Evolution of the Courtroom, 68 Buff. L. Rev. 1275 (2020); Christopher L. Dodson, Scott Dodson & Lee H. Rosenthal, The Zooming of Federal Litigation, 104 Judicature 12 (2020); Jenia Iontcheva Turner, Remote Criminal Justice, 53 Tex. Tech. L. Rev. 197 (2021).

⁹ Ed Spillane, The End of Jury Trials: COVID-19 and the Courts: The Implications and Challenges of Holding Hearings Virtually and in Person during a Pandemic from a Judge’s Perspective, 18 Ohio St. J. Crim. L. 537 (2021).

¹⁰ David Freeman Engstrom, Digital Civil Procedure, 169 U. Pa. L. Rev. 7 (2021); David Freeman Engstrom, Post-COVID Courts, 68 UCLA L. Rev. Disc. 246 (2020); Richard Susskind, Remote Courts, Practice, July/Aug. 2020, at 1.

¹¹ The second of the two questions addressed herein is more often expressed with respect to criminal trials, with the worry being dehumanization of an accused. See Derwyn Bunton, Chair, Nat’l Ass’n for Pub. Def., NAPD Statement on the Issues with the Use of Virtual Court Technology (2020), https://www.publicdefenders.us/files/NAPD%20Virtual%20Court%20Statement%208_1.pdf; see also Turner, Remote Criminal Justice; Anne Bowen Poulin, Criminal Justice and Videoconferencing Technology: The Remote Defendant, 78 Tul. L. Rev. 1089 (2004); Cormac T. Connor, Human Rights Violations in the Information Age, 16 Geo. Immigr. L.J. 207 (2001). But dehumanization might apply equally to the civil context, where the concern might be a reduced human connection to, say, a plaintiff suing about a physical injury. Indeed, depending on one’s perspective, in a case involving a human being against a corporation, dehumanization might eliminate an unfair emotional advantage inuring to the human party or further “skew” a system already tilted in favor of corporate entities. See Engstrom, Digital Civil Procedure.

¹² To clarify: we refer in this chapter to concerns of duplicitous or mistaken testimony about particular historical events and, relatedly, conflicts in testimony about such events. Imagine, in other words, one witness who testifies that the light was green at the time of an accident, and another witness who says that the light was red. Physics being what it is, one of these witnesses is lying or mistaken. In cases of genuine issues of material fact, a trier of fact (a jury, if relevant law makes one available) is ordinarily supposed to discern which witness is truthful and accurate. Triers of fact, including juries, may be able to identify mistakes or lies by considering the plausibility of testimony given other (presumably accurate and transparent) evidence; see, e.g., Scott v. Harris, 550 U.S. 372 (2007), or background circumstances, or by considering internal inconsistencies in a witness’s testimony. But to our knowledge, few in the United States legal profession argue that an in-person interaction is necessary to facilitate this latter exercise in truth detection.

¹³ Markman v. Westview Instruments, 570 U.S. 370 (1996).

¹⁴ See, e.g., George Fisher, The Jury’s Rise as Lie Detector, 107 Yale L.J. 576, 577 n.2, 703 n.597 (1997) (collecting case law adjudicating the advisability of judges promoting the use of nonverbal cues to detect witness deception); Cara Salvatore, May It Please the Camera: Zoom Trials Demand New Skills, Law360 (June 29, 2020), https://www.law360.com/articles/1278361/may-it-please-the-camera-zoom-trials-demand-new-skills (“Being able to see the witnesses face-on instead of sideways, [a judge] said, vastly improves the main job of fact-finders – assessing credibility.”).

¹⁵ See, e.g., Mattox v. United States, 156 U.S. 237, 243 (1895) (holding that in the criminal context, the Constitution gives the defendant the right to “compel [a prosecution witness] to stand face to face with the jury in order that they may look at him, and judge by his demeanor upon the stand and the manner in which he gives his testimony whether he is worthy of belief”). Note that there are challenges to the notion of what it means to be “face to face” with a witness other than the online environment. See Julia Simon-Kerr, Unmasking Demeanor, 88 Geo. Wash. L. Rev. Arguendo 158 (2020) (discussing the challenges posed by a policy of requiring witnesses to wear masks while testifying).

¹⁶ Charles F. Bond Jr. & Bella M. DePaulo, Accuracy of Deception Judgements, 10 Personality & Soc. Psych. Rev. 214, 219 (2006).

¹⁷ See William M. Marston, Studies in Testimony, 15 J. Crim. L. & Criminology 5, 22–26 (1924); Glenn E. Littlepage & Martin A. Pineault, Verbal, Facial, and Paralinguistic Cues to the Detection of Truth and Lying, 4 Personality & Soc. Psych. Bull. 461 (1978); John E. Hocking, Joyce Bauchner, Edmund P. Kamiski & Gerald R. Miller, Detecting Deceptive Communication from Verbal, Visual, and Paralinguistic Cues, 6 Hum. Comm. Rsch. 33, 34 (1979); Miron Zuckerman, Bella M. DePaulo & Robert Rosenthal, Verbal and Nonverbal Communication of Deception, 14 Adv. Exp. Soc. Psych. 1, 39–40 (1981); Bella M. DePaulo & Robert L. Pfeifer, On-the-Job Experience and Skill at Detecting Deception, 16 J. Applied Soc. Psych. 249 (1986); Ginter Kohnken, Training Police Officers to Detect Deceptive Eyewitness Statements: Does It Work? 2 Soc. Behav. 1 (1987). For experiments specifically addressing the effects of rehearsal, see, e.g., Joshua A. Fishman, Some Current Research Needs in the Psychology of Testimony, 13 J. Soc. Issues 60, 64–65 (1957); Norman R. F. Maier, Sensitivity to Attempts at Deception in an Interview Situation, 19 Personnel Psych. 55 (1966); Norman R. F. Maier & Junie C. Janzen, Reliability of Reasons Used in Making Judgments of Honesty and Dishonesty, 25 Perceptual & Motor Skills 141 (1967); Norman R. F. Maier & James A. Thurber, Accuracy of Judgments of Deception When an Interview Is Watched, Heard, and Read, 21 Personnel Psych. 23 (1968); Paul Ekman & Wallace V. Friesen, Nonverbal Leakage and Cues to Deception, 32 Psychiatry 88 (1969); Paul Ekman & Wallace V. Friesen, Detecting Deception from the Body or Face, 29 J. Personality & Soc. Psych. 288 (1974); Pal Ekman, Wallace V. Friesen & Klaus R. Scherer, Body Movement and Voice Pitch in Deceptive Interaction, 16 Semiotica 23 (1976); Gerald R. Miller & Norman E. Fontes, The Effects of Videotaped Court Materials on Juror Response 11–42 (1978); Glenn E. Littlepage & Martin A. Pineault, Detection of Deceptive Factual Statements from the Body and the Face, 5 Personality & Soc. Psych. Bull. 325, 328 (1979); Gerald R. Miller, Mark A. deTurck & Pamela J. Kalbfieisch, Self-Monitoring, Rehearsal, and Deceptive Communication, 10 Hum. Comm. Rsch. 97, 98–99, 114 (1983) (reporting unpublished studies of others as well as the authors’ work); Carol Toris & Bella M. DePaulo, Effects of Actual Deception and Suspiciousness of Deception on Interpersonal Perceptions, 47 J. Personality & Soc. Psych. 1063 (1984) (although most people cannot do better than chance in detecting falsehoods, most people confidently believe they can do so); Paul Ekman, Telling Lies 162–89 (1985); Charles F. Bond Jr. & William E. Fahey, False Suspicion and the Misperception of Deceit, 26 Br. J. Soc. Psych. 41 (1987).

¹⁸ For another example, see Chief Justice John Roberts, who reacted to quantitative standards for measuring political gerrymandering as follows: “It may be simply my educational background, but I can only describe it as sociological gobbledygook.” Gill v. Whitford Oral Argument, Oyez, https://www.oyez.org/cases/2017/16-1161.

¹⁹ Bella M. DePaulo, James J. Lindsay, Brian E. Malone & Laura Muhlenbruck, Cues to Deception, 129 Psych. Bull. 74, 95 (2003).

²⁰ Lucy Akehurst, Gunter Kohnken, Aldert Vrij & Ray Bull, Lay Persons’ and Police Officers’ Beliefs regarding Deceptive Behaviour, 10 Applied Cognitive Psych. 468 (1996).

²¹ Hannah Shaw & Minna Lyons, Lie Detection Accuracy: The Role of Age and the Use of Emotions as a Reliable Cue, 32 J. Police Crim. Psych. 300, 302 (2017).

²² Lyn M. Van Swol, Michael T. Braun & Miranda R. Kolb, Deception, Detection, Demeanor, and Truth Bias in Face-to-Face and Computer-Mediated Communication, 42 Commc’n Rsch. 1116, 1131 (2015).

²³ Timothy R. Levine et al., Sender Demeanor: Individual Differences in Sender Believability Have a Powerful Impact on Deception Detection Judgements, 37 Hum. Commc’n Rsch. 377, 400 (2011).

²⁴ Unsurprisingly, lawyers and judges believe otherwise: “In a virtual jury trial, jurors lose the ability to lay eyes on witnesses in real time, and as a result, may miss nuances in behavior, speech patterns or other clues relevant to whether the witness is telling the truth.” Paula Hinton & Tom Melsheimer, The Remote Jury Trial Is a Bad Idea, Law360 (June 9, 2020), https://www.law360.com/articles/1279805. Hinton & Melsheimer offer no support for this assertion, other than (we imagine) their gut instincts based on their having “practic[ed] for over a combined 70 years.” Id. The scientific consensus regarding deception detection has been reaffirmed for longer than seventy years. It involves dozens of studies pursued by scores of researchers. See sources cited supra note 17.

²⁵ Van Swol et al., Deception, Detection, Demeanor, and Truth Bias, at 1131.

²⁶ Id. at 1136.

²⁷ DePaulo et al., Cues to Deception, at 95.

²⁸ Bond & DePaulo, Accuracy of Deception Judgements, at 225.

²⁹ Saul M. Kassin, Christian A. Meissner & Rebecca J. Norwick, “I’d Know a False Confession if I Saw One”: A Comparative Study of College Students and Police Investigators, 29 Law & Hum. Behav. 211, 222 (2005).

³⁰ Akehurst et al., Lay Persons’ and Police Officers’ Beliefs, at 461.

³¹ Bond & DePaulo, Accuracy of Deception Judgements, at 229; Michael Aamodt & Heather Custer, Who Can Best Catch a Liar? A Meta-Analysis of Individual Differences in Detecting Deception, 15 Forensic Exam’r 6, 10 (2006).

³² Nadav Klein & Nicholas Epley, Group Discussion Improves Lie Detection, 112 Proc. Nat’l Acad. Scis. 7460, 7464 (2015). Groups consisted of three people, which is fewer than usually empaneled for any jury.

³³ Id. at 7463.

³⁴ Roger McHaney, Joey F. George & Manjul Gupta, An Exploration of Deception Detection: Are Groups More Effective Than Individuals? 45 Commc’n Rsch. 1111 (2018).

³⁵ Id.

³⁶ Meanwhile, there is some evidence that deliberation of the kind in which juries engage may make things worse. Holly K. Orcutt et al., Detecting Deception in Children’s Testimony: Factfinders’ Abilities to Reach the Truth in Open Court and Closed-Circuit Trials, 25 Law & Hum. Beh. 339 (2001).

³⁷ E. Paige Lloyd, Kevin M. Summers, Kurt Hugenburg & Allen R. McConnell, Revisiting Perceiver and Target Gender Effects in Deception Detection, 42 J. Nonverbal Behav. 427, 435 (2018).

³⁸ E. Paige Lloyd et al., Black and White Lies: Race-Based Biases in Deception Judgements, 28 Psych. Sci. 1134 (2017).

³⁹ Charlotte D. Sweeney & Stephen J. Ceci, Deception Detection, Transmission, and Modality in Age and Sex, 5 Frontiers Psych. 5 (2014).

⁴⁰ Scott E. Culhane, Andre Kehn, Jessica Hatz & Meagen M. Hildebrand, Are Two Heads Better Than One? Assessing the Influence of Collaborative Judgements and Presentation Mode on Deception Detection for Real and Mock Transgressions, 12 J. Investigative Psych. Offender Profiling 158, 165 (2015).

⁴¹ For excellent summaries of the historical research and literature, including many of the studies referenced in note 17 comparing our ability to detect deception across mediums, see Jeremy A, Blumenthal, A Wipe of the Hands, a Lick of the Lips: The Validity of Demeanor Evidence in Assessing Witness Credibility, 72 Neb. L. Rev. 1157 (1993); Olin Guy Wellborn II, Demeanor, 76 Cornell L. Rev. 1075 (1990). Note that we did find the occasional study suggesting to the contrary. See, e.g., Sara Landstrom, Par Anders Granhag & Maria Hartwig, Children’s Live and Videotaped Testimonies: How Presentation Mode Affects Observers’ Perception, Assessment, and Memory, 12 Leg. & Crim. Psych. 333 (2007), but the overwhelming weight of the scientific evidence is as summarized above.

⁴² Wellborn, Demeanor, at 1088.

⁴³ Id. at 1091.

⁴⁴ See, e.g., Laura Smalarz & Gary L. Wells, Post-Identification Feedback to Eyewitnesses Impairs Evaluators’ Abilities to Discriminate between Accurate and Mistaken Testimony, 38 Law & Hum. Behav. 194 (2014); C. A. E. Luus & G. L. Wells, The Malleability of Eyewitness Confidence: Co-witness and Perseverance Effects, 79 J. Applied Psych. 714 (1994). The Smalarz and Wells experiment reveals the lengths to which researchers must go to produce an artificial situation in which human beings are able to distinguish between accurate and mistaken testimony (a situation that may be required to investigate something else). In this study, to assure a sufficient number of “mistaken” witnesses, investigators had to deceive witnesses intended to provide a mistaken eyewitness identification into believing that the correct perpetrator was present in a lineup. Smalarz & Wells, Post-Identification Feedback, at 196. Extraordinary measures to produce distinguishably accurate versus inaccurate hypothetical witnesses characterize other studies. One study resorted to cherry-picking from numerous mock witnesses those judged most and least accurate for later use. Michael R. Leippe, Andrew P. Manion & Ann Romanczyk, Eyewitness Persuasion: How and How Well Do Fact Finders Judge the Accuracy of Adults’ and Children’s Memory Reports? 63 J. Personality & Soc. Psych. 181 (1992). Another made no effort to provide fact-finders with witness testimony. Instead, fact-finders were provided with questionnaires that witnesses completed about their thought processes. David Dunning & Lisa Beth Stern, Distinguishing Accurate from Inaccurate Eyewitness Identifications via Inquiries about Decision Processes, 67 J. Personality & Soc. Psych. 818 (1994).

⁴⁵ Peter A. Newcombe & Jennifer Bransgrove, Perceptions of Witness Credibility: Variations across Age, 28 J. Applied Dev. Psych. 318 (2007); Leippe et al., Eyewitness Persuasion.

⁴⁶ Richard Schmechel, Timothy O’Toole, Catherine Easterly & Elizabeth Loftus, Beyond the Ken? Testing Jurors’ Understanding of Eyewitness Reliability Evidence, 46 Jurimetrics 177 (2006).

⁴⁷ Melissa Boyce, Jennifer Baeaudry & R. C. L. Lindsay, Belief of Eyewitness Identification Evidence, in 2 The Handbook of Eyewitness Psychology 501 (R. C. L. Lindsay, D. F. Ross, J. D. Read & M. P. Toglia eds., 2007); see also Neil Brewer & Anne Burke, Effects of Testimonial Inconsistencies and Eyewitness Credibility on Mock-Juror Judgments, 26 Law & Hum. Behav. 353 (2002) (finding that fact-finders favor witness confidence over testimonial consistency in their accuracy judgments); Steven Penrod & Brian Cutler, Witness Confidence and Witness Accuracy: Assessing Their Forensic Relation, 1 Psych. Pub. Pol’y & L. 817 (1995) (same); Siegfried L. Sporer, Steven Penrod, Don Read & Brian Cutler, Choosing, Confidence, & Accuracy: A Meta-Analysis of the Confidence-Accuracy Relation in Eyewitness Identification Studies, 118 Psych. Bull. 315 (1995) (same).

⁴⁸ Gail S. Goodman et al., Face-to-Face Confrontation: Effects of Closed-Circuit Technology on Children’s Eyewitness Testimony and Jurors’ Decisions, 22 Law & Hum. Behav. 165 (1998).

⁴⁹ See, e.g., Bunton, NAPD Statement.

⁵⁰ See Joshua D. Angrist, Instrumental Variables Methods in Experimental Criminological Research: What, Why and How, 2 J. Exp. Criminol. 23, 24 (2006) (randomized studies are considered the gold standard for scientific evidence). The idea that randomized control trials are the gold standard in scientific research investigating causation is unfamiliar to some in the legal profession. Such studies are the gold standard because they create similarly situated comparison groups at the same point in time. Randomized studies create groups statistically identical to one another except that one is not exposed to the intervention or program (in the case of this section, video interaction), allowing us to know with as much certainty as science allows that the reason for any observed differences in outcome is the intervention or program. By contrast, a commonly used methodology that compares outcomes prior to an intervention’s implementation to outcomes after the implementation could, for example, turn on factors that do not exist in both groups. If we think about a legal intervention, those factors could be a change in presiding judge, a new crop of lawyers working on these cases, a change in procedural or substantive law, a reduction in policing or change in arrest philosophy, implementation or abandonment of reliance on a risk-assessment tool, bail reform, etc. The randomized trial eliminates as far as possible the potentially influencing factors. For that reason, it is thought of as the gold standard.

⁵¹ Bunton, NAPD Statement.

⁵² By way of explanation, the difference between field studies and lab studies is whether the experiment is conducted in a live setting or a contrived setting. While lab studies are common and valuable – especially when field experiments are difficult or, even more problematically, challenge ethics or decency – the scientific community generally places greater weight on field studies.

⁵³ Rod Eldford et al., A Randomized Controlled Trial of Child Psychiatric Assessments Conducted Using Videoconferencing, 6 J. Telemedicine & Telecare 73, 74–75 (2000).

⁵⁴ Carlos De Las Cuevas et al., Randomized Clinical Trial of Telepsychiatry through Videoconference versus Face-to-Face Conventional Psychiatric Treatment 12 Telemedicine & E-Health 341, 341 (2006).

⁵⁵ Id.

⁵⁶ See Angrist, Instrumental Variables.

⁵⁷ See Risto Roine, Arto Ohinmaa & David Hailey, Assessing Telemedicine: A Systematic Review of the Literature, 165 Canadian Med. Ass’n J. 765, 766 (2001) (noting the earliest reviews of telemedicine occurred in 1995).

⁵⁸ Derek S. Chapman & Patricia M. Rowe, The Impact of Videoconference Technology, Interview Structure, and Interviewer Gender on Interviewer Evaluations in the Employment Interview: A Field Experiment, 74 J. Occup. & Org. Psych. 279, 291 (2001).

⁵⁹ Min Kyung Lee, Nathaniel Fruchter & Laura Dabbish, Making Decisions from a Distance: The Impact of Technological Mediation on Riskiness and Dehumanization, 18 Proc. Ass’n Computing Mach. Conf. on Comput. Supported Coop. Work & Soc. Computing 1576, 1570 (2015).

⁶⁰ Fernando Robles et al., A Comparative Assessment of Videoconference and Face-to-Face Employment Interviews, 51 Mgmt. Decision 1733, 1740 (2013).

⁶¹ We caution that besides stating that the study included “all asylum cases differentiated between hearings conducted via [videoconference], telephone, and in-person for FY 2005 and FY 2006,” further selection details were not mentioned. Frank M. Walsh & Edward M. Walsh, Effective Processing or Assembly-Line Justice? The Use of Teleconferencing in Asylum Removal Hearings, 22 Geo. Immigr. L.J. 259, 259–71 (2008).

⁶² The process was not random. The pilot took place in parts of London and Kent with two magistrates’ courts participating (Camberwell Green and Medway) and sixteen police stations participating. Defendants in these participating courts had to give their consent before appearing in a Virtual Court (in December 2009, the consent requirement was removed). There was also a list of suitability criteria; if a defendant met any one of these criteria, the case was deemed unsuitable for videoconferencing. Matthew Terry, Steve Johnson & Peter Thompson, U.K. Ministry Just., Virtual Court Pilot Outcome Evaluation i, 24–31 (2010).

⁶³ Dane Thorley & Joshua Mitts, Trial by Skype: A Causality-Oriented Replication Exploring the Use of Remote Video Adjudication in Immigration Removal Proceedings, 59 Int’l Rev. L. & Econ. 82 (2019); Ingrid V. Eagly, Remote Adjudication in Immigration, 109 Nw. U. L. Rev. 933 (2015); Shari Seidman Diamond, Locke E. Bowman, Manyee Wong & Matthew M. Patton, Efficiency and Cost: The Impact of Videoconferenced Hearings on Bail Decisions, 100 J. Crim. L. & Criminology 869, 897 (2010).

⁶⁴ Molly Treadway Johnson & Elizabeth C. Wiggins, Videoconferencing in Criminal Proceedings: Legal and Empirical Issues and Directions for Research, 28 Law & Pol’y 211 (2006).

⁶⁵ Eric T. Bellone, Private Attorney-Client Communications and the Effect of Videoconferencing in the Courtroom, 8 J. Int’l Com. L. & Tech. 24 (2013).

⁶⁶ See, e.g., Fed. R. Civ. P. 43 (contemplating “testimony … by contemporaneous transmission from a different location” only “[f]or good cause in compelling circumstances and with appropriate safeguards”).

⁶⁷ We have focused this chapter on hearings involving fact-finding based on testimonial evidence. Hearings that do not involve fact-finding (status conferences, bond hearings, oral arguments, and so on) are now accepted, even routine. But the fact that something is accepted and routine does not make it advisable. For non-fact-finding hearings, concerns of truth detection might be lessened, but dehumanization and other concerns remain extant. We recommend strong evaluation of online versus in-person hearings that do not involve fact-finding as well.

⁶⁸ Holly Fernandez-Lynch, D. James Greiner & I. Glenn Cohen, Overcoming Obstacles to Experiments in Legal Practice, 367 Science 1078 (2020); see also Michael Abramowicz, Ian Ayres & Yair Listokin, Randomizing Law, 159 U. Pa. L. Rev. 929 (2011).

⁶⁹ Charles Fried, Medical Experimentation: Personal Integrity and Social Policy (2nd ed. 2016).

⁷⁰ Harry Marks, The Progress of Experiment: Science and Therapeutic Reform in the United States, 1900–1990 (2000).

⁷¹ Studies in which courts have randomized compulsory mediation sessions or settlement conferences are collected in D. James Greiner & Andrea Matthews, Randomized Control Trials in the United States Legal Profession, 12 Ann. Rev. L. & Soc. Sci. 295 (2016). Regarding randomization of assignments of judges, see the excellent Adam M. Samaha, Randomization in Adjudication, 51 Wm. & Mary L. Rev. 1 (2009).

5 Gamesmanship in Modern Discovery Tech

¹ Much of this chapter is based on and extends our previous work: Neel Guha, Peter Henderson & Diego A. Zambrano, Vulnerabilities in Discovery Tech, 35 Harv. J.L. & Tech. 581 (2022).

² See, e.g., Frank H. Easterbrook, Comment, Discovery as Abuse, 69 B.U. L. Rev. 635, 637 (1989); Linda S. Mullenix, The Pervasive Myth of Pervasive Discovery Abuse: The Sequel, 39 B.C. L. Rev. 683, 684–85 (1998).

³ Stephen Embry, Am. Bar. Ass’n, 2020 Litigation & TAR (2020), https://www.americanbar.org/groups/law_practice/publications/techreport/2020/litigationtar/.

⁴ Some scholars have specifically warned about the dangers of abuse. See, e.g., David Freeman Engstrom & Jonah B. Gelbach, Legal Tech, Civil Procedure, and the Future of Adversarialism, 169 U. Pa. L. Rev. 1001, 1073 (2020) (“[A]utomated discovery might breed more abuse, and prove less amenable to oversight, than an analog system built upon ‘eyes-on’ review.”); Seth K. Endo, Technological Opacity & Procedural Injustice, 59 B.C. L. Rev. 821 (2018) (same); Dana A. Remus, The Uncertain Promise of Predictive Coding, 99 Iowa L. Rev. 1691, 1709 (2014) (same). Other have instead focused on the need for attorneys to supervise technologists and remain technically competent. See, e.g., Shannon Brown, Peeking Inside the Black Box: A Preliminary Survey of Technology Assisted Review (TAR) and Predictive Coding Algorithms for eDiscovery, Suffolk J. Tr. & App. Advoc., June 2016, at 1 (warning against lawyers’ reliance on “outside advisors”); Daniel N. Kluttz & Deirdre K. Mulligan, Automated Decision Support Technologies and the Legal Profession, 34 Berkeley Tech. L.J. 853, 884 (2019) (same). There are also recent bills to address discovery gamesmanship. See, e.g., Tom Umberg, Document Production Gamesmanship Run Amuck – Governor Newsom Should Sign SB 17, Senator Tom Umberg (Sept. 27, 2020), https://sd34.senate.ca.gov/news/9272019-document-production-gamesmanship-run-amuck-%E2%80%93-governor-newsom-should-sign-sb-17.

⁵ Recent scholarship has explored this challenge from a variety of perspectives. See, e.g., William Matthewman, Towards a New Paradigm for E-Discovery in Civil Litigation: A Judicial Perspective, 71 Fla. L. Rev. 1261 (2019) (arguing that a new paradigm is necessary to regulate discovery tech).

⁶ Diego A. Zambrano, Discovery as Regulation, 119 Mich. L. Rev. 71, 80 (2020).

⁷ Fed. R. Civ. P. 26(g).

⁸ Id.

⁹ Fjelstad v. Am. Honda Motor Co., Inc., 762 F.2d 1334, 1343 (9th Cir. 1985) (“We consistently have held that sanctions may be imposed even for negligent failures to provide discovery.”).

¹⁰ Fed. R. Civ. P. 37.

¹¹ Model Rules of Pro. Conduct r. 3.4. (Am. Bar Ass’n 2021).

¹² 15 U.S.C. § 1.

¹³ Brief of the American Antitrust Institute as Amici Curiae in Support of Respondents at *4, Bell Atlantic Corp. v. Twombly, 550 U.S. 544 (2007) (No. 05-1126).

¹⁴ John M. Majoras, Antitrust Pleading Standards after Twombly, Jones Day (June 2007), https://www.jonesday.com/en/insights/2007/06/antitrust-pleading-standards-after-itwomblyi (noting that some antitrust cases involve “frequently millions of dollars in legal fees and discovery costs”).

¹⁵ See Kluttz & Mulligan, Automated Decision Support Technologies (citing Nicholas M. Pace & Laura Zakaras, RAND Institute for Civil Justice, Where the Money Goes: Understanding Litigant Expenditures for Producing Electronic Discovery 20, 16 n.39 (2012)); Eleanor Brock, eDiscovery Opportunity Costs: What Is the Most Efficient Approach? logikcull (Nov. 21, 2018), https://www.logikcull.com/blog/ediscovery-opportunity-costs-infographic; Casey Sullivan, What a Million-Dollar eDiscovery Bill Looks Like, Logikcull (May 9, 2017), https://www.logikcull.com/blog/million-dollar-ediscovery-bill-looks-like (describing a $13 million payment to discovery vendors in a case involving 3.6 terabytes of data).

¹⁶ The Sedona Conference, The Sedona Conference Best Practices Commentary on the Use of Search and Information Retrieval Methods in E-Discovery, 8 Sedona Conf. J. 189, 200–202 (2007) (discussing the use of keyword searching and its problems); Howard Sklar, Match Point with Recommind’s Predictive Coding – It’s “Man with Machine,” not “Man vs. Machine”, Corp. Couns. Bus. J. (Aug. 1, 2011), https://ccbjournal.com/articles/match-point-recomminds-predictive-coding-its-man-machine-not-man-vs-machine (discussing the weaknesses of keyword searching).

¹⁷ “Continuous Active Learning” can refer both to a specific product developed and trademarked by Maura R. Grossman and Gordan V. Cormack, or to a general class of algorithms sharing common attributes. Compare Continuous Active Learning, Registration No. 5876987, with Matthew Verga, Alphabet Soup: TAR, CAL, and Assisted Review, Assisted Review Series Part 1, XACT Data Discovery (Sept. 15, 2020), https://xactdatadiscovery.com/articles/predictive-coding-evolution/.

¹⁸ Maura R. Grossman & Gordan V. Cormack, Vetting and Validation of AI-Enabled Tools for Electronic Discovery, in Litigating Artificial Intelligence 3 (Jill Presser, Jesse Beatson & Gerald Chan, eds., 2021).

¹⁹ Id. at 14. Consequently, researchers believe that achieving more than 70 percent recall and 70 percent precision for any system is difficult. Id. at 15.

²⁰ Herbert L. Roitblat et al., Document Categorization in Legal Electronic Discovery: Computer Classification vs. Manual Review, 61 J. Am. Soc’y For Info. Sci. & Tech. 70, 74–75 (2010); Maura R. Grossman & Gordon V. Cormack, Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, 17 Rich. J.L. & Tech. 1, 3 (2011); Thomas Barnett et al., Machine Learning Classification for Document Review, 2009 DESI III: ICAIL Workshop on Glob. E-Discovery/E-Disclosure.

²¹ See, e.g., Progressive Cas. Ins. Co. v. Delaney, No. 2:11-CV-00678-LRH, 2014 WL 3563467, at *8 (D. Nev. July 18, 2014); Hyles v. New York City, No. 10CIV3119ATAJP, 2016 WL 4077114, at *2 (S.D.N.Y. Aug. 1, 2016).

²² Endo, Technological Opacity & Procedural Injustice, at 837–38.

²³ David Freeman Engstrom & Nora Freeman Engstrom, TAR Wars: E-Discovery and the Future of Legal Tech, 96 Advocate 19 (2021).

²⁴ Endo, Technological Opacity & Procedural Injustice, at 837.

²⁵ Henderson, Guha & Zambrano, Vulnerabilities in Discovery Tech, at 13 (citing Youngevity Int’l Corp. v. Smith, No. 16CV00704BTMJLB, 2019 WL 1542300, at *12 (S.D. Cal. Apr. 9, 2019); Progressive Cas., 2014 WL 3563467, at *10; William A. Gross Const. Assocs., Inc. v. Am. Mfrs. Mut. Ins. Co., 256 F.R.D. 134, 135 (S.D.N.Y. 2009).

²⁶ Henderson, Guha & Zambrano, Vulnerabilities in Discovery Tech, at 13 (citing Youngevity Int’l, 2019 WL 1542300, at *12).

²⁷ William A. Gross, 256 F.R.D. at 135; In re Seroquel Products Liability Litig., 244 F.R.D. 650, 662 (M.D. Fla. 2007).

²⁸ Christine Payne & Michelle Six, A Proposed Technology-Assisted Review Framework, Law360 (Apr. 27, 2020), https://perma.cc/9DZJ-7FSQ.

²⁹ Progressive Cas., 2014 WL 3563467, at *10.

³⁰ The Sedona Principles, Third Edition: Best Practices, Recommendations & Principles for Addressing Electronic Document Production, 19 Sedona Conf. J. 1 (2018).

³¹ Engstrom & Engstrom, TAR Wars, at 19.

³² Engstrom & Gelbach, Legal Tech, at 1072.

³³ Id. at 1046.

³⁴ See, e.g., id.; Bruce H. Kobayashi, Law’s Information Revolution as Procedural Reform: Predictive Search as a Solution to the In Terrorem Effect of Externalized Discovery Costs, 2014 U. Ill. L. Rev. 1473, 1509.

³⁵ For a longer discussion of the middle stage and other vulnerabilities, see generally Guha, Henderson & Zambrano, Vulnerabilities in Discovery Tech.

³⁶ See, e.g., David D. Lewis, Eugene Yang & Ophir Frieder, Certifying One-Phase Technology-Assisted Reviews, 30 Proc. of the ACM Int’l Conf. on Info. & Knowledge Mgmt. 893 (2021); Dan Li & Evangelos Kanoulas, When to Stop Reviewing in Technology-Assisted Reviews: Sampling from an Adaptive Distribution to Estimate Residual Relevant Documents, 38 ACM Transactions on Info. Sys. 1 (2020).

³⁷ See Guha, Henderson & Zambrano, Vulnerabilities in Discovery Tech.

³⁸ Kashmir Hill, Wrongfully Accused by an Algorithm, N.Y. Times (Aug. 3, 2020), https://www.nytimes.com/2020/06/24/technology/facial-recognition-arrest.html.

³⁹ Joy Buolamwini & Timnit Gebru, Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification, 81 Proc. Mach. Learning Rsch. 77 (2018) (showing that IBM and Microsoft facial recognition systems were biased at the time); Allison Koenecke et al., Racial Disparities in Automated Speech Recognition, Proc. Nat’l Acad. Scis., April 2020, at 7684 (showing that Apple, Amazon, IBM, Microsoft, and Google’s speech recognition systems were all biased).

⁴⁰ Koenecke et al., Racial Disparities in Automated Speech Recognition.

⁴¹ See, e.g., Katrin Tomanek et al., On Proper Unit Selection in Active Learning: Co-selection Effects for Named Entity Recognition, 2009 Proc. NAACL HLT 2009 Workshop on Active Learning for Nat. Language Processing 9; see also Dmitriy Dligach & Martha Palmer, Good Seed Makes a Good Crop: Accelerating Active Learning Using Language Modeling, 49 Proc. Ann. Meeting Ass’n for Computational Linguistics: Hum. Language Techs. 6 (2011); Christian J. Mahoney et al., Evaluation of Seed Set Selection Approaches and Active Learning Strategies in Predictive Coding, 2018 IEEE Int’l Conf. on Big Data 3292.

⁴² Luke Oakden-Rayner et al., Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging, 2020 Proc. ACM Conf. on Health, Inference, & Learning 151.

⁴³ Eric Wallace et al., Concealed Data Poisoning Attacks on NLP Models, 2021 Proc. 2021 Conf. N. Am. Chapter Ass’n for Computational Linguistics: Hum. Language Techs. 139.

⁴⁴ Grossman & Cormack, Vetting and Validation, at 13.

⁴⁵ See Karan Goel et al., Robustness Gym: Unifying the NLP Evaluation Landscape, 2021 Proc. Conf. N. Am. Chapter of the Ass’n for Computational Linguistics: Hum. Language Tech. 42.

⁴⁶ See Curtis G. Northcutt et al., Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks, arXiv (2021), https://arxiv.org/abs/2103.14749.

⁴⁷ See, e.g., Yonatan Oren et al., Distributionally Robust Language Modeling, arXiv (2019), https://arxiv.org/abs/1909.02060. But cf. Agnieszka Słowik & Léon Bottou, Algorithmic Bias and Data Bias: Understanding the Relation between Distributionally Robust Optimization and Data Curation, arXiv (2021), https://arxiv.org/abs/2106.09467; Christian J. Mahoney et al., Evaluation of Seed Set Selection Approaches and Active Learning Strategies in Predictive Coding, 2018 IEEE Int’l Conf. on Big Data 3292.

⁴⁸ See Guha, Henderson & Zambrano, Vulnerabilities in Discovery Tech.

⁴⁹ Sedona Principles.

⁵⁰ Walters v. Nat’l Ass’n of Radiation Survivors, 473 U.S. 305, 325 (1985) (quoting Henry J. Friendly, Some Kind of Hearing, 123 U. Pa. L. Rev. 1267, 1288 (1975)).

⁵¹ W. Bradley Wendel, Rediscovering Discovery Ethics, 79 Marq. L. Rev. 895 (1996).

⁵² Tyler Trew, Ethical Obligations in Technology Assisted Review, A.B.A. (Dec. 7, 2020), https://www.americanbar.org/groups/litigation/committees/professional-liability/practice/2020/ethical-obligations-in-technology-assisted-review/.

⁵³ Model Rules of Pro. Conduct r. 1.1, cmt. 8 (Am. Bar Ass’n 2021).

⁵⁴ Model Rules of Pro. Conduct r. 5.3(b) (Am. Bar Ass’n 2021).

⁵⁵ In re Broiler Chicken Antitrust Litig., No. 1:16-CV-08637, 2018 WL 1146371, at *1 (N.D. Ill. Jan. 3, 2018).

⁵⁶ Payne & Six, A Proposed Technology-Assisted Review Framework.

⁵⁷ Kobayashi, Law’s Information Revolution as Procedural Reform.

⁵⁸ See Zivilprozessordnung [ZPO] [Code of Civil Procedure], §§ 402–14; Sven Timmerbeil, The Role of Expert Witnesses in German and U.S. Civil Litigation, 9 Ann. Surv. Int’l & Comp. Law 163 (2003) (providing a detailed comparative study between German and United States rules for expert witnesses).

⁵⁹ Timmerbeil, The Role of Expert Witnesses, at 174.

⁶⁰ See, e.g., Alex Wang et al., GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding, arXiv (Feb. 22, 2019), https://arxiv.org/abs/1804.07461. Of course, it is important to recognize the inherent limitations of benchmarks. See Inioluwa Deborah Raji et al., AI and the Everything in the Whole Wide World Benchmark, ArXiv (Nov. 26, 2021), https://arxiv.org/abs/2111.15366.

⁶¹ See, e.g., Pengcheng He et al., Microsoft DeBERTa Surpasses Human Performance on the SuperGLUE Benchmark, Microsoft Rsch. Blog (Jan. 6, 2021), https://perma.cc/3L9T-VD6F.

⁶² Jörn-Henrik Jacobsen, Robert Geirhos & Claudio Michaelis, Shortcuts: How Neural Networks Love to Cheat, Gradient (July 25, 2020), https://thegradient.pub/shortcuts-neural-networks-love-to-cheat/.

⁶³ A notable exception is the 2008–10 TREC challenges, which involved synthetics complaints and documents collected in connection with litigation involving tobacco companies and Enron. See Douglas W. Oard et al., Overview of the TREC 2008 Legal Track (2008), https://perma.cc/N8ZX-HEP3.

⁶⁴ Cloudnine, The Enron Data Set Is No Longer a Representative Test Data Set: eDiscovery Best Practices, eDiscovery Daily Blog, https://cloudnine.com/ediscoverydaily/electronic-discovery/the-enron-data-set-is-no-longer-a-representative-test-data-set-ediscovery-best-practices/.

⁶⁵ Sara Merken, E-discovery Market Consolidation Continues with “Nine-Figure” Exterro Acquisition, Westlaw News (Dec. 3, 2020, 1:46 PM), www.reuters.com/article/legalinnovation-ediscovery-ma/e-discovery-market-consolidation-continues-with-nine-figure-exterro-acquisition-idUSL1N2IJ2V7.

⁶⁶ See Rishi Bommasani et al., On the Opportunities and Risks of Foundation Models, arXiv (Aug. 18, 2021), https://arxiv.org/abs/2108.07258; Yuqing Cui, Application of Zero-Knowledge Proof in Resolving Disputes of Privileged Documents in E-Discovery, 32 Harv. J.L. & Tech. 633, 653 (2018).

6 Legal Tech and the Litigation Playing Field

¹ Marc Galanter, Why the Haves Come Out Ahead: Speculations on the Limits of Legal Change, 9 Law & Soc’y Rev. 95 (1974); Albert Yoon, The Importance of Litigant Wealth, 59 DePaul L. Rev. 649 (2010).

² Alexandra D. Lahav & Peter Siegelman, The Curious Incident of the Falling Win Rate: Individual vs System-Level Justification and the Rule of Law, 52 U.C. Davis L. Rev. 1371 (2019).

³ Lynn Langton & Thomas H. Cohen, Bureau of Just. Stats., Civil Bench and Jury Trials in State Courts, 2005, at 10 (2009), https://bjs.ojp.gov/content/pub/pdf/cbjtsc05.pdf (reporting trial data from the seventy-five most populous counties).

⁴ Pew Charitable Trs., How Debt Collectors Are Transforming the Business of State Courts (2020), https://www.pewtrusts.org/en/research-and-analysis/reports/2020/05/how-debt-collectors-are-transforming-the-business-of-state-courts.

⁵ Joe Palazzolo, We Won’t See You in Court: The Era of Tort Lawsuits Is Waning, Wall St. J. (July 24, 2017), https://www.wsj.com/articles/we-wont-see-you-in-court-the-era-of-tort-lawsuits-is-waning-1500930572.

⁶ Ct. Stats. Project, State Court Caseload Digest: 2018 Data 10 (2020), https://www.courtstatistics.org/__data/assets/pdf_file/0014/40820/2018-Digest.pdf.

⁷ See Lahav & Siegelman, The Curious Incident of the Falling Win Rate, at 1374.

⁸ See generally Stephen Daniels & Joanne Martin, Tort Reform, Plaintiffs’ Lawyers, and Access to Justice (2015); William Haltom & Michael McCann, Distorting the Law: Politics, Media, and the Litigation Crisis (2004).

⁹ For how certain managerial activities might benefit defendants, see Nora Freeman Engstrom, The Diminished Trial, 86 Fordham L. Rev. 2131, 2146 (2018); Nora Freeman Engstrom, The Lessons of Lone Pine, 129 Yale L.J. 2, 62–65 (2019); Elizabeth G. Thornburg, The Managerial Judge Goes to Trial, 44 U. Rich. L. Rev. 1261, 1306–7 (2010).

¹⁰ Bell Atlantic Corp. v. Twombly, 550 U.S. 544 (2007); Ashcroft v. Iqbal, 556 U.S. 662 (2009). On effects, see Theodore Eisenberg & Kevin M. Clermont, Plaintiphobia in the Supreme Court, 100 Cornell L. Rev. 193, 193 (2014).

¹¹ Engstrom, Lessons of Lone Pine, at 68.

¹² Id.

¹³ See generally Stephen B. Burbank & Sean Farhang, Rights and Retrenchment: The Counterrevolution against Federal Litigation (2017).

¹⁴ See generally Arthur R. Miller, Simplified Pleading, Meaningful Days in Court, and Trials on the Merits: Reflections on the Deformation of Federal Procedure, 88 N.Y.U. L. Rev. 286 (2013).

¹⁵ Marc A. Franklin et al., Tort Law and Alternatives ch. 12, § B. (11th ed. 2021).

¹⁶ See David Freeman Engstrom & Jonah B. Gelbach, Legal Tech, Procedure, and the Future of Adversarialism, 169 U. Pa. L. Rev. 1001, 1031–41 (2021) (reviewing the debate).

¹⁷ See, e.g., Albert H. Yoon, The Post-Modern Lawyer: Technology and the Democratization of Legal Representation, 66 U. Toronto L.J. 456, 457 (2016); Joseph Raczynski, How Medium-Sized Law Firms Can Use Legal Tech to Compete with the Big Industry Players, Legal Insights Eur. (Aug. 10, 2018), https://legalsolutions.thomsonreuters.co.uk/blog/2018/08/10/how-medium-sized-law-firms-can-use-legal-tech-to-compete-with-the-big-industry-players/.

¹⁸ William D. Henderson, Legal Market Landscape Report: Commissioned by the State Bar of California i, 15, 17, 19 (2018) (lamenting law’s “lagging legal productivity” problem and arguing tech can mitigate the “deteriorating economics of lawyers serving individual clients”).

¹⁹ For a useful overview of legal tech tools serving self-represented litigants, see Rebecca L. Sandefur, Am. Bar Found., Legal Tech for Non-lawyers: Report of the Survey of U.S. Legal Technologies (2019), https://www.americanbarfoundation.org/uploads/cms/documents/report_us_digital_legal_tech_for_nonlawyers.pdf.

²⁰ For an earlier but helpful “pocket guide” to TAR, see Timothy T. Lau & Emery G. Lee III, Fed. Jud. Ctr., Technology-Assisted Review for Discovery Requests (2017), https://judicialstudies.duke.edu/wp-content/uploads/2017/07/Panel-4_Technology-Assisted_Review_for_Discovery_Requests.pdf.

²¹ Seth Katsuya Endo, Technological Opacity & Procedural Injustice, 59 B.C. L. Rev. 821, 837 (2018).

²² Engstrom & Gelbach, Legal Tech, at 1048–49.

²³ See David C. Blair & M. E. Maron, An Evaluation of Retrieval Effectiveness for a Full-Text Document-Retrieval System, 28 Commc’ns ACM 289, 291 (1985); Lau & Lee, Technology-Assisted Review, at 3.

²⁴ See James I. Ham, Ethical Considerations Relating to Outsourcing of Legal Services by Law Firms to Foreign Services Providers, 27 Pa. St. Int’l L. Rev. 323 (2008).

²⁵ Engstrom & Gelbach, Legal Tech, at 1052–54 (reviewing the evidence).

²⁶ Mike Ananny & Kate Crawford, Seeing without Knowing: Limitations of the Transparency Ideal and Its Application to Algorithmic Accountability, 20 New Media & Soc’y 983 (2016).

²⁷ See Dana A. Remus, The Uncertain Promise of Predictive Coding, 99 Iowa L. Rev. 1691, 1707 (2014).

²⁸ See id. (noting the possibility that lawyers will make aggressive relevance and privilege calls in constructing seed sets, which are then applied at scale to the entire document corpus).

²⁹ See Maura R. Grossman & Gordon V. Cormack, Comments on “The Implications of Rule 26(g) on the Use of Technology-Assisted Review,” 7 Fed. Cts. L. Rev. 285 (2014). For a recent case addressing this concern, see Livingston v. City of Chicago, No. 16 CV 10156, 2020 WL 5253848, at *3 (N.D. Ill. Sept. 3, 2020) (“Plaintiffs express concern that the attorney reviewers will improperly train the TAR tool by making incorrect responsiveness determinations or prematurely ending the review.”).

³⁰ For accessible overview, see Maura R. Grossman & Gordon V. Cormack, Continuous Active Learning for TAR, E-Discovery Bull., Apr./May 2016, at 32.

³¹ See Endo, Technological Opacity, at 863 (detailing how the “black-box” quality of predictive coding makes it harder for less sophisticated litigants to challenge the predictive coding process).

³² For initial background, see Herbert M. Kritzer, Defending Torts: What Should We Know? 1 J. Tort L. 1, 15 (2007). For the fact that Colossus “relies on 10,000 integrated rules,” see Thomas Scheffey, Attack on Colossus, Conn. L. Trib., Mar. 19, 1999 (quoting Richard J. Balducci).

³³ Bruce A. Hagen, Karen K. Koehler & Michael D. Freeman, Litigating Minor Impact Soft Tissue Cases § 1:2 (2020 ed.) (quoting the 2004 version of the CSC website); see also Joe Frey, Putting a Price on Auto Injuries: How Software Called Colossus Evaluates Claimants’ Pain, Conn. L. Trib., Aug. 14, 2000. For a list of past or current users, see Mark Romano & J. Robert Hunter, Consumer Fed’n Am., Low Ball: An Insider’s Look at How Some Insurers Can Manipulate Computerized Systems to Broadly Underpay Injury Claims, 2, 2 n.7 (2012). For the fact that Colossus was “first used by Allstate in the 1990s,” see Melissa M. D’Alelio & Taylore Karpa Schollard, Colossus and Xactimate: A Tale of Two AI Insurance Software Programs, Brief, Winter 2020, at 20, 24.

³⁴ See Gary T. Schwartz, Auto No-Fault and First-Party Insurance: Advantages and Problems, 73 S. Cal. L. Rev. 611, 635 (2000); Robin Stevenson Burroughs, When Colossus and Xactimate Are Not Exact: How Computerized Claims Adjusting Software Has Not Changed the Landscape of Insurance Litigation, 22 Info. & Commc’ns Tech. L. 109 (2013).

³⁵ See generally H. Laurence Ross, Settled Out of Court: The Social Process of Insurance Claims Adjustment (1980); see also Steven Plitt et al., Colossus under Attack: The Legal Efficacy of Computerized Evaluation of Bodily Injury Claims, Cal. Ins. L. & Reg. Rep., June 2007, at 1 (discussing roundtabling).

³⁶ Colossus®, Evaluate Bodily Injury Claims with Consistency, https://www.dxc.technology/p_and_c_general_insurance/offerings/26121/57637-colossus.

³⁷ Romano & Hunter, An Insider’s Look, at 5–6.

³⁸ Robert D. Bennett, How To Deal with Colossus, in 2 Ass’n Of Trial Lawyers of Am., Atla Annual Convention Reference Materials: Motor Vehicle Collision, Highway, And Premises Liability (2005). William F. Merlin Jr., Colossus: What We Know Today, 2002 ATLA-CLE 127 (2002); William Merlin, Maximizing Recovery in Colossus Claims, 14 Trial Excellence 7, 8, 11 (2002).

³⁹ Bennett, How to Deal with Colossus; Mark Ballard, Allstate’s Master Plan? Major Insurer Is Accused of Penalizing Claimants Who Dare Hire Attorneys, Nat’l L.J., Nov. 9, 1998.

⁴⁰ Jay M. Feinman, Delay, Deny, Defend: Why Insurance Companies Don’t Pay Claims and What You Can Do about It 116–17 (2010).

⁴¹ See Plitt et al., Colossus under Attack.

⁴² Colossus,® Evaluate Bodily Injury Claims with Consistency; accord Oakes v. Allstate Ins. Co., No. 5:05CV-174-R, 2008 WL 11363638, at *1 (W.D. Ky. Sept. 23, 2008); see, e.g., Mirville v. Allstate Indem. Co., 87 F. Supp. 2d 1184, 1186 (D. Kan. 2000), aff’d sub nom., Mirville v. Mirville, 10 F. App’x 640 (10th Cir. 2001) (“The Colossus program indicated that Marie Mirville’s general damages were in the range of $1,076,720 to $1,345,900 and recommended a settlement range of $942,130 to $1,211,310.”).

⁴³ In re Farmers Ins. Exch. Claims Representatives’ Overtime Pay Litig., 336 F. Supp. 2d 1077, 1101 (D. Or. 2004), aff’d in part, rev’d in part and remanded sub nom., 481 F.3d 1119 (9th Cir. 2007) (observing that, at Farmers, claims adjusters “must obtain supervisor approval to settle a claim above the Colossus range”); accord Dougherty v. AMCO Ins. Co., No. C 07-01140 MHP, 2008 WL 2563225, at *3 (N.D. Cal. June 23, 2008) (noting testimony by an adjuster that he had no discretion to deviate from the Colossus settlement range without manager permission).

⁴⁴ Feinman, Delay, Deny, Defend, at 119 (citing a source that, at Allstate, Colossus’ recommended settlement ranges were “etched in stone”); see also Merlin, Maximizing Recovery, at 8, 11; Chris Heeb, Commentary: Are You Colossus Proof? Mo. Laws. Wkly., July 24, 2006.

⁴⁵ Hagen et al., Litigating Minor Impact Soft Tissue Cases, § 1:2 (quoting Ken Williams, President of the Americans Division of CSC’s Financial Services Group).

⁴⁶ In Tough Hands at Allstate, Bloomberg Businessweek (Apr. 30, 2006), https://www.bloomberg.com/news/articles/2006-04-30/in-tough-hands-at-allstate; see also Romano & Hunter, An Insider’s Look, at 7, 13; Feinman, Delay, Deny, Defend, at 124.

⁴⁷ Dougherty, 2008 WL 2563225, at *3 (“Neither jury verdicts, arbitration awards nor post-litigation settlements were reflected in the Colossus analysis of settlement value.”); Jerry Guidera, “Colossus” at the Accident Scene: Software of Insurers Spurs Suits, Wall St. J., Jan. 2, 2003 (reporting on the testimony of Linda Brown, a former Allstate senior claims manager, who testified “that she was instructed to omit jury awards and any settlements of more than $50,000 when helping to establish Colossus database in 1995 for Kentucky”).

⁴⁸ Romano & Hunter, An Insider’s Look, at 13 (compiling evidence of this manipulation); Feinman, Delay, Deny, Defend, at 117–18 (amassing testimony that paints a similar picture); see also Paige St. John, How a Get-Tough Policy Lifted Allstate’s Profits, Sarasota Herald-Trib., Apr. 26, 2008, at A1. But cf. Allstate Agrees to $10 Million Regulatory Settlement over Bodily Injury Claims Handling Processes, N.Y. Dep’t Fin. Servs. (Oct. 18, 2010), https://perma.cc/7ZCS-VW6S (concluding an investigation into Allstate, at the end of which Allstate agreed to “make a number of changes to its claims handling process,” including vis-à-vis the company’s use and tuning of Colossus, while noting that the investigation uncovered “no systemic underpayment of bodily injury claims”).

⁴⁹ See Alan Bryan et al., Using A.I. to Digitize Lawsuits to Perform Actionable Data Analytics, Corp. Legal Operations Consortium (May 15, 2019), https://perma.cc/B4C2-XY3K; see also Brenna Goth, Walmart Using AI to Transform Legal Landscape, Cut Costs, Bloomberg L. (Apr. 26, 2018), https://news.bloomberglaw.com/daily-labor-report/walmart-using-ai-to-transform-legal-landscape-cut-costs; Patricia Barnes, Artificial Intelligence Further Exacerbates Inequality in Discrimination Lawsuits, Forbes (Aug. 26, 2019), https://www.forbes.com/sites/patriciagbarnes/2019/08/26/artificial-intelligence-further-exacerbates-inequality-in-discrimination-lawsuits/.

⁵⁰ Press Release, Ogletree Deakins, Ogletree Deakins and LegalMation Announce Innovative Partnership (Jan. 9, 2019), https://ogletree.com/media-center/press-releases/2019-01-09/ogletree-deakins-and-legalmation-announce-innovative-partnership/. Using AI in Litigation—Thomas Suh (LegalMation Co-Founder), Technically Legal Podcast (May 27, 2020), https://tlpodcast.com/episode-33-using-ai-in-litigation-thomas-suh-legalmation-co-founder/; LegalMation, Case Studies—Corporate In-House Case Study, www.legalmation.com/ (last visited Apr. 10, 2022); Kate Beioley, Workplace Litigation: Why US Employers Are Turning to Data, Fin. Times (Dec. 9, 2019), https://www.ft.com/content/865832b4-0486-11ea-a958-5e9b7282cbd1.

⁵¹ See Bryan et al., Using A.I. to Digitize Lawsuits; see also Using AI in Litigation (describing tool that combines case information with “law firms’ and corporate legal departments’ own billing data and outcome data” to estimate outcomes and cost); Sean Christy, In Their Words: Using Analytics and AI in Legal Practice, Ga. St. News Hub (Mar. 15, 2018), https://news.gsu.edu/2018/03/15/in-their-words-using-analytics-and-ai-in-legal-practice/ (noting use of tool to predict case length, likely cost, and outcome).

⁵² Case Studies – Large Firm Case Study, LegalMation, https://www.legalmation.com/) (claiming a reduction in attorney time on pleadings and initial discovery “from an average of 6–8 hours per matter, to less than 1 hour (including review time by an attorney)”); see also Barnes, Artificial Intelligence Further Exacerbates Inequality (describing case study).

⁵³ Using AI in Litigation.

⁵⁴ See Engstrom & Gelbach, Legal Tech, at 1048–49.

⁵⁵ See generally Brooke D. Coleman, The Efficiency Norm, 56 B.C. L. Rev. 1777 (2015) (describing various mechanisms).

⁵⁶ Proportionality became part of the federal rules in 1983, but it was beefed up in 2006 and then again in 2015. For discussion, see Paul W. Grimm, Are We Insane? The Quest for Proportionality in the Discovery Rules of the Federal Rules of Civil Procedure, 36 Rev. Litig. 117 (2017).

⁵⁷ See Seth Katsuya Endo, Discovery Hydraulics, 52 U.C. Davis. L. Rev. 1317, 1354–55 (2019).

⁵⁸ See Endo, Technological Opacity, at 855 (“Even assuming that predictive coding provides more accurate and comprehensive results at a lower cost, it is not settled how the gains should be distributed between the parties.”).

⁵⁹ Linda Sandstrom Simard, Seeking Proportional Discovery: The Beginning of the End of Procedural Uniformity in Civil Rules, 71 Vand. L. Rev. 1919, 1948 (2018).

⁶⁰ J. Maria Glover, The Federal Rules of Civil Settlement, 87 N.Y.U. L. Rev. 1713, 1730 (2012).

⁶¹ Likewise, if judges hold the line on more expansive discovery, and TAR turns out to be more efficient but only marginally more accurate than manual review, then TAR’s surplus – and, once again, the settlement surplus – will flow toward net document producers.

⁶² Livingston v. City of Chicago, No. 16 CV 10156, 2020 WL 5253848, at *3 (N.D. Ill. Sept. 3, 2020).

⁶³ See Chapter 5 in this volume.

⁶⁴ Remus, Uncertain Promise, at 1707.

⁶⁵ Daniel N. Kluttz & Deirdre K. Mulligan, Automated Decision Support Technologies and the Legal Profession, 34 Berkeley Tech. L.J. 853, 886 (2020) (discussing the concept).

⁶⁶ Aurora Coop. Elevator Co. v. Aventine Renewable Energy–Aurora W., LLC, No. 4:12CV230, 2015 WL 10550240, at *2 (D. Neb. Jan. 6, 2015).

⁶⁷ Fed. R. Civ. P. 26(b)(3). For discussion of whether seed sets are protected work product, see Engstrom & Gelbach, Legal Tech, at 1077–86.

⁶⁸ See Christine Payne & Michelle Six, A Proposed Technology-Assisted Review Framework, Law360 (Apr. 28, 2020), https://www.law360.com/articles/1267032/a-proposed-technology-assisted-review-framework.

⁶⁹ For an analogous argument in the area of algorithmic bias, see Jon Kleinberg et al., Discrimination in the Age of Algorithms, 10 J. Legal Analysis 113 (2019).

⁷⁰ Nora Freeman Engstrom, Retaliatory RICO and the Puzzle of Fraudulent Claiming, 115 Mich. L. Rev. 639, 676 (2017) (discussing Colossus’ salutary fraud-fighting capabilities).

⁷¹ Burroughs, When Colossus and Xactimate Are Not Exact, at 109 (observing that the shift “away from actual price checking might eliminate some possible human bias”).

⁷² St. John, How a Get-Tough Policy Lifted Allstate’s Profits, at A1.

⁷³ Feinman, Delay, Deny, Defend, at 120.

⁷⁴ Id. at 9.

⁷⁵ See Nora Freeman Engstrom, Run-of-the-Mill Justice, 22 Geo. J. Legal Ethics 1485, 1495–98 (2009).

⁷⁶ For the $16,000 figure, see Langton & Cohen, Civil Bench and Jury Trials in State Courts, at 10. For further discussion of the economics that constrain auto accident litigation, see Engstrom, Run-of-the-Mill Justice, at 1495–98.

⁷⁷ St. John, How a Get-Tough Policy Lifted Allstate’s Profits, at A1.

⁷⁸ Galanter, Why the Haves Come Out Ahead, at 101.

⁷⁹ The litigation of losers can also take more analog forms. Consider Allstate. In the mid-1990s, Allstate changed its treatment of minor impact soft-tissue claims (MIST for short) sustained in auto accidents, declaring that, particularly when those claims were accompanied by “vehicle damage of less than $1,000,” “[a] compromise settlement is not desired.” Feinman, Delay, Deny, Defend, at 96–99 (quoting an Allstate Claims Manual); Ballard, Allstate’s Master Plan? By withholding reasonable settlement offers, the insurer forced more soft-tissue claims to trial and, in so doing, caused plaintiffs’ counsel to think twice before accepting clients with such injuries. See Engstrom, Run-of-the-Mill Justice, at 1542 n. 349; Michael Maiello, So Sue Us, Forbes, (Feb. 7, 2000), https://www.forbes.com/forbes/2000/0207/6503060a.html?sh=68b803f62b04.

⁸⁰ See Robert H. Mnookin & Lewis Kornhauser, Bargaining in the Shadow of the Law: The Case of Divorce, 88 Yale L.J. 950, 968 (1979).

⁸¹ E.g., Arpin v. United States, 521 F.3d 769, 776 (7th Cir. 2008) (demanding that trial courts “consider[ ] awards in similar cases” when assessing non-economic damages under the Federal Tort Claims Act); Dougherty v. WCA of Florida, LLC, No. 01-2017-CA-001288, 2019 WL 691063, at *5 (Fla. Cir. Ct. Jan. 29, 2019) (remitting the plaintiff’s award because the $12.5 million award did not bear “a reasonable relationship to … the general trend of prior decisions in similar cases”); Rozmarin v. Sookhoo, 102 N.Y.S.3d 67, 71 (App. Div. 2019) (noting that recent awards, though not binding, can “guide and enlighten” the court).

⁸² Engstrom & Gelbach, Legal Tech, at 1074–75.

⁸³ See Ben Depoorter, Law in the Shadow of Bargaining: The Feedback Effect of Civil Settlements, 95 Cornell L. Rev. 957, 965–74 (2010) (outlining certain information on settlements that is available, notwithstanding data limitations and confidentiality provisions).

⁸⁴ Solutions to Political Polarization in America (Nathaniel Persily ed., 2015).

⁸⁵ Coleman, The Efficiency Norm, at 1784–85 (tracing shifts in “cultural attitudes about litigation”).

⁸⁶ Burbank & Farhang, Rights and Retrenchment.

⁸⁷ Id.

⁸⁸ For these and other deficiencies, see Chapter 16 in this volume; Lahav & Siegelman, The Curious Incident of the Falling Win Rate, at 1375; Nora Freeman Engstrom, Measuring Common Claims about Class Actions, Jotwell (Mar. 16, 2018), https://torts.jotwell.com/measuring-common-claims-about-class-actions/.

⁸⁹ Stephen C. Yeazell, Transparency for Civil Settlements: NASDAQ for Lawsuits?, in Confidentiality, Transparency, and the U.S. Civil Justice System 148–49 (Joseph W. Doherty et al. eds., 2012).

⁹⁰ Id. at 153–61.

⁹¹ Nora Freeman Engstrom, Sunlight and Settlement Mills, 86 N.Y.U. L. Rev. 805, 866–68 (2011).

⁹² See Deborah R. Hensler et al., Compensation for Accidental Injuries in the United States 121–22 (1991); Stephen Daniels & Joanne Martin, It Was the Best of Times, It Was the Worst of Times: The Precarious Nature of Plaintiffs’ Practice in Texas, 80 Tex. L. Rev. 1781, 1789 tbl.4 (2002).

⁹³ Compounding the problem, some settlements are shielded by strict confidentiality provisions. For a discussion, see Nora Freeman Engstrom, Legal Ethics: The Plaintiffs’ Lawyer 291–92 (2022).

⁹⁴ Yeazell, Transparency for Civil Settlements, at 149.

⁹⁵ See Chapters 14 and 16 in this volume. For “cash and kludge,” see Charlotte S. Alexander & Mohammad Javad Feizollahi, On Dragons, Caves, Teeth, and Claws: Legal Analytics and the Problem of Court Data Access, in Computational Legal Studies: The Promise and Challenge of Data-Driven Research (Ryan Whalen ed., 2020).

⁹⁶ This, of course, doesn’t preclude other ways courts might gain access to needed digital outputs. One can readily imagine judges, over staunch work-product objections, demanding that a party seeking to transfer venue provide the court with its software’s outcome prediction in order to test the party’s claim that the transferee court offers only greater “convenience.” See Engstrom & Gelbach, Legal Tech, at 1070.

7 Litigation Outcome Prediction, Access to Justice, and Legal Endogeneity

Thanks to Madison Gibbs for her excellent research assistance and to Albert Yoon, Dan Linna, David Freeman Engstrom, Peter Molnar, and Sanjay Srivastava for insightful conversations on these topics.

¹ Am. Bar Ass’n, Consortium on Legal Services and the Public, Legal Needs and Civil Justice: A Survey of Americans, Major Findings from the Comprehensive Legal Needs Study 27 (1994), https://legalaidresearch.org/2020/03/03/legal-needs-and-civil-justice-a-survey-of-americans-major-findings-from-the-comprehensive-legal-needs-study/.

² Legal Servs. Corp., The Justice Gap: Measuring the Unmet Civil Legal Needs of Low-Income Americans 6 (2017), https://www.lsc.gov/our-impact/publications/other-publications-and-reports/justice-gap-report; see also generally Rebecca L. Sandefur & James Teufel, Assessing America’s Access to Civil Justice Crisis, 11 U.C. Irvine L. Rev. 753 (2021).

³ See, e.g., N.C. Equal Access to Justice Comm’n & NC Equal Justice All., In Pursuit of Justice, An Assessment of the Civil Legal Needs of North Carolina 4 (2021), https://ncequaljusticealliance.org/assessment/; Victor D. Quintanilla & Rachel Thelin, Indiana Civil Legal Needs Study and Legal Aid System Scan 6 (2019), https://www.repository.law.indiana.edu/facbooks/206/; Legal Servs. Corp., The Justice Gap 53 n.6 (collecting additional state studies).

⁴ Legal Servs. Corp., The Justice Gap, at 7.

⁵ Kathryn A. Sabbeth, Housing Defense as the New Gideon, 41 Harv. J.L. & Gender 55, 56–57 (2018); see also Pamela Bookman & Colleen F. Shanahan, A Tale of Two Civil Procedures, 122 Colum. L. Rev. (forthcoming 2022) (describing state courts as “lawyerless”).

⁶ Legal Profession, New Ga. Encyclopedia (Aug. 11, 2020), https://www.georgiaencyclopedia.org/articles/government-politics/legal-profession/; Katheryn Hayes Tucker, Here Are the Six Georgia Counties That Have No Lawyers, The Daily Report (Jan. 8, 2015), https://www.law.com/dailyreportonline/almID/1202714378330/Here-Are-the-Six-Georgia-Counties-That-Have-No-Lawyers/?/.

⁷ Tucker, Six Georgia Counties.

⁸ See Chapter 3 in this volume.

⁹ Gillian K. Hadfield, Weighing the Value of Vagueness: An Economic Perspective on Precision in the Law, 81 Cal. L. Rev. 541 (1994).

¹⁰ Id.

¹¹ Bookman & Shanahan, A Tale of Two Civil Procedures, at 16–17; Rebecca L. Sandefur, Elements of Professional Expertise: Understanding Relational and Substantive Expertise through Lawyers’ Impact, 80 Am. Soc. Rev. 909, 915–16 (2015).

¹² David Freeman Engstrom & Jonah B. Gelbach, Legal Tech, Civil Procedure, and the Future of Adversarialism, 169 U. Pa. L. Rev. 1001, 1072 (2020); John O. McGinnis & Russell G. Pearce, The Great Disruption: How Machine Intelligence Will Transform the Role of Lawyers in the Delivery of Legal Services, 82 Fordham L. Rev. 3041, 3049 (2014).

¹³ Herbert M. Kritzer, Contingency Fee Lawyers as Gatekeepers in the Civil Justice System, 81 Judicature 22, 23 (1997) While Kritzer focuses on contingency-free practice, his core insight extends to cases brought under fee-shifting statutes or flat-fee arrangements as well, where lawyers are likewise balancing outlay of resources against probable recovery.

¹⁴ Thanks to David Freeman Engstrom for suggesting this possibility.

¹⁵ Charlotte S. Alexander, Litigation Migrants, 56 Am. Bus. L.J. 235 (2019).

¹⁶ To carry the thought exercise further, perhaps third-party litigation financing firms could fund these sorts of risky case-selection strategies, which solo lawyers or small firms might otherwise be hesitant to adopt. Center on the Legal Profession, Harvard Law School, The Practice, Investing in Legal Futures, Litig. Fin., Sept.–Oct. 2019.

¹⁷ Kevin Ashley, A Brief History of the Changing Roles of Case Prediction in AI and Law, 36 Law in Context: A Socio-Legal Journal 93, 96 (2019) (citing Ejan Mackaay & Pierre Robillard, Predicting Judicial Decisions: The Nearest Neighbor Rule and Visual Representation of Case Patterns, 3 Datenverarbeitung im Recht 302 (1974)).

¹⁸ See, e.g., Daniel Martin Katz, Michael J. Bommarito II & Josh Blackman, A General Approach for Predicting the Behavior of the Supreme Court of the United States, 12 PLoS ONE (2017).

¹⁹ See, e.g., Sergio Galletta, Elliott Ash & Daniel L. Chen, Measuring Judicial Sentiment: Methods and Application to U.S. Circuit Courts (Aug. 19, 2021) (unpublished manuscript), https://ssrn.com/abstract=3415393.

²⁰ Elizabeth C. Tippett et al., Does Lawyering Matter? Predicting Judicial Decisions from Legal Briefs, and What That Means for Access to Justice, 101 Texas L. Rev. (forthcoming 2022).

²¹ See, e.g., Matthew Dunn et al., Early Predictability of Asylum Court Decisions, 2017 Proc. ACM Conf. on AI & Law.

²² See, e.g., Devin J. McConnell et al., Case-Level Prediction of Motion Outcomes in Civil Litigation, 18 Proc. Int’l Conf. on A.I. & Law 99 (2021).

²³ See, e.g., Karl Branting et al., Semi-Supervised Methods for Explainable Legal Prediction, 17 Proc. Int’l Conf. on A.I. & Law 22 (2019).

²⁴ Tippett et al., Does Lawyering Matter?.

²⁵ Dunn et al., Early Predictability of Asylum Court Decisions.

²⁶ McConnell et al., Case-Level Prediction of Motion Outcomes.

²⁷ Kevin D. Ashley & Stefanie Brüninghaus, Automatically Classifying Case Texts and Predicting Outcomes, 17 A.I. L. 125 (2009).

²⁸ See, e.g., Nikolaos Aletras et al., Predicting Judicial Decisions of the European Court of Human Rights: A Natural Language Processing Perspective, 2(93) PeerJ Comput. Sci. (2016).

²⁹ Fabien Tarissan & Raphaëlle Nollez-Goldbach, Analysing the First Case of the International Criminal Court from a Network-Science Perspective, 4 J. Complex Networks 616 (2016).

³⁰ Paul Boniol et al., Performance in the Courtroom: Automated Processing and Visualization of Appeal Court Decisions in France, ArXiv (2020), https://arxiv.org/abs/2006.06251.

³¹ Michael Benedict L. Virtucio et al., Predicting Decisions of the Philippine Supreme Court Using Natural Language Processing and Machine Learning, 42 IEEE Int’l Conf. on Comput. Software & Applications 130 (2018).

³² Luyao Ma et al., Legal Judgment Prediction with Multi-Stage Case Representation Learning in the Real Court Setting, 44 Proc. Int’l ACM SIGIR Conf. on Rsch. & Dev. in Info. Retrieval 993 (2021).

³³ Andre Lage-Freitas et al., Predicting Brazilian Court Decisions, arXiv (2019), https://arxiv.org/abs/1905.10348.

³⁴ Mehmet Fatih Sert, Engin Yıldırım & İrfan Haşlak, Using Artificial Intelligence to Predict Decisions of the Turkish Constitutional Court, 2021 Soc. Sci. Comp. Rev. 1.

³⁵ Maxime C. Cohen et al., The Use of AI in Legal Systems: Determining Independent Contractor vs. Employee Status (Jan. 28, 2022) (unpublished manuscript) https://ssrn.com/abstract=4013823; Yifei Yin, Farhana Zulkernine & Samuel Dahan, Determining Worker Type from Legal Text Data Using Machine Learning, 2020 IEEE Intl. Conf. on Dependable, Autonomic & Secure Computing; Benjamin Alarie et al., Using Machine Learning to Predict Outcomes in Tax Law, 58 Can. Bus. L.J. 231 (2016).

³⁶ See, e.g., McConnell et al., Case-Level Prediction of Motion Outcomes, at 104 (reporting “maximum classification accuracy of 0.644” as compared to a naïve baseline of 0.501 using adaBoost, a decision-tree-based classification method and a variety of preprocessing steps applied to the input text).

³⁷ See, e.g., Octavia-Maria Şulea et al., Exploring the Use of Text Classification in the Legal Domain, arXiv (2017), https://arxiv.org/abs/1710.09306 (reporting “results of 98% average F1 score in predicting a case ruling” of the French Supreme Court).

³⁸ See id.

³⁹ Compare, e.g., Katz et al., A General Approach for Predicting the Behavior of the Supreme Court (focusing exclusively on predictive performance), with Ma et al., Legal Judgment Prediction, at 8–9 (presenting interpretability strategy for “black box” neural network predictions). This may be an unfair critique, as prediction and explanation can be two entirely separate goals. A classic example illustrates the difference: A ruler who wants to know whether to spend money for a rain dance to break a drought cares about causation. The same ruler who wants to know whether it will rain tomorrow so she can bring an umbrella cares only about prediction. Will it rain or not? For more discussion, see Jon Kleinberg et al., Prediction Policy Problems, 105 Am. Econ. Rev. 491 (2015).

⁴⁰ Litigation Analytics, LexisNexis, https://www.lexisnexis.com/en-us/products/lexis-plus/litigation-analytics.page.

⁴¹ Lex Machina, https://lexmachina.com/legal-analytics/.

⁴² Litigation Analytics, Westlaw Edge https://legal.thomsonreuters.com/en/products/westlaw-edge/litigation-analytics#compare; Precedent Analytics, Westlaw Edge, https://legal.thomsonreuters.com/en/insights/articles/introducing-precedent-analytics.

⁴³ Legal Analytics, Bloomberg L., https://pro.bloomberglaw.com/legal-analytics/.

⁴⁴ AI Sandbox, Fastcase, https://www.fastcase.com/sandbox/.

⁴⁵ Docket Alarm, https://www.docketalarm.com/.

⁴⁶ Blue J Tax, https://www.bluej.com/blue-j-tax.

⁴⁷ CourtQuant, https://www.courtquant.com/about.

⁴⁸ Here, legal operations service providers like Ernst and Young and other accounting and consulting firms, third-party litigation finance companies, and insurance companies that insure against litigation costs are all players that are invested in tech- and often AI-fueled outcome prediction. See, e.g., Apex Litig. Fin., https://www.apexlitigation.com/.

⁴⁹ Christine Schiffner, Inside the “Google-Style” Tech Hub Driving Plaintiffs Firms’ Growth, Nat’l L.J. (Nov. 15, 2021), https://www.law.com/nationallawjournal/2021/11/15/inside-the-google-style-tech-hub-driving-plaintiffs-firms-growth/.

⁵⁰ Dentons Launches Nextlaw Labs and Creates Legal Business Accelerator, Dentons (May 19, 2015) https://www.dentons.com/en/about-dentons/news-events-and-awards/news/2015/may/dentons-launches-nextlaw-labs-creates-legal-business-accelerator.

⁵¹ NateV, Expungement-Generator, GitHub, https://github.com/NateV/Expungement-Generator.

⁵² Id.; Rana Fayez, Meet the Disruptor: Michael Holland, Phila. Citizen (May 3, 2016), https://thephiladelphiacitizen.org/disruptor-michael-hollander-expungement-generator/.

⁵³ Katherine L. Norton, Mind the Gap: Technology as a Lifeline for Pro Se Child Custody Appeals, 58 Duq. L. Rev. 82, 91 (2020).

⁵⁴ Id.

⁵⁵ Engstrom & Gelbach, Legal Tech, at 1029.

⁵⁶ Charlotte S. Alexander & Mohammad Javad Feizollahi, On Dragons, Caves, Teeth, and Claws: Legal Analytics and the Problem of Court Data Access, in Computational Legal Studies: The Promise and Challenge of Data-Driven Legal Research (Ryan Whalen ed., 2020).

⁵⁷ Alaina Lancaster, Judge Rejects ROSS Intelligence’s Dismissal Attempt of Thomson Reuters Suit over Westlaw Content, Law.com (Mar. 29, 2021), https://www.law.com/therecorder/2021/03/29/judge-rejects-ross-intelligences-dismissal-attempt-of-thomson-reuters-suit-over-westlaw-content/.

⁵⁸ See Chapters 6, 14, and 15 in this volume.

⁵⁹ See Chapter 13 in this volume.

⁶⁰ Sean La Roque Doherty, Not All Litigation Analytics Products Are Created Equal, A.B.A. J. (Aug. 1, 2020), https://www.abajournal.com/magazine/article/analytics-products-offer-different-results-depending-on-data-sources-quality-and-the-types-of-analytics-and-reports-they-provide.

⁶¹ Engstrom and Engstrom’s contribution to this volume identifies yet another data limitation: the absence of reliable data on cases and claims that are settled, where the contents of the settlement are unavailable. See Chapter 6 in this volume.

⁶² Margot E. Kaminski, The Right to Explanation, Explained, 34 Berkeley Tech. L.J. 189 (2019).

⁶³ Id.

⁶⁴ See, e.g., Scott M. Lundberg et al., From Local Explanations to Global Understanding with Explainable AI for Trees, 2 Nature Mach. Intelligence 56 (2020); What Is Explainability? Alibi, https://docs.seldon.io/projects/alibi/en/stable/overview/high_level.html#what-is-explainability.

⁶⁵ John Pavlus, The Computer Scientist Training AI to Think with Analogies, Quanta Mag. (July 14, 2021), https://www.quantamagazine.org/melanie-mitchell-trains-ai-to-think-with-analogies-20210714/; see also Katie Atkinson & Trevor Bench-Capon, Reasoning with Legal Cases: Analogy or Rule Application? 17 Proc. Int’l Conf. on Artificial Intelligence & L. 12 (June 2019).

⁶⁶ Emerging Technology from the arXiv, King – Man + Woman = Queen: The Marvelous Mathematics of Computational Linguistics, MIT Tech. Review (Sept. 17, 2015), https://www.technologyreview.com/2015/09/17/166211/king-man-woman-queen-the-marvelous-mathematics-of-computational-linguistics/.

⁶⁷ Adam R. Pah et al., How to Build a More Open Justice System, 369(6500) Science (2020).

⁶⁸ Elisa B. v. Sup. Ct., 117 P.3d 660 (Cal. 2005).

⁶⁹ Id. at 664 (“The UPA defines the ‘[p]arent and child relationship’ as ‘the legal relationship existing between a child and the child’s natural or adoptive parents’ …. The term includes the mother and child relationship and the father and child relationship.”).

⁷⁰ Id. at 667.

⁷¹ Atkinson & Bench-Capon, Reasoning with Legal Cases.

⁷² Prediction tools become like Oliver Wendell Holmes’ Vermont justice: “There is a story of a Vermont justice of the peace before whom a suit was brought by one farmer against another for breaking a churn. The justice took time to consider, and then said that he has looked through the statutes and could find nothing about churns, and gave judgment for the defendant.” Oliver Wendell Holmes, Jr., The Path of the Law, 10 Harv. L. Rev. 457 (1897).

⁷³ Charlotte S. Alexander, Zev Eigen & Camille Gear Rich, Post-Racial Hydraulics: The Hidden Dangers of the Universal Turn, 91 N.Y.U. L. Rev. 1 (2016).

⁷⁴ A future of “legal singularity,” in which all outcomes are perfectly predictable, is not necessary for my argument here. Benjamin Alarie, The Path of the Law: Towards Legal Singularity, 66 U. Toronto L.J. 443 (2016). Even prediction that works well for some subclass of cases will change lawyers’ preferences for those cases over other, less certain cases. This has consequences for those clients’ civil legal needs.

⁷⁵ See, e.g., Laura Weidinger et al., Ethical and Social Risks of Harm from Language Models, arXiv (2021), https://arxiv.org/abs/2112.04359.

⁷⁶ Abeba Birhane et al., The Values Encoded in Machine Learning Research, arXiv (2021), https://arxiv.org/abs/2106.15590.

⁷⁷ Id. (citing Joseph Weizenbaum, Computer Power and Human Reason: From Judgment to Calculation (1976)).

⁷⁸ David Freeman Engstrom, Private Litigation’s Pathways: Lessons from Qui Tam Enforcement, 114 Colum. L. Rev. 1913, 1934 (2014) (“[P]rivate enforcers will tend to push into statutory and regulatory interstices.”).

⁷⁹ Charles L. Barzun, The Common Law and Critical Theory, 92 Colo. L. Rev. 1, 13 (2021).

⁸⁰ Id. at 8.

⁸¹ Engstrom & Gelbach, Legal Tech, at 1036–37.

⁸² Lauren B. Edelman, Christopher Uggen & Howard S. Erlanger, The Endogeneity of Legal Regulation: Grievance Procedures as Rational Myth, 105 Am. J. Socio. 406 (1999).

⁸³ Cynthia L. Estlund, The Ossification of American Labor Law, 102 Colum. L. Rev. 1527, 1530 (2002). The same points have been made in connection with grant funding for scientific research, where the fear is that innovation is stifled because researchers hew too closely to the example of previous successfully funded proposals. See, e.g., Scott O. Lilienfeld, Psychology’s Replication Crisis and the Grant Culture: Righting the Ship, 12 Perspectives on Psych. Sci. 660 (2017).

⁸⁴ Irene Solaiman & Christy Dennison, Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets, arXiv (2021), https://arxiv.org/abs/2106.10328.

⁸⁵ Kaminski, The Right to Explanation, Explained; see also Cathy O’Neill, Weapons of Math Destruction 205 (2016) (proposing a Hippocratic Oath for data scientists).

⁸⁶ For an analogous use of government resources to fill private enforcement gaps, see David Weil, Improving Workplace Conditions through Strategic Enforcement, Russell Sage Found. (2010), https://www.russellsage.org/research/report/strategic-enforcement.

⁸⁷ Tanina Rostain, Techno-Optimism and Access to the Legal System, 148 Daedalus 93 (2019).

8 Toward the Participatory MDL A Low-Tech Step to Promote Litigant Autonomy

Both authors are grateful to Catherina Yue Xu for excellent research assistance.

¹ A growing discourse centers on the use of cutting-edge technology in aggregate litigation. See, e.g., Peter N. Salib, Artificially Intelligent Class Actions, 100 Tex. L. Rev. 519, 544 (2022) (suggesting that machine learning algorithms could resolve individual questions and help satisfy the predominance requirement of Rule 23); Alexander W. Aiken, Class Action Notice in the Digital Age, 165 U. Pa. L. Rev. 967, 997 (2017) (urging courts and parties to consider using machine learning to identify and notify otherwise-unknown class members).

² Here, we play off Elizabeth J. Cabraser & Samuel Issacharoff, The Participatory Class Action, 92 N.Y.U. L. Rev. 846 (2017). In that piece, Cabraser and Issacharoff argue that technological and jurisprudential change have diminished the “absence” of class members because (inter alia) members are kept informed by social media, case-specific websites, and, sometimes, their individual attorneys. The insight is important, but class actions are today largely a dead letter (at least in the mass tort sphere). Recognizing that reality, we apply some of their insights to where mass tort cases are more frequently litigated: the MDL.

³ For an overview of the MDL process, see Eldon E. Fallon et al., Bellwether Trials in Multidistrict Litigation, 82 Tul. L. Rev. 2323, 2327 (2008). For the infrequency of remands, see Abbe R. Gluck & Elizabeth Chamblee Burch, MDL Revolution, 96 N.Y.U. L. Rev. 1, 16 (2021) (noting that more than 97 percent of cases centralized via MDL are resolved without remand).

⁴ Nora Freeman Engstrom, The Lessons of Lone Pine, 129 Yale L.J. 2, 7 (2019).

⁵ Elizabeth Chamblee Burch & Margaret S. Williams, Perceptions of Justice in Multidistrict Litigation: Voices from the Crowd, Cornell L. Rev. (forthcoming 2022) (manuscript at 2) (“[O]ne out of every two civil cases filed in federal court in 2020 was part of an MDL.”).

⁶ As Beth Burch puts it: “[Litigants] select a lawyer and a forum, but like Dorothy in the Wizard of Oz, they may quickly find themselves on unfamiliar turf.” Elizabeth Chamblee Burch, Mass Tort Deals: Backroom Bargaining in Multidistrict Litigation 124 (2019).

⁷ For certain advantages, see Andrew D. Bradt, The Long Arm of Multidistrict Litigation, 59 Wm. & Mary L. Rev. 1165, 1234–35 (2018); Abbe R. Gluck, Unorthodox Civil Procedure: Modern Multidistrict Litigation’s Place in the Textbook Understandings of Procedure, 165 U. Pa. L. Rev. 1669, 1676, 1696 (2017).

⁸ MDLs “last almost four times as long as the average civil case.” Burch & Williams, Perceptions of Justice in Multidistrict Litigation, at 6; see also Fallon et al., Bellwether Trials in Multidistrict Litigation, at 2330 (arguing that the “excessive delay … sometimes associated with traditional MDL practice … cannot be defended”).

⁹ For criticism of repeat play, see Elizabeth Chamblee Burch & Margaret S. Williams, Repeat Players in Multidistrict Litigation: The Social Network, 102 Cornell L. Rev. 1445, 1453 (2017). For a defense, see generally Andrew D. Bradt & D. Theodore Rave, It’s Good to Have the “Haves” on Your Side: A Defense of Repeat Players in Multidistrict Litigation, 108 Geo. L.J. 73 (2019).

¹⁰ For a compilation of scholarly critiques, see Engstrom, Lone Pine, at 9 n.21.

¹¹ E.g., Judith Resnik, Procedural Innovations, Sloshing Over: A Comment on Deborah Hensler, “A Glass Half Full, A Glass Half Empty: The Use of Alternative Dispute Resolution in Mass Personal Injury Litigation,” 73 Tex. L. Rev. 1627, 1641 (1995) (“In a large-scale mass tort, the act of consolidating the individual cases into a jumbo lawsuit risks breaking individual attorney-client relationships.”).

¹² Engstrom, Lone Pine, at 24–25.

¹³ As noted at supra Footnote note 3, more than 97 percent of cases centralized via MDL are resolved without remand. For criticism of that fact, see, e.g., Fallon et al., Bellwether Trials in Multidistrict Litigation, at 2330.

¹⁴ See Andrew D. Bradt & D. Theodore Rave, The Information-Forcing Role of the Judge in Multidistrict Litigation, 105 Cal. L. Rev. 1259, 1264 (2017) (explaining that MDLs derive their legitimacy, in large part, from the notion that the individual retains “ultimate control over her claim,” and further observing that the MDL “is ultimately grounded on premises of individual claimant autonomy”).

¹⁵ For the many procedural protections afforded class action litigants, see Fed. R. Civ. P. 23.

¹⁶ See Bradt & Rave, The Information-Forcing Role of the Judge, at 1271 (explaining that, in contemporary MDLs, “actual individual control of the litigation by claimants is limited”); Gluck & Burch, Bellwether Trials in Multidistrict Litigation, at 67, 72 (calling the notion that an individual plaintiff controls her case a “fiction” and insisting that, in an MDL, a litigant “does not control her own lawsuit in any meaningful way”).

¹⁷ E.g., Elizabeth Chamblee Burch, Monopolies in Multidistrict Litigation, 70 Vand. L. Rev. 67, 99–100 (2017) (describing the coercive mechanisms in certain MDL settlement agreements).

¹⁸ Martin H. Redish & Julie M. Karaba, One Size Doesn’t Fit All: Multidistrict Litigation, Due Process, and the Dangers of Procedural Collectivism, 95 B.U. L. Rev. 109, 114 (2015).

¹⁹ For the empirical revolution, see Daniel E. Ho & Larry Kramer, Introduction: The Empirical Revolution in Law, 65 Stan. L. Rev. 1195, 1196 (2013). As Deborah Hensler has powerfully reflected, we (still) don’t know much about what claimants want, and we don’t know much – of anything – about litigants’ lived experience in, or satisfaction with, the judicial process. Deborah R. Hensler, A Glass Half Full, a Glass Half Empty: The Use of Alternative Dispute Resolution in Mass Personal Injury Litigation, 73 Tex. L. Rev. 1587, 1626 (1995).

²⁰ Alexandra Lahav boils it down even further, observing: “Autonomy requires that each individual plaintiff have a right to participate in the proceeding that determines his entitlements.” Alexandra D. Lahav, Bellwether Trials, 76 Geo. Wash. L. Rev. 576, 610 (2008). Others see it slightly differently, describing litigant autonomy as including “process rights to supervise or manage one’s own litigation; to engage in a meaningful relationship with an attorney of the litigant’s choosing; to have an opportunity to develop the litigation and evidence related to the litigation; and to appear and give testimony before a jury.” Linda S. Mullenix, Competing Values: Preserving Litigant Autonomy in an Age of Collective Redress, 64 DePaul L. Rev. 601, 613–14 (2015) (compiling perspectives). Similar to our summary of necessary elements, Bradt and Rave have observed that, “for individual consent to work as a governance mechanism, claimants need information – about things like how the settlement will work, the strength of their claims, and their likelihood of success.” Bradt & Rave, The Information-Forcing Role of the Judge, at 1265.

²¹ The study’s findings should be viewed with caution. One issue is the survey’s exceedingly low response rate. The survey ultimately cataloged the views of more than 200 respondents, but those respondents were drawn from claimants in more than 200,000 actions. A low response rate is widely understood to limit a survey’s generalizability. See, e.g., Nicholas D. Lawson, “To Be a Good Lawyer, One Has to Be a Healthy Lawyer”: Lawyer Well-Being, Discrimination, and Discretionary Systems of Discipline, 34 Geo. J. Legal Ethics 65, 75 (2021). Meanwhile, more than 85 percent of respondents were pelvic mesh plaintiffs – and it’s not clear that issues that plague mesh litigation plague the MDL system more generally. See Alison Frankel, First-Ever Survey of MDL Plaintiffs Suggests Deep Flaws in Mass Tort System, Reuters (Aug. 9, 2021), https://www.reuters.com/legal/litigation/first-ever-survey-mdl-plaintiffs-suggests-deep-flaws-mass-tort-system-2021-08-09/ (discussing this issue). Beyond that, as Burch and Williams acknowledge (Perceptions of Justice in Multidistrict Litigation, at 4–5 n.16), the paper’s methodology (an opt-in survey, which was available online but was not actually sent to claimants) leaves it open to significant response bias, as opt-in surveys may draw respondents with extreme views. See, e.g., Nan Hu et al., Overcoming the J-Shaped Distribution of Product Reviews, 52 Comm. Acm. 144, 145 (2009). It is possible, then, that respondents were substantially more dissatisfied than the average MDL litigant, or that sample bias otherwise skews the survey’s results. Cf. Lindsay M. Harris & Hillary Mellinger, Asylum Attorney Burnout and Secondary Trauma, 56 Wake Forest L. Rev. 733, 821 n. 173 (2021) (discussing sampling biases). Just as importantly, as the text points out, most surveyed MDL litigants expressed confusion—but compared to what baseline? See infra notes 28 and 29 and accompanying text.

²² Burch & Williams, Perceptions of Justice in Multidistrict Litigation, at 14.

²³ Id. at 15.

²⁴ Id. at 26.

²⁵ For discussion of this and other methodological concerns, see supra Footnote note 21.

²⁶ See Samuel Issacharoff & Robert H. Klonoff, The Public Value of Settlement, 78 Fordham L. Rev. 1177, 1184 (2009) (observing that if it weren’t for “aggregation many cases could not credibly be pursued”). Currently, many who want a personal injury lawyer are unable to get one, and those who cannot find qualified counsel hardly ever prevail. See generally Stephen Daniels & Joanne Martin, Tort Reform, Plaintiffs’ Lawyers, and Access to Justice 231 (2015).

²⁷ Even in the absence of aggregation, the attorney-client relationship sometimes bears little resemblance to a “traditional” model. See Nora Freeman Engstrom, Run-of-the-Mill Justice, 22 Geo. J. Legal Ethics 1485, 1500 (2009) (documenting how settlement mill lawyers who represent individuals pursuing one-off claims very rarely meet, or communicate with, clients); Deborah R. Hensler, Resolving Mass Toxic Torts: Myths and Realities, 1989 U. Ill. L. Rev. 89, 92 (1989) (reporting that, even in non-aggregate tort litigation, the lawyer-client relationship is frequently attenuated, “perfunctory,” and “superficial”).

²⁸ Notably, per Model Rule of Professional Conduct 1.4, lawyers – and even lawyers who practice law at scale – are duty-bound to communicate with clients. Model Rules of Pro. Conduct r. 1.4 (Am. Bar Ass’n 2020); id. r. 1.3 cmt. 2 (“A lawyer’s work load must be controlled so that each matter can be handled competently.”); see also Restatement (Third) of the Law Governing Lawyers § 20 (Am. L. Inst. 2000) (compelling lawyers to “explain a matter to the extent reasonably necessary to permit the client to make informed decisions regarding the representation,” with no exception for lawyers with large client inventories).

²⁹ Deborah R. Hensler, No Need to Panic: The Multi-District Litigation Process Needs Improvement Not Demolition 4 (2017) (unpublished manuscript), https://www.law.gwu.edu/sites/g/files/zaxdzs2351/f/downloads/Deborah-Hensler-MDL-Paper.pdf.

³⁰ See Alison Frankel, Medical Device Defendant Probes Origin of Mesh Claims, Reuters, Mar. 10, 2016 (noting that, in the vaginal mesh litigation, one firm – AlphaLaw – had more than 10,000 claims); Letter from Shanin Specter to Comm. on Rules Prac. & Proc 3 (Dec. 18, 2020), https://www.uscourts.gov/sites/default/files/20-cv-hh_suggestion_from_shanin_specter_-_mdls_0.pdf (reporting that, in the transvaginal mesh litigation, “several attorneys represented in excess of 5,000 clients”); Nathan Koppel, Vioxx Plaintiffs’ Choice: Settle or Lose Their Lawyer, Wall St. J. (Nov. 16, 2007), https://www.wsj.com/articles/SB119517263199795016 (reporting that, in the Vioxx litigation, one lawyer had “more than 1,000 Vioxx cases,” while another firm had “about 4,000” such cases); Jack B. Weinstein, Ethical Dilemmas in Mass Tort Litigation, 88 Nw. U. L. Rev. 469, 494 (1994) (“In asbestos litigation, for example, some lawyers represent more than ten thousand plaintiffs. In other mass torts, such as DES, Dalkon Shield, or toxic dump pollution, lawyers routinely have carried many hundreds of clients at a time.”).

³¹ Weinstein, Ethical Dilemmas in Mass Tort Litigation, at 497 (explaining that, when lawyers simultaneously represent hundreds or thousands of clients, lawyers frequently “do not maintain meaningful one-to-one contact with their clients” and instead, too often, “[t]he client becomes no more than an unembodied cause of action”).

³² Recognizing this, Rule 1.4(b) demands that “[a] lawyer shall explain a matter to the extent reasonably necessary to permit the client to make informed decisions regarding the representation.” Model Rules of Pro. Conduct r. 1.4 (Am. Bar Ass’n 2020). For the decisions reserved exclusively for the clients, see Restatement (Third) of the Law Governing Lawyers § 22 (Am. L. Inst. 2000).

³³ Public confidence in our courts may also suffer. See Weinstein, Ethical Dilemmas in Mass Tort Litigation, at 497 (drawing a connection between adequate communication, litigant satisfaction, and public confidence).

³⁴ Burch & Williams, Perceptions of Justice in Multidistrict Litigation, at 25.

³⁵ Cabraser & Issacharoff, The Participatory Class Action, at 854.

³⁶ Of course, MDL websites are not – and should not be – the only site of communication. Lawyers, as noted above, are duty-bound to communicate with clients, see supra Footnote note 32, and many lawyers take this responsibility seriously, cf. In re Shell Oil Refinery, 155 F.R.D. 552, 573 (E.D. La. 1993) (commending class counsel for excellent and patient communication with class members).

³⁷ Jud. Panel on Multidistrict Litig. & Fed. Jud. Ctr., Managing Multidistrict Litigation in Products Liability Cases: A Pocket Guide for Transferee Judges 8 (2011).

³⁸ E.g., Bolch Jud. Inst., Guidelines and Best Practices for Large and Mass-Tort MDLs 88 (2nd ed. 2018) (offering “Best Practice 13 F”).

³⁹ Bradt, The Long Arm of Multidistrict Litigation, at 1235.

⁴⁰ Abbe Gluck reports that, in her interviews with transferee judges, judges viewed MDL websites as an indispensable tool to help litigants “follow” the litigation. Gluck, Unorthodox Civil Procedure, at 1689–90. Likewise, Judge Eduardo C. Robreno, the transferee judge in MDL-875, has explained that that MDL’s dedicated website facilitated the court’s “communication with thousands of litigants.” Eduardo C. Robreno, The Federal Asbestos Product Liability Multidistrict Litigation (MDL-875): Black Hole or New Paradigm? 23 Widener L.J. 97, 131 (2013); accord In re Asbestos Prod. Liab. Litig. (No. VI), 614 F. Supp. 2d 550, 551 (E.D. Pa. 2009) (“The Court has established an MDL 875 website … . The website … is a helpful tool for the Court and the litigants.”).

⁴¹ We arrived at this figure by calculating the number of actions pending in the twenty-five largest MDLs and then comparing that figure to the total number of pending actions. See Jud. Panel on Multidistrict Litig., MDL Statistics Report: Distribution of Pending MDL Dockets by Actions Pending (Jan. 19, 2020), https://www.jpml.uscourts.gov/sites/jpml/files/Pending_MDL_Dockets_By_Actions_Pending-January-19-2022.pdf.

⁴² To confirm the existence of official websites, we searched on Google for combinations of the MDL number, names of judges affiliated with the MDL, and topical search terms (e.g., “3M hearing loss MDL”). As the text explains, of the twenty-five largest MDLs, as measured by number of actions currently pending, only one (MDL-2848) lacked a website that we were able to locate. Interestingly, the twenty-four sites we reviewed were court-run, though some sites in MDLs are attorney-run. These twenty-four websites represented websites from seventeen different judicial districts. We trained our gaze on the twenty-five largest MDLs in part because research indicates that “large MDLs are significantly more likely to have public websites than small ones.” Gluck, Unorthodox Civil Procedure, at 1690.

⁴³ MDL Statistics Report.

⁴⁴ In an effort to minimize “personalized search effects,” we performed searches in “incognito mode.” See Lisa Larrimore Ouellette, The Google Shortcut to Trademark Law, 102 Cal. L. Rev. 351, 374, 401 & n. 270 (2014) (noting that incognito mode “reduces personalization concerns”).

⁴⁵ In re Zantac (Ranitidine) Prods. Liab. Litig., No. 20-MD-2924, https://perma.cc/A9AB-ECM7.

⁴⁶ Id., Hearings Calendar, https://perma.cc/QK3N-C3JF. Virtual hearing information could plausibly have been added after our March 2022 review, though the calendar lacked a heading for Zoom credentials or teleconference information.

⁴⁷ Id., Operative Pleadings, https://perma.cc/EM7U-YA69.

⁴⁸ Burch and Williams found much the same, reporting that, in one major case, an MDL page did not appear in the first twelve pages of Google results. Burch & Williams, Perceptions of Justice in Multidistrict Litigation, at 57.

⁴⁹ More specifically, we searched (on incognito mode in the Google Chrome browser): “3M earplug lawsuit”; “j&j talcum powder lawsuit”; and “hernia mesh lawsuit”. Given Google’s targeted results, the precise page results will likely vary by user. For more on incognito mode, see supra Footnote note 44.

⁵⁰ MDL Frequently Asked Questions, In re Xarelto Prods. Liab. Litig., No. 14-MD-2592, https://perma.cc/35FA-T3R5.

⁵¹ See, e.g., Proton Pump Case Management Orders, In re Proton-Pump Inhibitor Prods. Liab. Litig. (No. II), No. 17-MD-2789, https://perma.cc/HL2C-CPKP.

⁵² See, e.g., Current Developments, In re Aqueous Film-Forming Foams Prods. Liab. Litig., 18-MN-2873, https://perma.cc/62RU-FNH3.

⁵³ This problem has been discussed extensively in the context of class action notice. See, e.g., Deborah L. Rhode, Class Conflicts in Class Actions, 34 Stan. L. Rev. 1183, 1235 (1982).

⁵⁴ See, e.g., MDL Frequently Asked Questions, In re Xarelto Prods. Liab. Litig. (noting that cases in MDL “may be transferred by the Judicial Panel on Multidistrict Litigation (The Panel) to any federal court for coordinated and consolidated pretrial proceedings,” but offering no discussion of remand or definition of “pretrial proceedings”); Frequently Asked Questions, In re Davol, Inc./C.R. Bard, Inc., Polypropylene Hernia Mesh Prods. Liab. Litig., 18-MD-2846, https://perma.cc/RP8H-JVCB (using the same language).

⁵⁵ See, e.g., David Freeman Engstrom, Post-Covid Courts, 68 UCLA L. Rev. Discourse 246, 250 (2020) (noting that, during the pandemic, courts “embraced remote proceedings and trials, whether by telephone or video connection”).

⁵⁶ Upcoming MDL Events, In re Smith & Nephew Birmingham Hip Resurfacing (BHR) Hip Implant Prods. Liab. Litig., 17-MD-2775, https://perma.cc/GX6W-8ZMC; Multidistrict Litigation, In re Paraquat Prods. Liab. Litig., 21-MD-3004, https://perma.cc/7XLQ-H6KX; Upcoming Court Proceedings, In re Davol, at https://perma.cc/65QA-SDFZ.

⁵⁷ In re Roundup Prods. Liab. Litig., 16-MD-2741, https://perma.cc/4HL4-5AUF.

⁵⁸ Cf. In re Nineteen Appeals Arising Out of San Juan Dupont Plaza Hotel Fire Litig., 982 F.2d 603, 605 (1st Cir. 1992) (“The IRPAs [individually retained plaintiffs’ attorneys] handled individual client communication … . ”).

⁵⁹ In re Juul Labs, Inc. Mktg., Sales Pracs. & Prods. Liab. Litig., 19-MD-2913, https://perma.cc/2VGD-SKBU (last accessed Mar. 15, 2022).

⁶⁰ As of March 15, 2022, the latest document posted was filed on January 21, 2020. See id.

⁶¹ See, e.g., In re Juul Labs, Inc., Mktg., Sales Pracs., & Prod. Liab. Litig., 497 F. Supp. 3d 552 (N.D. Cal. 2020) (ruling on substantive motions to dismiss in October 2020).

⁶² For instance, March 2022 searches revealed the following: MDL-2846, see Upcoming Court Proceedings, In re Davol (providing, on the site’s “Upcoming Proceedings” page, a status conference scheduled for July 20, 2020), and MDL-2775, see Upcoming MDL Events, In re Smith & Nephew Birmingham (listing, on the site’s “Upcoming MDL Events” page, events up to June 2021).

⁶³ See, e.g., MDL-2775, In re Smith & Nephew Birmingham (per our search on the Internet Archive, not publicly posting orders for more than two months after the orders were issued). For discussion of the Internet Archive, which allows users to view older versions of websites, see Deborah R. Eltgroth, Best Evidence and the Wayback Machine: Toward a Workable Authentication Standard for Archived Internet Evidence, 78 Fordham L. Rev. 181, 185 (2009).

⁶⁴ The majority of sites did link to electronic filing systems (i.e., ECF and/or PACER), which offer a “push” feature that alerts the user when a new document is filed. But ECF and PACER require the creation of separate accounts, and PACER is not free. See, e.g., In re Roundup (“To sign up for email alerts when a new document is filed, you may open an account with this Court’s ECF system, then sign up for notices of electronic filing.”).

⁶⁵ There is some variability, and perhaps confusion, on this point. Compare supra Footnote note 40 (compiling judges’ views that the sites exist, at least in part, to facilitate communication with litigants), with 1 Charles S. Zimmerman, Pharmaceutical and Medical Device Litigation § 10:3 (rev. vol. 2021) (explaining that MDL websites are “designed primarily for lawyers, judges, and other professionals who have interests in the litigation”); and Fed. Jud. Ctr. & Jud. Panel on Multidistrict Litig., Bellwether Trials in MDL Proceedings: A Guide for Transferee Judges 38 (2019) (indicating that transferee courts can use MDL websites to “apprise state courts of MDL developments”); and Bolch Jud. Inst., Guidelines and Best Practices, at 82, 87–88 (highlighting court websites as an efficient way to “coordinate with state courts handling parallel state actions”), and In re McKinsey & Co., Nat’l Prescription Opiate Consultant Litig., 21-MD-2996, https://perma.cc/WH7C-LTUR (explaining that the site is designed to assist “journalists and interested members of the public”).

⁶⁶ For the fact that different experts seem to view the “audience” question differently, see supra Footnote note 65 (compiling inconsistent authority).

⁶⁷ There may be additional, nonpublic guidance on these subjects. See Jaime Dodge, Facilitative Judging: Organizational Design in Mass-Multidistrict Litigation, 64 Emory L.J. 329, 382 n.9 (2014) (describing FJC materials available only to judges and their staffs).

⁶⁸ Jud. Panel on Multidistrict Litig. & Fed. Jud. Ctr., Ten Steps to Better Case Management: A Guide for Multidistrict Litigation Transferee Court Clerks 12, app. D (2nd ed. 2014).

⁶⁹ For instance, the Manual for Complex Litigation also offers a model order for judges to use in creating a case website. It urges that a created site “contain sections through which the parties, counsel, and the public may access court orders, court opinions, court minutes, court calendars, frequently asked questions, court transcripts, court docket, current developments, information about plaintiffs’ and defendants’ lead and liaison counsel, and other information to be identified by the parties or the court and its staff.” Fed. Jud. Ctr., Manual for Complex Litigation § 40.3, 762 (4th ed. 2004). It also recommends that judges order counsel to confer to “identify other information that might be included on the Web site,” including case announcements or important documents. Id. Another (seemingly defunct) FJC site offers a sample order – covering, again, only the bare minimum. See Technology – Examples, Fed. Jud. Ctr., https://perma.cc/VL7E-XTTM. We state that the FJC site is “seemingly defunct” because, inter alia, the site lists six “Sample MDL Websites,” but, as of March 2022, none of the links were operational. See Sample MDL Websites, Fed. Jud. Ctr., https://perma.cc/32YG-TYSM.

⁷⁰ Accord Alvin K. Hellerstein, Democratization of Mass Tort Litigation: Presiding over Mass Tort Litigation to Enhance Participation and Control by the People Whose Claims Are Being Asserted, 45 Colum. J.L. & Soc. Probs. 473, 478 (2012) (“It is important to provide full and fair information at all stages of a mass tort litigation, in a systematic and regular manner, to the plaintiffs who are the real parties in interest.”); Weinstein, Ethical Dilemmas in Mass Tort Litigation, at 502 (calling for the use of innovative communication mechanisms and noting that, even in aggregate proceedings, judges “must insist on maintaining the essential aspects of our fundamentally individual system of justice, including communication and participation”).

⁷¹ “Of course, judges are not search engine optimization experts.” Burch and Williams, Perceptions of Justice in Multidistrict Litigation, at 57. But others are, and an initial investment in developing and circulating best practices could yield meaningful improvement.

⁷² See Bradt, The Long Arm of Multidistrict Litigation, at 1235 (advising that “all MDL hearings, depositions, and trials should be web-cast, with the recordings made available on the case website” – and, more generally, that “[c]ourts … should take advantage of the benefits of modern communications technology” in order to promote litigants’ “due process rights to participate meaningfully in the proceedings”).

⁷³ Notably, any plaintiff-to-court or plaintiff-to-plaintiff communication platform would need to protect privileged information. Cf. Judith Resnik et al., Individuals Within the Aggregate: Relationships, Representation, and Fees, 71 N.Y.U. L. Rev. 296, 391–92 (1996). (proposing a “clients’ committee” to “facilitate client-attorney communication and to ensure client knowledge about the litigation”); Weinstein, Ethical Dilemmas in Mass Tort Litigation, at 501–2 (calling for judges to explore ways to promote litigant participation and fortify court-client “[c]ommunication”).

Table 4.1 A hierarchy of study designs

Figure 5.1 The TAR 2.0 process: A stylized example

Figure 8.1 Zantac MDL docket

Figure 8.2 Zantac MDL hearing notice

Figure 8.3 Zantac operative pleadings

Figure 8.4 Hernia mesh Google search

Figure 8.5 Eastern District of Louisiana website link

Figure 8.6 Proton Pump case management orders

Figure 8.7 “Current developments” listing

Figure 8.8 a–c

Figure 8.9 Link-provided “Upcoming Proceedings” example

Book contents

Part II - Legal Tech, Litigation, and the Adversarial System

Summary

4.1 Adjudication of Testimonial Accuracy

4.2 The Absence of a Dehumanization Effect

Table 4.1 A hierarchy of study designs

4.2.1 Who Would Be Dehumanized?

4.2.2 Promising Results from Randomized Field Studies

4.2.3 Varied Findings from Studies with Less Strong Designs

4.2.4 Wrapping Up Dehumanization

4.3 A Research Agenda

5.1 Overview of Discovery and TAR

5.2 Gaming TAR

5.2.1 First Stage: Seed Set Manipulation

5.2.2 Final Stage: Validation

5.3 Three Visions of TAR’s Future

5.3.1 Vision 1: Same Rules, New Games?

5.3.2 Vision 2: New Rules, New Games?

5.3.3 Vision 3: Forget the Rules, New Technical Systems

6.1 Three Examples: TAR, Colossus, and the Walmart Suite

6.1.1 Technology-Assisted Review (TAR)

6.1.2 Colossus

6.1.3 The Walmart Suite

6.2 The Promise and Peril of Legal Tech

6.2.1 TAR Wars: Proportionality and Discovery Abuse

6.2.2 Colossus and the Walmart Suite: The Litigation of Losers

6.3 What to Do?

6.3.1 Plausible but Unlikely Reforms

6.3.2 Slouching Toward Equity: Judicial Procedural Management with an Eye to Technological Realities

6.4 Conclusion

7.1. Litigation Outcome Prediction Defined

7.2. Theory: Access-to-Justice Potential

7.3 Practice: Where Are We Now?

7.3.1 Scholarship

7.3.2 Commercial Applications

7.3.3 Outcome Prediction for Low- and Middle-Income Litigants

7.4. Present Limits

7.4.1 Data Limitations

7.4.2 Methodological Limitations

7.5 Unintended Consequences

7.5.1 Harms to Would-Be Litigants

7.5.2 Harms to the System

7.6 Next Steps

7.7 Conclusion

8.1 Individual Autonomy, Even in the Aggregate: Why It Matters and What We Know

8.1.1 Why Individual Autonomy Matters

8.1.2 Litigant Autonomy: What We Know

8.2 Current Court Communication: MDL Websites and Their Deficiencies

8.2.1 An Initial Example: The Zantac MDL

8.2.2 The Rest: Deficits along Five Key Dimensions

8.2.3 Explanations for the Above Deficits: Unspecified Audience and Insufficient Existing Guidance

8.3. A Simple Path Forward: A “Low-Tech” Mechanism to Keep Litigants Better Informed

8.4. Conclusion: Zooming Out

Footnotes

4 Remote Testimonial Fact-Finding

5 Gamesmanship in Modern Discovery Tech

6 Legal Tech and the Litigation Playing Field

7 Litigation Outcome Prediction, Access to Justice, and Legal Endogeneity

8 Toward the Participatory MDL A Low-Tech Step to Promote Litigant Autonomy

Save book to Kindle

Save book to Dropbox

Save book to Google Drive