A just-released industry survey by the eDiscovery Institute[1] showed that using predictive coding can cut legal review bills by 45 percent on average and sometimes by 70 percent or more.

Predictive coding involves reviewing a sample of documents and then using software to extend those decisions to other documents. The biggest savings can come when records can be deemed relevant or nonresponsive without having to visually examine each of them. However, predictive coding systems often rank records according to how likely they are to be responsive, and another form of saving is when the documents that appear to be the least likely to be responsive are allocated to the lowest level, least costly personnel to review, reserving the “hotter” documents for more senior people.

Many lawyers believe that old-fashioned linear or manual reviews produce the “best” results but the continuing explosion of the volume of electronically stored information makes the visual examination of every record in every case an economic impracticality. Furthermore, one attribute of quality is replicability, i.e., the ability of a process to produce the same results when it is repeated using the same input. However, when two sets of reviewers examine the same records for the same purposes, studies have shown that the second team may select only 48 percent to 62 percent of the records initially selected by the first team[2] – and will select a fair percentage of records as relevant that were previously deemed nonresponsive. This is hardly a gold standard.

When predictive review systems are used to reprocess documents originally reviewed manually, studies have shown they can select about the same percentage of documents originally deemed relevant as second reviews performed by manual linear review. However, due to the sort of audit trail that these systems typically generate, they can select virtually the same set of records if the predictive review process is repeated on the same records. In other words, predictive coding has far higher levels of replicability than manual review.

In the survey the reason most cited for predictive coding not being used on a more widespread basis was uncertainty over judicial acceptance. However, principles 6 and 11 of The Sedona Principles (Second Edition), Best Practices Recommendations & Principles for Addressing Electronic Document Production clearly indicate that a producing party can use appropriate technology to identify and produce records. In fact, Practice Point 1 from The Sedona Conference Best Practices Commentary on the use of Search and Information Retrieval Methods in E-Discovery states that, “In many settings involving electronically stored information, reliance solely on a manual search process for the purpose of finding responsive documents may be infeasible or unwarranted. In such cases, the use of automated search methods should be viewed as reasonable, valuable, and even necessary.” (Emphasis added.)

Considering the lack of replicability, the high cost and the long delays inherent in manual reviews, it is clearly reasonable to use predictive coding as a good faith way to meet a company’s production obligations.

One final postscript on a point that often gets overlooked: manual reviews using teams of contract lawyers and other temporary help needlessly expose confidential and sensitive corporate records to far more people than is necessary. Every person who sees corporate records represents a security risk of some sort. Predictive coding minimizes those risks by minimizing the number of people required to review records.

[1] “eDiscovery Institute Survey on Predictive Coding,” October 1, 2010, available on the website of the eDiscovery Institute, http://ediscoveryinstitute.org/pubs/PredictiveCodingSurvey.pdf. Companies providing responses included Capital Legal Solutions, Catalyst Repository Systems, Equivio, FTI Technology, Gallivan Gallivan & O’Melia, Hot Neuron, InterLegis, Kroll Ontrack, Recommind, Valora Technologies, and Xerox Litigation Services.

[2] See studies cited in the Preface to “eDiscovery Institute Survey on Predictive Coding,” supra.