“[C]omputer-assisted review is an available tool and should be seriously considered for use in large-data-volume cases where it may save the producing party (or both parties) significant amounts of legal fees in document review.”

— U.S. Magistrate Judge Andrew J. Peck, Southern District of New York, Da Silva Moore v. Publicis Groupe & MSL Group (2012)

Is computer-assisted review, or predictive coding, the holy grail for in-house counsel? Will it finally curb the explosive growth in costs for electronic discovery in large-data-volume cases?

Not yet, but predictive coding is here to stay. If used effectively, predictive coding can be a valuable tool, significantly cutting fees generated by outside counsel to review and code documents in large, complex litigation. Using predictive coding effectively, however, typically requires buy-in from all parties — as well as the court, if one side objects to its use.

As most in-house counsel know, predictive coding relies on computer software and proprietary algorithms to determine whether electronic documents are relevant to a case. Predictive-coding software is similar to spam-filters that train a computer program to identify unwanted, mass emails.

Litigation team members feed the software a seed set of potentially relevant documents to begin training the software. After lawyers identify the seed set, senior attorneys provide additional feedback to the software, which teaches it what is truly relevant. This process may be repeated several times.

The software then analyzes the seed set and identifies common characteristics, such as people, places, words or concepts. These allow the software to create rules it can apply across all harvested documents.

Once the software is sufficiently trained and it has established these rules, the litigation team can use the software to cull responsive documents from the entire universe of data, potentially saving the client significant attorney-hours in review time.

While some have raised concerns about the accuracy of computer-assisted review, Peck wrote in Da Silva that “statistics clearly show that computerized searches are at least as accurate, if not more so, than manual [document] review.” Despite this, it still may take time for lawyers and courts to accept predictive coding.

Gain Agreement

Predictive coding, like traditional methods of handling electronically stored information in litigation, works best when the parties agree on the type of electronic files to search, the custodians whose files they will search and the protocol for searching those files.

Getting buy-in from opposing counsel before proceeding down this path can be key, as predictive coding is still a relatively new field. Typically, a lawyer should notify opposing counsel of the intent to use predictive coding. Then, she should propose an e-discovery protocol that includes the nature and parameters of predictive coding, the identities of custodians and file-types to be searched, and other logistical issues like a de-duplication protocol and the format for production.

As Da Silva points out, the best approach to using predictive coding is to cooperate with opposing counsel.

“Advise opposing counsel that you plan to use computer-assisted coding and seek agreement; if you cannot, consider whether to abandon predictive coding for that case or go to the court for advance approval,” wrote Peck.

While this could, in the short term, add to the cost of discovery, it may be worth it in some cases to ask the court to approve the use of predictive coding over the objection of the other side. Judge James H. Chamblin of the Circuit Court of Virginia in Loudoun County did just that in Global AeroSpace Inc. v. Landow Aviation LP.

As things stand now, e-discovery creates tremendous settlement pressure for in-house counsel, given that the costs in even a moderately complex case can range from the hundreds of thousands to the millions of dollars, often dwarfing the actual value of the case. Predictive coding certainly is not a panacea that will eliminate those costs or the settlement pressure they create.

As these are still early days, and case law and procedural standards are largely absent, the world of predictive coding can be a bit like the Wild West. There are any number of vendors out there, each with its own approach to predictive coding and potentially opaque pricing structure.

Increased competition necessarily will drive down costs in the long run, but the concern in the short term is that any savings in outside counsel fees simply will shift to expenses for vendors, many of whom overpromise and under-deliver results.

But just because some vendors are out in front of their skis in this area does not mean that predictive coding is not a legitimate, effective tool. It works, and it is definitely here to stay. For in-house counsel faced with litigation involving significant amounts of electronically stored information, the future is now, and it is time to start thinking about predictive coding.