Last year was a landmark year for predictive coding within the judicial system. Five cases, in different regions and with varied subject material, directly addressed the use of predictive coding in discovery practice. As 2013 begins, the progress of these cases, and the way in which predictive coding is implemented, will determine whether it will make the jump from an exciting, cost-effective technological innovation to an established discovery tool with statistically defensible results.

In some cases, the inability of parties to agree on predictive coding has brought discovery to a standstill and forced courts to weed through infinite discovery motions. In the cases where predictive coding remains part of the discovery process, the courts have pushed the parties into a collaborative approach.

One of these cases, Kleen Products LLC v. Packaging Corp. of America, No. 1:10-cv-05711 (N.D. Ill.), is an antitrust suit between containerboard companies. The case was heard in the Northern District of Illinois and evidentiary issues were determined by U.S. Magistrate Judge Nan Nolan. The plaintiff alleged that the defendants used market consolidation to artificially inflate containerboard prices.

Discovery was contentious from the outset: The two parties fundamentally disagreed about the searching methodology and discovery obligations. The defendants claimed they followed cutting-edge industry standards to produce 3 million records. The plaintiffs argued that the defendants’ search-term approach would capture only 25 percent of responsive documents, while a content-based search would capture 75 percent.

Nolan tried to encourage the parties to cooperate and reach a protocol acceptable to both of them, but neither party would budge. On August 21, 2012, the parties tabled the predictive-coding debate until October 1, 2013, by agreeing to a stipulated order. The plaintiffs agreed to withdraw their demands and waive existing objections to the defendants’ electronically stored information (ESI) methodology, but left open the door to requesting content-based analytics after October 2013. Contention involving predictive coding prevented the parties from agreeing to any compromise, leaving discovery at a standstill until the request for analytics was withdrawn.

Da Silva Moore v. Publicis Groupe, No. 1:11-cv-01279 (S.D.N.Y.), was the first case to involve predictive coding. The class action was brought on behalf of women employees at Publicis Groupe S.A., an advertising conglomerate. The case was referred to U.S. Magistrate Judge Andrew Peck, a noted electronic-discovery expert, for pretrial supervision.

At the time, the parties disagreed about the ESI protocol to cull through the documents at issue. There were approximately 3 million electronic documents from the agreed-upon custodians, and defendants wanted to use predictive coding in addition to keyword searching to reduce the number of documents.

On February 8, 2012, Peck heard arguments from both sides and on February 24, 2012, issued an order approving the defendants’ predictive-coding protocol over the plaintiffs’ objections. The plaintiffs maintained that they “object to this ESI Protocol in its entirety” and that they only complied with the joint submission of the protocol because of the court’s order. Discovery was stayed as the plaintiffs tried to have Peck recused, claiming that his published articles and public positions on predictive coding made him biased.

On November 7, 2012, U.S. District Judge Andrew Carter rejected the recusal motion, meaning that the case will finally embark on the process of predictive coding.

In re Actos, No 6:11-md-02299 (W.D. La.), involves 11 civil actions filed in various venues during 2011. These actions were consolidated on December 29, 2011, and were assigned to U.S. District Judge Rebecca Doherty of the Western District of Louisiana for coordinated or consolidated pretrial proceedings.

The case centers around allegations that Actos, a prescription drug used for the treatment of diabetes, increased the risk of bladder cancer in users and that the defendant pharmaceutical companies had failed to adequately warn consumers about these risks and had instead tried to conceal them.

On July 27, 2012, Doherty issued an extensive case-management order containing the protocol relating to the provision of ESI. The order set out a detailed procedure for the implementation of predictive coding in which both parties would work with Epiq Systems Inc., the agreed-upon vendor, to review the number of documents collaboratively that are needed for Epiq’s Relevance System to reach “stability” — a term coined by Epiq to measure when statistical results can be verified within a certain confidence level and margin for error. All materials reviewed during the assessment phase become part of the control set. The control set of documents are reviewed by three designated “experts” on each side who work with one another to determine the documents’ relevance, thus building a statistically sound control set of materials with which to continue the iterative process of coding all remaining materials in the database.


Privileged materials are reviewed by the defendants and pulled or redacted before being added to the control set, so the likelihood of privileged materials being released to the plaintiffs’ experts is minimized. Privileged and redacted materials, however, can be used to train Epiq’s system and ensure that all relevant materials are training the system on co-occurring concepts regardless of their privilege status.

The six experts review documents together, in the same office, beginning with small sets of randomly selected documents. This is called the training phase. The training phase is repeated until the Epiq system reaches “stability.” Once the training rounds have been completed, the Epiq Relevance System is deployed to assign a coherence score between zero and 100 to all records in the database, with 100 representing those documents that are the most likely to be responsive and zero the least likely. The experts then continue to sample those coherence results to determine the coherence number when the balance tips between responsive and nonresponsive material. All responsive materials are then produced.

The order was crafted with the agreement of both parties and left room for them to re-evaluate the process once the results began to come out. Notably, the order specifically referred to the vendor that the parties had chosen and incorporated vernacular specific to Epiq’s brand of predictive coding. The buzzwords defined in the order will soon become the nomenclature for all other ESI protocols.

In December, Doherty declared that several factors in the evolution of the proceedings, including other trials involving the same issues and state court actions, had delayed the discovery process, but that the case was finally at “an appropriate point for full discovery to proceed” in accordance with the order. According to status-conference updates, the parties are updating the court every month as to the status of the document production and are continuing to report on “predictive coding/training.” However, no motions have been filed or heard on either side suggesting an update on how the process is developing.

Doherty is actively involved in the process; both parties are sophisticated; and, given the sheer volume of cases and the nature of the documents involved, predictive coding is a logical choice for mathematically separating the wheat from the chaff. Actos is still mired in the discovery process and, because of the sheer size of the litigation, complexity of the issues involved and the number of experts required to sufficiently train the system, the progress is guaranteed to be slow. With the potential for meet-and-confer meetings at every step of the process to resolve coding conflicts, the cost to both parties could prove exorbitant.

Another case, Global Aerospace v. Lan­dow Aviation, No. CL 00061040 (Loudoun Co., Va., Cir. Ct.), also reflects a turn toward collaboration, albeit one initially forced by the courts.

The case involves the collapse of three hangars at the Dulles Jet Center, a general aviation facility outside Washington. The defendants initially pushed for predictive coding because of the sheer size of the data and, ultimately, the ensuing costs to review them with traditional approaches; in other words, accuracy of the results was a side issue rather than the prime motivation. The defendants held that the 250 gigabytes of reviewable ESI would “easily equate to more than two million documents,” and a linear first-pass review “would take 20,000 man hours, cost 2 million dollars, and locate only sixty percent of the potentially relevant documents.” On the other hand, predictive coding would be “capable of locating upwards of seventy-five percent of the potentially relevant documents and can be effectively implemented at a fraction of the cost and in a fraction of the time of linear review and keyword searching.”

The plaintiffs countered by protesting the deviation from traditional discovery, and framed the use of predictive coding as a departure resulting from the defendants’ unwillingness to undertake the steps of the usual production of documents. The plaintiffs asked the court to order the defendants to “produce all responsive emails and other electronic documents, not just the 75%, or less” that predictive coding would provide.

On April 23, 2012, Loudoun County, Va., Circuit Judge James Chamblin issued an order allowing the defendants to “proceed with the use of predictive coding for purposes of the processing and production” of ESI, with “processing to be completed within 60 days and production to follow as soon as practicable.” However, no progress was reported for several months.

Similar to In re Actos, here the plaintiff and defendant work together in the process with a system of checks and balances. The defendant is required to share the control set of statistically sound results with opposing counsel prior to the process of categorizing, so there is a reasonable opportunity to verify or object before committing to categorization of the data universe.

What distinguishes Global Aerospace is one party’s initial unwillingness to consider the quick, cost-effective and statistically verifiable approach to locating relevant materials in a daunting universe of ESI. According to recent reports, the predictive-coding process was just completed. See, e.g., E. Koblentz, “Predictive Coding Completed in ‘Global Aerospace’ Case,” Law Tech. News, January 16. It took nine months, and the vendors have yet to issue a final report and press release. The deadline for opposing counsel’s objection passed unceremoniously, suggesting a begrudging acceptance of the technology.

The most recent judicial order involving predictive coding was issued in E ORHB Inc. v. HOA Holdings LLC, a case involving the interpretation of contract provisions in the purchase of the Hooters of America LLC restaurant chain.

Vice Chancellor Travis Laster of Delaware Court of Chancery characterized the case as “an ideal non-expedited case in which the parties would benefit from using predictive coding.” He asked the parties to use predictive coding for discovery or else to “show cause why this is not a case where predictive coding is the way to go.” He also suggested that they use a single e-discovery provider — “one of these wonderful discovery super powers” — to warehouse both sides’ documents and offered to choose a vendor if both parties could not agree on one.

In contrast to both Actos and Global Aerospace, neither party proposed using predictive coding; the initiative came straight from the bench. How the parties will respond has yet to be seen, but the judge’s decision to force predictive and demand their use of one vendor seems to ignore the complexities of existing predictive-coding protocols and the need for parties to buy in.

So far no case has taken predictive coding “from soup to nuts” — from a discovery ruling through to the conclusion of discovery. But 2013 may see this process happen in Actos and Global Aerospace.

Pooja Nair is an associate in the business litigation and dispute resolution and the discovery data management practice groups at Foley & Lardner; she works in the Los Angeles office. Leslie Nash Tookey is West Coast regional manager for the litigation support group.

We asked a group of litigators for their thoughts on the past year’s biggest developments, their pet peeves, and their predictions for the next big thing for e-discovery.
• What’s hot?
• What stinks?
• What’s next?