"It’s hard to predict things, especially the future." — Yogi Berra

Well, actually, it’s not all that hard to see that predictive coding is the future — some call it a game changer as courts begin to accept, even require it. But here’s my prediction: Predictive coding is a cool new tool, but it will not solve the basic problem. Discovery will remain expensive; parties will remain combative; and courts will remain perplexed as they try to sort things out. As Pogo so aptly put it, "I have met the enemy and it is us." We lawyers are the problem. New tools are nice, but new attitudes would be better.

The problem is that lawyers too often view cooperation and compromise as antithetical to zealous advocacy; we distrust and oppose whatever our opponent proposes. If you want to use predictive coding, I don’t. If I want to use 20 search terms, you want 50 different ones. If you want to search 20 custodians, I demand 200. And when, invariably, the receiving party claims production is incomplete, we race one another to court.

And we always have. Whether B.C. (Before Computers) or A.D. (After Digitization), the problem has always been the same. In the old days, before email and digital imaging, we dealt with mostly paper. There were cases in which the universe was immense — warehouses filled floor to ceiling with boxes of documents. We did not manually review the universe but rather selected a subset by looking at the labels on the boxes to identify those that might house something responsive. And then we fought over whether we had looked in the right boxes.

The digital age is no different, just more so. We still cull from the universe and manually review a subset. The big difference is that the big-universe cases are no longer the exception and the universes are exponentially bigger. We need tools to efficiently explore such universes.

Predictive Coding for Dummies (John Wiley & Sons 2012) tells you everything you need to know about predictive coding; read the book, but here is the reader’s digest version.

If you have a vast digital mass to review, you could machine cull, using Boolean logic search terms. Looking for documents about the valuation of commercial property in New York? Then, maybe, search for, among other terms, "asset" and "New York" or "NYC."

But your search will capture a host of false positives and miss lots of real positives. You will snare an email about taking a basset hound to the BonnyChic Dog Groomer, but miss a smoking-gun document where the author had fat thumbs or too many cocktails and typed "Mew Norj" instead of "New York."

Enter predictive coding. A statistically significant random sample is pulled from the universe, carefully reviewed and tagged for relevance. The reviewers tag as irrelevant messages about bassets and identify spelling glitches; they identify and rate the relevance of the documents. They feed their findings to the computer so that it learns the difference between relevant and not; and then the process is repeated — usually four to seven times but as often as necessary — until the computer recalls with sufficient precision relevant documents with an acceptable error rate and is ready to select documents from the universe without further human intervention.


But here’s the rub. Remember, your adversary demands all relevant documents — and if the goal is to find all, none of this works. Not search terms, not predictive coding, not manual review, not all of the above. No matter how it is done, you are going to miss a significant number of responsive documents.

Humans err. Innocent reviewers negligently miss relevant documents out of fatigue or lack of wit; less-than-innocent reviewers omit relevant documents out of facileness or sharpness of wit. Technology-assisted review is often more accurate than manual review. See Gross­man & Cormack, "Technology-Assisted Review," U. Rich. J.L. & Tech. (spring 2011).

But don’t pop the corks just yet. Machine searches aren’t all that accurate, either.

In the first case in which a court ordered the use of predictive coding over objection, Global Aerospace Inc. v. Landow Aviation L.P., No. 61040 (Loudoun Co., Va., Cir. Ct.), the coding algorithm culled 173,000 relevant documents from a universe of 1.3 million. But manual checks of samples found that the machine had found only 81 percent of relevant documents. See Debra Cassens Weiss, "Is predictive coding better than lawyers at document review?" ABA J. (January 22, 2013).

Eighty-one percent? What? Our fancy-schmancy predictive coding missed 19 percent of 1.3 million relevant documents — a quarter-million unproduced documents? Lawyers, start your engines, write those motions to compel.


But wait, not so fast. In Da Silva Moore v. Publicis Groupe, 2012 U.S. Dist. Lexis 23350 (S.D.N.Y. 2012), Magistrate Judge Andrew Peck approved a predictive-coding protocol over an objection that admittedly was a few gigabytes shy of perfect. So what? Tattoo the words of Peck on your arm and display them proudly: "The Federal Rules of Civil Procedure do not require perfection." Id. at *34. What the Federal Rules do require is "the just, speedy and inexpensive determination of every action" (Rule 1) and that the cost of discovery be proportional to the amount at stake (Rule 26). So 81 percent may be just fine for some cases — maybe more than fine for some others, less than fine for still others.

Eighty-one percent was just fine in Global Aerospace because, even though predictive coding was imposed by the court over objection, the process was transparent.

In Da Silva Moore, Peck heartily endorses the Sedona Conference Coop­eration Proclamation (www.TheSedona­Conference.org) — which directs that the parties engage in a transparent and cooperative dialogue to design an efficient discovery plan at the outset of the litigation; if cooperation doesn’t work, he urges that parties see the court — before investing the time and machismo in a potential fight.

Peck is joined by many thoughtful others, such as Magistrate Judge Nan Nolan in Kleen Products LLC v. Packaging Corp. of America, 2012 U.S. Dist. Lexis 139632 (N.D. Ill. 2012). More than a hundred other federal judges have endorsed the cooperation proclamation since its publication in 2008. Which raises the question: Why only a hundred? There are 1,700 federal judges and magistrates; roughly 24,000 state and administrative judges. What judge thinks it is wrong for parties to cooperate?

I suspect that no judge thinks so. In fact, doesn’t it seem off that we need a proclamation and judicial endorsement to champion cooperation?

We need these things because we don’t as a rule play well with others. And as we bicker, we drive up the cost of litigation for our hapless clients while we risk personal sanctions for ourselves as the court grows weary of the playground antics.

So let’s rethink. If you want to — and you should want to — avoid the time, distraction and downside of discovery disputes, what is the best way to do that? Duh. Enter into a cooperative agreement with your adversary — you can’t be criticized for following a discovery protocol to which both sides agreed. And if you try and fail to reach an agreement, run to court — now, not later, to ask the court to mediate and impose a protocol. You can’t be criticized — or sanctioned — for a transparent plan approved in advance by the court.

We have to control our inner beasts. It is not weakness to cooperate in discovery; it is good sense. Predictive coding may or may not be the right tool for every case. But transparency and cooperation always have been, always will be, good ways to go.

Robert L. Byman is treasurer and a member of the executive committee of the American College of Trial Lawyers and a partner at Chicago’s Jenner & Block. He can be reached at rbyman@jenner.com.