In the past year, I attended two high-level e-discovery conferences at which participants spoke of living in a "bubble," by which they meant a world where the e-discovery experts discussed the ramifications of new search and review technologies (think "predictive coding") or debated the implications of recent case law, without regard for the fact that most e-discovery matters do not involve millions of files or that, perhaps most importantly, most lawyers still have virtually no background in nor understanding of e-discovery. With these experiences in mind, I am going to try to avoid the columnist’s temptation to show how in the know I am by writing about the Next Big Thing, and will instead write about more basic, low-tech approaches to e-discovery issues. Today, I will explore a very basic idea: You save money in e-discovery production by having less to produce, and you will have less to produce if you know your case.
Before We Get Started
I will begin by quoting (again) Voltaire: "The perfect is the enemy of the good." Perfection is not the standard by which a production will be judged — reasonableness is. The lawyer who can explain all steps taken and show that all were reasonable takes a defensible position and should prevail. Moreover, even if reasonable steps are second-guessed by a court, they will not result in sanctions.
It is also important to remember that in e-discovery production, one size does not fit all. For example, I will discuss how to limit the number of sources from which e-discovery will have to be produced, but in some cases that "limit" is more inclusive than in others, and in other cases such limiting simply cannot be done. I will not deny that my low-tech rules do not always apply, but they are, regardless, a good standard. Try to make them apply and, if you cannot, at least you know that your large and costly e-discovery production was unavoidable.
Interviews, the Data Map and Collections
The first step in any e-discovery production is to know where your data is and what, basically, it entails. To gain this information, you will have to speak to the client personnel who know the matter as well as those who know the client’s IT infrastructure.
From the subject-matter expert, you must get a list of all users who generated or received potentially relevant ESI. From the users, you must ascertain all the locations where they saved emails or e-docs. Such locations include: (1) the hard drives of their desktops or laptops; (2) email servers and file servers; (3) server project folders; (4) cloud storage; (5) home device(s); (6) personal webmail accounts; and (7) cell phones and other handheld devices, if such are important to the matter. Once you establish user conduct, you need to ask IT what those locations described by the users translate into from IT’s perspective.
For example, are everyone’s email boxes on a company-owned server and, if so, is that server on the premises or housed elsewhere? If the client uses cloud computing for email and file storage, how can that data be accessed? Do the answers change depending upon the part of the relevant time period on which we’re focusing?
Finally, you will ask your subject matter expert to rank the users and, if any, the shared locations in order of importance. It is better to err on the side of inclusion. If the debate is whether five or 10 users are of "middle importance," go with 10.
Through these interviews, you can draw your data map, i.e., state where the data is and rank sources by order of importance. This knowledge should help you make defensible arguments, under Rule 26(b)(2)(B), regarding what is worth producing. You will, however, need to know more.
First, before you dig deeper into your data, you will have to take steps to secure it all. The strategy of reducing costs by producing less has two predicates: (1) you know your data and so can make reasoned, defensible judgments as to what gets produced and (2) you preserve your data, so that if your judgments are not accepted by a court, you can go back to sources without risk of spoliation.
How to preserve does not have a one-size-fits-all answer. Technically, the best way to preserve is by making bit-stream, forensic images of the data. That step usually means bringing in a vendor and thus incurring an out-of-pocket cost early in the litigation, when you do not yet know whether such cost will prove unnecessary (if the complaint is dismissed before discovery production commences), prescient (if the cost is insignificant and the process leads to the recovery of materials essential to your case), or somewhere in between.
While one size does not fit all, my bias is toward forensic collection. Litigants find that cost hard to accept, principally because it arises early in the litigation. Experienced litigants understand the benefits better, while those who do not bear that cost quite often spend far more later, in litigation over whether the data was collected properly, than they ever would have on the collection itself. The well-known matter, Pension Committee of the University of Montreal Pension Plan v. Banc of America Securities, No. 05 Civ. 9016 (S.D.NY 1/15/2010), provides a perfect illustration. There, the plaintiffs tried "preservation in place," i.e., simply instructing users not to delete data. When, however, after years had passed and they had to produce discovery, a great deal of the data could not be found. Such should not be surprising: users in a business use ESI for business, not to preserve it for a lawsuit.
The cost of preservation by forensic collection, then, should in most cases be outweighed by the headaches it avoids for users who must do the preserving. Moreover, it gives the litigant who wants to produce the smallest dataset possible the confidence that, if the court disagrees, he or she can, many years after the duty to preserve arose, return to the source (the forensically collected data) and produce what the court ordered be produced.
Processing, Searches and Production
The next step is to determine what to produce. To produce data, you must first have it processed into a searchable database format so you can review it, then review it (including automated searches of all different forms, de-duplication, etc.), then produce, usually electronically printing it in TIFF with Bates stamps, redactions, etc.
The fundamental truth about the cost of processing, reviewing and production is that, while pricing and service vary between vendors, the savings a client will see from not processing and producing data at all dwarfs the savings that comes from the best pricing the client will ever see. Thus, while you should always try to get the best prices, the initial focus should be on the reduction of the dataset that must be processed, reviewed and produced.
While your data map is necessary to tell you where to look first, where in all likelihood you need not look at all, and where you may have to look, it is not sufficient, because your map is based upon statements given to you that have not been tested.
In United States v. O’Keefe, 537 F. Supp. 2d 14 (D.D.C. 2008), and Equity Analytics v. Lundin, 248 F.R.D. 331, 333 (D.D.C. 2008), U.S. Magistrate Judge John M. Facciola, addressing how keyword searches were to be structured, made it clear that a producing party could not simply declare by fiat that its keyword choice was correct: as with any statement claiming to posit a scientific truth (in this case, that x terms will yield all responsive files), search terms need to be tested in order to show their efficacy.
This principle carries over to all aspects of e-discovery production. It is not enough that a person with knowledge of the matter has stated that only some ESI is important. Such a statement is a good point of departure, but must be tested.
Thus, the next step is not just to review and produce from those custodians identified by the client as important, but also to review the data of some middle-ground custodians and, possibly, even one deemed unimportant. Review should either confirm or refute the custodian’s pronouncement as to where the responsive data is. If it is confirmed, you have made your choices defensible by testing and saved money; if it is refuted, you should continue on reviewing until you reach the dry well. Regardless, you have produced the minimum of what you need to produce.
You will also have a good idea of how much it will cost to process, search, review (both technical and attorney review costs) and produce the remaining data. This data will allow you to scale up the cost of producing the remaining data (some vendors offer volume discounts or tiered pricing but it is nevertheless, overall, linearly scalable), and so provide real numbers when making the Rule 26(b)(2)(B) argument that the benefit of producing such data is not worth the cost.
Vaughn v. LA Fitness International, 285 F.R.D. 331 (E.D.PA. Aug. 16, 2012), provides a perfect example of how taking such review steps and documenting their cost and results can lead to a producing party prevailing under Rule 26(b)(2)(B). There, the court granted the defendant’s motion that, because the plaintiffs’ preservation demand was overbroad, if the plaintiffs wished the data to be preserved and searched, they would have to pay for it. By the time the issue arose, the defendant had produced a great number of documents and was able to show that the documents sought by the plaintiffs were of little value, while the cost of preservation and production would run several-hundred-thousand dollars. While most producing parties complain generally about how much e-discovery costs, the defendant in Vaughn was armed with specific facts, which persuaded the court. If you are similarly armed, the same result should be obtained.
There is no question that the world of e-discovery has a prominent place for predictive coding, concept searching and all of the other complex search engines that dominate discussions about e-discovery. For many practitioners, however, dealing with small or mid-sized matters, using such complex tools would prove unnecessarily costly. In all matters, it is important to remember that, by diligently mapping the sources of ESI and testing that mapped data to see to what degree it resembles how it has been described, a good lawyer can reduce the cost of e-discovery production in the simplest of ways, i.e., by reducing the amount of data to be reviewed and produced.  •
Leonard Deutchman is general counsel and vice president of LDiscovery LLC, a firm with offices in New York City, Philadelphia, Washington, D.C., Chicago, San Francisco and Atlanta that specializes in electronic digital discovery and digital forensics.