In our first two articles, we discussed the benefits of records management and methods for identifying and preserving ESI (our third article covered hot topics and recurring issues in recent e-discovery case law). Now it’s time to start collecting that information to review and, ultimately, to produce.

The collection process should be comprehensive without being over-inclusive. It should preserve the integrity of the data, the chain of custody and authenticity of the documents, and should be cost- and time-efficient and not disrupt the organization’s business operations.

It is important that both the organization and outside counsel have collection teams that collaborate to ensure there is a properly executed collection plan. The organization’s collection team should include a discovery point person, preferably an attorney, and IT personnel. Outside counsel’s team should include a supervising attorney, a project manager and a vendor (if one is to be used).

Methods of collecting ESI

It generally is best to collect only the ESI that is needed for a case. There are two general methods for collecting ESI: full disk acquisitions and directed collections. The nature and claims of the case, the complexity of an organization’s IT infrastructure, the discovery requests and proportionality will inform the decision on which collection method to use.

A full disk acquisition involves making bit-by-bit forensic copies of targeted data assets. This process replicates all data, including deleted files and fragments and information in the slack or unallocated space of the drive. It is best to consult a forensic technician when performing a full disk acquisition. But forensic collections like this typically are necessary only in cases involving computer-based fraud, IP theft or when user-specific computer habits are at issue.

Directed collections are usually preferable because they limit the scope and size of the collection. A directed collection can be limited to include particular custodians, certain file types or documents that result from a keyword or date range search. Keyword searches conducted for this purpose must be carefully crafted and tested with respect to the issues, claims and defenses in a case because the results of any search are only as good as the search itself. (The use of advanced search and analytics tools may be preferable to keyword search terms. See National Day Laborer Organizing Network v. United States Immigration and Customs Enforcement Agency (S.D.N.Y. 2012)).

Regardless of the method employed, the collection of ESI should be performed without altering the documents’ metadata. Metadata most commonly refers to data elements or properties of a document, such as the date, author, track changes and who last accessed or printed the document. If metadata is to be preserved, write-blocking hardware should be used to prevent writing to the file during the collection process. It is important to remember that courts generally view metadata as discoverable.

Searching for and collecting documents using standard Windows or MS Office utilities is not a recommended practice. These and many off-the-shelf products have limitations on their search capabilities, and their use may result in relevant documents being missed. Instead, there are specialized tools—those that use write-protecting to preserve metadata associated with the files—designed for this purpose.

Manually collecting documents is strongly discouraged as well. Companies should not collect documents by simply copying files from a network location or user workstation to external media. This so-called “drag and drop” method of collection alters the metadata of the documents and exposes an organization to potential claims of mishandling or spoliation of documents (see Green v. Blitz USA., Inc. (E.D. Tex. 2011)). Any company performing such a collection puts itself at risk of becoming a witness with regard to the collection.

The collection process should also be well-documented to establish that defensible processes were used should the collection ever be challenged.

What data should be collected

After determining the collection method, outside and in-house counsel must make a series of decisions regarding what types of data will be collected. It is best for the parties to agree on these issues in order to streamline the collection process and minimize discovery costs by reducing both the volume of ESI to be collected and the number of disputes. Ideally, consensus (which should be reduced to writing) should be reached on the following issues:

  • Scope of the collection (custodian names, date range, file type, location)
  • Keyword search terms
  • Whether active or inactive data will be collected
  • What constitutes “reasonably accessible” data
  • The form of the eventual production

Under Federal Rule of Civil Procedure 26(b)(2)(C)(iii), parties must keep in mind that the costs of discovery should always be proportional to the amount at stake in the case, each party’s resources and the importance of the issues involved.


Collecting ESI is less art and more science. Having a game plan, documenting key steps and addressing issues at the outset with an adversary are all important to avoid increased cost and unnecessary satellite litigation. Knowing what data you have and the best way to collect it will facilitate a smooth and hopefully uneventful collection process.