Symantec Corp. recently issued the findings of its 2011 Information Retention and eDiscovery Survey, which examined how enterprises manage their ever-growing volumes of electronically stored information (ESI). Interestingly, the survey of legal and IT personnel at 2,000 enterprises worldwide found that email is no longer the primary source of ESI companies produce in response to e-discovery requests and governmental inquiries. 

When asked what types of documents were most commonly part of an e-discovery request, respondents selected files/documents (67 percent) and database/application data (61 percent) ahead of email (58 percent).  Unlike a decade ago, the survey reveals that email simply does not axiomatically equal e-discovery any longer.  

Some may react incredulously to these results, like noted e-discovery expert Ralph Losey, who continues to believe in the paramount importance of email: “In the world of employment litigation it is all about email and attachments and other informal communications,” he says. “That is not to say databases aren’t also sometimes important. They can be, especially in class actions. But, the focus of eDiscovery remains squarely on email.”

To some extent, the relative descent of email’s importance can be looked at more as the ascendency of other data types, which now have an unquestioned seat at the table.  To understand the ramifications of the increasingly heterogeneous nature of e-discovery requests, it is useful to contrast email with both loose file and database discovery.

Initially, as Ralph correctly notes, email is incredibly helpful in a discovery context.  It has a number of relatively unique attributes that make it almost singularly useful in establishing timelines and the always important litigation concept of “who knew what, when?”.  This is because email is laden with tons of useful metadata (data about data) like to/from information, sent/received times, cc/bcc information, read receipts, forwarding information, etc.  All of this metadata is then organized in a structured database of sorts (commonly Outlook) that easily permits custodian-level analysis, which is often the cornerstone of discovery.  In fact, numerous companies have created software applications to better harness the power of email by reconstructing email threads, adding in ways to detect duplicates/near duplicates and identifying missing participants to email conversations.

As such, email is uniquely situated in the middle of the spectrum between even more highly structured ESI, like database information, and completely unstructured ESI, like loose files.  Now that a variety of data types are increasingly being requested, each type of ESI creates unique challenges as an entity attempts to navigate the Electronic Discovery Reference Model (EDRM) spectrum.

Often, the first major e-discovery task is to preserve potentially relevant ESI so that it isn’t intentionally or unintentionally lost, altered or deleted.  Database ESI is particularly vexing here because relational databases are always in constant motion, meaning that an application like a customer relationship management (CRM) system will have ESI elements that are conceivably updated, viewed, modified and exported by hundreds of users on a nearly simultaneous basis.  While it’s often possible to take snapshots of these relational databases, any attempts to literally preserve this information would mean preventing users from making full business use of the applications, and would likely result in rioting in the streets.

Loose files are in some ways a much easier preservation situation since those ESI elements, like Word documents stored on a network share drive, aren’t as often in flux or linked to other content in a relational sense.  The challenge here instead involves associating content with key custodians that are under a legal hold since the unstructured nature of the information makes it harder (if not impossible in some instances) to discern ownership information.  For instances where ESI needs to be preserved strictly due to content topics, like in patent or product-related litigation, the challenge is that this unstructured information often is not indexed, meaning that keyword searches aren’t possible.

Collection, which is the next major step in the EDRM process, poses unique challenges as well, particularly for database ESI such as financial, transactional and operational systems.  Here, the inherent power of database information is generated by its relational positioning/linking between other elements in the database.  If a piece of information is taken out of the database (like a singular “opportunity” in a CRM system), much of the really useful context is lost as this standalone piece of ESI often appears more like a fish out of water.  As an example, all of the reporting functionality relating to this lone opportunity would be lost once extracted.

Similarly, at the far end of the spectrum, production tasks for structured information also have proved to be vexing since it’s not generally possible to create mini-subsets of the requested information.  Instead, targeted reports typically are used, particularly to avoid giving the requesting party direct access to database systems (which is sometimes requested). 

As an example, in Ex parte Wal-Mart, Inc., [809 So. 2d 818 (Ala. 2001)], the plaintiff allegedly was injured by falling merchandise in a Walmart store.  The plaintiff requested the production of customer inquiry and workers’ compensation claims against Walmart, which were stored in a database maintained and controlled by CMI, a wholly owned subsidiary of Walmart. The trial court required Walmart to produce all incident reports from Alabama stores for the five-year period preceding the date of the plaintiff’s injury.  In reviewing the trial court’s discovery order, the Alabama Supreme Court narrowed the discovery order to incidents involving injury from falling merchandise.

Whether email’s reign as top e-discovery dog is really over isn’t the point.  Instead, the lesson should be that e-discovery has become increasingly heterogeneous. Therefore, failing to proactively deploy processes, procedures and technology to account for highly structured ESI, like databases, and highly unstructured data, like loose files, injects unnecessary risk into process that is already highly complex.