It is no longer a pejorative comment to say one “walks with his head in the clouds.” In fact, people and businesses are moving ever forward by utilizing “cloud computing.” Cloud computing is expected to reach $16.7 billion in revenue by 2013.1 Yet, few in the legal field fully understand the basics behind cloud computing, or how it may prove problematic to apply established rules pertaining to electronic discovery to data in the cloud.
At its basic level, cloud computing is a system that not only allows for storage of data, such as emails and documents at a remote facility, but also allows for immediate access to that data, whenever and wherever one demands, by cell phone, tablet, or personal computer. Moreover, cloud computing allows for numerous users to access and manipulate that data at the same time, enabling real-time sharing and collaboration.
This can occur because data is not simply stored on one device, such as a single hard drive. Rather, multiple computers store the data in multiple locations, called data centers, creating a system of redundancy and backup. Thus the “cloud” is made up of several parties. The Cloud Service Provider (CSP) is interdependent with other providers. A CSP that provides services such as an email application is known as Software as a Service (SaaS) provider. This provider may depend on another party, the Platform as a Service (PaaS) provider, to host log files and provide an operating system. In turn, a provider known as an Infra-structure as a Service (IaaS) provider will store files. Each of these players has multiple operating facilities throughout the United Sates, if not the world. For example, Google has considered placing data centers on ships at sea, while Microsoft has looked into placing data centers in Siberia.2
A cloud-based investigation may involve looking into each link in the dependency chain. As the dependencies are highly dynamic, corruption in that chain or lack of coordination between the parties could lead to serious discovery problems, including loss of data or inability to retrieve the data. These are the issues with which e-discovery must deal.
To help resolve and mitigate e-discovery issues, the New York State Bar Association, in September 2011, put forth a series of guidelines to assist attorneys in navigating the murky waters of e-discovery and Electronically Stored Information (ESI).3 The first such guideline sounds simple but is difficult to apply. That guideline reads: “A client must preserve evidence when that client has notice of pending litigation…”
The issue of when that notice becomes effective and the party is obligated to issue a “legal hold” on the ESI is fact-specific. Yet, as the court in VOOM HD Holding v. EchoStar Satellite, 93 AD3d 33 (1st Dept. 2012) ruled, sanctions may be imposed for spoliation of ESI. For cloud-based data though, there may be difficulty in complying with VOOM. All cloud servicers use some mode of multitenancy which is the ability for multiple customers (tenants) to share the same applications and/or computer resources. It is through these multitenant systems, including the use of meta-data, that cloud servicers achieve high-cost efficiencies and low-cost delivery.
Cloud meta-data is created through an “associative context” that observes users and their evolving relationship with content.4 That meta-data is shared and common to several clients, generated by several clients, and maintained in data centers that are shared among clients. The meta-data created is used to speed performance and make business actions. To impose a “hold” and restraint on the use of that data would cripple cloud computing and deprive the CSPs of their function.
Further, courts will need to consider the ramifications of imposing sanctions should a non-litigant or the CSP destroy or erase the “held information” through its own access to the Cloud, and what happens to the access and security of data subject to a litigation hold for other CSP clients.
To determine what ESI to preserve, Federal Rule of Civil Procedure 34(a) sets the baseline. It requires preservation of ESI in the party’s “possession, custody or control.” Building upon that, the NYSBA second guideline provides the following:
In determining what ESI should be preserved, clients should consider: the facts upon which the triggering event is based and the subject matter of the triggering event, whether the ESI is relevant to that event; the expense and burden incurred in preserving the ESI; and whether the loss of the ESI would be prejudicial to an opposing party.
Due to the nature of cloud computing, and the third parties involved, data saved in the cloud may not be clearly in the possession, custody or control of any one party. Further, as cloud computing deals in part with the sharing of resources, as data is stored in data centers and shared by clients, isolating and then retrieving the data of one client may adversely affect another client not involved in the litigation. ESI discovery in the cloud may create liabilities for retrieving ESI belonging to other clients. As the cloud-based servers share information, there is no one dedicated “hard drive” or storage unit for a client. The data is moved and shifted among data bases as needed. Thus, the retrieval of that data for one client may indeed possess information, confidential, privileged, or otherwise, pertaining to other clients.5
Moreover, there may be multiple layers of data. Layers of data include client-specific data, client-specific meta-data, and meta-data common to several clients, generated by several clients and maintained in data centers that are shared among clients. This in turn, makes it further difficult to isolate data for discovery purposes.
In addition to the data requested, courts are now requiring the production of meta-data. Guideline 7 states: “Counsel should agree on the form of production of ESI for all parties prior to producing ESI.”
The guideline commentary goes on to state that requests for meta-data must not be overbroad, and should be specific.
Counsel should consider (i) the ability to search by authors, recipients and text, as necessary to identify certain subject matters and to be able to segregate potentially privileged ESI which was authored by, sent to, or refers to in house or outside counsel or discusses legal advice; (ii) whether the court requires an index of ESI as it corresponds to the requests, and (iii) the list of major players involved in the case and other similar issues.
Nassau County has issued its own definitions for data to be preserved under its Commercial Division E-filing Guidelines:6
As used herein, “ESI” includes, but is not limited to, e-mails and attachments, voice mail, instant messaging…Native files and the corresponding Meta-data which is ordinarily maintained.
As used herein, the term “Meta-data” means: (i) information embedded in a Native File that is not ordinarily viewable or printable from the application that generated edited or modified such Native File; and (ii) information generated automatically by the operation of a computer or other information technology system when a Native File is created, modified, transmitted, deleted, sent, received or otherwise manipulated by a user of such system. Meta-data is a subset of ESI.
Meta-data is hidden information, and at times deleted matter, in an electronic file that is not apparent to the reader viewing a hard copy or screen image. It may include authors, origins, dates, comments, document versions, comments, and tags, such as those common in Facebook, that identify and reference people in notes, photos and videos. Cloud computer operating systems and programs often generate meta-data to improve organization and search capabilities.
To allow a litigant to discover meta-data from cloud-based storage may involve the discovery of meta-data common to other non-litigants. Its discovery and dissemination may be harmful to those non-parties. Consequently, those non-parties may oppose e-discovery requests and seek protective orders. Moreover, those whose data is being sought may “hide” their information as common meta-data in the cloud, and should the court issue protective orders for non-parties, it would also prevent discovery of the party’s meta-data. On the other hand, should the court allow the discovery of the common meta-data, there is the possibility of disclosing private information held by those non-parties.
Even more troubling is that meta-data is often inseparable among clients. Cloud providers utilize the meta-data in an effort to optimize their retrieval ability and speed. Further, CSPs exchange and use client data between different cloud providers to deliver that data anywhere on demand. These are two of the primary functions of cloud computing. To put providers in a position either to supply the meta-data or become the focus of litigation itself would force cloud providers to change delivery and security policies. The result would be increased prices for services and a slowing down of the services upon which cloud computing is based.
A further issue involves the determination of which party should incur the cost of the e-discovery. The court in US Bank v. Greenpoint Mortgage Funding, 94 AD3d 58 (1st Dept. 2012), citing VOOM, supra, ruled that the “Producing party [is] to bear the cost of production.” However, the unanswered question in recovering cloud data is which party is required to pay the costs for the cloud service provider data and its dependents in production of that data.
The court did, however, leave room for shifting the cost burden should the costs “create an undue burden or expense” on the responding party, citing Zubulake v. UBS Warburg, 220 FRD 212 (SDNY 2003). In Tener v. Cremer, 89 AD3d 75 (1st Dept. 2011), data was sought pursuant to a subpoena on a non-party. The court looked at the Nassau County guidelines that read “ESI is not to be deemed ‘inaccessible’ based solely on its source or type of storage media. Inaccessibility is based on the burden and expense of recovering and producing the ESI and the relative need for the data.” The court then held that should the data be available, the cost of obtaining that data should be allocated to the party requesting it.
Cloud-based computing systems are here to stay and present significant issues regarding e-discovery. Just as the world moved from mainframe architecture to personal computing, cloud computing involves the removal of certain aspects of personal computing to the cloud. The guidelines promulgated by the state bar and the various courts provide useful parameters but they may also be problematic.
The paradigm shift to cloud computing, which at its very foundation is the sharing of resources across numerous data centers by numerous clients, requires the e-discovery rules to be flexible. The preservation of data through a “litigation hold” limits the use of cloud computing as it detracts from the elasticity of resources for other clients who demand the same use and operation from those data centers.
Glenn F. Hardy is a sole practitioner of Glenn F. Hardy PC in Garden City. He practices criminal defense and is a member of the Tort Litigation Committee of the New York City Bar.
1. Louis Columbus, “Roundup of Cloud Forecasts and Market Estimates,” Forbes.com, November 2012.
2. Murad Ahmed, “Google Finds Seafaring Solution,” The Times of London, Sept. 15, 2008
3. New York State Bar Association Best Practices in E-Discovery in New York State and Federal Courts, NYSBA, September 2011.
4. David Vellante, “Meta-data in the Cloud: Creating New Business Models,” wikibon.org, Nov. 3, 2010,
5. Alberto F. Araiza. “Electronic Discovery in the Cloud,” Duke Law & Technology Review, Nov. 8, 2011.