In this three-part series, we will discuss some of the challenges and issues companies must recognize when litigation arises and the time comes to identify and map potentially relevant data that is stored with Infrastructure-as-a-Service (IaaS) providers. Some of the key issues that will be addressed arise from what, on the surface, appear to be simple questions:
- Who has my data?
- What control do I have over my data?
- Where is my data being stored?
We will also discuss specific steps that can be taken to collect data stored in the cloud order to defend the authenticity of that data.
As companies start to fully realize cost savings from outsourcing infrastructure, Fortune 1000 companies’ use of IaaS services—Amazon Web Services and Cisco CloudVerse, for example,—is rapidly increasing. This paradigm shift away from onsite computing and data storage changes the way companies need to map their data to be prepared for possible litigation and changes how data will be collected in these instances.
Before getting into the nuts and bolts of identifying and subsequently collecting data in an IaaS environment, it makes sense to spend a little time discussing exactly what we mean by an IaaS environment. The National Institute of Standards and Technology defines the IaaS service model as follows:
“The capability provided to the consumer is to provision processing, storage, networks, and other Fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, and deployed applications; and possible limited control of select networking components (e.g. host firewalls).”
So what we are seeing in an IaaS environment is that the approach is to move away from viewing computing equipment (workstations, servers, etc.) as physical assets to now viewing them as services to be provisioned when required. This is where significant cost savings can come into play, as organizations can now provision and equip computing resources on an as-needed basis, rather than worry about continuously purchasing new gear to replace older equipment.
In order to better understand the IaaS environment for when we discuss the identification and collection of data in parts two and three of this series, it may be better to review a case study that takes the abstract concept described above and puts it into a real-world scenario.
The mission of the Defense Information Systems Agency (DISA) is to provide a “globally accessible enterprise information infrastructure” to support the U.S. Armed Forces, and other national leaders. In that respect, their function is similar to the IT department in a large, multinational corporation.
To carry out its mission in a more efficient manner, DISA switched from a physical computing environment to an IaaS environment called Rapid Access Computing Environment (RACE). A concrete example of the savings and efficiency brought about by RACE is that provisioning new server space for users, which was a task that required three to six weeks in the old physical environment, now took only 24 hours.
While the efficiency and savings seen by the switch to RACE is certainly worth noting, of more importance for our discussion is that the data that was housed on hundreds of servers in disparate locations was migrated to the cloud and housed in a much smaller number of data centers. In this case, DISA implemented a private IaaS environment rather than use a commercial provider, but the basic concept of reducing the number of physical servers is the same as in a commercial implementation of an IaaS environment.
This fundamental switch from having data located on physical equipment located within an organization’s facilities, and therefore readily identified, to having data exist on virtual machines located on servers in multiple data centers spread across the country (or across the world in the case of Amazon Web Services and Google Apps services) brings up our first real challenge in dealing with data stored in an IaaS environment—exactly where is it located?