Can Your Old Files Come Back to Life?
Law Technology News
Success in the pursuit of electronic data discovery hinges on a lawyer grasping how a computer stores and manipulates information, and knowing about the virtual nooks and crannies where digital data hides in stubborn resistance to efforts to delete it.
The biggest haunt of fugitive data on the disk drive of a Windows computer is called slack space. After a computer has been in use for a while and files are deleted, space once devoted to those deleted files gets recycled. What fills the recycled space is the data that was supposed to disappear.
Staggering quantities of deleted file fragments lodge in the space freed up by deletion, called unallocated space, and even in parts of the unallocated space reoccupied by new files, called slack space. Computer forensics specialists can examine a computer's slack space, extract relevant material and, occasionally, locate the deleted data that clinches a case.
UNDERSTANDING SLACK SPACE
Understanding slack space requires a smattering of knowledge about how a computer stores data on the hard drive, and about a computer's file system, the utilitarian plumbing at the heart of every operating system.
A computer's hard drive records data in bits, bytes and sectors, all physical units of storage established by the hard disk drive's internal geometry in much the same way as the size and number of drawers in a filing cabinet are fixed at the factory.
Sticking with the file cabinet metaphor, bits and bytes are the letters and words that make up our documents.
Sectors (analogous to pages) are tiny segments of thousands of concentric rings of recorded data. A sector is 512 bytes, never more or less. A sector is the smallest individually addressable physical unit of information used by a computer. Computer hard drives can only "grab" data in sector-size chunks. A common paper filing system uses labeled manila folders assembled into a "red file" (master file) for a particular case, client or matter. A computer's file system stores information on the hard drive in batches of sectors called clusters. Clusters are the computer's manila folders and, like their real-world counterparts, collectively form files.
These files are the same ones that you create when you type a document or build a spreadsheet.
In a Windows computer, cluster size is set by the operating system when it is installed on the hard drive. Typically, Windows 98/ME clusters are 32 KB, while Windows XP/NT clusters are 4 KB. In setting cluster size, the file system strikes a balance between storage efficiency and operating efficiency.
The smaller the cluster, the more efficient the use of hard drive space; the larger the cluster, the easier it is to catalog and retrieve data.
When Windows stores a file, it fills as many clusters as needed, but except in the rare instance of a perfect fit, a portion of the final storage cluster will be left unfilled with new data. The space between the end of the file and the end of the last cluster is slack space.
Suppose your office uses 500-page notebooks to store all documents. If you have just 10 pages to store, you must dedicate an entire notebook to the task.
Once in use, you can add another 490 pages, until the notebook won't hold another sheet. For the 501st page and beyond, you have to use a second notebook. The difference between the capacity of the notebook and its contents is its "wasted" slack space. Smaller notebooks would mean less slack, but you'd have to keep track of many more volumes.
FAR FROM EMPTY
In the physical realm, where the slack in the notebook holds empty air, slack space is merely inefficient. But on a hard drive, where magnetic data isn't erased until it's overwritten by new data, the slack space is far from empty. When Windows deletes a file, it simply earmarks clusters as available for reuse.
When deleted clusters are recycled, they retain their contents until and unless the entire cluster is overwritten by new data. If later written data occupies less space than the deleted data, some of the deleted data remains. It's as if, in our notebook example, when you reused notebooks, you could only remove an old page when you replaced it with a new one.
Though it might seem that slack space should be insignificant -- after all, it's just the leftover space at the end of a file -- the reality is that slack space adds up.
If file sizes were truly random, then, on average, one-half of a cluster would be slack space for every file stored. But most files are pretty small. If a file being stored is small, even just a few bytes, it will still "tie up" an entire 32 KB cluster on the disc.
The more small files you have, the more slack space on your drive. It's not unusual for 25 percent to 40 percent of a drive to be lost to slack.
Over time, as a computer is used and files deleted, clusters containing deleted data are re-used and file slack increasingly includes fragments of deleted files.
In "Jurassic Park," scientists clone genetic material harvested from petrified mosquitoes to bring back the dinosaurs. Like insects in amber, Windows traps deleted data and computer forensics resurrects it.
Though a computer rich with data trapped in file slack can yield a mother lode of revealing information, mining this digital gold entails tedious digging, specialized tools and lots of good fortune and patience.
The Windows system is designed to be blind to all information in the slack space. Searching is accomplished using a forensically sound copy of the drive and specialized examination software, a hex editor utility that permits an examiner to read the data in each cluster directly from the media (or another operating system, like Linux), that treats a drive like a file, permitting string searches of contents.
File slack is, by its very nature, fragmented, and the information identifying file type is often the first data to be obscured.
The search for plain-text information is typically the most fruitful avenue in file slack examination and an exercise often measured not in hours, but in days or weeks of review.
Experienced computer forensics examiners are skilled in formulating search strategies likely to turn up revealing data, but the process is greatly aided if the examiner has a sense of what he or she is seeking before the search begins. Are there names, key words or parts of words likely to be found within a smoking gun document? If the issue is trade secrets, are there search terms uniquely associated with the proprietary data?
If the focus is pornography, is there image data or Web site address information uniquely associated with prohibited content?
Because most lawyers and litigants are unaware of its existence, file slack and its potential for disgorging revealing information is usually overlooked by those seeking and responding to discovery. In fairness, a request for production demanding "the contents of your computer's slack space" is unlikely to be productive. In practice, the hard drive must be examined by a computer forensics expert employed by one of the parties, a neutral expert agreed upon by all parties or a special master selected by the court.
Bear in mind that while the computer is running, computer data is constantly being overwritten by new data, creating a potential for spoliation. The most prudent course is to secure, either by agreement or court order, a forensically complete clone or image of each potentially relevant hard drive. Such a specially created copy preserves both the live data and the information trapped in the slack space and other hiding places.
Most importantly, it preserves the status quo and affords litigants the ability to address issues of discoverability, confidentiality and privilege without fear that delay will result in destruction of data.
File slack is just one of a host of hiding places where revealing data can be uncovered using computer forensics. The potential to unearth case-making evidence through computer forensics must be weighed by both sides in every case involving computers and electronic communications.
If it exists, getting to the smoking gun demands tenacity, resourcefulness and expert help, but going to battle less than fully armed and prepared is not a viable alternative.
Craig Ball is a member of the LTN Editorial Advisory Board, a trial lawyer and computer forensics consultant based in Montgomery, Texas.