Identity thieves may get the headlines, but data thieves do more harm. Studies in the United States and abroad suggest that two-thirds of departing white-collar employees leave with proprietary data. Few of them are prosecuted. When confidential files fly the coop, companies don’t call the cops — they sue.

Not all data leaves with the requisite "scienter" to be called theft. In this wired world, who doesn’t have data on thumb drives, phones, tablets, backup drives, and webmail? You work for a company awhile and you’re going to end up with their stuff on your devices and media. Still, data-theft lawsuits rarely stem from stale data on forgotten media.

The classic data-theft scenario involves the last-day, late-night movement of confidential files to an external USB hard drive or thumb drive. Often, the furtive transfer is part of a dastardly scheme. Less often, departing employees intend to take just their personal family photos or music, but unwittingly drag along some of the organization’s electronically stored information.

How do forensic examiners determine data was taken? How do they figure out what storage devices were used to carry away ESI? How do they find clues of motive? The answer is "By many happy accidents."

A Confluence of Happy Accidents. You can divide electronic evidence into user-generated or user-collected ESI (e.g., Excel spreadsheets or downloaded photos) and system-generated ESI. A user’s ESI tends to speak for itself; but system-generated ESI must whisper in an expert’s ear for its story to be told.

Forensic artifacts arise as a consequence of a software designer’s effort to deliver a better user experience or improve performance. Their probative value in court is just a happy accident.

To illustrate, on Microsoft Windows systems, a forensic examiner may look to machine-generated artifacts — called LNK files, prefetch records, and Registry keys — to determine what files and applications a user accessed and what storage devices a user attached to the system.

LNK files (pronounced "link" and named for their .lnk file extension) serve as pointers or "shortcuts" to other files. They are similar to shortcuts users create to conveniently launch files and applications, but LNK files aren’t user-generated.

Instead, the computer’s file system routinely creates them to facilitate access to recently used files.

Windows stores LNK files in the user’s RECENT folder. Each LNK file contains information about its target file that survives when the target file is deleted, including times, size, location, and an identifier for the target file’s storage medium. Microsoft didn’t intend that Windows retain information about deleted files in orphaned shortcuts; yet there’s the happy accident — or maybe not so happy, for those nabbed because their computers were trying to better serve them.

Similarly, Windows seeks to improve system performance by tracking the recency and frequency with which applications are run. If the system knows what applications are most likely to be run, it can "fetch" the programming code those applications need in advance and pre-load them into memory, speeding the execution of the program.

Thus, records of the last 128 programs run are stored as a series of so-called "prefetch" files. Because the metadata values for these prefetch files coincide with use of the associated program, by another happy accident, forensic examiners can determine, say, the time and date a file-wiping application was used to cover tracks.

Two final examples of how forensically significant evidence derives from happy accidents are the USBSTOR and DeviceClasses records found in the Windows System Registry hive. The Windows Registry is the central database that stores configuration information for the operating system and installed applications — it’s essentially everything the operating system must remember to set itself up and manage hardware and software.

The Windows Registry is huge and complex. Each time a user boots a Windows machine, the Registry is assembled from a group of files called "hives." Most hives are stored on the boot drive as discrete files; one hive — the Hardware hive — is created anew each time the machine boots.

When a user connects a portable hard drive or flash drive to a USB port, the system must load the proper drivers to communicate with the device. So, Windows interrogates the device, determines what driver to use and — importantly — records information about the device and driver pairing within the ENUM/USBSTOR and DeviceClasses "keys" of the System Registry hive.

Windows also stores the date and time of both the earliest and latest attachments of the USB storage device.

Presumably, the programmer’s goal was to speed selection of the right drivers the next time the USB devices were attached; but, the happy accident is that the data retained for a non-forensic purpose carries enormous probative value when properly interpreted and validated by a qualified examiner.

Departing employees aren’t data thieves by nature. Most just feel overlooked or underappreciated when opportunity knocks. It’s not really stealing, they rationalize; it’s protecting the fruits of their labor. Of course, if they didn’t know it was wrong, why do it late at night or jump through hoops to cover up the conduct?

Imagine you’re a budding data thief. Step one in liberating your employer’s data is deciding what to take. So you open folders, check file contents, and dump your plunder into a folder or Zip file, leaving a trail of LNK or prefetch files as you go.

Meanwhile, you’re generating MRU (Most Recently Used) file records in the Registry and altering MAC dates on files and folders.

All computer operating systems employ a file system that serves as the prosaic plumbing, handling routine file management tasks. Microsoft Windows uses one called NTFS (New Technology File System) that tracks several dates and times for each file called MAC dates, for Last Modified, Last Accessed and Created dates.

Last Accessed dates were once crucial in determining which files a data thief touched at the time of the theft. However, forensic examiners had to account for a range of automated activities (such as antivirus scans) that update Last Accessed times. All that updating consumed computing resources and slowed things down.

Slow doesn’t sell; so when Microsoft introduced the Vista operating system, it disabled the automatic updating of Last Accessed dates. It remains disabled in Windows 7 and 8. Today, updated Last Accessed dates may indicate that a file was touched by a user or machine, but the absence of updated values doesn’t reliably rule it out.

Now that you’ve left your digital fingerprints all over the scene of the crime, how will you take the stolen data home? You decide not to email it to your webmail account or burn a CD. Instead, you plug an external drive into the USB port of your computer and wait for Windows to see it.

In the brief time that the drive spun up, Windows said, "Who goes there?" and the drive responded with its name, rank, and serial number; that is, with enough information for Windows to locate and load the right driver to allow the computer to communicate with this new USB mass-storage device.

Windows dutifully made a record of the attachment in the Registry and in a log called C:\Windows\inf\ Congratulations! You’ve made it possible to track the USB drive by its make, model, and serial number.

When you claim you never used the data from your old employer to benefit your new employer, the computers to which you connected that USB drive may betray you.

These are just a handful of the ways that the happy accidents of computer forensics help forensic examiners bust data thieves. We will need all the happy accidents we can get as the stakes rise and employees find new tools and techniques to steal intellectual property.


More Info

When preserving ESI in cases of suspected data theft, remember that the lion’s share of the forensically revealing data is not stored on the media used to transfer the data, but principally resides on the systems to which the storage media attached. Endeavor to secure all such evidence items and have them forensically imaged — a copy or backup of the device isn’t sufficient to support a thorough forensic exam.

Lawyers interested in more information on this topic might enjoy the First Responder’s Guide to Employee Data Theft ( Examiners wanting to know more about Windows Registry analysis should see Harlan Carvey’s Windows Registry Forensics: Advanced Digital Forensic Analysis of the Windows Registry ( or download a copy of the SANS Windows Artifact Analysis poster (