Illustrator: Daniel Hertzberg
Information is the fuel that drives law offices. Yet extracting intelligence out of the massive amounts of raw information aka Big Data now pouring into law offices from court records, emails, text messages, transcripts, Facebook posts, Twitter, and other sources requires both careful planning and the help of sophisticated predictive analytics software.
Predictive analytics draws on a variety of techniques including statistics, modeling, machine learning, and data mining to study data and make predictions about future events and trends.
"Big data in general, and predictive data analytics in particular, are the potential holy grail in the practice of law," declares Donald Wochna, chief legal officer at Vestige Digital Investigations, a computer forensics firm located in Medina, Ohio.
"Fast, high-performing data analytics can help enterprises and law firms harness expanding data collections to guide them on everything from finding profitable efficiencies to making important decisions in case strategy," adds Matt Gillis, vice president and managing director for litigation tools and professional services at New York-based LexisNexis. "It's not uncommon for attorneys to sort through and make sense of upwards of 300 terabytes of data when preparing for a case [and] the massive volume of data simply outpaces the capabilities of traditional technology tools to process that much information in a timely fashion."
Analytics in Action. Predictive analytics tools help lawyers make insightful connections that might otherwise have gone unnoticed. This ability can be particularly useful when performing electronic data discovery. "Predictive data analytics seeks to identify correlations in data that occur during the planning phase of unlawful conduct so that conduct can be interdicted before any damage is done," Wochna says.
Analytics products can also slash case preparation time and costs. "Part of the challenge in e-discovery is the prevalence of large amounts of unstructured data," Gillis says. Structured data is information that's organized inside a database or spreadsheet, which makes it easily identifiable and relatable to even the simplest analysis tools, he explains.
Yet unstructured data, such as emails, word processing documents, instant messages, tweets, blog posts and other digital communications, now constitutes most of the data flowing into law offices.
"Both types of data are subject to e-discovery during litigation, yet few organizations have the technical tools and expertise to apply the same degree of sophistication to the analysis of unstructured data as to structured data," Gillis says.
An emerging generation of big data management tools help attorneys gain control over virtually all types of data, allowing them to find information relevant to their case and determine exactly which data should be processed and reviewed. "In particular, open source analytics platforms have proven extremely fast at processing data," Gillis says. Open source technologies such as Apache Hadoop are highly efficient because they help make sense of information chaos. "They pull massive amounts of structured and unstructured data into a refinery system and break it down quickly so attorneys have quick and easy access to relevant information," he says.
Analytics tools specifically designed for legal applications are also available to help law offices efficiently track, manage, and review big data, says Sheila Mackay, senior director of e-discovery consulting at Xerox Litigation Services, based in Albany, N.Y. She notes that technology-assisted review products, for instance, use machine-learning techniques to automate the prioritization of documents for review based on how likely they are to be responsive to a particular matter.