Information is the fuel that drives law offices. Yet extracting intelligence out of the massive amounts of raw information — aka Big Data — now pouring into law offices from court records, emails, text messages, transcripts, Facebook posts, Twitter and other sources requires both careful planning and the help of sophisticated predictive-analytics software.
Predictive analytics draws on a variety of techniques — including statistics, modeling, machine learning and data mining — to study data and make predictions about future events and trends.
“Big Data in general, and predictive-data analytics in particular, are the potential holy grail in the practice of law,” said Donald Wochna, chief legal officer at Vestige, a computer forensics firm in Medina, Ohio.
“Fast, high-performing data analytics can help enterprises and law firms harness expanding data collections to guide them on everything from finding profitable efficiencies to making important decisions in case strategy,” said Matt Gillis, vice president and managing director for litigation tools and professional services at LexisNexis Group.
“It’s not uncommon for attorneys to sort through and make sense of upwards of 300 terabytes of data when preparing for a case [and] the massive volume of data simply outpaces the capabilities of traditional technology tools to process that much information in a timely fashion.”
Predictive-analytics tools help lawyers make connections that might otherwise go unnoticed. This ability can be particularly useful when performing electronic-data discovery. “Predictive-data analytics seeks to identify correlations in data that occur during the planning phase of unlawful conduct so that conduct can be interdicted before any damage is done,” Wochna said.
Analytics products can also slash case preparation time and costs. “Part of the challenge in e-discovery is the prevalence of large amounts of unstructured data,” Gillis said. Structured data are information organized inside a database or spreadsheet that makes it easily identifiable and relatable to even the simplest analysis tools, he said.
Yet unstructured data, such as emails, word-processing documents, instant messages, tweets, blog posts and other digital communications, now constitute most of the data flowing into law offices.
“Both types of data are subject to e-discovery during litigation, yet few organizations have the technical tools and expertise to apply the same degree of sophistication to the analysis of unstructured data as to structured data,” Gillis said.
An emerging generation of Big Data management tools helps attorneys gain control over virtually all types of data, allowing them to find information relevant to their case and determine exactly which data should be processed and reviewed. “In particular, open-source analytics platforms have proven extremely fast at processing data,” Gillis said. Open-source technologies including Apache Hadoop are highly efficient because they help make sense of information chaos. “They pull massive amounts of structured and unstructured data into a refinery system and break it down quickly so attorneys have quick and easy access to relevant information,” he said.
Analytics tools specifically designed for legal applications are also available to help law offices efficiently track, manage and review large amounts of data, said Sheila Mackay, senior director of e-discovery consulting at Xerox Litigation Services, based in Albany, N.Y. Technology-assisted review (TAR) products, for instance, use machine-learning techniques to automate the prioritization of documents for review based on how likely they are to be responsive to a particular matter.
“TAR may be appropriate for large volumes of data subject to discovery that would otherwise be cost- and time-prohibitive to review manually based on deadlines,” Mackay said. Such approaches enable review managers to be more effective in allocating workflow to associate and contract reviewers, achieve more consistency and optimize senior attorneys’ time.
Analytics software can also help law offices optimize a variety of time-consuming business and management tasks, such as caseload distribution, revenue projection, fee forecasting and client-data organization.
Dean Gonsowski is senior e-discovery counsel at Mountain View, Calif.-based analytics software publisher Symantec Corp. When representing a client in a patent infringement suit, he said, a law office could use analytics to develop a reasonable fee estimate by processing and analyzing data gleaned from its involvement in previous, related suits.
“In like manner, such an estimate could help the law firm project its revenue streams on that suit and assist with overall budget forecasts,” he said.
A law office considering a move into Big Data analytics should begin by taking a close look at the data it’s now storing and how that information is being used. “A value-focused analysis will help determine what information should ultimately be kept and for how long,” Gonsowski said.
Effective Big Data management and use begins with four basic steps, Gillis said: “Develop a strategy for information governance; establish rules for defensible deletion; prioritize data sets; and select best technology tools.”
Mackay suggests creating a project management and oversight team. “It should be comprised of senior-level management, with representatives from IT,” she said. “Outside specialists, including consultants and e-discovery providers, can complement these teams by offering specific expertise.”
She also recommends creating a culture of information governance. “The law office should establish a comprehensive structure that supports all of its data along with processes and roles that outline how data will be handled,” she said. “Within such a structure, data can thrive as an asset rather than a liability.” The structure, she continued, should include a strategy for easily retrieving useful data, as well as data that have current business value, while avoiding a “keep everything” policy, which can actually make data a liability.
“Retaining information that has no Big Data potential threatens to turn Big Data into bad data, which merely increases risk,” Mackay said.
For example, firms that stubbornly retain client electronically stored information gathered in e-discovery — even after a lawsuit has been resolved — are occasionally forced by subpoena to hand over those data. “Not only does this needlessly divert firm resources into e-discovery sideshows,” Mackay said, “it negatively impacts client information-retention policies implemented to defensibly delete that data.”
While many analytics tools are designed for use by lawyers with little or no technical experience, only the very largest law organizations should attempt to find, install and configure the necessary software without outside help.
“Lawyers are experts at law but not at technology,” said John Tredennick, chief executive officer of Catalyst Repository Systems, an analytics-tools publisher headquartered in Denver. He noted that multiple federal court rulings have cautioned that certain aspects of e-discovery require specialized expertise in computer technology, statistics, linguistics and other technical matters.
“Law firms are well advised to focus on their core capabilities and bring in outside vendors and consultants to assist with the technical aspects of handling Big Data,” Tredennick said.
Mackay stressed the need to thoroughly document all steps. “A robust audit trail is imperative when a law office is called upon to defend and explain…processes and decisions to a court or government agency, and can show good faith to comply with legal and compliance obligations.” Yet the biggest risk facing law offices, Mackay said, “is not having a well-thought-out and designed plan.”
John Edwards, a freelance writer in Arizona, wrote this article for NLJ affiliate Law Technology News; it will appear in the February issue.
VIEW FROM THE TRENCHES
We asked a group of litigators for their thoughts on the past year’s biggest developments, their pet peeves, and their predictions for the next big thing for e-discovery.
• What’s hot?
• What stinks?
• What’s next?