ALM Properties, Inc.
Page printed from: Law Technology News
Select 'Print' in your browser menu to print this document.
Big Data Meets Big Law
Law Technology News
Editor's note: As 2013 approaches, what better time to look back at the last year to see what topics and trends dominated legal technology, and predict what we will still be discussing next December. Retrospection helps analysis and planning, so for the last few days of the year, LTN will be reprising its top stories from 2012. In the story "Big Data Meets Big Law" from the May/June issue of LTN, reporter Tam Harbert puts a future tense in the answer to a common question posed to counsel: "Can you win this case?"
"What are the odds of winning this case, and what's it going to cost me?" Those are questions clients routinely ask their attorneys. Today, lawyers draw on experience and gut instincts for the answers. Sometimes, they are even right. It may not be long, however, before computers spit out answers with far more accuracy.
Legal scholars, computer science engineers, and commercial companies are building databases and using algorithms to crunch massive amounts of historical legal data to identify the significant factors that influence particular legal outcomes.
These experts say that such factors can then be used to predict what will happen in future scenarios. Called quantitative legal prediction, it's basically what happens when the latest technology trend called "big data" meets the law. And it just might change how corporate general counsel and BigLaw manage legal matters and costs, how they craft legal arguments, and whether, how, and where they file a lawsuit.
The trick, however, is getting usable data. So far, finding comprehensive legal data in a form that computers can handle has proven difficult. Unless that problem is solved, the technology may have a more limited impact. Already, though, quantitative legal prediction has started "coming in at the edges of tasks that lawyers do," says Daniel Katz, assistant professor at Michigan State University College of Law. E-discovery, for example, uses algorithms to review reams of documents and predict which are likely to be relevant in a given case.
But that's just the tip of the iceberg, says Katz. There's been a quiet transition in the legal industry that most people are largely unaware of, he says. Reasoning traditionally done by human attorneys can be replaced or supplemented by predictions made by computers. "It's not going to end lawyering ... but I definitely think some percentage of tasks that lawyers do are going to be replaced by machines and/or technology," says Katz.
One company that is trying to capitalize on the potential of this technology is TyMetrix, part of Wolters Kluwer Corporate Legal Services. A vendor of e-billing and matter management systems for corporate law departments, it started collecting data on billings and legal matters in 2009. With its customers' permission, it has accumulated data from $25 billion in legal spending, which is stored in a data warehouse. TyMetrix is using analytics to mine the information for use in products, says Craig Raeburn Jr., managing director of TyMetrix Legal Analytics. One product that benefits from the analysis is the $2,500 Real Rate Report that benchmarks law firm rates and identifies the factors that drive them.
TyMetrix also offers a free app for mobile devices that uses Real Rate Report data to serve up average hourly legal rates of law firms across the country. The company's goal is to help customers manage future legal costs, says Raeburn, who notes that TyMetrix plans to integrate a rate analysis and forecasting feature into several products. For example, the technology could identify and analyze five to 10 key variables in a certain type of legal matter and then predict costs for a future case. "You can see the entire case costs and you can play what-if scenarios," he continues, to figure out the most cost-effective way to manage a matter. If you spend less on outside counsel, for example, how might that impact the outcome of the matter?
Of course, such predictions are only as good as the size and quality of the data. TyMetrix's data warehouse is limited the company has gathered only certain types of data and only from clients who've opted in. But it is on the hunt for additional data sources, says Raeburn, although he is cagey about exactly where that data may come from. "There are other sources that are available in the market where people can now get more information than they have ever been able to," he says.
Katz concurs that harvesting data can be a challenge. "The problem is you have to collect massive amounts of data, and a lot of it is not easy to get," he says. The most obvious data source, the Pacer (public access to court electronic records) system, is notoriously difficult to access. And, he notes, "Pacer is not free. You have to pay for it."
Another organization investigating quantitative legal prediction is the Harlan Institute, a non-profit organization that promotes interest in and education about the Supreme Court.
It grew out of what started as more of a lark by Josh Blackman, a law student and self-professed Supreme Court nerd, who in 2009 launched a web-based fantasy league for predicting Supreme Court decisions (see "Place Your Bets" for more on the league). Called Fantasy SCOTUS, the site has built up a database of crowd-sourced opinions and analyses of many Supreme Court cases.
In an academic paper published in the Northwestern Journal of Technology & Intellectual Property [Vol. 10, p. 125, 2012], Blackman and co-authors suggest that Fantasy SCOTUS could combine the crowd-sourced data with data from publicly available court filings, then use an algorithm and decision engine to make predictions: "It would be quite conceivable for a bot to crawl through all of the filings in Pacer . . . and develop a comprehensive database of all aspects of how each court works."
Conceivable for a bot to do all that crawling, that is, but not necessarily easy. The one startup that is perhaps closest to achieving the promise of quantitative legal prediction, Lex Machina, has spent 10 years trying to build and organize an effective database in just one legal specialty.
The company, which was spun out of Stanford University's IP Litigation Clearinghouse, focuses on patent litigation. That's a high-value, high-cost area for corporations. Intellectual property often materially contributes to the value of a corporation, and therefore companies spend lots of money to protect and procure it. In fact, IP litigation costs are almost 62% higher than other matters and the average cost of taking a patent case to trial can hit $5 million per patent, according to a report by the Federal Judicial Center.
Lex Machina has cleaned up and organized the data in the IP Litigation Clearinghouse so that algorithms could operate on it, says Joshua Walker, co-founder and executive vice president of law and business development at Lex Machina. The database holds information from 128,000 IP cases, 134,000 attorney records, 1,399 judges, 63,000 law firms and 64,042 parties, spanning the last decade. Walker estimates that it has taken a team of engineers and lawyers some 100,000 hours to properly categorize, tag and code the information. Even putting legal rulings in the right categories was a huge challenge. The team found that the outcome coding by the administrative office of the U.S. courts was incorrect more than half the time, says Walker.
As part of its charter, Lex Machina runs and hosts the data for a "public interest" version of IP Litigation Clearinghouse, which is free to academicians, public interest researchers, judges, policymakers, and the media. Lex Machina makes revenue by advising commercial clients, who pay a fee for the insights gleaned from the database. (It also supplements the database with data on clients' legal matters.) But "the last mile of analysis," is still being done by humans rather than computers, says Walker. "Over time, we're going to automate all of it." Eventually, Walker believes this technology will have a big impact on how corporations value, manage and protect IP.
"My hypothesis is that this will . . . revolutionize how corporate finance looks at litigation," he says. "We've done a number of use cases where we've said,'Here are the settlement patterns and win rates for these companies.' "
If Walker is right, then someday a certain amount of lawyering may be reduced to simple actuary formulas. "Insurance companies do this all the time," he says. "It's just never been easy to do for complex, high-stakes litigation."
TyMetrix makes data from its Real Rate Report available in a free mobile app called Rate Driver. The app is available on most mobile smart phones, including iPhone, Android, and BlackBerry. It can calculate average hourly legal rates for lawyers across the United States, based on the following five factors:
1.) Geographic location
2.) Size of firm
3.) Attorney title (partner, associate, etc.)
4.) Years of experience
For example, for a partner with six years of experience in finance and securities working at a 550-attorney law firm in New York, the average rate is $569 an hour. But a partner with the same experience at a same-sized firm in Raleigh, North Carolina will run $438 per hour.
This is the type of information that helps general counsel better manage their legal costs, says Daniel Katz, assistant law professor at Michigan State University College of Law. In many cases, it doesn't matter where the legal work is done. But until now there was no way for general counsels to easily compare such information.
Although the app is based on historical data, it's uncanny how accurately it can predict current rates, gushes Craig Raeburn Jr., the managing director of the TyMetrix Legal Analytics unit. "Every time we sit down with a lawyer whether on the corporate side or law firm side and we do that analysis . . . we are within five dollars of what the attorney's rate is," he says. "They say, 'oh my God how did you do that?' "
WORKING BY NUMBERS
You would think the capabilities of quantitativelegal prediction might be threatening to BigLaw. But that's not the attitude at Seyfarth Shaw. It adopted Lean Six Sigma, a data-driven management and quality improvement philosophy, about six years ago, says Lisa Damon, partner and chair of the firm's labor and employment department.
Today, it has customized some of the concepts into what it calls SeyfarthLean, a client service model based on Lean Six Sigma principles. To more accurately price and more efficiently deliver legal services, the firm collected and analyzed data on the amount of time it took people to do various tasks. It used information from TyMetrix's Real Rate Report, which includes data on the average amount of time that a timekeeper spends on specific tasks in defined types of legal matters, says Damon. "That data is incredibly important in terms of efficient delivery of services. It allows law firms and in-house legal to look at median costs for a particular type of matter, and begin to drive those costs down through greater efficiency."
But what about the fact that Seyfarth's clients also have access to that information? "We think the [availability of] more data presents the opportunity to make better decisions," she says. "Greater transparency is always a benefit."
Seyfarth is also using data to help it better manage legal matters. In the area of employment law, the firm has collected data on U.S. Equal Employment Opportunity Commission cases, and it has access to a national EEOC database as well.
By tracking client matters and comparing them to this data, Seyfarth helps clients understand and manage their risks.
For example, if a client has an EEOC charge, Seyfarth can pull data on other complainants, the particular EEOC investigator, and the type of claim. That data may indicate a higher risk with this investigator and category of claim. Often, clients treat EEOC charge work as "the lowest type of commodity work it's essentially just filing an answer," she says. But now Seyfarth can warn a client to be extra careful in a particular case. It can tell the client: "If you don't manage that charge right ... it could [turn] into a multi-million-dollar lawsuit."
Extra Credit: Daniel Katz slide deck on quantitative legal prediction, presented at LegalTech New York 2012: slidesha.re/LTN1261.
Tam Harbert is a freelance reporter based in Washington, D.C. Email: email@example.com.