The first phase of the highly-anticipated Oracle/Electronic Discovery Institute joint research project has been completed, and confirms what many advocates have been preaching about technology-assisted review (aka predictive coding): that spending more money doesn't correlate with greater quality; that senior attorneys know what they are doing; and that you can't turn discovery over to robots—humans are still the most vital component of the project.
The study began in 2012, when the nonprofit EDI launched the project with Oracle Corp. Stanford University professors Peter Glynn and Gerd Infanger serving as chief scientists for the project; Pallab Chakraborty, e-discovery director in Oracle's legal department, and EDI co-founder Patrick Oot, the U.S. Securities and Exchange Commission's senior special counsel, are advisers.
"EDI directed the professors to not only evaluate the results submitted by the participants, but to help define a gold-standard to evaluate each of the respondents," said Oot. As a result, the study "not only looks at document categorization systems—but considers multiple evaluation standards used to assess the study participants."
The study "considered multiple evaluation systems using litigation data from real high-stakes litigation where the producing party was confident that it conducted a meticulous attorney-based document review to respond to the document request," he explained. Said Oracle's Chakraborty: "The document review originally conducted by outside counsel in the study matter was a rigorous undertaking that included a thorough review with multiple quality checks. The review team was comprised of both law firm associates and contract attorneys."
Participants paid a participation fee of $3,500 to help cover costs (and pay the professors), said Oot, noting that "participants were required to evaluate documents for responsiveness to the document request, identify documents to withhold for privilege, and locate hot documents that are highly relevant to the matter." Some vendors provided cost-free resources to participate in the study.
A key goal was to benchmark the "accuracy performance of different providers as they compared to different standards, but the cost of deploying their process as if it were a real case," he said. Vendors submitted invoices as if they were billing to a real matter. "To avoid gamesmanship, providers were held to pricing, and in some circumstances had to answer follow-up validation about their pricing practices," Oot said. Potential business with Oracle was a motivator for firms to participate in the study, he noted.
The technology-assisted review providers that submitted results included Backstop, Catalyst, Consilio, D4 Discovery, Daegis, DiscoverReady, Exterro, Huron, Integreon, Kroll Ontrack, McDermott Will & Emery's Discovery & Dispute Services, ProSearch, Quislex teamed with Driven, RVM, SFL Data and Valora.
Some participants took advantage of the opportunity to submit multiple results, creating an opportunity to use the study as an exercise to help fine tune proprietary systems, Oot noted. "To avoid the perception of a bake-off," providers were required to sign a strict nondisclosure agreement with EDI and Oracle that prevents them from disclosing or identifying their performance or the ranking of others, he said. Breaches "could result in liquidated damages, providers are deterred from disclosing their performance to others."
A 2007 Department of Justice matter involving government pricing practices at Sun Microsystems triggered the document production. Oracle completed an acquisition of Sun Microsystems in 2010, inheriting the litigation, explained Oot. "Oracle settled the underlying litigation in 2011, but had completed its document review of Sun documents. Chakraborty convinced the Oracle legal team that the data was a great opportunity to assist Oracle in the evaluation of predictive coding providers. Knowing of EDI's first study, Chakraborty then reached out to me at EDI to launch the study," Oot recalled. With assistance form Oracle's head of litigation, Deborah Miller, the team located Glynn and Infanger, he said.
Participants received a collection of 1,693,243 documents and review materials, including the complaint, custodian list, glossary, privilege memorandum, inside and outside attorney name list, confidentiality memorandum, tagging rules memorandum, issue tag flowcharts, issue tag definitions, case updates/announcements, specific document request review rules, an acronym list and a combined time line. Participants submitted interrogatories to Sarah Dean, an associate at Hogan Lovells who managed the document review in the original matter; she answered questions via conference calls. Not all participants finished the process, said Oot.