E-discovery software options make use of a number of collection and culling techniques to find documents relevant to a litigation matter. When it comes to technology-assisted review (TAR), there are generally two camps of technology, says Gregory Leighton, partner at Neal Gerber and Eisenberg.

“In terms of service offerings, we commonly see two competing models. The more widespread is predictive coding, which has had plenty of exposure in the media, and courts are starting to recognize more frequently. Then, you have language-based analytics, which is the counter punch to predictive coding. Both models achieve the same goal, but it’s just the way they go about doing so that differs,” Leighton says.

While there are many other niches in the e-discovery space, predictive technology and language-based analytics cover a large portion of TAR offerings. In the case of predictive coding solutions, people provide and review a seed set, and then a computer program looks at that set, attempting to replicate the same decision-making process that was applied to the seed set.

However, Leighton argues that, because of the way these programs are generally set up, they can be prone to certain problems. “Because predictive coding is reliant on an algorithm based in human review, it can be time-consuming and prone to errors. Language-based analytics, on the other hand, relies on humans to do a review ahead of time, and requires resource commitment upfront to make that review simpler,” he explains.

Leighton says that language-based offerings can more proactively cull data sets, presenting e-discovery project managers with a novel way to cut down the number of documents before the review process is kicked off.




Pro bono cases get access to predictive coding

Data preservation vs. collection: How much is too much?

Language-based analytics: An innovative tool for cutting e-discovery costs



Here he lays out how these software products accomplish that task:

“Language-based analytics extracts every word of text out of a defined set, which could be millions of words, depending on the volume of information. Then, it collates those words in to a list with an incident rate, and shows which words are appearing most frequently, filtering prepositions superfluous words as it goes,” Leighton says.

After generating that list, e-discovery project managers have more information at their disposal when they sit down with the attorneys on a specific case. That information can aid in determining which documents fall within the scope of the discovery request.

“Instead of blindly deciding which documents to review, you now have a list of words extracted and you can say, ‘Okay, if a document has this word, it’s relevant. If it doesn’t, then we don’t need to review the document.’ You can get to a point where you’re able to say, ‘If this word appears, there is almost no way that this document is relevant,’” Leighton says.

Even in best-case scenarios, human error rate in document review is high, with some research estimating the rate is as high as 15 percent. Leighton says that by cutting back on the volume of documents that need review, e-discovery costs can be slashed.  “The fewer documents for which we need to rely on humans to review the better,” he says.

Although counsel considering a new e-discovery program will want to evaluate their litigation profiles and specific needs before agreeing to use a specific set of software, language-based programs offer a viable alternative to predictive software.