Image: Catalyst Repository Sysems, Inc.
Searching documents in e-discovery should be fast. No one has time to wait for a database to churn out search results. Fortunately, most of today's hosted document review platforms are speedy enough, but could they be faster?
That's a question that John Tredennick and his team at Catalyst Repository Systems, Inc., set out to answer several years ago when they started to push the limits of current technology in their larger cases. The "structured query language" (SQL) that most platforms use requires data to be placed in some kind of a structure before it can be searched. But most of the data involved in e-discovery is unstructured in the form of email, Word documents, text files, etc. That's why Tredennick and the Catalyst team turned to an XML-based back end for their new Insight document review platform.
I tested Catalyst Insight using a PC with Windows 7 Professional and Google's Chrome browser (version 24.0.1312.56) and found the hosted platform blazing fast for document review.
IMMEDIATE SEARCH AND RESPONSE
To give an example, I accessed an Insight account that contained 8 document collections with a total of 8,981,995 records. I knew there were that many records in the database because a counter in the lower right corner updated itself in response to anything I typed.
Since one of the document collections was the Enron data set, I jumped right into the "Free-Form Search" box and searched for one of my favorite emails from the collection with the words "lunch" and "shred." As soon as I typed "lunch" my record count jumped down to 83,191 in under a second. When I typed "shred" it immediately plummeted to 52. I hit the Search button and pulled up the email I was looking for: "This week is not good [for lunch]. I have too large a pile of documents to shred. Next week is better."
The story here isn't that I found my email I can perform the same search in any platform and (probably) get the same result. The story is how responsive Catalyst Insight was to my search. I don't want to simply call it "fast," I would describe it as "immediately responsive" because the system was running my search in the background before I even hit the search button. I could experiment with search terms and immediately see the number of potential results.
MULTIPLE SEARCH OPTIONS FOR YOUR DOCUMENT COLLECTION
When you log into Catalyst Insight, the primary navigation appears on the left side with small square icons for Search, Folders, Review Projects, Monitors, and the Administrator Console. By default, you're brought to the Free-Form Search screen which I used in the example above. See Figure 1. There are some Advanced settings here (stemming, case sensitivity, etc.) but if you want to just start typing you're free to do so. There's also a "Search Assist" box that allows you to select a specific field to search if you wish.
But if you're taking the exploratory route in your search, you're better off starting with "Faceted Search." See Figure 2. The Facets here are based on the fields that appear in the main window as list boxes. You can bring up the author box and add names to your search. Next, you can add another Facet such as "doctype" to narrow your search. The "docdate" field comes up as a nice visual graph allowing you to drag your cursor over the relevant timeframe.
The "Tracked Search" option allows you to generate useful reports on search terms. See Figure 3. You build your search by clicking the plus sign for each box and then entering your terms. If you already have a list of search terms that someone composed, you can copy and paste them into the "Delimited Entry" tab as long as they're separated by a comma, semicolon or hard return. You'll have to spend some time building your search here, but once you're done you can select "Create Report" from the Search Options at the top.
The Report provides details on the document collections and folders that were searched, followed by a visual chart of the documents as per doctype (which can be switched to a pie graph, doughnut, column, etc.). The most helpful information is at the bottom where it lists the keywords you used with the number of hits recorded. It also lists the similar words that were NOT included in your search. This report is extremely helpful when you're arguing about search terms with the other side.
VIEWING YOUR DOCUMENTS
When you're ready to view documents, Insight lists them in "Table View" by default. See Figure 4. Site Administrators can customize this default view or users (as allowed) can create their own views. All the tools are there to customize your list of columns but it did take me a few minutes to find everything (e.g., adding a new column requires clicking on a dropdown in an already existing column).
To see the content of a document, simply click the row and a Preview window pops open on the right side of the screen. See Figure 5. Nothing fancy here as the Preview window only shows the textual rendition of the file, but it does highlight search terms. You can also click "Show Fields" to see a list of all the metadata associated with the document.
Clicking the "Launch Detail" button will open the document in a separate browser window where you can view either the text of the document or a PDF. The embedded viewer worked great for every file type I tested but if you need to view the native file you can download it and use local software on your computer.
The bottom right corner of the Document Viewer shows "Related" documents (e.g., an email and its attachments) and "Duplicates" pulled from the database. Checkboxes allow you to tag the groups as appropriate. See Figure 6.
While the Table View will satisfy most review needs, Insight also shows the number of documents per author in the customizable Chart View, or graphs the total sizes of the files by selecting the Size option. See Figure 7.
There's also a "Communication Tracker" and "Communication Report" that visually presents how emails were exchanged between individuals. See Figure 8.
REDACTING, PRINTING, AND EXPORTING
Insight fully supports redactions for documents, but you'll need to have permission to do so from the site administrator. Documents must be converted to PDF first before any redactions can be applied. When you click the "Redact" button, you'll need to choose a Redaction Set in which to save the redacted document before continuing. Once you go through all of that, Insight offers a nice set of tools for creating redactions and stamps.
The "Print" option allows you to batch together selected documents as a compiled .pdf or .zip file. This is a tad confusing since this feature doesn't actually send the documents to a local printer, but the tool is an excellent way for support personnel to generate a combined PDF of the selected documents complete with separator sheets and custom PDF bookmarks.
There's also an "Export" feature which allows one to select and download structured information about the documents. You can choose the fields you want included and export them as an Excel file, .csv, Microsoft Word, etc. This is an excellent method for creating a privilege log.
I found Catalyst Insight to be blazing fast compared to numerous other review platforms. The Catalyst team promises more tweaks and updates very soon including a process for lawyers to help train the system for predictive coding.
Using Insight makes me believe I've seen the future of how we will search "Big Data." It's not that anything's wrong with our current systems, but the fact that a veteran vendor like Catalyst is looking to new technologies tells me that it may be time for others to start considering other alternatives as well.
Prices start as low as $35 per gigabyte that includes project management. There are no separate user fees. Rates are adjusted for larger volumes. Catalyst also offer terabyte rates for corporations and law firms who enter into enterprise agreements.
Brett Burney is principal of Burney Consultants, where he works with law firms and corporations on managing electronic data for litigation matters. Email: email@example.com.