Innovative Analytics Workflow Found the Needle in a Pile of Needles

What happens if you have a huge collection and suspect that only a tiny fraction of that collection is going to be responsive to the request for production? What do you do when you have to perform an internal investigation over a broad dataset that may or may not yield anything to support or deny the allegations that lead to the investigation? Traditionally, this means that you will apply some search terms hoping to limit the number of documents that you subject to a linear review by a team of contract reviewers.

The fact remains, in nearly all litigations, the collected files are greater in number than those that will be produced. So, why do we feel we have to look at all those non-responsive documents? There must be a better way. Fortunately, there is a better way and it is driven by analytics.

“The difference between success and failure with discovery analytics is determined more by the workflow than it is by the tool. With the right partner consulting on your analytics workflow, the results can be amazing.”

What is analytics and how do you apply it to your case? There are a variety of analytics tools fitted for use in the practice of eDiscovery. From predictive coding or technology-assisted review, to near duplication and concept search, analytics comes in all sorts of shapes and sizes. It is a suite of tools, each with its own unique set of applications. It can be difficult to know what analytics components to implement on your matter. The difference between success and failure is determined more by the workflow than it is by the tool. With the right partner consulting on your analytics workflow, the results can be amazing.

This is a case study where D4 applied an analytics workflow we call the “Smoking Gun Concept Search” method to reduce the volume of data prior to review, which resulted in our client saving many hours and dollars.

The Background

Recently, D4 had a case in which the legal team was tasked with finding a needle in a stack of needles, or more specifically, a set of emails within collection of 120,000 emails. They did not have a specific recipient to key on for their search. They did not know the specific language used in drafting the email they assumed was somewhere in the collection. They did, however, have one valuable item upon which to build their search; they had a clear, conceptual definition of what they were looking for.

The Problem

A former employee of Company XYZ was suspected of breaching the non-solicitation agreement he signed before leaving XYZ to join Company 123. Management at XYZ believed that the former employee was actively recruiting ex-colleagues to join him at Company 123. In an effort to determine if the assumptions were true, legal counsel for Company XYZ recommended collecting the email communications from the individuals who worked in the same department as the former employee. Those emails were collected from the time period between when the employee submitted his notice until Company XYZ filed its claim.

The Solution

The emails were processed and loaded into a review tool. Using a process called categorization, example language was submitted to the analytics index. The index then went through the database and identified documents similar to the example language and tagged them with a predefined category. The 360 categorized documents were reviewed, but still were deemed as lacking the requisite substance. On the D4 consultant’s suggestion, the legal team drafted (from thin air) their perfect smoking gun email. We took that fictional document and seeded the index with that language. The result was a little more than 400 documents, which after reading, they case team completed their review.

The Costs

There is an upfront cost to apply analytics to any data set and those costs often serve as a deterrent for using the technology. That being said, the numbers here are incredible and the savings from the application of analytics to this dataset were astronomical. From collection to review, the cost for this project was $17,198.23, including the analytics charges of $3,820.50. To review 195,000 records at $0.75 per document, this investigation would have cost Company XYZ $159,627.73 in review, hosting and processing fees. By leveraging analytics, they saved $142,429.50.

As stated earlier, analytics is not a single tool set that can be turned on and magically point to your smoking gun. In the case of Company XYZ, categorization was the best workflow for the specific problems presented by the case. The results were astounding and the case team was very satisfied with both the outcome and the dollars saved. It is important to note that a different case with a larger number of issues may not fit into this specific workflow. In the era of analytics in eDiscovery, consultation is moving back to the forefront of the review cycle. Even though unit costs continue to drop, ever-increasing data volumes make any eDiscovery project an expensive proposition. Yet, by identifying the proper analytics workflow through consultation, and expertly implementing that workflow, analytics will more than pay for itself.

Attract & Retain Top Talent

With a rapidly changing industry, it's vital to offer the right compensation and set the right expectation. With our Salary Guide, get detailed job descriptions, industry insights and local salary data to equip your managers with hiring confidence and expertise.

Get your copy »

Get email updates about more content like this.


| Next articles in The Column blog |

Get the foundation you need to hire the best legal talent.

Request your copy of our 2021 Salary Guide »