ADVERTISEMENT



Google       

Home -> Law Blog Directory -> Litigation Support Blogs -> E-Discovery In the Trenches

OR PHONE (866) 635-1838 for Bankruptcy Help, (866) 635-6190 for Divorce,
(866) 635-2689 for Personal Injury or (866) 635-9402 for Criminal Defense

Find a Local Lawyer

Bankruptcy (866) 635-1838
Divorce (866) 635-6190
Personal Injury (866) 635-2689
Criminal Defense (866) 635-9402

Bookmark

Litigation Support

: E-Discovery In the Trenches

Recall and Precision

By Jerry Bui

ADVERTISEMENTS
There's a great law.com article by H. Christopher Boehning and Daniel J. Toal that discusses traditional keyword and Boolean search methods versus new alternative methods. Though the authors don't mention it specifically, their article discusses the theory of "recall" and "precision". The ability to search a corpus of documents and bring back all of the relevant material in a result set is called "recall". The ability to reduce the number of false positives in a result set is called "precision". Therefore, if you craft an overly broad search you may increase your recall, but lower your precision. This scenario usually results in a larger number of false positive documents to sort through in your review. If you have very few false positives in your result set, it allows you to identify relevant documents one-after-another with fairly high frequency, but the snapshot of material may be a very thin slice of the overall relevant material (high precision, low recall). In other words, there may be a lot more juicy stuff out there to review. The trick is--and this is the holy grail of search--how do you corral all of the good stuff without having any bad stuff mixed in?

It really depends on your review goals. The fallacy with most search efforts is a desire to only get low doc counts with the most relevant material possible. In this case, the emphasis for your review is on precision (maybe because cost is your primary driving constraint). If relevant material is rampant within the corpus, however, you will want to increase your recall in order to get at the full scope of your issue. You may tolerate a good number of false positives in order to be as thorough as possible (maybe completeness is your primary driving constraint). You'll want to decide quickly whether recall or precision is the ultimate goal of your review. Of course you'll want both, but after the review has started you'll want to shift your focus on one or the other depending on the incremental results of your review. You'll know quickly (after a day or two) if your review assignments are yielding the desired level of precision. In order to test your level of recall, you'll want to sample a population of the documents that were excluded from review (make sure it's statistically significant). Once you perform a QC review on this sample set, you'll know whether your search terms were sufficient in capturing enough relevant material.

As you all know, the iterative nature of this work is commonplace in our business. Unless you have a real sense of the percentage of relevant material to begin with, there's absolutely no way of knowing whether your search results have achieved the highest level of recall and precision until you roll up your sleeves and just dig into it. If you're trusting the artificial intelligence of a system to do this "auto-magically" for you, either by concept grouping or "learning" or some other newfangled algorithm, then you are putting quite a bit of faith into the technology. Remember that most of this new technology is a carefully guarded trade secret belonging to the software vendor. In order to prove anything to the court, however, you have to be able to lift the hood and explain the goings-on underneath. The only defensible position that one can take these days, at least until there's a technology winner that is universally accepted by the court, is to present your search terms with hit counts and corresponding review calls. Keywords and Boolean searches are still the state-of-the-art today.

Full post as published by E-Discovery In the Trenches on April 26, 2008 (boomark / email).

Bloggers, promote your law blog by nominating your blog for inclusion in USLaw.com's Law Blog Directory and RSS Reader. Benefits described.
Related Law Blog Posts
Search Blog Directory:

Search Blog Directory:

Related Law Articles

Lawsuits and Settlements

Related Searches

























































































































US Law
#1 Online Legal Resource













Your Blog Subscriptions
Subscribe to blogs

10,000+ Law Job Listings
Lawyer . Police . Paralegal . Etc
Earn a law-related degree
Are you the author of this blog? Adding USLaw.com to your Blogroll increases relevance. You qualify to display a USLaw Network badge.
Suggest changes to this blog's description or nominate another for inclusion. Register for updates.


Practice Area
Zip Code:

Contact a Lawyer Now!






0.8514 secs (new cache)