The Practitioners Role in eDiscovery 2.0
Man vs. Machine or People + Technology?

Effective use of predictive coding technology requires highly qualified people and sound processes to maximize results and achieve cost savings.
The Importance of Choosing the Right People in eDiscovery 2.0
Recently, US District Court Judge Andrew Carter upheld the now infamous February opinion by Judge Andrew J. Peck in Da Silva Moore v. Publicis Groupe[1]. This paves the way for increasing utilization of advanced technologies to augment or replace traditional models of linear review. As a result, law firms and corporations no longer need to fear being penalized for using alternative non-linear technologies.
Despite this affirmative ruling, early adopters still need to ensure that when cutting edge technology is applied, it is well-managed by practitioners with the skills necessary to effectively utilize said technology. The intersection of man and machine heightens the need for trusted advisors in the legal technology and staffing space, and calls for a method of evaluating the skill-set of attorneys who are employing increasingly sophisticated technology.
Da Silva & Other cases… What do they mean?
There has been so much activity in legal periodicals and the eDiscovery blogosphere that it’s easy to lose sight of the relevance of the Da Silva decision. Judge Peck did not order the use or acceptance of Predictive Coding, rather he stated that the technology should be considered as a method to curtail discovery costs. He went on to determine which of two proposed protocols (both employing Predictive Coding) was acceptable. In the original opinion, Judge Peck stated:
Computer-assisted review …should be seriously considered for use in large-data-volume cases where it may save the producing party (or both parties) significant amounts of legal fees in document review. Counsel no longer needs to worry about being “first” or “guinea pig” for judicial acceptance of computer-assisted-review.
The recent order in Global Aerospace Inc., et al, v. Landow Aviation[2] goes a step further and affirmatively directs an unwilling plaintiff to accept a production derived via predictive analytics to curtail discovery costs, overruling an assertion that said technologies were a “radical departure from the standard practice of Human Review”. In his ruling, Virginia Circuit Court Judge James H. Chamblin noted that predictive coding tools were not being used in lieu of having human beings make selections, but rather:
Predictive coding tools require human input for a computer program to “predict” document relevance. Additionally, the proposed approach includes an additional human review step prior to production.
There has not yet been a ruling where predictive coding has been proscribed despite opposition from one side, but that circumstance is likewise under judicial consideration. In looking at the cases above, it is clear that there is a willingness of the bench to accept Technology Assisted Review (TAR). The key factor being the word “assisted”. Practitioners must take an active role in the implementation of these tools and use them in conjunction with fact-driven case development to reap the rewards offered by predictive coding or any of the litany of other tools. Judicial approval lies in sound processes and the intersection of man and machine, not merely in which widget is selected.
How Does the Technology Work?
Technologies that fall under the umbrella of predictive analytics (AI, Automated Review, Predictive Coding, Concept searching, etc.), all follow a similar equation:
A computer program learns from experience (E) with respect to some set of tasks (T) and performance measure (P), if its performance of the tasks (T), as measured by (P), improves with experience (E).[3]
Machine-learning techniques (much like human cognition) learn from examples to better perform future tasks. The tasks and the performance expectations are a constant, regardless of the tool that is being used. The method by which a system is provided the input that makes up the experience in this model varies depending on the vendor and solution employed. The accuracy of this process is dependent on the quality of the data practitioners provide: as with traditional manual review, garbage in = garbage out.
Risks Posed by Inadequate Training
Predictive analytic protocols all provide a variation of 4 key steps: Training, Learning, Evaluation and Validation. At each of these steps practitioners with an understanding the case, the technology and the data set play integral roles, as demonstrated in the following workflow:
At each phase, while the algorithm and computer are eliminating some aspect of human decision-making, they also require the practitioners to make sophisticated iterative decisions informed by the case and the data. The above is a variation of a machine learning model called reinforcement learning, and at each step along this iterative process it is imperative that quality decisions are made regarding the documents and data set because the algorithm only learns based off of what has been input by the human reviewers. Errors on the front end or sloppy coding can lead to oversight of key elements of a case or reduce the cost savings of using alternative technologies by increasing the volume of quality control needed or amount of training required.
Other models of machine learning are also susceptible to poor focus on the human aspect of machine learning. Take for example the supervised or semi-supervised machine-learning model employed by several notable predictive coding companies. These models rely heavily on a small seed set of data selected based off of knowledge of the case and data set and coding according to stringent guidelines determined by key players within the case. An algorithm results from this seed set and random sampling and coding is done to refine it to a point of maximal efficiency. Again, the quality of the training and knowledge of the practitioner is extremely important and will impact the efficiency of the review, the accuracy of the algorithm, and the cost savings realized.
Is Manual Document Review Dead?
In a word, no. In supporting the Da Silva opinion, Judge Carter went one step further than Judge Peck, stating that “manual review with keyword search is costly…such review is prone to human error and marred with inconsistencies from the various attorneys determination of whether a document is responsive”. Despite this critique, Judge Carter was also quick to acknowledge that traditional manual review is still “appropriate in certain situations”.
There is no need to figuratively throw the baby out with the bath water, because traditional keyword searches coupled with Attorney Review can be highly effective to examine large amounts of data. The problems noted above arise when ill-conceived searches utilizing broad search terms that are over and or under inclusive lead to massive amounts of data being pushed to traditional manual review. Commentary on Sedona Principle 11 notes:
The selective use of keyword searches can be a reasonable approach when dealing with large amounts of Electronic Data… this exploits a unique feature of electronic information – the ability to conduct fast, iterative searches for the presence of patterns of words or concepts in large document populations[4]
Further, even the most aggressive applications of Predictive coding/analytics rely on a human iterative feedback loop. No matter the complexity of the tool, or the aggressiveness of the protocol, the need for harmony with people + technology still remains.
Quality Practitioners are needed to Maximize Technology
The recent opinion by Judge Peck has changed the landscape of the e-discovery space, but not in the way that most expected. Rather than rendering the document reviewer obsolete, it offers the opportunity to work hand in hand with the case team to maximize technology in a document review. Reviewers’ skills need to increase proportionally to the sophistication of the technologies that are increasingly being deployed in order to remain a viable candidate in the post Da Silva world.
Firms and companies looking to leverage the savings (up to 60-75%+[5]) offered when leveraging alternative technologies need to work closely with these technology providers to ensure that they fully understand and can articulate the process they will be using. And they should work closely with the staffing providers they trust to refine the model of whom they want to participate on these sophisticated reviews. A close partnership with a trusted provider will allow a firm to determine the qualifications and skills that enable reviewers to best leverage technology. Further, a close relationship allows firms to repeatedly use the same group of people and maintain a smooth transfer of institutional knowledge from matter to matter without having to invest in the overhead of an internal dedicated team of attorneys purely for document review.
Citations
[1] http://www.nylj.com/nylawyer/adgifs/decisions/022912peck.pdf
[3] Mitchell, T. (1997). Machine Learning, McGraw Hill.
[4] Sedona Principle 11 commentary
[5] Recommind claims 50-90% Cost reduction in Wilmerhale case study, Rand Study also supports cost savings at various stages – saves up to 75-80% review, Stored IQ: 85% of review, Autonomy: Claims up to 90% reduction of data passed on for review, 80% review savings. In Da Silva Moore, defendant, MSLGroup, proposed to limit the cost of its review and production of over 3.2 million documents to $550,000 by utilizing predictive coding software. The technology-assisted reviews require (on average) human review of only 1.9% of the documents, a fifty-fold savings over exhaustive manual review see table 5 and 6 from Grossman/Cormak, Richmand Journal of Law and Technology.
Related Posts
No related posts.

Why The Carpenter Is As Important As The Hammer « ediscoverycat (May 18, 2012) #
[...] The Practitioners Role in eDiscovery 2.0 [...]