Data
Legal Analytics is very dependent on data so if there are weaknesses with your data there are weaknesses with the resulting analytics. One study of legal analytics platforms found wildly divergent results. See Bob Ambrogi, Legal Analytics Products Deliver Widely Divergent Results, Study Shows, LexBlog (Nov. 25, 2019), https://www.lexblog.com/2019/11/25/legal-analytics-products-deliver-widely-divergent-results-study-shows [https://perma.cc/B7J8-NN6A].
- Large data sets are needed
- Small datasets aren’t going to be very useful so legal analytics will come most into play with massive data sets.
- Inaccurate data
- Biased data
- When algorithms were turned loose on the bail-setting process and sentencing, they demonstrated a conspicuous racial bias. Even if questions never ask directly about race or income, the answers can end up being proxies for race and class.
- See Julia Angwin et al., Machine Bias, Pro Publica (May 23, 2016), https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing [https://perma.cc/GW49-6JC6]; Anna Maria Barry-Jester et al., The New Science of Sentencing, The Mashall Project (Aug. 4, 2015), https://www.themarshallproject.org/2015/08/04/the-new-science-of-sentencing [https://perma.cc/R3N6-6W78]; Alex Campolo,et al., AI Now 2017 Report (2017), http://bit.ly/2O0HmGX [https://perma.cc/KPX8-ZUPG]
- Lack of sophisticated tagging
- In addition to potential PACER errors, many analytics tools rely on PACER’s Nature of Suit (NOS) codes for classification schema and taxonomy. PACER does not have NOS codes for every practice areas and the NOS codes may not be specific enough for what is being asked of analytics.
- Over-reliance / Not recognizes the limitations of analytics
- Another limitation of legal analytics is an over-reliance. Historical trends might make for a poor probability calculation. Related to this are the potential biases that legal analytics incorporate. Would legal analytics have discouraged a Brown vs. Board of Education type of case?
- Analytics are not consistent. A 2019 test found that none of the products tested came up with the same results and they all missed certain results for different reasons.
- See Zach Warren, Law Librarians Push for Analytics Tools Improvement After Comparative Study, Law.com (July 15, 2019 at 12:52 PM), available on Lexis.
- "Franken-algorithms"
- Algorithms can be a black box. Competing algorithms can interact in ways that human beings cannot predict. Examples have emerged of cases where competing pieces of algorithmic pricing software interacted in unexpected ways and produced unpredictable outcomes. See Andrew Smith, Franken-algorithms: The Deadly Consequences of Unpredictable Code (Aug. 30, 2018 01.00 EDT), https://www.theguardian.com/technology/2018/aug/29/coding-algorithms-frankenalgos-program-danger [https://perma.cc/UTT9-ZW87].
- One example of these is called a flash crash and you can see them in stock market trading.
- A software engineer and a Carnegie Mellon University professor working in a case against Toyota, testified that the Toyota brake problem was caused by tangled software. The jury found against Toyota and Toyota settled before punitive damages could be considered.