Of Monsters, Men, and Lawyers

There’s been a very high-volume thread this weekend on Bluesky about whether an LLM can do basic lawyering. I’ve mostly avoided it (because it was my birthday and I had more joyful things to do) but I came across an after-effect of it this morning.

“I was trying to get it to work on her terms, which required not telling it the relevant distinction explicitly and relying on the robot to infer it” link

I haven’t read the thread, so I don’t want to make it about these specific Bluesky sparring partners or the specifics of this one particular test. In particular, I don’t know if David is misrepresenting the challenge as laid out. But I have seen other smart law-folks make similar mistakes to the one David reports here, so I want to write down my thoughts.

#Generalists, not specialists

Here’s the thing: LLMs are not Brilliant Lawyers. No one should be claiming that. Law is a specialty, and LLMs are generalist tools. (There are some important caveats, but I’ll get to those later.)

Think less “first-year attorney” and more “English-major undergrad with a philosophy minor, who has read a lot of true-crime novels”. So they have heard our magic words, and can fake it when they go as a character from Suits for Halloween, but they don’t quite know what the magic words mean.

In other words, relying on the untrained, generalist LLM to infer unprovided specialized information is a bit of a fool’s errand—like asking an undergrad who has watched courtroom TV, it tells you very little about the capability of LLMs to be useful in a legal practice.

#Very teachable generalists

But like an undergrad, they are also pretty easy to instruct. And unlike an undergrad, they are pretty reliable (and infinitely patient) once instructed.

In other words, asking an LLM to “do X” is often going to fail, or only work superficially.

Instructing the LLM that “X consists of (verbose, detailed definitions and instructions), now do X” is often going to work very well.

#Implications

So what does this mean? Some time-bounded thoughts and takeaways:

  1. Do not waste anyone’s time having an online argument about “can a general-purpose LLM do X” and then ask for the analysis to be done with one virtual arm tied behind its back. The discussion must start with “how often is this task done, so how much time does it take to codify the relevant knowledge and practice”.
  2. For many legal tasks, the answer to the question in (1) is going to be “it does not make sense to do this with an LLM, because this work is rarely done and so codifying it will take longer than doing it myself”. (Part of why I avoided the thread that prompted this blog post is because it is a task I’ve never once been asked to do in my 15+ year legal career, so it’s impossible for me to answer this question.)
  3. Programmers can be very bad about (1) because we tend to underestimate how much LLM-based coding tools have had massive investment in what are coming to be called “harnesses”: complex combinations of tooling+prompting+context that wrap the generalist LLM in a tight cocoon of specific knowledge about coding and codebases. So programmers tend to say “LLMs are great at coding” when what they really mean is “LLMs plus extremely extensive harnesses are great at coding”. This is an extremely important difference, and one that misleads non-programmers (and again, many programmers) into thinking that “LLMs” will be great at other fields without the big investment in harnesses.
  4. Programmers are also bad about (1) because the biggest pot of money in the long history “sell pickaxes to miners” is at stake, so when I say “big investment” I mean literally hundreds of billions of dollars. Not yet so much in lawyer harnesses. Both developers and lawyers having this discussion need to remember that.
  5. Lawyers cannot rest on our laurels. Our pot of money is still pretty big, so there will be broad and deep legal harnesses, most likely sooner rather than later. Those harnesses will have carefully crafted instructions and context, and will absolutely crush general-purpose tools + naive lawyer-drafted instructions. We’re not there yet, and I’m pretty skeptical that the existing duopoly will be the ones to produce those harnesses. (Harvey and Clio are of course both frantically trying to do this, but I haven’t been able to test them much yet.) But it absolutely will come, and thanks to aggregation theory, it is going to come for your niche no matter how niche you think it is.
  6. (5) is going to be rude to a lot of lawyers, because we tend to think our way of doing things is the One True Way when in fact there are a lot of ways to do law. The process of writing down previously unwritten things is going to cause a lot of discomfort.
  7. Related to the previous point, the most adamant lawyer I’ve seen on this point (not Kathryn) is a 99th-percentile expert in their field. (Some of the most critical programmers too!) If your standard for LLM evaluation is “it has to be as good as a 99th percentile expert, or it doesn’t count”, you’re going to be shocked when 50th percentile experts love it and 95th percentile experts begrudgingly start using it.
  8. There’s probably a lot more to be said here, but like I said at the beginning: time-bound. I have a day job! A family! I like them both!

#Conclusion

It really, really pains me to see debates about “is LLM good at X” that rest on “raw” LLM usage, because that is not how anyone in any profitable field is going to use LLMs.

If we’re going to come to real terms with what these mean for our industries (all of them, not just programming and law) we must stop asking questions of naive chatbots, and start thinking about what sophisticated, carefully-crafted tools—that use LLMs as part of their pipeline, but are also carefully trained and contextualized—will do.