Episode 36 – Human in the Loop: Why Legal AI Still Needs a Lawyer at the Controls

1

A partner at a mid-sized firm asks an AI tool to review a supply agreement. In less than two minutes, the tool flags three problematic clauses. The partner accepts the output, forwards it to the client, and moves on. Two weeks later, a dispute arises over a fourth clause, the one the AI missed completely. It was buried in a cross-reference. The AI did not follow the chain. The partner did not verify it.

Unfortunately, this is not a hypothetical anymore. Versions of this scenario are already emerging across law firms and in-house legal teams. They all reveal the same weakness: AI is being deployed without a clear operational model for human oversight. Think that sounds overstated? Look at this database, which has already collected more than 1,000 hallucinations in court decisions worldwide, and the number keeps growing.

The legal profession talks constantly about keeping a “human in the loop.” But invoking the phrase is easy. Translating it into daily legal practice is much harder. And there is another question, just as important, that many teams still avoid: at what point should the human actually step in?

What “Human in the Loop” Actually Means in Legal Work

Human in the loop (HITL) is a design principle. It means that a qualified person reviews, validates, and takes responsibility for the output of an AI system before that output produces any consequence. In legal work, this translates to a specific commitment: no AI-generated advice, document, or analysis leaves the firm or department without a lawyer having actively engaged with it.

The key word is actively. Glancing at an AI-drafted summary before forwarding it does not count as oversight. Running an AI contract review and hitting “approve” without reading the flagged clauses does not count either. HITL requires deliberate attention. It means treating AI output as a first draft that demands professional judgment before becoming work product.

The Real Question: When Does the Human Step In?

Most discussions about HITL focus on whether a human should be involved. That question has a straightforward answer: yes. The harder, more practical question is when in the process the human should intervene. Getting the timing wrong can make oversight either useless or suffocating.

Consider three different moments where human involvement plays a different role:

  • Before the AI runs (input stage). The human defines what the AI should do, sets the parameters, selects the right tool for the task. This is the stage where a lawyer decides whether a matter is suitable for AI assistance at all. A routine NDA review? Likely a good fit. A high-stakes regulatory opinion? Probably requires a different approach, with the AI limited to research support rather than drafting.
  • During execution (monitoring stage). For longer AI processes, like e-discovery or large-scale contract analysis, the human checks intermediate results rather than waiting for a final output. Sampling a subset of flagged documents midway through a review can catch systematic errors before they propagate across thousands of files.
  • After the AI delivers output (validation stage). This is the most common form of HITL, and the one most teams default to. The AI produces something; the lawyer reviews it. But even here, the depth of review should match the risk of the task. A routine summary of case law needs a different level of scrutiny than an AI-drafted clause in a cross-border M&A agreement.

The point is that HITL should not be a single checkpoint at the end of a pipeline. For most legal workflows, the right answer involves human involvement at multiple stages, with the intensity calibrated to the risk profile of each matter.

Five Practical Steps to Build a Real HITL Workflow

If your firm or legal department is using AI tools (or planning to), here is a concrete framework for putting human oversight into practice.

  • Map where AI acts and where humans decide, including when. For every process where AI is involved, specify what the AI does (draft, flag, summarise, classify) and what the human does (review, approve, edit, reject). Then add a third dimension: at which stage does each human decision point occur? Before execution, during, after, or at multiple stages? If you cannot clearly answer all three questions (what the AI does, what the human does, and when the human acts), the process needs more design work before going live.
  • Set review protocols, not just permissions. Giving a lawyer access to an AI tool is a separate matter from giving them a review protocol. Define what a proper review looks like for each task. For AI-assisted contract analysis, that might mean: read every flagged clause in full, check at least three unflagged sections at random, and verify all cross-references against the source document. Write it down. Make it part of the workflow rather than an informal expectation.
  • Train for AI-specific failure modes. AI does not make the same mistakes humans make. It can produce confident, well-structured text that is factually wrong. It can miss context that would be obvious to a first-year associate. Training lawyers to review AI output means teaching them where these tools fail, and when to be most skeptical.
  • Build audit trails from day one. Every AI-assisted task should leave a record: what tool was used, what input was provided, what output was generated, who reviewed it, what changes were made, and at which stage the review happened. When something goes wrong (and it will), the question will be: did the firm have a reasonable process in place? A clear audit trail is the best answer to that question.
  • Review the review process itself. AI tools change. Models get updated. New features appear. The review process needs a review cycle of its own, quarterly at a minimum. Track error rates. Collect feedback from the lawyers doing the reviewing. Adjust the protocols as the technology and the use cases change. Firms and legal departments that treat HITL as a living system, rather than a checkbox, will maintain both quality and trust over time.

Conclusion

There is a temptation to see human oversight as a drag on efficiency. Something that slows the machine down. That framing misses the point entirely.

The firms and departments that build robust HITL practices will be trusted with higher-value work, by clients who understand that AI alone does not deliver the judgment they need. A lawyer who can explain what the AI found, why a specific clause matters, what the AI might have missed, and what the strategic implications are: that lawyer is worth more in an AI-enabled market.

Oversight, done right and at the right time, is the reason AI works in legal practice.

At Better Ipsum, we help law firms and corporate legal departments design AI implementation strategies that keep human judgment at the centre. If you are building your HITL framework or rethinking how your team works with AI, get in touch.

Share:

Subscribe Our Newsletter