Human-in-the-Loop for Legal AI: Why Attorney Review Isn't Just a Compliance Checkbox

In May 2023, two attorneys filed a brief in a federal court in New York that cited six cases. None of the cases existed. They had been generated by ChatGPT, which produced plausible-sounding citations — complete with case names, reporters, and page numbers — that corresponded to nothing in Westlaw or Lexis. The attorneys told the court they had not verified the citations independently, assuming the AI's output was reliable. The court sanctioned both attorneys.

That case was widely covered as a technology failure. It wasn't. It was a HITL failure. The AI did what generative AI does when asked to cite case law without retrieval-augmented generation: it confabulated. The attorneys had a professional obligation to verify the citations. They didn't. The human-in-the-loop process that the law requires — and that malpractice insurance and bar ethics rules have always required — was absent.

The technology changed. The professional obligation didn't.

What Bar Association Guidance Actually Says

By 2026, every major bar association in the US has issued formal guidance on attorney use of AI. The language differs. The substance doesn't.

ABA Model Rule 1.1 requires attorneys to maintain competence, which the ABA Standing Committee on Ethics and Professional Responsibility has explicitly extended to include understanding AI tools used in representation — including their limitations and failure modes.

ABA Model Rule 5.1 holds supervising attorneys responsible for the work of subordinates. The ABA's 2023 Formal Opinion 512 clarified that AI tools are not subordinates — they are tools — but the work they produce is the attorney's work. Supervision responsibility is not discharged by running an AI and reviewing the output in the way a partner skims an associate's memo. The attorney must be in a position to validate the legal reasoning, not just the formatting.

ABA Model Rule 5.3 applies to non-lawyer assistance. While AI is not a non-lawyer in the statutory sense, the Committee has interpreted the rule's requirements around supervision to apply to AI-generated work product. The attorney cannot disclaim responsibility for AI output used in a matter.

ABA Model Rule 1.6 (confidentiality) has additional implications for cloud-based AI: client data cannot be disclosed to third parties without informed consent, and uploading confidential documents to a third-party AI service may constitute disclosure.

The practical upshot: an attorney who uses AI and doesn't verify the output independently has not discharged their professional responsibility. "The AI told me" is not a defense in a disciplinary proceeding.

Three Cases Where HITL Failure Created Liability

1. Hallucinated Citations Filed in Court

The 2023 New York case was the first widely publicized instance, but not the last. Since then, attorneys in California, Texas, and multiple federal districts have faced sanctions for filing AI-generated documents containing fabricated citations, non-existent statutes, or misquoted holdings.

In each case, the AI output passed a superficial review: the citations looked real, the format was correct, the language was confident. Meaningful HITL — independent Westlaw verification of every cited authority — would have caught every error. None of the attorneys performed that check.

2. AI-Drafted Contracts with Missing Provisions

A mid-size private equity firm used an AI contract drafting tool to generate a series of portfolio company operating agreements. The AI reliably included standard provisions. It omitted — consistently, across 14 documents — a specific drag-along rights provision that was non-standard but required by the fund's LPA.

The omission wasn't caught at signing. It surfaced during a portfolio company acquisition two years later when the provision needed to be enforced. The missing clause cost the fund the ability to force minority shareholder consent on the exit. The malpractice carrier paid. The firm lost the client.

A human attorney with deal experience reviewing the agreement for completeness — not just correctness of what was there — would have caught the gap. The review that happened validated what the AI produced, not whether it was complete.

3. AI-Generated Privilege Logs That Miscategorized Documents

A large document review in a commercial litigation matter used AI to categorize 400,000 documents for privilege. The AI was trained on a general corpus; it had no knowledge of the specific privilege relationships in this matter — that certain company executives had retained outside counsel for a parallel investigation, and communications with those attorneys were privileged.

The AI categorized those communications as non-privileged. 847 documents were produced to opposing counsel. The privilege waiver argument that followed consumed six months of motion practice. The producing party ultimately prevailed on inadvertent disclosure grounds — but the cost of the mistake far exceeded the cost of a properly structured HITL review of the privilege log.

What Meaningful Attorney Review Looks Like

There is a difference between reviewing AI output and independently verifying it.

Reviewing means reading what the AI produced and assessing whether it looks right. This is what the sanctioned attorneys in the citation cases did. The citations looked right.

Independently verifying means checking each factual or legal claim against an authoritative source. For legal citations, it means running each case in Westlaw or Lexis. For contract provisions, it means comparing against a checklist of required terms for this deal type. For legal arguments, it means independently assessing whether the cited authorities actually support the propositions cited.

The standard for HITL in legal practice is independent verification, not review. The difference is not semantic. A human who reviews AI output and a human who independently verifies it will catch different errors at very different rates.

Practice Areas Where the Stakes Are Highest

Criminal defense: A public defender using AI to draft motions is working in a context where errors directly affect a person's liberty. Ineffective assistance of counsel claims now routinely include questions about whether AI-generated legal arguments were independently validated.

M&A due diligence: AI tools that summarize contract schedules, flag material adverse change provisions, and identify missing reps and warranties are useful — and dangerous if the attorney relies on the summary rather than the document. Acquisition agreements contain provisions where the difference between "material" and "materially adverse" can be a $50M indemnification dispute.

Immigration filings: I-485, asylum applications, and visa petitions contain factual questions where incorrect AI-generated answers — even minor inaccuracies — can result in application denial, removal proceedings, or bars on future immigration benefits. The harm is often irreversible and affects the client's ability to remain in the country.

The Privilege Problem

Attorney-client privilege protects confidential communications between a client and their attorney made for the purpose of obtaining legal advice. It protects the attorney's work product — documents prepared by or for the attorney in anticipation of litigation.

When an AI makes an independent legal judgment — analyzes documents, synthesizes legal standards, recommends a course of action — and the attorney simply adopts the output without independent analysis, a privilege question arises. Whose mental impressions are reflected in the work product? If the attorney cannot explain the reasoning in the document because the AI generated it and they didn't follow the reasoning — only the conclusion — the work product doctrine's core protection (the attorney's mental impressions and legal theories) may not apply.

Courts have not fully resolved this question. Some have declined to extend work product protection to AI-generated analysis that the attorney adopted wholesale without independent analysis. The safe position is clear: the attorney's judgment, not the AI's, must be the one reflected in any document that needs privilege protection.

Document Review: How HITL Should Work

Large-scale AI-assisted document review is one of the most mature and defensible HITL applications in law — when done correctly.

A defensible AI-assisted review process includes:

Attorney-designed classification framework: The categories and definitions are set by an attorney with matter expertise, not derived from the AI's general training.
Training set validation: The attorney reviews and validates the seed set used to train the classifier on this matter's specific documents.
Statistical sampling: The attorney reviews random samples from each category — including predicted non-relevant — to validate the model's performance before production.
Continuous calibration: As documents are reviewed, the attorney spot-checks the AI's classifications at regular intervals. If error rates exceed threshold, the model is retrained.
Attorney sign-off on methodology: The attorney certifies the review methodology, not just the output. This is what courts and opposing counsel review when challenging document production.

None of these steps require the attorney to review every document. They do require the attorney to own the process.

The Ertas Angle

Legal documents are among the most sensitive data in any organization. Uploading them to a cloud AI service for labeling, summarization, or training data preparation is often ethically impermissible under Rule 1.6, practically dangerous from a privilege standpoint, and sometimes contractually prohibited by client engagement letters.

Ertas Data Suite runs on-premise. Legal document labeling, classification, and training data preparation happens within your security perimeter. Domain experts — which in a law firm context means attorneys with matter expertise — annotate and validate data directly in the tool. Every action is logged with timestamps and reviewer identity. Nothing leaves the building.

For law firms and legal departments building AI tools for document review, contract analysis, or regulatory research, the data preparation pipeline needs to satisfy the same confidentiality requirements as the deployed system. Ertas is built for that requirement.

For the foundational framework on HITL in enterprise AI, see What Is Human-in-the-Loop AI?. For context on why law firms approach third-party AI services with particular caution, see our coverage of legal AI confidentiality requirements.

Book a discovery call with Ertas →

Attorney review of AI output isn't a compliance checkbox. It's the professional and ethical floor. The bar associations, the courts, and the malpractice carriers have all made their position clear. The question for your firm or legal department is whether your AI workflows are designed to support genuine attorney oversight — or designed to create the appearance of it.

Human-in-the-Loop for Legal AI: Why Attorney Review Isn't Just a Compliance Checkbox

What Bar Association Guidance Actually Says

Three Cases Where HITL Failure Created Liability

1. Hallucinated Citations Filed in Court

2. AI-Drafted Contracts with Missing Provisions

3. AI-Generated Privilege Logs That Miscategorized Documents

What Meaningful Attorney Review Looks Like

Practice Areas Where the Stakes Are Highest

The Privilege Problem

Document Review: How HITL Should Work

The Ertas Angle

Turn unstructured data into AI-ready datasets — without it leaving the building.

Keep reading

AI Governance Framework for Law Firms: Privilege, Supervision, and Model Accountability

Human-in-the-Loop for Construction and Engineering AI: Site Safety, Structural Analysis, and BOQ Extraction

AI Governance Framework for Healthcare: HIPAA, FDA SaMD, and Clinical Oversight Requirements