Agentic drafting

Designing trust and intervention for autonomous legal document generation

Supio·Lead Product Designer·Ongoing·Product

B2BLegal AIAgentic interface

Overview

Legal documents carry real consequences. Every citation, every clause, every formatting choice reflects a firm's credibility. I design agentic systems that let legal professionals direct autonomous document generation, shaping output through conversation rather than assembling it by hand.

Supio is an AI-powered legal platform. Attorneys and paralegals use it to generate complex legal documents: demand letters, deposition outlines, medical summaries, and expert disclosures, all grounded in current case data and formatted to match firm-specific conventions. I joined as lead product designer on the drafting intelligence team.

I designed the end-to-end generation-to-delivery system. This included conversational document generation with guided blueprints, a diff-based editing layer modeled on legal redlines, and an agentic architecture that resolved retrieval accuracy at scale.

Visuals shown here are simplified representations of production systems. Additional detail can be discussed in interviews.

System overview

The core problem was not generating legal documents.

It was designing a system where legal professionals could direct, trust, and iteratively refine autonomous generation across firm-specific standards, variable case complexity, and documents where a single hallucinated citation carries real legal risk.

In practice, this meant building an end-to-end flow where users control the inputs, observe the outputs through familiar interaction patterns, and intervene at predictable points without needing to verify every sentence.

Problem space

Legal document generation operates under constraints that most AI products never encounter. Every citation must trace back to real case data. Every paragraph must match a firm's voice, structure, and formatting conventions. Long documents amplify hallucination risk at exactly the points where accuracy matters most.

Case data varies enormously in volume and quality. A single personal injury case might contain thousands of medical records, police reports, and correspondence. The system must extract the right data, apply it to the right document structure, and produce output that a paralegal can refine rather than rebuild.

• Accuracy must be verifiable through citations tied to timeline events and case data

• Firm-specific style matching across document types with no universal template

• Large case data volumes create retrieval noise and hallucination risk

• Legal professionals expect redline-level change tracking

• Output must be exportable and court-ready on first run

Conversational generation

The generation system translates natural language direction into firm-formatted legal documents grounded in case data. The design challenge was balancing guided structure with user flexibility.

Early versions used a modal flow. Users selected a document type, provided two examples from a previous case, and the system generated a new document in the same style. This built valuable document intelligence and classification data. But it failed in practice. Firms had more preferences and context than a locked-in modal could accommodate. Users couldn't add nuance mid-flow, couldn't reference additional source documents, and the process felt too rigid for the variety of real legal work.

I redesigned generation as a conversational interface. Document blueprints, informed by what the modal taught about successful generation patterns, guide users toward good inputs. Chat lets them provide additional context, reference knowledge base examples, or upload materials to match. Users direct the system toward exactly what they need rather than choosing from predefined paths.

The agentic model drew inspiration from how frontier AI labs like Anthropic design agent interactions: give the operator clear controls, make the system's reasoning observable, and let humans direct rather than supervise. The modal worked for the system but failed for the operator. Legal professionals don't think in rigid classifications. They think in context, precedent, and nuance. The conversational model matched how they already reason about documents.

Users direct document generation through conversation, iterating on style, structure, and content before the system produces a firm-formatted artifact ready for review.

User inputs flow through a generation engine that retrieves scoped case data, matches firm style from knowledge base examples, and produces a document with trust signals for human review before export.

Diff-based editing and trust calibration

The editing system lets users refine generated documents through agentic requests, by selecting text or chatting against the document. The design challenge was making AI-generated changes legible and trustworthy in a domain where every word matters.

I designed changes to surface in a diff experience: inline strikethrough for removals, underlined text for additions, with summary banners describing what changed and why. For comprehensive edits, the system breaks large requests into chunked summaries so users can cherry-pick section-level changes. Legal professionals live in redlines. Microsoft Word's Track Changes is their native interaction pattern for reviewing modifications. Building the editing layer around accept and reject diffs mapped to an existing cognitive model and reduced the learning curve to near zero.

Early on, large case data volumes caused the system to generate content that couldn't be traced back to source material. This eroded trust quickly. Users who encountered hallucinated citations became hesitant to use generation at all. The initial approach was target source selection, letting users narrow which case materials the system drew from before generation began. This helped, but the real resolution came from the agentic architecture itself. By orchestrating generation into discrete, scoped steps, each agent works with a targeted subset of case data rather than retrieving across the full volume. This kept retrieval focused and reduced hallucination risk naturally. Source selection became unnecessary once the agents could scope data reliably on their own.

Knowledge base as style reference

The Knowledge Base is where firms store their strongest work product as drafting references. When a user generates through Agentic Drafting, the system pulls from these examples to match firm voice, structure, and formatting. This is the difference between a usable first draft and output that gets rebuilt from scratch.

Scope of role

I led product design end-to-end: research, prototyping, and iteration across the full generation-to-delivery workflow. I worked cross-functionally with Supio's legal solutions architect and a sales team member with a paralegal background to shape the experience around real legal workflows and firm expectations.

Outcome

Supio is Series B at $7M MRR with rapid growth. Agentic Drafting achieved a 20% increase in first-draft success rate and became one of the platform's most important products, particularly loved by paralegal teams for its ease of use. Customer success teams report strong demand for expanding generation to more complex document types.

This case study covers the systems-level design. Product walkthroughs and specific design decisions are available on request.