Tutorial

From Meeting Transcript to BPMN: A Worked Example of the Document-to-Diagram Flow

The document-to-BPMN flow is easier to understand with a worked example than with a feature list. Here is the full sequence: transcript in, clarifications answered, diagram refined, for a typical stakeholder interview on a typical mid-market process.

March 31, 20267 min read

The starting point: a typical stakeholder interview transcript

The worked example starts with a transcript of a single 45-minute stakeholder interview. The subject is a 'new client onboarding' process at a mid-market professional services firm: five roles involved, roughly 20 steps, no existing BPMN. The interviewee is the operations manager, and the transcript runs about 5,000 words. Descriptions are natural-language, not step-numbered; the operations manager described the process the way they would describe it to a new hire.

This is exactly the kind of source material that document-to-BPMN is built for. The transcript contains all the information a trained analyst would need to produce a BPMN diagram: actors are named, tasks are described as activities, decision points are stated as conditions, but the information is distributed across 5,000 words of prose rather than packaged in a structured template. The job of the document-to-BPMN flow is to extract the structure automatically.

Step 1 and 2: upload and the clarifying questions

The transcript goes into the upload panel as plain text. Within 40 seconds, the AI's analysis is complete and the clarification dialog opens. For this specific transcript, the AI produced four questions.

Detail level (always the first question): Detailed, Balanced, or Summary. The worked example picks Balanced: appropriate for an executive readout, not too granular for the 20-task process.
Actor disambiguation: 'The transcript mentions both Sales and Business Development: are these the same role or two distinct actors?' This question is generated because the operations manager used both terms. The worked example's answer: same role, use Sales.
Decision criterion: 'When the onboarding form is routed to Finance, what is the criterion? The transcript mentions both contract value and industry as factors.' Answer: contract value above $50,000 triggers Finance involvement; industry is informational only.
Intent verification: 'The transcript also contains references to a proposed future improvement: "we want to add an automatic credit check". Should this be included in the current BPMN, or mapped as a separate target-state diagram?' Answer: separate target-state diagram (this is the multi-intent classifier at work).

Step 3: the first-pass diagram

With the four clarifications answered, the diagram renders in about 60 seconds. The first-pass output for this worked example has 22 tasks across 5 swimlanes (Client, Sales, Legal, Finance, Operations), 4 gateways (one for the $50k threshold, one for credit check pass/fail, two for onboarding type selection), and 2 end states (onboarded, rejected). Each task carries KPI estimates: duration in minutes, cost per execution in dollars, frequency as count-per-month.

Looking at the first-pass diagram, the operations manager spots three things that need correction. First, one task is in the wrong swimlane: 'review legal terms' was placed in Sales but should be in Legal. Second, a task is missing: 'confirm onboarding materials sent' is something Operations does but the transcript did not mention it explicitly. Third, one KPI estimate is clearly off: 'client contract signature' was estimated at 30 minutes but typically takes 3 to 5 days of calendar time (0 minutes of active work, long waiting time). These are all normal first-pass corrections.

Step 5: Integrating multi-modal inputs (whiteboards and SOPs)

In 2026, process discovery has evolved beyond text-only transcripts. According to research on emerging technologies by Gartner 2026, multi-modal process discovery is now a standard capability for modern enterprises, reducing time-to-model by up to 70 percent compared to traditional manual interviews. Teams frequently run hybrid workshops where they sketch rough flows on a physical whiteboard while discussing them.

Visual grounding: The AI maps handwritten shapes and sticky notes directly to the spoken timeline in the transcript, reducing the number of clarifying questions by half.
SOP cross-referencing: Uploading an outdated Standard Operating Procedure (SOP) alongside the interview transcript allows the AI to highlight compliance gaps where the actual practice deviates from the written rule.
System log alignment: Attaching a sample database export or system log helps the AI validate the actual execution times against the subjective estimates given by the interviewee.

Step 6: what the KPI layer reveals that the transcript did not

With the diagram refined and KPIs in place, the cost dashboard and heatmap provide findings that the transcript alone could not. The Impact heatmap reveals that the Legal lane is both the slowest and the most expensive: three tasks there consume 40 percent of the total process cost. The cost dashboard shows the process costs $8,200 per onboarding and $82,000 a month at the current volume of 10 onboardings per month.

The operations manager's reaction to the number, an audible pause followed by 'I knew it was expensive, but I didn't realise it was that expensive', is the typical response. The gap between a stakeholder's subjective sense of cost and the calculated figure is usually one order of magnitude. Processes that 'feel expensive' are often two or three times worse than the felt sense; processes that 'feel cheap' are often two or three times more expensive than assumed. Without the KPI layer, this gap stays invisible, which is why the transcript-alone approach to process mapping historically missed the most actionable findings.

Frequently asked questions

Does the transcript need to be cleaned up or formatted before upload?

No. The document-to-BPMN flow is specifically designed for raw transcripts: automatic speech-recognition output with typos, filler words, and backtracks is handled without preprocessing. Transcripts from tools like Otter, Rev, Fireflies, and the built-in transcription of Zoom, Teams, and Google Meet all work directly. The one thing worth doing is removing timestamps and speaker labels if they are not adding value to the analysis, they do not hurt the output but they do count against the context budget, which matters for very long transcripts.

How long can the transcript be?

The practical upper limit is around 100 to 150 pages of dense text, set by the AI model's context window. A single 45-minute interview transcript at normal speaking rates runs 5,000 to 8,000 words, which is well within the limit. Back-to-back interviews on the same process can be combined into a single upload up to the context limit. For longer inputs (a day of workshop transcripts, multiple interviews across a week), the recommended approach is to combine the most relevant material into one upload rather than concatenating everything: the AI parses 8,000 words more accurately than 50,000 words.

What if my transcript mixes the current process and a proposed future process?

The multi-intent classifier detects this and surfaces it as a clarifying question. The worked example above includes exactly this case: the interviewee mentioned both the current onboarding process and a proposed future addition (the credit check). The AI flagged this and asked whether to include the future improvement in the current BPMN or to map it as a separate target-state diagram. The standard practice is to map the current state first, then come back to the transcript and map the target state as a second diagram, then compare them side by side. This separation prevents the diagram from becoming an incoherent mix of what-is and what-might-be.

Can I upload multiple transcripts describing the same process from different stakeholders?

Yes, and this typically produces better results than a single transcript. The AI triangulates across the accounts and flags contradictions ('Stakeholder A says the approval step takes 2 hours, Stakeholder B says 4 hours, which is correct?'). The contradictions are where the real insight lives, because they usually surface the undocumented parts of the process that neither stakeholder on their own would have flagged. For the worked example above, adding a second stakeholder's transcript (say, the Legal team lead) would likely produce one or two additional clarifying questions and a richer diagram.

How accurate are the KPI estimates from a transcript?

For well-described tasks where the interviewee gave explicit numbers ('this step takes about half an hour', 'we do this twice a week'), the estimates are accurate on first pass. For tasks where the numbers are implicit or inferred from context, expect 60 to 80 percent accuracy on first pass, with the rest needing adjustment during refinement. The worked example above had one major correction (contract signature was estimated at 30 minutes active time, actually 0 minutes active + multi-day wait). These corrections are normal and expected. The explicit editability of the KPI fields is the key feature: the AI's estimate is a starting point, the analyst is the final authority.

Document to BPMN: AI Auto-Generation in 2 Minutes (2026)Swimlane Diagrams: How the Layout Makes Accountability Visible Before Anyone Reads the Text The What-If Process Simulator: Three Levers That Let You Test Change Before Committing

Ready to Build Your AI Transformation Plan?

Upload any process document and co-build an AI transformation plan with real tool recommendations and ROI projections, in minutes, not weeks.

Try LucidFlow Free