A meeting summariser is one of the most satisfying AI tools to build because the pain it removes is universal: nobody enjoys taking notes, and details slip between the call and the follow-up. The pipeline is three clean stages — transcribe the audio, summarise and extract action items with a structured prompt, then deliver the result where the team already works. This tutorial walks each stage, and the generator below produces both the extraction prompt and a pipeline scaffold you can adapt.
Step 1 — Transcribe the audio
Start by turning audio into text with a speech-to-text model. Whisper is the common choice — accurate across accents and languages, available as a hosted API or as an open model you can self-host for privacy. You upload the recording and get back a transcript, ideally with timestamps.
For meetings with several speakers, add a diarization step that labels who said what. Speaker attribution dramatically improves the usefulness of action items, because “Sam will send the deck” is far more actionable than “someone will send the deck.” Always notify participants that the meeting is being recorded and summarised — recording consent laws vary by region.
Step 2 — Summarise and extract action items
The difference between a useless wall of text and a genuinely useful note is structured output. Do not ask the model to “summarise the meeting.” Ask it to return a fixed JSON shape:
{
"summary": "2-3 sentence overview",
"decisions": ["..."],
"actionItems": [
{ "task": "...", "owner": "...", "due": "..." }
]
}
A defined schema forces the model to separate decisions from discussion and makes the result safe to push into other tools programmatically. For long meetings that exceed the context window, use a map-reduce approach: summarise each transcript segment, then summarise the summaries, merging and de-duplicating action items at the end.
Step 3 — Deliver the result
The summariser only delivers value if the notes reach people without anyone copy-pasting. Push the structured output to where the team lives — create a Notion page, post a formatted Slack message, or send an email — automatically after each meeting.
Because transcription can mishear names and the model can occasionally misattribute an action, treat the output as a strong first draft. Have the meeting owner skim and correct the action items before they become the official record. Use the generator below to build your extraction prompt and pipeline scaffold, then see how to build a summarisation tool and how to call the chat API for the underlying mechanics.