What you are building
Most surveys collect a rating and a free-text comment, and most teams analyse the rating and ignore the comments — because reading hundreds of open responses by hand is impractical. Yet the free text is where the why lives. This tutorial builds a feedback analyser that fixes that: it takes open-text responses, groups them into themes by similarity, summarises each theme, and ranks them by how many people raised them. The result is a one-page view of what your users are actually saying, generated in seconds. The demo below runs a working version of the clustering step right in your browser so you can paste responses and watch themes emerge.
How it works
The pipeline has three stages. Collect the open-text responses into a list. Cluster them so answers about the same topic land together — in production you turn each response into an embedding (a numeric vector that captures meaning) and group vectors that sit close, which clusters “the app is too slow” with “loading takes forever” even though they share no words. The in-browser demo approximates this with shared significant keywords, which is enough to show the concept live without any model. Summarise each cluster by counting its size, surfacing its defining terms and a representative quote, and — in the full version — sending the cluster to an LLM for a one-line plain-language summary. You call the model once per theme, not once per response, so cost stays low even at thousands of responses.
Tips and notes
Paste one response per line into the tool below to try it. Notice how the biggest clusters surface first — that ranking by volume is the whole point, because a theme raised by sixty people matters more than one raised by two. In production, the keyword approach becomes embedding similarity, which catches paraphrases and synonyms the demo misses. Keep representative quotes attached to every theme so a human can always verify a summary against what people actually wrote, and instruct any LLM summariser to describe only what is present rather than speculate — that single instruction is what stops it inventing themes. Build the cluster-and-count step first; the LLM summary is a thin, optional layer on top of a pipeline that already delivers value on its own.