Why AI changed translation economics
Machine translation used to mean rigid, phrase-based systems that produced serviceable but wooden output. Large language models translate by understanding meaning and context, so they handle idiom, tone, and ambiguity far better and can reach dozens of languages from a single prompt — including pairs with little dedicated training data. The result is that translating a product into a hundred locales is now a budgeting and quality-control problem rather than a staffing one. But “good enough” depends entirely on what you are translating: the same workflow that ships a support article would be reckless for a legal contract. The skill is in matching effort to risk.
Prompting: zero-shot, few-shot, and glossaries
The simplest workflow is zero-shot: instruct the model to translate from one language to another, optionally specifying tone and audience. It is fast and perfectly adequate for general text. When you need the output to match a specific voice or use precise domain terminology, switch to few-shot — include a few example source-and-translation pairs in the prompt so the model anchors on your style. The single highest-leverage technique for product localisation is glossary injection: pass a list of brand names and key terms with their approved translations (or an instruction to leave them untranslated) on every call, so the model never reinvents your product name or a technical term differently on string 4,000 than it did on string 4. Maintain the glossary per language pair and send the relevant entries with each request.
Quality control and post-editing
At scale you cannot eyeball every string, so you combine automation with sampling.
Automated checks flag the mechanical failures machines actually make: dropped or
reordered placeholders ({name}, %s), altered numbers, broken formatting or
markup, and glossary violations. Route any flagged output to a human. On top of
that, post-edit a meaningful sample per language so a native speaker catches the
fluency, register, and cultural-fit issues automated checks miss. Track an error
rate per language — some locales the model handles almost flawlessly, others (rarer
languages, ones with heavy honorific systems) need far more human involvement.
Letting the data tell you where to spend review time is what keeps quality high
without reviewing everything.
Balancing cost against quality
Translation cost scales with tokens and model choice, and stronger models cost more per token while translating more fluently. The winning move is tiering by risk. Route brand-defining marketing copy and legal text to your best model with few-shot examples, glossary injection, and human post-editing. Send bulk, low-risk, templated strings — support snippets, product attributes — to a cheaper model with only automated checks. This segmentation captures most of the quality of a premium-everywhere approach at a fraction of the cost, and it scales: as you add the fortieth or hundredth language, the per-string economics, not the engineering, are what determine whether the project is viable.