SSML Prosody Builder

Build <prosody> SSML tags for pitch, rate, and volume control in TTS

Ad placeholder (leaderboard)

Building SSML prosody tags

Text-to-speech engines read plain text in a flat, neutral voice by default. SSML prosody tags let you shape the delivery — raising pitch for a question, slowing the rate for emphasis, or lowering volume for an aside — without changing a single word. This builder assembles a valid <prosody> tag from your inputs and escapes the text so the markup never breaks.

How it works

You enter a text segment and set three controls: pitch, rate, and volume. Each offers a preset mode (named values like high or x-slow) and a relative mode — semitones for pitch (+2st), percent for rate (120%), and dB for volume (+6dB). The tool wraps your escaped text in a <speak><prosody> block with only the attributes you set, producing clean SSML compatible with AWS Polly and Azure Speech.

Tips and notes

  • Relative semitones are the most musical pitch control. +2st shifts pitch predictably; the named presets are coarser steps.
  • Percent rate beats presets for fine pacing. 90% is a subtle slowdown that slow would overshoot.
  • Keep segments short. Apply prosody to the specific phrase that needs it rather than a whole paragraph, so the rest reads naturally.
  • Test in your engine. Most prosody attributes are portable, but always preview in the actual TTS voice — engines interpret extremes differently.
Ad placeholder (rectangle)