How many voices does OpenAI TTS offer?

The core OpenAI speech models ship six built-in voices — alloy, echo, fable, nova, onyx, and shimmer. They each have a distinct tone, and the same voice name works across the tts-1 and tts-1-hd models.

Which OpenAI voice is best for narration?

Onyx and alloy tend to work well for narration thanks to a steady, grounded tone, while nova is a good warmer alternative. The right pick depends on your audience and brand, so audition a couple with the same script.

Can I change the speaking speed?

Yes. The speech API accepts a speed parameter, typically from 0.25 up to 4.0, with 1.0 as normal. Slowing slightly improves clarity for instructional content, while speeding up suits skimmable summaries.

What audio formats can OpenAI TTS return?

You can request mp3, opus, aac, flac, wav, and pcm. Use mp3 or aac for general distribution, opus for low-latency streaming, and wav or flac when you need lossless audio for editing.

Is the voice the same across tts-1 and tts-1-hd?

The voice identity is the same, but tts-1-hd renders at higher audio quality with a bit more latency and cost. Use tts-1 for real-time or high-volume use and tts-1-hd for polished published audio.

What is the OpenAI TTS Voice Picker?

Reference and matcher for all six OpenAI TTS voices — alloy, echo, fable, nova, onyx, shimmer. Pick a use case and tone to get the best-fit voice, with descriptions, sample phrases, speed, and audio-format guidance. Runs in your browser. It runs free in your browser on Gera Tools, with nothing uploaded.

OpenAI TTS Voice Picker

Name: OpenAI TTS Voice Picker
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Get one useful tool a week

Like this tool? Enter your email and we'll send you one genuinely useful Gera tool a week — plus a link to come back to this one. No spam, one-click unsubscribe any time.

OpenAI TTS voice picker

OpenAI’s speech models ship six voices — alloy, echo, fable, nova, onyx, and shimmer — and the right one depends entirely on what you are making. This picker pairs each voice with its tone, best use cases, and a sample phrase, then recommends a shortlist based on your use case and tone preference.

Voice character guide

Each voice has a consistent personality across both tts-1 and tts-1-hd:

Voice	Character	Works best for
alloy	Balanced, clear, gently neutral	General narration, documentation reads, tutorials
echo	Slightly warmer male tone	Explanatory content, educational material
fable	Expressive, storytelling quality	Audiobooks, character narration, creative content
nova	Warm, friendly female tone	Conversational assistants, onboarding flows, customer service
onyx	Deep, authoritative male tone	News-style reads, formal narration, brand announcements
shimmer	Clear, bright female tone	Product demos, marketing, upbeat assistant interactions

How the picker works

Every voice has a consistent character regardless of model. The picker scores each one against two inputs:

Use case — narration, assistant/agent, audiobook, advertisement, or character — because a calm reader and a punchy promo voice rarely overlap.
Tone — warm, neutral, or authoritative — which biases toward the voices that carry that feel.

It then surfaces the top matches with notes on speed (the API’s speed parameter, 0.25–4.0) and audio format so you can drop the choice straight into your request.

Speed and format settings

The speed parameter (0.25 to 4.0, default 1.0) adjusts pacing without changing the voice identity. Some practical anchor points:

0.85–0.9 — Slowed slightly for complex instructions or accessibility use
1.0 — Normal conversational pace
1.1–1.15 — Brisk summary or notification style
1.25+ — Speed-listening; quality degrades noticeably above 1.5

Audio formats to request from the API:

mp3 — Default; good for general use and web delivery
opus — Lower file size with good quality; best for streaming in real time
aac — Smaller than mp3, compatible with most modern players
wav / flac — Lossless; use when you will edit the audio in a DAW afterward
pcm — Raw audio samples; useful for piping into audio processing pipelines

Tips for choosing a voice

Audition with your real script. A generic “hello world” hides how a voice handles your actual sentences, names, and technical terms.
Match voice to medium. Onyx and alloy read long-form well; nova and shimmer feel friendlier for assistants and welcome flows. Fable stands out for anything with a narrative arc.
Set speed deliberately. Drop to ~0.9 for instructions, push to ~1.1–1.2 for snappy summaries.
Pick format by destination. mp3/aac to ship, opus to stream, wav/flac when you will edit the audio afterward.
Be consistent within a product. Switching voices between screens or sections feels jarring. Pick one voice for the whole product or one per distinct persona (assistant vs narrator), and stick with it.