Generate
Convert text to natural-sounding speech with AI voices
Generate lets you turn any text into lifelike audio using AI voices. Choose from a variety of voices, enter your text, and download the audio file.
How to Generate Speech
Choose a voice from the dropdown. Voices are grouped by language. Click the play button to preview how each voice sounds.
Type or paste the text you want to convert. Character limits depend on the voice type (see table below).
Click Generate or press Cmd+Enter (Mac) / Ctrl+Enter (Windows). Wait for processing to complete.
Preview the audio in the player, then click Download to save the file.
Voice Types
| Voice Type | Character Limit | Output Formats | Special Features |
|---|---|---|---|
| Gpro voices | 1,000 characters | WAV | Add style instructions for tone/emotion |
| Grok voices | 1,000 characters | MP3, WAV | Multi-language support with expressive tags |
| Standard voices | 500 characters | MP3 | Use emotion tags in your text |
Gpro voices show a text box where you can describe how you want the voice to sound (e.g., "warm and friendly" or "serious and professional").
Grok voices support both MP3 and WAV output. They also support expressive tags like [laugh], [sigh], [pause], and wrapper tags like <whisper>...</whisper>, <soft>...</soft> for fine-grained control over delivery.
Emotion Tags (Standard voices)
Standard voices support emotion tags you can insert directly into your text. Available tags vary by language.
Example: Hello! <laugh> That was a good one.
Credits
Credits are calculated based on text length and voice type:
- Gpro voices: ~1 credit per token (based on text and audio length)
- Grok voices: 4 credits per 100 characters
- Standard voices: Based on estimated audio duration
You need at least 10 credits to generate audio. Your remaining credits are shown at the top of the page.