
How to write prompts for voice AI agents
We created a tactical guide sharing a bunch of best practice for developers on how to tune your prompts to make your voice agent's speech more conversational. Sharing the fundamentals here đ
Written and spoken language are fundamentally different in a number of ways that impact how you should approach building voice AI agents.
Consider how a human might communicate the same information:
Written: "The meeting is scheduled for 2:00 PM on February 14th, 2025, in Conference Room B."
Spoken: "We're meeting at two this afternoon in, uh, Conference Room B."
The differences run deeper than formatting. Written language is edited, polished, and permanent. Spoken language is messy, immediate, and ephemeral. Itâs full of:
Filler words
Sentences that trail off or restart mid-thought
Contextual shortcuts ("that thing we discussed")
Emotional color through tone and pace
When an agent speaks with perfect grammar and formal structure, it feels unsettling or just plain âweirdâ to users.
Thankfully, improving your agentâs speech can be as simple as tuning your prompts.
This guide includes a range of easy to implement tips and example prompts to help you build LLM-powered voice agents that sound more human...
The fundamentals: From text to speech
Tell your LLM to speak, not write
This one is straightforward, but important: Begin your voice agentâs prompt with a single instruction to shift the LLM's entire response pattern and decrease the chance of responses that include content optimized for written text (e.g. URLs, email addresses spelled with @ symbols, formatting that makes no sense when spoken, etc). This can be as simple as:
You are having a spoken conversation. Your responses will be read aloud by a text-to-speech system. Speak naturally, as if talking to a friend.
Format for ears, not eyes
Spoken language is processed differently than written text, so it can be helpful to apply the â6th-grade reading level testâ and add clear instructions to your system prompt:
- Use simple vocabulary and short sentences
- Never use bullet points, numbered lists, or formatted text
- Avoid parentheses, brackets, or quotation marks in speech
- If you must mention a special character, spell it out
- Never include emojis (they can't be spoken)
- Never use symbols that have no pronunciation (@#$%^&*)
When tuning your prompts, itâs a good idea to test every response with the "speakable content" test: read it aloud. If you stumble or it sounds weird, rewrite it.
Avoid robotic speech
Perfect speech sounds unnatural to humans. Directly including specific examples of natural speech patterns in your prompt can help output more natural responses. For example:
Try to use natural speech. Don't use robotic speech. This is what I mean:
# Robotic:
"I have found three restaurants matching your criteria.
The first option is Luigi's Italian Restaurant located at 123 Main Street."
# Natural:
"Okay, so... I found three places that could work.
First up is, uh, Luigi's - it's an Italian place on Main Street."
If you found any of this helpful or interesting, you can read the full post on our blog. Please let us know if you have any questions or know any advice to that we missed!
Replies