System Prompts & Instructions
The system prompt — what you write under Core Behaviour in the Agent Studio — is the single most important configuration on your agent. It defines persona, goals, rules, voice, and how the agent handles tools and knowledge. A great chat reply can be a terrible voice reply: too long, full of markdown, packed with acronyms, written like a memo instead of like something a human would say out loud. This guide walks through how to write a Core Behaviour prompt that makes a thinnestAI agent sound natural in production: across English, Hindi, Hinglish, and the other Indian languages we support. Use it alongside the Behaviour tab of any agent in the Agent Studio.Why voice prompts are different
When an agent runs through a cascaded STT → LLM → TTS pipeline, the LLM in the middle has no idea it’s being spoken aloud. It defaults to writing the way it always writes: full paragraphs, bullet lists, “Certainly!”, URLs withhttps://, currency formatted as ₹1,500. All of that sounds wrong when a TTS reads it back at 1× speed on a phone line.
Even on a speech-to-speech model (Gemini Live, OpenAI Realtime), the model still needs to be told to be brief, to ask one question at a time, and to wait for the caller to react. Voice callers are not patient. They will interrupt. They will hang up.
So the goal of a voice prompt is two-part:
- Tell the agent what to do (goals, rules, tools, persona).
- Tell the agent how to sound doing it (length, formatting, fillers, pacing).
A working skeleton
Most production voice prompts on thinnestAI follow roughly this shape. Treat it as a starting outline, not a template — you’ll grow or trim each section based on what your agent actually has to do.Identity
Start with a one or two-line identity statement. It anchors the rest of the prompt — every output rule and behaviour later is “in service of being this person.”Output rules
This block is the single biggest lever for making your agent sound voice-native. Without it, you’ll get markdown read aloud, currency mispronounced, and three-paragraph monologues. A good baseline:Conversational style
Output rules tell the agent what not to do. Conversational style tells it how to actually sound human while doing it. This is where most prompts thin out, and where most agents start sounding robotic.Filler words and acknowledgments
LLMs don’t reach for “uh”, “hmm”, or “achha” on their own — they were trained on edited text where those got cut. You have to put them back. Don’t ask for them on every turn, though, or it sounds performative.Self-corrections
Real speech includes course-corrections mid-sentence. Models won’t do this without examples.Phrase variation across turns
Without prompting, agents open every other turn with “Sure” or “Got it”. The third time you hear it, the illusion breaks.Personality as behaviour
“Be friendly” is meaningless to an LLM. Show what friendly sounds like in your agent’s voice:Pauses, emotion, and non-verbal sounds
If your TTS provider supports SSML or expression tags, you can shape pacing and emotion directly. Support varies — check the provider page for Sarvam, Aero, or whichever TTS you’re using before relying on tags. A typical pattern for<break> tags (where supported):
[chuckles] or [sighs] (ElevenLabs v3 and a few others), cap them — one per turn at most — so each one keeps its punch.
If your stack doesn’t support these tags, leave them out entirely. A <break> tag spoken aloud is worse than no pause at all.
Goals
After style, tell the agent what it’s actually trying to accomplish.Tools
If the agent has access to tools, give it a short policy on when and how to use them. The detailed parameter descriptions belong on each tool definition; the prompt just needs a behaviour rule.Execute, don’t narrate
The single most common voice-tool bug: the model says “I’ve booked your meeting” or “I’ve saved your details” without ever actually calling the tool. From the caller’s perspective the agent confirms something that never happened — calendar invite never arrives, contact never lands in the spreadsheet. Tell the prompt explicitly that this isn’t allowed:Describe actions, not function names
When you reference a tool inside your prompt, describe what it does — don’t name the function. If you write “call the save-to-sheet function”, the model takes it literally and tries to invoke a function literally namedsave-to-sheet, which usually isn’t what’s registered.
Two-step flows: check before acting
For booking, ordering, payment, or anything that mutates state, the underlying API often validates the input before accepting it. A booking call with a guessed time gets rejected because that slot is taken; a payment call with an invalid card fails on charge. Make the agent run the read-only “check” tool before the write-tool:Read back spelling-critical inputs
Voice transcription mishears emails, names, phone numbers, and addresses more often than you’d think. “hello@thinnest.ai” gets captured as “hello@thinnest.ar”. “Bharath” becomes “Bharat”. “nine eight seven” becomes “nine ate seven”. If any of these are about to be passed to a tool — to book a meeting, place an order, send an SMS — the wrong value gets locked in and the caller never knows. Force a read-back step in the prompt:- Forces the agent to slow down at exactly the moment callers expect care. Booking a meeting with someone who didn’t confirm your email back feels sloppy; reading it back is what an attentive human would do.
- Catches STT errors before they become tool failures. Cal.com / Stripe / your CRM will reject
hello@thinnest.arwith a generic 400 — the agent loops, the caller waits. A read-back catches it for free, before any tool fires.
Never leave dead air
This is the #1 voice agent failure mode in production. The agent says something like “Let me check that for you” and then stops — no answer, no follow-up, just silence. The caller waits, eventually says “hello? you there?”, and the agent re-greets like the call just started. Catastrophic for trust. The cause is almost always the prompt. Three patterns produce it, often together:- Filler-only acknowledgments. Phrases like “let me check our latest documentation”, “let me look that up”, “one moment please”, “give me a second”, “main check karta hoon” — said alone, without the answer that should follow in the same response.
- Pre-tool-call filler instructions that frame the filler as a precursor to an action (a tool call, a lookup) and don’t tell the model what to do when there’s no action to take. The model says the filler and stops, waiting for an action it never decides to make.
- Generic “be thoughtful” personality instructions (“use phrases like ‘hmm, let me think about that’ to sound natural”) that the model latches onto when uncertain — emitting the thoughtful filler as its entire response.
How to fix an agent that goes silent after a filler
If you’re already seeing this behavior, fix the prompt, not the runtime:- Find and delete any sentences that tell the agent to use filler phrases alone. Examples to remove or rewrite:
- “Use occasional ‘hmm’ or ‘let me think about that’ to simulate thoughtfulness” → drop entirely, or rewrite to require the thought to be followed by the answer in the same response.
- “If uncertain, say ‘let me check our latest documentation’” → change to “If uncertain, state your best understanding in one sentence and offer to escalate or send the exact details by email.”
- “Can I put you on a brief hold while I check this?” → change to “Pulling that up… here’s what I found: [answer].”
- Add the “Never leave dead air” block above to your prompt as its own section, even if you already have a “Conversational style” section.
- Test the failure mode explicitly by asking your agent a question your knowledge base doesn’t fully cover. The agent should either give a best-guess answer + offer to escalate, or directly say “I’m not sure — want me to take a callback?”. It should NOT say “let me check that for you” and stop.
Why this is a prompt fix and not a runtime fix
The runtime can detect a dangling filler after the fact (short response that ends with a known filler phrase) and force a continuation. We considered shipping that as a safety net. But the model’s “let me check…” emission is itself a prompt instruction being followed literally. Fix the prompt and the symptom disappears at the source. Runtime detection would still leave a 1-2 second pause while the agent re-prompts itself — the user perceives the gap.Keep the call alive
Voice agents have an annoying failure mode: once they’ve completed a task — booked the meeting, saved the contact, answered the FAQ — they read the closing instructions in the prompt as a cue that the call is over, and either go silent or rush to sign off. Then the caller asks a follow-up and the agent ignores it, or the line goes dead. Add an explicit rule that nothing ends the call except the caller themselves:Guardrails
Make boundaries explicit. Voice callers will test them — sometimes accidentally (“can you also tell me about XYZ”), sometimes deliberately.Handoff procedure
Tell the agent how to hand off, not just when. Otherwise it improvises, and improvised handoffs leak data or skip critical fields.Knowledge grounding
If your agent uses knowledge, pin it down hard. Voice agents that hallucinate prices or policies erode trust in seconds. On thinnestAI, knowledge is not stitched into the prompt as a literal block. Instead, every datasource you attach under the Knowledge tab becomes searchable through a built-in tool the model can call (search_knowledge_base). The agent decides when to retrieve, runs the search, and grounds its next sentence on what came back. That’s why your prompt needs to tell the agent to use that tool aggressively for facts, and not to answer factual questions without it.
{{ placeholder }} for the knowledge base itself — attaching the datasource under the Knowledge tab is what wires it up. Your prompt only needs to enforce the usage discipline: search first, refuse cleanly, never invent.
If you’re seeing the agent answer factual questions without searching, the fix is almost always a stronger rule in this section (and sometimes a stronger description on the tool itself). See knowledge for setup details.
Language detection and switching
For Indian-language agents, language handling is a prompt-level concern. Don’t ask the caller “which language do you prefer?” — match their first turn.Variables in your prompt
This is where most thinnestAI agents pick up their per-caller personalisation, and it’s worth a dedicated section because it changes how you write the rest of the prompt.Syntax
thinnestAI prompts support inline placeholders using double-curly syntax:{{ user_name }} and {{user_name}} behave the same.
Behaviour on lookup:
- Primitive value (string, number, boolean) → substituted as text.
1500becomes1500. - Array or object → substituted as JSON.
["hindi","english"]becomes the literal string["hindi", "english"]. Useful for handing a list of options to the model. - Undefined / missing → substituted as an empty string. The agent won’t crash; it just sees no value. Plan for this in the prompt (see “Defaults” below).
Where variables come from
A single agent execution merges variables from two scopes:- Agent-level variables. Defined under the Variables section in the Behaviour tab of Agent Studio. These are stored against the agent itself and available on every run. Good for things that change per-deployment but not per-call: brand name, support hours, escalation email, default greeting tone.
- Node-level variables. Defined on individual nodes inside the flow editor. Scoped to a single node’s execution and can override agent-level values of the same name.
{{ name }} syntax.
Static vs computed variables
Each variable has a data type, set when you create it:- String, number, boolean, array, object — static values you type into the dashboard.
-
Python function — a small, sandboxed expression that runs at lookup time. The platform evaluates it safely (no imports, no file I/O, no network) and substitutes the result. Useful for time-aware prompts:
The evaluator accepts safe expressions (arithmetic, comparisons, literals, f-strings) and a small whitelist of datetime helpers. Anything more complex should be computed in your backend and passed in as a dispatched variable instead.
A working example
A typical voice agent for an insurance renewal call might define these agent-level variables in the dashboard:| Name | Type | Value |
|---|---|---|
company_name | string | Acme Insurance |
support_phone | string | one eight hundred two two two three |
escalation_email | string | escalations@acme.in |
today | python_function | date.today().isoformat() |
| Name | Source |
|---|---|
user_name | dispatch metadata (CRM lookup at call start) |
policy_number | dispatch metadata |
current_premium | dispatch metadata |
last_premium | dispatch metadata |
Defaults and missing values
Because undefined variables silently become empty strings, write your prompts so the agent reads cleanly even when some values are missing. Two patterns work well:-
Inline default — surround the placeholder with words that read fine empty:
- Whole-line default — author a fallback at the variable definition (in the dashboard) so undefined never reaches the prompt at all. The lookup uses the default whenever the per-call value is missing.
Where this fits in the prompt
Most prompts cluster variable-bearing lines into two blocks near the top, after the identity statement:Voice style preamble
Separately from your Core Behaviour prompt, every agent on thinnestAI has a Voice Style Preamble — a short, voice-specific block applied automatically to voice calls (and not to text chat). You’ll see it in the right column of the Behaviour tab in Agent Studio. The default preamble enforces the basics: speak in short sentences, use contractions, never read markdown or symbols aloud, spell out money and dates, ask one question at a time, and add light filler words. Most agents can leave it untouched. You’d customise it when:- The agent operates in a specific language and you want the filler list adapted (e.g. swap English fillers for Marathi ones).
- The use-case needs domain-specific reading rules — for instance, a healthcare agent that must spell out dosages digit by digit.
- You want a stricter or looser baseline (very formal for legal contexts; very casual for consumer support).
Opening line
Short, warm, varied. Long openings are the most common reason callers hang up in the first three seconds.A complete example
Pulling everything together, here’s a tight prompt for an inbound policy-renewal agent. It’s intentionally short — every line earns its place.Iterating
Voice prompts get better through real-call feedback faster than any other kind of prompt. The two cheapest loops, both built into the dashboard:Activity transcripts
Under the Activity tab in Agent Studio you get every call this agent has handled, with the full transcript and the audio recording side-by-side. Listen to five or ten in a row and mark every line that:- Reads markdown, an emoji, a URL with a protocol, or an unspelled number aloud → tighten Output rules.
- Runs longer than two sentences when one would do → tighten the length rule, add an explicit “ask one thing at a time” line.
- Opens consecutive turns with the same acknowledgment → add a Phrase variation section.
- Sounds robotic, scripted, or apologetic → add or expand the Personality section with concrete behaviour examples.
- Hallucinates a fact you know isn’t in the knowledge base → strengthen the Knowledge section’s refusal rule, or add the missing fact to a datasource.
- Has the caller repeat themselves → check the prompt for stacked clarifying questions, then look at STT/turn-taking settings under Voice.
Evaluations
When you spot a specific user input that breaks the agent, pin it into the Evaluations tab as a test case. Define what the right behaviour should look like (a tone check, a “did it call the right tool”, a regex on the answer), then re-run the whole eval set after every prompt change. This is the difference between “I think the prompt is better now” and “the eval suite that failed twelve cases yesterday passes all twelve today.” Especially valuable when you have multiple people editing the prompt — without an eval suite, fixes for one bug quietly re-break others.Versioning
Every save creates a new version under the Versions tab. If a prompt change makes things worse in production, roll back to the last known-good version in one click rather than trying to remember which line you edited. See versioning for the full workflow. Small prompt changes have surprisingly large effects on voice behaviour. Change one rule, run a fresh call, listen. Repeat.Related
- Voice models — STT, TTS, and realtime model setup
- Knowledge — grounding agents in your content
- Evaluations — automated prompt-quality regression tests
- Workflows — when one prompt isn’t enough
- Memory and context — caller context that persists across calls
- Versioning — roll back prompt changes that misbehave

