Audio Ambience

Two perceived-latency hacks borrowed from production voice agents:

Background sound — a low-volume office-chatter loop running for the entire call. Masks silence, makes the agent feel “present in a room” rather than floating in a vacuum.
Thinking sound — a short keyboard-typing clip that plays while the agent is in its “thinking” state. Masks LLM time-to-first-token and the gaps that open up during tool calls / agent handoffs.

Both are powered by LiveKit’s BackgroundAudioPlayer under the hood. The agent’s TTS / Gemini Live audio plays on top — the ambience is mixed at low volume so it doesn’t compete with the spoken reply.

Where to find it

In Agent Studio → Voice Configuration → Advanced → Call Control, toggle the Audio Ambience card on. Click the gear icon to open the configuration modal — every setting lives there.

Configuration

Background Sound

Option	Behaviour
Off	No background sound.
Office Chatter (built-in)	Subtle call-centre chatter loop. Plays continuously throughout the call.
Custom	Upload your own `.mp3` or `.wav` (up to 25 MB).

A volume slider (0–100%, default 30%) sits below the picker.

Thinking Sound

Option	Behaviour
Off	No thinking sound.
Keyboard Typing (built-in)	Plays while the agent is in its “thinking” state — i.e. waiting for the LLM.
Custom	Upload your own `.mp3` or `.wav` (up to 25 MB).

Volume slider (0–100%, default 50%). The thinking sound is auto-triggered by LiveKit’s session events; you don’t have to wire anything to make it play during tool calls or handoffs.

Custom audio uploads

Click Custom for either slot to enable the upload zone. Drag a .mp3 or .wav in (or click to pick from disk). The file is uploaded to the same audio asset endpoint used by the rest of the platform and the URL is stored on the agent config. You can preview the uploaded clip with the play button before saving. Use the trash icon to remove the uploaded file.

Best tested on a real phone call. In a browser test (laptop speakers + mic, no echo cancellation), the agent can hear its own ambience and treat it as user speech. Use headphones, or test from a phone.

How it interacts with the audio path

The ambience and thinking clips are published on the agent’s outbound audio track. They don’t go through the noise-cancellation path (NC only processes inbound caller audio).
The mixer waits ~3 seconds after the session starts before the background streams attach, so the greeting TTS has time to drive the audio path. Without that delay, LiveKit’s audio mixer drops the background streams on first-frame timeout.
When the call ends, the player is closed cleanly and any temp-downloaded custom audio files are deleted.

Cost

Audio Ambience is free — there’s no per-minute charge.

Quick start

Open your agent in Agent Studio → Voice Configuration.
Go to Advanced → Call Control.
Toggle Audio Ambience on. Both default to the built-in clips (office chatter + keyboard typing).
Click the gear icon if you want to upload your own clips, change volumes, or pick “Off” for either slot.
Save and test on a real phone (or with headphones).

Next Steps

Noise Cancellation — clean up inbound caller audio.
Voice Configuration — provider, voice, interruption settings.

​Audio Ambience

​Where to find it

​Configuration

​Background Sound

​Thinking Sound

​Custom audio uploads

​How it interacts with the audio path

​Cost

​Quick start

​Next Steps

Audio Ambience

Where to find it

Configuration

Background Sound

Thinking Sound

Custom audio uploads

How it interacts with the audio path

Cost

Quick start

Next Steps