> ## Documentation Index
> Fetch the complete documentation index at: https://docs.thinnest.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Noise Cancellation

> Pick a noise-cancellation engine for inbound caller audio and dial in how aggressive it should be — four open-source engines unified behind one strength slider.

# Noise Cancellation

Inbound caller audio almost always has background noise — markets,
traffic, family chatter, fan hum. Noise cancellation cleans up that
audio **before** it reaches the speech recognizer, which dramatically
improves transcript accuracy and reduces misinterpretations.

thinnestAI ships **four open-source noise-cancellation engines** out of
the box, each with different strengths. They sit behind a single
**Strength** slider so you don't have to learn a new control surface
when you switch engines.

## Where to find it

In **Agent Studio → Voice Configuration → Advanced → Voice Settings**,
toggle **Noise Cancellation** on. An engine picker and strength slider
appear right below.

## Engines

### GTCRN (recommended default)

Group Temporal Convolutional Recurrent Network. A 535 KB MIT-licensed
streaming ONNX model that runs at 16 kHz on CPU.

* **Best for:** general use. The lightest engine and the gentlest on
  quiet or short utterances ("twenty-fifth", "five PM") — these get
  clipped by more aggressive engines.
* **CPU:** lowest of the four.
* **License:** MIT.

This is the default for new agents.

### Hush

Weya AI's two-stage stack: DeepFilterNet3 + an auxiliary speaker
separation head. Suppresses background noise **and** competing voices
(markets, busy households, family in the background).

* **Best for:** noisy environments where multiple people are talking.
* **CPU:** highest of the four.
* **License:** MIT (DeepFilterNet) + Apache 2.0.

If the Hush model bundle isn't available on a deployment, the bot
silently falls back to GTCRN.

### RNNoise

Mozilla / Xiph's RNNoise via the `pyrnnoise` wrapper.

* **Best for:** the leanest CPU footprint when steady-state noise is
  the main problem (fan hum, AC).
* **CPU:** very low.
* **License:** Apache 2.0.

Leaves more steady-state noise through than GTCRN — pick RNNoise if
the call host is CPU-constrained.

### DTLN

Dual-Signal Transformation LSTM Network. Legacy / opt-in.

* **Best for:** backward-compatibility with agents saved before
  GTCRN became the default.
* **Caveat:** known to over-suppress short user replies (e.g. "25th",
  "tomorrow") on multi-turn calls. Prefer GTCRN unless you have a
  specific reason.

## Strength slider

Every engine is wrapped in a **wet/dry blend adapter** so the slider
behaves the same regardless of which engine you pick:

| Strength          | Behaviour                                                                                        |
| ----------------- | ------------------------------------------------------------------------------------------------ |
| **0%**            | Engine is bypassed entirely — same CPU as "Off".                                                 |
| **30%** (default) | Empirically validated sweet spot — preserves quiet/short utterances, removes most ambient noise. |
| **70%**           | Aggressive — removes more steady noise, but may clip quiet syllables on short replies.           |
| **100%**          | Pure denoised output.                                                                            |

We default to **30%** based on multi-turn testing where 70% strength
was shown to drop short utterances ("25th", "five PM") often enough
that Deepgram Nova-3 saw only an interim transcript and never committed
a final inside the 2 s STT timeout.

If you switch engines, the strength setting carries over.

## How it interacts with other features

* **Speech-to-Speech mode:** Noise cancellation runs on the inbound
  audio path before it reaches Gemini Live, so all four engines work
  the same way in S2S as in Cascaded mode.
* **Audio Ambience:** Background sound is published on the *outbound*
  agent-track and is unaffected by noise cancellation (which only
  processes inbound caller audio).
* **Krisp / BVC** (LiveKit Cloud): Listed in the type for
  backward-compatibility with saved configs but not selectable on
  self-hosted deployments. Saved configs that reference them are
  silently re-routed to GTCRN at runtime, with no Krisp surcharge billed.

## Picking an engine — quick guide

| Caller environment                                    | Engine              |
| ----------------------------------------------------- | ------------------- |
| Office / home / general                               | **GTCRN** (default) |
| Market / café / noisy household                       | **Hush**            |
| Steady fan / AC noise on a CPU-tight box              | **RNNoise**         |
| You're migrating an old agent that was tuned for DTLN | **DTLN**            |

## Next Steps

* [Voice Configuration](/docs/voice/voice-configuration) — pick STT/TTS providers for the cascaded pipeline.
* [Audio Ambience](/docs/voice/ambience) — add background sound on the outbound agent path.
* [Speech-to-Speech](/docs/voice/speech-to-speech) — switch to a single Gemini Live realtime model.
