Documentation Index
Fetch the complete documentation index at: https://docs.thinnest.ai/llms.txt
Use this file to discover all available pages before exploring further.
Noise Cancellation
Inbound caller audio almost always has background noise — markets, traffic, family chatter, fan hum. Noise cancellation cleans up that audio before it reaches the speech recognizer, which dramatically improves transcript accuracy and reduces misinterpretations. thinnestAI ships four open-source noise-cancellation engines out of the box, each with different strengths. They sit behind a single Strength slider so you don’t have to learn a new control surface when you switch engines.Where to find it
In Agent Studio → Voice Configuration → Advanced → Voice Settings, toggle Noise Cancellation on. An engine picker and strength slider appear right below.Engines
GTCRN (recommended default)
Group Temporal Convolutional Recurrent Network. A 535 KB MIT-licensed streaming ONNX model that runs at 16 kHz on CPU.- Best for: general use. The lightest engine and the gentlest on quiet or short utterances (“twenty-fifth”, “five PM”) — these get clipped by more aggressive engines.
- CPU: lowest of the four.
- License: MIT.
Hush
Weya AI’s two-stage stack: DeepFilterNet3 + an auxiliary speaker separation head. Suppresses background noise and competing voices (markets, busy households, family in the background).- Best for: noisy environments where multiple people are talking.
- CPU: highest of the four.
- License: MIT (DeepFilterNet) + Apache 2.0.
RNNoise
Mozilla / Xiph’s RNNoise via thepyrnnoise wrapper.
- Best for: the leanest CPU footprint when steady-state noise is the main problem (fan hum, AC).
- CPU: very low.
- License: Apache 2.0.
DTLN
Dual-Signal Transformation LSTM Network. Legacy / opt-in.- Best for: backward-compatibility with agents saved before GTCRN became the default.
- Caveat: known to over-suppress short user replies (e.g. “25th”, “tomorrow”) on multi-turn calls. Prefer GTCRN unless you have a specific reason.
Strength slider
Every engine is wrapped in a wet/dry blend adapter so the slider behaves the same regardless of which engine you pick:| Strength | Behaviour |
|---|---|
| 0% | Engine is bypassed entirely — same CPU as “Off”. |
| 30% (default) | Empirically validated sweet spot — preserves quiet/short utterances, removes most ambient noise. |
| 70% | Aggressive — removes more steady noise, but may clip quiet syllables on short replies. |
| 100% | Pure denoised output. |
How it interacts with other features
- Speech-to-Speech mode: Noise cancellation runs on the inbound audio path before it reaches Gemini Live, so all four engines work the same way in S2S as in Cascaded mode.
- Audio Ambience: Background sound is published on the outbound agent-track and is unaffected by noise cancellation (which only processes inbound caller audio).
- Krisp / BVC (LiveKit Cloud): Listed in the type for backward-compatibility with saved configs but not selectable on self-hosted deployments. Saved configs that reference them are silently re-routed to GTCRN at runtime, with no Krisp surcharge billed.
Picking an engine — quick guide
| Caller environment | Engine |
|---|---|
| Office / home / general | GTCRN (default) |
| Market / café / noisy household | Hush |
| Steady fan / AC noise on a CPU-tight box | RNNoise |
| You’re migrating an old agent that was tuned for DTLN | DTLN |
Next Steps
- Voice Configuration — pick STT/TTS providers for the cascaded pipeline.
- Audio Ambience — add background sound on the outbound agent path.
- Speech-to-Speech — switch to a single Gemini Live realtime model.

