Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.thinnest.ai/llms.txt

Use this file to discover all available pages before exploring further.

Noise Cancellation

Inbound caller audio almost always has background noise — markets, traffic, family chatter, fan hum. Noise cancellation cleans up that audio before it reaches the speech recognizer, which dramatically improves transcript accuracy and reduces misinterpretations. thinnestAI ships four open-source noise-cancellation engines out of the box, each with different strengths. They sit behind a single Strength slider so you don’t have to learn a new control surface when you switch engines.

Where to find it

In Agent Studio → Voice Configuration → Advanced → Voice Settings, toggle Noise Cancellation on. An engine picker and strength slider appear right below.

Engines

Group Temporal Convolutional Recurrent Network. A 535 KB MIT-licensed streaming ONNX model that runs at 16 kHz on CPU.
  • Best for: general use. The lightest engine and the gentlest on quiet or short utterances (“twenty-fifth”, “five PM”) — these get clipped by more aggressive engines.
  • CPU: lowest of the four.
  • License: MIT.
This is the default for new agents.

Hush

Weya AI’s two-stage stack: DeepFilterNet3 + an auxiliary speaker separation head. Suppresses background noise and competing voices (markets, busy households, family in the background).
  • Best for: noisy environments where multiple people are talking.
  • CPU: highest of the four.
  • License: MIT (DeepFilterNet) + Apache 2.0.
If the Hush model bundle isn’t available on a deployment, the bot silently falls back to GTCRN.

RNNoise

Mozilla / Xiph’s RNNoise via the pyrnnoise wrapper.
  • Best for: the leanest CPU footprint when steady-state noise is the main problem (fan hum, AC).
  • CPU: very low.
  • License: Apache 2.0.
Leaves more steady-state noise through than GTCRN — pick RNNoise if the call host is CPU-constrained.

DTLN

Dual-Signal Transformation LSTM Network. Legacy / opt-in.
  • Best for: backward-compatibility with agents saved before GTCRN became the default.
  • Caveat: known to over-suppress short user replies (e.g. “25th”, “tomorrow”) on multi-turn calls. Prefer GTCRN unless you have a specific reason.

Strength slider

Every engine is wrapped in a wet/dry blend adapter so the slider behaves the same regardless of which engine you pick:
StrengthBehaviour
0%Engine is bypassed entirely — same CPU as “Off”.
30% (default)Empirically validated sweet spot — preserves quiet/short utterances, removes most ambient noise.
70%Aggressive — removes more steady noise, but may clip quiet syllables on short replies.
100%Pure denoised output.
We default to 30% based on multi-turn testing where 70% strength was shown to drop short utterances (“25th”, “five PM”) often enough that Deepgram Nova-3 saw only an interim transcript and never committed a final inside the 2 s STT timeout. If you switch engines, the strength setting carries over.

How it interacts with other features

  • Speech-to-Speech mode: Noise cancellation runs on the inbound audio path before it reaches Gemini Live, so all four engines work the same way in S2S as in Cascaded mode.
  • Audio Ambience: Background sound is published on the outbound agent-track and is unaffected by noise cancellation (which only processes inbound caller audio).
  • Krisp / BVC (LiveKit Cloud): Listed in the type for backward-compatibility with saved configs but not selectable on self-hosted deployments. Saved configs that reference them are silently re-routed to GTCRN at runtime, with no Krisp surcharge billed.

Picking an engine — quick guide

Caller environmentEngine
Office / home / generalGTCRN (default)
Market / café / noisy householdHush
Steady fan / AC noise on a CPU-tight boxRNNoise
You’re migrating an old agent that was tuned for DTLNDTLN

Next Steps