Knowledge Bases

Knowledge bases let you give your agents access to your own data — documents, websites, FAQs, product information, and more. Instead of relying solely on the LLM’s training data, your agent can search your content and provide accurate, up-to-date answers grounded in your information.

How Knowledge Works

thinnestAI uses Retrieval Augmented Generation (RAG) to connect your documents to your agents:

You upload content — PDFs, web pages, text, or other sources.
thinnestAI processes it — The content is chunked, embedded, and indexed for search.
A user asks a question — Your agent receives the message.
The agent searches your knowledge base — Finding the most relevant chunks of content.
The agent generates a response — Using the retrieved content to formulate an accurate answer.

User: "What's your refund policy?"
            │
            ▼
    ┌───────────────────┐
    │  Agent searches    │
    │  knowledge base    │
    │  for "refund       │
    │  policy"           │
    └───────┬───────────┘
            │ Finds relevant chunks from your docs
            ▼
    ┌───────────────────┐
    │  Agent generates   │
    │  response using    │
    │  your actual       │
    │  policy document   │
    └───────────────────┘
            │
            ▼
Agent: "Our refund policy allows full refunds within 30 days
        of purchase. After 30 days, we offer store credit..."

This means your agent always answers based on your content, not hallucinated information.

Creating a Knowledge Base

Step 1: Navigate to Knowledge

Go to the Knowledge section in the thinnestAI dashboard.
Click Create Knowledge Base.
Give it a name and optional description:

Field	Example
Name	Product Documentation
Description	All product docs, FAQs, and support articles

Step 2: Add Sources

Click Add Source to upload your content. thinnestAI supports multiple source types:

Files — Upload PDF, DOCX, TXT, and more
URLs — Paste a web page URL to scrape its content
YouTube — Extract transcripts from YouTube videos
Text — Paste text directly
GitHub — Import a repository’s documentation
Azure Blob — Connect to Azure Blob Storage
SharePoint — Import from SharePoint sites
Excel — Import data from Excel workbooks

See Knowledge Sources for detailed instructions on each type.

Step 3: Wait for Processing

After adding sources, thinnestAI processes them:

Extraction — Content is extracted from the source format.
Chunking — Content is split into meaningful segments.
Embedding — Each chunk is converted into a vector embedding.
Indexing — Embeddings are stored for fast similarity search.

Processing typically takes a few seconds for text and up to a few minutes for large documents. You’ll see a progress indicator.

Assigning Knowledge to Agents

Once your knowledge base is ready, connect it to an agent:

Go to Agents and select your agent.
Scroll to the Knowledge section.
Click Add Knowledge Base.
Select the knowledge base you created.
Save.

An agent can have multiple knowledge bases. For example:

Product Docs — for product questions
Company Policies — for policy inquiries
FAQ — for common questions

The agent searches across all assigned knowledge bases automatically.

Via the API

curl -X PUT https://api.thinnest.ai/agents/{agent_id}/knowledge \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "knowledge_base_ids": ["kb_abc123", "kb_def456"]
  }'

How Search Works

thinnestAI uses a hybrid search approach combining two methods for the best results:

Semantic Search

Finds content based on meaning, not just exact keywords. If a user asks “How do I get my money back?”, semantic search will find your “Refund Policy” document even though the words don’t match exactly.

Full-Text Search

Traditional keyword matching that finds content containing the exact terms used. This catches specific names, product codes, and technical terms that semantic search might miss.

Hybrid Results

Both search methods run in parallel, and results are combined and ranked by relevance. This ensures your agent finds the right information whether the user asks casually or uses precise terminology.

How Agents Access the KB

The retrieval mechanism depends on the agent’s voice mode:

Voice mode	Mechanism	Latency
Cascaded (STT → LLM → TTS)	Auto-injection. Every user turn triggers a hybrid search before the LLM responds; the top passages are placed in the model’s context. The LLM sees the KB chunks alongside the user’s question and answers from them.	~300-700 ms hidden in the cascaded turn
Speech-to-Speech (Gemini Live / OpenAI Realtime)	Function calling. A `search_knowledge_base` tool is registered automatically when knowledge is attached; the realtime model calls it when it detects a knowledge-relevant question.	~300-600 ms tool round-trip on KB-related turns; 0 ms on chitchat
Chat (text agents)	Auto-injection on every user message.	~300-700 ms per message

In all three cases the underlying vector search is identical (hybrid semantic + full-text, tenant-isolated to the agent’s assigned source IDs) — only the trigger differs. See Speech-to-Speech → Knowledge Base in S2S mode for the S2S-specific details.

When retrieval is slow or comes up empty

Knowledge retrieval is designed to fail gracefully — a call never stalls on it:

Slow lookup — if a search takes unusually long, the agent still answers within a few seconds rather than going silent and waiting indefinitely. The caller always gets a timely response.
No match — when the knowledge base has nothing relevant to the question, the agent says so plainly (“I don’t have that detail”) instead of inventing an answer.
Service hiccup — if the knowledge base can’t be reached for a moment, the agent acknowledges it (“I’m having trouble pulling that up right now”) and offers to follow up — it won’t guess at prices, policies, or other specifics.

This keeps the agent honest: it answers from your content when it can, and is upfront when it can’t.

Knowledge Isolation

By default, knowledge isolation is enabled to prevent cross-agent data leakage. Each agent’s vector search is scoped to its own knowledge sources, ensuring that Agent A cannot accidentally access Agent B’s private documents. This is important for:

Multi-tenant deployments — Different customers’ agents never see each other’s data.
Security-sensitive use cases — Legal, healthcare, or financial agents with strict data boundaries.
Team setups — Each team member agent searches only its own assigned knowledge, not the entire organization’s data.

Knowledge isolation works automatically. When an agent performs a vector search, the query includes a filter for the agent’s assigned knowledge base IDs. No additional configuration is needed. If you want agents to share a knowledge base, simply assign the same knowledge base to multiple agents. They will both have access to the shared content while still being isolated from any knowledge bases not explicitly assigned to them.

Tips for Better Knowledge

Be specific — Focused, well-organized documents perform better than giant catch-all files.
Use headings — Documents with clear headings and sections help chunking produce better results.
Keep it current — Update your knowledge base when your content changes. Delete outdated sources.
Test with questions — After adding sources, test by asking your agent questions you’d expect users to ask.
Separate by topic — Use multiple knowledge bases for different topics (product, policy, technical) rather than one massive knowledge base.

What’s Next

Knowledge Sources — Detailed guide on adding each type of source.
Supported Formats & Limits — File formats, size limits, and best practices.

Introduction

Getting Started

Voice Agents

Agent Capabilities

Channels

Quality & Oversight

Platform

Knowledge Bases

Knowledge Bases

How Knowledge Works

Creating a Knowledge Base

Step 1: Navigate to Knowledge

Step 2: Add Sources

Step 3: Wait for Processing

Assigning Knowledge to Agents

Via the API

How Search Works

Semantic Search

Full-Text Search

Hybrid Results

How Agents Access the KB

When retrieval is slow or comes up empty

Knowledge Isolation

Tips for Better Knowledge

What’s Next

​Knowledge Bases

​How Knowledge Works

​Creating a Knowledge Base

​Step 1: Navigate to Knowledge

​Step 2: Add Sources

​Step 3: Wait for Processing

​Assigning Knowledge to Agents

​Via the API

​How Search Works

​Semantic Search

​Full-Text Search

​Hybrid Results

​How Agents Access the KB

​When retrieval is slow or comes up empty

​Knowledge Isolation

​Tips for Better Knowledge

​What’s Next

Knowledge Bases

How Knowledge Works

Creating a Knowledge Base

Step 1: Navigate to Knowledge

Step 2: Add Sources

Step 3: Wait for Processing

Assigning Knowledge to Agents

Via the API

How Search Works

Semantic Search

Full-Text Search

Hybrid Results

How Agents Access the KB

When retrieval is slow or comes up empty

Knowledge Isolation

Tips for Better Knowledge

What’s Next