Chat

The Chat API is how your users interact with thinnestAI agents. Send a message, get a response — it’s that simple. Under the hood, you get streaming, session persistence, and intelligent rate limiting.

Quick Start

Send a message to your agent:

curl -X POST https://api.thinnest.ai/chat \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "agent_abc123",
    "message": "What are your business hours?",
    "session_id": "session_xyz789"
  }'

Chat API Endpoint

POST /chat

Send a message to an agent and receive a response. Request Body:

Field	Type	Required	Description
`agent_id`	string	Yes	The ID of the agent to chat with
`message`	string	Yes	The user’s message
`session_id`	string	No	Session ID for conversation continuity. If omitted, a new session is created.
`stream`	boolean	No	Enable streaming responses (default: `true`)
`metadata`	object	No	Additional context to pass to the agent

Example Request:

curl -X POST https://api.thinnest.ai/chat \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "agent_abc123",
    "message": "Tell me about your return policy",
    "session_id": "session_xyz789",
    "stream": true
  }'

Non-Streaming Response:

{
  "response": "Our return policy allows returns within 30 days of purchase...",
  "session_id": "session_xyz789",
  "usage": {
    "input_tokens": 45,
    "output_tokens": 128,
    "total_tokens": 173
  }
}

Streaming Responses (SSE)

By default, the Chat API streams responses using Server-Sent Events (SSE). This gives your users a real-time typing experience instead of waiting for the full response.

How SSE Works

When stream: true (the default), the response is delivered as a series of events:

data: {"type": "token", "content": "Our"}
data: {"type": "token", "content": " return"}
data: {"type": "token", "content": " policy"}
data: {"type": "token", "content": " allows"}
...
data: {"type": "done", "session_id": "session_xyz789", "usage": {"input_tokens": 45, "output_tokens": 128}}

Consuming SSE in JavaScript

const response = await fetch('https://api.thinnest.ai/chat', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    agent_id: 'agent_abc123',
    message: 'What are your business hours?',
    session_id: 'session_xyz789',
  }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const chunk = decoder.decode(value);
  const lines = chunk.split('\n');

  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const data = JSON.parse(line.slice(6));
      if (data.type === 'token') {
        // Append to your UI
        console.log(data.content);
      } else if (data.type === 'done') {
        // Response complete
        console.log('Done:', data.usage);
      }
    }
  }
}

Consuming SSE in Python

import httpx

with httpx.stream(
    "POST",
    "https://api.thinnest.ai/chat",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "agent_id": "agent_abc123",
        "message": "What are your business hours?",
        "stream": True,
    },
) as response:
    for line in response.iter_lines():
        if line.startswith("data: "):
            data = json.loads(line[6:])
            if data["type"] == "token":
                print(data["content"], end="", flush=True)

Session Management

Sessions maintain conversation history so your agent remembers previous messages.

How Sessions Work

New conversation: Omit session_id — the API creates a new session and returns its ID.
Continue conversation: Include the session_id from a previous response.
Session storage: Conversation history is stored in PostgreSQL and persists across requests.
Session expiry: Sessions remain active for 24 hours of inactivity by default.

Managing Sessions

# Start a new conversation (no session_id)
curl -X POST https://api.thinnest.ai/chat \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "agent_abc123",
    "message": "Hi, I need help with my order"
  }'
# Response includes: "session_id": "session_new123"

# Continue the conversation
curl -X POST https://api.thinnest.ai/chat \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "agent_abc123",
    "message": "The order number is #12345",
    "session_id": "session_new123"
  }'
# Agent remembers the previous context

Session Metadata

You can pass metadata with each message to provide additional context:

{
  "agent_id": "agent_abc123",
  "message": "Check my account status",
  "session_id": "session_xyz789",
  "metadata": {
    "user_email": "customer@example.com",
    "plan": "enterprise",
    "source": "web"
  }
}

Your agent’s tools can access this metadata to personalize responses.

Rate Limiting

The Chat API enforces rate limits to ensure fair usage and platform stability.

Default Limits

Plan	Requests per Minute	Concurrent Connections
Free	20	2
Pro	120	10
Enterprise	Custom	Custom

Rate Limit Headers

Every response includes rate limit information:

X-RateLimit-Limit: 120
X-RateLimit-Remaining: 117
X-RateLimit-Reset: 1709654400

Handling Rate Limits

When you exceed the limit, you receive a 429 Too Many Requests response:

{
  "detail": "Rate limit exceeded. Please retry after 12 seconds.",
  "retry_after": 12
}

Implement exponential backoff in your client:

async function chatWithRetry(payload, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch('https://api.thinnest.ai/chat', {
      method: 'POST',
      headers: {
        'Authorization': 'Bearer YOUR_API_KEY',
        'Content-Type': 'application/json',
      },
      body: JSON.stringify(payload),
    });

    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After') || 5;
      await new Promise(r => setTimeout(r, retryAfter * 1000));
      continue;
    }

    return response;
  }
  throw new Error('Max retries exceeded');
}

Error Handling

Status Code	Meaning	Action
200	Success	Process the response
400	Bad request (missing fields)	Check your request body
401	Invalid API key	Verify your credentials
404	Agent not found	Check the agent ID
429	Rate limited	Wait and retry
500	Server error	Retry with backoff
503	Service temporarily unavailable	Retry after a moment

Next Steps

Embed Widget — Add chat to your website with zero code.
Agents — Configure the agents behind your chat.
Knowledge Sources — Give your agent information to draw from.

Introduction

Getting Started

Voice Agents

Agent Capabilities

Channels

Quality & Oversight

Platform

Chat

Chat

Quick Start

Chat API Endpoint

POST /chat

Streaming Responses (SSE)

How SSE Works

Consuming SSE in JavaScript

Consuming SSE in Python

Session Management

How Sessions Work

Managing Sessions

Session Metadata

Rate Limiting

Default Limits

Rate Limit Headers

Handling Rate Limits

Error Handling

Next Steps

​Chat

​Quick Start

​Chat API Endpoint

​POST /chat

​Streaming Responses (SSE)

​How SSE Works

​Consuming SSE in JavaScript

​Consuming SSE in Python

​Session Management

​How Sessions Work

​Managing Sessions

​Session Metadata

​Rate Limiting

​Default Limits

​Rate Limit Headers

​Handling Rate Limits

​Error Handling

​Next Steps

Chat

Quick Start

Chat API Endpoint

POST /chat

Streaming Responses (SSE)

How SSE Works

Consuming SSE in JavaScript

Consuming SSE in Python

Session Management

How Sessions Work

Managing Sessions

Session Metadata

Rate Limiting

Default Limits

Rate Limit Headers

Handling Rate Limits

Error Handling

Next Steps