Dash (Self-Learning DB Agent)
A self-learning data agent that grounds answers in 6 layers of context and improves with every query. Inspired by OpenAI's in-house data agent.
Dash — Self-Learning Database Agent
Dash is a self-learning PostgreSQL tool that turns your agent into a data analyst. It generates SQL from natural language, learns from every interaction, auto-corrects mistakes, and gets smarter over time.
Inspired by OpenAI's in-house data agent and built on the Agno Dash architecture.
How It Works
User asks a question
│
▼
┌──────────────────────────┐
│ 6-Layer Context Assembly │ ← Schema + Annotations + Knowledge
│ ask_database() │ + Docs + Learnings + Runtime
└──────────┬───────────────┘
▼
┌──────────────────────────┐
│ SQL Generation │ Agent writes SQL grounded in context
└──────────┬───────────────┘
▼
┌──────────────────────────┐
│ Query Execution │ ← Read-only enforcement, 30s timeout
│ run_query() │
└──────────┬───────────────┘
┌────┴────┐
▼ ▼
Success Failure
│ │
▼ ▼
Store as Auto-diagnose → Fix → Retry
Knowledge └─ Store as LearningEvery successful query becomes Knowledge — a proven pattern for future use. Every corrected error becomes a Learning — a mistake the system won't repeat.
The 6 Layers of Context
When Dash assembles context for SQL generation, it draws from six distinct sources:
| Layer | What It Provides | Source |
|---|---|---|
| 1. Schema | Tables, columns, types, PKs, FKs, indexes, row counts | Auto-discovered from database |
| 2. Annotations | Per-table descriptions, per-column notes, metric definitions, business rules | Your configuration |
| 3. Knowledge | Proven query patterns that worked before (hybrid search) | Learned from successful queries |
| 4. Docs | External documentation, policies, MCP context | Your configuration |
| 5. Learnings | Error corrections — what failed and how it was fixed | Learned from query failures |
| 6. Runtime | Live stats, sample data, column distributions | On-demand introspection |
The more your agent uses Dash, the richer layers 3 and 5 become.
Setup
From the Dashboard
- Go to your agent's Tools section.
- Click Add Tool and select Dash (Self-Learning DB).
- Fill in the connection details:
- Host — Your PostgreSQL server address
- Database Name — The database to connect to
- Username / Password — Database credentials
- Schema — Default schema (usually
public)
- Optionally configure business annotations (see below).
- Save the agent.
Via the API
curl -X POST https://api.thinnest.ai/agents \
-H "Authorization: Bearer $THINNESTAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Data Analyst",
"instructions": "You are a data analyst that answers business questions by querying the database.",
"tools": [
{
"type": "dash_postgres",
"config": {
"host": "your-db-host.com",
"port": 5432,
"db_name": "analytics",
"user": "readonly_user",
"password": "your_password",
"schema": "public",
"read_only": true,
"enable_learning": true,
"max_retries": 2,
"annotations": {
"description": "E-commerce analytics database",
"notes": "Fiscal year starts April 1. All monetary amounts are in USD cents.",
"metrics": {
"revenue": "SUM(amount) FROM orders WHERE status = 'completed'",
"churn_rate": "Users who cancelled in period / Total users at period start"
},
"tables": {
"orders": {
"description": "Customer orders. One row per order.",
"columns": {
"status": "Values: pending, completed, cancelled, refunded",
"total_amount": "In USD cents. Includes tax and shipping.",
"net_amount": "In USD cents. Pre-tax, pre-shipping."
}
},
"users": {
"description": "Registered users. Soft-deleted via deleted_at.",
"columns": {
"plan": "Values: free, starter, pro, enterprise",
"mrr": "Monthly recurring revenue in USD cents"
}
}
}
}
}
}
]
}'Configuration
| Field | Type | Default | Description |
|---|---|---|---|
host | string | Required | PostgreSQL host |
port | integer | 5432 | PostgreSQL port |
db_name | string | Required | Database name |
user | string | Required | Database username |
password | string | Required | Database password |
schema | string | "public" | Default schema |
read_only | boolean | true | Only allow SELECT queries |
enable_learning | boolean | true | Enable self-learning loop |
max_retries | integer | 2 | Auto-correction retry attempts (0-5) |
annotations | object | {} | Business annotations (see below) |
docs_context | string | "" | External documentation context |
Business Annotations
Annotations are the secret to accurate SQL generation. They encode tribal knowledge that exists nowhere in the database schema:
{
"description": "Company analytics database — contains all customer, order, and product data",
"notes": "Fiscal year starts April 1. All timestamps are UTC. Soft deletes use deleted_at column.",
"metrics": {
"revenue": "SUM(total_amount) / 100.0 FROM orders WHERE status = 'completed'",
"active_users": "COUNT(DISTINCT user_id) FROM sessions WHERE created_at > NOW() - INTERVAL '30 days'",
"churn_rate": "cancelled_users / total_users_at_period_start * 100"
},
"tables": {
"orders": {
"description": "One row per customer order. Created when checkout completes.",
"columns": {
"status": "Enum: pending → processing → completed → refunded. Cancelled is a separate flow.",
"total_amount": "Integer, USD cents. Includes tax + shipping. Use net_amount for pre-tax.",
"created_at": "When the order was placed. Use this for revenue date, not updated_at."
}
},
"users": {
"description": "One row per registered user. Soft-deleted via deleted_at (not hard deleted).",
"columns": {
"plan": "Current subscription: free, starter, pro, enterprise. Historical plans in plan_history table.",
"email": "Always lowercase. Unique constraint. Some test accounts use @example.com."
}
}
}
}Available Functions
Dash registers these functions that the agent can call:
| Function | Purpose |
|---|---|
discover_schema() | Auto-discover tables, columns, types, constraints, indexes |
ask_database(question) | Assemble 6-layer context for a natural language question |
run_query(query, question) | Execute SQL with read-only enforcement and learning |
record_correction(original, error, corrected, question) | Store an error correction as a Learning |
get_table_stats(table) | Row counts, numeric stats, null rates, date ranges |
get_sample_data(table, limit) | Preview actual data values |
get_column_distribution(table, column, limit) | Value frequency analysis |
suggest_visualization(result) | Recommend chart types for query results |
Dual Memory System
Dash maintains two types of persistent memory:
Knowledge (Proven Patterns)
When a query succeeds, Dash stores it as Knowledge — a validated pattern that future queries can reference. Over time, this builds a library of working SQL for your specific database.
Query: "What were our top customers last quarter?"
SQL: SELECT u.name, SUM(o.total_amount)/100.0 as revenue
FROM orders o JOIN users u ON o.user_id = u.id
WHERE o.status = 'completed' AND o.created_at >= '2026-01-01'
GROUP BY u.name ORDER BY revenue DESC LIMIT 10
Result: 10 rows ✓ → Stored as KnowledgeLearnings (Error Corrections)
When a query fails and is corrected, Dash stores the before/after pair as a Learning. The next time a similar question comes up, the agent sees the correction and avoids the same mistake.
Failed: SELECT * FROM users WHERE plan = 'premium'
Error: No rows — 'premium' is not a valid plan value
Fixed: SELECT * FROM users WHERE plan = 'pro'
Result: 247 rows ✓ → Stored as LearningPersistence
Knowledge and learnings are stored in a JSON file keyed by your database connection. They persist across agent restarts and sessions. The store is capped at 500 knowledge patterns and 200 corrections to prevent unbounded growth.
Auto-Retry on Errors
When a query fails, Dash doesn't just return an error. It builds diagnostic context:
- Error classification — Identifies the error type (missing column, syntax error, type mismatch, timeout, etc.)
- Relevant past corrections — Searches learnings for similar failures that were previously fixed
- Fix suggestions — Recommends using
discover_schema()orget_table_stats()to verify assumptions
The agent then rewrites the SQL and retries automatically, up to max_retries attempts. If the retry succeeds, the correction is stored as a Learning.
Security
Read-Only Mode (Default)
By default, Dash blocks all data-modifying statements:
INSERT,UPDATE,DELETEDROP,TRUNCATE,ALTER,CREATEGRANT,REVOKE,COPY- Piggyback attacks like
SELECT ...; DROP TABLE ...
Query Timeout
All queries have a 30-second timeout to prevent runaway operations.
Best Practices
- Always use a read-only database user — Even with read-only mode enabled, defense-in-depth is important.
- Use a replica — Point Dash at a read replica, not your primary database.
- Limit schema access — Only expose the schemas and tables the agent needs.
Example Conversations
Simple Question
User: How many active users do we have?
Agent: [calls ask_database("How many active users do we have?")]
Agent: [calls run_query("SELECT COUNT(*) FROM users WHERE deleted_at IS NULL", "How many active users do we have?")]
Agent: You have 12,847 active users.Question with Learning
User: What's our monthly revenue trend?
Agent: [calls ask_database("What's our monthly revenue trend?")]
→ Context includes annotation: "All amounts are in USD cents"
→ Context includes Knowledge: Similar past query used /100.0 conversion
Agent: [calls run_query with proper cents-to-dollars conversion]
Agent: Here's your monthly revenue trend:
| Month | Revenue |
|----------|-------------|
| Jan 2026 | $1,234,567 |
| Feb 2026 | $1,345,678 |
| Mar 2026 | $1,456,789 |
Visualization: Line Chart — Plot Revenue over MonthAuto-Correction
User: Show me orders by product category
Agent: [calls run_query("SELECT category, COUNT(*) FROM orders GROUP BY category")]
→ Error: column "category" does not exist
Dash returns: QUERY FAILED (attempt 1/3)
Diagnostic: Column name may be misspelled. Use discover_schema() to verify.
Agent: [calls discover_schema()]
→ Sees: orders has product_id, products table has category column
Agent: [calls run_query("SELECT p.category, COUNT(*) FROM orders o JOIN products p ON o.product_id = p.id GROUP BY p.category")]
→ Success! Stores the correction as a Learning.
Agent: [calls record_correction(...)]Integration with Agent Learning
Dash's self-learning operates independently but complements the platform's Agent Learning system:
- Dash learnings are SQL-specific — query patterns and corrections stored locally per database
- Agent learnings are conversation-level — feedback and corrections stored in the platform database
Both systems make your agent smarter over time, at different levels.
Next Steps
- Built-in Tools — Browse all available tools
- Agent Learning — Learn about the platform-wide learning system
- Knowledge Sources — Add documents and data to your agent's knowledge base