Tools

Dash (Self-Learning DB Agent)

A self-learning data agent that grounds answers in 6 layers of context and improves with every query. Inspired by OpenAI's in-house data agent.

Dash — Self-Learning Database Agent

Dash is a self-learning PostgreSQL tool that turns your agent into a data analyst. It generates SQL from natural language, learns from every interaction, auto-corrects mistakes, and gets smarter over time.

Inspired by OpenAI's in-house data agent and built on the Agno Dash architecture.

How It Works

User asks a question


┌──────────────────────────┐
│  6-Layer Context Assembly │  ← Schema + Annotations + Knowledge
│  ask_database()           │    + Docs + Learnings + Runtime
└──────────┬───────────────┘

┌──────────────────────────┐
│  SQL Generation           │  Agent writes SQL grounded in context
└──────────┬───────────────┘

┌──────────────────────────┐
│  Query Execution          │  ← Read-only enforcement, 30s timeout
│  run_query()              │
└──────────┬───────────────┘
      ┌────┴────┐
      ▼         ▼
  Success     Failure
      │         │
      ▼         ▼
  Store as   Auto-diagnose → Fix → Retry
  Knowledge    └─ Store as Learning

Every successful query becomes Knowledge — a proven pattern for future use. Every corrected error becomes a Learning — a mistake the system won't repeat.

The 6 Layers of Context

When Dash assembles context for SQL generation, it draws from six distinct sources:

LayerWhat It ProvidesSource
1. SchemaTables, columns, types, PKs, FKs, indexes, row countsAuto-discovered from database
2. AnnotationsPer-table descriptions, per-column notes, metric definitions, business rulesYour configuration
3. KnowledgeProven query patterns that worked before (hybrid search)Learned from successful queries
4. DocsExternal documentation, policies, MCP contextYour configuration
5. LearningsError corrections — what failed and how it was fixedLearned from query failures
6. RuntimeLive stats, sample data, column distributionsOn-demand introspection

The more your agent uses Dash, the richer layers 3 and 5 become.

Setup

From the Dashboard

  1. Go to your agent's Tools section.
  2. Click Add Tool and select Dash (Self-Learning DB).
  3. Fill in the connection details:
    • Host — Your PostgreSQL server address
    • Database Name — The database to connect to
    • Username / Password — Database credentials
    • Schema — Default schema (usually public)
  4. Optionally configure business annotations (see below).
  5. Save the agent.

Via the API

curl -X POST https://api.thinnest.ai/agents \
  -H "Authorization: Bearer $THINNESTAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Data Analyst",
    "instructions": "You are a data analyst that answers business questions by querying the database.",
    "tools": [
      {
        "type": "dash_postgres",
        "config": {
          "host": "your-db-host.com",
          "port": 5432,
          "db_name": "analytics",
          "user": "readonly_user",
          "password": "your_password",
          "schema": "public",
          "read_only": true,
          "enable_learning": true,
          "max_retries": 2,
          "annotations": {
            "description": "E-commerce analytics database",
            "notes": "Fiscal year starts April 1. All monetary amounts are in USD cents.",
            "metrics": {
              "revenue": "SUM(amount) FROM orders WHERE status = 'completed'",
              "churn_rate": "Users who cancelled in period / Total users at period start"
            },
            "tables": {
              "orders": {
                "description": "Customer orders. One row per order.",
                "columns": {
                  "status": "Values: pending, completed, cancelled, refunded",
                  "total_amount": "In USD cents. Includes tax and shipping.",
                  "net_amount": "In USD cents. Pre-tax, pre-shipping."
                }
              },
              "users": {
                "description": "Registered users. Soft-deleted via deleted_at.",
                "columns": {
                  "plan": "Values: free, starter, pro, enterprise",
                  "mrr": "Monthly recurring revenue in USD cents"
                }
              }
            }
          }
        }
      }
    ]
  }'

Configuration

FieldTypeDefaultDescription
hoststringRequiredPostgreSQL host
portinteger5432PostgreSQL port
db_namestringRequiredDatabase name
userstringRequiredDatabase username
passwordstringRequiredDatabase password
schemastring"public"Default schema
read_onlybooleantrueOnly allow SELECT queries
enable_learningbooleantrueEnable self-learning loop
max_retriesinteger2Auto-correction retry attempts (0-5)
annotationsobject{}Business annotations (see below)
docs_contextstring""External documentation context

Business Annotations

Annotations are the secret to accurate SQL generation. They encode tribal knowledge that exists nowhere in the database schema:

{
  "description": "Company analytics database — contains all customer, order, and product data",

  "notes": "Fiscal year starts April 1. All timestamps are UTC. Soft deletes use deleted_at column.",

  "metrics": {
    "revenue": "SUM(total_amount) / 100.0 FROM orders WHERE status = 'completed'",
    "active_users": "COUNT(DISTINCT user_id) FROM sessions WHERE created_at > NOW() - INTERVAL '30 days'",
    "churn_rate": "cancelled_users / total_users_at_period_start * 100"
  },

  "tables": {
    "orders": {
      "description": "One row per customer order. Created when checkout completes.",
      "columns": {
        "status": "Enum: pending → processing → completed → refunded. Cancelled is a separate flow.",
        "total_amount": "Integer, USD cents. Includes tax + shipping. Use net_amount for pre-tax.",
        "created_at": "When the order was placed. Use this for revenue date, not updated_at."
      }
    },
    "users": {
      "description": "One row per registered user. Soft-deleted via deleted_at (not hard deleted).",
      "columns": {
        "plan": "Current subscription: free, starter, pro, enterprise. Historical plans in plan_history table.",
        "email": "Always lowercase. Unique constraint. Some test accounts use @example.com."
      }
    }
  }
}

Available Functions

Dash registers these functions that the agent can call:

FunctionPurpose
discover_schema()Auto-discover tables, columns, types, constraints, indexes
ask_database(question)Assemble 6-layer context for a natural language question
run_query(query, question)Execute SQL with read-only enforcement and learning
record_correction(original, error, corrected, question)Store an error correction as a Learning
get_table_stats(table)Row counts, numeric stats, null rates, date ranges
get_sample_data(table, limit)Preview actual data values
get_column_distribution(table, column, limit)Value frequency analysis
suggest_visualization(result)Recommend chart types for query results

Dual Memory System

Dash maintains two types of persistent memory:

Knowledge (Proven Patterns)

When a query succeeds, Dash stores it as Knowledge — a validated pattern that future queries can reference. Over time, this builds a library of working SQL for your specific database.

Query: "What were our top customers last quarter?"
SQL: SELECT u.name, SUM(o.total_amount)/100.0 as revenue
     FROM orders o JOIN users u ON o.user_id = u.id
     WHERE o.status = 'completed' AND o.created_at >= '2026-01-01'
     GROUP BY u.name ORDER BY revenue DESC LIMIT 10
Result: 10 rows ✓ → Stored as Knowledge

Learnings (Error Corrections)

When a query fails and is corrected, Dash stores the before/after pair as a Learning. The next time a similar question comes up, the agent sees the correction and avoids the same mistake.

Failed:  SELECT * FROM users WHERE plan = 'premium'
Error:   No rows — 'premium' is not a valid plan value
Fixed:   SELECT * FROM users WHERE plan = 'pro'
Result:  247 rows ✓ → Stored as Learning

Persistence

Knowledge and learnings are stored in a JSON file keyed by your database connection. They persist across agent restarts and sessions. The store is capped at 500 knowledge patterns and 200 corrections to prevent unbounded growth.

Auto-Retry on Errors

When a query fails, Dash doesn't just return an error. It builds diagnostic context:

  1. Error classification — Identifies the error type (missing column, syntax error, type mismatch, timeout, etc.)
  2. Relevant past corrections — Searches learnings for similar failures that were previously fixed
  3. Fix suggestions — Recommends using discover_schema() or get_table_stats() to verify assumptions

The agent then rewrites the SQL and retries automatically, up to max_retries attempts. If the retry succeeds, the correction is stored as a Learning.

Security

Read-Only Mode (Default)

By default, Dash blocks all data-modifying statements:

  • INSERT, UPDATE, DELETE
  • DROP, TRUNCATE, ALTER, CREATE
  • GRANT, REVOKE, COPY
  • Piggyback attacks like SELECT ...; DROP TABLE ...

Query Timeout

All queries have a 30-second timeout to prevent runaway operations.

Best Practices

  • Always use a read-only database user — Even with read-only mode enabled, defense-in-depth is important.
  • Use a replica — Point Dash at a read replica, not your primary database.
  • Limit schema access — Only expose the schemas and tables the agent needs.

Example Conversations

Simple Question

User: How many active users do we have?

Agent: [calls ask_database("How many active users do we have?")]
Agent: [calls run_query("SELECT COUNT(*) FROM users WHERE deleted_at IS NULL", "How many active users do we have?")]

Agent: You have 12,847 active users.

Question with Learning

User: What's our monthly revenue trend?

Agent: [calls ask_database("What's our monthly revenue trend?")]
  → Context includes annotation: "All amounts are in USD cents"
  → Context includes Knowledge: Similar past query used /100.0 conversion

Agent: [calls run_query with proper cents-to-dollars conversion]

Agent: Here's your monthly revenue trend:
| Month    | Revenue     |
|----------|-------------|
| Jan 2026 | $1,234,567  |
| Feb 2026 | $1,345,678  |
| Mar 2026 | $1,456,789  |

Visualization: Line Chart — Plot Revenue over Month

Auto-Correction

User: Show me orders by product category

Agent: [calls run_query("SELECT category, COUNT(*) FROM orders GROUP BY category")]
  → Error: column "category" does not exist

Dash returns: QUERY FAILED (attempt 1/3)
  Diagnostic: Column name may be misspelled. Use discover_schema() to verify.

Agent: [calls discover_schema()]
  → Sees: orders has product_id, products table has category column

Agent: [calls run_query("SELECT p.category, COUNT(*) FROM orders o JOIN products p ON o.product_id = p.id GROUP BY p.category")]
  → Success! Stores the correction as a Learning.

Agent: [calls record_correction(...)]

Integration with Agent Learning

Dash's self-learning operates independently but complements the platform's Agent Learning system:

  • Dash learnings are SQL-specific — query patterns and corrections stored locally per database
  • Agent learnings are conversation-level — feedback and corrections stored in the platform database

Both systems make your agent smarter over time, at different levels.

Next Steps

On this page