Documentation Query API Agents Building Agents Memory Core Self-Learning Playground
Continuous Improvement

Self-Learning Loop

Agents that get smarter with every conversation. User feedback automatically improves responses over time.

πŸ’¬
Conversations
πŸ‘
Feedback
🧠
Training
✨
Improved
Agent Gets
Smarter
β†’
β†’
β†’
β†’

How It Works

Every interaction is an opportunity to learn. Here's the continuous improvement cycle:

πŸ’¬

1. Conversations Happen

Users chat with your agent. Every message, response, and interaction is captured and stored with full context.

// Each conversation is logged
{
  "session_id": "user_abc123",
  "messages": [
    { "role": "user", "content": "How do I reset my password?" },
    { "role": "agent", "content": "Click Settings > Security > Reset Password..." }
  ]
}
πŸ‘

2. Users Give Feedback

Simple thumbs up/down buttons let users rate responses. This creates labeled training data automatically.

🧠

3. Training Data Generated

Positive feedback becomes training examples. Negative feedback flags responses for review and improvement.

// Automatically generated training pair
{
  "prompt": "User asks: How do I reset my password?",
  "completion": "To reset your password, go to Settings β†’ Security...",
  "rating": "positive",
  "weight": 1.0
}
✨

4. Agent Improves

Training data is used to fine-tune the model or update the agent's knowledge base. The agent gives better responses.

Before Training

"You can reset your password in settings."

After Training

"Go to Settings β†’ Security β†’ Reset Password. You'll receive an email with a reset link valid for 24 hours."

Learning Metrics

Track how your agent improves over time:

πŸ’¬
12.4K
Conversations
πŸ‘
89%
Positive Rate
πŸ“ˆ
+23%
Improvement
🎯
847
Training Examples

Feedback Over Time

Positive
Negative
Week 1
Week 2
Week 3
Week 4
Week 5
Week 6

What Gets Learned

Signal What It Teaches Example
πŸ‘ Thumbs Up This response pattern works well Detailed step-by-step answers preferred
πŸ‘Ž Thumbs Down This response needs improvement Vague answers get flagged
Follow-up Questions Response was incomplete "What do you mean by that?"
Conversation Success Goal was achieved User completed booking
Session Length Engagement quality Longer = more helpful

Training Pipeline

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                          SELF-LEARNING PIPELINE                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

  Live Conversations                    Feedback Collection
  ─────────────────                    ───────────────────
        β”‚                                      β”‚
        β–Ό                                      β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  Messages   β”‚                      β”‚  πŸ‘ / πŸ‘Ž Ratings β”‚
  β”‚  Logged     β”‚                      β”‚  Per Message    β”‚
  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜                      β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                                      β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
                        β–Ό
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚     TRAINING DATA STORE      β”‚
         β”‚                              β”‚
         β”‚  β€’ Positive examples (πŸ‘)    β”‚
         β”‚  β€’ Negative examples (πŸ‘Ž)    β”‚
         β”‚  β€’ Context & metadata        β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
                        β–Ό
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚      TRAINING PROCESSOR      β”‚
         β”‚                              β”‚
         β”‚  1. Filter quality examples  β”‚
         β”‚  2. Format for fine-tuning   β”‚
         β”‚  3. Balance positive/negativeβ”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚                           β”‚
          β–Ό                           β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  FINE-TUNE    β”‚          β”‚  KNOWLEDGE    β”‚
  β”‚  MODEL        β”‚          β”‚  BASE UPDATE  β”‚
  β”‚               β”‚          β”‚               β”‚
  β”‚  Custom model β”‚          β”‚  Add Q&A pairsβ”‚
  β”‚  trained on   β”‚          β”‚  to semantic  β”‚
  β”‚  your data    β”‚          β”‚  search       β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜          β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚                           β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
                        β–Ό
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚       IMPROVED AGENT         β”‚
         β”‚                              β”‚
         β”‚  Better responses, learned   β”‚
         β”‚  from real user feedback     β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

API Reference

POST /api/feedback/:messageId

Submit feedback for a specific message.

// Request
{
  "rating": "positive",  // or "negative"
  "comment": "Very helpful!"  // optional
}

// Response
{
  "success": true,
  "training_queued": true
}
GET /api/analytics/:agentId/training-data

Export training data generated from feedback.

// Response
{
  "total_examples": 847,
  "positive": 752,
  "negative": 95,
  "data": [
    {
      "prompt": "How do I cancel my subscription?",
      "completion": "Go to Account Settings...",
      "rating": "positive"
    }
  ]
}
POST /api/agents/:agentId/train

Trigger a training run with collected feedback data.

// Request
{
  "min_examples": 100,
  "include_negative": false
}

// Response
{
  "training_id": "train_xyz789",
  "status": "queued",
  "examples_count": 752
}

Zero Manual Work: The entire learning loop is automatic. Users give feedback naturally, and your agent improves continuously without any manual training or data labeling.

Start Learning Today

Create an agent and watch it get smarter with every conversation.