Self-Learning Loop
Agents that get smarter with every conversation. User feedback automatically improves responses over time.
Smarter
How It Works
Every interaction is an opportunity to learn. Here's the continuous improvement cycle:
1. Conversations Happen
Users chat with your agent. Every message, response, and interaction is captured and stored with full context.
// Each conversation is logged
{
"session_id": "user_abc123",
"messages": [
{ "role": "user", "content": "How do I reset my password?" },
{ "role": "agent", "content": "Click Settings > Security > Reset Password..." }
]
}
2. Users Give Feedback
Simple thumbs up/down buttons let users rate responses. This creates labeled training data automatically.
3. Training Data Generated
Positive feedback becomes training examples. Negative feedback flags responses for review and improvement.
// Automatically generated training pair
{
"prompt": "User asks: How do I reset my password?",
"completion": "To reset your password, go to Settings β Security...",
"rating": "positive",
"weight": 1.0
}
4. Agent Improves
Training data is used to fine-tune the model or update the agent's knowledge base. The agent gives better responses.
"You can reset your password in settings."
"Go to Settings β Security β Reset Password. You'll receive an email with a reset link valid for 24 hours."
Learning Metrics
Track how your agent improves over time:
Feedback Over Time
What Gets Learned
| Signal | What It Teaches | Example |
|---|---|---|
| π Thumbs Up | This response pattern works well | Detailed step-by-step answers preferred |
| π Thumbs Down | This response needs improvement | Vague answers get flagged |
| Follow-up Questions | Response was incomplete | "What do you mean by that?" |
| Conversation Success | Goal was achieved | User completed booking |
| Session Length | Engagement quality | Longer = more helpful |
Training Pipeline
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SELF-LEARNING PIPELINE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Live Conversations Feedback Collection
βββββββββββββββββ βββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββ βββββββββββββββββββ
β Messages β β π / π Ratings β
β Logged β β Per Message β
ββββββββ¬βββββββ ββββββββββ¬βββββββββ
β β
ββββββββββββββββ¬ββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββ
β TRAINING DATA STORE β
β β
β β’ Positive examples (π) β
β β’ Negative examples (π) β
β β’ Context & metadata β
ββββββββββββββββ¬ββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββ
β TRAINING PROCESSOR β
β β
β 1. Filter quality examples β
β 2. Format for fine-tuning β
β 3. Balance positive/negativeβ
ββββββββββββββββ¬ββββββββββββββββ
β
βββββββββββββββ΄ββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββ βββββββββββββββββ
β FINE-TUNE β β KNOWLEDGE β
β MODEL β β BASE UPDATE β
β β β β
β Custom model β β Add Q&A pairsβ
β trained on β β to semantic β
β your data β β search β
βββββββββ¬ββββββββ βββββββββ¬ββββββββ
β β
βββββββββββββββ¬ββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββ
β IMPROVED AGENT β
β β
β Better responses, learned β
β from real user feedback β
ββββββββββββββββββββββββββββββββ
API Reference
Submit feedback for a specific message.
// Request
{
"rating": "positive", // or "negative"
"comment": "Very helpful!" // optional
}
// Response
{
"success": true,
"training_queued": true
}
Export training data generated from feedback.
// Response
{
"total_examples": 847,
"positive": 752,
"negative": 95,
"data": [
{
"prompt": "How do I cancel my subscription?",
"completion": "Go to Account Settings...",
"rating": "positive"
}
]
}
Trigger a training run with collected feedback data.
// Request
{
"min_examples": 100,
"include_negative": false
}
// Response
{
"training_id": "train_xyz789",
"status": "queued",
"examples_count": 752
}
Zero Manual Work: The entire learning loop is automatic. Users give feedback naturally, and your agent improves continuously without any manual training or data labeling.
Start Learning Today
Create an agent and watch it get smarter with every conversation.