Quick Start¶
This guide will help you get started with dataknobs-bots quickly by building your first AI chatbot.
Installation¶
Install dataknobs-bots using pip:
For this quickstart, we'll use Ollama for local LLM inference:
Your First Chatbot¶
Let's create a simple chatbot with memory:
import asyncio
from dataknobs_bots import DynaBot, BotContext
async def main():
# Step 1: Define configuration
config = {
"llm": {
"provider": "ollama",
"model": "gemma3:1b",
"temperature": 0.7,
"max_tokens": 1000
},
"conversation_storage": {
"backend": "memory"
},
"memory": {
"type": "buffer",
"max_messages": 10
}
}
# Step 2: Create bot from configuration
bot = await DynaBot.from_config(config)
# Step 3: Create conversation context
context = BotContext(
conversation_id="quickstart-001",
client_id="demo-client",
user_id="user-123"
)
# Step 4: Start chatting
messages = [
"Hello! What can you help me with?",
"My name is Alice. Can you remember it?",
"What's my name?"
]
for message in messages:
print(f"User: {message}")
response = await bot.chat(message, context)
print(f"Bot: {response}\n")
if __name__ == "__main__":
asyncio.run(main())
Save this as quickstart.py and run it:
What's Happening?¶
- Configuration - We define the bot's behavior using a dictionary:
llm: Specifies the LLM provider (Ollama) and model (gemma3:1b)conversation_storage: Where conversations are stored (memory)-
memory: How the bot remembers context (buffer with 10 messages) -
Bot Creation -
DynaBot.from_config(config)creates a bot from the configuration -
Conversation Context -
BotContextencapsulates the conversation: conversation_id: Unique identifier for this conversationclient_id: Tenant/application identifier-
user_id: User identifier -
Chatting -
bot.chat(message, context)sends a message and gets a response
Streaming Responses¶
For a better user experience, you can stream responses token-by-token:
import asyncio
from dataknobs_bots import DynaBot, BotContext
async def main():
config = {
"llm": {
"provider": "ollama",
"model": "gemma3:1b"
},
"conversation_storage": {
"backend": "memory"
}
}
bot = await DynaBot.from_config(config)
context = BotContext(
conversation_id="streaming-demo",
client_id="demo-client"
)
# Stream the response — each chunk is an LLMStreamResponse
print("Bot: ", end="", flush=True)
async for chunk in bot.stream_chat("Tell me a short story", context):
print(chunk.delta, end="", flush=True)
print() # Newline after streaming
if __name__ == "__main__":
asyncio.run(main())
Streaming Benefits¶
- Better UX - Users see responses immediately as they're generated
- Lower Perceived Latency - First tokens appear right away
- Interruptible - Users can stop generation if needed
Streaming vs Non-Streaming¶
| Method | Returns | Best For |
|---|---|---|
chat() |
Complete str |
Simple integrations, batch processing |
stream_chat() |
AsyncGenerator[LLMStreamResponse] |
Interactive UIs, real-time display |
Per-Request Config Overrides¶
Override LLM settings per-request without creating a new bot. Useful for A/B testing, dynamic model selection, and cost optimization.
import asyncio
from dataknobs_bots import DynaBot, BotContext
async def main():
config = {
"llm": {
"provider": "ollama",
"model": "gemma3:1b", # Default model
"temperature": 0.7
},
"conversation_storage": {"backend": "memory"}
}
bot = await DynaBot.from_config(config)
context = BotContext(conversation_id="override-demo", client_id="demo")
# Use default settings
response = await bot.chat("Hello!", context)
# Override model for a specific request
response = await bot.chat(
"Explain quantum computing",
context,
llm_config_overrides={
"model": "llama3.1:8b", # Use a more capable model
"temperature": 0.3,
"max_tokens": 2000
}
)
# Works with streaming too
async for chunk in bot.stream_chat(
"Write a creative poem",
context,
llm_config_overrides={"temperature": 0.9}
):
print(chunk.delta, end="", flush=True)
if __name__ == "__main__":
asyncio.run(main())
Supported Override Fields¶
| Field | Description |
|---|---|
model |
Model identifier (e.g., "gpt-4-turbo") |
temperature |
Sampling temperature (0.0-2.0) |
max_tokens |
Maximum tokens in response |
top_p |
Nucleus sampling threshold |
stop_sequences |
Stop generation sequences |
seed |
Random seed for reproducibility |
Learn more about config overrides →
Adding a Knowledge Base (RAG)¶
Let's enhance our bot with RAG (Retrieval Augmented Generation):
import asyncio
from dataknobs_bots import DynaBot, BotContext
from pathlib import Path
async def main():
# Create a simple knowledge base
docs_dir = Path("./my_docs")
docs_dir.mkdir(exist_ok=True)
# Create a sample document
(docs_dir / "product_info.md").write_text("""
# Product Information
## Features
- Fast processing
- Easy to use
- Scalable architecture
## Pricing
- Basic: $10/month
- Pro: $50/month
- Enterprise: Contact sales
""")
# Configuration with knowledge base
config = {
"llm": {
"provider": "ollama",
"model": "gemma3:1b"
},
"conversation_storage": {
"backend": "memory"
},
"knowledge_base": {
"enabled": True,
"documents_path": str(docs_dir),
"vector_store": {
"backend": "faiss",
"dimension": 384
},
"embedding_provider": "ollama",
"embedding_model": "nomic-embed-text"
}
}
# Create bot
bot = await DynaBot.from_config(config)
# Create context
context = BotContext(
conversation_id="rag-quickstart",
client_id="demo-client"
)
# Ask questions about the documents
questions = [
"What are the product features?",
"How much does the Pro plan cost?"
]
for question in questions:
print(f"User: {question}")
response = await bot.chat(question, context)
print(f"Bot: {response}\n")
if __name__ == "__main__":
asyncio.run(main())
Before running, pull the embedding model:
What Changed?¶
We added a knowledge_base section to the configuration:
enabled: True- Enables RAGdocuments_path- Directory containing markdown documentsvector_store- Vector database configuration (FAISS)embedding- Embedding model configuration
The bot now: 1. Loads and chunks markdown documents 2. Creates embeddings for each chunk 3. Stores embeddings in FAISS 4. Retrieves relevant chunks for each question 5. Injects retrieved context into prompts
Creating a Tool-Using Agent¶
Let's create an agent that can use tools:
import asyncio
from dataknobs_bots import DynaBot, BotContext
from dataknobs_llm.tools import Tool
from typing import Dict, Any
# Step 1: Define a custom tool
class CalculatorTool(Tool):
def __init__(self, precision: int = 2):
super().__init__(
name="calculator",
description="Performs basic arithmetic operations"
)
self.precision = precision
@property
def schema(self) -> Dict[str, Any]:
return {
"type": "object",
"properties": {
"operation": {
"type": "string",
"enum": ["add", "subtract", "multiply", "divide"],
"description": "The operation to perform"
},
"a": {
"type": "number",
"description": "First number"
},
"b": {
"type": "number",
"description": "Second number"
}
},
"required": ["operation", "a", "b"]
}
async def execute(
self,
operation: str,
a: float,
b: float,
**kwargs
) -> float:
if operation == "add":
result = a + b
elif operation == "subtract":
result = a - b
elif operation == "multiply":
result = a * b
elif operation == "divide":
if b == 0:
raise ValueError("Cannot divide by zero")
result = a / b
else:
raise ValueError(f"Unknown operation: {operation}")
return round(result, self.precision)
# Step 2: Configure bot with tools and ReAct reasoning
async def main():
config = {
"llm": {
"provider": "ollama",
"model": "phi3:mini" # Better for tool use
},
"conversation_storage": {
"backend": "memory"
},
"reasoning": {
"strategy": "react",
"max_iterations": 5,
"verbose": True
},
"tools": [
{
"class": "__main__.CalculatorTool",
"params": {"precision": 2}
}
]
}
# Create bot
bot = await DynaBot.from_config(config)
# Create context
context = BotContext(
conversation_id="agent-quickstart",
client_id="demo-client"
)
# Ask questions that require calculation
questions = [
"What is 15 multiplied by 23?",
"If I have $100 and spend $37.50, how much do I have left?"
]
for question in questions:
print(f"User: {question}")
response = await bot.chat(question, context)
print(f"Bot: {response}\n")
if __name__ == "__main__":
asyncio.run(main())
Before running, pull the model:
What's New?¶
We added:
- Custom Tool -
CalculatorToolimplements theToolinterface: nameanddescription- Identifies the toolschema- JSON schema defining parameters-
execute()- Performs the actual calculation -
ReAct Reasoning - Enables the Reasoning + Acting pattern:
strategy: "react"- Uses ReAct reasoningmax_iterations: 5- Maximum reasoning steps-
verbose: True- Shows reasoning trace -
Tools Configuration - Loads the tool from configuration:
class- Fully qualified class nameparams- Constructor parameters
The agent now: 1. Receives a question 2. Reasons about how to answer it 3. Decides to use the calculator tool 4. Executes the tool 5. Uses the result to formulate an answer
Multi-Tenant Setup¶
Here's how to set up a multi-tenant bot serving multiple clients:
import asyncio
from dataknobs_bots import BotRegistry, BotContext
async def main():
# Create bot registry with base configuration
base_config = {
"llm": {
"provider": "ollama",
"model": "gemma3:1b"
},
"conversation_storage": {
"backend": "memory"
}
}
registry = BotRegistry(
config=base_config,
cache_ttl=300, # Cache bots for 5 minutes
max_cache_size=1000
)
# Register clients with custom configurations
await registry.register_client(
"client-a",
{
"memory": {"type": "buffer", "max_messages": 10},
"prompts": {
"system": "You are a helpful customer support assistant."
}
}
)
await registry.register_client(
"client-b",
{
"memory": {"type": "buffer", "max_messages": 20},
"prompts": {
"system": "You are a technical expert."
}
}
)
# Get bots for different clients
bot_a = await registry.get_bot("client-a")
bot_b = await registry.get_bot("client-b")
# Chat with client A's bot
context_a = BotContext(
conversation_id="conv-a-001",
client_id="client-a"
)
response_a = await bot_a.chat("How can I reset my password?", context_a)
print(f"Client A: {response_a}\n")
# Chat with client B's bot
context_b = BotContext(
conversation_id="conv-b-001",
client_id="client-b"
)
response_b = await bot_b.chat("Explain async/await in Python", context_b)
print(f"Client B: {response_b}\n")
if __name__ == "__main__":
asyncio.run(main())
Multi-Tenancy Benefits¶
- Client Isolation - Each client has separate conversations and configuration
- Efficient Caching - Bot instances are cached and reused
- Centralized Management - Single registry manages all bots
- Horizontal Scaling - Stateless design enables scaling across multiple servers
Next Steps¶
Now that you've built your first bots, explore more advanced features:
- User Guide - Comprehensive tutorials and patterns
- Configuration Reference - All configuration options
- Tools Development - Create custom tools
- Architecture - System design and scaling
- Examples - More complete examples
Common Issues¶
Model Not Found¶
Solution: Pull the model first:
Import Error¶
Solution: Install the package:
FAISS Not Found¶
Solution: Install FAISS:
Tips¶
- Start Simple - Begin with a basic chatbot, then add features incrementally
- Use Ollama - Great for development and testing without API costs
- Enable Verbose - Set
verbose: Truein reasoning config to see what's happening - Check Logs - Enable logging to debug issues
- Read Examples - The examples directory has complete working code
Summary¶
You've learned how to:
- ✅ Create a basic chatbot with memory
- ✅ Stream responses in real-time
- ✅ Override LLM config per-request (A/B testing, model switching)
- ✅ Add a knowledge base (RAG)
- ✅ Build a tool-using agent
- ✅ Set up multi-tenant bots
You're now ready to build sophisticated AI agents with dataknobs-bots!