User Guide¶

Complete guide to using DataKnobs Bots with tutorials and how-to guides.

Table of Contents¶

Getting Started
Basic Tutorials
Tutorial 1: Your First Chatbot
Tutorial 2: Adding Memory
Tutorial 3: Streaming Responses
Per-Request Config Overrides
Conversation Undo & Rewind
Tutorial 4: Building a RAG Chatbot
Tutorial 5: Creating Tool-Using Agents
Tutorial 6: Building Guided Wizard Flows
Advanced Topics
Multi-Tenant Deployment
Custom Tools Development
Production Deployment
Common Patterns
Troubleshooting

Getting Started¶

Prerequisites¶

Python 3.12 or higher
Basic understanding of async/await in Python
(Optional) Ollama installed for local LLM testing

Installation¶

# Basic installation
pip install dataknobs-bots

# With PostgreSQL support
pip install dataknobs-bots[postgres]

# With all optional dependencies
pip install dataknobs-bots[all]

Install Ollama (Optional, for Local Testing)¶

# macOS/Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model
ollama pull gemma3:1b

Basic Tutorials¶

Tutorial 1: Your First Chatbot¶

Build a simple conversational chatbot in 5 minutes.

Step 1: Create the Bot Configuration¶

# first_bot.py
import asyncio
from dataknobs_bots import DynaBot, BotContext

async def main():
    # Configuration
    config = {
        "llm": {
            "provider": "ollama",
            "model": "gemma3:1b",
            "temperature": 0.7,
            "max_tokens": 500
        },
        "conversation_storage": {
            "backend": "memory"
        }
    }

    # Create bot
    bot = await DynaBot.from_config(config)
    print("✓ Bot created successfully!")

    # Create context
    context = BotContext(
        conversation_id="tutorial-1",
        client_id="my-app"
    )

    # Chat loop
    print("\nChat with the bot (type 'quit' to exit):\n")
    while True:
        user_input = input("You: ")
        if user_input.lower() == "quit":
            break

        response = await bot.chat(user_input, context)
        print(f"Bot: {response}\n")

if __name__ == "__main__":
    asyncio.run(main())

Step 2: Run the Bot¶

python first_bot.py

Step 3: Try It Out¶

You: Hello!
Bot: Hi there! How can I help you today?

You: What can you do?
Bot: I'm a conversational AI assistant. I can chat with you about various topics, answer questions, and help with tasks.

You: quit

What's Happening?¶

Configuration: Defines LLM (Ollama) and storage (in-memory)
Bot Creation: from_config() creates a configured bot
Context: Identifies the conversation
Chat: bot.chat() processes messages and returns responses

Adding a System Prompt¶

You can add a system prompt to customize the bot's behavior:

config = {
    "llm": {
        "provider": "ollama",
        "model": "gemma3:1b",
    },
    "conversation_storage": {
        "backend": "memory"
    },
    # Add a system prompt (smart detection: if not in prompts library,
    # treated as inline content)
    "system_prompt": "You are a helpful coding assistant. Be concise and technical."
}

DynaBot uses smart detection for system prompts: - If the string exists in the prompts library → used as a template reference - If not → treated as inline content

Next Steps¶

Try different models: llama3.1:8b, phi3:mini
Adjust temperature (0.0 = focused, 1.0 = creative)
Change max_tokens for longer/shorter responses
See CONFIGURATION.md for all system prompt options

Tutorial 2: Adding Memory¶

Add conversation memory so the bot remembers previous messages.

Step 1: Add Memory Configuration¶

# memory_bot.py
import asyncio
from dataknobs_bots import DynaBot, BotContext

async def main():
    config = {
        "llm": {
            "provider": "ollama",
            "model": "gemma3:1b",
        },
        "conversation_storage": {
            "backend": "memory"
        },
        # Add memory configuration
        "memory": {
            "type": "buffer",
            "max_messages": 10  # Remember last 10 messages
        }
    }

    bot = await DynaBot.from_config(config)

    context = BotContext(
        conversation_id="tutorial-2",
        client_id="my-app",
        user_id="user-123"
    )

    # Test memory
    print("Testing conversation memory:\n")

    response1 = await bot.chat("My name is Alice", context)
    print(f"Bot: {response1}\n")

    response2 = await bot.chat("What is my name?", context)
    print(f"Bot: {response2}\n")

    response3 = await bot.chat("Tell me about yourself", context)
    print(f"Bot: {response3}\n")

if __name__ == "__main__":
    asyncio.run(main())

Step 2: Run and Observe¶

python memory_bot.py

Output:

Testing conversation memory:

Bot: Nice to meet you, Alice! How can I help you today?

Bot: Your name is Alice!

Bot: I'm an AI assistant designed to have helpful, harmless conversations...

Understanding Memory Types¶

Buffer Memory (What we used): - Keeps last N messages - Fast and simple - Good for most use cases

Summary Memory (LLM-based compression):

"memory": {
    "type": "summary",
    "recent_window": 10  # Keep last 10 messages verbatim, summarize older
}

Vector Memory (Semantic recall):

"memory": {
    "type": "vector",
    "embedding_provider": "ollama",
    "embedding_model": "nomic-embed-text",
    "backend": "faiss",
    "dimension": 384
}

Composite Memory (Combine strategies):

"memory": {
    "type": "composite",
    "strategies": [
        {"type": "summary", "recent_window": 10},
        {
            "type": "vector",
            "backend": "memory",
            "dimension": 384,
            "embedding_provider": "ollama",
            "embedding_model": "nomic-embed-text"
        }
    ]
}

See CONFIGURATION.md for full details on composite memory and vector memory tenant scoping.

Tutorial 3: Streaming Responses¶

Stream LLM responses token-by-token for better user experience.

Why Streaming?¶

Better UX: Users see responses as they're generated
Lower Latency: First tokens appear immediately
Interactive: Users can interrupt or cancel if response isn't useful

Step 1: Basic Streaming¶

# streaming_bot.py
import asyncio
from dataknobs_bots import DynaBot, BotContext

async def main():
    config = {
        "llm": {
            "provider": "ollama",
            "model": "gemma3:1b",
        },
        "conversation_storage": {
            "backend": "memory"
        }
    }

    bot = await DynaBot.from_config(config)

    context = BotContext(
        conversation_id="streaming-demo",
        client_id="my-app"
    )

    print("Bot: ", end="", flush=True)

    # Stream response token by token — each chunk is an LLMStreamResponse
    async for chunk in bot.stream_chat("Write a haiku about coding", context):
        print(chunk.delta, end="", flush=True)

    print()  # Newline after response

    await bot.close()

if __name__ == "__main__":
    asyncio.run(main())

Step 2: Run and See Streaming¶

python streaming_bot.py

Output (tokens appear one-by-one):

Bot: Fingers on the keys
Bugs hide in the midnight code
Coffee grows cold now

Step 3: Accumulating the Full Response¶

# If you need the complete response
full_response = ""
async for chunk in bot.stream_chat("Tell me a joke", context):
    full_response += chunk.delta
    print(chunk.delta, end="", flush=True)

print()
print(f"\n[Total length: {len(full_response)} characters]")

Step 4: Streaming with a Web API¶

# api_streaming.py
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from dataknobs_bots import DynaBot, BotContext

app = FastAPI()
bot = None

@app.on_event("startup")
async def startup():
    global bot
    bot = await DynaBot.from_config(config)

@app.post("/chat/stream")
async def stream_chat(request: ChatRequest):
    context = BotContext(
        conversation_id=request.conversation_id,
        client_id=request.client_id
    )

    async def generate():
        async for chunk in bot.stream_chat(request.message, context):
            yield chunk.delta

    return StreamingResponse(generate(), media_type="text/plain")

Streaming vs Non-Streaming¶

Feature	`chat()`	`stream_chat()`
Return type	`str`	`AsyncGenerator[LLMStreamResponse, None]`
Response timing	All at once	Token by token
Middleware hook	`after_message()`	`post_stream()`
Memory updates	After response	After stream completes
Best for	Simple integrations	Interactive UIs

Error Handling in Streaming¶

try:
    async for chunk in bot.stream_chat("Hello", context):
        print(chunk.delta, end="", flush=True)
except Exception as e:
    print(f"\nStreaming error: {e}")
    # Middleware's on_error() is automatically called
    # Memory is NOT updated with partial responses

Per-Request Config Overrides¶

Override LLM configuration on a per-request basis without creating a new bot instance.

Why Use Config Overrides?¶

A/B Testing: Compare models or parameters without redeployment
Dynamic Model Selection: Switch models based on request type
Cost Optimization: Use cheaper models for simple queries
Fallback Routing: Route to different models for specific use cases

Basic Usage¶

# Override model and temperature for a single request
response = await bot.chat(
    "Explain quantum computing in simple terms",
    context,
    llm_config_overrides={
        "model": "gpt-4-turbo",
        "temperature": 0.3
    }
)

Streaming with Overrides¶

async for chunk in bot.stream_chat(
    "Write a creative poem",
    context,
    llm_config_overrides={
        "model": "claude-3-opus",
        "temperature": 0.9,
        "max_tokens": 2000
    }
):
    print(chunk.delta, end="", flush=True)

Supported Override Fields¶

Field	Type	Description
`model`	`str`	Model identifier (e.g., "gpt-4-turbo", "llama3.2:8b")
`temperature`	`float`	Sampling temperature (0.0-2.0)
`max_tokens`	`int`	Maximum tokens in response
`top_p`	`float`	Nucleus sampling threshold
`stop_sequences`	`list[str]`	Stop generation at these sequences
`seed`	`int`	Random seed for reproducibility
`options`	`dict`	Provider-specific options

Tracking Override Usage¶

The bot automatically tracks which overrides were applied in the conversation metadata:

# Chat with overrides
response = await bot.chat(
    "Hello",
    context,
    llm_config_overrides={"model": "gpt-4-turbo"}
)

# Check what was used
conversation = await bot.get_conversation(context.conversation_id)
tree = conversation.message_tree
assistant_nodes = tree.find_nodes(
    lambda node: node.data.message and node.data.message.role == "assistant"
)

# See which overrides were applied
metadata = assistant_nodes[-1].data.metadata
print(metadata.get("config_overrides_applied"))
# Output: {"model": "gpt-4-turbo"}

Use Cases¶

A/B Testing Models:

import random

model = random.choice(["gpt-4", "claude-3-sonnet"])
response = await bot.chat(
    message,
    context,
    llm_config_overrides={"model": model}
)

Query Complexity Routing:

# Simple queries → faster/cheaper model
# Complex queries → more capable model
if len(message) < 50:
    overrides = {"model": "gpt-3.5-turbo"}
else:
    overrides = {"model": "gpt-4"}

response = await bot.chat(message, context, llm_config_overrides=overrides)

Creative vs Factual Responses:

# High temperature for creative tasks
creative_response = await bot.chat(
    "Write a poem about coding",
    context,
    llm_config_overrides={"temperature": 0.9}
)

# Low temperature for factual queries
factual_response = await bot.chat(
    "What is the capital of France?",
    context,
    llm_config_overrides={"temperature": 0.1}
)

Conversation Undo & Rewind¶

DynaBot supports undoing turns and rewinding conversations to earlier points. Undo navigates the conversation tree to a checkpoint recorded before the turn, creating a new branch. The original path is preserved.

Undo the Last Turn¶

from dataknobs_bots import DynaBot, BotContext

bot = await DynaBot.from_config(config)
context = BotContext(conversation_id="undo-demo", client_id="my-app")

await bot.chat("Hello", context)
await bot.chat("Tell me about Python", context)

# Undo the last turn (removes "Tell me about Python" + its response)
result = await bot.undo_last_turn(context)
print(f"Undid: {result.undone_user_message}")
print(f"Remaining turns: {result.remaining_turns}")

# Next chat() creates a new branch from the checkpoint
await bot.chat("Tell me about Rust instead", context)

Rewind to a Specific Turn¶

await bot.chat("First message", context)   # Turn 0
await bot.chat("Second message", context)  # Turn 1
await bot.chat("Third message", context)   # Turn 2

# Rewind to after turn 0 (removes turns 1 and 2)
result = await bot.rewind_to_turn(context, 0)

# Rewind to start (removes all turns)
result = await bot.rewind_to_turn(context, -1)

UndoResult¶

Both methods return an UndoResult dataclass:

from dataknobs_bots import UndoResult

# Fields:
#   undone_user_message: str   - The user message that was undone
#   undone_bot_response: str   - The bot response that was undone
#   remaining_turns: int       - Number of user turns remaining
#   branching: bool            - Whether the undo created a branch

What Gets Rolled Back¶

Undo coordinates rollback across all DynaBot subsystems:

Component	Rollback behavior
Conversation tree	Navigates to checkpoint node; next message branches
Memory (Buffer/Summary)	Pops messages added since checkpoint
Wizard FSM state	Restores from per-node metadata
Memory banks	Removes records whose `source_node_id` is not an ancestor of the checkpoint

Limitations¶

VectorMemory does not support undo (pop_messages raises NotImplementedError)
SummaryMemory can only undo messages still in its recent window; summarized messages cannot be individually undone
Undo does not reverse external side effects (tool calls that wrote to a database, sent emails, etc.)

Tutorial 4: Building a RAG Chatbot¶

Create a chatbot that answers questions using your documents.

Step 1: Prepare Documents¶

# Create a docs directory
mkdir my_docs

# Add some documents
echo "Our company was founded in 2020 by Alice and Bob." > my_docs/about.txt
echo "We offer Premium ($99/month) and Enterprise ($299/month) plans." > my_docs/pricing.txt
echo "Email support@company.com for help or call 555-0123." > my_docs/contact.txt

Step 2: Create RAG Bot¶

# rag_bot.py
import asyncio
from dataknobs_bots import DynaBot, BotContext

async def main():
    config = {
        "llm": {
            "provider": "ollama",
            "model": "gemma3:1b",
        },
        "conversation_storage": {
            "backend": "memory"
        },
        # Enable knowledge base
        "knowledge_base": {
            "enabled": True,
            "documents_path": "./my_docs",
            "vector_store": {
                "backend": "faiss",
                "dimension": 384,
                "collection": "my_knowledge"
            },
            "embedding_provider": "ollama",
            "embedding_model": "nomic-embed-text",
            "chunking": {
                "max_chunk_size": 500
            }
        }
    }

    print("Creating RAG bot and indexing documents...")
    bot = await DynaBot.from_config(config)
    print("✓ Bot ready!\n")

    context = BotContext(
        conversation_id="tutorial-3",
        client_id="my-app"
    )

    # Ask questions about documents
    questions = [
        "When was the company founded?",
        "What are the pricing plans?",
        "How can I contact support?",
    ]

    for question in questions:
        print(f"Question: {question}")
        response = await bot.chat(question, context)
        print(f"Answer: {response}\n")

if __name__ == "__main__":
    asyncio.run(main())

Step 3: Pull Required Model¶

ollama pull nomic-embed-text

Step 4: Run the RAG Bot¶

python rag_bot.py

Output:

Creating RAG bot and indexing documents...
✓ Bot ready!

Question: When was the company founded?
Answer: According to the documents, the company was founded in 2020 by Alice and Bob.

Question: What are the pricing plans?
Answer: We offer two pricing plans: Premium at $99/month and Enterprise at $299/month.

Question: How can I contact support?
Answer: You can email support@company.com or call 555-0123 for help.

How RAG Works¶

User Question
    ↓
1. Convert to embedding
    ↓
2. Search knowledge base
    ↓
3. Retrieve relevant chunks
    ↓
4. Add chunks to LLM context
    ↓
5. Generate answer with context

Tutorial 5: Creating Tool-Using Agents¶

Build an agent that can use tools to perform actions.

Step 1: Define Custom Tools¶

# tools.py
from dataknobs_llm.tools import Tool
from typing import Dict, Any

class CalculatorTool(Tool):
    """Tool for arithmetic operations."""

    def __init__(self, precision: int = 2):
        super().__init__(
            name="calculator",
            description="Performs basic arithmetic: add, subtract, multiply, divide"
        )
        self.precision = precision

    @property
    def schema(self) -> Dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "operation": {
                    "type": "string",
                    "enum": ["add", "subtract", "multiply", "divide"],
                    "description": "Operation to perform"
                },
                "a": {"type": "number", "description": "First number"},
                "b": {"type": "number", "description": "Second number"}
            },
            "required": ["operation", "a", "b"]
        }

    async def execute(self, operation: str, a: float, b: float) -> float:
        operations = {
            "add": lambda x, y: x + y,
            "subtract": lambda x, y: x - y,
            "multiply": lambda x, y: x * y,
            "divide": lambda x, y: x / y if y != 0 else float('inf')
        }
        result = operations[operation](a, b)
        return round(result, self.precision)


class WeatherTool(Tool):
    """Mock weather tool (in real use, call weather API)."""

    def __init__(self):
        super().__init__(
            name="get_weather",
            description="Get current weather for a location"
        )

    @property
    def schema(self) -> Dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name or location"
                }
            },
            "required": ["location"]
        }

    async def execute(self, location: str) -> str:
        # Mock weather data (in real use, call API)
        weather_data = {
            "new york": "Sunny, 72°F",
            "london": "Cloudy, 15°C",
            "tokyo": "Rainy, 20°C"
        }
        location_lower = location.lower()
        return weather_data.get(location_lower, f"Weather data not available for {location}")

Step 2: Create Agent with Tools¶

# agent.py
import asyncio
from dataknobs_bots import DynaBot, BotContext

async def main():
    config = {
        "llm": {
            "provider": "ollama",
            "model": "phi3:mini",  # phi3 is good with tools
        },
        "conversation_storage": {
            "backend": "memory"
        },
        # Enable ReAct reasoning
        "reasoning": {
            "strategy": "react",
            "max_iterations": 5,
            "verbose": True,  # See reasoning steps
            "store_trace": True
        },
        # Configure tools
        "tools": [
            {
                "class": "tools.CalculatorTool",
                "params": {"precision": 2}
            },
            {
                "class": "tools.WeatherTool",
                "params": {}
            }
        ]
    }

    print("Creating agent with tools...\n")
    bot = await DynaBot.from_config(config)
    print("✓ Agent ready!\n")

    context = BotContext(
        conversation_id="tutorial-4",
        client_id="my-app"
    )

    # Tasks requiring tools
    tasks = [
        "What is 15 multiplied by 7?",
        "What's the weather in Tokyo?",
        "Calculate 100 divided by 4, then add 10 to that result"
    ]

    for task in tasks:
        print(f"Task: {task}\n")
        response = await bot.chat(task, context)
        print(f"Agent: {response}\n")
        print("-" * 60 + "\n")

if __name__ == "__main__":
    asyncio.run(main())

Step 3: Run the Agent¶

python agent.py

Output:

Creating agent with tools...
✓ Agent ready!

Task: What is 15 multiplied by 7?

[Iteration 1]
Thought: I need to multiply 15 by 7
Action: calculator
Action Input: {"operation": "multiply", "a": 15, "b": 7}
Observation: 105

[Iteration 2]
Thought: I have the answer
Agent: 15 multiplied by 7 is 105.

------------------------------------------------------------

Task: What's the weather in Tokyo?

[Iteration 1]
Thought: I need to check the weather
Action: get_weather
Action Input: {"location": "Tokyo"}
Observation: Rainy, 20°C

[Iteration 2]
Thought: I have the weather information
Agent: The weather in Tokyo is rainy with a temperature of 20°C.

Understanding ReAct¶

ReAct = **Rea**soning + **Act**ing

Each iteration: 1. Thought: What should I do? 2. Action: Which tool to use? 3. Observation: What did the tool return? 4. Repeat or Answer: Continue or provide final answer

Tutorial 6: Building Guided Wizard Flows¶

Create multi-step conversational wizards with validation and branching.

You'll Learn: - Creating wizard configuration files - Stage-based data collection - JSON Schema validation - Navigation commands (back, skip, restart) - Lifecycle hooks

Use Cases: - User onboarding flows - Form-based data collection - Multi-step configuration wizards - Guided decision trees

Step 1: Create Wizard Configuration¶

Create a wizard.yaml file:

# wizard.yaml
name: onboarding-wizard
version: "1.0"
description: User onboarding flow

stages:
  - name: welcome
    is_start: true
    prompt: "Welcome! What type of project are you creating?"
    schema:
      type: object
      properties:
        project_type:
          type: string
          enum: [web, mobile, api]
    suggestions:
      - "Web application"
      - "Mobile app"
      - "API service"
    transitions:
      - target: name_project
        condition: "data.get('project_type')"

  - name: name_project
    prompt: "What would you like to name your project?"
    help_text: "Choose a descriptive name (3-30 characters)"
    schema:
      type: object
      properties:
        project_name:
          type: string
          minLength: 3
          maxLength: 30
      required: ["project_name"]
    transitions:
      - target: features

  - name: features
    prompt: "Which features do you need? (comma-separated)"
    can_skip: true
    schema:
      type: object
      properties:
        features:
          type: array
          items:
            type: string
    suggestions:
      - "Authentication"
      - "Database"
      - "File uploads"
    transitions:
      - target: complete

  - name: complete
    is_end: true
    prompt: |
      Your project is configured!
      - Name: {{project_name}}
      - Type: {{project_type}}
      - Features: {{features}}

Step 2: Create Wizard Bot¶

# wizard_bot.py
import asyncio
from dataknobs_bots import DynaBot, BotContext

async def main():
    config = {
        "llm": {
            "provider": "ollama",
            "model": "gemma3:1b"
        },
        "conversation_storage": {
            "backend": "memory"
        },
        "reasoning": {
            "strategy": "wizard",
            "wizard_config": "wizard.yaml",
            "strict_validation": True
        }
    }

    print("Creating wizard bot...\n")
    bot = await DynaBot.from_config(config)
    print("✓ Wizard ready!\n")

    context = BotContext(
        conversation_id="onboarding-001",
        client_id="my-app"
    )

    # Interactive loop
    print("Start the wizard by sending any message.")
    print("Type 'quit' to exit.\n")

    while True:
        user_input = input("You: ").strip()
        if user_input.lower() == 'quit':
            break

        response = await bot.chat(user_input, context)
        print(f"Bot: {response}\n")

if __name__ == "__main__":
    asyncio.run(main())

Step 3: Run the Wizard¶

python wizard_bot.py

Example Session:

Creating wizard bot...
✓ Wizard ready!

Start the wizard by sending any message.
Type 'quit' to exit.

You: hello
Bot: Welcome! What type of project are you creating?
     Suggestions: Web application, Mobile app, API service

You: I want to build a web app
Bot: Great choice! What would you like to name your project?

You: MyAwesomeProject
Bot: Which features do you need? (comma-separated)
     Suggestions: Authentication, Database, File uploads
     (You can skip this step by saying "skip")

You: authentication, database
Bot: Your project is configured!
     - Name: MyAwesomeProject
     - Type: web
     - Features: ['authentication', 'database']

Bot-Initiated Greeting¶

Instead of requiring the user to send the first message, wizard bots can greet the user proactively using bot.greet():

context = BotContext(conversation_id="onboarding-001", client_id="my-app")

# Bot speaks first — renders the start stage's response_template or uses LLM
greeting = await bot.greet(context)
if greeting:
    print(f"Bot: {greeting}")

# Now the user's first message answers the wizard's question
while True:
    user_input = input("You: ").strip()
    if user_input.lower() == "quit":
        break
    response = await bot.chat(user_input, context)
    print(f"Bot: {response}\n")

This works with wizard bots only. Non-wizard bots return None from greet(). See the Configuration Reference for details.

Users can navigate naturally using keyword commands:

Say	Effect
"back" / "go back" / "previous"	Return to previous stage
"skip" / "skip this"	Skip optional stage (if `can_skip: true`)
"restart" / "start over"	Begin from start

Example:

You: Actually, go back
Bot: Returning to previous step. What would you like to name your project?

You: restart
Bot: Starting over. Welcome! What type of project are you creating?

These keywords are the defaults. You can customize them per-wizard or per-stage via the navigation section in wizard settings. See the Configuration Reference for details.

Conversation Tree Branching:

When a wizard stage is revisited via back or restart, the conversation tree creates a sibling branch from the point where the stage was previously entered. This preserves earlier conversation paths rather than chaining messages deeper into a single linear chain. For example, restarting a wizard that was on the greeting stage creates a new greeting node as a sibling of the original, both sharing the same parent node.

Adding Lifecycle Hooks¶

Customize behavior at stage transitions:

from dataknobs_bots.reasoning.wizard_hooks import WizardHooks

# Create hooks instance
hooks = WizardHooks()

# Log every stage entry
def log_entry(stage: str, data: dict):
    print(f"[Entering {stage}] Data so far: {data}")

hooks.on_enter(log_entry)

# Validate before exit (stage-specific)
async def validate_exit(stage: str, data: dict):
    name = data.get("project_name", "")
    if name.lower() in ["test", "temp"]:
        raise ValueError("Please choose a more descriptive name")

hooks.on_exit(validate_exit, stage="name_project")

# Save on completion
async def save_project(data: dict):
    print(f"Saving project: {data}")
    # Save to database, create files, etc.

hooks.on_complete(save_project)

Configuration-based hooks (for YAML configs):

reasoning:
  strategy: wizard
  wizard_config: wizard.yaml
  hooks:
    on_enter:
      - "myapp.hooks:log_entry"
      - function: "myapp.hooks:validate_welcome"
        stage: welcome  # Stage-specific hook
    on_complete:
      - "myapp.hooks:save_project"

Conditional Branching¶

Create dynamic flows based on user input:

stages:
  - name: experience
    prompt: "What's your experience level?"
    schema:
      type: object
      properties:
        level:
          type: string
          enum: [beginner, intermediate, advanced]
    transitions:
      - target: beginner_path
        condition: "data.get('level') == 'beginner'"
      - target: advanced_path
        condition: "data.get('level') == 'advanced'"
      - target: intermediate_path  # Default

  - name: beginner_path
    prompt: "Let's start with the basics..."
    # ... beginner-specific flow

  - name: advanced_path
    prompt: "Here are the advanced options..."
    # ... advanced-specific flow

Understanding Wizard Reasoning¶

Strategy	Best For	Key Feature
Simple	Basic Q&A	Single LLM call
ReAct	Tool use	Iterative reasoning
Wizard	Multi-step flows	FSM-backed stages

Wizard reasoning is ideal when you need: - Structured data collection - Input validation per step - Conditional flow branching - Progress tracking - User navigation (back/skip)

Advanced Topics¶

Multi-Tenant Deployment¶

Deploy a single bot instance serving multiple clients.

# multi_tenant_bot.py
import asyncio
from dataknobs_bots import DynaBot, BotContext

async def handle_client_request(
    bot: DynaBot,
    client_id: str,
    user_id: str,
    message: str
):
    """Handle request from a specific client."""
    context = BotContext(
        conversation_id=f"{client_id}-{user_id}",
        client_id=client_id,
        user_id=user_id,
        session_metadata={
            "client_name": f"Client {client_id}",
            "subscription": "premium"
        }
    )

    response = await bot.chat(message, context)
    return response


async def main():
    # Shared bot configuration
    config = {
        "llm": {"provider": "ollama", "model": "gemma3:1b"},
        "conversation_storage": {
            "backend": "postgres",  # Shared storage
            "host": "localhost",
            "database": "multi_tenant_db"
        },
        "memory": {"type": "buffer", "max_messages": 10}
    }

    bot = await DynaBot.from_config(config)

    # Simulate multiple clients
    tasks = [
        handle_client_request(bot, "client-A", "user-1", "Hello from Client A"),
        handle_client_request(bot, "client-B", "user-2", "Hello from Client B"),
        handle_client_request(bot, "client-A", "user-3", "Another user from A"),
    ]

    responses = await asyncio.gather(*tasks)
    for i, response in enumerate(responses):
        print(f"Response {i+1}: {response}\n")


if __name__ == "__main__":
    asyncio.run(main())

Key Points: - Single bot instance - Separate client_id for each tenant - Conversations isolated by ID - Shared storage with tenant partitioning

Custom Tools Development¶

See TOOLS.md for comprehensive guide.

Quick Example:

from dataknobs_llm.tools import Tool
from typing import Dict, Any
import httpx

class StockPriceTool(Tool):
    """Get current stock price."""

    def __init__(self, api_key: str):
        super().__init__(
            name="get_stock_price",
            description="Get current stock price for a ticker symbol"
        )
        self.api_key = api_key

    @property
    def schema(self) -> Dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "ticker": {
                    "type": "string",
                    "description": "Stock ticker symbol (e.g., AAPL, GOOGL)"
                }
            },
            "required": ["ticker"]
        }

    async def execute(self, ticker: str) -> Dict[str, Any]:
        async with httpx.AsyncClient() as client:
            response = await client.get(
                f"https://api.example.com/stock/{ticker}",
                headers={"Authorization": f"Bearer {self.api_key}"}
            )
            data = response.json()
            return {
                "ticker": ticker,
                "price": data["price"],
                "change": data["change"]
            }

Usage:

config = {
    # ... other config
    "tools": [
        {
            "class": "my_tools.StockPriceTool",
            "params": {"api_key": "${STOCK_API_KEY}"}
        }
    ]
}

Production Deployment¶

Configuration for Production¶

# production.yaml
llm:
  provider: openai
  model: gpt-4
  api_key: ${OPENAI_API_KEY}
  temperature: 0.7
  max_tokens: 2000

conversation_storage:
  backend: postgres
  host: ${DB_HOST}
  port: 5432
  database: ${DB_NAME}
  user: ${DB_USER}
  password: ${DB_PASSWORD}
  pool_size: 20
  max_overflow: 10

memory:
  type: buffer
  max_messages: 20

reasoning:
  strategy: react
  max_iterations: 5
  verbose: false
  store_trace: false

# Logging middleware
middleware:
  - class: middleware.RequestLoggingMiddleware
    params:
      log_level: INFO
  - class: middleware.MetricsMiddleware
    params:
      statsd_host: ${STATSD_HOST}

Docker Deployment¶

# Dockerfile
FROM python:3.12-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["python", "app.py"]

# docker-compose.yml
version: '3.8'

services:
  postgres:
    image: postgres:14
    environment:
      POSTGRES_DB: botdb
      POSTGRES_USER: botuser
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    ports:
      - "5432:5432"

  bot:
    build: .
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - DB_HOST=postgres
      - DB_NAME=botdb
      - DB_USER=botuser
      - DB_PASSWORD=${DB_PASSWORD}
    depends_on:
      - postgres
    ports:
      - "8000:8000"
    deploy:
      replicas: 3

volumes:
  postgres_data:

Health Checks¶

# app.py
from fastapi import FastAPI
from dataknobs_bots import DynaBot

app = FastAPI()
bot = None

@app.on_event("startup")
async def startup():
    global bot
    bot = await DynaBot.from_config(config)

@app.get("/health")
async def health_check():
    return {"status": "healthy", "bot_ready": bot is not None}

@app.post("/chat")
async def chat(request: ChatRequest):
    context = BotContext(
        conversation_id=request.conversation_id,
        client_id=request.client_id,
        user_id=request.user_id
    )
    response = await bot.chat(request.message, context)
    return {"response": response}

Common Patterns¶

Pattern 1: Configuration per Environment¶

import os
import yaml

def load_config():
    env = os.getenv("ENV", "development")
    config_file = f"config/{env}.yaml"

    with open(config_file) as f:
        config = yaml.safe_load(f)

    return config

config = load_config()
bot = await DynaBot.from_config(config)

Pattern 2: Dynamic Tool Loading¶

config = {
    # ... base config
    "tool_definitions": {
        "calculator": {
            "class": "tools.CalculatorTool",
            "params": {"precision": 2}
        },
        "weather": {
            "class": "tools.WeatherTool",
            "params": {}
        }
    },
    "tools": []  # Empty initially
}

# Load tools based on user subscription
if user.has_feature("calculator"):
    config["tools"].append("xref:tools[calculator]")

if user.has_feature("weather"):
    config["tools"].append("xref:tools[weather]")

bot = await DynaBot.from_config(config)

Pattern 3: Conversation Handoff¶

async def escalate_to_human(conversation_id: str):
    """Transfer conversation to human agent."""
    # Get conversation history
    history = await storage.load_conversation(conversation_id)

    # Send to human agent system
    await human_agent_system.create_ticket(
        conversation_id=conversation_id,
        history=history,
        priority="high"
    )

    # Update conversation metadata
    await storage.update_metadata(
        conversation_id,
        {"status": "escalated", "escalated_at": datetime.now()}
    )

Troubleshooting¶

Issue: Bot responses are too slow¶

Possible Causes: - Using a large LLM model - Knowledge base search is slow - Network latency to LLM API

Solutions:

# Use a faster model
config["llm"]["model"] = "gemma3:1b"  # Instead of "llama3.1:70b"

# Reduce max_tokens
config["llm"]["max_tokens"] = 500  # Instead of 2000

# Use local LLM (Ollama)
config["llm"]["provider"] = "ollama"

# Optimize knowledge base
config["knowledge_base"]["chunking"]["max_chunk_size"] = 300  # Smaller chunks

Issue: Out of memory errors¶

Possible Causes: - Too many cached conversations - Vector memory using too much RAM - Large knowledge base in memory

Solutions:

# Use buffer memory instead of vector
config["memory"] = {"type": "buffer", "max_messages": 10}

# Use external vector store
config["knowledge_base"]["vector_store"]["backend"] = "pinecone"

# Implement conversation cache eviction
# (Future feature)

Issue: Knowledge base doesn't find relevant docs¶

Possible Causes: - Poor chunking strategy - Embeddings don't match query semantics - Wrong similarity threshold

Solutions:

# Adjust chunking
config["knowledge_base"]["chunking"] = {
    "max_chunk_size": 500   # Larger chunks
}

# Try different embedding model
config["knowledge_base"]["embedding_model"] = "text-embedding-3-large"

# Return more results
# In query: kb.query(query, k=10)  # Instead of k=3

Issue: Tools not being called¶

Possible Causes: - Tool description not clear - Model not good at tool use - Max iterations too low

Solutions:

# Use a model better at tool use
config["llm"]["model"] = "phi3:mini"  # Or "gpt-4"

# Increase max iterations
config["reasoning"]["max_iterations"] = 10

# Improve tool descriptions
class MyTool(Tool):
    def __init__(self):
        super().__init__(
            name="my_tool",
            description="VERY CLEAR description of what this tool does, when to use it, and what it returns"  # Be explicit!
        )

Debug Mode¶

Enable verbose logging:

import logging

logging.basicConfig(level=logging.DEBUG)

config["reasoning"]["verbose"] = True
config["reasoning"]["store_trace"] = True

Next Steps¶

Explore Examples: Check out examples/ for more patterns
Read API Docs: See API.md for complete API reference
Configuration Deep Dive: Read CONFIGURATION.md
Build Custom Tools: Follow TOOLS.md guide
Understand Architecture: Study ARCHITECTURE.md

Getting Help¶

GitHub Issues: Report bugs or request features
Discussions: Ask questions and share ideas
Documentation: You're reading it!
Examples: Working code examples

User Guide¶

Table of Contents¶

Getting Started¶

Prerequisites¶

Installation¶

Install Ollama (Optional, for Local Testing)¶

Basic Tutorials¶

Tutorial 1: Your First Chatbot¶

Step 1: Create the Bot Configuration¶

Step 2: Run the Bot¶

Step 3: Try It Out¶

What's Happening?¶

Adding a System Prompt¶

Next Steps¶

Tutorial 2: Adding Memory¶

Step 1: Add Memory Configuration¶

Step 2: Run and Observe¶

Understanding Memory Types¶

Tutorial 3: Streaming Responses¶

Why Streaming?¶

Step 1: Basic Streaming¶

Step 2: Run and See Streaming¶

Step 3: Accumulating the Full Response¶

Step 4: Streaming with a Web API¶

Streaming vs Non-Streaming¶

Error Handling in Streaming¶

Per-Request Config Overrides¶

Why Use Config Overrides?¶

Basic Usage¶

Streaming with Overrides¶

Supported Override Fields¶

Tracking Override Usage¶

Use Cases¶

Conversation Undo & Rewind¶

Undo the Last Turn¶

Rewind to a Specific Turn¶

UndoResult¶

What Gets Rolled Back¶

Limitations¶

Tutorial 4: Building a RAG Chatbot¶

Step 1: Prepare Documents¶

Step 2: Create RAG Bot¶

Step 3: Pull Required Model¶

Step 4: Run the RAG Bot¶

How RAG Works¶

Tutorial 5: Creating Tool-Using Agents¶

Step 1: Define Custom Tools¶

Step 2: Create Agent with Tools¶

Step 3: Run the Agent¶

Understanding ReAct¶

Tutorial 6: Building Guided Wizard Flows¶

Step 1: Create Wizard Configuration¶

Step 2: Create Wizard Bot¶

Step 3: Run the Wizard¶

Bot-Initiated Greeting¶

Navigation Commands¶

Adding Lifecycle Hooks¶

Conditional Branching¶

Understanding Wizard Reasoning¶

Advanced Topics¶

Multi-Tenant Deployment¶

Custom Tools Development¶

Production Deployment¶

Configuration for Production¶

Docker Deployment¶

Health Checks¶

Common Patterns¶

Pattern 1: Configuration per Environment¶

Pattern 2: Dynamic Tool Loading¶

Pattern 3: Conversation Handoff¶

Troubleshooting¶

Issue: Bot responses are too slow¶

Issue: Out of memory errors¶

Issue: Knowledge base doesn't find relevant docs¶

Issue: Tools not being called¶

Debug Mode¶

Next Steps¶

Getting Help¶