Memory Chatbot Example¶

A chatbot with conversation memory for context-aware responses.

Overview¶

This example demonstrates:

Buffer memory for conversation context
Context-aware responses using message history
Configuration-based memory setup

Prerequisites¶

# Install Ollama: https://ollama.ai/

# Pull the required model
ollama pull gemma3:1b

# Install dataknobs-bots
pip install dataknobs-bots

What Changed from Simple Chatbot?¶

We added a memory section to the configuration:

config = {
    "llm": {
        "provider": "ollama",
        "model": "gemma3:1b"
    },
    "conversation_storage": {
        "backend": "memory"
    },
    "memory": {
        "type": "buffer",        # Buffer memory (sliding window)
        "max_messages": 10        # Keep last 10 messages
    }
}

How It Works¶

Buffer Memory¶

Buffer memory maintains a sliding window of recent messages:

User sends message → Added to buffer
Bot responds → Response added to buffer
Buffer exceeds max_messages → Oldest messages removed
Next message → Recent history included in context

Memory Flow¶

graph LR
    A[User Message] --> B[Add to Buffer]
    B --> C[Get Recent Context]
    C --> D[LLM with Context]
    D --> E[Bot Response]
    E --> B

Complete Code¶

02_chatbot_with_memory.py

"""Chatbot with memory example.

This example demonstrates:
- Buffer memory configuration
- Context retention across messages
- Memory limits and management
- How the bot remembers previous conversation

Required Ollama model:
    ollama pull gemma3:1b
"""

import asyncio

from dataknobs_bots import BotContext, DynaBot


async def main():
    """Run a chatbot with memory."""
    print("=" * 60)
    print("Chatbot with Memory Example")
    print("=" * 60)
    print()
    print("This example shows a chatbot that remembers context.")
    print("Required: ollama pull gemma3:1b")
    print()

    # Configuration with buffer memory
    config = {
        "llm": {
            "provider": "ollama",
            "model": "gemma3:1b",
            "temperature": 0.7,
            "max_tokens": 500,
        },
        "conversation_storage": {
            "backend": "memory",
        },
        "memory": {
            "type": "buffer",
            "max_messages": 10,  # Remember last 10 messages
        },
        "prompts": {
            "helpful_assistant": "You are a helpful AI assistant with excellent memory. "
            "You remember details from earlier in the conversation and can reference them."
        },
        "system_prompt": {
            "name": "helpful_assistant",
        },
    }

    print("Creating bot with buffer memory...")
    bot = await DynaBot.from_config(config)
    print("✓ Bot created successfully")
    print(f"✓ Memory: Buffer (max {config['memory']['max_messages']} messages)")
    print()

    # Create context for this conversation
    context = BotContext(
        conversation_id="memory-chat-001",
        client_id="example-client",
        user_id="demo-user",
    )

    # Conversation demonstrating memory
    messages = [
        "Hello! My name is Alice and I love reading science fiction.",
        "What's your favorite sci-fi book?",
        "Do you remember my name?",
        "What did I tell you I love to read?",
        "Can you recommend a sci-fi book for me based on what you know about my interests?",
    ]

    for i, user_message in enumerate(messages, 1):
        print(f"[{i}] User: {user_message}")

        response = await bot.chat(
            message=user_message,
            context=context,
        )

        print(f"[{i}] Bot: {response}")
        print()

        # Add a small delay between messages
        if i < len(messages):
            await asyncio.sleep(1)

    print("=" * 60)
    print("Conversation complete!")
    print()
    print("Memory demonstration:")
    print("- The bot remembered the user's name (Alice)")
    print("- The bot remembered the user's interest (science fiction)")
    print("- The bot used this context to make relevant recommendations")
    print()
    print(f"Memory buffer stores last {config['memory']['max_messages']} messages")


if __name__ == "__main__":
    asyncio.run(main())

Running the Example¶

cd packages/bots
python examples/02_chatbot_with_memory.py

Expected Output¶

The bot now remembers previous messages:

User: My name is Alice.
Bot: Hello Alice! Nice to meet you.

User: What's my name?
Bot: Your name is Alice.

User: What did I just tell you?
Bot: You told me your name is Alice.

Memory Types¶

Buffer Memory¶

Simple sliding window (used in this example):

"memory": {
    "type": "buffer",
    "max_messages": 10  # Last 10 messages
}

Pros: Fast, simple, predictable Cons: Limited context, doesn't prioritize important information

Summary Memory¶

LLM-based compression of older messages:

"memory": {
    "type": "summary",
    "recent_window": 10  # Keep last 10 messages verbatim
}

Pros: Very long effective context, preserves key points Cons: Loses exact wording of old messages

Uses the bot's LLM by default. For a dedicated summarization model:

"memory": {
    "type": "summary",
    "recent_window": 10,
    "llm": {
        "provider": "ollama",
        "model": "gemma3:1b"  # Lightweight model for summaries
    }
}

Vector Memory¶

Semantic search over conversation history:

"memory": {
    "type": "vector",
    "max_messages": 100,
    "top_k": 5,  # Retrieve 5 most relevant messages
    "embedding_provider": "ollama",
    "embedding_model": "nomic-embed-text"
}

Pros: Finds relevant messages regardless of recency Cons: Slower, requires embedding model

Tenant scoping: Use default_metadata and default_filter for multi-tenant isolation:

"memory": {
    "type": "vector",
    "backend": "pgvector",
    "dimension": 768,
    "embedding_provider": "ollama",
    "embedding_model": "nomic-embed-text",
    "default_metadata": {"user_id": "u123"},   # Tagged on writes
    "default_filter": {"user_id": "u123"},     # Scoped on reads
}

Composite Memory¶

Combine multiple strategies for best-of-both-worlds context:

"memory": {
    "type": "composite",
    "primary": 0,         # Index of primary strategy
    "strategies": [
        {
            "type": "summary",
            "recent_window": 10
        },
        {
            "type": "vector",
            "backend": "memory",
            "dimension": 384,
            "embedding_provider": "ollama",
            "embedding_model": "nomic-embed-text"
        }
    ]
}

All strategies receive every message. On read, primary results appear first, then deduplicated secondary results. If any strategy fails, the composite continues with the remaining ones.

Pros: Combines recent-context awareness with semantic recall Cons: Uses more resources (multiple stores, possible embedding calls)

Choosing max_messages¶

max_messages	Use Case	Token Usage
5-10	Short conversations	Low
10-20	Standard conversations	Medium
20-50	Long conversations	High
50+	Document-length conversations	Very High

Recommendation: Start with 10-20 for most use cases.

Key Takeaways¶

✅ Context Awareness - Bot remembers conversation history
✅ Easy Configuration - Just add memory section
✅ Sliding Window - Automatic management of context size
✅ Token Efficiency - Only recent messages included

Customization¶

Longer Memory¶

"memory": {
    "type": "buffer",
    "max_messages": 20  # Remember more messages
}

Summary Memory¶

"memory": {
    "type": "summary",
    "recent_window": 10
}

Semantic Memory¶

"memory": {
    "type": "vector",
    "max_messages": 100,
    "embedding_provider": "ollama",
    "embedding_model": "nomic-embed-text"
}

Composite Memory (Summary + Vector)¶

"memory": {
    "type": "composite",
    "strategies": [
        {"type": "summary", "recent_window": 10},
        {
            "type": "vector",
            "backend": "memory",
            "dimension": 384,
            "embedding_provider": "ollama",
            "embedding_model": "nomic-embed-text"
        }
    ]
}

What's Next?¶

To add knowledge retrieval, see the RAG Chatbot Example.

Simple Chatbot - Basic bot without memory
RAG Chatbot - Add knowledge base
Multi-Tenant Bot - Multiple clients

Memory Chatbot Example¶

Overview¶

Prerequisites¶

What Changed from Simple Chatbot?¶

How It Works¶

Buffer Memory¶

Memory Flow¶

Complete Code¶

Running the Example¶

Expected Output¶

Memory Types¶

Buffer Memory¶

Summary Memory¶

Vector Memory¶

Composite Memory¶

Choosing max_messages¶

Key Takeaways¶

Customization¶

Longer Memory¶

Summary Memory¶

Semantic Memory¶

Composite Memory (Summary + Vector)¶

What's Next?¶

Related Examples¶

Related Documentation¶