dataknobs-bots Complete API Reference¶

Complete auto-generated API documentation from source code docstrings.

💡 Also see: - Curated API Guide - Hand-crafted tutorials and examples - Package Overview - Introduction and getting started - Source Code - View on GitHub

dataknobs_bots ¶

DataKnobs Bots - Configuration-driven AI agents.

Modules:

Name	Description
`api`	FastAPI integration components for dataknobs_bots.
`artifacts`	Artifact management for conversational workflows.
`bot`	Bot core components.
`config`	Configuration utilities for DynaBot.
`context`	Context management for conversational workflows.
`generators`	Deterministic content generators for structured output production.
`knowledge`	Knowledge base implementations for DynaBot.
`memory`	Memory implementations for DynaBot.
`middleware`	Middleware components for bot request/response lifecycle.
`providers`	Provider creation utilities for dataknobs-bots.
`reasoning`	Reasoning strategies for DynaBot.
`registry`	Registry module for bot registration storage and management.
`review`	Review system for validating artifacts.
`rubrics`	Rubric-based evaluation system for structured content assessment.
`testing`	Testing utilities for dataknobs-bots.
`tools`	Tools for DynaBot.
`utils`	Utility functions and helpers for the dataknobs_bots package.

Classes:

Name	Description
`BotContext`	Runtime context for bot execution.
`BotManager`	Manages multiple DynaBot instances for multi-tenancy.
`BotRegistry`	Multi-tenant bot registry with caching and environment support.
`DynaBot`	Configuration-driven chatbot leveraging the DataKnobs ecosystem.
`UndoResult`	Result of an undo operation.
`ConfigDraftManager`	File-based draft manager for interactive config creation.
`ConfigTemplate`	A reusable DynaBot configuration template.
`ConfigTemplateRegistry`	Registry for managing and applying configuration templates.
`ConfigValidator`	Pluggable validation engine for DynaBot configurations.
`DraftMetadata`	Metadata for a configuration draft.
`DynaBotConfigBuilder`	Fluent builder for DynaBot configurations.
`DynaBotConfigSchema`	Queryable registry of valid DynaBot configuration options.
`TemplateVariable`	Definition of a template variable.
`ToolCatalog`	Registry mapping tool names to class paths and default configuration.
`ToolEntry`	Metadata for a tool in the catalog.
`ValidationResult`	Result of validating a configuration.
`RAGKnowledgeBase`	RAG knowledge base using dataknobs-xization for chunking and vector search.
`BufferMemory`	Simple buffer memory keeping last N messages.
`CompositeMemory`	Combines multiple memory strategies into one.
`Memory`	Abstract base class for memory implementations.
`SummaryMemory`	Memory that summarizes older messages to maintain long context windows.
`VectorMemory`	Vector-based semantic memory using dataknobs-data vector stores.
`CostTrackingMiddleware`	Middleware for tracking LLM API costs and usage.
`LoggingMiddleware`	Middleware for tracking conversation interactions.
`Middleware`	Base class for bot middleware.
`ReActReasoning`	ReAct (Reasoning + Acting) strategy.
`ReasoningStrategy`	Abstract base class for reasoning strategies.
`SimpleReasoning`	Simple reasoning strategy that makes direct LLM calls.
`StrategyCapabilities`	Declares what a reasoning strategy manages autonomously.
`StrategyRegistry`	Registry mapping strategy names to their factories.
`BotTestHarness`	High-level test helper for ALL DynaBot behavioral tests.
`CaptureReplay`	Loads a capture JSON file and creates pre-loaded EchoProviders.
`TurnResult`	Result of a single `bot.chat()` or `bot.greet()` turn.
`WizardConfigBuilder`	Fluent builder for wizard configuration dicts.
`AddKBResourceTool`	Tool for adding a resource to the knowledge base resource list.
`CheckKnowledgeSourceTool`	Tool for verifying a knowledge source directory exists and has content.
`GetTemplateDetailsTool`	Tool for getting detailed information about a template.
`IngestKnowledgeBaseTool`	Tool for writing the KB ingestion manifest and finalizing KB config.
`KnowledgeSearchTool`	Tool for searching the knowledge base.
`ListAvailableToolsTool`	Tool for listing tools available to configure for a bot.
`ListKBResourcesTool`	Tool for listing currently tracked knowledge base resources.
`ListTemplatesTool`	Tool for listing available configuration templates.
`PreviewConfigTool`	Tool for previewing the configuration being built.
`RemoveKBResourceTool`	Tool for removing a resource from the knowledge base resource list.
`SaveConfigTool`	Tool for saving/finalizing the configuration.

Functions:

Name	Description
`normalize_wizard_state`	Normalize wizard metadata to canonical structure.
`create_default_catalog`	Create a new ToolCatalog pre-populated with built-in tools.
`create_knowledge_base_from_config`	Create knowledge base from configuration.
`create_memory_from_config`	Create memory instance from configuration.
`create_reasoning_from_config`	Create reasoning strategy from configuration.
`register_strategy`	Register a custom reasoning strategy.
`inject_providers`	Inject LLM providers into a DynaBot instance for testing.

Attributes:

Name	Type	Description
`default_catalog`	`ToolCatalog`	Module-level singleton catalog pre-populated with built-in tools.

Attributes¶

default_catalog `module-attribute` ¶

default_catalog: ToolCatalog = ToolCatalog()

Module-level singleton catalog pre-populated with built-in tools.

Classes¶

BotContext `dataclass` ¶

BotContext(
    conversation_id: str,
    client_id: str,
    user_id: str | None = None,
    session_metadata: dict[str, Any] = dict(),
    request_metadata: dict[str, Any] = dict(),
)

Runtime context for bot execution.

Supports dict-like access for dynamic attributes via request_metadata. Use context["key"] or context.get("key") for dynamic data.

Attributes:

Name	Type	Description
`conversation_id`	`str`	Unique identifier for the conversation
`client_id`	`str`	Identifier for the client/tenant
`user_id`	`str \| None`	Optional user identifier
`session_metadata`	`dict[str, Any]`	Metadata for the session
`request_metadata`	`dict[str, Any]`	Metadata for the current request (also used for dict-like access)

Methods:

Name	Description
`__getitem__`	Get item from request_metadata using dict-like access.
`__setitem__`	Set item in request_metadata using dict-like access.
`__contains__`	Check if key exists in request_metadata.
`get`	Get item from request_metadata with optional default.
`copy`	Create a copy of this context with optional field overrides.

Functions¶

getitem ¶

__getitem__(key: str) -> Any

Get item from request_metadata using dict-like access.

Parameters:

Name	Type	Description	Default
`key`	`str`	Key to retrieve	required

Returns:

Type	Description
`Any`	Value from request_metadata

Raises:

Type	Description
`KeyError`	If key not found in request_metadata

Source code in packages/bots/src/dataknobs_bots/bot/context.py

def __getitem__(self, key: str) -> Any:
    """Get item from request_metadata using dict-like access.

    Args:
        key: Key to retrieve

    Returns:
        Value from request_metadata

    Raises:
        KeyError: If key not found in request_metadata
    """
    return self.request_metadata[key]

setitem ¶

__setitem__(key: str, value: Any) -> None

Set item in request_metadata using dict-like access.

Parameters:

Name	Type	Description	Default
`key`	`str`	Key to set	required
`value`	`Any`	Value to store	required

Source code in packages/bots/src/dataknobs_bots/bot/context.py

def __setitem__(self, key: str, value: Any) -> None:
    """Set item in request_metadata using dict-like access.

    Args:
        key: Key to set
        value: Value to store
    """
    self.request_metadata[key] = value

contains ¶

__contains__(key: str) -> bool

Check if key exists in request_metadata.

Parameters:

Name	Type	Description	Default
`key`	`str`	Key to check	required

Returns:

Type	Description
`bool`	True if key exists in request_metadata

Source code in packages/bots/src/dataknobs_bots/bot/context.py

def __contains__(self, key: str) -> bool:
    """Check if key exists in request_metadata.

    Args:
        key: Key to check

    Returns:
        True if key exists in request_metadata
    """
    return key in self.request_metadata

get ¶

get(key: str, default: Any = None) -> Any

Get item from request_metadata with optional default.

Parameters:

Name	Type	Description	Default
`key`	`str`	Key to retrieve	required
`default`	`Any`	Default value if key not found	`None`

Returns:

Type	Description
`Any`	Value from request_metadata or default

Source code in packages/bots/src/dataknobs_bots/bot/context.py

def get(self, key: str, default: Any = None) -> Any:
    """Get item from request_metadata with optional default.

    Args:
        key: Key to retrieve
        default: Default value if key not found

    Returns:
        Value from request_metadata or default
    """
    return self.request_metadata.get(key, default)

copy ¶

copy(**overrides: Any) -> BotContext

Create a copy of this context with optional field overrides.

Creates shallow copies of session_metadata and request_metadata dicts to avoid mutation issues between the original and copy.

Parameters:

Name	Type	Description	Default
`**overrides`	`Any`	Field values to override in the copy	`{}`

Returns:

Type	Description
`BotContext`	New BotContext instance with copied values

Example

ctx = BotContext(conversation_id="conv-1", client_id="client-1") ctx2 = ctx.copy(conversation_id="conv-2") ctx2.conversation_id 'conv-2'

Source code in packages/bots/src/dataknobs_bots/bot/context.py

def copy(self, **overrides: Any) -> "BotContext":
    """Create a copy of this context with optional field overrides.

    Creates shallow copies of session_metadata and request_metadata dicts
    to avoid mutation issues between the original and copy.

    Args:
        **overrides: Field values to override in the copy

    Returns:
        New BotContext instance with copied values

    Example:
        >>> ctx = BotContext(conversation_id="conv-1", client_id="client-1")
        >>> ctx2 = ctx.copy(conversation_id="conv-2")
        >>> ctx2.conversation_id
        'conv-2'
    """
    return BotContext(
        conversation_id=overrides.get("conversation_id", self.conversation_id),
        client_id=overrides.get("client_id", self.client_id),
        user_id=overrides.get("user_id", self.user_id),
        session_metadata=overrides.get(
            "session_metadata", dict(self.session_metadata)
        ),
        request_metadata=overrides.get(
            "request_metadata", dict(self.request_metadata)
        ),
    )

BotManager ¶

BotManager(
    config_loader: ConfigLoaderType | None = None,
    environment: EnvironmentConfig | str | None = None,
    env_dir: str | Path = "config/environments",
)

Manages multiple DynaBot instances for multi-tenancy.

.. deprecated:: Use :class:BotRegistry or :class:InMemoryBotRegistry instead.

BotManager handles: - Bot instance creation and caching - Client-level isolation - Configuration loading and validation - Bot lifecycle management - Environment-aware resource resolution (optional)

Each client/tenant gets its own bot instance, which can serve multiple users. The underlying DynaBot architecture ensures conversation isolation through BotContext with different conversation_ids.

Attributes:

Name	Type	Description
`bots`		Cache of bot_id -> DynaBot instances
`config_loader`		Optional configuration loader (sync or async)
`environment_name`	`str \| None`	Current environment name (if environment-aware)

Example

# Basic usage with inline configuration
manager = BotManager()
bot = await manager.get_or_create("my-bot", config={
    "llm": {"provider": "openai", "model": "gpt-4o"},
    "conversation_storage": {"backend": "memory"},
})

# With environment-aware configuration
manager = BotManager(environment="production")
bot = await manager.get_or_create("my-bot", config={
    "bot": {
        "llm": {"$resource": "default", "type": "llm_providers"},
        "conversation_storage": {"$resource": "db", "type": "databases"},
    }
})

# With config loader function
def load_config(bot_id: str) -> dict:
    return load_yaml(f"configs/{bot_id}.yaml")

manager = BotManager(config_loader=load_config)
bot = await manager.get_or_create("my-bot")

# List active bots
active_bots = manager.list_bots()

Initialize BotManager.

Parameters:

Name	Type	Description	Default
`config_loader`	`ConfigLoaderType \| None`	Optional configuration loader. Can be: - An object with a `.load(bot_id)` method (sync or async) - A callable function: bot_id -> config_dict (sync or async) - None (configurations must be provided explicitly)	`None`
`environment`	`EnvironmentConfig \| str \| None`	Environment name or EnvironmentConfig for resource resolution. If None, environment-aware features are disabled unless an EnvironmentAwareConfig is passed to get_or_create(). If a string, loads environment config from env_dir.	`None`
`env_dir`	`str \| Path`	Directory containing environment config files. Only used if environment is a string name.	`'config/environments'`

Methods:

Name	Description
`get_or_create`	Get existing bot or create new one.
`get`	Get bot without creating if doesn't exist.
`remove`	Remove bot instance.
`reload`	Reload bot instance with fresh configuration.
`list_bots`	List all active bot IDs.
`get_bot_count`	Get count of active bots.
`clear_all`	Clear all bot instances.
`get_portable_config`	Get portable configuration for storage.
`__repr__`	String representation.

Source code in packages/bots/src/dataknobs_bots/bot/manager.py

def __init__(
    self,
    config_loader: ConfigLoaderType | None = None,
    environment: EnvironmentConfig | str | None = None,
    env_dir: str | Path = "config/environments",
):
    """Initialize BotManager.

    Args:
        config_loader: Optional configuration loader.
            Can be:
            - An object with a `.load(bot_id)` method (sync or async)
            - A callable function: bot_id -> config_dict (sync or async)
            - None (configurations must be provided explicitly)
        environment: Environment name or EnvironmentConfig for resource resolution.
            If None, environment-aware features are disabled unless
            an EnvironmentAwareConfig is passed to get_or_create().
            If a string, loads environment config from env_dir.
        env_dir: Directory containing environment config files.
            Only used if environment is a string name.
    """
    warnings.warn(_DEPRECATION_MESSAGE, DeprecationWarning, stacklevel=2)

    self._bots: dict[str, DynaBot] = {}
    self._config_loader = config_loader
    self._env_dir = Path(env_dir)

    # Load environment config if specified
    self._environment: EnvironmentConfig | None = None
    if environment is not None:
        try:
            from dataknobs_config import EnvironmentConfig

            if isinstance(environment, str):
                self._environment = EnvironmentConfig.load(environment, env_dir)
            else:
                self._environment = environment
            logger.info(f"Initialized BotManager with environment: {self._environment.name}")
        except ImportError:
            logger.warning(
                "dataknobs_config not installed, environment-aware features disabled"
            )
    else:
        logger.info("Initialized BotManager")

Attributes¶

environment_name `property` ¶

environment_name: str | None

Get current environment name, or None if not environment-aware.

environment `property` ¶

environment: EnvironmentConfig | None

Get current environment config, or None if not environment-aware.

Functions¶

get_or_create `async` ¶

get_or_create(
    bot_id: str,
    config: dict[str, Any] | EnvironmentAwareConfig | None = None,
    use_environment: bool | None = None,
    config_key: str = "bot",
) -> DynaBot

Get existing bot or create new one.

Parameters:

Name	Type	Description	Default
`bot_id`	`str`	Bot identifier (e.g., "customer-support", "sales-assistant")	required
`config`	`dict[str, Any] \| EnvironmentAwareConfig \| None`	Optional bot configuration. Can be: - dict with resolved values (traditional) - dict with $resource references (requires environment) - EnvironmentAwareConfig instance If not provided and config_loader is set, will load configuration.	`None`
`use_environment`	`bool \| None`	Whether to use environment-aware resolution. - True: Use environment for $resource resolution - False: Use config as-is (no resolution) - None (default): Auto-detect based on whether manager has an environment configured or config is EnvironmentAwareConfig	`None`
`config_key`	`str`	Key within config containing bot configuration. Defaults to "bot". Set to None to use root config. Only used when use_environment is True.	`'bot'`

Returns:

Type	Description
`DynaBot`	DynaBot instance

Raises:

Type	Description
`ValueError`	If config is None and no config_loader is set

Example

# Traditional usage (no environment resolution)
manager = BotManager()
bot = await manager.get_or_create("support-bot", config={
    "llm": {"provider": "openai", "model": "gpt-4"},
    "conversation_storage": {"backend": "memory"},
})

# Environment-aware usage with $resource references
manager = BotManager(environment="production")
bot = await manager.get_or_create("support-bot", config={
    "bot": {
        "llm": {"$resource": "default", "type": "llm_providers"},
        "conversation_storage": {"$resource": "db", "type": "databases"},
    }
})

# Explicit environment resolution control
bot = await manager.get_or_create(
    "support-bot",
    config=my_config,
    use_environment=True,
    config_key="bot"
)

Source code in packages/bots/src/dataknobs_bots/bot/manager.py

async def get_or_create(
    self,
    bot_id: str,
    config: dict[str, Any] | EnvironmentAwareConfig | None = None,
    use_environment: bool | None = None,
    config_key: str = "bot",
) -> DynaBot:
    """Get existing bot or create new one.

    Args:
        bot_id: Bot identifier (e.g., "customer-support", "sales-assistant")
        config: Optional bot configuration. Can be:
            - dict with resolved values (traditional)
            - dict with $resource references (requires environment)
            - EnvironmentAwareConfig instance
            If not provided and config_loader is set, will load configuration.
        use_environment: Whether to use environment-aware resolution.
            - True: Use environment for $resource resolution
            - False: Use config as-is (no resolution)
            - None (default): Auto-detect based on whether manager has
              an environment configured or config is EnvironmentAwareConfig
        config_key: Key within config containing bot configuration.
                   Defaults to "bot". Set to None to use root config.
                   Only used when use_environment is True.

    Returns:
        DynaBot instance

    Raises:
        ValueError: If config is None and no config_loader is set

    Example:
        ```python
        # Traditional usage (no environment resolution)
        manager = BotManager()
        bot = await manager.get_or_create("support-bot", config={
            "llm": {"provider": "openai", "model": "gpt-4"},
            "conversation_storage": {"backend": "memory"},
        })

        # Environment-aware usage with $resource references
        manager = BotManager(environment="production")
        bot = await manager.get_or_create("support-bot", config={
            "bot": {
                "llm": {"$resource": "default", "type": "llm_providers"},
                "conversation_storage": {"$resource": "db", "type": "databases"},
            }
        })

        # Explicit environment resolution control
        bot = await manager.get_or_create(
            "support-bot",
            config=my_config,
            use_environment=True,
            config_key="bot"
        )
        ```
    """
    # Return cached bot if exists
    if bot_id in self._bots:
        logger.debug(f"Returning cached bot: {bot_id}")
        return self._bots[bot_id]

    # Load configuration if not provided
    if config is None:
        if self._config_loader is None:
            raise ValueError(
                f"No configuration provided for bot '{bot_id}' "
                "and no config_loader is set"
            )
        config = await self._load_config(bot_id)

    # Determine whether to use environment resolution
    is_env_aware_config = False
    try:
        from dataknobs_config import EnvironmentAwareConfig

        is_env_aware_config = isinstance(config, EnvironmentAwareConfig)
    except ImportError:
        pass

    should_use_environment = use_environment
    if should_use_environment is None:
        # Auto-detect: use environment if manager has one or config is EnvironmentAwareConfig
        should_use_environment = self._environment is not None or is_env_aware_config

    # Create new bot
    logger.info(f"Creating new bot: {bot_id} (environment_aware={should_use_environment})")

    if should_use_environment:
        bot = await DynaBot.from_environment_aware_config(
            config,
            environment=self._environment,
            env_dir=self._env_dir,
            config_key=config_key,
        )
    else:
        # Traditional path - use config as-is
        bot = await DynaBot.from_config(config)

    # Cache and return
    self._bots[bot_id] = bot
    return bot

get `async` ¶

get(bot_id: str) -> DynaBot | None

Get bot without creating if doesn't exist.

Parameters:

Name	Type	Description	Default
`bot_id`	`str`	Bot identifier	required

Returns:

Type	Description
`DynaBot \| None`	DynaBot instance if exists, None otherwise

Source code in packages/bots/src/dataknobs_bots/bot/manager.py

async def get(self, bot_id: str) -> DynaBot | None:
    """Get bot without creating if doesn't exist.

    Args:
        bot_id: Bot identifier

    Returns:
        DynaBot instance if exists, None otherwise
    """
    return self._bots.get(bot_id)

remove `async` ¶

remove(bot_id: str) -> bool

Remove bot instance.

Parameters:

Name	Type	Description	Default
`bot_id`	`str`	Bot identifier	required

Returns:

Type	Description
`bool`	True if bot was removed, False if didn't exist

Source code in packages/bots/src/dataknobs_bots/bot/manager.py

async def remove(self, bot_id: str) -> bool:
    """Remove bot instance.

    Args:
        bot_id: Bot identifier

    Returns:
        True if bot was removed, False if didn't exist
    """
    if bot_id in self._bots:
        logger.info(f"Removing bot: {bot_id}")
        del self._bots[bot_id]
        return True
    return False

reload `async` ¶

reload(bot_id: str) -> DynaBot

Reload bot instance with fresh configuration.

Parameters:

Name	Type	Description	Default
`bot_id`	`str`	Bot identifier	required

Returns:

Type	Description
`DynaBot`	New DynaBot instance

Raises:

Type	Description
`ValueError`	If no config_loader is set

Source code in packages/bots/src/dataknobs_bots/bot/manager.py

async def reload(self, bot_id: str) -> DynaBot:
    """Reload bot instance with fresh configuration.

    Args:
        bot_id: Bot identifier

    Returns:
        New DynaBot instance

    Raises:
        ValueError: If no config_loader is set
    """
    if self._config_loader is None:
        raise ValueError("Cannot reload without config_loader")

    # Remove existing bot
    await self.remove(bot_id)

    # Create new one
    return await self.get_or_create(bot_id)

list_bots ¶

list_bots() -> list[str]

List all active bot IDs.

Returns:

Type	Description
`list[str]`	List of bot identifiers

Source code in packages/bots/src/dataknobs_bots/bot/manager.py

def list_bots(self) -> list[str]:
    """List all active bot IDs.

    Returns:
        List of bot identifiers
    """
    return list(self._bots.keys())

get_bot_count ¶

get_bot_count() -> int

Get count of active bots.

Returns:

Type	Description
`int`	Number of active bot instances

Source code in packages/bots/src/dataknobs_bots/bot/manager.py

def get_bot_count(self) -> int:
    """Get count of active bots.

    Returns:
        Number of active bot instances
    """
    return len(self._bots)

clear_all `async` ¶

clear_all() -> None

Clear all bot instances.

Useful for testing or when restarting the service.

Source code in packages/bots/src/dataknobs_bots/bot/manager.py

async def clear_all(self) -> None:
    """Clear all bot instances.

    Useful for testing or when restarting the service.
    """
    logger.info("Clearing all bot instances")
    self._bots.clear()

get_portable_config ¶

get_portable_config(
    config: dict[str, Any] | EnvironmentAwareConfig,
) -> dict[str, Any]

Get portable configuration for storage.

Extracts portable config (with $resource references intact, environment variables unresolved) suitable for storing in registries or databases.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any] \| EnvironmentAwareConfig`	Configuration to make portable. Can be dict or EnvironmentAwareConfig.	required

Returns:

Type	Description
`dict[str, Any]`	Portable configuration dictionary

Example

manager = BotManager(environment="production")

# Get portable config from EnvironmentAwareConfig
portable = manager.get_portable_config(env_aware_config)

# Store in registry (portable across environments)
await registry.store(bot_id, portable)

Source code in packages/bots/src/dataknobs_bots/bot/manager.py

def get_portable_config(
    self,
    config: dict[str, Any] | EnvironmentAwareConfig,
) -> dict[str, Any]:
    """Get portable configuration for storage.

    Extracts portable config (with $resource references intact,
    environment variables unresolved) suitable for storing in
    registries or databases.

    Args:
        config: Configuration to make portable.
            Can be dict or EnvironmentAwareConfig.

    Returns:
        Portable configuration dictionary

    Example:
        ```python
        manager = BotManager(environment="production")

        # Get portable config from EnvironmentAwareConfig
        portable = manager.get_portable_config(env_aware_config)

        # Store in registry (portable across environments)
        await registry.store(bot_id, portable)
        ```
    """
    return DynaBot.get_portable_config(config)

repr ¶

__repr__() -> str

String representation.

Source code in packages/bots/src/dataknobs_bots/bot/manager.py

def __repr__(self) -> str:
    """String representation."""
    bots = ", ".join(self._bots.keys())
    env = f", environment={self._environment.name!r}" if self._environment else ""
    return f"BotManager(bots=[{bots}], count={len(self._bots)}{env})"

BotRegistry ¶

BotRegistry(
    backend: RegistryBackend | None = None,
    environment: EnvironmentConfig | str | None = None,
    env_dir: str | Path = "config/environments",
    cache_ttl: int = 300,
    max_cache_size: int = 1000,
    validate_on_register: bool = True,
    config_key: str = "bot",
)

Multi-tenant bot registry with caching and environment support.

The BotRegistry manages multiple bot instances for different clients/tenants. It provides: - Pluggable storage backends via RegistryBackend protocol - Environment-aware configuration resolution - Portability validation to ensure configs work across environments - LRU-style caching with TTL for bot instances - Thread-safe access

This enables: - Multi-tenant SaaS platforms - A/B testing with different bot configurations - Horizontal scaling with stateless bot instances - Cross-environment deployment with portable configs

Attributes:

Name	Type	Description
`backend`	`RegistryBackend`	Storage backend for configurations
`environment`	`EnvironmentConfig \| None`	Environment for $resource resolution
`cache_ttl`	`int`	Time-to-live for cached bots in seconds
`max_cache_size`	`int`	Maximum number of bots to cache

Example

from dataknobs_bots.bot import BotRegistry
from dataknobs_bots.registry import InMemoryBackend

# Create registry
registry = BotRegistry(
    backend=InMemoryBackend(),
    environment="production",
    cache_ttl=300,
)
await registry.initialize()

# Register portable configuration
await registry.register("client-123", {
    "bot": {
        "llm": {"$resource": "default", "type": "llm_providers"},
    }
})

# Get bot for a client
bot = await registry.get_bot("client-123")

# Use the bot
response = await bot.chat(message, context)

Initialize bot registry.

Parameters:

Name	Type	Description	Default
`backend`	`RegistryBackend \| None`	Storage backend for configurations. If None, uses InMemoryBackend.	`None`
`environment`	`EnvironmentConfig \| str \| None`	Environment name or EnvironmentConfig for $resource resolution. If None, configs are used as-is without environment resolution.	`None`
`env_dir`	`str \| Path`	Directory containing environment config files. Only used if environment is a string name.	`'config/environments'`
`cache_ttl`	`int`	Cache time-to-live in seconds (default: 300)	`300`
`max_cache_size`	`int`	Maximum cached bots (default: 1000)	`1000`
`validate_on_register`	`bool`	If True, validate config portability when registering (default: True)	`True`
`config_key`	`str`	Key within config containing bot configuration. Defaults to "bot". Used during environment resolution.	`'bot'`

Methods:

Name	Description
`initialize`	Initialize the registry and backend.
`close`	Close the registry and backend.
`register`	Register or update a bot configuration.
`get_bot`	Get bot instance for a client.
`get_config`	Get stored configuration for a bot.
`get_registration`	Get full registration including metadata.
`unregister`	Remove a bot registration (hard delete).
`deactivate`	Deactivate a bot registration (soft delete).
`exists`	Check if an active bot registration exists.
`list_bots`	List all active bot IDs.
`count`	Count active bot registrations.
`get_cached_bots`	Get list of currently cached bot IDs.
`clear_cache`	Clear all cached bot instances.
`register_client`	Register or update a client's bot configuration.
`remove_client`	Remove a client from the registry.
`get_cached_clients`	Get list of currently cached client IDs.
`__repr__`	String representation.

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

def __init__(
    self,
    backend: RegistryBackend | None = None,
    environment: EnvironmentConfig | str | None = None,
    env_dir: str | Path = "config/environments",
    cache_ttl: int = 300,
    max_cache_size: int = 1000,
    validate_on_register: bool = True,
    config_key: str = "bot",
):
    """Initialize bot registry.

    Args:
        backend: Storage backend for configurations.
            If None, uses InMemoryBackend.
        environment: Environment name or EnvironmentConfig for
            $resource resolution. If None, configs are used as-is
            without environment resolution.
        env_dir: Directory containing environment config files.
            Only used if environment is a string name.
        cache_ttl: Cache time-to-live in seconds (default: 300)
        max_cache_size: Maximum cached bots (default: 1000)
        validate_on_register: If True, validate config portability
            when registering (default: True)
        config_key: Key within config containing bot configuration.
            Defaults to "bot". Used during environment resolution.
    """
    self._backend = backend or InMemoryBackend()
    self._env_dir = Path(env_dir)
    self._cache_ttl = cache_ttl
    self._max_cache_size = max_cache_size
    self._validate_on_register = validate_on_register
    self._config_key = config_key

    # Bot instance cache: bot_id -> (DynaBot, cached_timestamp)
    self._cache: dict[str, tuple[DynaBot, float]] = {}
    self._lock = asyncio.Lock()
    self._initialized = False

    # Load environment config if specified
    self._environment: EnvironmentConfig | None = None
    if environment is not None:
        try:
            from dataknobs_config import EnvironmentConfig as EnvConfig

            if isinstance(environment, str):
                self._environment = EnvConfig.load(environment, env_dir)
            else:
                self._environment = environment
            logger.info(f"BotRegistry using environment: {self._environment.name}")
        except ImportError:
            logger.warning(
                "dataknobs_config not installed, environment-aware features disabled"
            )

Attributes¶

backend `property` ¶

backend: RegistryBackend

Get the storage backend.

environment `property` ¶

environment: EnvironmentConfig | None

Get current environment config, or None if not environment-aware.

environment_name `property` ¶

environment_name: str | None

Get current environment name, or None if not environment-aware.

cache_ttl `property` ¶

cache_ttl: int

Get cache TTL in seconds.

max_cache_size `property` ¶

max_cache_size: int

Get maximum cache size.

Functions¶

initialize `async` ¶

initialize() -> None

Initialize the registry and backend.

Must be called before using the registry.

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def initialize(self) -> None:
    """Initialize the registry and backend.

    Must be called before using the registry.
    """
    if not self._initialized:
        await self._backend.initialize()
        self._initialized = True
        logger.info("BotRegistry initialized")

close `async` ¶

close() -> None

Close the registry and backend.

Closes all cached bot instances and the storage backend.

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def close(self) -> None:
    """Close the registry and backend.

    Closes all cached bot instances and the storage backend.
    """
    async with self._lock:
        for bot_id, (bot, _) in self._cache.items():
            await self._close_bot(bot_id, bot)
        self._cache.clear()
    await self._backend.close()
    self._initialized = False
    logger.info("BotRegistry closed")

register `async` ¶

register(
    bot_id: str,
    config: dict[str, Any],
    status: str = "active",
    skip_validation: bool = False,
) -> Registration

Register or update a bot configuration.

Stores a portable configuration in the backend. By default, validates that the configuration is portable (no resolved local values).

Parameters:

Name	Type	Description	Default
`bot_id`	`str`	Unique bot identifier	required
`config`	`dict[str, Any]`	Bot configuration dictionary (should be portable)	required
`status`	`str`	Registration status (default: active)	`'active'`
`skip_validation`	`bool`	If True, skip portability validation	`False`

Returns:

Type	Description
`Registration`	Registration object with metadata

Raises:

Type	Description
`PortabilityError`	If config is not portable and validation is enabled

Example

# Register with portable config
reg = await registry.register("support-bot", {
    "bot": {
        "llm": {"$resource": "default", "type": "llm_providers"},
    }
})
print(f"Registered at: {reg.created_at}")

# Update existing registration
reg = await registry.register("support-bot", new_config)
print(f"Updated at: {reg.updated_at}")

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def register(
    self,
    bot_id: str,
    config: dict[str, Any],
    status: str = "active",
    skip_validation: bool = False,
) -> Registration:
    """Register or update a bot configuration.

    Stores a portable configuration in the backend. By default, validates
    that the configuration is portable (no resolved local values).

    Args:
        bot_id: Unique bot identifier
        config: Bot configuration dictionary (should be portable)
        status: Registration status (default: active)
        skip_validation: If True, skip portability validation

    Returns:
        Registration object with metadata

    Raises:
        PortabilityError: If config is not portable and validation is enabled

    Example:
        ```python
        # Register with portable config
        reg = await registry.register("support-bot", {
            "bot": {
                "llm": {"$resource": "default", "type": "llm_providers"},
            }
        })
        print(f"Registered at: {reg.created_at}")

        # Update existing registration
        reg = await registry.register("support-bot", new_config)
        print(f"Updated at: {reg.updated_at}")
        ```
    """
    # Validate portability if enabled
    if self._validate_on_register and not skip_validation:
        validate_portability(config)

    # Validate capability requirements if environment is available
    if self._validate_on_register and self._environment and not skip_validation:
        from .validation import validate_bot_capabilities

        # Extract the bot section if config_key is set
        bot_section = config.get(self._config_key, config) if self._config_key else config
        cap_warnings = validate_bot_capabilities(bot_section, self._environment)
        for warning in cap_warnings:
            logger.warning("Bot %s: %s", bot_id, warning)

    # Store in backend
    registration = await self._backend.register(bot_id, config, status)

    # Invalidate cache for this bot
    async with self._lock:
        if bot_id in self._cache:
            old_bot, _ = self._cache.pop(bot_id)
            await self._close_bot(bot_id, old_bot)
            logger.debug(f"Invalidated cache for bot: {bot_id}")

    logger.info(f"Registered bot: {bot_id}")
    return registration

get_bot `async` ¶

get_bot(bot_id: str, force_refresh: bool = False) -> DynaBot

Get bot instance for a client.

Bots are cached for performance. If a cached bot exists and hasn't expired, it's returned. Otherwise, a new bot is created from the stored configuration with environment resolution applied.

Parameters:

Name	Type	Description	Default
`bot_id`	`str`	Bot identifier	required
`force_refresh`	`bool`	If True, bypass cache and create fresh bot	`False`

Returns:

Type	Description
`DynaBot`	DynaBot instance for the client

Raises:

Type	Description
`KeyError`	If no registration exists for the bot_id
`ValueError`	If bot configuration is invalid

Example

# Get cached bot
bot = await registry.get_bot("client-123")

# Force refresh (e.g., after config change)
bot = await registry.get_bot("client-123", force_refresh=True)

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def get_bot(
    self,
    bot_id: str,
    force_refresh: bool = False,
) -> DynaBot:
    """Get bot instance for a client.

    Bots are cached for performance. If a cached bot exists and hasn't
    expired, it's returned. Otherwise, a new bot is created from the
    stored configuration with environment resolution applied.

    Args:
        bot_id: Bot identifier
        force_refresh: If True, bypass cache and create fresh bot

    Returns:
        DynaBot instance for the client

    Raises:
        KeyError: If no registration exists for the bot_id
        ValueError: If bot configuration is invalid

    Example:
        ```python
        # Get cached bot
        bot = await registry.get_bot("client-123")

        # Force refresh (e.g., after config change)
        bot = await registry.get_bot("client-123", force_refresh=True)
        ```
    """
    async with self._lock:
        # Check cache
        if not force_refresh and bot_id in self._cache:
            bot, cached_at = self._cache[bot_id]
            if time.time() - cached_at < self._cache_ttl:
                logger.debug(f"Returning cached bot: {bot_id}")
                return bot

        # Close stale/replaced bot if present
        if bot_id in self._cache:
            old_bot, _ = self._cache.pop(bot_id)
            await self._close_bot(bot_id, old_bot)

        # Load configuration from backend
        config = await self._backend.get_config(bot_id)
        if config is None:
            raise KeyError(f"No bot configuration found for: {bot_id}")

        # Create bot with environment resolution if configured
        if self._environment is not None:
            logger.debug(f"Creating bot with environment resolution: {bot_id}")
            bot = await DynaBot.from_environment_aware_config(
                config,
                environment=self._environment,
                env_dir=self._env_dir,
                config_key=self._config_key,
            )
        else:
            # Traditional path - use config as-is
            # Extract bot config if wrapped in config_key
            bot_config = config.get(self._config_key, config)
            logger.debug(f"Creating bot without environment resolution: {bot_id}")
            bot = await DynaBot.from_config(bot_config)

        # Cache the bot
        self._cache[bot_id] = (bot, time.time())
        logger.info(f"Created bot: {bot_id}")

        # Evict old entries if cache is full
        if len(self._cache) > self._max_cache_size:
            await self._evict_oldest()

        return bot

get_config `async` ¶

get_config(bot_id: str) -> dict[str, Any] | None

Get stored configuration for a bot.

Returns the portable configuration as stored, without environment resolution applied.

Parameters:

Name	Type	Description	Default
`bot_id`	`str`	Bot identifier	required

Returns:

Type	Description
`dict[str, Any] \| None`	Configuration dict if found, None otherwise

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def get_config(self, bot_id: str) -> dict[str, Any] | None:
    """Get stored configuration for a bot.

    Returns the portable configuration as stored, without
    environment resolution applied.

    Args:
        bot_id: Bot identifier

    Returns:
        Configuration dict if found, None otherwise
    """
    return await self._backend.get_config(bot_id)

get_registration `async` ¶

get_registration(bot_id: str) -> Registration | None

Get full registration including metadata.

Parameters:

Name	Type	Description	Default
`bot_id`	`str`	Bot identifier	required

Returns:

Type	Description
`Registration \| None`	Registration if found, None otherwise

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def get_registration(self, bot_id: str) -> Registration | None:
    """Get full registration including metadata.

    Args:
        bot_id: Bot identifier

    Returns:
        Registration if found, None otherwise
    """
    return await self._backend.get(bot_id)

unregister `async` ¶

unregister(bot_id: str) -> bool

Remove a bot registration (hard delete).

Parameters:

Name	Type	Description	Default
`bot_id`	`str`	Bot identifier	required

Returns:

Type	Description
`bool`	True if removed, False if not found

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def unregister(self, bot_id: str) -> bool:
    """Remove a bot registration (hard delete).

    Args:
        bot_id: Bot identifier

    Returns:
        True if removed, False if not found
    """
    # Remove from cache
    async with self._lock:
        if bot_id in self._cache:
            old_bot, _ = self._cache.pop(bot_id)
            await self._close_bot(bot_id, old_bot)

    result = await self._backend.unregister(bot_id)
    if result:
        logger.info(f"Unregistered bot: {bot_id}")
    return result

deactivate `async` ¶

deactivate(bot_id: str) -> bool

Deactivate a bot registration (soft delete).

Parameters:

Name	Type	Description	Default
`bot_id`	`str`	Bot identifier	required

Returns:

Type	Description
`bool`	True if deactivated, False if not found

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def deactivate(self, bot_id: str) -> bool:
    """Deactivate a bot registration (soft delete).

    Args:
        bot_id: Bot identifier

    Returns:
        True if deactivated, False if not found
    """
    # Remove from cache
    async with self._lock:
        if bot_id in self._cache:
            old_bot, _ = self._cache.pop(bot_id)
            await self._close_bot(bot_id, old_bot)

    result = await self._backend.deactivate(bot_id)
    if result:
        logger.info(f"Deactivated bot: {bot_id}")
    return result

exists `async` ¶

exists(bot_id: str) -> bool

Check if an active bot registration exists.

Parameters:

Name	Type	Description	Default
`bot_id`	`str`	Bot identifier	required

Returns:

Type	Description
`bool`	True if registration exists and is active

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def exists(self, bot_id: str) -> bool:
    """Check if an active bot registration exists.

    Args:
        bot_id: Bot identifier

    Returns:
        True if registration exists and is active
    """
    return await self._backend.exists(bot_id)

list_bots `async` ¶

list_bots() -> list[str]

List all active bot IDs.

Returns:

Type	Description
`list[str]`	List of active bot identifiers

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def list_bots(self) -> list[str]:
    """List all active bot IDs.

    Returns:
        List of active bot identifiers
    """
    return await self._backend.list_ids()

count `async` ¶

count() -> int

Count active bot registrations.

Returns:

Type	Description
`int`	Number of active registrations

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def count(self) -> int:
    """Count active bot registrations.

    Returns:
        Number of active registrations
    """
    return await self._backend.count()

get_cached_bots ¶

get_cached_bots() -> list[str]

Get list of currently cached bot IDs.

Returns:

Type	Description
`list[str]`	List of bot IDs with cached instances

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

def get_cached_bots(self) -> list[str]:
    """Get list of currently cached bot IDs.

    Returns:
        List of bot IDs with cached instances
    """
    return list(self._cache.keys())

clear_cache `async` ¶

clear_cache() -> None

Clear all cached bot instances.

Closes each bot before removing it from cache. Does not affect stored registrations.

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def clear_cache(self) -> None:
    """Clear all cached bot instances.

    Closes each bot before removing it from cache.
    Does not affect stored registrations.
    """
    async with self._lock:
        for bot_id, (bot, _) in self._cache.items():
            await self._close_bot(bot_id, bot)
        self._cache.clear()
    logger.debug("Cleared bot cache")

register_client `async` ¶

register_client(client_id: str, bot_config: dict[str, Any]) -> None

Register or update a client's bot configuration.

.. deprecated:: Use :meth:register instead.

Parameters:

Name	Type	Description	Default
`client_id`	`str`	Client/tenant identifier	required
`bot_config`	`dict[str, Any]`	Bot configuration dictionary	required

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def register_client(
    self, client_id: str, bot_config: dict[str, Any]
) -> None:
    """Register or update a client's bot configuration.

    .. deprecated::
        Use :meth:`register` instead.

    Args:
        client_id: Client/tenant identifier
        bot_config: Bot configuration dictionary
    """
    await self.register(client_id, bot_config)

remove_client `async` ¶

remove_client(client_id: str) -> None

Remove a client from the registry.

.. deprecated:: Use :meth:unregister instead.

Parameters:

Name	Type	Description	Default
`client_id`	`str`	Client/tenant identifier	required

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

async def remove_client(self, client_id: str) -> None:
    """Remove a client from the registry.

    .. deprecated::
        Use :meth:`unregister` instead.

    Args:
        client_id: Client/tenant identifier
    """
    await self.unregister(client_id)

get_cached_clients ¶

get_cached_clients() -> list[str]

Get list of currently cached client IDs.

.. deprecated:: Use :meth:get_cached_bots instead.

Returns:

Type	Description
`list[str]`	List of client IDs with cached bots

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

def get_cached_clients(self) -> list[str]:
    """Get list of currently cached client IDs.

    .. deprecated::
        Use :meth:`get_cached_bots` instead.

    Returns:
        List of client IDs with cached bots
    """
    return self.get_cached_bots()

repr ¶

__repr__() -> str

String representation.

Source code in packages/bots/src/dataknobs_bots/bot/registry.py

def __repr__(self) -> str:
    """String representation."""
    env = f", environment={self._environment.name!r}" if self._environment else ""
    return (
        f"BotRegistry(backend={self._backend!r}, "
        f"cached={len(self._cache)}{env})"
    )

DynaBot ¶

DynaBot(
    llm: AsyncLLMProvider,
    prompt_builder: AsyncPromptBuilder,
    conversation_storage: ConversationStorage,
    tool_registry: ToolRegistry | None = None,
    memory: Memory | None = None,
    knowledge_base: KnowledgeBase | None = None,
    kb_auto_context: bool = True,
    reasoning_strategy: Any | None = None,
    middleware: list[Middleware] | None = None,
    system_prompt_name: str | None = None,
    system_prompt_content: str | None = None,
    system_prompt_rag_configs: list[dict[str, Any]] | None = None,
    default_temperature: float = 0.7,
    default_max_tokens: int = 1000,
    context_transform: Callable[[str], str] | None = None,
    max_tool_iterations: int = _DEFAULT_MAX_TOOL_ITERATIONS,
    tool_timeout: float = _DEFAULT_TOOL_TIMEOUT,
    tool_loop_timeout: float = _DEFAULT_TOOL_LOOP_TIMEOUT,
)

Configuration-driven chatbot leveraging the DataKnobs ecosystem.

DynaBot provides a flexible, configuration-driven bot that can be customized for different use cases through YAML/JSON configuration files.

.. versionadded:: 0.14.0 DynaBot-level tool execution loop — strategies that pass tools to the LLM but do not execute tool_calls themselves (e.g. SimpleReasoning) now have their tool calls executed automatically by the bot pipeline.

Attributes:

Name	Type	Description
`llm`		LLM provider for generating responses
`prompt_builder`		Prompt builder for managing prompts
`conversation_storage`		Storage backend for conversations
`tool_registry`		Registry of available tools
`memory`		Optional memory implementation for context
`knowledge_base`		Optional knowledge base for RAG
`reasoning_strategy`		Optional reasoning strategy
`middleware`	`list[Middleware]`	List of middleware for request/response processing
`system_prompt_name`		Name of the system prompt template to use
`system_prompt_content`		Inline system prompt content (alternative to name)
`system_prompt_rag_configs`		RAG configurations for inline system prompts
`default_temperature`		Default temperature for LLM generation
`default_max_tokens`		Default max tokens for LLM generation

Initialize DynaBot.

Parameters:

Name	Type	Description	Default
`llm`	`AsyncLLMProvider`	LLM provider instance	required
`prompt_builder`	`AsyncPromptBuilder`	Prompt builder instance	required
`conversation_storage`	`ConversationStorage`	Conversation storage backend	required
`tool_registry`	`ToolRegistry \| None`	Optional tool registry	`None`
`memory`	`Memory \| None`	Optional memory implementation	`None`
`knowledge_base`	`KnowledgeBase \| None`	Optional knowledge base	`None`
`kb_auto_context`	`bool`	Whether to auto-inject KB results into messages. When False, the KB is still available for tool-based access but not automatically queried on every message.	`True`
`reasoning_strategy`	`Any \| None`	Optional reasoning strategy	`None`
`middleware`	`list[Middleware] \| None`	Optional list of Middleware instances	`None`
`system_prompt_name`	`str \| None`	Name of system prompt template (mutually exclusive with content)	`None`
`system_prompt_content`	`str \| None`	Inline system prompt content (mutually exclusive with name)	`None`
`system_prompt_rag_configs`	`list[dict[str, Any]] \| None`	RAG configurations for inline system prompts	`None`
`default_temperature`	`float`	Default temperature (0-1)	`0.7`
`default_max_tokens`	`int`	Default max tokens to generate	`1000`
`context_transform`	`Callable[[str], str] \| None`	Optional callable applied to each content string (KB chunks, memory context) before it is injected into the prompt. Use this to sanitize or fence external content against prompt injection.	`None`
`max_tool_iterations`	`int`	Maximum number of tool execution rounds before returning. When a strategy returns a response with `tool_calls`, DynaBot executes the tools and re-generates. This cap prevents infinite loops when the model keeps requesting the same tools.	`_DEFAULT_MAX_TOOL_ITERATIONS`
`tool_timeout`	`float`	Per-tool execution timeout in seconds. If a single tool call exceeds this duration, it is cancelled and an error observation is recorded.	`_DEFAULT_TOOL_TIMEOUT`
`tool_loop_timeout`	`float`	Wall-clock budget in seconds for the tool execution loop (across all iterations). Checked at the start of each iteration and before each LLM re-call. For `chat()`, the LLM re-call is also bounded by the remaining budget via `asyncio.wait_for()`. For `stream_chat()`, a streaming re-call that starts within budget runs to completion (async generators cannot be reliably cancelled mid-chunk). Individual tool executions are always bounded by `tool_timeout`.	`_DEFAULT_TOOL_LOOP_TIMEOUT`

Methods:

Name	Description
`register_provider`	Register an auxiliary LLM/embedding provider by role.
`get_provider`	Get a registered provider by role.
`from_config`	Create DynaBot from configuration.
`from_environment_aware_config`	Create DynaBot with environment-aware configuration.
`get_portable_config`	Extract portable configuration for storage.
`chat`	Process a chat message.
`greet`	Generate a bot-initiated greeting before the user speaks.
`stream_chat`	Stream chat response token by token.
`get_conversation`	Retrieve conversation history.
`clear_conversation`	Clear a conversation's history.
`get_wizard_state`	Get current wizard state for a conversation.
`close`	Close the bot and clean up resources.
`__aenter__`	Async context manager entry.
`__aexit__`	Async context manager exit - ensures cleanup.
`get_conversation_manager`	Get a cached conversation manager by conversation ID.
`undo_last_turn`	Undo the last conversational turn (user message + bot response).
`rewind_to_turn`	Rewind conversation to after the given turn number.

Source code in packages/bots/src/dataknobs_bots/bot/base.py

def __init__(
    self,
    llm: AsyncLLMProvider,
    prompt_builder: AsyncPromptBuilder,
    conversation_storage: ConversationStorage,
    tool_registry: ToolRegistry | None = None,
    memory: Memory | None = None,
    knowledge_base: KnowledgeBase | None = None,
    kb_auto_context: bool = True,
    reasoning_strategy: Any | None = None,
    middleware: list[Middleware] | None = None,
    system_prompt_name: str | None = None,
    system_prompt_content: str | None = None,
    system_prompt_rag_configs: list[dict[str, Any]] | None = None,
    default_temperature: float = 0.7,
    default_max_tokens: int = 1000,
    context_transform: Callable[[str], str] | None = None,
    max_tool_iterations: int = _DEFAULT_MAX_TOOL_ITERATIONS,
    tool_timeout: float = _DEFAULT_TOOL_TIMEOUT,
    tool_loop_timeout: float = _DEFAULT_TOOL_LOOP_TIMEOUT,
):
    """Initialize DynaBot.

    Args:
        llm: LLM provider instance
        prompt_builder: Prompt builder instance
        conversation_storage: Conversation storage backend
        tool_registry: Optional tool registry
        memory: Optional memory implementation
        knowledge_base: Optional knowledge base
        kb_auto_context: Whether to auto-inject KB results into messages.
            When False, the KB is still available for tool-based access
            but not automatically queried on every message.
        reasoning_strategy: Optional reasoning strategy
        middleware: Optional list of Middleware instances
        system_prompt_name: Name of system prompt template (mutually exclusive with content)
        system_prompt_content: Inline system prompt content (mutually exclusive with name)
        system_prompt_rag_configs: RAG configurations for inline system prompts
        default_temperature: Default temperature (0-1)
        default_max_tokens: Default max tokens to generate
        context_transform: Optional callable applied to each content string
            (KB chunks, memory context) before it is injected into the
            prompt.  Use this to sanitize or fence external content
            against prompt injection.
        max_tool_iterations: Maximum number of tool execution rounds
            before returning.  When a strategy returns a response with
            ``tool_calls``, DynaBot executes the tools and re-generates.
            This cap prevents infinite loops when the model keeps
            requesting the same tools.
        tool_timeout: Per-tool execution timeout in seconds.  If a
            single tool call exceeds this duration, it is cancelled
            and an error observation is recorded.
        tool_loop_timeout: Wall-clock budget in seconds for the
            tool execution loop (across all iterations).  Checked
            at the start of each iteration and before each LLM
            re-call.  For ``chat()``, the LLM re-call is also
            bounded by the remaining budget via
            ``asyncio.wait_for()``.  For ``stream_chat()``, a
            streaming re-call that starts within budget runs to
            completion (async generators cannot be reliably
            cancelled mid-chunk).  Individual tool executions are always
            bounded by ``tool_timeout``.
    """
    self.llm = llm
    self.prompt_builder = prompt_builder
    self.conversation_storage = conversation_storage
    self.tool_registry = tool_registry or ToolRegistry()
    self.memory = memory
    self.knowledge_base = knowledge_base
    self._kb_auto_context = kb_auto_context
    self.reasoning_strategy = reasoning_strategy
    self.middleware: list[Middleware] = middleware or []
    self.system_prompt_name = system_prompt_name
    self.system_prompt_content = system_prompt_content
    self.system_prompt_rag_configs = system_prompt_rag_configs
    self.default_temperature = default_temperature
    self.default_max_tokens = default_max_tokens
    self._context_transform = context_transform
    self._max_tool_iterations = max_tool_iterations
    if tool_timeout < 0:
        raise ValueError(
            f"tool_timeout must be non-negative, got {tool_timeout}"
        )
    if tool_loop_timeout < 0:
        raise ValueError(
            f"tool_loop_timeout must be non-negative, got "
            f"{tool_loop_timeout}"
        )
    self._tool_timeout = tool_timeout
    self._tool_loop_timeout = tool_loop_timeout
    self._owns_llm = True  # Set False by from_config() when llm= injected
    self._conversation_managers: dict[str, ConversationManager] = {}
    self._turn_checkpoints: dict[str, list[tuple[str, int]]] = {}
    self._providers: dict[str, AsyncLLMProvider] = {}

Attributes¶

all_providers `property` ¶

all_providers: dict[str, AsyncLLMProvider]

All registered providers keyed by role.

Always includes "main" (self.llm). Subsystems add their own entries during construction. Returns a fresh dict (snapshot) on each call.

Functions¶

register_provider ¶

register_provider(role: str, provider: AsyncLLMProvider) -> None

Register an auxiliary LLM/embedding provider by role.

Providers registered here are included in all_providers for observability and enumeration. The registry is a catalog — it does not manage provider lifecycle. Each subsystem closes the providers it created (originator-owns-lifecycle).

The "main" role is reserved for self.llm and cannot be overwritten.

Parameters:

Name	Type	Description	Default
`role`	`str`	Unique role identifier (e.g. `"memory_embedding"`).	required
`provider`	`AsyncLLMProvider`	The provider instance.	required

Source code in packages/bots/src/dataknobs_bots/bot/base.py

def register_provider(self, role: str, provider: AsyncLLMProvider) -> None:
    """Register an auxiliary LLM/embedding provider by role.

    Providers registered here are included in ``all_providers`` for
    observability and enumeration.  The registry is a catalog — it
    does not manage provider lifecycle.  Each subsystem closes the
    providers it created (originator-owns-lifecycle).

    The ``"main"`` role is reserved for ``self.llm`` and cannot be
    overwritten.

    Args:
        role: Unique role identifier (e.g. ``"memory_embedding"``).
        provider: The provider instance.
    """
    if role == PROVIDER_ROLE_MAIN:
        logger.warning(
            "Cannot register provider with reserved role %r — "
            "use the 'llm' constructor parameter instead",
            PROVIDER_ROLE_MAIN,
        )
        return
    self._providers[role] = provider

get_provider ¶

get_provider(role: str) -> AsyncLLMProvider | None

Get a registered provider by role.

Parameters:

Name	Type	Description	Default
`role`	`str`	Provider role identifier.	required

Returns:

Type	Description
`AsyncLLMProvider \| None`	The provider, or `None` if not registered.

Source code in packages/bots/src/dataknobs_bots/bot/base.py

def get_provider(self, role: str) -> AsyncLLMProvider | None:
    """Get a registered provider by role.

    Args:
        role: Provider role identifier.

    Returns:
        The provider, or ``None`` if not registered.
    """
    if role == PROVIDER_ROLE_MAIN:
        return self.llm
    return self._providers.get(role)

from_config `async` `classmethod` ¶

from_config(
    config: dict[str, Any],
    *,
    llm: AsyncLLMProvider | None = None,
    middleware: list[Middleware] | None = None,
) -> DynaBot

Create DynaBot from configuration.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Configuration dictionary containing: - llm: LLM configuration (provider, model, etc.). Optional when the `llm` kwarg is provided. - conversation_storage: Storage configuration. Two modes: - `backend`: Database backend key for the default DataknobsConversationStorage (e.g. `"memory"`, `"sqlite"`, `"postgres"`). - `storage_class`: Dotted import path to a custom ConversationStorage class (e.g. `"myapp.storage:AcmeStorage"`). The class must implement `ConversationStorage` including the async `create(config)` classmethod. - tools: Optional list of tool configurations - memory: Optional memory configuration - knowledge_base: Optional knowledge base configuration - reasoning: Optional reasoning strategy configuration - middleware: Optional middleware configurations (ignored when the `middleware` kwarg is provided) - prompts: Optional prompts library (dict of name -> content) - system_prompt: Optional system prompt configuration (see below) - config_base_path: Optional base directory for resolving relative config file paths (e.g. wizard_config). When set, relative paths in nested configs are resolved against this directory instead of the current working directory.	required
`llm`	`AsyncLLMProvider \| None`	Pre-built LLM provider. When provided, `config["llm"]` is optional and the provider is used as-is (no initialization or cleanup — the caller owns the lifecycle). Use this to share a single provider across multiple bot instances.	`None`
`middleware`	`list[Middleware] \| None`	Pre-built middleware list. When provided, replaces any middleware defined in config.	`None`

Returns:

Type	Description
`DynaBot`	Configured DynaBot instance

System Prompt Formats

The system_prompt can be specified in multiple ways:

String: Smart detection - if the string exists as a template name in the prompt library, it's used as a template reference; otherwise it's treated as inline content.
Dict with name: {"name": "template_name"} - explicit template reference
Dict with name + strict: {"name": "template_name", "strict": true} - raises error if template doesn't exist
Dict with content: {"content": "inline prompt text"} - inline content
Dict with content + rag_configs: inline content with RAG enhancement

Example

bot = await DynaBot.from_config(config)

# With a shared provider
shared_llm = OllamaProvider({"provider": "ollama", "model": "llama3.2"})
await shared_llm.initialize()
bot = await DynaBot.from_config(
    {"conversation_storage": {"backend": "memory"}},
    llm=shared_llm,
)

# With pre-built middleware
bot = await DynaBot.from_config(config, middleware=[my_middleware])

Source code in packages/bots/src/dataknobs_bots/bot/base.py

@classmethod
async def from_config(
    cls,
    config: dict[str, Any],
    *,
    llm: AsyncLLMProvider | None = None,
    middleware: list[Middleware] | None = None,
) -> DynaBot:
    """Create DynaBot from configuration.

    Args:
        config: Configuration dictionary containing:
            - llm: LLM configuration (provider, model, etc.).
              Optional when the ``llm`` kwarg is provided.
            - conversation_storage: Storage configuration.  Two modes:
                - ``backend``: Database backend key for the default
                  DataknobsConversationStorage (e.g. ``"memory"``,
                  ``"sqlite"``, ``"postgres"``).
                - ``storage_class``: Dotted import path to a custom
                  ConversationStorage class (e.g.
                  ``"myapp.storage:AcmeStorage"``).  The class must
                  implement ``ConversationStorage`` including the
                  async ``create(config)`` classmethod.
            - tools: Optional list of tool configurations
            - memory: Optional memory configuration
            - knowledge_base: Optional knowledge base configuration
            - reasoning: Optional reasoning strategy configuration
            - middleware: Optional middleware configurations (ignored
              when the ``middleware`` kwarg is provided)
            - prompts: Optional prompts library (dict of name -> content)
            - system_prompt: Optional system prompt configuration (see below)
            - config_base_path: Optional base directory for resolving
              relative config file paths (e.g. wizard_config). When set,
              relative paths in nested configs are resolved against this
              directory instead of the current working directory.
        llm: Pre-built LLM provider.  When provided, ``config["llm"]``
            is optional and the provider is used as-is (no initialization
            or cleanup — the caller owns the lifecycle).  Use this to
            share a single provider across multiple bot instances.
        middleware: Pre-built middleware list.  When provided, replaces
            any middleware defined in config.

    Returns:
        Configured DynaBot instance

    System Prompt Formats:
        The system_prompt can be specified in multiple ways:

        - String: Smart detection - if the string exists as a template name
          in the prompt library, it's used as a template reference; otherwise
          it's treated as inline content.

        - Dict with name: `{"name": "template_name"}` - explicit template reference
        - Dict with name + strict: `{"name": "template_name", "strict": true}` -
          raises error if template doesn't exist
        - Dict with content: `{"content": "inline prompt text"}` - inline content
        - Dict with content + rag_configs: inline content with RAG enhancement

    Example:
        ```python
        bot = await DynaBot.from_config(config)

        # With a shared provider
        shared_llm = OllamaProvider({"provider": "ollama", "model": "llama3.2"})
        await shared_llm.initialize()
        bot = await DynaBot.from_config(
            {"conversation_storage": {"backend": "memory"}},
            llm=shared_llm,
        )

        # With pre-built middleware
        bot = await DynaBot.from_config(config, middleware=[my_middleware])
        ```
    """
    if llm is not None:
        # Caller-owned provider — skip creation/initialization.
        # Caller is responsible for lifecycle (initialize/close).
        llm_config = config.get("llm", {})
        bot = await cls._build_from_config(
            config, llm, llm_config, middleware_override=middleware
        )
        bot._owns_llm = False  # Caller owns lifecycle
        return bot

    # Create LLM provider from config
    llm_config = config["llm"]

    from dataknobs_llm.llm import LLMProviderFactory

    created_llm = LLMProviderFactory(is_async=True).create(llm_config)
    await created_llm.initialize()

    # Everything below can fail; ensure the provider is closed on error
    # so we don't leak aiohttp sessions or other resources.
    try:
        return await cls._build_from_config(
            config, created_llm, llm_config,
            middleware_override=middleware,
        )
    except Exception:
        await created_llm.close()
        raise

from_environment_aware_config `async` `classmethod` ¶

from_environment_aware_config(
    config: EnvironmentAwareConfig | dict[str, Any],
    environment: EnvironmentConfig | str | None = None,
    env_dir: str | Path = "config/environments",
    config_key: str = "bot",
) -> DynaBot

Create DynaBot with environment-aware configuration.

This is the recommended entry point for environment-portable bots. Resource references ($resource) are resolved against the environment config, and environment variables are substituted at instantiation time (late binding).

Parameters:

Name	Type	Description	Default
`config`	`EnvironmentAwareConfig \| dict[str, Any]`	EnvironmentAwareConfig instance or dict with $resource references. If dict, will be wrapped in EnvironmentAwareConfig.	required
`environment`	`EnvironmentConfig \| str \| None`	Environment name or EnvironmentConfig instance. If None, auto-detects from DATAKNOBS_ENVIRONMENT env var. Ignored if config is already an EnvironmentAwareConfig.	`None`
`env_dir`	`str \| Path`	Directory containing environment config files. Only used if environment is a string name.	`'config/environments'`
`config_key`	`str`	Key within config containing bot configuration. Defaults to "bot". Set to None to use root config.	`'bot'`

Returns:

Type	Description
`DynaBot`	Fully initialized DynaBot instance with resolved resources

Example

# With portable config dict
config = {
    "bot": {
        "llm": {
            "$resource": "default",
            "type": "llm_providers",
            "temperature": 0.7,
        },
        "conversation_storage": {
            "$resource": "conversations",
            "type": "databases",
        },
    }
}
bot = await DynaBot.from_environment_aware_config(config)

# With explicit environment
bot = await DynaBot.from_environment_aware_config(
    config,
    environment="production",
    env_dir="configs/environments"
)

# With EnvironmentAwareConfig instance
from dataknobs_config import EnvironmentAwareConfig
env_config = EnvironmentAwareConfig.load_app("my-bot", ...)
bot = await DynaBot.from_environment_aware_config(env_config)

Note

The config should use $resource references for infrastructure:

bot:
  llm:
    $resource: default      # Logical name
    type: llm_providers     # Resource type
    temperature: 0.7        # Behavioral param (portable)

The environment config provides concrete bindings:

resources:
  llm_providers:
    default:
      provider: openai
      model: gpt-4
      api_key: ${OPENAI_API_KEY}

Source code in packages/bots/src/dataknobs_bots/bot/base.py

@classmethod
async def from_environment_aware_config(
    cls,
    config: EnvironmentAwareConfig | dict[str, Any],
    environment: EnvironmentConfig | str | None = None,
    env_dir: str | Path = "config/environments",
    config_key: str = "bot",
) -> DynaBot:
    """Create DynaBot with environment-aware configuration.

    This is the recommended entry point for environment-portable bots.
    Resource references ($resource) are resolved against the environment
    config, and environment variables are substituted at instantiation time
    (late binding).

    Args:
        config: EnvironmentAwareConfig instance or dict with $resource references.
               If dict, will be wrapped in EnvironmentAwareConfig.
        environment: Environment name or EnvironmentConfig instance.
                    If None, auto-detects from DATAKNOBS_ENVIRONMENT env var.
                    Ignored if config is already an EnvironmentAwareConfig.
        env_dir: Directory containing environment config files.
                Only used if environment is a string name.
        config_key: Key within config containing bot configuration.
                   Defaults to "bot". Set to None to use root config.

    Returns:
        Fully initialized DynaBot instance with resolved resources

    Example:
        ```python
        # With portable config dict
        config = {
            "bot": {
                "llm": {
                    "$resource": "default",
                    "type": "llm_providers",
                    "temperature": 0.7,
                },
                "conversation_storage": {
                    "$resource": "conversations",
                    "type": "databases",
                },
            }
        }
        bot = await DynaBot.from_environment_aware_config(config)

        # With explicit environment
        bot = await DynaBot.from_environment_aware_config(
            config,
            environment="production",
            env_dir="configs/environments"
        )

        # With EnvironmentAwareConfig instance
        from dataknobs_config import EnvironmentAwareConfig
        env_config = EnvironmentAwareConfig.load_app("my-bot", ...)
        bot = await DynaBot.from_environment_aware_config(env_config)
        ```

    Note:
        The config should use $resource references for infrastructure:
        ```yaml
        bot:
          llm:
            $resource: default      # Logical name
            type: llm_providers     # Resource type
            temperature: 0.7        # Behavioral param (portable)
        ```

        The environment config provides concrete bindings:
        ```yaml
        resources:
          llm_providers:
            default:
              provider: openai
              model: gpt-4
              api_key: ${OPENAI_API_KEY}
        ```
    """
    from dataknobs_config import EnvironmentAwareConfig, EnvironmentConfig

    # Wrap dict in EnvironmentAwareConfig if needed
    if isinstance(config, dict):
        # Load or use provided environment
        if isinstance(environment, EnvironmentConfig):
            env_config = environment
        else:
            env_config = EnvironmentConfig.load(environment, env_dir)

        config = EnvironmentAwareConfig(
            config=config,
            environment=env_config,
        )
    elif environment is not None:
        # Switch environment on existing EnvironmentAwareConfig
        config = config.with_environment(environment, env_dir)

    # Resolve resources and env vars (late binding happens here)
    if config_key:
        resolved = config.resolve_for_build(config_key)
    else:
        resolved = config.resolve_for_build()

    # Delegate to existing from_config
    return await cls.from_config(resolved)

get_portable_config `staticmethod` ¶

get_portable_config(
    config: EnvironmentAwareConfig | dict[str, Any],
) -> dict[str, Any]

Extract portable configuration for storage.

Returns configuration with $resource references intact and environment variables unresolved. This is the config that should be stored in registries or databases for cross-environment portability.

Parameters:

Name	Type	Description	Default
`config`	`EnvironmentAwareConfig \| dict[str, Any]`	EnvironmentAwareConfig instance or portable dict	required

Returns:

Type	Description
`dict[str, Any]`	Portable configuration dictionary

Example

from dataknobs_config import EnvironmentAwareConfig

# From EnvironmentAwareConfig
env_config = EnvironmentAwareConfig.load_app("my-bot", ...)
portable = DynaBot.get_portable_config(env_config)

# Store portable config in registry
await registry.store(bot_id, portable)

# Dict passes through unchanged
portable = DynaBot.get_portable_config({"bot": {...}})

Source code in packages/bots/src/dataknobs_bots/bot/base.py

@staticmethod
def get_portable_config(
    config: EnvironmentAwareConfig | dict[str, Any],
) -> dict[str, Any]:
    """Extract portable configuration for storage.

    Returns configuration with $resource references intact
    and environment variables unresolved. This is the config
    that should be stored in registries or databases for
    cross-environment portability.

    Args:
        config: EnvironmentAwareConfig instance or portable dict

    Returns:
        Portable configuration dictionary

    Example:
        ```python
        from dataknobs_config import EnvironmentAwareConfig

        # From EnvironmentAwareConfig
        env_config = EnvironmentAwareConfig.load_app("my-bot", ...)
        portable = DynaBot.get_portable_config(env_config)

        # Store portable config in registry
        await registry.store(bot_id, portable)

        # Dict passes through unchanged
        portable = DynaBot.get_portable_config({"bot": {...}})
        ```
    """
    # Import here to avoid circular dependency at module level
    try:
        from dataknobs_config import EnvironmentAwareConfig

        if isinstance(config, EnvironmentAwareConfig):
            return config.get_portable_config()
    except ImportError:
        pass

    # Dict passes through (assumed already portable)
    return config

chat `async` ¶

chat(
    message: str,
    context: BotContext,
    temperature: float | None = None,
    max_tokens: int | None = None,
    rag_query: str | None = None,
    llm_config_overrides: dict[str, Any] | None = None,
    plugin_data: dict[str, Any] | None = None,
    **kwargs: Any,
) -> str

Process a chat message.

Parameters:

Name	Type	Description	Default
`message`	`str`	User message to process	required
`context`	`BotContext`	Bot execution context	required
`temperature`	`float \| None`	Optional temperature override	`None`
`max_tokens`	`int \| None`	Optional max tokens override	`None`
`rag_query`	`str \| None`	Optional explicit query for knowledge base retrieval. If provided, this is used instead of the message for RAG. Useful when the message contains literal text to analyze (e.g., "Analyze this prompt: [prompt text]") but you want to search for analysis techniques instead.	`None`
`llm_config_overrides`	`dict[str, Any] \| None`	Optional dict to override LLM config fields for this request only. Supported fields: model, temperature, max_tokens, top_p, stop_sequences, seed, options.	`None`
`plugin_data`	`dict[str, Any] \| None`	Optional dict to seed `turn.plugin_data` before middleware runs. Enables caller-managed lifecycle patterns (e.g., passing a DB session handle that middleware can use and `finally_turn` can close).	`None`
`**kwargs`	`Any`	Additional arguments	`{}`

Returns:

Type	Description
`str`	Bot response as string

Example

context = BotContext(
    conversation_id="conv-123",
    client_id="client-456",
    user_id="user-789"
)
response = await bot.chat("Hello!", context)

# With explicit RAG query
response = await bot.chat(
    "Analyze this: Write a poem about cats",
    context,
    rag_query="prompt analysis techniques evaluation"
)

# With LLM config overrides (switch model per-request)
response = await bot.chat(
    "Explain quantum computing",
    context,
    llm_config_overrides={"model": "gpt-4-turbo", "temperature": 0.9}
)

Source code in packages/bots/src/dataknobs_bots/bot/base.py

async def chat(
    self,
    message: str,
    context: BotContext,
    temperature: float | None = None,
    max_tokens: int | None = None,
    rag_query: str | None = None,
    llm_config_overrides: dict[str, Any] | None = None,
    plugin_data: dict[str, Any] | None = None,
    **kwargs: Any,
) -> str:
    """Process a chat message.

    Args:
        message: User message to process
        context: Bot execution context
        temperature: Optional temperature override
        max_tokens: Optional max tokens override
        rag_query: Optional explicit query for knowledge base retrieval.
                  If provided, this is used instead of the message for RAG.
                  Useful when the message contains literal text to analyze
                  (e.g., "Analyze this prompt: [prompt text]") but you want
                  to search for analysis techniques instead.
        llm_config_overrides: Optional dict to override LLM config fields
                  for this request only. Supported fields: model, temperature,
                  max_tokens, top_p, stop_sequences, seed, options.
        plugin_data: Optional dict to seed ``turn.plugin_data`` before
                  middleware runs.  Enables caller-managed lifecycle
                  patterns (e.g., passing a DB session handle that
                  middleware can use and ``finally_turn`` can close).
        **kwargs: Additional arguments

    Returns:
        Bot response as string

    Example:
        ```python
        context = BotContext(
            conversation_id="conv-123",
            client_id="client-456",
            user_id="user-789"
        )
        response = await bot.chat("Hello!", context)

        # With explicit RAG query
        response = await bot.chat(
            "Analyze this: Write a poem about cats",
            context,
            rag_query="prompt analysis techniques evaluation"
        )

        # With LLM config overrides (switch model per-request)
        response = await bot.chat(
            "Explain quantum computing",
            context,
            llm_config_overrides={"model": "gpt-4-turbo", "temperature": 0.9}
        )
        ```
    """
    turn = TurnState(
        mode=TurnMode.CHAT,
        message=message,
        context=context,
        rag_query=rag_query,
        temperature=temperature,
        max_tokens=max_tokens,
        llm_config_overrides=llm_config_overrides,
        plugin_data=plugin_data or {},
    )
    try:
        await self._prepare_turn(turn)
        response = await self._generate_response(
            turn.manager, temperature, max_tokens, llm_config_overrides
        )

        # DynaBot-level tool execution loop.  Strategies that handle
        # tool_calls internally (e.g. ReAct) return responses without
        # tool_calls, so this loop is a no-op for them.
        loop_start = time.monotonic()
        for _iteration in range(self._max_tool_iterations):
            if (
                not self.tool_registry
                or not getattr(response, "tool_calls", None)
            ):
                break
            if time.monotonic() - loop_start >= self._tool_loop_timeout:
                logger.warning(
                    "Tool execution loop exceeded wall-clock timeout "
                    "(%.1fs)",
                    self._tool_loop_timeout,
                    extra={
                        "conversation_id": getattr(
                            turn.manager, "conversation_id", None
                        ),
                    },
                )
                break
            await self._execute_tools(turn, response.tool_calls)
            # Accumulate usage from intermediate LLM calls
            turn.accumulate_usage(response)
            # Enforce remaining loop budget on the LLM re-call
            remaining = self._tool_loop_timeout - (
                time.monotonic() - loop_start
            )
            if remaining <= 0:
                logger.warning(
                    "Tool loop budget exhausted before LLM re-call "
                    "(%.1fs budget)",
                    self._tool_loop_timeout,
                    extra={
                        "conversation_id": getattr(
                            turn.manager, "conversation_id", None
                        ),
                    },
                )
                break
            try:
                response = await asyncio.wait_for(
                    turn.manager.complete(
                        tools=list(self.tool_registry) or None,
                        temperature=temperature or self.default_temperature,
                        max_tokens=max_tokens or self.default_max_tokens,
                        llm_config_overrides=llm_config_overrides,
                    ),
                    timeout=remaining,
                )
            except (TimeoutError, asyncio.TimeoutError):
                logger.warning(
                    "LLM re-call exceeded remaining tool loop "
                    "budget (%.1fs remaining of %.1fs)",
                    remaining,
                    self._tool_loop_timeout,
                    extra={
                        "conversation_id": getattr(
                            turn.manager, "conversation_id", None
                        ),
                    },
                )
                break
        else:
            # Loop completed without break — cap hit
            if self.tool_registry and getattr(
                response, "tool_calls", None
            ):
                logger.warning(
                    "Tool execution loop reached max iterations (%d) "
                    "with pending tool_calls",
                    self._max_tool_iterations,
                    extra={
                        "conversation_id": getattr(
                            turn.manager, "conversation_id", None
                        ),
                    },
                )

        turn.response = response
        turn.response_content = self._extract_response_content(response)
        turn.populate_from_response(response, self.llm)
        await self._finalize_turn(turn)
        return turn.response_content
    except Exception as e:
        await self._call_on_error_middleware(e, message, context)
        raise
    finally:
        await self._call_finally_turn_middleware(turn)

greet `async` ¶

greet(
    context: BotContext,
    *,
    initial_context: dict[str, Any] | None = None,
    plugin_data: dict[str, Any] | None = None,
) -> str | None

Generate a bot-initiated greeting before the user speaks.

Delegates to the reasoning strategy's greet() method. Returns None if the bot has no reasoning strategy or the strategy does not support greetings (e.g. non-wizard strategies).

No user message is added to conversation history — the greeting is a bot-initiated assistant message only.

Parameters:

Name	Type	Description	Default
`context`	`BotContext`	Bot execution context	required
`initial_context`	`dict[str, Any] \| None`	Optional dict of initial data to seed into the reasoning strategy's state before generating the greeting. For wizard strategies, these values are merged into `wizard_state.data` so they are available to the start stage's prompt template and transforms.	`None`
`plugin_data`	`dict[str, Any] \| None`	Optional dict to seed `turn.plugin_data` before middleware runs. See `chat()` for details. When `reasoning_strategy` is `None`, no turn is initiated but `finally_turn` still fires if `plugin_data` was provided, ensuring cleanup.	`None`

Returns:

Type	Description
`str \| None`	Greeting string, or None if the bot does not support greetings

Note

Middleware lifecycle for greet: on_turn_start(turn) and before_message("") are called before greeting generation; after_turn(turn) and after_message(...) are called on success (only when a response is generated); finally_turn(turn) fires on success, error, and when the strategy returns None (no greeting). If an error occurs, on_error hooks receive message="" since there is no user message. If a middleware hook itself fails, on_hook_error is called on all middleware.

Example

context = BotContext(conversation_id="conv-123", client_id="harness")
greeting = await bot.greet(context, initial_context={"user_name": "Alice"})
if greeting:
    print(f"Bot says: {greeting}")

Source code in packages/bots/src/dataknobs_bots/bot/base.py

async def greet(
    self,
    context: BotContext,
    *,
    initial_context: dict[str, Any] | None = None,
    plugin_data: dict[str, Any] | None = None,
) -> str | None:
    """Generate a bot-initiated greeting before the user speaks.

    Delegates to the reasoning strategy's ``greet()`` method. Returns
    ``None`` if the bot has no reasoning strategy or the strategy does
    not support greetings (e.g. non-wizard strategies).

    No user message is added to conversation history — the greeting
    is a bot-initiated assistant message only.

    Args:
        context: Bot execution context
        initial_context: Optional dict of initial data to seed into
            the reasoning strategy's state before generating the
            greeting. For wizard strategies, these values are merged
            into ``wizard_state.data`` so they are available to the
            start stage's prompt template and transforms.
        plugin_data: Optional dict to seed ``turn.plugin_data`` before
            middleware runs.  See ``chat()`` for details.

            When ``reasoning_strategy`` is ``None``, no turn is
            initiated but ``finally_turn`` still fires if
            ``plugin_data`` was provided, ensuring cleanup.

    Returns:
        Greeting string, or None if the bot does not support greetings

    Note:
        Middleware lifecycle for greet: ``on_turn_start(turn)`` and
        ``before_message("")`` are called before greeting generation;
        ``after_turn(turn)`` and ``after_message(...)`` are called on
        success (only when a response is generated);
        ``finally_turn(turn)`` fires on success, error, and when
        the strategy returns ``None`` (no greeting).
        If an error occurs, ``on_error`` hooks receive
        ``message=""`` since there is no user message.  If a
        middleware hook itself fails, ``on_hook_error`` is called on
        all middleware.

    Example:
        ```python
        context = BotContext(conversation_id="conv-123", client_id="harness")
        greeting = await bot.greet(context, initial_context={"user_name": "Alice"})
        if greeting:
            print(f"Bot says: {greeting}")
        ```
    """
    if not self.reasoning_strategy:
        if plugin_data is not None:
            turn = TurnState(
                mode=TurnMode.GREET,
                message="",
                context=context,
                plugin_data=plugin_data,
            )
            await self._call_finally_turn_middleware(turn)
        return None

    turn = TurnState(
        mode=TurnMode.GREET,
        message="",
        context=context,
        initial_context=initial_context,
        plugin_data=plugin_data or {},
    )
    try:
        await self._prepare_turn(turn)

        response = await self.reasoning_strategy.greet(
            manager=turn.manager,
            llm=self.llm,
            initial_context=initial_context,
        )

        if response is None:
            return None

        turn.response = response
        turn.response_content = self._extract_response_content(response)
        # Note: greet responses are not checked for tool_calls.
        # Greetings are bot-initiated and strategies are not expected
        # to request tool calls during greet.  If this assumption
        # changes, add the tool execution loop here (matching
        # chat/stream_chat).
        turn.populate_from_response(response, self.llm)
        await self._finalize_turn(turn)
        return turn.response_content
    except Exception as e:
        await self._call_on_error_middleware(e, "", context)
        raise
    finally:
        await self._call_finally_turn_middleware(turn)

stream_chat `async` ¶

stream_chat(
    message: str,
    context: BotContext,
    temperature: float | None = None,
    max_tokens: int | None = None,
    rag_query: str | None = None,
    llm_config_overrides: dict[str, Any] | None = None,
    plugin_data: dict[str, Any] | None = None,
    **kwargs: Any,
) -> AsyncGenerator[LLMStreamResponse, None]

Stream chat response token by token.

Similar to chat() but yields LLMStreamResponse objects as they are generated, providing both the text delta and rich metadata (usage, finish_reason, is_final) for each chunk.

Parameters:

Name	Type	Description	Default
`message`	`str`	User message to process	required
`context`	`BotContext`	Bot execution context	required
`temperature`	`float \| None`	Optional temperature override	`None`
`max_tokens`	`int \| None`	Optional max tokens override	`None`
`rag_query`	`str \| None`	Optional explicit query for knowledge base retrieval. If provided, this is used instead of the message for RAG.	`None`
`llm_config_overrides`	`dict[str, Any] \| None`	Optional dict to override LLM config fields for this request only. Supported fields: model, temperature, max_tokens, top_p, stop_sequences, seed, options.	`None`
`plugin_data`	`dict[str, Any] \| None`	Optional dict to seed `turn.plugin_data` before middleware runs. See `chat()` for details.	`None`
`**kwargs`	`Any`	Additional arguments passed to LLM	`{}`

Yields:

Type	Description
`AsyncGenerator[LLMStreamResponse, None]`	LLMStreamResponse objects with `.delta` (text), `.is_final`,
`AsyncGenerator[LLMStreamResponse, None]`	`.usage`, and `.finish_reason` attributes.

Example

context = BotContext(
    conversation_id="conv-123",
    client_id="client-456",
    user_id="user-789"
)

# Stream and display in real-time
async for chunk in bot.stream_chat("Explain quantum computing", context):
    print(chunk.delta, end="", flush=True)
print()  # Newline after streaming

# Accumulate response
full_response = ""
async for chunk in bot.stream_chat("Hello!", context):
    full_response += chunk.delta

# With LLM config overrides
async for chunk in bot.stream_chat(
    "Explain quantum computing",
    context,
    llm_config_overrides={"model": "gpt-4-turbo"}
):
    print(chunk.delta, end="", flush=True)

Note

Conversation history is automatically updated after streaming completes. When a reasoning_strategy is configured, the strategy produces the complete response and it is emitted as a single stream chunk.

Cleanup guarantee: finally_turn middleware fires via a finally block inside the async generator. In Python, async generator finally blocks execute only when the generator is fully consumed, explicitly closed (await gen.aclose()), or garbage collected. Callers that break out of the stream early should use contextlib.aclosing to guarantee prompt cleanup::

from contextlib import aclosing

async with aclosing(bot.stream_chat("msg", ctx)) as stream:
    async for chunk in stream:
        if done:
            break  # aclose() fires finally_turn

Source code in packages/bots/src/dataknobs_bots/bot/base.py

async def stream_chat(
    self,
    message: str,
    context: BotContext,
    temperature: float | None = None,
    max_tokens: int | None = None,
    rag_query: str | None = None,
    llm_config_overrides: dict[str, Any] | None = None,
    plugin_data: dict[str, Any] | None = None,
    **kwargs: Any,
) -> AsyncGenerator[LLMStreamResponse, None]:
    """Stream chat response token by token.

    Similar to chat() but yields ``LLMStreamResponse`` objects as they are
    generated, providing both the text delta and rich metadata (usage,
    finish_reason, is_final) for each chunk.

    Args:
        message: User message to process
        context: Bot execution context
        temperature: Optional temperature override
        max_tokens: Optional max tokens override
        rag_query: Optional explicit query for knowledge base retrieval.
                  If provided, this is used instead of the message for RAG.
        llm_config_overrides: Optional dict to override LLM config fields
                  for this request only. Supported fields: model, temperature,
                  max_tokens, top_p, stop_sequences, seed, options.
        plugin_data: Optional dict to seed ``turn.plugin_data`` before
                  middleware runs.  See ``chat()`` for details.
        **kwargs: Additional arguments passed to LLM

    Yields:
        LLMStreamResponse objects with ``.delta`` (text), ``.is_final``,
        ``.usage``, and ``.finish_reason`` attributes.

    Example:
        ```python
        context = BotContext(
            conversation_id="conv-123",
            client_id="client-456",
            user_id="user-789"
        )

        # Stream and display in real-time
        async for chunk in bot.stream_chat("Explain quantum computing", context):
            print(chunk.delta, end="", flush=True)
        print()  # Newline after streaming

        # Accumulate response
        full_response = ""
        async for chunk in bot.stream_chat("Hello!", context):
            full_response += chunk.delta

        # With LLM config overrides
        async for chunk in bot.stream_chat(
            "Explain quantum computing",
            context,
            llm_config_overrides={"model": "gpt-4-turbo"}
        ):
            print(chunk.delta, end="", flush=True)
        ```

    Note:
        Conversation history is automatically updated after streaming completes.
        When a reasoning_strategy is configured, the strategy produces the
        complete response and it is emitted as a single stream chunk.

        **Cleanup guarantee:** ``finally_turn`` middleware fires via a
        ``finally`` block inside the async generator.  In Python, async
        generator ``finally`` blocks execute only when the generator is
        fully consumed, explicitly closed (``await gen.aclose()``), or
        garbage collected.  Callers that break out of the stream early
        should use ``contextlib.aclosing`` to guarantee prompt cleanup::

            from contextlib import aclosing

            async with aclosing(bot.stream_chat("msg", ctx)) as stream:
                async for chunk in stream:
                    if done:
                        break  # aclose() fires finally_turn
    """
    turn = TurnState(
        mode=TurnMode.STREAM,
        message=message,
        context=context,
        rag_query=rag_query,
        temperature=temperature,
        max_tokens=max_tokens,
        llm_config_overrides=llm_config_overrides,
        plugin_data=plugin_data or {},
    )
    streaming_error: Exception | None = None
    stream_fully_consumed = False

    try:
        await self._prepare_turn(turn)

        # Track tool_calls across streaming rounds so the tool
        # execution loop can pick them up after the initial stream.
        pending_tool_calls: list[Any] | None = None

        if self.reasoning_strategy:
            # Delegate to the strategy's stream_generate().
            # Strategies with true streaming (SimpleReasoning) yield
            # LLMStreamResponse chunks; others yield a single complete
            # response that we wrap as a stream chunk.
            async for chunk in self.reasoning_strategy.stream_generate(
                manager=turn.manager,
                llm=self.llm,
                tools=list(self.tool_registry) or None,
                temperature=temperature or self.default_temperature,
                max_tokens=max_tokens or self.default_max_tokens,
                llm_config_overrides=llm_config_overrides,
            ):
                if isinstance(chunk, LLMStreamResponse):
                    turn.stream_chunks.append(chunk.delta)
                    if chunk.is_final or chunk.usage:
                        turn.populate_from_final_stream_chunk(
                            chunk, self.llm
                        )
                    # Intercept tool_calls: suppress is_final so the
                    # consumer knows more content may follow.
                    if chunk.tool_calls and self.tool_registry:
                        pending_tool_calls = chunk.tool_calls
                        yield LLMStreamResponse(
                            delta=chunk.delta,
                            is_final=False,
                            usage=chunk.usage,
                            model=chunk.model,
                        )
                    else:
                        yield chunk
                else:
                    # Strategy yielded a complete LLMResponse — wrap it
                    content = self._extract_response_content(chunk)
                    turn.stream_chunks.append(content)
                    turn.populate_from_response(chunk, self.llm)
                    # Check for tool_calls on the LLMResponse
                    if (
                        getattr(chunk, "tool_calls", None)
                        and self.tool_registry
                    ):
                        pending_tool_calls = chunk.tool_calls
                        yield LLMStreamResponse(
                            delta=content, is_final=False,
                        )
                    else:
                        yield LLMStreamResponse(
                            delta=content,
                            is_final=True,
                            finish_reason="stop",
                        )
        else:
            # No reasoning strategy — stream directly from LLM
            async for chunk in turn.manager.stream_complete(
                tools=list(self.tool_registry) or None,
                llm_config_overrides=llm_config_overrides,
                temperature=temperature or self.default_temperature,
                max_tokens=max_tokens or self.default_max_tokens,
                **kwargs,
            ):
                turn.stream_chunks.append(chunk.delta)
                if chunk.is_final or chunk.usage:
                    turn.populate_from_final_stream_chunk(chunk, self.llm)
                if chunk.tool_calls and self.tool_registry:
                    pending_tool_calls = chunk.tool_calls
                    yield LLMStreamResponse(
                        delta=chunk.delta,
                        is_final=False,
                        usage=chunk.usage,
                        model=chunk.model,
                    )
                else:
                    yield chunk

        # DynaBot-level tool execution loop for streaming.
        # Execute pending tool_calls, then re-stream until no
        # more tool_calls or max iterations reached.
        loop_start = time.monotonic()
        for _iteration in range(self._max_tool_iterations):
            if not pending_tool_calls or not self.tool_registry:
                break
            if time.monotonic() - loop_start >= self._tool_loop_timeout:
                logger.warning(
                    "Streaming tool execution loop exceeded "
                    "wall-clock timeout (%.1fs)",
                    self._tool_loop_timeout,
                    extra={
                        "conversation_id": getattr(
                            turn.manager, "conversation_id", None
                        ),
                    },
                )
                break
            await self._execute_tools(turn, pending_tool_calls)
            # Accumulate usage from intermediate streaming rounds
            turn.accumulate_usage_from_stream()
            pending_tool_calls = None

            # Check remaining budget before starting LLM re-stream
            remaining = self._tool_loop_timeout - (
                time.monotonic() - loop_start
            )
            if remaining <= 0:
                logger.warning(
                    "Streaming tool loop budget exhausted before "
                    "LLM re-stream (%.1fs budget)",
                    self._tool_loop_timeout,
                    extra={
                        "conversation_id": getattr(
                            turn.manager, "conversation_id", None
                        ),
                    },
                )
                break

            async for chunk in turn.manager.stream_complete(
                tools=list(self.tool_registry) or None,
                temperature=temperature or self.default_temperature,
                max_tokens=max_tokens or self.default_max_tokens,
                llm_config_overrides=llm_config_overrides,
            ):
                turn.stream_chunks.append(chunk.delta)
                if chunk.is_final or chunk.usage:
                    turn.populate_from_final_stream_chunk(
                        chunk, self.llm
                    )
                if chunk.tool_calls and self.tool_registry:
                    pending_tool_calls = chunk.tool_calls
                    yield LLMStreamResponse(
                        delta=chunk.delta,
                        is_final=False,
                        usage=chunk.usage,
                        model=chunk.model,
                    )
                else:
                    yield chunk
        else:
            # Loop completed without break — cap hit
            if pending_tool_calls and self.tool_registry:
                logger.warning(
                    "Streaming tool execution loop reached max "
                    "iterations (%d) with pending tool_calls",
                    self._max_tool_iterations,
                    extra={
                        "conversation_id": getattr(
                            turn.manager, "conversation_id", None
                        ),
                    },
                )

        stream_fully_consumed = True

    except Exception as e:
        streaming_error = e
        await self._call_on_error_middleware(e, message, context)
        raise
    finally:
        # Only finalize when the stream was fully consumed (not
        # on early exit via aclose/break, which would write
        # partial data to conversation history).
        if streaming_error is None and stream_fully_consumed:
            turn.response_content = "".join(turn.stream_chunks)
            await self._finalize_turn(turn)
        await self._call_finally_turn_middleware(turn)

get_conversation `async` ¶

get_conversation(conversation_id: str) -> Any

Retrieve conversation history.

This method fetches the complete conversation state including all messages, metadata, and the message tree structure. Useful for displaying conversation history, debugging, analytics, or exporting conversations.

Parameters:

Name	Type	Description	Default
`conversation_id`	`str`	Unique identifier of the conversation to retrieve	required

Returns:

Type	Description
`Any`	ConversationState object containing the full conversation history,
`Any`	or None if the conversation does not exist

Example

# Retrieve a conversation
conv_state = await bot.get_conversation("conv-123")

# Access messages
messages = conv_state.message_tree

# Access metadata
print(conv_state.metadata)

clear_conversation `async` ¶

clear_conversation(conversation_id: str) -> bool

Clear a conversation's history.

This method removes the conversation from both persistent storage and the internal cache. The next chat() call with this conversation_id will start a fresh conversation. Useful for:

Implementing "start over" functionality
Privacy/data deletion requirements
Testing and cleanup
Resetting conversation context

Parameters:

Name	Type	Description	Default
`conversation_id`	`str`	Unique identifier of the conversation to clear	required

Returns:

Type	Description
`bool`	True if the conversation was deleted, False if it didn't exist

Example

# Clear a conversation
deleted = await bot.clear_conversation("conv-123")

if deleted:
    print("Conversation deleted")
else:
    print("Conversation not found")

# Next chat will start fresh
response = await bot.chat("Hello!", context)

Note

This operation is permanent and cannot be undone. The conversation cannot be recovered after deletion.

get_wizard_state `async` ¶

get_wizard_state(conversation_id: str) -> dict[str, Any] | None

Get current wizard state for a conversation.

This method provides public access to wizard state without requiring access to private conversation managers. It checks the in-memory manager first (most current) and falls back to persisted storage.

Parameters:

Name	Type	Description	Default
`conversation_id`	`str`	Conversation identifier	required

Returns:

Type	Description
`dict[str, Any] \| None`	Wizard state dict with canonical structure, or None if no wizard
`dict[str, Any] \| None`	active or conversation not found.

The returned dict follows the canonical schema

{ "current_stage": str, "stage_index": int, "total_stages": int, "progress": float, "completed": bool, "data": dict, "can_skip": bool, "can_go_back": bool, "suggestions": list[str], "history": list[str], }

Example

# Get wizard state for a conversation
state = await bot.get_wizard_state("conv-123")

if state:
    print(f"Current stage: {state['current_stage']}")
    print(f"Progress: {state['progress'] * 100:.0f}%")
    print(f"Collected data: {state['data']}")

Source code in packages/bots/src/dataknobs_bots/bot/base.py

async def get_wizard_state(self, conversation_id: str) -> dict[str, Any] | None:
    """Get current wizard state for a conversation.

    This method provides public access to wizard state without requiring
    access to private conversation managers. It checks the in-memory
    manager first (most current) and falls back to persisted storage.

    Args:
        conversation_id: Conversation identifier

    Returns:
        Wizard state dict with canonical structure, or None if no wizard
        active or conversation not found.

    The returned dict follows the canonical schema:
        {
            "current_stage": str,
            "stage_index": int,
            "total_stages": int,
            "progress": float,
            "completed": bool,
            "data": dict,
            "can_skip": bool,
            "can_go_back": bool,
            "suggestions": list[str],
            "history": list[str],
        }

    Example:
        ```python
        # Get wizard state for a conversation
        state = await bot.get_wizard_state("conv-123")

        if state:
            print(f"Current stage: {state['current_stage']}")
            print(f"Progress: {state['progress'] * 100:.0f}%")
            print(f"Collected data: {state['data']}")
        ```
    """
    # Fast path: in-memory cache
    manager = self._conversation_managers.get(conversation_id)
    if manager and manager.metadata:
        wizard_meta = manager.metadata.get("wizard")
        if wizard_meta:
            return self._normalize_wizard_state(wizard_meta)

    # Slow path: fall back to persisted storage
    state = await self.conversation_storage.load_conversation(conversation_id)
    if state and state.metadata:
        wizard_meta = state.metadata.get("wizard")
        if wizard_meta:
            return self._normalize_wizard_state(wizard_meta)

    return None

close `async` ¶

close() -> None

Close the bot and clean up resources.

This method closes the LLM provider, conversation storage backend, reasoning strategy, and releases associated resources like HTTP connections and database connections. Should be called when the bot is no longer needed, especially in testing or when creating temporary bot instances.

Example

bot = await DynaBot.from_config(config)
try:
    response = await bot.chat("Hello", context)
finally:
    await bot.close()

Note

After calling close(), the bot should not be used for further operations. Create a new bot instance if needed.

Source code in packages/bots/src/dataknobs_bots/bot/base.py

async def close(self) -> None:
    """Close the bot and clean up resources.

    This method closes the LLM provider, conversation storage backend,
    reasoning strategy, and releases associated resources like HTTP
    connections and database connections. Should be called when the bot
    is no longer needed, especially in testing or when creating temporary
    bot instances.

    Example:
        ```python
        bot = await DynaBot.from_config(config)
        try:
            response = await bot.chat("Hello", context)
        finally:
            await bot.close()
        ```

    Note:
        After calling close(), the bot should not be used for further operations.
        Create a new bot instance if needed.
    """
    # Each subsystem owns the lifecycle of the providers it created.
    # The provider registry is a catalog for observability — it does
    # not manage lifecycle.  DynaBot only closes self.llm (the main
    # provider it created).

    # Close subsystems — each closes its own providers and resources.
    if self.knowledge_base:
        try:
            await self.knowledge_base.close()
        except Exception:
            logger.exception("Error closing knowledge base")

    if self.reasoning_strategy:
        try:
            await self.reasoning_strategy.close()
        except Exception:
            logger.exception("Error closing reasoning strategy")

    if self.memory:
        try:
            await self.memory.close()
        except Exception:
            logger.exception("Error closing memory store")

    # Close conversation storage
    if self.conversation_storage:
        try:
            await self.conversation_storage.close()
        except Exception:
            logger.exception("Error closing conversation storage")

    # Close main LLM provider only if DynaBot created it.
    # When from_config(llm=...) was used, the caller owns the lifecycle.
    if self._owns_llm and self.llm and hasattr(self.llm, "close"):
        try:
            await self.llm.close()
        except Exception:
            logger.exception("Error closing main LLM provider")

aenter `async` ¶

__aenter__() -> Self

Async context manager entry.

Returns:

Type	Description
`Self`	Self for use in async with statement

Source code in packages/bots/src/dataknobs_bots/bot/base.py

async def __aenter__(self) -> Self:
    """Async context manager entry.

    Returns:
        Self for use in async with statement
    """
    return self

aexit `async` ¶

__aexit__(
    exc_type: type[BaseException] | None,
    exc_val: BaseException | None,
    exc_tb: TracebackType | None,
) -> None

Async context manager exit - ensures cleanup.

Parameters:

Name	Type	Description	Default
`exc_type`	`type[BaseException] \| None`	Exception type if an exception occurred	required
`exc_val`	`BaseException \| None`	Exception value if an exception occurred	required
`exc_tb`	`TracebackType \| None`	Exception traceback if an exception occurred	required

Source code in packages/bots/src/dataknobs_bots/bot/base.py

async def __aexit__(
    self,
    exc_type: type[BaseException] | None,
    exc_val: BaseException | None,
    exc_tb: TracebackType | None,
) -> None:
    """Async context manager exit - ensures cleanup.

    Args:
        exc_type: Exception type if an exception occurred
        exc_val: Exception value if an exception occurred
        exc_tb: Exception traceback if an exception occurred
    """
    await self.close()

get_conversation_manager ¶

get_conversation_manager(conversation_id: str) -> ConversationManager | None

Get a cached conversation manager by conversation ID.

Returns None if no manager exists for the given ID (i.e. no turn has been processed for that conversation yet). Use this for cross-layer integration testing (e.g. injecting LLM-layer ConversationMiddleware into a manager after construction).

Parameters:

Name	Type	Description	Default
`conversation_id`	`str`	Conversation identifier	required

Returns:

Type	Description
`ConversationManager \| None`	Cached ConversationManager, or None

Source code in packages/bots/src/dataknobs_bots/bot/base.py

def get_conversation_manager(
    self, conversation_id: str
) -> ConversationManager | None:
    """Get a cached conversation manager by conversation ID.

    Returns ``None`` if no manager exists for the given ID (i.e. no
    turn has been processed for that conversation yet).  Use this for
    cross-layer integration testing (e.g. injecting LLM-layer
    ``ConversationMiddleware`` into a manager after construction).

    Args:
        conversation_id: Conversation identifier

    Returns:
        Cached ConversationManager, or None
    """
    return self._conversation_managers.get(conversation_id)

undo_last_turn `async` ¶

undo_last_turn(context: BotContext) -> UndoResult

Undo the last conversational turn (user message + bot response).

Navigates the conversation tree back to the node_id recorded before the last turn started. The next chat() call will create a new branch from that point. The original branch is preserved in the tree.

Also rolls back: - Memory layer (pop N messages based on node depth difference) - Wizard FSM state (restored from per-node metadata) - Memory banks (reverted via backend-managed checkpointing)

Parameters:

Name	Type	Description	Default
`context`	`BotContext`	Bot execution context (identifies the conversation).	required

Returns:

Type	Description
`UndoResult`	UndoResult with details about what was undone.

Raises:

Type	Description
`ValueError`	If there's nothing to undo (at start of conversation).

Source code in packages/bots/src/dataknobs_bots/bot/base.py

async def undo_last_turn(self, context: BotContext) -> UndoResult:
    """Undo the last conversational turn (user message + bot response).

    Navigates the conversation tree back to the node_id recorded before
    the last turn started. The next chat() call will create a new branch
    from that point. The original branch is preserved in the tree.

    Also rolls back:
    - Memory layer (pop N messages based on node depth difference)
    - Wizard FSM state (restored from per-node metadata)
    - Memory banks (reverted via backend-managed checkpointing)

    Args:
        context: Bot execution context (identifies the conversation).

    Returns:
        UndoResult with details about what was undone.

    Raises:
        ValueError: If there's nothing to undo (at start of conversation).
    """
    conv_id = context.conversation_id
    manager = self._conversation_managers.get(conv_id)
    if manager is None or manager.state is None:
        raise ValueError("No active conversation")

    checkpoints = self._turn_checkpoints.get(conv_id, [])
    if not checkpoints:
        raise ValueError("Nothing to undo")

    checkpoint_node_id, checkpoint_mem_count = checkpoints.pop()

    # Identify what we're undoing (last user message + last bot response).
    # For user messages, prefer raw_content from node metadata so that
    # UndoResult.undone_user_message reflects the original user input
    # rather than the KB/memory-augmented version.
    undone_user = ""
    undone_bot = ""
    nodes = manager.state.get_current_nodes()
    for node in reversed(nodes):
        role = node.message.role
        if role == "assistant" and not undone_bot:
            content = node.message.content
            undone_bot = content if isinstance(content, str) else str(content)
        elif role == "user" and not undone_user:
            raw = node.metadata.get("raw_content")
            if raw is not None:
                undone_user = raw
            else:
                content = node.message.content
                undone_user = content if isinstance(content, str) else str(content)
            break

    # Navigate back — next add_message() creates a sibling branch
    await manager.switch_to_node(checkpoint_node_id)

    # Roll back memory — use stored message count for accuracy
    current_mem_count = 0
    if self.memory:
        try:
            current_mem_count = len(await self.memory.get_context(""))
        except Exception:
            current_mem_count = 0
    messages_to_pop = current_mem_count - checkpoint_mem_count
    if self.memory and messages_to_pop > 0:
        try:
            await self.memory.pop_messages(messages_to_pop)
        except (ValueError, NotImplementedError):
            logger.warning(
                "Memory pop_messages failed for %d messages",
                messages_to_pop,
                exc_info=True,
            )

    # Restore wizard FSM state from checkpoint node's metadata
    self._restore_wizard_from_node(manager, checkpoint_node_id)

    # Revert banks via backend-managed checkpointing
    self._undo_banks_to_checkpoint(checkpoint_node_id)

    # Count remaining turns
    remaining_messages = manager.messages
    user_count = sum(
        1 for m in remaining_messages
        if (m.get("role") if isinstance(m, dict) else getattr(m, "role", "")) == "user"
    )

    return UndoResult(
        undone_user_message=undone_user,
        undone_bot_response=undone_bot,
        remaining_turns=user_count,
        branching=True,
    )

rewind_to_turn `async` ¶

rewind_to_turn(context: BotContext, turn: int) -> UndoResult

Rewind conversation to after the given turn number.

Turn 0 is the first user-bot exchange. Rewinding to turn -1 means back to the start (before any user messages).

Parameters:

Name	Type	Description	Default
`context`	`BotContext`	Bot execution context.	required
`turn`	`int`	Turn number to rewind to (-1 for conversation start).	required

Returns:

Type	Description
`UndoResult`	UndoResult with details about what was undone.

Raises:

Type	Description
`ValueError`	If turn number is invalid.

Source code in packages/bots/src/dataknobs_bots/bot/base.py

async def rewind_to_turn(
    self, context: BotContext, turn: int
) -> UndoResult:
    """Rewind conversation to after the given turn number.

    Turn 0 is the first user-bot exchange. Rewinding to turn -1
    means back to the start (before any user messages).

    Args:
        context: Bot execution context.
        turn: Turn number to rewind to (-1 for conversation start).

    Returns:
        UndoResult with details about what was undone.

    Raises:
        ValueError: If turn number is invalid.
    """
    conv_id = context.conversation_id
    checkpoints = self._turn_checkpoints.get(conv_id, [])
    target_count = turn + 1  # checkpoints[0] is before turn 0

    if target_count < 0 or target_count > len(checkpoints):
        raise ValueError(
            f"Invalid turn {turn}: conversation has "
            f"{len(checkpoints)} turns"
        )

    turns_to_undo = len(checkpoints) - target_count
    result = None
    for _ in range(turns_to_undo):
        result = await self.undo_last_turn(context)

    if result is None:
        raise ValueError("Nothing to undo")
    return result

UndoResult `dataclass` ¶

UndoResult(
    undone_user_message: str,
    undone_bot_response: str,
    remaining_turns: int,
    branching: bool,
)

Result of an undo operation.

ConfigDraftManager ¶

ConfigDraftManager(
    output_dir: Path,
    draft_prefix: str = "_draft-",
    max_age_hours: float = 24.0,
    metadata_key: str = "_draft",
)

File-based draft manager for interactive config creation.

Manages the lifecycle of configuration drafts: creation, incremental updates, finalization, and cleanup of stale drafts.

Draft files are named {prefix}{draft_id}.yaml and stored in the output directory. When a config_name is provided, a named alias file {config_name}.yaml is also maintained.

Initialize the draft manager.

Parameters:

Name	Type	Description	Default
`output_dir`	`Path`	Directory for draft and config files.	required
`draft_prefix`	`str`	Prefix for draft file names.	`'_draft-'`
`max_age_hours`	`float`	Default maximum age for stale draft cleanup.	`24.0`
`metadata_key`	`str`	Key used to store draft metadata in config files.	`'_draft'`

Methods:

Name	Description
`create_draft`	Create a new draft from a config dict.
`update_draft`	Update an existing draft.
`get_draft`	Retrieve a draft and its metadata.
`finalize`	Finalize a draft into a completed configuration.
`discard`	Discard a draft by removing its file.
`list_drafts`	List all current drafts.
`cleanup_stale`	Remove drafts older than the specified age.

Attributes:

Name	Type	Description
`output_dir`	`Path`	The output directory for drafts.

Source code in packages/bots/src/dataknobs_bots/config/drafts.py

def __init__(
    self,
    output_dir: Path,
    draft_prefix: str = "_draft-",
    max_age_hours: float = 24.0,
    metadata_key: str = "_draft",
) -> None:
    """Initialize the draft manager.

    Args:
        output_dir: Directory for draft and config files.
        draft_prefix: Prefix for draft file names.
        max_age_hours: Default maximum age for stale draft cleanup.
        metadata_key: Key used to store draft metadata in config files.
    """
    self._output_dir = output_dir
    self._draft_prefix = draft_prefix
    self._max_age_hours = max_age_hours
    self._metadata_key = metadata_key

Attributes¶

output_dir `property` ¶

output_dir: Path

The output directory for drafts.

Functions¶

create_draft ¶

create_draft(config: dict[str, Any], stage: str | None = None) -> str

Create a new draft from a config dict.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Configuration dictionary to save as draft.	required
`stage`	`str \| None`	Current wizard stage.	`None`

Returns:

Type	Description
`str`	The generated draft ID.

Source code in packages/bots/src/dataknobs_bots/config/drafts.py

def create_draft(
    self,
    config: dict[str, Any],
    stage: str | None = None,
) -> str:
    """Create a new draft from a config dict.

    Args:
        config: Configuration dictionary to save as draft.
        stage: Current wizard stage.

    Returns:
        The generated draft ID.
    """
    draft_id = uuid.uuid4().hex[:8]
    now = datetime.now(timezone.utc).isoformat()

    metadata = DraftMetadata(
        draft_id=draft_id,
        created_at=now,
        last_updated=now,
        stage=stage,
    )

    self._write_draft(draft_id, config, metadata)
    logger.info(
        "Created draft %s at stage '%s'",
        draft_id,
        stage,
        extra={"draft_id": draft_id, "stage": stage},
    )
    return draft_id

update_draft ¶

update_draft(
    draft_id: str,
    config: dict[str, Any],
    stage: str | None = None,
    config_name: str | None = None,
) -> None

Update an existing draft.

Parameters:

Name	Type	Description	Default
`draft_id`	`str`	The draft ID to update.	required
`config`	`dict[str, Any]`	Updated configuration dictionary.	required
`stage`	`str \| None`	Current wizard stage.	`None`
`config_name`	`str \| None`	Optional name for the config file alias.	`None`

Raises:

Type	Description
`FileNotFoundError`	If the draft file does not exist.

Source code in packages/bots/src/dataknobs_bots/config/drafts.py

def update_draft(
    self,
    draft_id: str,
    config: dict[str, Any],
    stage: str | None = None,
    config_name: str | None = None,
) -> None:
    """Update an existing draft.

    Args:
        draft_id: The draft ID to update.
        config: Updated configuration dictionary.
        stage: Current wizard stage.
        config_name: Optional name for the config file alias.

    Raises:
        FileNotFoundError: If the draft file does not exist.
    """
    draft_path = self._draft_path(draft_id)
    if not draft_path.exists():
        raise FileNotFoundError(f"Draft not found: {draft_id}")

    existing = self._read_file(draft_path)
    existing_meta = existing.get(self._metadata_key, {})
    now = datetime.now(timezone.utc).isoformat()

    metadata = DraftMetadata(
        draft_id=draft_id,
        created_at=existing_meta.get("created_at", now),
        last_updated=now,
        stage=stage or existing_meta.get("stage"),
        config_name=config_name or existing_meta.get("config_name"),
    )

    self._write_draft(draft_id, config, metadata)

    # Also write named alias file if config_name is set
    if metadata.config_name:
        self._write_named_file(metadata.config_name, config, metadata)

    logger.info(
        "Updated draft %s at stage '%s'",
        draft_id,
        stage,
        extra={"draft_id": draft_id, "stage": stage},
    )

get_draft ¶

get_draft(draft_id: str) -> tuple[dict[str, Any], DraftMetadata] | None

Retrieve a draft and its metadata.

Parameters:

Name	Type	Description	Default
`draft_id`	`str`	The draft ID to retrieve.	required

Returns:

Type	Description
`tuple[dict[str, Any], DraftMetadata] \| None`	Tuple of (config_dict, metadata), or None if not found.

Source code in packages/bots/src/dataknobs_bots/config/drafts.py

def get_draft(
    self, draft_id: str
) -> tuple[dict[str, Any], DraftMetadata] | None:
    """Retrieve a draft and its metadata.

    Args:
        draft_id: The draft ID to retrieve.

    Returns:
        Tuple of (config_dict, metadata), or None if not found.
    """
    draft_path = self._draft_path(draft_id)
    if not draft_path.exists():
        return None

    data = self._read_file(draft_path)
    meta_dict = data.pop(self._metadata_key, {})
    metadata = DraftMetadata.from_dict(meta_dict)
    return data, metadata

finalize ¶

finalize(draft_id: str, final_name: str | None = None) -> dict[str, Any]

Finalize a draft into a completed configuration.

Strips draft metadata, writes the final config file, and removes the draft file.

Parameters:

Name	Type	Description	Default
`draft_id`	`str`	The draft ID to finalize.	required
`final_name`	`str \| None`	Name for the final config file. If not provided, uses the config_name from draft metadata.	`None`

Returns:

Type	Description
`dict[str, Any]`	The finalized configuration dict (without draft metadata).

Raises:

Type	Description
`FileNotFoundError`	If the draft does not exist.
`ValueError`	If no final name can be determined.

Source code in packages/bots/src/dataknobs_bots/config/drafts.py

def finalize(
    self,
    draft_id: str,
    final_name: str | None = None,
) -> dict[str, Any]:
    """Finalize a draft into a completed configuration.

    Strips draft metadata, writes the final config file, and
    removes the draft file.

    Args:
        draft_id: The draft ID to finalize.
        final_name: Name for the final config file. If not provided,
            uses the config_name from draft metadata.

    Returns:
        The finalized configuration dict (without draft metadata).

    Raises:
        FileNotFoundError: If the draft does not exist.
        ValueError: If no final name can be determined.
    """
    result = self.get_draft(draft_id)
    if result is None:
        raise FileNotFoundError(f"Draft not found: {draft_id}")

    config, metadata = result
    name = final_name or metadata.config_name
    if not name:
        raise ValueError(
            "No final_name provided and draft has no config_name set"
        )

    # Write final file without metadata
    self._ensure_output_dir()
    final_path = self._output_dir / f"{name}.yaml"
    self._write_yaml(final_path, config)

    # Remove draft file
    draft_path = self._draft_path(draft_id)
    if draft_path.exists():
        draft_path.unlink()

    logger.info(
        "Finalized draft %s as '%s'",
        draft_id,
        name,
        extra={"draft_id": draft_id, "final_name": name},
    )
    return config

discard ¶

discard(draft_id: str) -> bool

Discard a draft by removing its file.

Parameters:

Name	Type	Description	Default
`draft_id`	`str`	The draft ID to discard.	required

Returns:

Type	Description
`bool`	True if the draft was found and removed, False otherwise.

Source code in packages/bots/src/dataknobs_bots/config/drafts.py

def discard(self, draft_id: str) -> bool:
    """Discard a draft by removing its file.

    Args:
        draft_id: The draft ID to discard.

    Returns:
        True if the draft was found and removed, False otherwise.
    """
    draft_path = self._draft_path(draft_id)
    if draft_path.exists():
        draft_path.unlink()
        logger.info("Discarded draft %s", draft_id)
        return True
    return False

list_drafts ¶

list_drafts() -> list[DraftMetadata]

List all current drafts.

Returns:

Type	Description
`list[DraftMetadata]`	List of DraftMetadata for all draft files.

Source code in packages/bots/src/dataknobs_bots/config/drafts.py

def list_drafts(self) -> list[DraftMetadata]:
    """List all current drafts.

    Returns:
        List of DraftMetadata for all draft files.
    """
    result: list[DraftMetadata] = []
    if not self._output_dir.exists():
        return result

    for path in sorted(self._output_dir.glob(f"{self._draft_prefix}*.yaml")):
        try:
            data = self._read_file(path)
            meta_dict = data.get(self._metadata_key, {})
            if meta_dict:
                result.append(DraftMetadata.from_dict(meta_dict))
        except Exception:
            logger.exception("Failed to read draft: %s", path)
    return result

cleanup_stale ¶

cleanup_stale(max_age_hours: float | None = None) -> int

Remove drafts older than the specified age.

Also strips stale draft metadata blocks from named config files.

Parameters:

Name	Type	Description	Default
`max_age_hours`	`float \| None`	Maximum age in hours. Defaults to the manager's configured max_age_hours.	`None`

Returns:

Type	Description
`int`	Number of stale drafts removed.

Source code in packages/bots/src/dataknobs_bots/config/drafts.py

def cleanup_stale(self, max_age_hours: float | None = None) -> int:
    """Remove drafts older than the specified age.

    Also strips stale draft metadata blocks from named config files.

    Args:
        max_age_hours: Maximum age in hours. Defaults to the
            manager's configured max_age_hours.

    Returns:
        Number of stale drafts removed.
    """
    age_limit = max_age_hours if max_age_hours is not None else self._max_age_hours
    cutoff = time.time() - (age_limit * 3600)
    cleaned = 0

    if not self._output_dir.exists():
        return 0

    # Clean draft files
    for path in self._output_dir.glob(f"{self._draft_prefix}*.yaml"):
        try:
            data = self._read_file(path)
            meta = data.get(self._metadata_key, {})
            last_updated = meta.get("last_updated", "")
            if last_updated and _parse_timestamp(last_updated) < cutoff:
                path.unlink()
                cleaned += 1
                logger.info("Cleaned stale draft: %s", path.name)
        except Exception:
            logger.exception("Failed to cleanup draft: %s", path)

    # Strip stale metadata from named config files
    for path in self._output_dir.glob("*.yaml"):
        if path.name.startswith(self._draft_prefix):
            continue
        try:
            data = self._read_file(path)
            meta = data.get(self._metadata_key, {})
            if not meta:
                continue
            last_updated = meta.get("last_updated", "")
            if last_updated and _parse_timestamp(last_updated) < cutoff:
                data.pop(self._metadata_key, None)
                self._write_yaml(path, data)
                logger.info(
                    "Stripped stale metadata from %s", path.name
                )
        except Exception:
            logger.exception(
                "Failed to strip metadata from: %s", path
            )

    return cleaned

ConfigTemplate `dataclass` ¶

ConfigTemplate(
    name: str,
    description: str = "",
    version: str = "1.0.0",
    tags: list[str] = list(),
    variables: list[TemplateVariable] = list(),
    structure: dict[str, Any] = dict(),
)

A reusable DynaBot configuration template.

Templates define a configuration structure with variable placeholders ({{var}}) that are substituted when the template is applied.

Attributes:

Name	Type	Description
`name`	`str`	Template identifier (underscores internally).
`description`	`str`	Human-readable description.
`version`	`str`	Semantic version string.
`tags`	`list[str]`	Tags for filtering and categorization.
`variables`	`list[TemplateVariable]`	List of template variables.
`structure`	`dict[str, Any]`	The config structure with {{var}} placeholders.

Methods:

Name	Description
`get_required_variables`	Get variables that must be provided.
`get_optional_variables`	Get variables that have defaults or are not required.
`to_dict`	Convert to dictionary representation.
`from_dict`	Create a ConfigTemplate from a dictionary.
`from_yaml_file`	Load a ConfigTemplate from a YAML file.

Functions¶

get_required_variables ¶

get_required_variables() -> list[TemplateVariable]

Get variables that must be provided.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

def get_required_variables(self) -> list[TemplateVariable]:
    """Get variables that must be provided."""
    return [v for v in self.variables if v.required]

get_optional_variables ¶

get_optional_variables() -> list[TemplateVariable]

Get variables that have defaults or are not required.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

def get_optional_variables(self) -> list[TemplateVariable]:
    """Get variables that have defaults or are not required."""
    return [v for v in self.variables if not v.required]

to_dict ¶

to_dict() -> dict[str, Any]

Convert to dictionary representation.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

def to_dict(self) -> dict[str, Any]:
    """Convert to dictionary representation."""
    return {
        "name": self.name,
        "description": self.description,
        "version": self.version,
        "tags": self.tags,
        "variables": [v.to_dict() for v in self.variables],
        "structure": self.structure,
    }

from_dict `classmethod` ¶

from_dict(data: dict[str, Any]) -> ConfigTemplate

Create a ConfigTemplate from a dictionary.

Parameters:

Name	Type	Description	Default
`data`	`dict[str, Any]`	Dictionary with template fields.	required

Returns:

Type	Description
`ConfigTemplate`	A new ConfigTemplate instance.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

@classmethod
def from_dict(cls, data: dict[str, Any]) -> ConfigTemplate:
    """Create a ConfigTemplate from a dictionary.

    Args:
        data: Dictionary with template fields.

    Returns:
        A new ConfigTemplate instance.
    """
    variables = [
        TemplateVariable.from_dict(v) for v in data.get("variables", [])
    ]
    return cls(
        name=data.get("name", ""),
        description=data.get("description", ""),
        version=data.get("version", "1.0.0"),
        tags=data.get("tags", []),
        variables=variables,
        structure=data.get("structure", {}),
    )

from_yaml_file `classmethod` ¶

from_yaml_file(path: Path) -> ConfigTemplate

Load a ConfigTemplate from a YAML file.

Parameters:

Name	Type	Description	Default
`path`	`Path`	Path to the YAML file.	required

Returns:

Type	Description
`ConfigTemplate`	A new ConfigTemplate instance.

Raises:

Type	Description
`FileNotFoundError`	If the file does not exist.
`YAMLError`	If the file is not valid YAML.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

@classmethod
def from_yaml_file(cls, path: Path) -> ConfigTemplate:
    """Load a ConfigTemplate from a YAML file.

    Args:
        path: Path to the YAML file.

    Returns:
        A new ConfigTemplate instance.

    Raises:
        FileNotFoundError: If the file does not exist.
        yaml.YAMLError: If the file is not valid YAML.
    """
    with open(path) as f:
        data = yaml.safe_load(f)
    if data is None:
        data = {}
    template = cls.from_dict(data)
    if not template.name:
        template.name = path.stem.replace("-", "_")
    return template

ConfigTemplateRegistry ¶

ConfigTemplateRegistry()

Registry for managing and applying configuration templates.

Supports registration, tag-based filtering, variable validation, and template application with variable substitution.

Methods:

Name	Description
`register`	Register a template.
`get`	Get a template by name.
`list_templates`	List templates, optionally filtered by tags.
`load_from_file`	Load and register a template from a YAML file.
`load_from_directory`	Load and register all templates from a directory.
`apply_template`	Apply a template with variable substitution.
`validate_variables`	Validate variables against a template's requirements.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

def __init__(self) -> None:
    self._templates: dict[str, ConfigTemplate] = {}

Functions¶

register ¶

register(template: ConfigTemplate) -> None

Register a template.

Parameters:

Name	Type	Description	Default
`template`	`ConfigTemplate`	The template to register.	required

Source code in packages/bots/src/dataknobs_bots/config/templates.py

def register(self, template: ConfigTemplate) -> None:
    """Register a template.

    Args:
        template: The template to register.
    """
    self._templates[template.name] = template
    logger.debug("Registered template: %s", template.name)

get ¶

get(name: str) -> ConfigTemplate | None

Get a template by name.

Parameters:

Name	Type	Description	Default
`name`	`str`	Template name.	required

Returns:

Type	Description
`ConfigTemplate \| None`	The template, or None if not found.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

def get(self, name: str) -> ConfigTemplate | None:
    """Get a template by name.

    Args:
        name: Template name.

    Returns:
        The template, or None if not found.
    """
    return self._templates.get(name)

list_templates ¶

list_templates(tags: list[str] | None = None) -> list[ConfigTemplate]

List templates, optionally filtered by tags.

Parameters:

Name	Type	Description	Default
`tags`	`list[str] \| None`	If provided, only return templates that have all specified tags.	`None`

Returns:

Type	Description
`list[ConfigTemplate]`	List of matching templates.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

def list_templates(
    self, tags: list[str] | None = None
) -> list[ConfigTemplate]:
    """List templates, optionally filtered by tags.

    Args:
        tags: If provided, only return templates that have all specified tags.

    Returns:
        List of matching templates.
    """
    templates = list(self._templates.values())
    if tags:
        tag_set = set(tags)
        templates = [t for t in templates if tag_set.issubset(set(t.tags))]
    return templates

load_from_file ¶

load_from_file(path: Path) -> ConfigTemplate

Load and register a template from a YAML file.

Parameters:

Name	Type	Description	Default
`path`	`Path`	Path to the YAML file.	required

Returns:

Type	Description
`ConfigTemplate`	The loaded template.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

def load_from_file(self, path: Path) -> ConfigTemplate:
    """Load and register a template from a YAML file.

    Args:
        path: Path to the YAML file.

    Returns:
        The loaded template.
    """
    template = ConfigTemplate.from_yaml_file(path)
    self.register(template)
    return template

load_from_directory ¶

load_from_directory(directory: Path) -> int

Load and register all templates from a directory.

Scans for *.yaml and *.yml files, skipping files named README or base.

Parameters:

Name	Type	Description	Default
`directory`	`Path`	Directory to scan.	required

Returns:

Type	Description
`int`	Number of templates loaded.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

def load_from_directory(self, directory: Path) -> int:
    """Load and register all templates from a directory.

    Scans for ``*.yaml`` and ``*.yml`` files, skipping files named
    ``README`` or ``base``.

    Args:
        directory: Directory to scan.

    Returns:
        Number of templates loaded.
    """
    count = 0
    for ext in ("*.yaml", "*.yml"):
        for path in sorted(directory.glob(ext)):
            if path.stem.lower() in ("readme", "base"):
                continue
            try:
                self.load_from_file(path)
                count += 1
            except Exception:
                logger.exception("Failed to load template from %s", path)
    return count

apply_template ¶

apply_template(name: str, variables: dict[str, Any]) -> dict[str, Any]

Apply a template with variable substitution.

Deep-copies the template structure and substitutes all {{var}} placeholders with values from the variables dict.

Parameters:

Name	Type	Description	Default
`name`	`str`	Template name.	required
`variables`	`dict[str, Any]`	Variable values to substitute.	required

Returns:

Type	Description
`dict[str, Any]`	The resolved configuration dict.

Raises:

Type	Description
`KeyError`	If the template is not found.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

def apply_template(
    self,
    name: str,
    variables: dict[str, Any],
) -> dict[str, Any]:
    """Apply a template with variable substitution.

    Deep-copies the template structure and substitutes all ``{{var}}``
    placeholders with values from the variables dict.

    Args:
        name: Template name.
        variables: Variable values to substitute.

    Returns:
        The resolved configuration dict.

    Raises:
        KeyError: If the template is not found.
    """
    template = self._templates.get(name)
    if template is None:
        raise KeyError(f"Template not found: {name}")

    # Build full variable map: user values + defaults
    var_map = _build_variable_map(template, variables)

    structure = copy.deepcopy(template.structure)
    result: dict[str, Any] = substitute_template_vars(
        structure, var_map, preserve_missing=True
    )
    return result

validate_variables ¶

validate_variables(name: str, variables: dict[str, Any]) -> ValidationResult

Validate variables against a template's requirements.

Checks that required variables are present and that values match any defined choices constraints.

Parameters:

Name	Type	Description	Default
`name`	`str`	Template name.	required
`variables`	`dict[str, Any]`	Variable values to validate.	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with any issues found.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

def validate_variables(
    self,
    name: str,
    variables: dict[str, Any],
) -> ValidationResult:
    """Validate variables against a template's requirements.

    Checks that required variables are present and that values
    match any defined choices constraints.

    Args:
        name: Template name.
        variables: Variable values to validate.

    Returns:
        ValidationResult with any issues found.
    """
    template = self._templates.get(name)
    if template is None:
        return ValidationResult.error(f"Template not found: {name}")

    result = ValidationResult.ok()

    for var in template.variables:
        if var.required and var.name not in variables:
            if var.default is None:
                result = result.merge(
                    ValidationResult.error(
                        f"Missing required variable: {var.name}"
                    )
                )
        if var.choices is not None and var.name in variables:
            value = variables[var.name]
            if value not in var.choices:
                result = result.merge(
                    ValidationResult.error(
                        f"Variable '{var.name}' has invalid value '{value}'. "
                        f"Valid choices: {var.choices}"
                    )
                )

    return result

ConfigValidator ¶

ConfigValidator(schema: DynaBotConfigSchema | None = None)

Pluggable validation engine for DynaBot configurations.

Runs a pipeline of validators against a config dict and collects all errors and warnings into a single ValidationResult.

Example

validator = ConfigValidator()

# Add custom validator
def check_api_key(config):
    if "api_key" in str(config):
        return ValidationResult.warning("Config contains an API key")
    return ValidationResult.ok()

validator.register_validator("api_key_check", check_api_key)
result = validator.validate(my_config)

Initialize the validator.

Parameters:

Name	Type	Description	Default
`schema`	`DynaBotConfigSchema \| None`	Optional config schema for schema-based validation.	`None`

Methods:

Name	Description
`register_validator`	Register a named validation function.
`validate`	Run all validators against a configuration.
`validate_completeness`	Check that a config has the minimum required fields.
`validate_portability`	Check that a config is portable across environments.
`validate_component`	Validate a specific component section of the config.

Source code in packages/bots/src/dataknobs_bots/config/validation.py

def __init__(self, schema: DynaBotConfigSchema | None = None) -> None:
    """Initialize the validator.

    Args:
        schema: Optional config schema for schema-based validation.
    """
    self._schema = schema
    self._validators: dict[str, ValidatorFn] = {}

Functions¶

register_validator ¶

register_validator(name: str, validator: ValidatorFn) -> None

Register a named validation function.

Parameters:

Name	Type	Description	Default
`name`	`str`	Unique name for this validator.	required
`validator`	`ValidatorFn`	Function that takes a config dict and returns ValidationResult.	required

Source code in packages/bots/src/dataknobs_bots/config/validation.py

def register_validator(self, name: str, validator: ValidatorFn) -> None:
    """Register a named validation function.

    Args:
        name: Unique name for this validator.
        validator: Function that takes a config dict and returns ValidationResult.
    """
    self._validators[name] = validator
    logger.debug("Registered validator: %s", name)

validate ¶

validate(config: dict[str, Any]) -> ValidationResult

Run all validators against a configuration.

Runs completeness check, schema validation (if schema provided), and all registered custom validators.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Configuration dictionary to validate.	required

Returns:

Type	Description
`ValidationResult`	Merged ValidationResult from all validators.

Source code in packages/bots/src/dataknobs_bots/config/validation.py

def validate(self, config: dict[str, Any]) -> ValidationResult:
    """Run all validators against a configuration.

    Runs completeness check, schema validation (if schema provided),
    and all registered custom validators.

    Args:
        config: Configuration dictionary to validate.

    Returns:
        Merged ValidationResult from all validators.
    """
    result = self.validate_completeness(config)

    if self._schema is not None:
        result = result.merge(self._schema.validate(config))

    for name, validator in self._validators.items():
        try:
            result = result.merge(validator(config))
        except Exception:
            logger.exception("Validator '%s' raised an exception", name)
            result = result.merge(
                ValidationResult.error(f"Validator '{name}' failed with an error")
            )

    return result

validate_completeness ¶

validate_completeness(config: dict[str, Any]) -> ValidationResult

Check that a config has the minimum required fields.

A valid DynaBot config must have at minimum an LLM configuration and conversation storage configuration.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Configuration dictionary to check.	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with errors for missing required fields.

Source code in packages/bots/src/dataknobs_bots/config/validation.py

def validate_completeness(self, config: dict[str, Any]) -> ValidationResult:
    """Check that a config has the minimum required fields.

    A valid DynaBot config must have at minimum an LLM configuration
    and conversation storage configuration.

    Args:
        config: Configuration dictionary to check.

    Returns:
        ValidationResult with errors for missing required fields.
    """
    result = ValidationResult.ok()

    # Check for LLM config (flat or portable format)
    bot = config.get("bot", config)
    has_llm = "llm" in bot
    if not has_llm:
        result = result.merge(
            ValidationResult.error(
                "Missing required 'llm' configuration. "
                "Set llm.provider and llm.model, or use a $resource reference."
            )
        )

    # Check for conversation storage
    has_storage = "conversation_storage" in bot
    if not has_storage:
        result = result.merge(
            ValidationResult.error(
                "Missing required 'conversation_storage' configuration. "
                "Set conversation_storage.backend, "
                "conversation_storage.storage_class, "
                "or use a $resource reference."
            )
        )

    return result

validate_portability ¶

validate_portability(config: dict[str, Any]) -> ValidationResult

Check that a config is portable across environments.

Wraps the portability checker from registry.portability to return a ValidationResult instead of raising exceptions.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Configuration dictionary to check.	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with portability issues as warnings.

Source code in packages/bots/src/dataknobs_bots/config/validation.py

def validate_portability(self, config: dict[str, Any]) -> ValidationResult:
    """Check that a config is portable across environments.

    Wraps the portability checker from registry.portability to return
    a ValidationResult instead of raising exceptions.

    Args:
        config: Configuration dictionary to check.

    Returns:
        ValidationResult with portability issues as warnings.
    """
    try:
        issues = validate_portability(config, raise_on_error=False)
    except PortabilityError as e:
        return ValidationResult.error(str(e))

    if issues:
        return ValidationResult(
            valid=True,
            warnings=[f"Portability: {issue}" for issue in issues],
        )
    return ValidationResult.ok()

validate_component ¶

validate_component(component: str, config: dict[str, Any]) -> ValidationResult

Validate a specific component section of the config.

Parameters:

Name	Type	Description	Default
`component`	`str`	Component name (e.g., 'llm', 'memory').	required
`config`	`dict[str, Any]`	The component's configuration dictionary.	required

Returns:

Type	Description
`ValidationResult`	ValidationResult for that component.

Source code in packages/bots/src/dataknobs_bots/config/validation.py

def validate_component(
    self, component: str, config: dict[str, Any]
) -> ValidationResult:
    """Validate a specific component section of the config.

    Args:
        component: Component name (e.g., 'llm', 'memory').
        config: The component's configuration dictionary.

    Returns:
        ValidationResult for that component.
    """
    if self._schema is None:
        return ValidationResult.ok()

    schema = self._schema.get_component_schema(component)
    if schema is None:
        return ValidationResult.warning(
            f"No schema registered for component '{component}'"
        )

    return _validate_against_schema(component, config, schema)

DraftMetadata `dataclass` ¶

DraftMetadata(
    draft_id: str,
    created_at: str,
    last_updated: str,
    stage: str | None = None,
    complete: bool = False,
    config_name: str | None = None,
)

Metadata for a configuration draft.

Attributes:

Name	Type	Description
`draft_id`	`str`	Unique identifier for the draft.
`created_at`	`str`	ISO 8601 creation timestamp.
`last_updated`	`str`	ISO 8601 last update timestamp.
`stage`	`str \| None`	Current wizard stage when draft was saved.
`complete`	`bool`	Whether the draft represents a complete config.
`config_name`	`str \| None`	Optional name for the final config file.

Methods:

Name	Description
`to_dict`	Convert to dictionary representation.
`from_dict`	Create DraftMetadata from a dictionary.

Functions¶

to_dict ¶

to_dict() -> dict[str, Any]

Convert to dictionary representation.

Source code in packages/bots/src/dataknobs_bots/config/drafts.py

def to_dict(self) -> dict[str, Any]:
    """Convert to dictionary representation."""
    result: dict[str, Any] = {
        "id": self.draft_id,
        "created_at": self.created_at,
        "last_updated": self.last_updated,
        "complete": self.complete,
    }
    if self.stage is not None:
        result["stage"] = self.stage
    if self.config_name is not None:
        result["config_name"] = self.config_name
    return result

from_dict `classmethod` ¶

from_dict(data: dict[str, Any]) -> DraftMetadata

Create DraftMetadata from a dictionary.

Parameters:

Name	Type	Description	Default
`data`	`dict[str, Any]`	Dictionary with metadata fields.	required

Returns:

Type	Description
`DraftMetadata`	A new DraftMetadata instance.

Source code in packages/bots/src/dataknobs_bots/config/drafts.py

@classmethod
def from_dict(cls, data: dict[str, Any]) -> DraftMetadata:
    """Create DraftMetadata from a dictionary.

    Args:
        data: Dictionary with metadata fields.

    Returns:
        A new DraftMetadata instance.
    """
    return cls(
        draft_id=data.get("id", data.get("draft_id", "")),
        created_at=data.get("created_at", ""),
        last_updated=data.get("last_updated", ""),
        stage=data.get("stage"),
        complete=data.get("complete", False),
        config_name=data.get("config_name"),
    )

DynaBotConfigBuilder ¶

DynaBotConfigBuilder(schema: DynaBotConfigSchema | None = None)

Fluent builder for DynaBot configurations.

Provides setter methods for each DynaBot component that return self for method chaining. Consumer-specific sections are added via set_custom_section().

Two output formats: - build() returns flat format compatible with DynaBot.from_config() - build_portable() returns environment-aware format with $resource references and a bot wrapper key

Initialize the builder.

Parameters:

Name	Type	Description	Default
`schema`	`DynaBotConfigSchema \| None`	Optional schema for validation. If not provided, a default schema is created.	`None`

Methods:

Name	Description
`set_llm`	Set the LLM provider configuration (flat/direct format).
`set_llm_resource`	Set the LLM configuration using a $resource reference.
`set_conversation_storage`	Set the conversation storage backend (flat/direct format).
`set_conversation_storage_resource`	Set conversation storage using a $resource reference.
`set_conversation_storage_class`	Set conversation storage using a custom ConversationStorage class.
`set_memory`	Set the memory configuration.
`set_config_base_path`	Set base path for resolving relative config file paths.
`set_reasoning`	Set the reasoning strategy.
`set_reasoning_wizard`	Set wizard reasoning with a config path, inline dict, or WizardConfig.
`set_system_prompt`	Set the system prompt configuration.
`set_knowledge_base`	Set the knowledge base configuration.
`add_tool`	Add a tool to the bot configuration.
`add_tool_by_name`	Add a tool to the config by looking up its catalog entry.
`add_tools_by_name`	Add multiple tools by name from the catalog.
`add_middleware`	Add middleware to the bot configuration.
`set_custom_section`	Set a custom (domain-specific) config section.
`from_template`	Initialize the builder from a template.
`merge_overrides`	Merge override values into the current configuration.
`validate`	Validate the current configuration.
`build`	Build the flat configuration dict.
`build_portable`	Build the portable configuration with $resource references.
`to_yaml`	Serialize the portable configuration as YAML.
`reset`	Reset the builder to an empty state.
`from_config`	Create a builder pre-populated from an existing config.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def __init__(self, schema: DynaBotConfigSchema | None = None) -> None:
    """Initialize the builder.

    Args:
        schema: Optional schema for validation. If not provided, a
            default schema is created.
    """
    self._schema = schema or DynaBotConfigSchema()
    self._config: dict[str, Any] = {}
    self._custom_sections: dict[str, Any] = {}
    self._validator = ConfigValidator(self._schema)

Functions¶

set_llm ¶

set_llm(provider: str, model: str | None = None, **kwargs: Any) -> Self

Set the LLM provider configuration (flat/direct format).

Parameters:

Name	Type	Description	Default
`provider`	`str`	LLM provider name (e.g., 'ollama', 'openai').	required
`model`	`str \| None`	Model name or identifier.	`None`
`**kwargs`	`Any`	Additional provider-specific settings (temperature, max_tokens, etc.).	`{}`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def set_llm(
    self,
    provider: str,
    model: str | None = None,
    **kwargs: Any,
) -> Self:
    """Set the LLM provider configuration (flat/direct format).

    Args:
        provider: LLM provider name (e.g., 'ollama', 'openai').
        model: Model name or identifier.
        **kwargs: Additional provider-specific settings
            (temperature, max_tokens, etc.).

    Returns:
        self for method chaining.
    """
    llm_config: dict[str, Any] = {"provider": provider}
    if model is not None:
        llm_config["model"] = model
    llm_config.update(kwargs)
    self._config["llm"] = llm_config
    return self

set_llm_resource ¶

set_llm_resource(
    resource_name: str = "default",
    resource_type: str = "llm_providers",
    **overrides: Any,
) -> Self

Set the LLM configuration using a $resource reference.

Parameters:

Name	Type	Description	Default
`resource_name`	`str`	Resource name to resolve at runtime.	`'default'`
`resource_type`	`str`	Resource type category.	`'llm_providers'`
`**overrides`	`Any`	Override values applied after resolution.	`{}`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def set_llm_resource(
    self,
    resource_name: str = "default",
    resource_type: str = "llm_providers",
    **overrides: Any,
) -> Self:
    """Set the LLM configuration using a $resource reference.

    Args:
        resource_name: Resource name to resolve at runtime.
        resource_type: Resource type category.
        **overrides: Override values applied after resolution.

    Returns:
        self for method chaining.
    """
    llm_config: dict[str, Any] = {
        "$resource": resource_name,
        "type": resource_type,
    }
    llm_config.update(overrides)
    self._config["llm"] = llm_config
    return self

set_conversation_storage ¶

set_conversation_storage(backend: str, **kwargs: Any) -> Self

Set the conversation storage backend (flat/direct format).

Parameters:

Name	Type	Description	Default
`backend`	`str`	Storage backend name (e.g., 'memory', 'sqlite').	required
`**kwargs`	`Any`	Additional backend-specific settings.	`{}`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def set_conversation_storage(
    self,
    backend: str,
    **kwargs: Any,
) -> Self:
    """Set the conversation storage backend (flat/direct format).

    Args:
        backend: Storage backend name (e.g., 'memory', 'sqlite').
        **kwargs: Additional backend-specific settings.

    Returns:
        self for method chaining.
    """
    storage_config: dict[str, Any] = {"backend": backend}
    storage_config.update(kwargs)
    self._config["conversation_storage"] = storage_config
    return self

set_conversation_storage_resource ¶

set_conversation_storage_resource(
    resource_name: str = "conversations",
    resource_type: str = "databases",
    **overrides: Any,
) -> Self

Set conversation storage using a $resource reference.

Parameters:

Name	Type	Description	Default
`resource_name`	`str`	Resource name to resolve at runtime.	`'conversations'`
`resource_type`	`str`	Resource type category.	`'databases'`
`**overrides`	`Any`	Override values applied after resolution.	`{}`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def set_conversation_storage_resource(
    self,
    resource_name: str = "conversations",
    resource_type: str = "databases",
    **overrides: Any,
) -> Self:
    """Set conversation storage using a $resource reference.

    Args:
        resource_name: Resource name to resolve at runtime.
        resource_type: Resource type category.
        **overrides: Override values applied after resolution.

    Returns:
        self for method chaining.
    """
    storage_config: dict[str, Any] = {
        "$resource": resource_name,
        "type": resource_type,
    }
    storage_config.update(overrides)
    self._config["conversation_storage"] = storage_config
    return self

set_conversation_storage_class ¶

set_conversation_storage_class(storage_class: str, **kwargs: Any) -> Self

Set conversation storage using a custom ConversationStorage class.

The class must implement ConversationStorage and provide an async create(config: dict) -> ConversationStorage classmethod.

Parameters:

Name	Type	Description	Default
`storage_class`	`str`	Dotted import path to the storage class (e.g., `"myapp.storage:AcmeConversationStorage"`).	required
`**kwargs`	`Any`	Additional config passed to `create()`.	`{}`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def set_conversation_storage_class(
    self,
    storage_class: str,
    **kwargs: Any,
) -> Self:
    """Set conversation storage using a custom ConversationStorage class.

    The class must implement ``ConversationStorage`` and provide an async
    ``create(config: dict) -> ConversationStorage`` classmethod.

    Args:
        storage_class: Dotted import path to the storage class
            (e.g., ``"myapp.storage:AcmeConversationStorage"``).
        **kwargs: Additional config passed to ``create()``.

    Returns:
        self for method chaining.
    """
    storage_config: dict[str, Any] = {"storage_class": storage_class}
    storage_config.update(kwargs)
    self._config["conversation_storage"] = storage_config
    return self

set_memory ¶

set_memory(memory_type: str, **kwargs: Any) -> Self

Set the memory configuration.

Parameters:

Name	Type	Description	Default
`memory_type`	`str`	Memory type (e.g., 'buffer', 'vector').	required
`**kwargs`	`Any`	Additional memory settings (max_messages, etc.).	`{}`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def set_memory(self, memory_type: str, **kwargs: Any) -> Self:
    """Set the memory configuration.

    Args:
        memory_type: Memory type (e.g., 'buffer', 'vector').
        **kwargs: Additional memory settings (max_messages, etc.).

    Returns:
        self for method chaining.
    """
    memory_config: dict[str, Any] = {"type": memory_type}
    memory_config.update(kwargs)
    self._config["memory"] = memory_config
    return self

set_config_base_path ¶

set_config_base_path(path: str | Path) -> Self

Set base path for resolving relative config file paths.

When set, relative paths in nested configs (e.g. wizard_config) are resolved against this directory instead of the current working directory.

Parameters:

Name	Type	Description	Default
`path`	`str \| Path`	Base directory path (string or Path object).	required

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def set_config_base_path(self, path: str | Path) -> Self:
    """Set base path for resolving relative config file paths.

    When set, relative paths in nested configs (e.g. ``wizard_config``)
    are resolved against this directory instead of the current working
    directory.

    Args:
        path: Base directory path (string or Path object).

    Returns:
        self for method chaining.
    """
    self._config["config_base_path"] = str(path)
    return self

set_reasoning ¶

set_reasoning(strategy: str, **kwargs: Any) -> Self

Set the reasoning strategy.

Parameters:

Name	Type	Description	Default
`strategy`	`str`	Reasoning strategy (e.g., 'simple', 'react', 'wizard').	required
`**kwargs`	`Any`	Additional strategy settings.	`{}`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def set_reasoning(self, strategy: str, **kwargs: Any) -> Self:
    """Set the reasoning strategy.

    Args:
        strategy: Reasoning strategy (e.g., 'simple', 'react', 'wizard').
        **kwargs: Additional strategy settings.

    Returns:
        self for method chaining.
    """
    reasoning_config: dict[str, Any] = {"strategy": strategy}
    reasoning_config.update(kwargs)
    self._config["reasoning"] = reasoning_config
    return self

set_reasoning_wizard ¶

set_reasoning_wizard(
    wizard_config: str | dict[str, Any] | WizardConfig, **kwargs: Any
) -> Self

Set wizard reasoning with a config path, inline dict, or WizardConfig.

When wizard_config is a WizardConfig object, the caller is responsible for writing it to disk via wizard_config.to_file() before the bot loads.

When wizard_config is a dict, it is stored inline in the reasoning config and loaded via WizardConfigLoader.load_from_dict() at bot startup.

Parameters:

Name	Type	Description	Default
`wizard_config`	`str \| dict[str, Any] \| WizardConfig`	Path to a wizard YAML file, an inline dict (compatible with `WizardConfigLoader.load_from_dict()`), or a `WizardConfig` object (whose `name` is used as the config path identifier).	required
`**kwargs`	`Any`	Additional reasoning settings (extraction_config, etc.).	`{}`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def set_reasoning_wizard(
    self,
    wizard_config: str | dict[str, Any] | WizardConfig,
    **kwargs: Any,
) -> Self:
    """Set wizard reasoning with a config path, inline dict, or WizardConfig.

    When ``wizard_config`` is a ``WizardConfig`` object, the caller
    is responsible for writing it to disk via ``wizard_config.to_file()``
    before the bot loads.

    When ``wizard_config`` is a ``dict``, it is stored inline in the
    reasoning config and loaded via
    ``WizardConfigLoader.load_from_dict()`` at bot startup.

    Args:
        wizard_config: Path to a wizard YAML file, an inline dict
            (compatible with ``WizardConfigLoader.load_from_dict()``),
            or a ``WizardConfig`` object (whose ``name`` is used as
            the config path identifier).
        **kwargs: Additional reasoning settings
            (extraction_config, etc.).

    Returns:
        self for method chaining.
    """
    if isinstance(wizard_config, dict):
        return self.set_reasoning(
            "wizard", wizard_config=wizard_config, **kwargs
        )
    config_path = (
        wizard_config
        if isinstance(wizard_config, str)
        else wizard_config.name
    )
    return self.set_reasoning(
        "wizard", wizard_config=config_path, **kwargs
    )

set_system_prompt ¶

set_system_prompt(
    content: str | None = None,
    name: str | None = None,
    rag_configs: list[dict[str, Any]] | None = None,
) -> Self

Set the system prompt configuration.

Provide either content (inline prompt) or name (template reference). Optionally add RAG configurations for prompt enhancement.

Parameters:

Name	Type	Description	Default
`content`	`str \| None`	Inline prompt content.	`None`
`name`	`str \| None`	Prompt template name.	`None`
`rag_configs`	`list[dict[str, Any]] \| None`	RAG configurations for prompt enhancement.	`None`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def set_system_prompt(
    self,
    content: str | None = None,
    name: str | None = None,
    rag_configs: list[dict[str, Any]] | None = None,
) -> Self:
    """Set the system prompt configuration.

    Provide either ``content`` (inline prompt) or ``name`` (template
    reference). Optionally add RAG configurations for prompt enhancement.

    Args:
        content: Inline prompt content.
        name: Prompt template name.
        rag_configs: RAG configurations for prompt enhancement.

    Returns:
        self for method chaining.
    """
    if content is not None and name is None and rag_configs is None:
        self._config["system_prompt"] = content
    else:
        prompt_config: dict[str, Any] = {}
        if content is not None:
            prompt_config["content"] = content
        if name is not None:
            prompt_config["name"] = name
        if rag_configs is not None:
            prompt_config["rag_configs"] = rag_configs
        self._config["system_prompt"] = prompt_config
    return self

set_knowledge_base ¶

set_knowledge_base(**kwargs: Any) -> Self

Set the knowledge base configuration.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Knowledge base settings (enabled, type, vector_store, embedding, etc.).	`{}`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def set_knowledge_base(self, **kwargs: Any) -> Self:
    """Set the knowledge base configuration.

    Args:
        **kwargs: Knowledge base settings (enabled, type,
            vector_store, embedding, etc.).

    Returns:
        self for method chaining.
    """
    self._config["knowledge_base"] = dict(kwargs)
    return self

add_tool ¶

add_tool(tool_class: str, **params: Any) -> Self

Add a tool to the bot configuration.

Parameters:

Name	Type	Description	Default
`tool_class`	`str`	Fully qualified tool class name.	required
`**params`	`Any`	Tool constructor parameters.	`{}`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def add_tool(self, tool_class: str, **params: Any) -> Self:
    """Add a tool to the bot configuration.

    Args:
        tool_class: Fully qualified tool class name.
        **params: Tool constructor parameters.

    Returns:
        self for method chaining.
    """
    tools = self._config.setdefault("tools", [])
    tool_entry: dict[str, Any] = {"class": tool_class}
    if params:
        tool_entry["params"] = dict(params)
    tools.append(tool_entry)
    return self

add_tool_by_name ¶

add_tool_by_name(
    catalog: ToolCatalog, name: str, **param_overrides: Any
) -> Self

Add a tool to the config by looking up its catalog entry.

Resolves the tool name to a class path via the catalog and adds it with default params (overridable).

Parameters:

Name	Type	Description	Default
`catalog`	`ToolCatalog`	Tool catalog for name resolution.	required
`name`	`str`	Tool name to look up.	required
`**param_overrides`	`Any`	Override default params.	`{}`

Returns:

Type	Description
`Self`	self for method chaining.

Raises:

Type	Description
`NotFoundError`	If tool name is not in the catalog.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def add_tool_by_name(
    self,
    catalog: ToolCatalog,
    name: str,
    **param_overrides: Any,
) -> Self:
    """Add a tool to the config by looking up its catalog entry.

    Resolves the tool name to a class path via the catalog and adds
    it with default params (overridable).

    Args:
        catalog: Tool catalog for name resolution.
        name: Tool name to look up.
        **param_overrides: Override default params.

    Returns:
        self for method chaining.

    Raises:
        NotFoundError: If tool name is not in the catalog.
    """
    config = catalog.to_bot_config(name, **param_overrides)
    tools = self._config.setdefault("tools", [])
    tools.append(config)
    return self

add_tools_by_name ¶

add_tools_by_name(
    catalog: ToolCatalog,
    names: Sequence[str],
    overrides: dict[str, dict[str, Any]] | None = None,
) -> Self

Add multiple tools by name from the catalog.

Parameters:

Name	Type	Description	Default
`catalog`	`ToolCatalog`	Tool catalog for name resolution.	required
`names`	`Sequence[str]`	Tool names to add.	required
`overrides`	`dict[str, dict[str, Any]] \| None`	Per-tool param overrides keyed by tool name.	`None`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def add_tools_by_name(
    self,
    catalog: ToolCatalog,
    names: Sequence[str],
    overrides: dict[str, dict[str, Any]] | None = None,
) -> Self:
    """Add multiple tools by name from the catalog.

    Args:
        catalog: Tool catalog for name resolution.
        names: Tool names to add.
        overrides: Per-tool param overrides keyed by tool name.

    Returns:
        self for method chaining.
    """
    configs = catalog.to_bot_configs(names, overrides)
    tools = self._config.setdefault("tools", [])
    tools.extend(configs)
    return self

add_middleware ¶

add_middleware(middleware_class: str, **params: Any) -> Self

Add middleware to the bot configuration.

Parameters:

Name	Type	Description	Default
`middleware_class`	`str`	Fully qualified middleware class name.	required
`**params`	`Any`	Middleware constructor parameters.	`{}`

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def add_middleware(self, middleware_class: str, **params: Any) -> Self:
    """Add middleware to the bot configuration.

    Args:
        middleware_class: Fully qualified middleware class name.
        **params: Middleware constructor parameters.

    Returns:
        self for method chaining.
    """
    middleware = self._config.setdefault("middleware", [])
    mw_entry: dict[str, Any] = {"class": middleware_class}
    if params:
        mw_entry["params"] = dict(params)
    middleware.append(mw_entry)
    return self

set_custom_section ¶

set_custom_section(key: str, value: Any) -> Self

Set a custom (domain-specific) config section.

This is the extension point for consumers to add sections like educational, customer_service, domain, etc.

Parameters:

Name	Type	Description	Default
`key`	`str`	Section key name.	required
`value`	`Any`	Section value (dict, list, or scalar).	required

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def set_custom_section(self, key: str, value: Any) -> Self:
    """Set a custom (domain-specific) config section.

    This is the extension point for consumers to add sections like
    ``educational``, ``customer_service``, ``domain``, etc.

    Args:
        key: Section key name.
        value: Section value (dict, list, or scalar).

    Returns:
        self for method chaining.
    """
    self._custom_sections[key] = value
    return self

from_template ¶

from_template(template: ConfigTemplate, variables: dict[str, Any]) -> Self

Initialize the builder from a template.

Deep-copies the template structure, substitutes variables, and uses the result as the builder's base configuration.

Parameters:

Name	Type	Description	Default
`template`	`ConfigTemplate`	The template to apply.	required
`variables`	`dict[str, Any]`	Variable values for substitution.	required

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def from_template(
    self,
    template: ConfigTemplate,
    variables: dict[str, Any],
) -> Self:
    """Initialize the builder from a template.

    Deep-copies the template structure, substitutes variables, and
    uses the result as the builder's base configuration.

    Args:
        template: The template to apply.
        variables: Variable values for substitution.

    Returns:
        self for method chaining.
    """
    from .templates import _build_variable_map

    from dataknobs_config.template_vars import substitute_template_vars

    var_map = _build_variable_map(template, variables)
    structure = copy.deepcopy(template.structure)
    resolved: dict[str, Any] = substitute_template_vars(
        structure, var_map, preserve_missing=True
    )

    # If structure has a 'bot' key, use its contents as the config
    if "bot" in resolved:
        self._config = dict(resolved.pop("bot"))
        # Remaining top-level keys become custom sections
        for key, value in resolved.items():
            self._custom_sections[key] = value
    else:
        self._config = resolved

    return self

merge_overrides ¶

merge_overrides(overrides: dict[str, Any]) -> Self

Merge override values into the current configuration.

Performs recursive dict merge for nested dictionaries.

Parameters:

Name	Type	Description	Default
`overrides`	`dict[str, Any]`	Override values to merge.	required

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def merge_overrides(self, overrides: dict[str, Any]) -> Self:
    """Merge override values into the current configuration.

    Performs recursive dict merge for nested dictionaries.

    Args:
        overrides: Override values to merge.

    Returns:
        self for method chaining.
    """
    self._config = _deep_merge(self._config, overrides)
    return self

validate ¶

validate() -> ValidationResult

Validate the current configuration.

Returns:

Type	Description
`ValidationResult`	ValidationResult with any errors and warnings.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def validate(self) -> ValidationResult:
    """Validate the current configuration.

    Returns:
        ValidationResult with any errors and warnings.
    """
    config = self._build_internal()
    return self._validator.validate(config)

build ¶

build() -> dict[str, Any]

Build the flat configuration dict.

The returned dict is compatible with DynaBot.from_config(). Validates before returning and raises ValueError if there are errors.

Returns:

Type	Description
`dict[str, Any]`	Flat configuration dictionary.

Raises:

Type	Description
`ValueError`	If the configuration has validation errors.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def build(self) -> dict[str, Any]:
    """Build the flat configuration dict.

    The returned dict is compatible with ``DynaBot.from_config()``.
    Validates before returning and raises ValueError if there are errors.

    Returns:
        Flat configuration dictionary.

    Raises:
        ValueError: If the configuration has validation errors.
    """
    config = self._build_internal()
    result = self._validator.validate(config)
    if not result.valid:
        raise ValueError(
            "Configuration validation failed:\n"
            + "\n".join(f"  - {e}" for e in result.errors)
        )
    for warning in result.warnings:
        logger.warning("Config warning: %s", warning)
    return config

build_portable ¶

build_portable() -> dict[str, Any]

Build the portable configuration with $resource references.

Wraps the config under a bot key and includes any custom sections as top-level siblings.

Returns:

Type	Description
`dict[str, Any]`	Portable configuration dict with `bot` wrapper.

Raises:

Type	Description
`ValueError`	If the configuration has validation errors.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def build_portable(self) -> dict[str, Any]:
    """Build the portable configuration with $resource references.

    Wraps the config under a ``bot`` key and includes any custom
    sections as top-level siblings.

    Returns:
        Portable configuration dict with ``bot`` wrapper.

    Raises:
        ValueError: If the configuration has validation errors.
    """
    config = self._build_internal()
    result = self._validator.validate(config)
    if not result.valid:
        raise ValueError(
            "Configuration validation failed:\n"
            + "\n".join(f"  - {e}" for e in result.errors)
        )
    for warning in result.warnings:
        logger.warning("Config warning: %s", warning)

    # Separate core bot config from custom sections
    bot_config: dict[str, Any] = {}
    custom: dict[str, Any] = {}
    for key, value in config.items():
        if key in self._custom_sections:
            custom[key] = value
        else:
            bot_config[key] = value

    portable: dict[str, Any] = {"bot": bot_config}
    portable.update(custom)
    return portable

to_yaml ¶

to_yaml() -> str

Serialize the portable configuration as YAML.

Returns:

Type	Description
`str`	YAML string representation.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def to_yaml(self) -> str:
    """Serialize the portable configuration as YAML.

    Returns:
        YAML string representation.
    """
    portable = self.build_portable()
    return yaml.dump(portable, default_flow_style=False, sort_keys=False)

reset ¶

reset() -> Self

Reset the builder to an empty state.

Returns:

Type	Description
`Self`	self for method chaining.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

def reset(self) -> Self:
    """Reset the builder to an empty state.

    Returns:
        self for method chaining.
    """
    self._config = {}
    self._custom_sections = {}
    return self

from_config `classmethod` ¶

from_config(config: dict[str, Any]) -> DynaBotConfigBuilder

Create a builder pre-populated from an existing config.

Supports both flat format and portable format (with bot wrapper).

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Existing configuration dictionary.	required

Returns:

Type	Description
`DynaBotConfigBuilder`	A new builder instance with the config loaded.

Source code in packages/bots/src/dataknobs_bots/config/builder.py

@classmethod
def from_config(cls, config: dict[str, Any]) -> DynaBotConfigBuilder:
    """Create a builder pre-populated from an existing config.

    Supports both flat format and portable format (with ``bot`` wrapper).

    Args:
        config: Existing configuration dictionary.

    Returns:
        A new builder instance with the config loaded.
    """
    builder = cls()
    if "bot" in config:
        bot = dict(config["bot"])
        builder._config = bot
        for key, value in config.items():
            if key != "bot":
                builder._custom_sections[key] = value
    else:
        builder._config = dict(config)
    return builder

DynaBotConfigSchema ¶

DynaBotConfigSchema()

Queryable registry of valid DynaBot configuration options.

Auto-registers the 8 default DynaBot components on initialization. Consumers can register additional extensions for domain-specific sections.

Methods:

Name	Description
`register_component`	Register a core DynaBot component schema.
`register_extension`	Register a consumer-specific config extension.
`get_component_schema`	Get the JSON Schema for a component.
`get_extension_schema`	Get the JSON Schema for an extension.
`get_valid_options`	Get valid options for a field within a component or extension.
`validate`	Validate a config against all registered schemas.
`get_full_schema`	Get the combined schema for all components and extensions.
`to_description`	Generate a human-readable description for LLM system prompts.

Source code in packages/bots/src/dataknobs_bots/config/schema.py

def __init__(self) -> None:
    self._components: dict[str, ComponentSchema] = {}
    self._extensions: dict[str, ComponentSchema] = {}
    self._register_defaults()

Functions¶

register_component ¶

register_component(
    name: str,
    schema: dict[str, Any],
    description: str = "",
    required: bool = False,
) -> None

Register a core DynaBot component schema.

Parameters:

Name	Type	Description	Default
`name`	`str`	Component name.	required
`schema`	`dict[str, Any]`	JSON Schema-like definition.	required
`description`	`str`	Human-readable description.	`''`
`required`	`bool`	Whether this component is required.	`False`

Source code in packages/bots/src/dataknobs_bots/config/schema.py

def register_component(
    self,
    name: str,
    schema: dict[str, Any],
    description: str = "",
    required: bool = False,
) -> None:
    """Register a core DynaBot component schema.

    Args:
        name: Component name.
        schema: JSON Schema-like definition.
        description: Human-readable description.
        required: Whether this component is required.
    """
    self._components[name] = ComponentSchema(
        name=name,
        description=description,
        schema=schema,
        required=required,
    )
    logger.debug("Registered component schema: %s", name)

register_extension ¶

register_extension(
    name: str, schema: dict[str, Any], description: str = ""
) -> None

Register a consumer-specific config extension.

Extensions are domain-specific sections (e.g., 'educational', 'customer_service') that aren't part of the core DynaBot schema.

Parameters:

Name	Type	Description	Default
`name`	`str`	Extension name.	required
`schema`	`dict[str, Any]`	JSON Schema-like definition.	required
`description`	`str`	Human-readable description.	`''`

Source code in packages/bots/src/dataknobs_bots/config/schema.py

def register_extension(
    self,
    name: str,
    schema: dict[str, Any],
    description: str = "",
) -> None:
    """Register a consumer-specific config extension.

    Extensions are domain-specific sections (e.g., 'educational',
    'customer_service') that aren't part of the core DynaBot schema.

    Args:
        name: Extension name.
        schema: JSON Schema-like definition.
        description: Human-readable description.
    """
    self._extensions[name] = ComponentSchema(
        name=name,
        description=description or f"Extension: {name}",
        schema=schema,
    )
    logger.debug("Registered extension schema: %s", name)

get_component_schema ¶

get_component_schema(name: str) -> dict[str, Any] | None

Get the JSON Schema for a component.

Parameters:

Name	Type	Description	Default
`name`	`str`	Component name.	required

Returns:

Type	Description
`dict[str, Any] \| None`	JSON Schema dict, or None if not registered.

Source code in packages/bots/src/dataknobs_bots/config/schema.py

def get_component_schema(self, name: str) -> dict[str, Any] | None:
    """Get the JSON Schema for a component.

    Args:
        name: Component name.

    Returns:
        JSON Schema dict, or None if not registered.
    """
    component = self._components.get(name)
    if component is not None:
        return component.schema
    return None

get_extension_schema ¶

get_extension_schema(name: str) -> dict[str, Any] | None

Get the JSON Schema for an extension.

Parameters:

Name	Type	Description	Default
`name`	`str`	Extension name.	required

Returns:

Type	Description
`dict[str, Any] \| None`	JSON Schema dict, or None if not registered.

Source code in packages/bots/src/dataknobs_bots/config/schema.py

def get_extension_schema(self, name: str) -> dict[str, Any] | None:
    """Get the JSON Schema for an extension.

    Args:
        name: Extension name.

    Returns:
        JSON Schema dict, or None if not registered.
    """
    ext = self._extensions.get(name)
    if ext is not None:
        return ext.schema
    return None

get_valid_options ¶

get_valid_options(component: str, field_name: str) -> list[str]

Get valid options for a field within a component or extension.

Parameters:

Name	Type	Description	Default
`component`	`str`	Component or extension name.	required
`field_name`	`str`	Field name to query.	required

Returns:

Type	Description
`list[str]`	List of valid option strings.

Source code in packages/bots/src/dataknobs_bots/config/schema.py

def get_valid_options(self, component: str, field_name: str) -> list[str]:
    """Get valid options for a field within a component or extension.

    Args:
        component: Component or extension name.
        field_name: Field name to query.

    Returns:
        List of valid option strings.
    """
    comp = self._components.get(component) or self._extensions.get(component)
    if comp is not None:
        return comp.get_valid_options(field_name)
    return []

validate ¶

validate(config: dict[str, Any]) -> ValidationResult

Validate a config against all registered schemas.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Full DynaBot configuration dict.	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with all schema violations.

Source code in packages/bots/src/dataknobs_bots/config/schema.py

def validate(self, config: dict[str, Any]) -> ValidationResult:
    """Validate a config against all registered schemas.

    Args:
        config: Full DynaBot configuration dict.

    Returns:
        ValidationResult with all schema violations.
    """
    result = ValidationResult.ok()
    bot = config.get("bot", config)

    for name, comp in self._components.items():
        if comp.required and name not in bot:
            result = result.merge(
                ValidationResult.error(f"Missing required component: {name}")
            )
        if name in bot and isinstance(bot[name], dict):
            result = result.merge(
                _validate_against_schema(name, bot[name], comp.schema)
            )

    for name, ext in self._extensions.items():
        if name in bot and isinstance(bot[name], dict):
            result = result.merge(
                _validate_against_schema(name, bot[name], ext.schema)
            )

    return result

get_full_schema ¶

get_full_schema() -> dict[str, Any]

Get the combined schema for all components and extensions.

Returns:

Type	Description
`dict[str, Any]`	Dict mapping component/extension names to their schemas.

Source code in packages/bots/src/dataknobs_bots/config/schema.py

def get_full_schema(self) -> dict[str, Any]:
    """Get the combined schema for all components and extensions.

    Returns:
        Dict mapping component/extension names to their schemas.
    """
    result: dict[str, Any] = {}
    for name, comp in self._components.items():
        result[name] = {
            "description": comp.description,
            "required": comp.required,
            "schema": comp.schema,
        }
    for name, ext in self._extensions.items():
        result[name] = {
            "description": ext.description,
            "required": False,
            "extension": True,
            "schema": ext.schema,
        }
    return result

to_description ¶

to_description() -> str

Generate a human-readable description for LLM system prompts.

Returns:

Type	Description
`str`	Structured text describing all available configuration options.

Source code in packages/bots/src/dataknobs_bots/config/schema.py

def to_description(self) -> str:
    """Generate a human-readable description for LLM system prompts.

    Returns:
        Structured text describing all available configuration options.
    """
    lines: list[str] = ["# DynaBot Configuration Options", ""]

    lines.append("## Core Components")
    lines.append("")
    for name, comp in self._components.items():
        req = " (required)" if comp.required else " (optional)"
        lines.append(f"### {name}{req}")
        if comp.description:
            lines.append(comp.description)
        props = comp.schema.get("properties", {})
        if props:
            lines.append("")
            for field_name, field_schema in props.items():
                desc = field_schema.get("description", "")
                enum_values = field_schema.get("enum")
                line = f"- **{field_name}**"
                if desc:
                    line += f": {desc}"
                if enum_values:
                    line += f" (options: {', '.join(str(v) for v in enum_values)})"
                lines.append(line)
        lines.append("")

    if self._extensions:
        lines.append("## Extensions")
        lines.append("")
        for name, ext in self._extensions.items():
            lines.append(f"### {name}")
            if ext.description:
                lines.append(ext.description)
            props = ext.schema.get("properties", {})
            if props:
                lines.append("")
                for field_name, field_schema in props.items():
                    desc = field_schema.get("description", "")
                    line = f"- **{field_name}**"
                    if desc:
                        line += f": {desc}"
                    lines.append(line)
            lines.append("")

    return "\n".join(lines)

TemplateVariable `dataclass` ¶

TemplateVariable(
    name: str,
    description: str = "",
    type: str = "string",
    required: bool = False,
    default: Any = None,
    choices: list[Any] | None = None,
    validation: dict[str, Any] | None = None,
)

Definition of a template variable.

Attributes:

Name	Type	Description
`name`	`str`	Variable name used in {{name}} placeholders.
`description`	`str`	Human-readable description.
`type`	`str`	Variable type (string, integer, boolean, enum, array).
`required`	`bool`	Whether the variable must be provided.
`default`	`Any`	Default value if not provided.
`choices`	`list[Any] \| None`	Valid values for enum-type variables.
`validation`	`dict[str, Any] \| None`	JSON Schema constraints for the value.

Methods:

Name	Description
`to_dict`	Convert to dictionary representation.
`from_dict`	Create a TemplateVariable from a dictionary.

Functions¶

to_dict ¶

to_dict() -> dict[str, Any]

Convert to dictionary representation.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

def to_dict(self) -> dict[str, Any]:
    """Convert to dictionary representation."""
    result: dict[str, Any] = {
        "name": self.name,
        "type": self.type,
        "required": self.required,
    }
    if self.description:
        result["description"] = self.description
    if self.default is not None:
        result["default"] = self.default
    if self.choices is not None:
        result["choices"] = self.choices
    if self.validation is not None:
        result["validation"] = self.validation
    return result

from_dict `classmethod` ¶

from_dict(data: dict[str, Any]) -> TemplateVariable

Create a TemplateVariable from a dictionary.

Parameters:

Name	Type	Description	Default
`data`	`dict[str, Any]`	Dictionary with variable fields.	required

Returns:

Type	Description
`TemplateVariable`	A new TemplateVariable instance.

Source code in packages/bots/src/dataknobs_bots/config/templates.py

@classmethod
def from_dict(cls, data: dict[str, Any]) -> TemplateVariable:
    """Create a TemplateVariable from a dictionary.

    Args:
        data: Dictionary with variable fields.

    Returns:
        A new TemplateVariable instance.
    """
    return cls(
        name=data["name"],
        description=data.get("description", ""),
        type=data.get("type", "string"),
        required=data.get("required", False),
        default=data.get("default"),
        choices=data.get("choices"),
        validation=data.get("validation"),
    )

ToolCatalog ¶

ToolCatalog()

Bases: Registry[ToolEntry]

Registry mapping tool names to class paths and default configuration.

Provides a single source of truth for tool metadata, enabling config builders to reference tools by name and produce correct bot/wizard configs.

Built on Registry[ToolEntry] for thread safety, metrics, and consistent error handling.

Example

catalog = ToolCatalog()
catalog.register_tool(
    name="knowledge_search",
    class_path="dataknobs_bots.tools.knowledge_search.KnowledgeSearchTool",
    description="Search the knowledge base.",
    tags=("general", "rag"),
    requires=("knowledge_base",),
)
config = catalog.to_bot_config("knowledge_search", k=10)

Initialize the catalog.

Methods:

Name	Description
`register_tool`	Register a tool in the catalog.
`register_entry`	Register a pre-built ToolEntry.
`register_from_dict`	Register a tool from a dict (e.g., loaded from YAML).
`register_many_from_dicts`	Register multiple tools from dicts.
`register_from_class`	Register a tool class that provides `catalog_metadata()`.
`list_tools`	List all registered tools, optionally filtered by tags.
`get_names`	Get all registered tool names.
`to_bot_config`	Generate a bot config tool entry for the named tool.
`to_bot_configs`	Generate bot config entries for multiple tools.
`get_requirements`	Get the union of all requirements for the given tool names.
`check_requirements`	Check that tool requirements are satisfied by a config dict.
`instantiate_tool`	Import and instantiate a tool from its catalog entry.
`create_tool_registry`	Create a ToolRegistry populated from catalog entries.
`to_dict`	Serialize entire catalog to a dict (for YAML output).
`from_dict`	Create a catalog from a dict (e.g., loaded from YAML).

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def __init__(self) -> None:
    """Initialize the catalog."""
    super().__init__("tool_catalog", enable_metrics=True)

Functions¶

register_tool ¶

register_tool(
    name: str,
    class_path: str,
    description: str = "",
    default_params: dict[str, Any] | None = None,
    tags: Sequence[str] = (),
    requires: Sequence[str] = (),
) -> None

Register a tool in the catalog.

Parameters:

Name	Type	Description	Default
`name`	`str`	Tool's runtime name.	required
`class_path`	`str`	Fully-qualified class path.	required
`description`	`str`	Human-readable description.	`''`
`default_params`	`dict[str, Any] \| None`	Default constructor params.	`None`
`tags`	`Sequence[str]`	Categorization tags.	`()`
`requires`	`Sequence[str]`	Dependency identifiers.	`()`

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def register_tool(
    self,
    name: str,
    class_path: str,
    description: str = "",
    default_params: dict[str, Any] | None = None,
    tags: Sequence[str] = (),
    requires: Sequence[str] = (),
) -> None:
    """Register a tool in the catalog.

    Args:
        name: Tool's runtime name.
        class_path: Fully-qualified class path.
        description: Human-readable description.
        default_params: Default constructor params.
        tags: Categorization tags.
        requires: Dependency identifiers.
    """
    entry = ToolEntry(
        name=name,
        class_path=class_path,
        description=description,
        default_params=default_params or {},
        tags=frozenset(tags),
        requires=frozenset(requires),
    )
    self.register(name, entry)

register_entry ¶

register_entry(entry: ToolEntry) -> None

Register a pre-built ToolEntry.

Parameters:

Name	Type	Description	Default
`entry`	`ToolEntry`	ToolEntry to register.	required

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def register_entry(self, entry: ToolEntry) -> None:
    """Register a pre-built ToolEntry.

    Args:
        entry: ToolEntry to register.
    """
    self.register(entry.name, entry)

register_from_dict ¶

register_from_dict(data: dict[str, Any]) -> None

Register a tool from a dict (e.g., loaded from YAML).

Parameters:

Name	Type	Description	Default
`data`	`dict[str, Any]`	Dict with `name` and `class_path` keys.	required

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def register_from_dict(self, data: dict[str, Any]) -> None:
    """Register a tool from a dict (e.g., loaded from YAML).

    Args:
        data: Dict with ``name`` and ``class_path`` keys.
    """
    entry = ToolEntry.from_dict(data)
    self.register(entry.name, entry)

register_many_from_dicts ¶

register_many_from_dicts(entries: list[dict[str, Any]]) -> None

Register multiple tools from dicts.

Parameters:

Name	Type	Description	Default
`entries`	`list[dict[str, Any]]`	List of tool definition dicts.	required

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def register_many_from_dicts(self, entries: list[dict[str, Any]]) -> None:
    """Register multiple tools from dicts.

    Args:
        entries: List of tool definition dicts.
    """
    for data in entries:
        self.register_from_dict(data)

register_from_class ¶

register_from_class(tool_class: type) -> None

Register a tool class that provides catalog_metadata().

Computes class_path automatically from the class's module path.

Parameters:

Name	Type	Description	Default
`tool_class`	`type`	A tool class with a `catalog_metadata()` classmethod.	required

Raises:

Type	Description
`ValueError`	If tool_class does not implement `catalog_metadata()`.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def register_from_class(self, tool_class: type) -> None:
    """Register a tool class that provides ``catalog_metadata()``.

    Computes ``class_path`` automatically from the class's module path.

    Args:
        tool_class: A tool class with a ``catalog_metadata()`` classmethod.

    Raises:
        ValueError: If tool_class does not implement ``catalog_metadata()``.
    """
    if not hasattr(tool_class, "catalog_metadata") or not callable(
        tool_class.catalog_metadata
    ):
        raise ValueError(
            f"{tool_class.__name__} does not implement catalog_metadata()"
        )
    meta = tool_class.catalog_metadata()
    class_path = f"{tool_class.__module__}.{tool_class.__qualname__}"
    self.register_tool(
        name=meta["name"],
        class_path=class_path,
        description=meta.get("description", ""),
        default_params=meta.get("default_params"),
        tags=meta.get("tags", ()),
        requires=meta.get("requires", ()),
    )

list_tools ¶

list_tools(tags: Sequence[str] | None = None) -> list[ToolEntry]

List all registered tools, optionally filtered by tags.

Parameters:

Name	Type	Description	Default
`tags`	`Sequence[str] \| None`	If provided, return only tools that have ANY of the specified tags (union semantics).	`None`

Returns:

Type	Description
`list[ToolEntry]`	List of matching ToolEntry instances.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def list_tools(
    self,
    tags: Sequence[str] | None = None,
) -> list[ToolEntry]:
    """List all registered tools, optionally filtered by tags.

    Args:
        tags: If provided, return only tools that have ANY of the
            specified tags (union semantics).

    Returns:
        List of matching ToolEntry instances.
    """
    entries = self.list_items()
    if tags:
        tag_set = frozenset(tags)
        entries = [e for e in entries if e.tags & tag_set]
    return entries

get_names ¶

get_names() -> list[str]

Get all registered tool names.

Returns:

Type	Description
`list[str]`	List of tool names.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def get_names(self) -> list[str]:
    """Get all registered tool names.

    Returns:
        List of tool names.
    """
    return self.list_keys()

to_bot_config ¶

to_bot_config(name: str, **param_overrides: Any) -> dict[str, Any]

Generate a bot config tool entry for the named tool.

Returns a dict suitable for DynaBot._resolve_tool(): {"class": "full.class.path", "params": {...}}

Parameters:

Name	Type	Description	Default
`name`	`str`	Tool name to look up.	required
`**param_overrides`	`Any`	Override default params.	`{}`

Returns:

Type	Description
`dict[str, Any]`	Bot config dict for the tool.

Raises:

Type	Description
`NotFoundError`	If tool name is not registered.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def to_bot_config(self, name: str, **param_overrides: Any) -> dict[str, Any]:
    """Generate a bot config tool entry for the named tool.

    Returns a dict suitable for ``DynaBot._resolve_tool()``:
    ``{"class": "full.class.path", "params": {...}}``

    Args:
        name: Tool name to look up.
        **param_overrides: Override default params.

    Returns:
        Bot config dict for the tool.

    Raises:
        NotFoundError: If tool name is not registered.
    """
    entry = self.get(name)
    return entry.to_bot_config(**param_overrides)

to_bot_configs ¶

to_bot_configs(
    names: Sequence[str], overrides: dict[str, dict[str, Any]] | None = None
) -> list[dict[str, Any]]

Generate bot config entries for multiple tools.

Parameters:

Name	Type	Description	Default
`names`	`Sequence[str]`	Tool names to include.	required
`overrides`	`dict[str, dict[str, Any]] \| None`	Per-tool param overrides keyed by tool name.	`None`

Returns:

Type	Description
`list[dict[str, Any]]`	List of bot config dicts.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def to_bot_configs(
    self,
    names: Sequence[str],
    overrides: dict[str, dict[str, Any]] | None = None,
) -> list[dict[str, Any]]:
    """Generate bot config entries for multiple tools.

    Args:
        names: Tool names to include.
        overrides: Per-tool param overrides keyed by tool name.

    Returns:
        List of bot config dicts.
    """
    overrides = overrides or {}
    return [
        self.to_bot_config(name, **overrides.get(name, {}))
        for name in names
    ]

get_requirements ¶

get_requirements(names: Sequence[str]) -> frozenset[str]

Get the union of all requirements for the given tool names.

Parameters:

Name	Type	Description	Default
`names`	`Sequence[str]`	Tool names to check.	required

Returns:

Type	Description
`frozenset[str]`	Set of all requirement identifiers across the named tools.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def get_requirements(self, names: Sequence[str]) -> frozenset[str]:
    """Get the union of all requirements for the given tool names.

    Args:
        names: Tool names to check.

    Returns:
        Set of all requirement identifiers across the named tools.
    """
    reqs: set[str] = set()
    for name in names:
        entry = self.get(name)
        reqs.update(entry.requires)
    return frozenset(reqs)

check_requirements ¶

check_requirements(
    tool_names: Sequence[str], config: dict[str, Any]
) -> list[str]

Check that tool requirements are satisfied by a config dict.

Returns a list of warning messages for any unmet requirements. Tools with no requirements are always satisfied.

Parameters:

Name	Type	Description	Default
`tool_names`	`Sequence[str]`	Names of tools to check.	required
`config`	`dict[str, Any]`	Bot config dict to check against (top-level keys).	required

Returns:

Type	Description
`list[str]`	List of warning strings (empty if all requirements met).

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def check_requirements(
    self,
    tool_names: Sequence[str],
    config: dict[str, Any],
) -> list[str]:
    """Check that tool requirements are satisfied by a config dict.

    Returns a list of warning messages for any unmet requirements.
    Tools with no requirements are always satisfied.

    Args:
        tool_names: Names of tools to check.
        config: Bot config dict to check against (top-level keys).

    Returns:
        List of warning strings (empty if all requirements met).
    """
    warnings: list[str] = []
    for name in tool_names:
        entry = self.get(name)
        for req in sorted(entry.requires):
            if req not in config:
                warnings.append(
                    f"Tool '{name}' requires '{req}' "
                    f"but it is not configured"
                )
    return warnings

instantiate_tool ¶

instantiate_tool(name: str, **param_overrides: Any) -> Any

Import and instantiate a tool from its catalog entry.

Uses resolve_callable() to import the class, then instantiates it with default_params merged with overrides. Prefers from_config() if the class defines it.

Parameters:

Name	Type	Description	Default
`name`	`str`	Tool name to instantiate.	required
`**param_overrides`	`Any`	Override default params.	`{}`

Returns:

Type	Description
`Any`	Instantiated tool.

Raises:

Type	Description
`NotFoundError`	If name not in catalog.
`ImportError`	If class cannot be imported.
`ValueError`	If resolved class is not callable.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def instantiate_tool(self, name: str, **param_overrides: Any) -> Any:
    """Import and instantiate a tool from its catalog entry.

    Uses ``resolve_callable()`` to import the class, then instantiates
    it with ``default_params`` merged with overrides. Prefers
    ``from_config()`` if the class defines it.

    Args:
        name: Tool name to instantiate.
        **param_overrides: Override default params.

    Returns:
        Instantiated tool.

    Raises:
        NotFoundError: If name not in catalog.
        ImportError: If class cannot be imported.
        ValueError: If resolved class is not callable.
    """
    from dataknobs_bots.tools.resolve import resolve_callable

    entry = self.get(name)
    tool_class = resolve_callable(entry.class_path)
    params = dict(entry.default_params)
    params.update(param_overrides)

    if hasattr(tool_class, "from_config") and callable(
        tool_class.from_config
    ):
        return tool_class.from_config(params)
    return tool_class(**params) if params else tool_class()

create_tool_registry ¶

create_tool_registry(
    names: Sequence[str] | None = None,
    overrides: dict[str, dict[str, Any]] | None = None,
    strict: bool = False,
) -> Any

Create a ToolRegistry populated from catalog entries.

Imports and instantiates each named tool, registering them in a new ToolRegistry.

Parameters:

Name	Type	Description	Default
`names`	`Sequence[str] \| None`	Tool names to include (default: all registered).	`None`
`overrides`	`dict[str, dict[str, Any]] \| None`	Per-tool param overrides keyed by tool name.	`None`
`strict`	`bool`	If True, raise on instantiation failure. If False (default), skip failed tools and log warnings.	`False`

Returns:

Type	Description
`Any`	ToolRegistry with instantiated tools.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def create_tool_registry(
    self,
    names: Sequence[str] | None = None,
    overrides: dict[str, dict[str, Any]] | None = None,
    strict: bool = False,
) -> Any:
    """Create a ToolRegistry populated from catalog entries.

    Imports and instantiates each named tool, registering them in a
    new ``ToolRegistry``.

    Args:
        names: Tool names to include (default: all registered).
        overrides: Per-tool param overrides keyed by tool name.
        strict: If True, raise on instantiation failure.
            If False (default), skip failed tools and log warnings.

    Returns:
        ToolRegistry with instantiated tools.
    """
    from dataknobs_llm.tools import ToolRegistry

    registry = ToolRegistry()
    target_names = list(names) if names else self.list_keys()
    overrides = overrides or {}

    for name in target_names:
        try:
            tool = self.instantiate_tool(name, **overrides.get(name, {}))
            registry.register_tool(tool)
        except Exception as e:
            if strict:
                raise
            logger.warning(
                "Failed to instantiate tool '%s': %s", name, e
            )

    return registry

to_dict ¶

to_dict() -> dict[str, Any]

Serialize entire catalog to a dict (for YAML output).

Returns:

Type	Description
`dict[str, Any]`	Dict with `tools` key containing list of tool dicts.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def to_dict(self) -> dict[str, Any]:
    """Serialize entire catalog to a dict (for YAML output).

    Returns:
        Dict with ``tools`` key containing list of tool dicts.
    """
    return {
        "tools": [entry.to_dict() for entry in self.list_items()]
    }

from_dict `classmethod` ¶

from_dict(data: dict[str, Any]) -> ToolCatalog

Create a catalog from a dict (e.g., loaded from YAML).

Parameters:

Name	Type	Description	Default
`data`	`dict[str, Any]`	Dict with `tools` key containing list of tool dicts.	required

Returns:

Type	Description
`ToolCatalog`	New ToolCatalog populated from the data.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

@classmethod
def from_dict(cls, data: dict[str, Any]) -> ToolCatalog:
    """Create a catalog from a dict (e.g., loaded from YAML).

    Args:
        data: Dict with ``tools`` key containing list of tool dicts.

    Returns:
        New ToolCatalog populated from the data.
    """
    catalog = cls()
    for tool_data in data.get("tools", []):
        catalog.register_from_dict(tool_data)
    return catalog

ToolEntry `dataclass` ¶

ToolEntry(
    name: str,
    class_path: str,
    description: str = "",
    default_params: dict[str, Any] = None,
    tags: frozenset[str] = frozenset(),
    requires: frozenset[str] = frozenset(),
)

Metadata for a tool in the catalog.

Captures the information needed to: - Generate bot config entries (class path + params) - Reference tools in wizard stage configs (name) - Discover tools by capability (tags) - Validate tool dependencies (requires)

Methods:

Name	Description
`__post_init__`	Set default_params to empty dict if None.
`to_dict`	Serialize to dict (suitable for YAML output).
`from_dict`	Deserialize from dict (e.g., loaded from YAML).
`to_bot_config`	Generate a bot config tool entry.

Functions¶

__post_init__ ¶

__post_init__() -> None

Set default_params to empty dict if None.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def __post_init__(self) -> None:
    """Set default_params to empty dict if None."""
    if self.default_params is None:
        object.__setattr__(self, "default_params", {})

to_dict ¶

to_dict() -> dict[str, Any]

Serialize to dict (suitable for YAML output).

Omits empty/default fields for clean output.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def to_dict(self) -> dict[str, Any]:
    """Serialize to dict (suitable for YAML output).

    Omits empty/default fields for clean output.
    """
    result: dict[str, Any] = {
        "name": self.name,
        "class_path": self.class_path,
    }
    if self.description:
        result["description"] = self.description
    if self.default_params:
        result["default_params"] = dict(self.default_params)
    if self.tags:
        result["tags"] = sorted(self.tags)
    if self.requires:
        result["requires"] = sorted(self.requires)
    return result

from_dict `classmethod` ¶

from_dict(data: dict[str, Any]) -> ToolEntry

Deserialize from dict (e.g., loaded from YAML).

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

@classmethod
def from_dict(cls, data: dict[str, Any]) -> ToolEntry:
    """Deserialize from dict (e.g., loaded from YAML)."""
    return cls(
        name=data["name"],
        class_path=data["class_path"],
        description=data.get("description", ""),
        default_params=data.get("default_params") or {},
        tags=frozenset(data.get("tags") or ()),
        requires=frozenset(data.get("requires") or ()),
    )

to_bot_config ¶

to_bot_config(**param_overrides: Any) -> dict[str, Any]

Generate a bot config tool entry.

Returns a dict suitable for DynaBot._resolve_tool(): {"class": "full.class.path", "params": {...}}

Parameters:

Name	Type	Description	Default
`**param_overrides`	`Any`	Override default params.	`{}`

Returns:

Type	Description
`dict[str, Any]`	Bot config dict for this tool.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def to_bot_config(self, **param_overrides: Any) -> dict[str, Any]:
    """Generate a bot config tool entry.

    Returns a dict suitable for ``DynaBot._resolve_tool()``:
    ``{"class": "full.class.path", "params": {...}}``

    Args:
        **param_overrides: Override default params.

    Returns:
        Bot config dict for this tool.
    """
    params = dict(self.default_params)
    params.update(param_overrides)
    config: dict[str, Any] = {"class": self.class_path}
    if params:
        config["params"] = params
    return config

ValidationResult `dataclass` ¶

ValidationResult(
    valid: bool, errors: list[str] = list(), warnings: list[str] = list()
)

Result of validating a configuration.

Attributes:

Name	Type	Description
`valid`	`bool`	Whether the configuration passed validation.
`errors`	`list[str]`	List of error messages (validation failures).
`warnings`	`list[str]`	List of warning messages (non-blocking issues).

Methods:

Name	Description
`merge`	Merge another validation result into this one.
`ok`	Create a successful validation result.
`error`	Create a failed validation result with a single error.
`warning`	Create a successful validation result with a warning.
`to_dict`	Convert to dictionary representation.

Functions¶

merge ¶

merge(other: ValidationResult) -> ValidationResult

Merge another validation result into this one.

The merged result is valid only if both results are valid.

Parameters:

Name	Type	Description	Default
`other`	`ValidationResult`	Another validation result to merge.	required

Returns:

Type	Description
`ValidationResult`	A new ValidationResult with combined errors and warnings.

Source code in packages/bots/src/dataknobs_bots/config/validation.py

def merge(self, other: ValidationResult) -> ValidationResult:
    """Merge another validation result into this one.

    The merged result is valid only if both results are valid.

    Args:
        other: Another validation result to merge.

    Returns:
        A new ValidationResult with combined errors and warnings.
    """
    return ValidationResult(
        valid=self.valid and other.valid,
        errors=self.errors + other.errors,
        warnings=self.warnings + other.warnings,
    )

ok `classmethod` ¶

ok() -> ValidationResult

Create a successful validation result.

Source code in packages/bots/src/dataknobs_bots/config/validation.py

@classmethod
def ok(cls) -> ValidationResult:
    """Create a successful validation result."""
    return cls(valid=True)

error `classmethod` ¶

error(message: str) -> ValidationResult

Create a failed validation result with a single error.

Parameters:

Name	Type	Description	Default
`message`	`str`	The error message.	required

Source code in packages/bots/src/dataknobs_bots/config/validation.py

@classmethod
def error(cls, message: str) -> ValidationResult:
    """Create a failed validation result with a single error.

    Args:
        message: The error message.
    """
    return cls(valid=False, errors=[message])

warning `classmethod` ¶

warning(message: str) -> ValidationResult

Create a successful validation result with a warning.

Parameters:

Name	Type	Description	Default
`message`	`str`	The warning message.	required

Source code in packages/bots/src/dataknobs_bots/config/validation.py

@classmethod
def warning(cls, message: str) -> ValidationResult:
    """Create a successful validation result with a warning.

    Args:
        message: The warning message.
    """
    return cls(valid=True, warnings=[message])

to_dict ¶

to_dict() -> dict[str, Any]

Convert to dictionary representation.

Source code in packages/bots/src/dataknobs_bots/config/validation.py

def to_dict(self) -> dict[str, Any]:
    """Convert to dictionary representation."""
    return {
        "valid": self.valid,
        "errors": self.errors,
        "warnings": self.warnings,
    }

RAGKnowledgeBase ¶

RAGKnowledgeBase(
    vector_store: Any,
    embedding_provider: Any,
    chunking_config: dict[str, Any] | None = None,
    merger_config: MergerConfig | None = None,
    formatter_config: FormatterConfig | None = None,
)

Bases: KnowledgeBase

RAG knowledge base using dataknobs-xization for chunking and vector search.

This implementation: - Parses markdown documents using dataknobs-xization - Chunks documents intelligently based on structure - Stores chunks with embeddings in vector store - Provides semantic search for relevant context

Attributes:

Name	Type	Description
`vector_store`		Vector store backend from dataknobs_data
`embedding_provider`		LLM provider for generating embeddings
`chunking_config`		Configuration for document chunking

Initialize RAG knowledge base.

Parameters:

Name	Type	Description	Default
`vector_store`	`Any`	Vector store backend instance	required
`embedding_provider`	`Any`	LLM provider with embed() method	required
`chunking_config`	`dict[str, Any] \| None`	Configuration for chunking: - max_chunk_size: Maximum chunk size in characters - combine_under_heading: Combine text under same heading - quality_filter: ChunkQualityConfig for filtering - generate_embeddings: Whether to generate enriched embedding text	`None`
`merger_config`	`MergerConfig \| None`	Configuration for chunk merging (optional)	`None`
`formatter_config`	`FormatterConfig \| None`	Configuration for context formatting (optional)	`None`

Methods:

Name	Description
`from_config`	Create RAG knowledge base from configuration.
`load_markdown_document`	Load and chunk a markdown document from a file.
`load_documents_from_directory`	Load all markdown documents from a directory.
`load_json_document`	Load and chunk a JSON document by converting it to markdown.
`load_yaml_document`	Load and chunk a YAML document by converting it to markdown.
`load_csv_document`	Load and chunk a CSV document by converting it to markdown.
`load_from_directory`	Load documents from a directory using KnowledgeBaseConfig.
`load_markdown_text`	Load markdown content from a string.
`query`	Query knowledge base for relevant chunks.
`hybrid_query`	Query knowledge base using hybrid search (text + vector).
`format_context`	Format search results for LLM context.
`count`	Get the number of chunks in the knowledge base.
`clear`	Clear all documents from the knowledge base.
`save`	Save the knowledge base to persistent storage.
`providers`	Return the embedding provider, keyed by role.
`set_provider`	Replace the embedding provider if the role matches.
`close`	Close the knowledge base and release resources.
`__aenter__`	Async context manager entry.
`__aexit__`	Async context manager exit - ensures cleanup.

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

def __init__(
    self,
    vector_store: Any,
    embedding_provider: Any,
    chunking_config: dict[str, Any] | None = None,
    merger_config: MergerConfig | None = None,
    formatter_config: FormatterConfig | None = None,
):
    """Initialize RAG knowledge base.

    Args:
        vector_store: Vector store backend instance
        embedding_provider: LLM provider with embed() method
        chunking_config: Configuration for chunking:
            - max_chunk_size: Maximum chunk size in characters
            - combine_under_heading: Combine text under same heading
            - quality_filter: ChunkQualityConfig for filtering
            - generate_embeddings: Whether to generate enriched embedding text
        merger_config: Configuration for chunk merging (optional)
        formatter_config: Configuration for context formatting (optional)
    """
    self.vector_store = vector_store
    self.embedding_provider = embedding_provider
    self.chunking_config = chunking_config or {
        "max_chunk_size": 500,
        "combine_under_heading": True,
    }

    # Initialize merger and formatter
    self.merger = ChunkMerger(merger_config) if merger_config else ChunkMerger()
    self.formatter = ContextFormatter(formatter_config) if formatter_config else ContextFormatter()

Functions¶

from_config `async` `classmethod` ¶

from_config(config: dict[str, Any]) -> RAGKnowledgeBase

Create RAG knowledge base from configuration.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Configuration dictionary with: - vector_store: Vector store configuration - embedding: Nested embedding config dict (preferred), e.g. `{"provider": "ollama", "model": "nomic-embed-text"}` - embedding_provider / embedding_model: Legacy flat keys - chunking: Optional chunking configuration - documents_path: Optional path to load documents from - document_pattern: Optional glob pattern for documents	required

Returns:

Type	Description
`RAGKnowledgeBase`	Configured RAGKnowledgeBase instance

Example

config = {
    "vector_store": {
        "backend": "faiss",
        "dimensions": 768,
        "collection": "docs"
    },
    "embedding": {
        "provider": "ollama",
        "model": "nomic-embed-text",
    },
    "chunking": {
        "max_chunk_size": 500
    },
    "documents_path": "./docs"
}
kb = await RAGKnowledgeBase.from_config(config)

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

@classmethod
async def from_config(cls, config: dict[str, Any]) -> "RAGKnowledgeBase":
    """Create RAG knowledge base from configuration.

    Args:
        config: Configuration dictionary with:
            - vector_store: Vector store configuration
            - embedding: Nested embedding config dict (preferred), e.g.
              ``{"provider": "ollama", "model": "nomic-embed-text"}``
            - embedding_provider / embedding_model: Legacy flat keys
            - chunking: Optional chunking configuration
            - documents_path: Optional path to load documents from
            - document_pattern: Optional glob pattern for documents

    Returns:
        Configured RAGKnowledgeBase instance

    Example:
        ```python
        config = {
            "vector_store": {
                "backend": "faiss",
                "dimensions": 768,
                "collection": "docs"
            },
            "embedding": {
                "provider": "ollama",
                "model": "nomic-embed-text",
            },
            "chunking": {
                "max_chunk_size": 500
            },
            "documents_path": "./docs"
        }
        kb = await RAGKnowledgeBase.from_config(config)
        ```
    """
    from dataknobs_data.vector.stores import VectorStoreFactory

    from ..providers import create_embedding_provider

    # Create vector store
    vs_config = config["vector_store"]
    factory = VectorStoreFactory()
    vector_store = factory.create(**vs_config)
    await vector_store.initialize()

    # Create embedding provider
    embedding_provider = await create_embedding_provider(config)

    # Create merger config if specified
    merger_config = None
    if "merger" in config:
        merger_config = MergerConfig(**config["merger"])

    # Create formatter config if specified
    formatter_config = None
    if "formatter" in config:
        formatter_config = FormatterConfig(**config["formatter"])

    # Create instance
    kb = cls(
        vector_store=vector_store,
        embedding_provider=embedding_provider,
        chunking_config=config.get("chunking", {}),
        merger_config=merger_config,
        formatter_config=formatter_config,
    )

    # Load documents if path provided
    if "documents_path" in config:
        await kb.load_documents_from_directory(
            config["documents_path"], config.get("document_pattern", "**/*.md")
        )

    return kb

load_markdown_document `async` ¶

load_markdown_document(
    filepath: str | Path, metadata: dict[str, Any] | None = None
) -> int

Load and chunk a markdown document from a file.

Reads the file and delegates to :meth:load_markdown_text for parsing, chunking, embedding, and storage.

Parameters:

Name	Type	Description	Default
`filepath`	`str \| Path`	Path to markdown file	required
`metadata`	`dict[str, Any] \| None`	Optional metadata to attach to all chunks	`None`

Returns:

Type	Description
`int`	Number of chunks created

Example

num_chunks = await kb.load_markdown_document(
    "docs/api.md",
    metadata={"category": "api", "version": "1.0"}
)

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def load_markdown_document(
    self, filepath: str | Path, metadata: dict[str, Any] | None = None
) -> int:
    """Load and chunk a markdown document from a file.

    Reads the file and delegates to :meth:`load_markdown_text` for
    parsing, chunking, embedding, and storage.

    Args:
        filepath: Path to markdown file
        metadata: Optional metadata to attach to all chunks

    Returns:
        Number of chunks created

    Example:
        ```python
        num_chunks = await kb.load_markdown_document(
            "docs/api.md",
            metadata={"category": "api", "version": "1.0"}
        )
        ```
    """
    filepath = Path(filepath)
    with open(filepath, encoding="utf-8") as f:
        markdown_text = f.read()

    return await self.load_markdown_text(
        markdown_text,
        source=str(filepath),
        metadata=metadata,
    )

load_documents_from_directory `async` ¶

load_documents_from_directory(
    directory: str | Path, pattern: str = "**/*.md"
) -> dict[str, Any]

Load all markdown documents from a directory.

Parameters:

Name	Type	Description	Default
`directory`	`str \| Path`	Directory path containing documents	required
`pattern`	`str`	Glob pattern for files to load (default: */.md)	`'*/.md'`

Returns:

Type	Description
`dict[str, Any]`	Dictionary with loading statistics: - total_files: Number of files processed - total_chunks: Total chunks created - errors: List of errors encountered

Example

results = await kb.load_documents_from_directory(
    "docs/",
    pattern="**/*.md"
)
print(f"Loaded {results['total_chunks']} chunks from {results['total_files']} files")

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def load_documents_from_directory(
    self, directory: str | Path, pattern: str = "**/*.md"
) -> dict[str, Any]:
    """Load all markdown documents from a directory.

    Args:
        directory: Directory path containing documents
        pattern: Glob pattern for files to load (default: **/*.md)

    Returns:
        Dictionary with loading statistics:
            - total_files: Number of files processed
            - total_chunks: Total chunks created
            - errors: List of errors encountered

    Example:
        ```python
        results = await kb.load_documents_from_directory(
            "docs/",
            pattern="**/*.md"
        )
        print(f"Loaded {results['total_chunks']} chunks from {results['total_files']} files")
        ```
    """
    directory = Path(directory)
    results = {"total_files": 0, "total_chunks": 0, "errors": []}

    for filepath in directory.glob(pattern):
        if not filepath.is_file():
            continue

        try:
            num_chunks = await self.load_markdown_document(
                filepath, metadata={"filename": filepath.name}
            )
            results["total_files"] += 1
            results["total_chunks"] += num_chunks
        except Exception as e:
            results["errors"].append({"file": str(filepath), "error": str(e)})

    return results

load_json_document `async` ¶

load_json_document(
    filepath: str | Path,
    metadata: dict[str, Any] | None = None,
    schema: str | None = None,
    transformer: ContentTransformer | None = None,
    title: str | None = None,
) -> int

Load and chunk a JSON document by converting it to markdown.

This method converts JSON data to markdown format using ContentTransformer, then processes it like any other markdown document.

Parameters:

Name	Type	Description	Default
`filepath`	`str \| Path`	Path to JSON file	required
`metadata`	`dict[str, Any] \| None`	Optional metadata to attach to all chunks	`None`
`schema`	`str \| None`	Optional schema name (requires transformer with registered schema)	`None`
`transformer`	`ContentTransformer \| None`	Optional ContentTransformer instance with custom configuration	`None`
`title`	`str \| None`	Optional document title for the markdown	`None`

Returns:

Type	Description
`int`	Number of chunks created

Example

# Generic conversion
num_chunks = await kb.load_json_document(
    "data/patterns.json",
    metadata={"content_type": "patterns"}
)

# With custom schema
transformer = ContentTransformer()
transformer.register_schema("pattern", {
    "title_field": "name",
    "sections": [
        {"field": "description", "heading": "Description"},
        {"field": "example", "heading": "Example", "format": "code"}
    ]
})
num_chunks = await kb.load_json_document(
    "data/patterns.json",
    transformer=transformer,
    schema="pattern"
)

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def load_json_document(
    self,
    filepath: str | Path,
    metadata: dict[str, Any] | None = None,
    schema: str | None = None,
    transformer: ContentTransformer | None = None,
    title: str | None = None,
) -> int:
    """Load and chunk a JSON document by converting it to markdown.

    This method converts JSON data to markdown format using ContentTransformer,
    then processes it like any other markdown document.

    Args:
        filepath: Path to JSON file
        metadata: Optional metadata to attach to all chunks
        schema: Optional schema name (requires transformer with registered schema)
        transformer: Optional ContentTransformer instance with custom configuration
        title: Optional document title for the markdown

    Returns:
        Number of chunks created

    Example:
        ```python
        # Generic conversion
        num_chunks = await kb.load_json_document(
            "data/patterns.json",
            metadata={"content_type": "patterns"}
        )

        # With custom schema
        transformer = ContentTransformer()
        transformer.register_schema("pattern", {
            "title_field": "name",
            "sections": [
                {"field": "description", "heading": "Description"},
                {"field": "example", "heading": "Example", "format": "code"}
            ]
        })
        num_chunks = await kb.load_json_document(
            "data/patterns.json",
            transformer=transformer,
            schema="pattern"
        )
        ```
    """
    import json

    filepath = Path(filepath)

    # Read JSON
    with open(filepath, encoding="utf-8") as f:
        data = json.load(f)

    # Convert to markdown
    if transformer is None:
        transformer = ContentTransformer()

    markdown_text = transformer.transform_json(
        data,
        schema=schema,
        title=title or filepath.stem.replace("_", " ").title(),
    )

    return await self.load_markdown_text(
        markdown_text,
        source=str(filepath),
        metadata=metadata,
    )

load_yaml_document `async` ¶

load_yaml_document(
    filepath: str | Path,
    metadata: dict[str, Any] | None = None,
    schema: str | None = None,
    transformer: ContentTransformer | None = None,
    title: str | None = None,
) -> int

Load and chunk a YAML document by converting it to markdown.

Parameters:

Name	Type	Description	Default
`filepath`	`str \| Path`	Path to YAML file	required
`metadata`	`dict[str, Any] \| None`	Optional metadata to attach to all chunks	`None`
`schema`	`str \| None`	Optional schema name (requires transformer with registered schema)	`None`
`transformer`	`ContentTransformer \| None`	Optional ContentTransformer instance with custom configuration	`None`
`title`	`str \| None`	Optional document title for the markdown	`None`

Returns:

Type	Description
`int`	Number of chunks created

Example

num_chunks = await kb.load_yaml_document(
    "data/config.yaml",
    metadata={"content_type": "configuration"}
)

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def load_yaml_document(
    self,
    filepath: str | Path,
    metadata: dict[str, Any] | None = None,
    schema: str | None = None,
    transformer: ContentTransformer | None = None,
    title: str | None = None,
) -> int:
    """Load and chunk a YAML document by converting it to markdown.

    Args:
        filepath: Path to YAML file
        metadata: Optional metadata to attach to all chunks
        schema: Optional schema name (requires transformer with registered schema)
        transformer: Optional ContentTransformer instance with custom configuration
        title: Optional document title for the markdown

    Returns:
        Number of chunks created

    Example:
        ```python
        num_chunks = await kb.load_yaml_document(
            "data/config.yaml",
            metadata={"content_type": "configuration"}
        )
        ```
    """
    filepath = Path(filepath)

    # Convert to markdown
    if transformer is None:
        transformer = ContentTransformer()

    markdown_text = transformer.transform_yaml(
        filepath,
        schema=schema,
        title=title or filepath.stem.replace("_", " ").title(),
    )

    return await self.load_markdown_text(
        markdown_text,
        source=str(filepath),
        metadata=metadata,
    )

load_csv_document `async` ¶

load_csv_document(
    filepath: str | Path,
    metadata: dict[str, Any] | None = None,
    title: str | None = None,
    title_field: str | None = None,
    transformer: ContentTransformer | None = None,
) -> int

Load and chunk a CSV document by converting it to markdown.

Each row becomes a section with the first column (or title_field) as heading.

Parameters:

Name	Type	Description	Default
`filepath`	`str \| Path`	Path to CSV file	required
`metadata`	`dict[str, Any] \| None`	Optional metadata to attach to all chunks	`None`
`title`	`str \| None`	Optional document title for the markdown	`None`
`title_field`	`str \| None`	Column to use as section title (default: first column)	`None`
`transformer`	`ContentTransformer \| None`	Optional ContentTransformer instance with custom configuration	`None`

Returns:

Type	Description
`int`	Number of chunks created

Example

num_chunks = await kb.load_csv_document(
    "data/faq.csv",
    title="Frequently Asked Questions",
    title_field="question"
)

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def load_csv_document(
    self,
    filepath: str | Path,
    metadata: dict[str, Any] | None = None,
    title: str | None = None,
    title_field: str | None = None,
    transformer: ContentTransformer | None = None,
) -> int:
    """Load and chunk a CSV document by converting it to markdown.

    Each row becomes a section with the first column (or title_field) as heading.

    Args:
        filepath: Path to CSV file
        metadata: Optional metadata to attach to all chunks
        title: Optional document title for the markdown
        title_field: Column to use as section title (default: first column)
        transformer: Optional ContentTransformer instance with custom configuration

    Returns:
        Number of chunks created

    Example:
        ```python
        num_chunks = await kb.load_csv_document(
            "data/faq.csv",
            title="Frequently Asked Questions",
            title_field="question"
        )
        ```
    """
    filepath = Path(filepath)

    # Convert to markdown
    if transformer is None:
        transformer = ContentTransformer()

    markdown_text = transformer.transform_csv(
        filepath,
        title=title or filepath.stem.replace("_", " ").title(),
        title_field=title_field,
    )

    return await self.load_markdown_text(
        markdown_text,
        source=str(filepath),
        metadata=metadata,
    )

load_from_directory `async` ¶

load_from_directory(
    directory: str | Path,
    config: KnowledgeBaseConfig | None = None,
    progress_callback: Any | None = None,
) -> dict[str, Any]

Load documents from a directory using KnowledgeBaseConfig.

This method uses the xization DirectoryProcessor to process documents with configurable patterns, chunking, and metadata. It supports markdown, JSON, and JSONL files with streaming for large files.

Parameters:

Name	Type	Description	Default
`directory`	`str \| Path`	Directory path containing documents	required
`config`	`KnowledgeBaseConfig \| None`	Optional KnowledgeBaseConfig. If not provided, attempts to load from knowledge_base.json/yaml in the directory, or uses defaults.	`None`
`progress_callback`	`Any \| None`	Optional callback function(file_path, num_chunks) for progress	`None`

Returns:

Type	Description
`dict[str, Any]`	Dictionary with loading statistics: - total_files: Number of files processed - total_chunks: Total chunks created - files_by_type: Count of files by type (markdown, json, jsonl) - errors: List of errors encountered - documents: List of processed document info

Example

# With auto-loaded config from directory
results = await kb.load_from_directory("./docs")

# With explicit config
config = KnowledgeBaseConfig(
    name="product-docs",
    default_chunking={"max_chunk_size": 800},
    patterns=[
        FilePatternConfig(pattern="api/**/*.json", text_fields=["title", "description"]),
        FilePatternConfig(pattern="**/*.md"),
    ],
    exclude_patterns=["**/drafts/**"],
)
results = await kb.load_from_directory("./docs", config=config)
print(f"Loaded {results['total_chunks']} chunks from {results['total_files']} files")

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def load_from_directory(
    self,
    directory: str | Path,
    config: KnowledgeBaseConfig | None = None,
    progress_callback: Any | None = None,
) -> dict[str, Any]:
    """Load documents from a directory using KnowledgeBaseConfig.

    This method uses the xization DirectoryProcessor to process documents
    with configurable patterns, chunking, and metadata. It supports markdown,
    JSON, and JSONL files with streaming for large files.

    Args:
        directory: Directory path containing documents
        config: Optional KnowledgeBaseConfig. If not provided, attempts to load
               from knowledge_base.json/yaml in the directory, or uses defaults.
        progress_callback: Optional callback function(file_path, num_chunks) for progress

    Returns:
        Dictionary with loading statistics:
            - total_files: Number of files processed
            - total_chunks: Total chunks created
            - files_by_type: Count of files by type (markdown, json, jsonl)
            - errors: List of errors encountered
            - documents: List of processed document info

    Example:
        ```python
        # With auto-loaded config from directory
        results = await kb.load_from_directory("./docs")

        # With explicit config
        config = KnowledgeBaseConfig(
            name="product-docs",
            default_chunking={"max_chunk_size": 800},
            patterns=[
                FilePatternConfig(pattern="api/**/*.json", text_fields=["title", "description"]),
                FilePatternConfig(pattern="**/*.md"),
            ],
            exclude_patterns=["**/drafts/**"],
        )
        results = await kb.load_from_directory("./docs", config=config)
        print(f"Loaded {results['total_chunks']} chunks from {results['total_files']} files")
        ```
    """
    import numpy as np

    directory = Path(directory)

    # Load or use provided config
    if config is None:
        config = KnowledgeBaseConfig.load(directory)

    # Create processor
    processor = DirectoryProcessor(config, directory)

    # Track results
    results: dict[str, Any] = {
        "total_files": 0,
        "total_chunks": 0,
        "files_by_type": {"markdown": 0, "json": 0, "jsonl": 0},
        "errors": [],
        "documents": [],
    }

    # Process each document
    for doc in processor.process():
        doc_info: dict[str, Any] = {
            "source": doc.source_file,
            "type": doc.document_type,
            "chunks": 0,
            "errors": doc.errors,
        }

        if doc.has_errors:
            results["errors"].extend([
                {"file": doc.source_file, "error": err}
                for err in doc.errors
            ])
            results["documents"].append(doc_info)
            continue

        # Process chunks for this document
        vectors = []
        ids = []
        metadatas = []

        source_stem = Path(doc.source_file).stem

        for chunk in doc.chunks:
            # Get text for embedding
            text_for_embedding = chunk.get("embedding_text") or chunk.get("text", "")

            if not text_for_embedding:
                continue

            # Generate embedding
            embedding = await self.embedding_provider.embed(text_for_embedding)

            # Convert to numpy if needed
            if not isinstance(embedding, np.ndarray):
                embedding = np.array(embedding, dtype=np.float32)

            # Build chunk ID
            chunk_index = chunk.get("chunk_index", len(vectors))
            chunk_id = f"{source_stem}_{chunk_index}"

            # Build metadata
            chunk_metadata = {
                "text": chunk.get("text", ""),
                "source": doc.source_file,
                "chunk_index": chunk_index,
                "document_type": doc.document_type,
            }

            # Add chunk-specific metadata
            if "metadata" in chunk:
                chunk_metadata.update(chunk["metadata"])

            # Add document-level metadata
            if doc.metadata:
                for key, value in doc.metadata.items():
                    if key not in chunk_metadata:
                        chunk_metadata[key] = value

            vectors.append(embedding)
            ids.append(chunk_id)
            metadatas.append(chunk_metadata)

        # Batch insert into vector store
        if vectors:
            await self.vector_store.add_vectors(
                vectors=vectors, ids=ids, metadata=metadatas
            )

        doc_info["chunks"] = len(vectors)
        results["total_files"] += 1
        results["total_chunks"] += len(vectors)
        results["files_by_type"][doc.document_type] += 1
        results["documents"].append(doc_info)

        # Call progress callback if provided
        if progress_callback:
            progress_callback(doc.source_file, len(vectors))

    return results

load_markdown_text `async` ¶

load_markdown_text(
    markdown_text: str, source: str, metadata: dict[str, Any] | None = None
) -> int

Load markdown content from a string.

Parses, chunks, embeds, and stores the markdown text. This is the shared implementation used by :meth:load_markdown_document, :meth:load_json_document, :meth:load_yaml_document, and :meth:load_csv_document.

Parameters:

Name	Type	Description	Default
`markdown_text`	`str`	Markdown content to load	required
`source`	`str`	Source identifier for metadata	required
`metadata`	`dict[str, Any] \| None`	Optional metadata to attach to all chunks	`None`

Returns:

Type	Description
`int`	Number of chunks created

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def load_markdown_text(
    self,
    markdown_text: str,
    source: str,
    metadata: dict[str, Any] | None = None,
) -> int:
    """Load markdown content from a string.

    Parses, chunks, embeds, and stores the markdown text.  This is
    the shared implementation used by :meth:`load_markdown_document`,
    :meth:`load_json_document`, :meth:`load_yaml_document`, and
    :meth:`load_csv_document`.

    Args:
        markdown_text: Markdown content to load
        source: Source identifier for metadata
        metadata: Optional metadata to attach to all chunks

    Returns:
        Number of chunks created
    """
    import numpy as np

    # Parse markdown
    tree = parse_markdown(markdown_text)

    # Build quality filter config if specified
    quality_filter = None
    if "quality_filter" in self.chunking_config:
        qf_config = self.chunking_config["quality_filter"]
        if isinstance(qf_config, ChunkQualityConfig):
            quality_filter = qf_config
        elif isinstance(qf_config, dict):
            quality_filter = ChunkQualityConfig(**qf_config)

    # Chunk the document with enhanced options
    chunks = chunk_markdown_tree(
        tree,
        max_chunk_size=self.chunking_config.get("max_chunk_size", 500),
        heading_inclusion=HeadingInclusion.IN_METADATA,
        combine_under_heading=self.chunking_config.get("combine_under_heading", True),
        quality_filter=quality_filter,
        generate_embeddings=self.chunking_config.get("generate_embeddings", True),
    )

    # Process and store chunks
    vectors = []
    ids = []
    metadatas = []

    # Generate a base ID from source
    source_stem = Path(source).stem if source else "doc"

    for i, chunk in enumerate(chunks):
        # Use embedding_text if available, otherwise use chunk text
        text_for_embedding = chunk.metadata.embedding_text or chunk.text

        # Generate embedding
        embedding = await self.embedding_provider.embed(text_for_embedding)

        # Convert to numpy if needed
        if not isinstance(embedding, np.ndarray):
            embedding = np.array(embedding, dtype=np.float32)

        # Prepare metadata with new fields
        chunk_id = f"{source_stem}_{i}"
        chunk_metadata = {
            "text": chunk.text,
            "source": source,
            "chunk_index": i,
            "heading_path": chunk.metadata.heading_display or chunk.metadata.get_heading_path(),
            "headings": chunk.metadata.headings,
            "heading_levels": chunk.metadata.heading_levels,
            "line_number": chunk.metadata.line_number,
            "chunk_size": chunk.metadata.chunk_size,
            "content_length": chunk.metadata.content_length,
        }

        # Merge with user metadata
        if metadata:
            chunk_metadata.update(metadata)

        vectors.append(embedding)
        ids.append(chunk_id)
        metadatas.append(chunk_metadata)

    # Batch insert into vector store
    if vectors:
        await self.vector_store.add_vectors(
            vectors=vectors, ids=ids, metadata=metadatas
        )

    return len(chunks)

query `async` ¶

query(
    query: str,
    k: int = 5,
    filter_metadata: dict[str, Any] | None = None,
    min_similarity: float = 0.0,
    merge_adjacent: bool = False,
    max_chunk_size: int | None = None,
) -> list[dict[str, Any]]

Query knowledge base for relevant chunks.

Parameters:

Name	Type	Description	Default
`query`	`str`	Query text to search for	required
`k`	`int`	Number of results to return	`5`
`filter_metadata`	`dict[str, Any] \| None`	Optional metadata filters	`None`
`min_similarity`	`float`	Minimum similarity score (0-1)	`0.0`
`merge_adjacent`	`bool`	Whether to merge adjacent chunks with same heading	`False`
`max_chunk_size`	`int \| None`	Maximum size for merged chunks (uses merger config default if not specified)	`None`

Returns:

Type	Description
`list[dict[str, Any]]`	List of result dictionaries with: - text: Chunk text - source: Source file - heading_path: Heading hierarchy - similarity: Similarity score - metadata: Full chunk metadata

Example

results = await kb.query(
    "How do I configure the database?",
    k=3,
    merge_adjacent=True
)
for result in results:
    print(f"[{result['similarity']:.2f}] {result['heading_path']}")
    print(result['text'])

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def query(
    self,
    query: str,
    k: int = 5,
    filter_metadata: dict[str, Any] | None = None,
    min_similarity: float = 0.0,
    merge_adjacent: bool = False,
    max_chunk_size: int | None = None,
) -> list[dict[str, Any]]:
    """Query knowledge base for relevant chunks.

    Args:
        query: Query text to search for
        k: Number of results to return
        filter_metadata: Optional metadata filters
        min_similarity: Minimum similarity score (0-1)
        merge_adjacent: Whether to merge adjacent chunks with same heading
        max_chunk_size: Maximum size for merged chunks (uses merger config default if not specified)

    Returns:
        List of result dictionaries with:
            - text: Chunk text
            - source: Source file
            - heading_path: Heading hierarchy
            - similarity: Similarity score
            - metadata: Full chunk metadata

    Example:
        ```python
        results = await kb.query(
            "How do I configure the database?",
            k=3,
            merge_adjacent=True
        )
        for result in results:
            print(f"[{result['similarity']:.2f}] {result['heading_path']}")
            print(result['text'])
        ```
    """
    import numpy as np

    # Generate query embedding
    query_embedding = await self.embedding_provider.embed(query)

    # Convert to numpy if needed
    if not isinstance(query_embedding, np.ndarray):
        query_embedding = np.array(query_embedding, dtype=np.float32)

    # Search vector store
    search_results = await self.vector_store.search(
        query_vector=query_embedding,
        k=k,
        filter=filter_metadata,
        include_metadata=True,
    )

    # Format results
    results = []
    for _chunk_id, similarity, chunk_metadata in search_results:
        if chunk_metadata and similarity >= min_similarity:
            results.append(
                {
                    "text": chunk_metadata.get("text", ""),
                    "source": chunk_metadata.get("source", ""),
                    "heading_path": chunk_metadata.get("heading_path", ""),
                    "similarity": similarity,
                    "metadata": chunk_metadata,
                }
            )

    # Apply chunk merging if requested
    if merge_adjacent and results:
        # Update merger config if max_chunk_size specified
        if max_chunk_size is not None:
            merger = ChunkMerger(MergerConfig(max_merged_size=max_chunk_size))
        else:
            merger = self.merger

        merged_chunks = merger.merge(results)
        results = merger.to_result_list(merged_chunks)

    return results

hybrid_query `async` ¶

hybrid_query(
    query: str,
    k: int = 5,
    text_weight: float = 0.5,
    vector_weight: float = 0.5,
    fusion_strategy: str = "rrf",
    text_fields: list[str] | None = None,
    filter_metadata: dict[str, Any] | None = None,
    min_similarity: float = 0.0,
    merge_adjacent: bool = False,
    max_chunk_size: int | None = None,
) -> list[dict[str, Any]]

Query knowledge base using hybrid search (text + vector).

Combines keyword matching with semantic vector search for improved retrieval quality. Uses Reciprocal Rank Fusion (RRF) or weighted score fusion to combine results.

Parameters:

Name	Type	Description	Default
`query`	`str`	Query text to search for	required
`k`	`int`	Number of results to return	`5`
`text_weight`	`float`	Weight for text search (0.0 to 1.0)	`0.5`
`vector_weight`	`float`	Weight for vector search (0.0 to 1.0)	`0.5`
`fusion_strategy`	`str`	Fusion method - "rrf" (default), "weighted_sum", or "native"	`'rrf'`
`text_fields`	`list[str] \| None`	Fields to search for text matching (default: ["text"])	`None`
`filter_metadata`	`dict[str, Any] \| None`	Optional metadata filters	`None`
`min_similarity`	`float`	Minimum combined score (0-1)	`0.0`
`merge_adjacent`	`bool`	Whether to merge adjacent chunks with same heading	`False`
`max_chunk_size`	`int \| None`	Maximum size for merged chunks	`None`

Returns:

Type	Description
`list[dict[str, Any]]`	List of result dictionaries with: - text: Chunk text - source: Source file - heading_path: Heading hierarchy - similarity: Combined similarity score - text_score: Score from text search (if available) - vector_score: Score from vector search (if available) - metadata: Full chunk metadata

Example

# Default RRF fusion
results = await kb.hybrid_query(
    "How do I configure the database?",
    k=5,
)

# Weighted toward vector search
results = await kb.hybrid_query(
    "database configuration",
    k=5,
    text_weight=0.3,
    vector_weight=0.7,
)

# Weighted sum fusion
results = await kb.hybrid_query(
    "configure database",
    k=5,
    fusion_strategy="weighted_sum",
)

for result in results:
    print(f"[{result['similarity']:.2f}] {result['heading_path']}")
    print(f"  text_score={result.get('text_score', 'N/A')}")
    print(f"  vector_score={result.get('vector_score', 'N/A')}")
    print(result['text'])

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def hybrid_query(
    self,
    query: str,
    k: int = 5,
    text_weight: float = 0.5,
    vector_weight: float = 0.5,
    fusion_strategy: str = "rrf",
    text_fields: list[str] | None = None,
    filter_metadata: dict[str, Any] | None = None,
    min_similarity: float = 0.0,
    merge_adjacent: bool = False,
    max_chunk_size: int | None = None,
) -> list[dict[str, Any]]:
    """Query knowledge base using hybrid search (text + vector).

    Combines keyword matching with semantic vector search for improved
    retrieval quality. Uses Reciprocal Rank Fusion (RRF) or weighted
    score fusion to combine results.

    Args:
        query: Query text to search for
        k: Number of results to return
        text_weight: Weight for text search (0.0 to 1.0)
        vector_weight: Weight for vector search (0.0 to 1.0)
        fusion_strategy: Fusion method - "rrf" (default), "weighted_sum", or "native"
        text_fields: Fields to search for text matching (default: ["text"])
        filter_metadata: Optional metadata filters
        min_similarity: Minimum combined score (0-1)
        merge_adjacent: Whether to merge adjacent chunks with same heading
        max_chunk_size: Maximum size for merged chunks

    Returns:
        List of result dictionaries with:
            - text: Chunk text
            - source: Source file
            - heading_path: Heading hierarchy
            - similarity: Combined similarity score
            - text_score: Score from text search (if available)
            - vector_score: Score from vector search (if available)
            - metadata: Full chunk metadata

    Example:
        ```python
        # Default RRF fusion
        results = await kb.hybrid_query(
            "How do I configure the database?",
            k=5,
        )

        # Weighted toward vector search
        results = await kb.hybrid_query(
            "database configuration",
            k=5,
            text_weight=0.3,
            vector_weight=0.7,
        )

        # Weighted sum fusion
        results = await kb.hybrid_query(
            "configure database",
            k=5,
            fusion_strategy="weighted_sum",
        )

        for result in results:
            print(f"[{result['similarity']:.2f}] {result['heading_path']}")
            print(f"  text_score={result.get('text_score', 'N/A')}")
            print(f"  vector_score={result.get('vector_score', 'N/A')}")
            print(result['text'])
        ```
    """
    from dataknobs_data.vector.hybrid import (
        FusionStrategy,
        HybridSearchConfig,
        reciprocal_rank_fusion,
        weighted_score_fusion,
    )
    import numpy as np

    # Generate query embedding
    query_embedding = await self.embedding_provider.embed(query)

    # Convert to numpy if needed
    if not isinstance(query_embedding, np.ndarray):
        query_embedding = np.array(query_embedding, dtype=np.float32)

    # Check if vector store supports hybrid search natively
    has_hybrid = hasattr(self.vector_store, "hybrid_search")

    # Default text fields for knowledge base chunks
    search_text_fields = text_fields or ["text"]

    # Map string to FusionStrategy enum
    strategy_map = {
        "rrf": FusionStrategy.RRF,
        "weighted_sum": FusionStrategy.WEIGHTED_SUM,
        "native": FusionStrategy.NATIVE,
    }
    strategy = strategy_map.get(fusion_strategy.lower(), FusionStrategy.RRF)

    if has_hybrid and strategy == FusionStrategy.NATIVE:
        # Use vector store's native hybrid search
        config = HybridSearchConfig(
            text_weight=text_weight,
            vector_weight=vector_weight,
            fusion_strategy=strategy,
            text_fields=search_text_fields,
        )
        hybrid_results = await self.vector_store.hybrid_search(
            query_text=query,
            query_vector=query_embedding,
            text_fields=search_text_fields,
            k=k,
            config=config,
            filter=filter_metadata,
        )

        # Convert HybridSearchResult to our result format
        results = []
        for hr in hybrid_results:
            if hr.combined_score >= min_similarity:
                # Extract metadata from record
                record_metadata = {}
                if hasattr(hr.record, "data"):
                    record_metadata = hr.record.data or {}
                elif hasattr(hr.record, "metadata"):
                    record_metadata = hr.record.metadata or {}

                results.append({
                    "text": record_metadata.get("text", ""),
                    "source": record_metadata.get("source", ""),
                    "heading_path": record_metadata.get("heading_path", ""),
                    "similarity": hr.combined_score,
                    "text_score": hr.text_score,
                    "vector_score": hr.vector_score,
                    "metadata": record_metadata,
                })
    else:
        # Client-side hybrid search implementation
        # Step 1: Vector search
        vector_results = await self.vector_store.search(
            query_vector=query_embedding,
            k=k * 2,  # Get more for fusion
            filter=filter_metadata,
            include_metadata=True,
        )

        # Step 2: Text search (simple keyword matching on stored chunks)
        # For vector stores without text search, we search in retrieved chunks
        # and also do a broader metadata-based text match if supported

        # Build vector result map
        vector_scores: list[tuple[str, float]] = []
        chunks_by_id: dict[str, dict[str, Any]] = {}

        for chunk_id, similarity, chunk_metadata in vector_results:
            if chunk_metadata:
                vector_scores.append((chunk_id, similarity))
                chunks_by_id[chunk_id] = chunk_metadata

        # Simple text matching on chunk content
        query_lower = query.lower()
        query_terms = query_lower.split()
        text_scores: list[tuple[str, float]] = []

        for chunk_id, chunk_metadata in chunks_by_id.items():
            text_content = ""
            for field in search_text_fields:
                value = chunk_metadata.get(field, "")
                if value:
                    text_content += " " + str(value)

            text_content_lower = text_content.lower()

            # Calculate text match score
            if query_lower in text_content_lower:
                # Exact phrase match
                score = 1.0
            else:
                # Term overlap score
                matched_terms = sum(1 for term in query_terms if term in text_content_lower)
                score = matched_terms / len(query_terms) if query_terms else 0.0

            if score > 0:
                text_scores.append((chunk_id, score))

        # Sort text scores descending
        text_scores.sort(key=lambda x: x[1], reverse=True)

        # Step 3: Fuse results
        if strategy == FusionStrategy.WEIGHTED_SUM:
            total = text_weight + vector_weight
            if total > 0:
                norm_text = text_weight / total
                norm_vector = vector_weight / total
            else:
                norm_text = norm_vector = 0.5

            fused = weighted_score_fusion(
                text_results=text_scores,
                vector_results=vector_scores,
                text_weight=norm_text,
                vector_weight=norm_vector,
                normalize_scores=True,
            )
        else:
            # Default to RRF
            fused = reciprocal_rank_fusion(
                text_results=text_scores,
                vector_results=vector_scores,
                k=60,
                text_weight=text_weight,
                vector_weight=vector_weight,
            )

        # Build result list
        text_score_map = dict(text_scores)
        vector_score_map = dict(vector_scores)

        results = []
        for chunk_id, combined_score in fused[:k]:
            if combined_score < min_similarity:
                continue

            chunk_metadata = chunks_by_id.get(chunk_id)
            if not chunk_metadata:
                continue

            results.append({
                "text": chunk_metadata.get("text", ""),
                "source": chunk_metadata.get("source", ""),
                "heading_path": chunk_metadata.get("heading_path", ""),
                "similarity": combined_score,
                "text_score": text_score_map.get(chunk_id),
                "vector_score": vector_score_map.get(chunk_id),
                "metadata": chunk_metadata,
            })

    # Apply chunk merging if requested
    if merge_adjacent and results:
        if max_chunk_size is not None:
            merger = ChunkMerger(MergerConfig(max_merged_size=max_chunk_size))
        else:
            merger = self.merger

        merged_chunks = merger.merge(results)
        results = merger.to_result_list(merged_chunks)

    return results

format_context ¶

format_context(results: list[dict[str, Any]], wrap_in_tags: bool = True) -> str

Format search results for LLM context.

Convenience method to format results using the configured formatter.

Parameters:

Name	Type	Description	Default
`results`	`list[dict[str, Any]]`	Search results from query()	required
`wrap_in_tags`	`bool`	Whether to wrap in tags	`True`

Returns:

Type	Description
`str`	Formatted context string

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

def format_context(
    self,
    results: list[dict[str, Any]],
    wrap_in_tags: bool = True,
) -> str:
    """Format search results for LLM context.

    Convenience method to format results using the configured formatter.

    Args:
        results: Search results from query()
        wrap_in_tags: Whether to wrap in <knowledge_base> tags

    Returns:
        Formatted context string
    """
    context = self.formatter.format(results)
    if wrap_in_tags:
        context = self.formatter.wrap_for_prompt(context)
    return context

count `async` ¶

count(filter: dict[str, Any] | None = None) -> int

Get the number of chunks in the knowledge base.

Delegates to the underlying vector store's count method.

Parameters:

Name	Type	Description	Default
`filter`	`dict[str, Any] \| None`	Optional metadata filter to count only matching chunks	`None`

Returns:

Type	Description
`int`	Number of chunks stored (optionally filtered)

Example

total = await kb.count()
domain_count = await kb.count(filter={"domain_id": "my-domain"})

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def count(self, filter: dict[str, Any] | None = None) -> int:
    """Get the number of chunks in the knowledge base.

    Delegates to the underlying vector store's count method.

    Args:
        filter: Optional metadata filter to count only matching chunks

    Returns:
        Number of chunks stored (optionally filtered)

    Example:
        ```python
        total = await kb.count()
        domain_count = await kb.count(filter={"domain_id": "my-domain"})
        ```
    """
    return await self.vector_store.count(filter)

clear `async` ¶

clear() -> None

Clear all documents from the knowledge base.

Warning: This removes all stored chunks and embeddings.

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def clear(self) -> None:
    """Clear all documents from the knowledge base.

    Warning: This removes all stored chunks and embeddings.
    """
    if hasattr(self.vector_store, "clear"):
        await self.vector_store.clear()
    else:
        raise NotImplementedError(
            "Vector store does not support clearing. "
            "Consider creating a new knowledge base with a fresh collection."
        )

save `async` ¶

save() -> None

Save the knowledge base to persistent storage.

This persists the vector store index and metadata to disk. Only applicable for vector stores that support persistence (e.g., FAISS).

Example

await kb.load_markdown_document("docs/api.md")
await kb.save()  # Persist to disk

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def save(self) -> None:
    """Save the knowledge base to persistent storage.

    This persists the vector store index and metadata to disk.
    Only applicable for vector stores that support persistence (e.g., FAISS).

    Example:
        ```python
        await kb.load_markdown_document("docs/api.md")
        await kb.save()  # Persist to disk
        ```
    """
    if hasattr(self.vector_store, "save"):
        await self.vector_store.save()

providers ¶

providers() -> dict[str, Any]

Return the embedding provider, keyed by role.

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

def providers(self) -> dict[str, Any]:
    """Return the embedding provider, keyed by role."""
    from dataknobs_bots.bot.base import PROVIDER_ROLE_KB_EMBEDDING

    if self.embedding_provider is not None:
        return {PROVIDER_ROLE_KB_EMBEDDING: self.embedding_provider}
    return {}

set_provider ¶

set_provider(role: str, provider: Any) -> bool

Replace the embedding provider if the role matches.

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

def set_provider(self, role: str, provider: Any) -> bool:
    """Replace the embedding provider if the role matches."""
    from dataknobs_bots.bot.base import PROVIDER_ROLE_KB_EMBEDDING

    if role == PROVIDER_ROLE_KB_EMBEDDING:
        self.embedding_provider = provider
        return True
    return False

close `async` ¶

close() -> None

Close the knowledge base and release resources.

This method: - Saves the vector store to disk (if persistence is configured) - Closes the vector store connection - Closes the embedding provider (releases HTTP sessions)

Should be called when done using the knowledge base to prevent resource leaks (e.g., unclosed aiohttp sessions).

Example

kb = await RAGKnowledgeBase.from_config(config)
try:
    await kb.load_markdown_document("docs/api.md")
    results = await kb.query("How do I configure?")
finally:
    await kb.close()

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def close(self) -> None:
    """Close the knowledge base and release resources.

    This method:
    - Saves the vector store to disk (if persistence is configured)
    - Closes the vector store connection
    - Closes the embedding provider (releases HTTP sessions)

    Should be called when done using the knowledge base to prevent
    resource leaks (e.g., unclosed aiohttp sessions).

    Example:
        ```python
        kb = await RAGKnowledgeBase.from_config(config)
        try:
            await kb.load_markdown_document("docs/api.md")
            results = await kb.query("How do I configure?")
        finally:
            await kb.close()
        ```
    """
    # Close vector store (will save if persist_path is set)
    if hasattr(self.vector_store, "close"):
        await self.vector_store.close()

    # Close embedding provider (releases HTTP client sessions)
    if hasattr(self.embedding_provider, "close"):
        await self.embedding_provider.close()

aenter `async` ¶

__aenter__() -> Self

Async context manager entry.

Returns:

Type	Description
`Self`	Self for use in async with statement

Example

async with await RAGKnowledgeBase.from_config(config) as kb:
    await kb.load_markdown_document("docs/api.md")
    results = await kb.query("How do I configure?")
# Automatically saved and closed

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def __aenter__(self) -> Self:
    """Async context manager entry.

    Returns:
        Self for use in async with statement

    Example:
        ```python
        async with await RAGKnowledgeBase.from_config(config) as kb:
            await kb.load_markdown_document("docs/api.md")
            results = await kb.query("How do I configure?")
        # Automatically saved and closed
        ```
    """
    return self

aexit `async` ¶

__aexit__(
    exc_type: type[BaseException] | None,
    exc_val: BaseException | None,
    exc_tb: TracebackType | None,
) -> None

Async context manager exit - ensures cleanup.

Parameters:

Name	Type	Description	Default
`exc_type`	`type[BaseException] \| None`	Exception type if an exception occurred	required
`exc_val`	`BaseException \| None`	Exception value if an exception occurred	required
`exc_tb`	`TracebackType \| None`	Exception traceback if an exception occurred	required

Source code in packages/bots/src/dataknobs_bots/knowledge/rag.py

async def __aexit__(
    self,
    exc_type: type[BaseException] | None,
    exc_val: BaseException | None,
    exc_tb: types.TracebackType | None,
) -> None:
    """Async context manager exit - ensures cleanup.

    Args:
        exc_type: Exception type if an exception occurred
        exc_val: Exception value if an exception occurred
        exc_tb: Exception traceback if an exception occurred
    """
    await self.close()

BufferMemory ¶

BufferMemory(max_messages: int = 10)

Bases: Memory

Simple buffer memory keeping last N messages.

This implementation uses a fixed-size buffer that keeps the most recent messages in memory. When the buffer is full, the oldest messages are automatically removed.

Attributes:

Name	Type	Description
`max_messages`		Maximum number of messages to keep in buffer
`messages`	`deque[dict[str, Any]]`	Deque containing the messages

Initialize buffer memory.

Parameters:

Name	Type	Description	Default
`max_messages`	`int`	Maximum number of messages to keep	`10`

Methods:

Name	Description
`add_message`	Add message to buffer.
`get_context`	Get all messages in buffer.
`clear`	Clear all messages from buffer.
`pop_messages`	Remove and return the last N messages from the buffer.

Source code in packages/bots/src/dataknobs_bots/memory/buffer.py

def __init__(self, max_messages: int = 10):
    """Initialize buffer memory.

    Args:
        max_messages: Maximum number of messages to keep
    """
    self.max_messages = max_messages
    self.messages: deque[dict[str, Any]] = deque(maxlen=max_messages)

Functions¶

add_message `async` ¶

add_message(
    content: str, role: str, metadata: dict[str, Any] | None = None
) -> None

Add message to buffer.

Parameters:

Name	Type	Description	Default
`content`	`str`	Message content	required
`role`	`str`	Message role	required
`metadata`	`dict[str, Any] \| None`	Optional metadata	`None`

Source code in packages/bots/src/dataknobs_bots/memory/buffer.py

async def add_message(
    self, content: str, role: str, metadata: dict[str, Any] | None = None
) -> None:
    """Add message to buffer.

    Args:
        content: Message content
        role: Message role
        metadata: Optional metadata
    """
    self.messages.append({"content": content, "role": role, "metadata": metadata or {}})

get_context `async` ¶

get_context(current_message: str) -> list[dict[str, Any]]

Get all messages in buffer.

The current_message parameter is not used in buffer memory since we simply return all buffered messages in order.

Parameters:

Name	Type	Description	Default
`current_message`	`str`	Not used in buffer memory	required

Returns:

Type	Description
`list[dict[str, Any]]`	List of all buffered messages

Source code in packages/bots/src/dataknobs_bots/memory/buffer.py

async def get_context(self, current_message: str) -> list[dict[str, Any]]:
    """Get all messages in buffer.

    The current_message parameter is not used in buffer memory since
    we simply return all buffered messages in order.

    Args:
        current_message: Not used in buffer memory

    Returns:
        List of all buffered messages
    """
    return list(self.messages)

clear `async` ¶

clear() -> None

Clear all messages from buffer.

Source code in packages/bots/src/dataknobs_bots/memory/buffer.py

async def clear(self) -> None:
    """Clear all messages from buffer."""
    self.messages.clear()

pop_messages `async` ¶

pop_messages(count: int = 2) -> list[dict[str, Any]]

Remove and return the last N messages from the buffer.

Parameters:

Name	Type	Description	Default
`count`	`int`	Number of messages to remove from the end.	`2`

Returns:

Type	Description
`list[dict[str, Any]]`	The removed messages in the order they were stored.

Raises:

Type	Description
`ValueError`	If count exceeds available messages or is < 1.

Source code in packages/bots/src/dataknobs_bots/memory/buffer.py

async def pop_messages(self, count: int = 2) -> list[dict[str, Any]]:
    """Remove and return the last N messages from the buffer.

    Args:
        count: Number of messages to remove from the end.

    Returns:
        The removed messages in the order they were stored.

    Raises:
        ValueError: If count exceeds available messages or is < 1.
    """
    if count < 1:
        raise ValueError(f"count must be >= 1, got {count}")
    if count > len(self.messages):
        raise ValueError(
            f"Cannot pop {count} messages, only {len(self.messages)} available"
        )
    removed = []
    for _ in range(count):
        removed.append(self.messages.pop())
    removed.reverse()
    return removed

CompositeMemory ¶

CompositeMemory(strategies: list[Memory], *, primary_index: int = 0)

Bases: Memory

Combines multiple memory strategies into one.

All sub-strategies receive every add_message() call independently. On get_context(), the primary strategy's results appear first, followed by deduplicated results from secondary strategies.

Graceful degradation: if any strategy fails on a read or write, the composite logs a warning and continues with the remaining strategies.

Attributes:

Name	Type	Description
`primary`	`Memory`	The primary memory strategy (results appear first).
`strategies`	`list[Memory]`	All sub-strategies in order.

Initialize composite memory.

Parameters:

Name	Type	Description	Default
`strategies`	`list[Memory]`	List of memory strategy instances.	required
`primary_index`	`int`	Index of the primary strategy in the list.	`0`

Raises:

Type	Description
`ValueError`	If strategies is empty or primary_index is out of range.

Methods:

Name	Description
`add_message`	Forward message to all strategies.
`get_context`	Collect context from all strategies, primary first.
`clear`	Clear all strategies. Log and continue on individual failures.
`pop_messages`	Delegate to primary strategy only.
`close`	Close all strategies. Log and continue on individual failures.
`providers`	Aggregate providers from all sub-strategies.
`set_provider`	Forward to all sub-strategies; return True if any accepted.

Source code in packages/bots/src/dataknobs_bots/memory/composite.py

def __init__(
    self,
    strategies: list[Memory],
    *,
    primary_index: int = 0,
) -> None:
    """Initialize composite memory.

    Args:
        strategies: List of memory strategy instances.
        primary_index: Index of the primary strategy in the list.

    Raises:
        ValueError: If strategies is empty or primary_index is out of range.
    """
    if not strategies:
        raise ValueError("CompositeMemory requires at least one strategy")
    if primary_index < 0 or primary_index >= len(strategies):
        raise ValueError(
            f"primary_index {primary_index} out of range for "
            f"{len(strategies)} strategies"
        )
    self._strategies = strategies
    self._primary_index = primary_index

Attributes¶

primary `property` ¶

primary: Memory

The primary memory strategy.

strategies `property` ¶

strategies: list[Memory]

All sub-strategies (defensive copy).

Functions¶

add_message `async` ¶

add_message(
    content: str, role: str, metadata: dict[str, Any] | None = None
) -> None

Forward message to all strategies.

If a strategy raises, the error is logged and remaining strategies still receive the message.

Source code in packages/bots/src/dataknobs_bots/memory/composite.py

async def add_message(
    self, content: str, role: str, metadata: dict[str, Any] | None = None
) -> None:
    """Forward message to all strategies.

    If a strategy raises, the error is logged and remaining strategies
    still receive the message.
    """
    for i, strategy in enumerate(self._strategies):
        try:
            await strategy.add_message(content, role, metadata)
        except _STRATEGY_ERRORS:
            logger.warning(
                "Memory strategy %d (%s) failed on add_message",
                i,
                type(strategy).__name__,
                exc_info=True,
            )

get_context `async` ¶

get_context(current_message: str) -> list[dict[str, Any]]

Collect context from all strategies, primary first.

Results from the primary strategy appear first. All results are deduplicated by (role, content) — if a message with the same role and content already appeared, it is not repeated.

Source code in packages/bots/src/dataknobs_bots/memory/composite.py

async def get_context(self, current_message: str) -> list[dict[str, Any]]:
    """Collect context from all strategies, primary first.

    Results from the primary strategy appear first. All results are
    deduplicated by ``(role, content)`` — if a message with the same
    role and content already appeared, it is not repeated.
    """
    results: list[dict[str, Any]] = []
    seen: set[tuple[str, str]] = set()

    # Primary first
    try:
        primary_results = await self.primary.get_context(current_message)
        for msg in primary_results:
            key = (msg.get("role", ""), msg.get("content", ""))
            if key not in seen:
                results.append(msg)
                seen.add(key)
    except _STRATEGY_ERRORS:
        logger.warning(
            "Primary memory strategy (%s) failed on get_context",
            type(self.primary).__name__,
            exc_info=True,
        )

    # Secondaries — skip primary, dedup by (role, content)
    for i, strategy in enumerate(self._strategies):
        if i == self._primary_index:
            continue
        try:
            secondary_results = await strategy.get_context(current_message)
            for msg in secondary_results:
                key = (msg.get("role", ""), msg.get("content", ""))
                if key not in seen:
                    results.append(msg)
                    seen.add(key)
        except _STRATEGY_ERRORS:
            logger.warning(
                "Memory strategy %d (%s) failed on get_context",
                i,
                type(strategy).__name__,
                exc_info=True,
            )

    return results

clear `async` ¶

clear() -> None

Clear all strategies. Log and continue on individual failures.

Source code in packages/bots/src/dataknobs_bots/memory/composite.py

async def clear(self) -> None:
    """Clear all strategies. Log and continue on individual failures."""
    for i, strategy in enumerate(self._strategies):
        try:
            await strategy.clear()
        except _STRATEGY_ERRORS:
            logger.warning(
                "Memory strategy %d (%s) failed on clear",
                i,
                type(strategy).__name__,
                exc_info=True,
            )

pop_messages `async` ¶

pop_messages(count: int = 2) -> list[dict[str, Any]]

Delegate to primary strategy only.

Secondary strategies (especially vector) may not support undo. If the primary doesn't support it, NotImplementedError propagates.

Source code in packages/bots/src/dataknobs_bots/memory/composite.py

async def pop_messages(self, count: int = 2) -> list[dict[str, Any]]:
    """Delegate to primary strategy only.

    Secondary strategies (especially vector) may not support undo.
    If the primary doesn't support it, NotImplementedError propagates.
    """
    return await self.primary.pop_messages(count)

close `async` ¶

close() -> None

Close all strategies. Log and continue on individual failures.

Source code in packages/bots/src/dataknobs_bots/memory/composite.py

async def close(self) -> None:
    """Close all strategies. Log and continue on individual failures."""
    for i, strategy in enumerate(self._strategies):
        try:
            await strategy.close()
        except _STRATEGY_ERRORS:
            logger.warning(
                "Memory strategy %d (%s) failed on close",
                i,
                type(strategy).__name__,
                exc_info=True,
            )

providers ¶

providers() -> dict[str, Any]

Aggregate providers from all sub-strategies.

If multiple strategies expose the same role, the last one wins and a warning is logged.

Source code in packages/bots/src/dataknobs_bots/memory/composite.py

def providers(self) -> dict[str, Any]:
    """Aggregate providers from all sub-strategies.

    If multiple strategies expose the same role, the last one wins
    and a warning is logged.
    """
    result: dict[str, Any] = {}
    for i, strategy in enumerate(self._strategies):
        for role, provider in strategy.providers().items():
            if role in result:
                logger.warning(
                    "Provider role %r already registered by an earlier "
                    "strategy; strategy %d (%s) overwrites it",
                    role,
                    i,
                    type(strategy).__name__,
                )
            result[role] = provider
    return result

set_provider ¶

set_provider(role: str, provider: Any) -> bool

Forward to all sub-strategies; return True if any accepted.

Source code in packages/bots/src/dataknobs_bots/memory/composite.py

def set_provider(self, role: str, provider: Any) -> bool:
    """Forward to all sub-strategies; return True if any accepted."""
    accepted = False
    for strategy in self._strategies:
        if strategy.set_provider(role, provider):
            accepted = True
    return accepted

Memory ¶

Bases: ABC

Abstract base class for memory implementations.

Methods:

Name	Description
`add_message`	Add message to memory.
`get_context`	Get relevant context for current message.
`clear`	Clear all memory.
`providers`	Return LLM providers managed by this memory, keyed by role.
`set_provider`	Replace a provider managed by this memory.
`close`	Release resources held by this memory implementation.
`pop_messages`	Remove and return the last N messages from memory.

Functions¶

add_message `abstractmethod` `async` ¶

add_message(
    content: str, role: str, metadata: dict[str, Any] | None = None
) -> None

Add message to memory.

Parameters:

Name	Type	Description	Default
`content`	`str`	Message content	required
`role`	`str`	Message role (user, assistant, system, etc.)	required
`metadata`	`dict[str, Any] \| None`	Optional metadata for the message	`None`

Source code in packages/bots/src/dataknobs_bots/memory/base.py

@abstractmethod
async def add_message(
    self, content: str, role: str, metadata: dict[str, Any] | None = None
) -> None:
    """Add message to memory.

    Args:
        content: Message content
        role: Message role (user, assistant, system, etc.)
        metadata: Optional metadata for the message
    """
    pass

get_context `abstractmethod` `async` ¶

get_context(current_message: str) -> list[dict[str, Any]]

Get relevant context for current message.

Parameters:

Name	Type	Description	Default
`current_message`	`str`	The current message to get context for	required

Returns:

Type	Description
`list[dict[str, Any]]`	List of relevant message dictionaries

Source code in packages/bots/src/dataknobs_bots/memory/base.py

@abstractmethod
async def get_context(self, current_message: str) -> list[dict[str, Any]]:
    """Get relevant context for current message.

    Args:
        current_message: The current message to get context for

    Returns:
        List of relevant message dictionaries
    """
    pass

clear `abstractmethod` `async` ¶

clear() -> None

Clear all memory.

Source code in packages/bots/src/dataknobs_bots/memory/base.py

@abstractmethod
async def clear(self) -> None:
    """Clear all memory."""
    pass

providers ¶

providers() -> dict[str, Any]

Return LLM providers managed by this memory, keyed by role.

Subsystems declare the providers they own so that the bot can register them in the provider catalog without reaching into private attributes. The default returns an empty dict (no providers).

Returns:

Type	Description
`dict[str, Any]`	Dict mapping provider role names to provider instances.

Source code in packages/bots/src/dataknobs_bots/memory/base.py

def providers(self) -> dict[str, Any]:
    """Return LLM providers managed by this memory, keyed by role.

    Subsystems declare the providers they own so that the bot can
    register them in the provider catalog without reaching into
    private attributes.  The default returns an empty dict (no
    providers).

    Returns:
        Dict mapping provider role names to provider instances.
    """
    return {}

set_provider ¶

set_provider(role: str, provider: Any) -> bool

Replace a provider managed by this memory.

Called by inject_providers to wire a test provider into the actual subsystem, not just the registry catalog. The default returns False (role not recognized). Concrete subclasses override to accept their known roles.

Parameters:

Name	Type	Description	Default
`role`	`str`	Provider role name (e.g. `PROVIDER_ROLE_MEMORY_EMBEDDING`).	required
`provider`	`Any`	Replacement provider instance.	required

Returns:

Type	Description
`bool`	`True` if the role was recognized and the provider updated,
`bool`	`False` otherwise.

Source code in packages/bots/src/dataknobs_bots/memory/base.py

def set_provider(self, role: str, provider: Any) -> bool:
    """Replace a provider managed by this memory.

    Called by ``inject_providers`` to wire a test provider into the
    actual subsystem, not just the registry catalog.  The default
    returns ``False`` (role not recognized).  Concrete subclasses
    override to accept their known roles.

    Args:
        role: Provider role name (e.g. ``PROVIDER_ROLE_MEMORY_EMBEDDING``).
        provider: Replacement provider instance.

    Returns:
        ``True`` if the role was recognized and the provider updated,
        ``False`` otherwise.
    """
    return False

close `async` ¶

close() -> None

Release resources held by this memory implementation.

The default is a no-op. Subclasses that create providers or open connections (e.g. VectorMemory, SummaryMemory) should override to clean up.

Source code in packages/bots/src/dataknobs_bots/memory/base.py

async def close(self) -> None:  # noqa: B027 — intentional no-op default
    """Release resources held by this memory implementation.

    The default is a no-op.  Subclasses that create providers or open
    connections (e.g. ``VectorMemory``, ``SummaryMemory``) should
    override to clean up.
    """

pop_messages `async` ¶

pop_messages(count: int = 2) -> list[dict[str, Any]]

Remove and return the last N messages from memory.

Used for conversation undo. The count is determined by the caller based on node depth difference (not a fixed 2).

Parameters:

Name	Type	Description	Default
`count`	`int`	Number of messages to remove from the end.	`2`

Returns:

Type	Description
`list[dict[str, Any]]`	The removed messages in the order they were stored.

Raises:

Type	Description
`NotImplementedError`	If the implementation does not support undo.

Source code in packages/bots/src/dataknobs_bots/memory/base.py

async def pop_messages(self, count: int = 2) -> list[dict[str, Any]]:
    """Remove and return the last N messages from memory.

    Used for conversation undo. The count is determined by the caller
    based on node depth difference (not a fixed 2).

    Args:
        count: Number of messages to remove from the end.

    Returns:
        The removed messages in the order they were stored.

    Raises:
        NotImplementedError: If the implementation does not support undo.
    """
    raise NotImplementedError(
        f"{type(self).__name__} does not support pop_messages"
    )

SummaryMemory ¶

SummaryMemory(
    llm_provider: AsyncLLMProvider,
    recent_window: int = 10,
    summary_prompt: str | None = None,
    *,
    owns_llm_provider: bool = False,
)

Bases: Memory

Memory that summarizes older messages to maintain long context windows.

Maintains a rolling buffer of recent messages. When the buffer exceeds a configurable threshold, the oldest messages are compressed into a running summary using the LLM provider. This trades exact message recall for a much longer effective context window.

get_context() returns the summary (if any) as a system message, followed by the recent verbatim messages.

Attributes:

Name	Type	Description
`llm_provider`		LLM provider used for generating summaries
`recent_window`		Number of recent messages to keep verbatim
`summary_prompt`		Template for the summarization prompt

Initialize summary memory.

Parameters:

Name	Type	Description	Default
`llm_provider`	`AsyncLLMProvider`	Async LLM provider for generating summaries	required
`recent_window`	`int`	Number of recent messages to keep verbatim. When the buffer has more than `recent_window` messages, the oldest are summarized.	`10`
`summary_prompt`	`str \| None`	Custom summarization prompt template. Must contain `{existing_summary}` and `{new_messages}` placeholders.	`None`
`owns_llm_provider`	`bool`	Whether this instance owns the provider's lifecycle. True when a dedicated provider was created for this memory; False when reusing the bot's main LLM.	`False`

Methods:

Name	Description
`add_message`	Add a message and trigger summarization if the buffer is full.
`get_context`	Return the running summary followed by recent messages.
`providers`	Return the summary LLM provider for catalog registration.
`set_provider`	Replace the summary LLM provider if the role matches.
`close`	Close the LLM provider if this instance owns it.
`clear`	Clear both the running summary and the message buffer.
`pop_messages`	Remove and return the last N messages from the recent window.

Source code in packages/bots/src/dataknobs_bots/memory/summary.py

def __init__(
    self,
    llm_provider: AsyncLLMProvider,
    recent_window: int = 10,
    summary_prompt: str | None = None,
    *,
    owns_llm_provider: bool = False,
) -> None:
    """Initialize summary memory.

    Args:
        llm_provider: Async LLM provider for generating summaries
        recent_window: Number of recent messages to keep verbatim.
                      When the buffer has more than ``recent_window``
                      messages, the oldest are summarized.
        summary_prompt: Custom summarization prompt template. Must
                       contain ``{existing_summary}`` and
                       ``{new_messages}`` placeholders.
        owns_llm_provider: Whether this instance owns the provider's
            lifecycle. True when a dedicated provider was created for
            this memory; False when reusing the bot's main LLM.
    """
    self.llm_provider = llm_provider
    self.recent_window = recent_window
    self.summary_prompt = summary_prompt or DEFAULT_SUMMARY_PROMPT
    self._owns_llm_provider = owns_llm_provider
    self._messages: deque[dict[str, Any]] = deque()
    self._summary: str = ""

Functions¶

add_message `async` ¶

add_message(
    content: str, role: str, metadata: dict[str, Any] | None = None
) -> None

Add a message and trigger summarization if the buffer is full.

When the number of buffered messages exceeds recent_window, the oldest messages are summarized into the running summary using the LLM provider. On LLM failure, older messages are dropped to keep the buffer within bounds (graceful degradation).

Parameters:

Name	Type	Description	Default
`content`	`str`	Message content	required
`role`	`str`	Message role (user, assistant, system)	required
`metadata`	`dict[str, Any] \| None`	Optional metadata for the message	`None`

Source code in packages/bots/src/dataknobs_bots/memory/summary.py

async def add_message(
    self, content: str, role: str, metadata: dict[str, Any] | None = None
) -> None:
    """Add a message and trigger summarization if the buffer is full.

    When the number of buffered messages exceeds ``recent_window``,
    the oldest messages are summarized into the running summary using
    the LLM provider. On LLM failure, older messages are dropped to
    keep the buffer within bounds (graceful degradation).

    Args:
        content: Message content
        role: Message role (user, assistant, system)
        metadata: Optional metadata for the message
    """
    self._messages.append(
        {"content": content, "role": role, "metadata": metadata or {}}
    )

    if len(self._messages) > self.recent_window:
        await self._summarize_oldest()

get_context `async` ¶

get_context(current_message: str) -> list[dict[str, Any]]

Return the running summary followed by recent messages.

Parameters:

Name	Type	Description	Default
`current_message`	`str`	The current message (not used by summary memory, kept for interface compatibility)	required

Returns:

Type	Description
`list[dict[str, Any]]`	List of message dicts. If a summary exists it is the first
`list[dict[str, Any]]`	element with `role="system"`; the remaining elements are
`list[dict[str, Any]]`	the recent verbatim messages.

Source code in packages/bots/src/dataknobs_bots/memory/summary.py

async def get_context(self, current_message: str) -> list[dict[str, Any]]:
    """Return the running summary followed by recent messages.

    Args:
        current_message: The current message (not used by summary memory,
                        kept for interface compatibility)

    Returns:
        List of message dicts. If a summary exists it is the first
        element with ``role="system"``; the remaining elements are
        the recent verbatim messages.
    """
    context: list[dict[str, Any]] = []

    if self._summary:
        context.append(
            {
                "content": f"[Conversation summary]: {self._summary}",
                "role": "system",
                "metadata": {"is_summary": True},
            }
        )

    context.extend(self._messages)
    return context

providers ¶

providers() -> dict[str, Any]

Return the summary LLM provider for catalog registration.

Always reports the provider for discovery and observability. The _owns_llm_provider flag controls lifecycle (close()), not visibility — consistent with VectorMemory, RAGKnowledgeBase, and WizardReasoning.

Source code in packages/bots/src/dataknobs_bots/memory/summary.py

def providers(self) -> dict[str, Any]:
    """Return the summary LLM provider for catalog registration.

    Always reports the provider for discovery and observability.
    The ``_owns_llm_provider`` flag controls lifecycle (``close()``),
    not visibility — consistent with VectorMemory, RAGKnowledgeBase,
    and WizardReasoning.
    """
    from dataknobs_bots.bot.base import PROVIDER_ROLE_SUMMARY_LLM

    if self.llm_provider is not None:
        return {PROVIDER_ROLE_SUMMARY_LLM: self.llm_provider}
    return {}

set_provider ¶

set_provider(role: str, provider: Any) -> bool

Replace the summary LLM provider if the role matches.

Source code in packages/bots/src/dataknobs_bots/memory/summary.py

def set_provider(self, role: str, provider: Any) -> bool:
    """Replace the summary LLM provider if the role matches."""
    from dataknobs_bots.bot.base import PROVIDER_ROLE_SUMMARY_LLM

    if role == PROVIDER_ROLE_SUMMARY_LLM:
        self.llm_provider = provider
        return True
    return False

close `async` ¶

close() -> None

Close the LLM provider if this instance owns it.

When a dedicated provider was created for this memory (via the llm config key), this instance owns its lifecycle. When the bot's main LLM was passed in as a fallback, the bot owns it.

Source code in packages/bots/src/dataknobs_bots/memory/summary.py

async def close(self) -> None:
    """Close the LLM provider if this instance owns it.

    When a dedicated provider was created for this memory (via the
    ``llm`` config key), this instance owns its lifecycle. When the
    bot's main LLM was passed in as a fallback, the bot owns it.
    """
    if self._owns_llm_provider and self.llm_provider and hasattr(self.llm_provider, "close"):
        try:
            await self.llm_provider.close()
        except Exception:
            logger.exception("Error closing summary LLM provider")

clear `async` ¶

clear() -> None

Clear both the running summary and the message buffer.

Source code in packages/bots/src/dataknobs_bots/memory/summary.py

async def clear(self) -> None:
    """Clear both the running summary and the message buffer."""
    self._messages.clear()
    self._summary = ""

pop_messages `async` ¶

pop_messages(count: int = 2) -> list[dict[str, Any]]

Remove and return the last N messages from the recent window.

Only messages still in the recent buffer can be popped. Messages that have already been summarized are irreversibly compressed and cannot be individually removed.

Parameters:

Name	Type	Description	Default
`count`	`int`	Number of messages to remove from the end.	`2`

Returns:

Type	Description
`list[dict[str, Any]]`	The removed messages in the order they were stored.

Raises:

Type	Description
`ValueError`	If count exceeds available (unsummarized) messages or is < 1.

Source code in packages/bots/src/dataknobs_bots/memory/summary.py

async def pop_messages(self, count: int = 2) -> list[dict[str, Any]]:
    """Remove and return the last N messages from the recent window.

    Only messages still in the recent buffer can be popped. Messages that
    have already been summarized are irreversibly compressed and cannot be
    individually removed.

    Args:
        count: Number of messages to remove from the end.

    Returns:
        The removed messages in the order they were stored.

    Raises:
        ValueError: If count exceeds available (unsummarized) messages
            or is < 1.
    """
    if count < 1:
        raise ValueError(f"count must be >= 1, got {count}")
    if count > len(self._messages):
        raise ValueError(
            f"Cannot pop {count} messages, only {len(self._messages)} "
            f"unsummarized messages available in the recent window"
        )
    removed = []
    for _ in range(count):
        removed.append(self._messages.pop())
    removed.reverse()
    return removed

VectorMemory ¶

VectorMemory(
    vector_store: Any,
    embedding_provider: Any,
    max_results: int = 5,
    similarity_threshold: float = 0.7,
    default_metadata: dict[str, Any] | None = None,
    default_filter: dict[str, Any] | None = None,
    owns_embedding_provider: bool = False,
    owns_vector_store: bool = False,
)

Bases: Memory

Vector-based semantic memory using dataknobs-data vector stores.

This implementation stores messages with vector embeddings and retrieves relevant messages based on semantic similarity.

Attributes:

Name	Type	Description
`vector_store`		Vector store backend from dataknobs_data.vector.stores
`embedding_provider`		LLM provider for generating embeddings
`max_results`		Maximum number of results to return
`similarity_threshold`		Minimum similarity score for results

Initialize vector memory.

Parameters:

Name	Type	Description	Default
`vector_store`	`Any`	Vector store backend instance	required
`embedding_provider`	`Any`	LLM provider with embed() method	required
`max_results`	`int`	Maximum number of similar messages to return	`5`
`similarity_threshold`	`float`	Minimum similarity score (0-1)	`0.7`
`default_metadata`	`dict[str, Any] \| None`	Metadata merged into every `add_message()` call. Caller-supplied metadata overrides these defaults. Use for tenant scoping, e.g. `{"user_id": "u123"}`.	`None`
`default_filter`	`dict[str, Any] \| None`	Filter merged into every `get_context()` search call. Use to scope reads to a tenant, e.g. `{"user_id": "u123"}`.	`None`
`owns_embedding_provider`	`bool`	If True, `close()` will close the embedding provider. Set by `from_config` for resources it creates. Default False for externally-injected providers.	`False`
`owns_vector_store`	`bool`	If True, `close()` will close the vector store. Set by `from_config` for resources it creates. Default False for externally-injected stores.	`False`

Methods:

Name	Description
`from_config`	Create VectorMemory from configuration.
`add_message`	Add message with vector embedding.
`get_context`	Get semantically relevant messages.
`providers`	Return the embedding provider, keyed by role.
`set_provider`	Replace the embedding provider if the role matches.
`close`	Close owned resources.
`clear`	Clear all vectors from memory.

Source code in packages/bots/src/dataknobs_bots/memory/vector.py

def __init__(
    self,
    vector_store: Any,
    embedding_provider: Any,
    max_results: int = 5,
    similarity_threshold: float = 0.7,
    default_metadata: dict[str, Any] | None = None,
    default_filter: dict[str, Any] | None = None,
    owns_embedding_provider: bool = False,
    owns_vector_store: bool = False,
):
    """Initialize vector memory.

    Args:
        vector_store: Vector store backend instance
        embedding_provider: LLM provider with embed() method
        max_results: Maximum number of similar messages to return
        similarity_threshold: Minimum similarity score (0-1)
        default_metadata: Metadata merged into every ``add_message()``
            call. Caller-supplied metadata overrides these defaults.
            Use for tenant scoping, e.g. ``{"user_id": "u123"}``.
        default_filter: Filter merged into every ``get_context()``
            search call. Use to scope reads to a tenant, e.g.
            ``{"user_id": "u123"}``.
        owns_embedding_provider: If True, ``close()`` will close the
            embedding provider. Set by ``from_config`` for resources
            it creates. Default False for externally-injected providers.
        owns_vector_store: If True, ``close()`` will close the vector
            store. Set by ``from_config`` for resources it creates.
            Default False for externally-injected stores.
    """
    self.vector_store = vector_store
    self.embedding_provider = embedding_provider
    self.max_results = max_results
    self.similarity_threshold = similarity_threshold
    self._default_metadata = default_metadata or {}
    self._default_filter = default_filter or {}
    self._owns_embedding_provider = owns_embedding_provider
    self._owns_vector_store = owns_vector_store

Functions¶

from_config `async` `classmethod` ¶

from_config(config: dict[str, Any]) -> VectorMemory

Create VectorMemory from configuration.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Configuration dictionary with: - backend: Vector store backend type - dimension: Vector store dimension (singular; default 1536) - collection: Collection/index name (optional) - embedding: Nested embedding config dict (preferred), e.g. `{"provider": "ollama", "model": "nomic-embed-text", "dimensions": 768}` - embedding_provider / embedding_model: Legacy flat keys. Note: `dimensions` (plural) at the top level is forwarded to the embedding provider, not the vector store. Use `dimension` (singular) for the vector store size. - max_results: Max results to return (default 5) - similarity_threshold: Min similarity score (default 0.7)	required

Returns:

Type	Description
`VectorMemory`	Configured VectorMemory instance

Source code in packages/bots/src/dataknobs_bots/memory/vector.py

@classmethod
async def from_config(cls, config: dict[str, Any]) -> "VectorMemory":
    """Create VectorMemory from configuration.

    Args:
        config: Configuration dictionary with:
            - backend: Vector store backend type
            - dimension: Vector store dimension (singular; default 1536)
            - collection: Collection/index name (optional)
            - embedding: Nested embedding config dict (preferred), e.g.
              ``{"provider": "ollama", "model": "nomic-embed-text",
              "dimensions": 768}``
            - embedding_provider / embedding_model: Legacy flat keys.
              Note: ``dimensions`` (plural) at the top level is forwarded
              to the embedding provider, not the vector store.  Use
              ``dimension`` (singular) for the vector store size.
            - max_results: Max results to return (default 5)
            - similarity_threshold: Min similarity score (default 0.7)

    Returns:
        Configured VectorMemory instance
    """
    from dataknobs_data.vector.stores import VectorStoreFactory

    from ..providers import create_embedding_provider

    # Create vector store
    store_config = {
        "backend": config.get("backend", "memory"),
        "dimensions": config.get("dimension", 1536),
    }

    # Add optional store parameters
    if "collection" in config:
        store_config["collection_name"] = config["collection"]
    if "persist_path" in config:
        store_config["persist_path"] = config["persist_path"]

    # Merge any additional store_params
    if "store_params" in config:
        store_config.update(config["store_params"])

    factory = VectorStoreFactory()
    vector_store = factory.create(**store_config)
    await vector_store.initialize()

    # Create embedding provider
    embedding_provider = await create_embedding_provider(config)

    return cls(
        vector_store=vector_store,
        embedding_provider=embedding_provider,
        max_results=config.get("max_results", 5),
        similarity_threshold=config.get("similarity_threshold", 0.7),
        default_metadata=config.get("default_metadata"),
        default_filter=config.get("default_filter"),
        owns_embedding_provider=True,
        owns_vector_store=True,
    )

add_message `async` ¶

add_message(
    content: str, role: str, metadata: dict[str, Any] | None = None
) -> None

Add message with vector embedding.

Parameters:

Name	Type	Description	Default
`content`	`str`	Message content	required
`role`	`str`	Message role	required
`metadata`	`dict[str, Any] \| None`	Optional caller-supplied metadata. Merged after `default_metadata` (from init) and system base fields (`content`, `role`, `timestamp`, `id`). Caller metadata has highest precedence.	`None`

Source code in packages/bots/src/dataknobs_bots/memory/vector.py

async def add_message(
    self, content: str, role: str, metadata: dict[str, Any] | None = None
) -> None:
    """Add message with vector embedding.

    Args:
        content: Message content
        role: Message role
        metadata: Optional caller-supplied metadata. Merged after
            ``default_metadata`` (from init) and system base fields
            (``content``, ``role``, ``timestamp``, ``id``).
            Caller metadata has highest precedence.
    """
    # Generate embedding
    embedding = await self.embedding_provider.embed(content)

    # Convert to numpy array if needed
    if not isinstance(embedding, np.ndarray):
        embedding = np.array(embedding, dtype=np.float32)

    # Merge order: defaults < base fields (system-controlled) < caller metadata
    msg_metadata = dict(self._default_metadata)
    msg_metadata.update({
        "content": content,
        "role": role,
        "timestamp": datetime.now().isoformat(),
        "id": str(uuid4()),
    })
    if metadata:
        msg_metadata.update(metadata)

    # Store in vector store
    await self.vector_store.add_vectors(
        vectors=[embedding], ids=[msg_metadata["id"]], metadata=[msg_metadata]
    )

get_context `async` ¶

get_context(current_message: str) -> list[dict[str, Any]]

Get semantically relevant messages.

Parameters:

Name	Type	Description	Default
`current_message`	`str`	Current message to find context for	required

Returns:

Type	Description
`list[dict[str, Any]]`	List of relevant message dictionaries sorted by similarity

Source code in packages/bots/src/dataknobs_bots/memory/vector.py

async def get_context(self, current_message: str) -> list[dict[str, Any]]:
    """Get semantically relevant messages.

    Args:
        current_message: Current message to find context for

    Returns:
        List of relevant message dictionaries sorted by similarity
    """
    # Generate query embedding
    query_embedding = await self.embedding_provider.embed(current_message)

    # Convert to numpy array if needed
    if not isinstance(query_embedding, np.ndarray):
        query_embedding = np.array(query_embedding, dtype=np.float32)

    # Search for similar vectors
    search_kwargs: dict[str, Any] = {
        "query_vector": query_embedding,
        "k": self.max_results,
        "include_metadata": True,
    }
    if self._default_filter:
        search_kwargs["filter"] = dict(self._default_filter)

    results = await self.vector_store.search(**search_kwargs)

    # Format results
    context = []
    for _vector_id, similarity, msg_metadata in results:
        if msg_metadata and similarity >= self.similarity_threshold:
            context.append(
                {
                    "content": msg_metadata.get("content", ""),
                    "role": msg_metadata.get("role", ""),
                    "similarity": similarity,
                    "metadata": msg_metadata,
                }
            )

    return context

providers ¶

providers() -> dict[str, Any]

Return the embedding provider, keyed by role.

Source code in packages/bots/src/dataknobs_bots/memory/vector.py

def providers(self) -> dict[str, Any]:
    """Return the embedding provider, keyed by role."""
    from dataknobs_bots.bot.base import PROVIDER_ROLE_MEMORY_EMBEDDING

    if self.embedding_provider is not None:
        return {PROVIDER_ROLE_MEMORY_EMBEDDING: self.embedding_provider}
    return {}

set_provider ¶

set_provider(role: str, provider: Any) -> bool

Replace the embedding provider if the role matches.

Source code in packages/bots/src/dataknobs_bots/memory/vector.py

def set_provider(self, role: str, provider: Any) -> bool:
    """Replace the embedding provider if the role matches."""
    from dataknobs_bots.bot.base import PROVIDER_ROLE_MEMORY_EMBEDDING

    if role == PROVIDER_ROLE_MEMORY_EMBEDDING:
        self.embedding_provider = provider
        return True
    return False

close `async` ¶

close() -> None

Close owned resources.

Only closes resources that this instance owns (created in from_config). Externally-injected resources are left open for the caller to manage.

Source code in packages/bots/src/dataknobs_bots/memory/vector.py

async def close(self) -> None:
    """Close owned resources.

    Only closes resources that this instance owns (created in
    ``from_config``). Externally-injected resources are left open
    for the caller to manage.
    """
    if (
        self._owns_embedding_provider
        and self.embedding_provider
        and hasattr(self.embedding_provider, "close")
    ):
        try:
            await self.embedding_provider.close()
        except Exception:
            logger.exception("Error closing embedding provider")

    if (
        self._owns_vector_store
        and self.vector_store
        and hasattr(self.vector_store, "close")
    ):
        try:
            await self.vector_store.close()
        except Exception:
            logger.exception("Error closing vector store")

clear `async` ¶

clear() -> None

Clear all vectors from memory.

Delegates to the vector store's clear() method. Use with caution if the store is shared across multiple memory instances.

Raises:

Type	Description
`NotImplementedError`	If the backing vector store does not support `clear()`.

Source code in packages/bots/src/dataknobs_bots/memory/vector.py

async def clear(self) -> None:
    """Clear all vectors from memory.

    Delegates to the vector store's ``clear()`` method. Use with caution
    if the store is shared across multiple memory instances.

    Raises:
        NotImplementedError: If the backing vector store does not
            support ``clear()``.
    """
    if hasattr(self.vector_store, "clear"):
        await self.vector_store.clear()
    else:
        raise NotImplementedError(
            "Vector store does not support clearing. "
            "Consider creating a new VectorMemory instance with a fresh collection."
        )

CostTrackingMiddleware ¶

CostTrackingMiddleware(
    track_tokens: bool = True, cost_rates: dict[str, Any] | None = None
)

Bases: Middleware

Middleware for tracking LLM API costs and usage.

Monitors token usage across different providers (Ollama, OpenAI, Anthropic, etc.) to help optimize costs and track budgets.

Attributes:

Name	Type	Description
`track_tokens`		Whether to track token usage
`cost_rates`		Token cost rates per provider/model
`usage_stats`		Accumulated usage statistics by client_id

Example

# Create middleware with default rates
middleware = CostTrackingMiddleware()

# Or with custom rates
middleware = CostTrackingMiddleware(
    cost_rates={
        "openai": {
            "gpt-4o": {"input": 0.0025, "output": 0.01},
        },
    }
)

# Get stats
stats = middleware.get_client_stats("my-client")
total = middleware.get_total_cost()

# Export to JSON
json_data = middleware.export_stats_json()

Initialize cost tracking middleware.

Parameters:

Name	Type	Description	Default
`track_tokens`	`bool`	Enable token tracking	`True`
`cost_rates`	`dict[str, Any] \| None`	Optional custom cost rates (merged with defaults)	`None`

Methods:

Name	Description
`on_turn_start`	Log estimated input tokens at the start of a turn.
`after_turn`	Track costs after turn completion using TurnState data.
`on_error`	Log errors but don't track costs for failed requests.
`on_hook_error`	Track middleware hook failures.
`get_client_stats`	Get usage statistics for a client.
`get_all_stats`	Get all usage statistics.
`get_total_cost`	Get total cost across all clients.
`get_total_tokens`	Get total tokens across all clients.
`clear_stats`	Clear usage statistics.
`export_stats_json`	Export all statistics as JSON.
`export_stats_csv`	Export statistics as CSV (one row per client).

Source code in packages/bots/src/dataknobs_bots/middleware/cost.py

def __init__(
    self,
    track_tokens: bool = True,
    cost_rates: dict[str, Any] | None = None,
):
    """Initialize cost tracking middleware.

    Args:
        track_tokens: Enable token tracking
        cost_rates: Optional custom cost rates (merged with defaults)
    """
    self.track_tokens = track_tokens
    # Merge custom rates with defaults
    self.cost_rates = self.DEFAULT_RATES.copy()
    if cost_rates:
        for provider, rates in cost_rates.items():
            if provider in self.cost_rates:
                if isinstance(rates, dict) and isinstance(
                    self.cost_rates[provider], dict
                ):
                    self.cost_rates[provider].update(rates)
                else:
                    self.cost_rates[provider] = rates
            else:
                self.cost_rates[provider] = rates

    self._usage_stats: dict[str, dict[str, Any]] = {}
    self._logger = logging.getLogger(f"{__name__}.CostTracker")

Functions¶

on_turn_start `async` ¶

on_turn_start(turn: TurnState) -> str | None

Log estimated input tokens at the start of a turn.

Parameters:

Name	Type	Description	Default
`turn`	`TurnState`	Turn state at the start of the pipeline.	required

Returns:

Type	Description
`str \| None`	None (no message transform).

Source code in packages/bots/src/dataknobs_bots/middleware/cost.py

async def on_turn_start(self, turn: TurnState) -> str | None:
    """Log estimated input tokens at the start of a turn.

    Args:
        turn: Turn state at the start of the pipeline.

    Returns:
        None (no message transform).
    """
    # Estimate input tokens (rough approximation: ~4 chars per token)
    estimated_tokens = len(turn.message) // 4
    self._logger.debug("Estimated input tokens: %d", estimated_tokens)
    return None

after_turn `async` ¶

after_turn(turn: TurnState) -> None

Track costs after turn completion using TurnState data.

Uses real token usage when the provider reports it, otherwise estimates from message/response text length (~4 chars per token).

Parameters:

Name	Type	Description	Default
`turn`	`TurnState`	Completed turn state with usage and response data.	required

Source code in packages/bots/src/dataknobs_bots/middleware/cost.py

async def after_turn(self, turn: TurnState) -> None:
    """Track costs after turn completion using TurnState data.

    Uses real token usage when the provider reports it, otherwise
    estimates from message/response text length (~4 chars per token).

    Args:
        turn: Completed turn state with usage and response data.
    """
    if not self.track_tokens:
        return

    client_id = turn.context.client_id
    provider = turn.provider_name or "unknown"
    model = turn.model or "unknown"

    if turn.usage:
        input_tokens = int(
            turn.usage.get(
                "input",
                turn.usage.get("prompt_tokens", 0),
            )
        )
        output_tokens = int(
            turn.usage.get(
                "output",
                turn.usage.get("completion_tokens", 0),
            )
        )
        estimated = False
    else:
        # Estimate from text length (~4 chars per token).
        # Note: turn.message is the user's message before KB/memory
        # augmentation.  The actual LLM input includes system prompt,
        # KB chunks, and memory context, so this underestimates real
        # input tokens.  When real usage data is available (above
        # branch), this fallback is not reached.
        input_tokens = len(turn.message) // 4
        output_tokens = len(turn.response_content) // 4
        estimated = True

    hook_counter = (
        "stream_turns" if turn.is_streaming else "chat_turns"
    )
    cost = self._record_usage(
        client_id, hook_counter,
        provider, model, input_tokens, output_tokens,
    )

    total = self._usage_stats[client_id]["total_cost_usd"]
    mode_label = turn.mode.value
    est_marker = " (estimated)" if estimated else ""
    self._logger.info(
        "Turn complete (%s) - Client %s: %s/%s - "
        "%d in + %d out tokens%s, cost: $%.6f, total: $%.6f",
        mode_label, client_id, provider, model,
        input_tokens, output_tokens, est_marker, cost, total,
    )

on_error `async` ¶

on_error(error: Exception, message: str, context: BotContext) -> None

Log errors but don't track costs for failed requests.

Parameters:

Name	Type	Description	Default
`error`	`Exception`	The exception that occurred	required
`message`	`str`	User message that caused the error	required
`context`	`BotContext`	Bot context	required

Source code in packages/bots/src/dataknobs_bots/middleware/cost.py

async def on_error(
    self, error: Exception, message: str, context: BotContext
) -> None:
    """Log errors but don't track costs for failed requests.

    Args:
        error: The exception that occurred
        message: User message that caused the error
        context: Bot context
    """
    client_id = context.client_id
    if client_id not in self._usage_stats:
        self._usage_stats[client_id] = self._new_client_stats(client_id)
    self._usage_stats[client_id]["on_error_calls"] += 1

    self._logger.warning(
        "Error during request for client %s: %s", client_id, error,
    )

on_hook_error `async` ¶

on_hook_error(hook_name: str, error: Exception, context: BotContext) -> None

Track middleware hook failures.

Parameters:

Name	Type	Description	Default
`hook_name`	`str`	Name of the hook that failed	required
`error`	`Exception`	The exception raised by the middleware hook	required
`context`	`BotContext`	Bot context	required

Source code in packages/bots/src/dataknobs_bots/middleware/cost.py

async def on_hook_error(
    self, hook_name: str, error: Exception, context: BotContext
) -> None:
    """Track middleware hook failures.

    Args:
        hook_name: Name of the hook that failed
        error: The exception raised by the middleware hook
        context: Bot context
    """
    client_id = context.client_id
    if client_id not in self._usage_stats:
        self._usage_stats[client_id] = self._new_client_stats(client_id)
    self._usage_stats[client_id]["on_hook_error_calls"] += 1

    self._logger.warning(
        "Middleware hook %s failed for client %s: %s",
        hook_name, client_id, error,
    )

get_client_stats ¶

get_client_stats(client_id: str) -> dict[str, Any] | None

Get usage statistics for a client.

Parameters:

Name	Type	Description	Default
`client_id`	`str`	Client identifier	required

Returns:

Type	Description
`dict[str, Any] \| None`	Usage statistics or None if not found

Source code in packages/bots/src/dataknobs_bots/middleware/cost.py

def get_client_stats(self, client_id: str) -> dict[str, Any] | None:
    """Get usage statistics for a client.

    Args:
        client_id: Client identifier

    Returns:
        Usage statistics or None if not found
    """
    return self._usage_stats.get(client_id)

get_all_stats ¶

get_all_stats() -> dict[str, dict[str, Any]]

Get all usage statistics.

Returns:

Type	Description
`dict[str, dict[str, Any]]`	Dictionary mapping client_id to statistics

Source code in packages/bots/src/dataknobs_bots/middleware/cost.py

def get_all_stats(self) -> dict[str, dict[str, Any]]:
    """Get all usage statistics.

    Returns:
        Dictionary mapping client_id to statistics
    """
    return self._usage_stats.copy()

get_total_cost ¶

get_total_cost() -> float

Get total cost across all clients.

Returns:

Type	Description
`float`	Total cost in USD

Source code in packages/bots/src/dataknobs_bots/middleware/cost.py

def get_total_cost(self) -> float:
    """Get total cost across all clients.

    Returns:
        Total cost in USD
    """
    return float(
        sum(stats["total_cost_usd"] for stats in self._usage_stats.values())
    )

get_total_tokens ¶

get_total_tokens() -> dict[str, int]

Get total tokens across all clients.

Returns:

Type	Description
`dict[str, int]`	Dictionary with 'input', 'output', and 'total' token counts

Source code in packages/bots/src/dataknobs_bots/middleware/cost.py

def get_total_tokens(self) -> dict[str, int]:
    """Get total tokens across all clients.

    Returns:
        Dictionary with 'input', 'output', and 'total' token counts
    """
    input_tokens = sum(
        stats["total_input_tokens"] for stats in self._usage_stats.values()
    )
    output_tokens = sum(
        stats["total_output_tokens"] for stats in self._usage_stats.values()
    )
    return {
        "input": input_tokens,
        "output": output_tokens,
        "total": input_tokens + output_tokens,
    }

clear_stats ¶

clear_stats(client_id: str | None = None) -> None

Clear usage statistics.

Parameters:

Name	Type	Description	Default
`client_id`	`str \| None`	If provided, clear only this client. Otherwise clear all.	`None`

Source code in packages/bots/src/dataknobs_bots/middleware/cost.py

def clear_stats(self, client_id: str | None = None) -> None:
    """Clear usage statistics.

    Args:
        client_id: If provided, clear only this client. Otherwise clear all.
    """
    if client_id:
        if client_id in self._usage_stats:
            del self._usage_stats[client_id]
    else:
        self._usage_stats.clear()

export_stats_json ¶

export_stats_json(indent: int = 2) -> str

Export all statistics as JSON.

Parameters:

Name	Type	Description	Default
`indent`	`int`	JSON indentation level	`2`

Returns:

Type	Description
`str`	JSON string of all statistics

Source code in packages/bots/src/dataknobs_bots/middleware/cost.py

def export_stats_json(self, indent: int = 2) -> str:
    """Export all statistics as JSON.

    Args:
        indent: JSON indentation level

    Returns:
        JSON string of all statistics
    """
    return json.dumps(self._usage_stats, indent=indent)

export_stats_csv ¶

export_stats_csv() -> str

Export statistics as CSV (one row per client).

Returns:

Type	Description
`str`	CSV string with headers

Source code in packages/bots/src/dataknobs_bots/middleware/cost.py

def export_stats_csv(self) -> str:
    """Export statistics as CSV (one row per client).

    Returns:
        CSV string with headers
    """
    lines = [
        "client_id,total_requests,total_input_tokens,total_output_tokens,total_cost_usd"
    ]
    for client_id, stats in self._usage_stats.items():
        lines.append(
            f"{client_id},{stats['total_requests']},"
            f"{stats['total_input_tokens']},{stats['total_output_tokens']},"
            f"{stats['total_cost_usd']:.6f}"
        )
    return "\n".join(lines)

LoggingMiddleware ¶

LoggingMiddleware(
    log_level: str = "INFO",
    include_metadata: bool = True,
    json_format: bool = False,
)

Bases: Middleware

Middleware for tracking conversation interactions.

Logs all user messages and bot responses with context for monitoring, debugging, and analytics.

Uses the unified TurnState hooks:

on_turn_start — logs incoming user message
after_turn — logs turn completion with response, usage, tools

Attributes:

Name	Type	Description
`log_level`		Logging level to use (default: INFO)
`include_metadata`		Whether to include full context metadata
`json_format`		Whether to output logs in JSON format

Example

# Basic usage
middleware = LoggingMiddleware()

# With JSON format for log aggregation
middleware = LoggingMiddleware(
    log_level="INFO",
    include_metadata=True,
    json_format=True
)

Initialize logging middleware.

Parameters:

Name	Type	Description	Default
`log_level`	`str`	Logging level (DEBUG, INFO, WARNING, ERROR)	`'INFO'`
`include_metadata`	`bool`	Whether to log full context metadata	`True`
`json_format`	`bool`	Whether to output in JSON format	`False`

Methods:

Name	Description
`on_turn_start`	Log incoming user message at the start of a turn.
`after_turn`	Log turn completion with unified data for all turn types.
`on_error`	Called when an error occurs during message processing.
`on_hook_error`	Called when a middleware hook itself raises.

Source code in packages/bots/src/dataknobs_bots/middleware/logging.py

def __init__(
    self,
    log_level: str = "INFO",
    include_metadata: bool = True,
    json_format: bool = False,
):
    """Initialize logging middleware.

    Args:
        log_level: Logging level (DEBUG, INFO, WARNING, ERROR)
        include_metadata: Whether to log full context metadata
        json_format: Whether to output in JSON format
    """
    self.log_level = log_level
    self.include_metadata = include_metadata
    self.json_format = json_format
    self._logger = logging.getLogger(f"{__name__}.ConversationLogger")
    self._logger.setLevel(getattr(logging, log_level.upper()))

Functions¶

on_turn_start `async` ¶

on_turn_start(turn: TurnState) -> str | None

Log incoming user message at the start of a turn.

Parameters:

Name	Type	Description	Default
`turn`	`TurnState`	Turn state at the start of the pipeline.	required

Returns:

Type	Description
`str \| None`	None (no message transform).

Source code in packages/bots/src/dataknobs_bots/middleware/logging.py

async def on_turn_start(self, turn: TurnState) -> str | None:
    """Log incoming user message at the start of a turn.

    Args:
        turn: Turn state at the start of the pipeline.

    Returns:
        None (no message transform).
    """
    log_data: dict[str, Any] = {
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "event": "user_message",
        "mode": turn.mode.value,
        "client_id": turn.context.client_id,
        "user_id": turn.context.user_id,
        "conversation_id": turn.context.conversation_id,
        "message_length": len(turn.message),
    }

    if self.include_metadata:
        log_data["session_metadata"] = turn.context.session_metadata
        log_data["request_metadata"] = turn.context.request_metadata

    if self.json_format:
        self._logger.info(json.dumps(log_data))
    else:
        self._logger.info("User message: %s", log_data)

    # Log content at DEBUG level (first 200 chars)
    self._logger.debug("Message content: %.200s...", turn.message)
    return None

after_turn `async` ¶

after_turn(turn: TurnState) -> None

Log turn completion with unified data for all turn types.

Parameters:

Name	Type	Description	Default
`turn`	`TurnState`	Completed turn state.	required

Source code in packages/bots/src/dataknobs_bots/middleware/logging.py

async def after_turn(self, turn: TurnState) -> None:
    """Log turn completion with unified data for all turn types.

    Args:
        turn: Completed turn state.
    """
    log_data: dict[str, Any] = {
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "event": "turn_complete",
        "mode": turn.mode.value,
        "client_id": turn.context.client_id,
        "user_id": turn.context.user_id,
        "conversation_id": turn.context.conversation_id,
        "response_length": len(turn.response_content),
    }

    if turn.usage:
        log_data["tokens_used"] = turn.usage
    if turn.provider_name:
        log_data["provider"] = turn.provider_name
    if turn.model:
        log_data["model"] = turn.model
    if turn.tool_executions:
        log_data["tool_executions"] = len(turn.tool_executions)

    if self.include_metadata:
        log_data["session_metadata"] = turn.context.session_metadata
        log_data["request_metadata"] = turn.context.request_metadata

    if self.json_format:
        self._logger.info(json.dumps(log_data))
    else:
        self._logger.info("Turn complete: %s", log_data)

    # Log content at DEBUG level (first 200 chars)
    self._logger.debug("Response content: %.200s...", turn.response_content)

on_error `async` ¶

on_error(error: Exception, message: str, context: BotContext) -> None

Called when an error occurs during message processing.

Parameters:

Name	Type	Description	Default
`error`	`Exception`	The exception that occurred	required
`message`	`str`	User message that caused the error	required
`context`	`BotContext`	Bot context	required

Source code in packages/bots/src/dataknobs_bots/middleware/logging.py

async def on_error(
    self, error: Exception, message: str, context: BotContext
) -> None:
    """Called when an error occurs during message processing.

    Args:
        error: The exception that occurred
        message: User message that caused the error
        context: Bot context
    """
    log_data = {
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "event": "error",
        "client_id": context.client_id,
        "user_id": context.user_id,
        "conversation_id": context.conversation_id,
        "error_type": type(error).__name__,
        "error_message": str(error),
    }

    if self.json_format:
        self._logger.error(json.dumps(log_data), exc_info=error)
    else:
        self._logger.error(
            "Error processing message: %s", log_data, exc_info=error
        )

on_hook_error `async` ¶

on_hook_error(hook_name: str, error: Exception, context: BotContext) -> None

Called when a middleware hook itself raises.

Parameters:

Name	Type	Description	Default
`hook_name`	`str`	Name of the hook that failed	required
`error`	`Exception`	The exception raised by the middleware hook	required
`context`	`BotContext`	Bot context	required

Source code in packages/bots/src/dataknobs_bots/middleware/logging.py

async def on_hook_error(
    self, hook_name: str, error: Exception, context: BotContext
) -> None:
    """Called when a middleware hook itself raises.

    Args:
        hook_name: Name of the hook that failed
        error: The exception raised by the middleware hook
        context: Bot context
    """
    log_data = {
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "event": "hook_error",
        "hook_name": hook_name,
        "client_id": context.client_id,
        "user_id": context.user_id,
        "conversation_id": context.conversation_id,
        "error_type": type(error).__name__,
        "error_message": str(error),
    }

    if self.json_format:
        self._logger.warning(json.dumps(log_data), exc_info=error)
    else:
        self._logger.warning(
            "Middleware hook %s failed: %s", hook_name, log_data, exc_info=error
        )

Middleware ¶

Base class for bot middleware.

Middleware provides hooks into the bot request/response lifecycle. All hooks are concrete no-ops — subclasses override only the hooks they need.

Preferred hooks (receive full TurnState):

on_turn_start(turn) — before processing; can write plugin_data and optionally transform the message.
after_turn(turn) — after any turn completes (chat, stream, greet); unified successor to after_message and post_stream.
finally_turn(turn) — fires after every turn on both success and error paths. Use for resource cleanup. For stream_chat, requires full consumption or aclosing().
on_tool_executed(execution, context) — after each tool call.

Legacy hooks (kept for backward compatibility):

before_message(message, context) — use on_turn_start instead.
after_message(response, context, **kwargs) — use after_turn instead.
post_stream(message, response, context) — use after_turn instead.

Error hooks (no TurnState equivalent — still primary):

on_error(error, message, context) — request failed.
on_hook_error(hook_name, error, context) — a hook failed.

Error semantics

on_error fires when the bot request fails — the caller does NOT receive a response. on_hook_error fires when a middleware's own hook raises after the request already succeeded — the caller DID receive a response, but a middleware could not complete its post-processing.

Example

class MyMiddleware(Middleware):
    async def on_turn_start(self, turn):
        turn.plugin_data["started"] = True
        return None  # or return transformed message

    async def after_turn(self, turn):
        log.info("Turn %s done", turn.mode.value)

    async def on_error(self, error, message, context):
        log.error("Request failed: %s", error)

Methods:

Name	Description
`before_message`	Called before processing user message.
`after_message`	Called after generating bot response (non-streaming).
`post_stream`	Called after streaming response completes.
`on_error`	Called when a request-level error occurs during message processing.
`on_hook_error`	Called when a middleware hook itself raises an exception.
`on_turn_start`	Called at the start of every turn, before message processing.
`after_turn`	Called after any turn completes (chat, stream, or greet).
`finally_turn`	Called after every turn, on both success and error paths.
`on_tool_executed`	Called after each tool execution within a turn.

Functions¶

before_message `async` ¶

before_message(message: str, context: BotContext) -> None

Called before processing user message.

.. deprecated:: Use on_turn_start instead, which provides the full TurnState including plugin_data for cross-middleware communication and supports message transforms.

Parameters:

Name	Type	Description	Default
`message`	`str`	User's input message	required
`context`	`BotContext`	Bot context with conversation and user info	required

Source code in packages/bots/src/dataknobs_bots/middleware/base.py

async def before_message(
    self, message: str, context: BotContext
) -> None:
    """Called before processing user message.

    .. deprecated::
        Use ``on_turn_start`` instead, which provides the full
        ``TurnState`` including ``plugin_data`` for cross-middleware
        communication and supports message transforms.

    Args:
        message: User's input message
        context: Bot context with conversation and user info
    """

after_message `async` ¶

after_message(response: str, context: BotContext, **kwargs: Any) -> None

Called after generating bot response (non-streaming).

.. deprecated:: Use after_turn instead, which fires for all turn types (chat, stream, greet) and provides the full TurnState with usage data, tool executions, and plugin data.

Parameters:

Name	Type	Description	Default
`response`	`str`	Bot's generated response	required
`context`	`BotContext`	Bot context	required
`**kwargs`	`Any`	Additional data (e.g., tokens_used, response_time_ms, provider, model)	`{}`

Source code in packages/bots/src/dataknobs_bots/middleware/base.py

async def after_message(
    self, response: str, context: BotContext, **kwargs: Any
) -> None:
    """Called after generating bot response (non-streaming).

    .. deprecated::
        Use ``after_turn`` instead, which fires for all turn types
        (chat, stream, greet) and provides the full ``TurnState``
        with usage data, tool executions, and plugin data.

    Args:
        response: Bot's generated response
        context: Bot context
        **kwargs: Additional data (e.g., tokens_used, response_time_ms, provider, model)
    """

post_stream `async` ¶

post_stream(message: str, response: str, context: BotContext) -> None

Called after streaming response completes.

.. deprecated:: Use after_turn instead, which fires for all turn types and provides real token usage data from the provider.

Parameters:

Name	Type	Description	Default
`message`	`str`	Original user message that triggered the stream	required
`response`	`str`	Complete accumulated response from streaming	required
`context`	`BotContext`	Bot context	required

Source code in packages/bots/src/dataknobs_bots/middleware/base.py

async def post_stream(
    self, message: str, response: str, context: BotContext
) -> None:
    """Called after streaming response completes.

    .. deprecated::
        Use ``after_turn`` instead, which fires for all turn types
        and provides real token usage data from the provider.

    Args:
        message: Original user message that triggered the stream
        response: Complete accumulated response from streaming
        context: Bot context
    """

on_error `async` ¶

on_error(error: Exception, message: str, context: BotContext) -> None

Called when a request-level error occurs during message processing.

This hook fires when the bot request fails (preparation, generation, or memory/middleware post-processing in before_message). The caller does NOT receive a response.

Parameters:

Name	Type	Description	Default
`error`	`Exception`	The exception that occurred	required
`message`	`str`	User message that caused the error	required
`context`	`BotContext`	Bot context	required

Source code in packages/bots/src/dataknobs_bots/middleware/base.py

async def on_error(
    self, error: Exception, message: str, context: BotContext
) -> None:
    """Called when a request-level error occurs during message processing.

    This hook fires when the bot request fails (preparation, generation,
    or memory/middleware post-processing in ``before_message``). The
    caller does NOT receive a response.

    Args:
        error: The exception that occurred
        message: User message that caused the error
        context: Bot context
    """

on_hook_error `async` ¶

on_hook_error(hook_name: str, error: Exception, context: BotContext) -> None

Called when a middleware hook itself raises an exception.

This fires when a post-generation middleware hook raises: after_turn, finally_turn, on_tool_executed, after_message, post_stream, or on_error. On the success path, the response was already delivered; on the error path (e.g. finally_turn after a failed turn), it was not. In either case, the middleware could not complete its own post-processing (e.g., a logging sink was unreachable, a metrics backend timed out).

Note: on_turn_start exceptions are NOT routed here — they are re-raised to abort the request (matching before_message semantics), so on_error fires instead.

Parameters:

Name	Type	Description	Default
`hook_name`	`str`	Name of the hook that failed (e.g. `"after_turn"`, `"finally_turn"`, `"on_tool_executed"`, `"on_error"`)	required
`error`	`Exception`	The exception raised by the middleware hook	required
`context`	`BotContext`	Bot context	required

Source code in packages/bots/src/dataknobs_bots/middleware/base.py

async def on_hook_error(
    self, hook_name: str, error: Exception, context: BotContext
) -> None:
    """Called when a middleware hook itself raises an exception.

    This fires when a post-generation middleware hook raises:
    ``after_turn``, ``finally_turn``, ``on_tool_executed``,
    ``after_message``, ``post_stream``, or ``on_error``.  On the
    success path, the response was already delivered; on the error
    path (e.g. ``finally_turn`` after a failed turn), it was not.
    In either case, the middleware could not complete its own
    post-processing (e.g., a logging sink was unreachable, a
    metrics backend timed out).

    Note: ``on_turn_start`` exceptions are NOT routed here —
    they are re-raised to abort the request (matching
    ``before_message`` semantics), so ``on_error`` fires instead.

    Args:
        hook_name: Name of the hook that failed (e.g.
            ``"after_turn"``, ``"finally_turn"``,
            ``"on_tool_executed"``, ``"on_error"``)
        error: The exception raised by the middleware hook
        context: Bot context
    """

on_turn_start `async` ¶

on_turn_start(turn: TurnState) -> str | None

Called at the start of every turn, before message processing.

Receives the full TurnState including plugin_data for cross-middleware communication. Middleware can:

Write to turn.plugin_data to share data with downstream pipeline participants (LLM middleware, tools, after_turn).
Return a transformed message string to replace turn.message before it reaches the LLM (e.g., PII stripping, attack sanitization). Transforms chain: each middleware receives the message as modified by the previous one.
Return None to leave the message unchanged.

Parameters:

Name	Type	Description	Default
`turn`	`TurnState`	Turn state at the start of the pipeline.	required

Returns:

Type	Description
`str \| None`	Transformed message string, or `None` to keep the original.

Source code in packages/bots/src/dataknobs_bots/middleware/base.py

async def on_turn_start(
    self, turn: TurnState
) -> str | None:
    """Called at the start of every turn, before message processing.

    Receives the full ``TurnState`` including ``plugin_data`` for
    cross-middleware communication. Middleware can:

    - Write to ``turn.plugin_data`` to share data with downstream
      pipeline participants (LLM middleware, tools, ``after_turn``).
    - Return a transformed message string to replace ``turn.message``
      before it reaches the LLM (e.g., PII stripping, attack
      sanitization). Transforms chain: each middleware receives the
      message as modified by the previous one.
    - Return ``None`` to leave the message unchanged.

    Args:
        turn: Turn state at the start of the pipeline.

    Returns:
        Transformed message string, or ``None`` to keep the original.
    """
    return None

after_turn `async` ¶

after_turn(turn: TurnState) -> None

Called after any turn completes (chat, stream, or greet).

Provides the full TurnState with usage data, tool executions, and response content regardless of how the turn was initiated. This is the unified successor to after_message and post_stream — implement this for uniform post-turn handling.

The legacy hooks (after_message / post_stream) continue to fire as well, so existing middleware is unaffected.

Parameters:

Name	Type	Description	Default
`turn`	`TurnState`	Complete turn state with all pipeline data.	required

Source code in packages/bots/src/dataknobs_bots/middleware/base.py

async def after_turn(self, turn: TurnState) -> None:
    """Called after any turn completes (chat, stream, or greet).

    Provides the full ``TurnState`` with usage data, tool executions,
    and response content regardless of how the turn was initiated.
    This is the unified successor to ``after_message`` and
    ``post_stream`` — implement this for uniform post-turn handling.

    The legacy hooks (``after_message`` / ``post_stream``) continue
    to fire as well, so existing middleware is unaffected.

    Args:
        turn: Complete turn state with all pipeline data.
    """

finally_turn `async` ¶

finally_turn(turn: TurnState) -> None

Called after every turn, on both success and error paths.

Use this for resource cleanup (closing DB sessions, releasing locks, flushing buffers) that must happen regardless of outcome. after_turn is conditional — it does not fire on error paths or when greet() returns None. Do not assume after_turn has already run when writing finally_turn logic.

For chat() and greet(), this hook fires reliably via a finally block. For stream_chat() (an async generator), the finally block fires only when the generator is fully consumed, explicitly closed (aclose()), or garbage collected. Callers that break out of the stream early should use contextlib.aclosing to guarantee prompt cleanup.

plugin_data populated by on_turn_start (or seeded from the call site via the plugin_data parameter on chat() / stream_chat() / greet()) is available here.

This is an observational hook — failures are logged and reported via on_hook_error but do not prevent other middleware from running.

Parameters:

Name	Type	Description	Default
`turn`	`TurnState`	Turn state at the end of the pipeline. On error paths or no-strategy `greet()` paths, `response_content` may be empty and `manager` may be `None`. `plugin_data` is always available.	required

Source code in packages/bots/src/dataknobs_bots/middleware/base.py

async def finally_turn(self, turn: TurnState) -> None:
    """Called after every turn, on both success and error paths.

    Use this for resource cleanup (closing DB sessions, releasing
    locks, flushing buffers) that must happen regardless of outcome.
    ``after_turn`` is conditional — it does not fire on error paths
    or when ``greet()`` returns ``None``.  Do not assume
    ``after_turn`` has already run when writing ``finally_turn``
    logic.

    For ``chat()`` and ``greet()``, this hook fires reliably via a
    ``finally`` block.  For ``stream_chat()`` (an async generator),
    the ``finally`` block fires only when the generator is fully
    consumed, explicitly closed (``aclose()``), or garbage
    collected.  Callers that break out of the stream early should
    use ``contextlib.aclosing`` to guarantee prompt cleanup.

    ``plugin_data`` populated by ``on_turn_start`` (or seeded from
    the call site via the ``plugin_data`` parameter on ``chat()`` /
    ``stream_chat()`` / ``greet()``) is available here.

    This is an observational hook — failures are logged and reported
    via ``on_hook_error`` but do not prevent other middleware from
    running.

    Args:
        turn: Turn state at the end of the pipeline.  On error paths
            or no-strategy ``greet()`` paths, ``response_content``
            may be empty and ``manager`` may be ``None``.
            ``plugin_data`` is always available.
    """

on_tool_executed `async` ¶

on_tool_executed(execution: ToolExecution, context: BotContext) -> None

Called after each tool execution within a turn.

Fired once per tool invocation, before after_turn. All on_tool_executed calls happen post-turn during _finalize_turn(), not in real-time as tools execute — this hook is for auditing and logging, not for aborting or rate-limiting mid-turn.

Ordering note: DynaBot-level tool executions appear first, followed by strategy-level executions (e.g. ReAct). In practice only one source produces executions per turn.

Parameters:

Name	Type	Description	Default
`execution`	`ToolExecution`	Record of the tool execution (name, params, result, error, duration).	required
`context`	`BotContext`	Bot context for the current turn.	required

Source code in packages/bots/src/dataknobs_bots/middleware/base.py

async def on_tool_executed(
    self, execution: ToolExecution, context: BotContext
) -> None:
    """Called after each tool execution within a turn.

    Fired once per tool invocation, before ``after_turn``.  All
    ``on_tool_executed`` calls happen **post-turn** during
    ``_finalize_turn()``, not in real-time as tools execute — this
    hook is for auditing and logging, not for aborting or
    rate-limiting mid-turn.

    Ordering note: DynaBot-level tool executions appear first,
    followed by strategy-level executions (e.g. ReAct).  In
    practice only one source produces executions per turn.

    Args:
        execution: Record of the tool execution (name, params, result,
            error, duration).
        context: Bot context for the current turn.
    """

ReActReasoning ¶

ReActReasoning(
    max_iterations: int = 5,
    verbose: bool = False,
    store_trace: bool = False,
    artifact_registry: Any | None = None,
    review_executor: Any | None = None,
    context_builder: Any | None = None,
    extra_context: dict[str, Any] | None = None,
    prompt_refresher: Callable[[], str] | None = None,
    greeting_template: str | None = None,
)

Bases: ReasoningStrategy

ReAct (Reasoning + Acting) strategy.

This strategy implements the ReAct pattern where the LLM: 1. Reasons about what to do (Thought) 2. Takes an action (using tools if needed) 3. Observes the result 4. Repeats until task is complete

This is useful for: - Multi-step problem solving - Tasks requiring tool use - Complex reasoning chains

Attributes:

Name	Type	Description
`max_iterations`		Maximum number of reasoning loops
`verbose`		Whether to enable debug-level logging
`store_trace`		Whether to store reasoning trace in conversation metadata

Example

strategy = ReActReasoning(
    max_iterations=5,
    verbose=True,
    store_trace=True
)
response = await strategy.generate(
    manager=conversation_manager,
    llm=llm_provider,
    tools=[search_tool, calculator_tool]
)

Initialize ReAct reasoning strategy.

Parameters:

Name	Type	Description	Default
`max_iterations`	`int`	Maximum reasoning/action iterations	`5`
`verbose`	`bool`	Enable debug-level logging for reasoning steps	`False`
`store_trace`	`bool`	Store reasoning trace in conversation metadata	`False`
`artifact_registry`	`Any \| None`	Optional ArtifactRegistry for artifact management	`None`
`review_executor`	`Any \| None`	Optional ReviewExecutor for running reviews	`None`
`context_builder`	`Any \| None`	Optional ContextBuilder for building conversation context	`None`
`extra_context`	`dict[str, Any] \| None`	Optional extra key-value pairs to merge into the ToolExecutionContext for every tool call (e.g. banks, custom state)	`None`
`prompt_refresher`	`Callable[[], str] \| None`	Optional callback that returns a fresh system prompt string. Called after tool execution in each iteration to update `system_prompt_override` in the next `manager.complete()` call. This prevents stale context when mutating tools change artifact/bank state mid-loop.	`None`
`greeting_template`	`str \| None`	Optional Jinja2 template for bot-initiated greetings (inherited from ReasoningStrategy).	`None`

Methods:

Name	Description
`from_config`	Create ReActReasoning from a configuration dict.
`generate`	Generate response using ReAct loop.

Source code in packages/bots/src/dataknobs_bots/reasoning/react.py

def __init__(
    self,
    max_iterations: int = 5,
    verbose: bool = False,
    store_trace: bool = False,
    artifact_registry: Any | None = None,
    review_executor: Any | None = None,
    context_builder: Any | None = None,
    extra_context: dict[str, Any] | None = None,
    prompt_refresher: Callable[[], str] | None = None,
    greeting_template: str | None = None,
):
    """Initialize ReAct reasoning strategy.

    Args:
        max_iterations: Maximum reasoning/action iterations
        verbose: Enable debug-level logging for reasoning steps
        store_trace: Store reasoning trace in conversation metadata
        artifact_registry: Optional ArtifactRegistry for artifact management
        review_executor: Optional ReviewExecutor for running reviews
        context_builder: Optional ContextBuilder for building conversation context
        extra_context: Optional extra key-value pairs to merge into the
            ToolExecutionContext for every tool call (e.g. banks, custom state)
        prompt_refresher: Optional callback that returns a fresh system
            prompt string.  Called after tool execution in each iteration
            to update ``system_prompt_override`` in the next
            ``manager.complete()`` call.  This prevents stale context
            when mutating tools change artifact/bank state mid-loop.
        greeting_template: Optional Jinja2 template for bot-initiated
            greetings (inherited from ReasoningStrategy).
    """
    super().__init__(greeting_template=greeting_template)
    self.max_iterations = max_iterations
    self.verbose = verbose
    self.store_trace = store_trace
    self._artifact_registry = artifact_registry
    self._review_executor = review_executor
    self._context_builder = context_builder
    self._extra_context = extra_context
    self._prompt_refresher = prompt_refresher

Attributes¶

artifact_registry `property` ¶

artifact_registry: Any | None

Get the artifact registry if configured.

review_executor `property` ¶

review_executor: Any | None

Get the review executor if configured.

context_builder `property` ¶

context_builder: Any | None

Get the context builder if configured.

Functions¶

from_config `classmethod` ¶

from_config(config: dict[str, Any], **_kwargs: Any) -> ReActReasoning

Create ReActReasoning from a configuration dict.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Configuration dict with optional keys: max_iterations, verbose, store_trace, greeting_template.	required
`**_kwargs`	`Any`	Ignored (no KB or provider injection needed).	`{}`

Returns:

Type	Description
`ReActReasoning`	Configured ReActReasoning instance.

Source code in packages/bots/src/dataknobs_bots/reasoning/react.py

@classmethod
def from_config(cls, config: dict[str, Any], **_kwargs: Any) -> ReActReasoning:  # type: ignore[override]
    """Create ReActReasoning from a configuration dict.

    Args:
        config: Configuration dict with optional keys:
            max_iterations, verbose, store_trace, greeting_template.
        **_kwargs: Ignored (no KB or provider injection needed).

    Returns:
        Configured ReActReasoning instance.
    """
    return cls(
        max_iterations=config.get("max_iterations", 5),
        verbose=config.get("verbose", False),
        store_trace=config.get("store_trace", False),
        greeting_template=config.get("greeting_template"),
    )

generate `async` ¶

generate(
    manager: Any, llm: Any, tools: list[Any] | None = None, **kwargs: Any
) -> Any

Generate response using ReAct loop.

The ReAct loop: 1. Generate response (may include tool calls) 2. If tool calls present, execute them 3. Add observations to conversation 4. Repeat until no more tool calls or max iterations

Parameters:

Name	Type	Description	Default
`manager`	`Any`	ConversationManager instance	required
`llm`	`Any`	LLM provider instance	required
`tools`	`list[Any] \| None`	Optional list of available tools	`None`
`**kwargs`	`Any`	Generation parameters	`{}`

Returns:

Type	Description
`Any`	Final LLM response

Source code in packages/bots/src/dataknobs_bots/reasoning/react.py

async def generate(
    self,
    manager: Any,
    llm: Any,
    tools: list[Any] | None = None,
    **kwargs: Any,
) -> Any:
    """Generate response using ReAct loop.

    The ReAct loop:
    1. Generate response (may include tool calls)
    2. If tool calls present, execute them
    3. Add observations to conversation
    4. Repeat until no more tool calls or max iterations

    Args:
        manager: ConversationManager instance
        llm: LLM provider instance
        tools: Optional list of available tools
        **kwargs: Generation parameters

    Returns:
        Final LLM response
    """
    # Clear any stale tool executions from a previous call.
    # Each generate() call should start with a fresh list so
    # concurrent async calls on the same strategy instance don't
    # accumulate records from earlier calls.
    self._tool_executions.clear()

    if not tools:
        # No tools available, fall back to simple generation
        logger.info(
            "ReAct: No tools available, falling back to simple generation",
            extra={"conversation_id": manager.conversation_id},
        )
        return await manager.complete(**kwargs)

    # Initialize trace if enabled
    trace = [] if self.store_trace else None

    # Get log level based on verbose setting
    log_level = logging.DEBUG if self.verbose else logging.INFO

    logger.log(
        log_level,
        "ReAct: Starting reasoning loop",
        extra={
            "conversation_id": manager.conversation_id,
            "max_iterations": self.max_iterations,
            "tools_available": len(tools),
        },
    )

    # Track previous iteration's tool calls for duplicate detection
    prev_tool_calls: list[tuple[str, str]] | None = None

    # ReAct loop
    for iteration in range(self.max_iterations):
        iteration_trace = {
            "iteration": iteration + 1,
            "tool_calls": [],
        }

        logger.log(
            log_level,
            "ReAct: Starting iteration",
            extra={
                "conversation_id": manager.conversation_id,
                "iteration": iteration + 1,
                "max_iterations": self.max_iterations,
            },
        )

        # Generate response with tools
        try:
            response = await manager.complete(tools=tools, **kwargs)
        except ToolsNotSupportedError as e:
            logger.error(
                "ReAct: Model '%s' does not support tools — "
                "returning graceful response to user",
                e.model,
                extra={"conversation_id": manager.conversation_id},
            )
            return LLMResponse(
                content=(
                    "I'm configured to use tools for this task, but my "
                    "current language model doesn't support tool calling. "
                    "Please contact the administrator to update the model "
                    "configuration."
                ),
                model=e.model,
                finish_reason="error",
            )

        # Check if we have tool calls
        if not hasattr(response, "tool_calls") or not response.tool_calls:
            # No tool calls, we're done
            logger.log(
                log_level,
                "ReAct: No tool calls in response, finishing",
                extra={
                    "conversation_id": manager.conversation_id,
                    "iteration": iteration + 1,
                },
            )

            if trace is not None:
                iteration_trace["status"] = "completed"
                trace.append(iteration_trace)
                await self._store_trace(manager, trace)

            return response

        num_tool_calls = len(response.tool_calls)
        logger.log(
            log_level,
            "ReAct: Executing tool calls",
            extra={
                "conversation_id": manager.conversation_id,
                "iteration": iteration + 1,
                "num_tools": num_tool_calls,
                "tools": [tc.name for tc in response.tool_calls],
            },
        )

        # Duplicate detection: compare (name, sorted params JSON)
        # with previous iteration to avoid infinite loops
        current_calls = [
            (tc.name, json.dumps(tc.parameters, sort_keys=True))
            for tc in response.tool_calls
        ]

        if prev_tool_calls is not None and current_calls == prev_tool_calls:
            logger.warning(
                "ReAct: Duplicate tool calls detected, breaking loop",
                extra={
                    "conversation_id": manager.conversation_id,
                    "iteration": iteration + 1,
                    "duplicate_calls": [tc.name for tc in response.tool_calls],
                },
            )

            # Add explanatory message so the final LLM call doesn't
            # see dangling tool_calls with no corresponding observations.
            tool_names = [tc.name for tc in response.tool_calls]
            await manager.add_message(
                content=(
                    f"System notice: The tools {tool_names} were already "
                    "called with identical parameters in the previous step. "
                    "Their results are already in the conversation above. "
                    "Please use those results to respond to the user."
                ),
                role="system",
            )

            if trace is not None:
                iteration_trace["status"] = "duplicate_tool_calls_detected"
                trace.append(iteration_trace)
                await self._store_trace(manager, trace)

            break

        prev_tool_calls = current_calls

        # Build execution context for tools that need it
        tool_context = ToolExecutionContext.from_manager(manager)

        # Extend context with artifact/review infrastructure if available
        extra_context: dict[str, Any] = {}
        if self._artifact_registry is not None:
            extra_context["artifact_registry"] = self._artifact_registry
        if self._review_executor is not None:
            extra_context["review_executor"] = self._review_executor
        if self._context_builder is not None:
            try:
                conversation_context = await self._context_builder.build(manager)
                extra_context["conversation_context"] = conversation_context
            except Exception as e:
                logger.warning("Failed to build conversation context: %s", e)
        if self._extra_context:
            extra_context.update(self._extra_context)
        if extra_context:
            tool_context = tool_context.with_extra(**extra_context)

        # Execute all tool calls
        for tool_call in response.tool_calls:
            tool_trace = {
                "name": tool_call.name,
                "parameters": tool_call.parameters,
            }

            try:
                # Find the tool
                tool = self._find_tool(tool_call.name, tools)
                if not tool:
                    observation = f"Error: Tool '{tool_call.name}' not found"
                    tool_trace["status"] = "error"
                    tool_trace["error"] = "Tool not found"

                    logger.warning(
                        "ReAct: Tool not found",
                        extra={
                            "conversation_id": manager.conversation_id,
                            "iteration": iteration + 1,
                            "tool_name": tool_call.name,
                        },
                    )
                else:
                    # Execute the tool with context injection
                    # Context-aware tools will extract _context and use it
                    # Regular tools will ignore _context via **kwargs
                    t0 = time.monotonic()
                    result = await tool.execute(
                        **tool_call.parameters, _context=tool_context
                    )
                    duration_ms = (time.monotonic() - t0) * 1000
                    try:
                        observation = f"Tool result: {json.dumps(result, default=str)}"
                    except (TypeError, ValueError):
                        observation = f"Tool result: {result}"
                    tool_trace["status"] = "success"
                    tool_trace["result"] = str(result)

                    # Record for DynaBot on_tool_executed middleware hook
                    self._tool_executions.append(ToolExecution(
                        tool_name=tool_call.name,
                        parameters=tool_call.parameters,
                        result=result,
                        duration_ms=duration_ms,
                    ))

                    logger.log(
                        log_level,
                        "ReAct: Tool executed successfully",
                        extra={
                            "conversation_id": manager.conversation_id,
                            "iteration": iteration + 1,
                            "tool_name": tool_call.name,
                            "result_length": len(str(result)),
                        },
                    )

                # Add observation using role="tool" so providers can
                # pair it with the assistant's tool_calls in history.
                await manager.add_message(
                    content=f"Observation from {tool_call.name}: {observation}",
                    role="tool",
                    name=tool_call.name,
                    tool_call_id=tool_call.id,
                )

            except Exception as e:
                # Handle tool execution errors — use role="tool" so the
                # error is paired with the tool call in conversation.
                error_msg = f"Error executing tool {tool_call.name}: {e!s}"
                tool_trace["status"] = "error"
                tool_trace["error"] = str(e)

                # Record failed execution for middleware hook
                self._tool_executions.append(ToolExecution(
                    tool_name=tool_call.name,
                    parameters=tool_call.parameters,
                    error=str(e),
                ))

                logger.error(
                    "ReAct: Tool execution failed",
                    extra={
                        "conversation_id": manager.conversation_id,
                        "iteration": iteration + 1,
                        "tool_name": tool_call.name,
                        "error": str(e),
                    },
                    exc_info=True,
                )

                await manager.add_message(
                    content=error_msg,
                    role="tool",
                    name=tool_call.name,
                    tool_call_id=tool_call.id,
                )

            if trace is not None:
                iteration_trace["tool_calls"].append(tool_trace)

        if trace is not None:
            iteration_trace["status"] = "continued"
            trace.append(iteration_trace)

        # Refresh system prompt so the next iteration sees current
        # artifact/bank state (e.g. after load_from_catalog).
        if self._prompt_refresher is not None:
            kwargs["system_prompt_override"] = self._prompt_refresher()

    else:
        # for-else: only reached when the loop exhausts all iterations
        # without a break (i.e. not triggered by duplicate detection)
        logger.log(
            log_level,
            "ReAct: Max iterations reached, generating final response",
            extra={
                "conversation_id": manager.conversation_id,
                "iterations_used": self.max_iterations,
            },
        )

        if trace is not None:
            trace.append({"status": "max_iterations_reached"})
            await self._store_trace(manager, trace)

    # Refresh prompt for the final complete() call as well.
    if self._prompt_refresher is not None:
        kwargs["system_prompt_override"] = self._prompt_refresher()

    return await manager.complete(**kwargs)

ReasoningStrategy ¶

ReasoningStrategy(*, greeting_template: str | None = None)

Bases: ABC

Abstract base class for reasoning strategies.

Reasoning strategies control how the bot processes information and generates responses. Different strategies can implement different levels of reasoning complexity.

All strategies support an optional greeting_template — a Jinja2 template string rendered with initial_context variables when greet() is called. Strategies that need richer greeting behavior (e.g. WizardReasoning with FSM-driven stage responses) override greet() entirely.

Parameters:

Name	Type	Description	Default
`greeting_template`	`str \| None`	Optional Jinja2 template for bot-initiated greetings. Variables from `initial_context` are available as top-level template variables (e.g. `{{ user_name }}`).	`None`

Examples:

Simple: Direct LLM call
Chain-of-Thought: Break down reasoning into steps
ReAct: Reason and act in a loop with tools

Methods:

Name	Description
`capabilities`	Declare what this strategy manages autonomously.
`from_config`	Create a strategy instance from a configuration dict.
`get_source_configs`	Extract source configuration dicts from a strategy config.
`get_and_clear_tool_executions`	Return tool executions recorded during the last generate() call.
`greet`	Generate an initial bot greeting before the user speaks.
`add_source`	Add a retrieval source to this strategy.
`providers`	Return LLM providers managed by this strategy, keyed by role.
`set_provider`	Replace a provider managed by this strategy.
`close`	Release resources held by this strategy.
`generate`	Generate response using this reasoning strategy.
`stream_generate`	Stream response using this reasoning strategy.

Source code in packages/bots/src/dataknobs_bots/reasoning/base.py

def __init__(self, *, greeting_template: str | None = None) -> None:
    self._greeting_template = greeting_template
    self._tool_executions: list[ToolExecution] = []

Functions¶

capabilities `classmethod` ¶

capabilities() -> StrategyCapabilities

Declare what this strategy manages autonomously.

The default returns no capabilities. Concrete strategies override to declare their actual capabilities.

Returns:

Type	Description
`StrategyCapabilities`	Frozen dataclass describing strategy capabilities.

Source code in packages/bots/src/dataknobs_bots/reasoning/base.py

@classmethod
def capabilities(cls) -> StrategyCapabilities:
    """Declare what this strategy manages autonomously.

    The default returns no capabilities.  Concrete strategies
    override to declare their actual capabilities.

    Returns:
        Frozen dataclass describing strategy capabilities.
    """
    return StrategyCapabilities()

from_config `classmethod` ¶

from_config(config: dict[str, Any], **_kwargs: Any) -> Self

Create a strategy instance from a configuration dict.

The base implementation extracts greeting_template and passes it to the constructor. Concrete strategies with richer configuration override this classmethod.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Strategy configuration dict.	required
`**_kwargs`	`Any`	Additional context (e.g. `knowledge_base`).	`{}`

Returns:

Type	Description
`Self`	Configured strategy instance.

Source code in packages/bots/src/dataknobs_bots/reasoning/base.py

@classmethod
def from_config(cls, config: dict[str, Any], **_kwargs: Any) -> Self:
    """Create a strategy instance from a configuration dict.

    The base implementation extracts ``greeting_template`` and
    passes it to the constructor.  Concrete strategies with richer
    configuration override this classmethod.

    Args:
        config: Strategy configuration dict.
        **_kwargs: Additional context (e.g. ``knowledge_base``).

    Returns:
        Configured strategy instance.
    """
    return cls(greeting_template=config.get("greeting_template"))

get_source_configs `classmethod` ¶

get_source_configs(config: dict[str, Any]) -> list[dict[str, Any]]

Extract source configuration dicts from a strategy config.

DynaBot calls this after creating the strategy to discover which sources to construct and wire in via :meth:add_source. The default looks for a top-level "sources" key, which is the convention used by GroundedReasoning.

Strategies with non-standard config layouts (e.g. HybridReasoning, where sources are nested under "grounded") override this to return the correct list.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	The full strategy configuration dict.	required

Returns:

Type	Description
`list[dict[str, Any]]`	List of source configuration dicts (may be empty).

Source code in packages/bots/src/dataknobs_bots/reasoning/base.py

@classmethod
def get_source_configs(cls, config: dict[str, Any]) -> list[dict[str, Any]]:
    """Extract source configuration dicts from a strategy config.

    ``DynaBot`` calls this after creating the strategy to discover
    which sources to construct and wire in via :meth:`add_source`.
    The default looks for a top-level ``"sources"`` key, which is
    the convention used by ``GroundedReasoning``.

    Strategies with non-standard config layouts (e.g.
    ``HybridReasoning``, where sources are nested under
    ``"grounded"``) override this to return the correct list.

    Args:
        config: The full strategy configuration dict.

    Returns:
        List of source configuration dicts (may be empty).
    """
    return config.get("sources", [])

get_and_clear_tool_executions ¶

get_and_clear_tool_executions() -> list[ToolExecution]

Return tool executions recorded during the last generate() call.

Strategies that execute tools (e.g. ReAct) append ToolExecution records to self._tool_executions during their generation loop. DynaBot calls this after generate() returns to collect the records and fire on_tool_executed middleware hooks.

Returns:

Type	Description
`list[ToolExecution]`	List of tool execution records (cleared after retrieval).

Source code in packages/bots/src/dataknobs_bots/reasoning/base.py

def get_and_clear_tool_executions(self) -> list[ToolExecution]:
    """Return tool executions recorded during the last generate() call.

    Strategies that execute tools (e.g. ReAct) append
    ``ToolExecution`` records to ``self._tool_executions`` during
    their generation loop.  DynaBot calls this after
    ``generate()`` returns to collect the records and fire
    ``on_tool_executed`` middleware hooks.

    Returns:
        List of tool execution records (cleared after retrieval).
    """
    result = list(self._tool_executions)
    self._tool_executions.clear()
    return result

greet `async` ¶

greet(
    manager: ReasoningManagerProtocol,
    llm: Any,
    *,
    initial_context: dict[str, Any] | None = None,
    **kwargs: Any,
) -> Any | None

Generate an initial bot greeting before the user speaks.

The default implementation renders greeting_template (if set) with initial_context variables using Jinja2 and returns the result as an LLMResponse. Returns None when no template is configured.

WizardReasoning fully overrides this with FSM-driven greeting generation from the wizard's start stage.

Parameters:

Name	Type	Description	Default
`manager`	`ReasoningManagerProtocol`	ConversationManager or compatible manager instance	required
`llm`	`Any`	LLM provider instance	required
`initial_context`	`dict[str, Any] \| None`	Optional dict of data available as Jinja2 template variables (e.g. `{"user_name": "Alice"}` makes `{{ user_name }}` resolve to `"Alice"`).	`None`
`**kwargs`	`Any`	Additional generation parameters	`{}`

Returns:

Type	Description
`Any \| None`	LLMResponse if a greeting was generated, or None

Source code in packages/bots/src/dataknobs_bots/reasoning/base.py

async def greet(
    self,
    manager: ReasoningManagerProtocol,
    llm: Any,
    *,
    initial_context: dict[str, Any] | None = None,
    **kwargs: Any,
) -> Any | None:
    """Generate an initial bot greeting before the user speaks.

    The default implementation renders ``greeting_template`` (if set)
    with ``initial_context`` variables using Jinja2 and returns the
    result as an ``LLMResponse``.  Returns ``None`` when no template
    is configured.

    ``WizardReasoning`` fully overrides this with FSM-driven greeting
    generation from the wizard's start stage.

    Args:
        manager: ConversationManager or compatible manager instance
        llm: LLM provider instance
        initial_context: Optional dict of data available as Jinja2
            template variables (e.g. ``{"user_name": "Alice"}``
            makes ``{{ user_name }}`` resolve to ``"Alice"``).
        **kwargs: Additional generation parameters

    Returns:
        LLMResponse if a greeting was generated, or None
    """
    if self._greeting_template is None:
        return None
    context = initial_context or {}
    env = jinja2.Environment(undefined=jinja2.Undefined)
    text = env.from_string(self._greeting_template).render(**context)
    return LLMResponse(content=text, model="template", finish_reason="stop")

add_source ¶

add_source(source: Any) -> None

Add a retrieval source to this strategy.

Strategies that declare manages_sources=True in their :meth:capabilities MUST override this method. DynaBot calls it during config-driven source construction.

The default raises NotImplementedError so that a 3^rd-party strategy that forgets to implement it fails loudly.

Parameters:

Name	Type	Description	Default
`source`	`Any`	A `GroundedSource` instance (or compatible).	required

Raises:

Type	Description
`NotImplementedError`	If not overridden by a subclass.

Source code in packages/bots/src/dataknobs_bots/reasoning/base.py

def add_source(self, source: Any) -> None:
    """Add a retrieval source to this strategy.

    Strategies that declare ``manages_sources=True`` in their
    :meth:`capabilities` MUST override this method.  ``DynaBot``
    calls it during config-driven source construction.

    The default raises ``NotImplementedError`` so that a 3rd-party
    strategy that forgets to implement it fails loudly.

    Args:
        source: A ``GroundedSource`` instance (or compatible).

    Raises:
        NotImplementedError: If not overridden by a subclass.
    """
    raise NotImplementedError(
        f"{type(self).__name__} does not implement add_source(). "
        f"Strategies that declare manages_sources=True in "
        f"capabilities() must override this method."
    )

providers ¶

providers() -> dict[str, Any]

Return LLM providers managed by this strategy, keyed by role.

Subsystems declare the providers they own so that the bot can register them in the provider catalog without reaching into private attributes. The default returns an empty dict (no providers).

Returns:

Type	Description
`dict[str, Any]`	Dict mapping provider role names to provider instances.

Source code in packages/bots/src/dataknobs_bots/reasoning/base.py

def providers(self) -> dict[str, Any]:
    """Return LLM providers managed by this strategy, keyed by role.

    Subsystems declare the providers they own so that the bot can
    register them in the provider catalog without reaching into
    private attributes.  The default returns an empty dict (no
    providers).

    Returns:
        Dict mapping provider role names to provider instances.
    """
    return {}

set_provider ¶

set_provider(role: str, provider: Any) -> bool

Replace a provider managed by this strategy.

Called by inject_providers to wire a test provider into the actual subsystem, not just the registry catalog. The default returns False (role not recognized). Concrete subclasses override to accept their known roles.

Parameters:

Name	Type	Description	Default
`role`	`str`	Provider role name (e.g. `PROVIDER_ROLE_EXTRACTION`).	required
`provider`	`Any`	Replacement provider instance.	required

Returns:

Type	Description
`bool`	`True` if the role was recognized and the provider updated,
`bool`	`False` otherwise.

Source code in packages/bots/src/dataknobs_bots/reasoning/base.py

def set_provider(self, role: str, provider: Any) -> bool:
    """Replace a provider managed by this strategy.

    Called by ``inject_providers`` to wire a test provider into the
    actual subsystem, not just the registry catalog.  The default
    returns ``False`` (role not recognized).  Concrete subclasses
    override to accept their known roles.

    Args:
        role: Provider role name (e.g. ``PROVIDER_ROLE_EXTRACTION``).
        provider: Replacement provider instance.

    Returns:
        ``True`` if the role was recognized and the provider updated,
        ``False`` otherwise.
    """
    return False

close `async` ¶

close() -> None

Release resources held by this strategy.

Default no-op. Subclasses that hold resources (LLM providers, database connections, asyncio tasks) should override to release them. Called by DynaBot.close().

Source code in packages/bots/src/dataknobs_bots/reasoning/base.py

async def close(self) -> None:  # noqa: B027
    """Release resources held by this strategy.

    Default no-op. Subclasses that hold resources (LLM providers,
    database connections, asyncio tasks) should override to release
    them. Called by ``DynaBot.close()``.
    """

generate `abstractmethod` `async` ¶

generate(
    manager: ReasoningManagerProtocol,
    llm: Any,
    tools: list[Any] | None = None,
    **kwargs: Any,
) -> Any

Generate response using this reasoning strategy.

Pass tools through to manager.complete(tools=tools) so the LLM can see available tools. If the returned response contains tool_calls, DynaBot will execute them automatically in a post-strategy loop — strategies do not need to handle tool execution unless they want full control (like ReActReasoning).

Strategies that execute tools internally should record them via self._tool_executions.append(ToolExecution(...)) and consume tool_calls before returning, so the DynaBot loop is a no-op.

Parameters:

Name	Type	Description	Default
`manager`	`ReasoningManagerProtocol`	ConversationManager or compatible manager instance	required
`llm`	`Any`	LLM provider instance	required
`tools`	`list[Any] \| None`	Optional list of available tools. Forward to `manager.complete(tools=tools)` for LLM visibility.	`None`
`**kwargs`	`Any`	Additional generation parameters (temperature, max_tokens, etc.)	`{}`

Returns:

Type	Description
`Any`	LLM response object

Example

response = await strategy.generate(
    manager=conversation_manager,
    llm=llm_provider,
    tools=[search_tool, calculator_tool],
    temperature=0.7,
    max_tokens=1000
)

Source code in packages/bots/src/dataknobs_bots/reasoning/base.py

@abstractmethod
async def generate(
    self,
    manager: ReasoningManagerProtocol,
    llm: Any,
    tools: list[Any] | None = None,
    **kwargs: Any,
) -> Any:
    """Generate response using this reasoning strategy.

    Pass ``tools`` through to ``manager.complete(tools=tools)`` so
    the LLM can see available tools.  If the returned response
    contains ``tool_calls``, ``DynaBot`` will execute them
    automatically in a post-strategy loop — strategies do not need
    to handle tool execution unless they want full control (like
    ``ReActReasoning``).

    Strategies that execute tools internally should record them via
    ``self._tool_executions.append(ToolExecution(...))`` and
    consume ``tool_calls`` before returning, so the DynaBot loop
    is a no-op.

    Args:
        manager: ConversationManager or compatible manager instance
        llm: LLM provider instance
        tools: Optional list of available tools.  Forward to
            ``manager.complete(tools=tools)`` for LLM visibility.
        **kwargs: Additional generation parameters (temperature, max_tokens, etc.)

    Returns:
        LLM response object

    Example:
        ```python
        response = await strategy.generate(
            manager=conversation_manager,
            llm=llm_provider,
            tools=[search_tool, calculator_tool],
            temperature=0.7,
            max_tokens=1000
        )
        ```
    """
    pass

stream_generate `async` ¶

stream_generate(
    manager: ReasoningManagerProtocol,
    llm: Any,
    tools: list[Any] | None = None,
    **kwargs: Any,
) -> AsyncIterator[Any]

Stream response using this reasoning strategy.

The default implementation wraps generate() and yields the complete response as a single item. Subclasses that support true token-level streaming (e.g. SimpleReasoning) should override this to yield incremental chunks.

Parameters:

Name	Type	Description	Default
`manager`	`ReasoningManagerProtocol`	ConversationManager or compatible manager instance	required
`llm`	`Any`	LLM provider instance	required
`tools`	`list[Any] \| None`	Optional list of available tools	`None`
`**kwargs`	`Any`	Additional generation parameters	`{}`

Yields:

Type	Description
`AsyncIterator[Any]`	LLM response or stream chunk objects

Source code in packages/bots/src/dataknobs_bots/reasoning/base.py

async def stream_generate(
    self,
    manager: ReasoningManagerProtocol,
    llm: Any,
    tools: list[Any] | None = None,
    **kwargs: Any,
) -> AsyncIterator[Any]:
    """Stream response using this reasoning strategy.

    The default implementation wraps ``generate()`` and yields the
    complete response as a single item.  Subclasses that support true
    token-level streaming (e.g. ``SimpleReasoning``) should override
    this to yield incremental chunks.

    Args:
        manager: ConversationManager or compatible manager instance
        llm: LLM provider instance
        tools: Optional list of available tools
        **kwargs: Additional generation parameters

    Yields:
        LLM response or stream chunk objects
    """
    result = await self.generate(manager, llm, tools=tools, **kwargs)
    yield result

SimpleReasoning ¶

SimpleReasoning(*, greeting_template: str | None = None)

Bases: ReasoningStrategy

Simple reasoning strategy that makes direct LLM calls.

This is the most straightforward strategy - it simply passes the conversation to the LLM and returns the response without any additional reasoning steps.

Use this when: - You want direct, fast responses - The task doesn't require complex reasoning - You're using a powerful model that doesn't need guidance

Example

strategy = SimpleReasoning()
response = await strategy.generate(
    manager=conversation_manager,
    llm=llm_provider,
    temperature=0.7
)

Methods:

Name	Description
`generate`	Generate response with a simple LLM call.
`stream_generate`	Stream response with true token-level streaming.

Source code in packages/bots/src/dataknobs_bots/reasoning/simple.py

def __init__(self, *, greeting_template: str | None = None) -> None:
    super().__init__(greeting_template=greeting_template)

Functions¶

generate `async` ¶

generate(
    manager: Any, llm: Any, tools: list[Any] | None = None, **kwargs: Any
) -> Any

Generate response with a simple LLM call.

Parameters:

Name	Type	Description	Default
`manager`	`Any`	ConversationManager instance	required
`llm`	`Any`	LLM provider instance (not used directly)	required
`tools`	`list[Any] \| None`	Optional list of tools	`None`
`**kwargs`	`Any`	Generation parameters	`{}`

Returns:

Type	Description
`Any`	LLM response

Source code in packages/bots/src/dataknobs_bots/reasoning/simple.py

async def generate(
    self,
    manager: Any,
    llm: Any,
    tools: list[Any] | None = None,
    **kwargs: Any,
) -> Any:
    """Generate response with a simple LLM call.

    Args:
        manager: ConversationManager instance
        llm: LLM provider instance (not used directly)
        tools: Optional list of tools
        **kwargs: Generation parameters

    Returns:
        LLM response
    """
    # Use the conversation manager's generate method
    # which handles the LLM call with the conversation history
    return await manager.complete(tools=tools, **kwargs)

stream_generate `async` ¶

stream_generate(
    manager: Any, llm: Any, tools: list[Any] | None = None, **kwargs: Any
) -> AsyncIterator[Any]

Stream response with true token-level streaming.

Delegates to manager.stream_complete() which yields LLMStreamResponse chunks as they arrive from the provider.

Parameters:

Name	Type	Description	Default
`manager`	`Any`	ConversationManager instance	required
`llm`	`Any`	LLM provider instance (not used directly)	required
`tools`	`list[Any] \| None`	Optional list of tools	`None`
`**kwargs`	`Any`	Generation parameters	`{}`

Yields:

Type	Description
`AsyncIterator[Any]`	LLM stream response chunks

Source code in packages/bots/src/dataknobs_bots/reasoning/simple.py

async def stream_generate(
    self,
    manager: Any,
    llm: Any,
    tools: list[Any] | None = None,
    **kwargs: Any,
) -> AsyncIterator[Any]:
    """Stream response with true token-level streaming.

    Delegates to ``manager.stream_complete()`` which yields
    ``LLMStreamResponse`` chunks as they arrive from the provider.

    Args:
        manager: ConversationManager instance
        llm: LLM provider instance (not used directly)
        tools: Optional list of tools
        **kwargs: Generation parameters

    Yields:
        LLM stream response chunks
    """
    async for chunk in manager.stream_complete(tools=tools, **kwargs):
        yield chunk

StrategyCapabilities `dataclass` ¶

StrategyCapabilities(manages_sources: bool = False)

Declares what a reasoning strategy manages autonomously.

Used by DynaBot and other consumers to decide which orchestration steps to perform (e.g. source construction, auto-context) without hard-coding strategy names.

All fields default to False; concrete strategies override only the capabilities they possess. New fields can be added with default=False without breaking existing strategies.

Attributes:

Name	Type	Description
`manages_sources`	`bool`	Strategy manages its own retrieval sources (grounded/hybrid). When `True`, `DynaBot` performs config-driven source construction after factory creation and disables redundant `auto_context`.

StrategyRegistry ¶

StrategyRegistry()

Registry mapping strategy names to their factories.

Unlike PluginRegistry (which caches singleton instances and has a (key, config) factory signature), StrategyRegistry creates a fresh instance per call — strategies are per-bot, not singletons.

When PluginRegistry gains a create() method for fresh-instance factory invocation (consumer-gaps plan Item 65), this class should be migrated to use it as its backing store.

Methods:

Name	Description
`register`	Register a strategy factory under the given name.
`create`	Create a strategy instance from a config dict.
`get_factory`	Return the factory for a strategy name, or `None`.
`is_registered`	Check whether a strategy name is registered.
`list_keys`	Return sorted list of registered strategy names.

Source code in packages/bots/src/dataknobs_bots/reasoning/registry.py

def __init__(self) -> None:
    self._factories: dict[str, StrategyFactory] = {}
    self._initialized = False
    self._lock = threading.RLock()

Functions¶

register ¶

register(
    name: str, factory: StrategyFactory, *, override: bool = False
) -> None

Register a strategy factory under the given name.

Parameters:

Name	Type	Description	Default
`name`	`str`	Strategy name (used in config `strategy` field).	required
`factory`	`StrategyFactory`	A `ReasoningStrategy` subclass or callable `(config, **kwargs) -> ReasoningStrategy`.	required
`override`	`bool`	If `True`, silently replace an existing registration. Otherwise raise `ValueError`.	`False`

Raises:

Type	Description
`ValueError`	If `name` is already registered and `override` is `False`.

Source code in packages/bots/src/dataknobs_bots/reasoning/registry.py

def register(
    self,
    name: str,
    factory: StrategyFactory,
    *,
    override: bool = False,
) -> None:
    """Register a strategy factory under the given name.

    Args:
        name: Strategy name (used in config ``strategy`` field).
        factory: A ``ReasoningStrategy`` subclass or callable
            ``(config, **kwargs) -> ReasoningStrategy``.
        override: If ``True``, silently replace an existing
            registration.  Otherwise raise ``ValueError``.

    Raises:
        ValueError: If ``name`` is already registered and
            ``override`` is ``False``.
    """
    self._ensure_builtins()
    canonical = name.lower()
    with self._lock:
        if canonical in self._factories and not override:
            raise ValueError(
                f"Strategy '{canonical}' is already registered. "
                f"Use override=True to replace it."
            )
        self._factories[canonical] = factory
    logger.debug("Registered strategy '%s'", canonical)

create ¶

create(config: dict[str, Any], **kwargs: Any) -> ReasoningStrategy

Create a strategy instance from a config dict.

Extracts config["strategy"] (default "simple"), looks up the factory, and calls it. For ReasoningStrategy subclasses the factory is cls.from_config(config, **kwargs). For plain callables the factory is called as factory(config, **kwargs).

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Strategy configuration dict (must contain `strategy` key).	required
`**kwargs`	`Any`	Forwarded to the factory (e.g. `knowledge_base`).	`{}`

Returns:

Type	Description
`ReasoningStrategy`	Configured strategy instance.

Raises:

Type	Description
`ValueError`	If the strategy name is not registered.

Source code in packages/bots/src/dataknobs_bots/reasoning/registry.py

def create(
    self,
    config: dict[str, Any],
    **kwargs: Any,
) -> ReasoningStrategy:
    """Create a strategy instance from a config dict.

    Extracts ``config["strategy"]`` (default ``"simple"``), looks up
    the factory, and calls it.  For ``ReasoningStrategy`` subclasses
    the factory is ``cls.from_config(config, **kwargs)``.  For plain
    callables the factory is called as ``factory(config, **kwargs)``.

    Args:
        config: Strategy configuration dict (must contain
            ``strategy`` key).
        **kwargs: Forwarded to the factory (e.g. ``knowledge_base``).

    Returns:
        Configured strategy instance.

    Raises:
        ValueError: If the strategy name is not registered.
    """
    self._ensure_builtins()
    name = config.get("strategy", "simple").lower()
    factory = self._factories.get(name)
    if factory is None:
        available = ", ".join(sorted(self._factories))
        raise ValueError(
            f"Unknown reasoning strategy: '{name}'. "
            f"Available strategies: {available}"
        )

    if isinstance(factory, type) and issubclass(factory, ReasoningStrategy):
        return factory.from_config(config, **kwargs)
    return factory(config, **kwargs)

get_factory ¶

get_factory(name: str) -> StrategyFactory | None

Return the factory for a strategy name, or None.

Source code in packages/bots/src/dataknobs_bots/reasoning/registry.py

def get_factory(self, name: str) -> StrategyFactory | None:
    """Return the factory for a strategy name, or ``None``."""
    self._ensure_builtins()
    return self._factories.get(name.lower())

is_registered ¶

is_registered(name: str) -> bool

Check whether a strategy name is registered.

Source code in packages/bots/src/dataknobs_bots/reasoning/registry.py

def is_registered(self, name: str) -> bool:
    """Check whether a strategy name is registered."""
    self._ensure_builtins()
    return name.lower() in self._factories

list_keys ¶

list_keys() -> list[str]

Return sorted list of registered strategy names.

Source code in packages/bots/src/dataknobs_bots/reasoning/registry.py

def list_keys(self) -> list[str]:
    """Return sorted list of registered strategy names."""
    self._ensure_builtins()
    return sorted(self._factories)

BotTestHarness ¶

BotTestHarness(
    bot: Any,
    provider: EchoProvider,
    extractor: ConfigurableExtractor | None,
    context: Any,
)

High-level test helper for ALL DynaBot behavioral tests.

Wraps the full setup ceremony (bot creation, provider injection, tool registration, middleware wiring, context management) into one object. Use create() to build, chat()/greet() to run turns.

For non-wizard tests, use bot_config= with any DynaBot config:

Example

async with await BotTestHarness.create(
    bot_config={
        "llm": {"provider": "echo", "model": "test"},
        "conversation_storage": {"backend": "memory"},
        "reasoning": {"strategy": "simple"},
    },
    main_responses=[
        tool_call_response("my_tool", {"q": "test"}),
        text_response("Here are the results"),
    ],
    tools=[my_tool],
    middleware=[my_middleware],
) as harness:
    result = await harness.chat("search")
    assert result.response == "Here are the results"
    # Streaming: harness.bot.stream_chat("msg", harness.context)

For wizard tests, use wizard_config= with WizardConfigBuilder:

Example

async with await BotTestHarness.create(
    wizard_config=config,
    main_responses=["Got it!", "All set!"],
    extraction_results=[
        [{"name": "Alice"}],
        [{"domain_id": "chess"}, {"name": "Alice", "domain_id": "chess"}],
    ],
) as harness:
    result = await harness.chat("My name is Alice")
    assert harness.wizard_data["name"] == "Alice"
    assert harness.wizard_stage == "gather"

Methods:

Name	Description
`create`	Create a harness with a fully wired DynaBot.
`chat`	Run a chat turn and capture wizard state.
`greet`	Run a greet turn and capture wizard state.
`close`	Close the bot and release resources.

Attributes:

Name	Type	Description
`wizard_stage`	`str \| None`	Current wizard stage from the last turn.
`wizard_data`	`dict[str, Any]`	Wizard state data dict from the last turn.
`wizard_state`	`dict[str, Any] \| None`	Full wizard state from the last turn.
`last_response`	`str`	Response text from the last turn.
`turn_count`	`int`	Number of turns executed.
`bot`	`Any`	The underlying DynaBot instance.
`context`	`Any`	The BotContext used for this harness's turns.
`provider`	`EchoProvider`	The main EchoProvider (for call history assertions).
`extractor`	`ConfigurableExtractor \| None`	The ConfigurableExtractor (for call verification).

Source code in packages/bots/src/dataknobs_bots/testing.py

def __init__(
    self,
    bot: Any,
    provider: EchoProvider,
    extractor: ConfigurableExtractor | None,
    context: Any,
) -> None:
    self._bot = bot
    self._provider = provider
    self._extractor = extractor
    self._context = context
    self._turn_count = 0
    self._last_result: TurnResult | None = None

Attributes¶

wizard_stage `property` ¶

wizard_stage: str | None

Current wizard stage from the last turn.

wizard_data `property` ¶

wizard_data: dict[str, Any]

Wizard state data dict from the last turn.

wizard_state `property` ¶

wizard_state: dict[str, Any] | None

Full wizard state from the last turn.

last_response `property` ¶

last_response: str

Response text from the last turn.

turn_count `property` ¶

turn_count: int

Number of turns executed.

bot `property` ¶

bot: Any

The underlying DynaBot instance.

context `property` ¶

context: Any

The BotContext used for this harness's turns.

provider `property` ¶

provider: EchoProvider

The main EchoProvider (for call history assertions).

extractor `property` ¶

extractor: ConfigurableExtractor | None

The ConfigurableExtractor (for call verification).

Functions¶

create `async` `classmethod` ¶

create(
    *,
    wizard_config: dict[str, Any] | None = None,
    bot_config: dict[str, Any] | None = None,
    main_responses: list[Any] | None = None,
    extraction_results: list[list[dict[str, Any]]] | None = None,
    system_prompt: str = "You are a helpful assistant.",
    conversation_id: str = "test-conv",
    client_id: str = "test",
    extraction_scope: str = "current_message",
    tools: list[Any] | None = None,
    middleware: list[Any] | None = None,
    strict_tools: bool = True,
    strict: bool = False,
) -> BotTestHarness

Create a harness with a fully wired DynaBot.

Provide either wizard_config (auto-wires bot config) or bot_config (full control).

Parameters:

Name	Type	Description	Default
`wizard_config`	`dict[str, Any] \| None`	Wizard config dict (e.g. from `WizardConfigBuilder.build()`). Auto-wires EchoProvider, ConfigurableExtractor, and memory storage.	`None`
`bot_config`	`dict[str, Any] \| None`	Complete bot config dict for `DynaBot.from_config()`. When provided, `wizard_config` is ignored.	`None`
`main_responses`	`list[Any] \| None`	Responses to queue on the main EchoProvider. Accepts strings or `LLMResponse` objects (e.g. from `text_response()` / `tool_call_response()`).	`None`
`extraction_results`	`list[list[dict[str, Any]]] \| None`	Per-turn extraction results. Each inner list contains dicts for one turn's extraction calls. Flattened into a `ConfigurableExtractor` sequence internally.	`None`
`system_prompt`	`str`	System prompt text.	`'You are a helpful assistant.'`
`conversation_id`	`str`	Conversation ID for the test context.	`'test-conv'`
`client_id`	`str`	Client ID for the test context.	`'test'`
`extraction_scope`	`str`	Default extraction scope for the wizard. Only applies when `wizard_config` is used; ignored when `bot_config` is provided directly.	`'current_message'`
`tools`	`list[Any] \| None`	Optional list of `Tool` instances to register on the bot. Useful for ReAct strategy tests that need tool execution.	`None`
`middleware`	`list[Any] \| None`	Optional list of `Middleware` instances to append to the bot. Useful for testing middleware hooks like `after_turn` and `on_tool_executed`.	`None`
`strict_tools`	`bool`	If True (default), the EchoProvider raises ValueError when a scripted response contains tool_calls but no tools were provided to complete(). Set to False for tests that intentionally exercise unexpected tool_calls with no registered tools.	`True`
`strict`	`bool`	If True, the EchoProvider raises `ResponseQueueExhaustedError` when all scripted responses have been consumed, instead of falling back to echo behavior. Catches under-scripted tests.	`False`

Returns:

Type	Description
`BotTestHarness`	Configured `BotTestHarness` instance.

Raises:

Type	Description
`ValueError`	If neither `wizard_config` nor `bot_config` is provided.

Source code in packages/bots/src/dataknobs_bots/testing.py

@classmethod
async def create(
    cls,
    *,
    wizard_config: dict[str, Any] | None = None,
    bot_config: dict[str, Any] | None = None,
    main_responses: list[Any] | None = None,
    extraction_results: list[list[dict[str, Any]]] | None = None,
    system_prompt: str = "You are a helpful assistant.",
    conversation_id: str = "test-conv",
    client_id: str = "test",
    extraction_scope: str = "current_message",
    tools: list[Any] | None = None,
    middleware: list[Any] | None = None,
    strict_tools: bool = True,
    strict: bool = False,
) -> BotTestHarness:
    """Create a harness with a fully wired DynaBot.

    Provide either ``wizard_config`` (auto-wires bot config) or
    ``bot_config`` (full control).

    Args:
        wizard_config: Wizard config dict (e.g. from
            ``WizardConfigBuilder.build()``). Auto-wires EchoProvider,
            ConfigurableExtractor, and memory storage.
        bot_config: Complete bot config dict for ``DynaBot.from_config()``.
            When provided, ``wizard_config`` is ignored.
        main_responses: Responses to queue on the main EchoProvider.
            Accepts strings or ``LLMResponse`` objects (e.g. from
            ``text_response()`` / ``tool_call_response()``).
        extraction_results: Per-turn extraction results. Each inner list
            contains dicts for one turn's extraction calls. Flattened
            into a ``ConfigurableExtractor`` sequence internally.
        system_prompt: System prompt text.
        conversation_id: Conversation ID for the test context.
        client_id: Client ID for the test context.
        extraction_scope: Default extraction scope for the wizard.
            Only applies when ``wizard_config`` is used; ignored
            when ``bot_config`` is provided directly.
        tools: Optional list of ``Tool`` instances to register on the
            bot. Useful for ReAct strategy tests that need tool
            execution.
        middleware: Optional list of ``Middleware`` instances to append
            to the bot. Useful for testing middleware hooks like
            ``after_turn`` and ``on_tool_executed``.
        strict_tools: If True (default), the EchoProvider raises
            ValueError when a scripted response contains tool_calls
            but no tools were provided to complete(). Set to False
            for tests that intentionally exercise unexpected
            tool_calls with no registered tools.
        strict: If True, the EchoProvider raises
            ``ResponseQueueExhaustedError`` when all scripted
            responses have been consumed, instead of falling back
            to echo behavior.  Catches under-scripted tests.

    Returns:
        Configured ``BotTestHarness`` instance.

    Raises:
        ValueError: If neither ``wizard_config`` nor ``bot_config``
            is provided.
    """
    from .bot.base import DynaBot
    from .bot.context import BotContext

    if bot_config is None and wizard_config is None:
        raise ValueError(
            "Either wizard_config or bot_config must be provided"
        )

    # Build extraction results
    extractor: ConfigurableExtractor | None = None
    if extraction_results is not None:
        flat_results = [
            SimpleExtractionResult(data=data, confidence=0.9)
            for turn_results in extraction_results
            for data in turn_results
        ]
        extractor = ConfigurableExtractor(results=flat_results)

    # Build bot config if not provided
    if bot_config is None:
        assert wizard_config is not None
        wizard_cfg = copy.deepcopy(wizard_config)

        existing_settings = wizard_cfg.get("settings", {})
        if "extraction_scope" not in existing_settings:
            wizard_cfg["settings"] = {
                "extraction_scope": extraction_scope,
                **existing_settings,
            }

        # When scripted extraction results are provided, force LLM
        # extraction on stages that would otherwise use verbatim
        # capture (single required string field).  Without this,
        # the ConfigurableExtractor is silently bypassed and tests
        # get the raw user message instead of scripted results.
        #
        # This applies to ALL schema stages uniformly.  In multi-stage
        # wizards where a specific stage should still use verbatim
        # capture, set ``capture_mode="verbatim"`` explicitly on that
        # stage — the guard below respects explicit overrides at both
        # the top-level and collection_config levels.
        if extraction_results is not None:
            for stage_def in wizard_cfg.get("stages", []):
                if (
                    stage_def.get("schema")
                    and stage_def.get("capture_mode") in (None, "auto")
                ):
                    col = stage_def.get("collection_config") or {}
                    if col.get("capture_mode") in (None, "auto"):
                        stage_def["capture_mode"] = "extract"

        bot_config = {
            "llm": {"provider": "echo", "model": "echo-test"},
            "conversation_storage": {"backend": "memory"},
            "prompts": {
                "assistant": system_prompt,
            },
            "system_prompt": "assistant",
            "reasoning": {
                "strategy": "wizard",
                "wizard_config": wizard_cfg,
                "extraction_config": {
                    "provider": "echo",
                    "model": "echo-extraction",
                },
            },
        }

    # Create bot
    bot = await DynaBot.from_config(bot_config)

    # Close the original provider created by from_config() — we replace
    # it with a fresh EchoProvider that has a clean response queue.
    original_provider = bot.llm
    if hasattr(original_provider, "close"):
        await original_provider.close()

    # Create a fresh provider with known state
    provider = EchoProvider(
        {"provider": "echo", "model": "echo-test"},
        strict_tools=strict_tools,
        strict=strict,
    )
    if main_responses:
        provider.set_responses(main_responses)

    # Inject fresh provider and extractor
    inject_providers(bot, main_provider=provider, extractor=extractor)

    # Register tools if provided
    if tools:
        for tool in tools:
            bot.tool_registry.register_tool(tool)

    # Append middleware if provided
    if middleware:
        for mw in middleware:
            bot.middleware.append(mw)

    context = BotContext(
        conversation_id=conversation_id,
        client_id=client_id,
    )

    return cls(
        bot=bot,
        provider=provider,
        extractor=extractor,
        context=context,
    )

chat `async` ¶

chat(message: str, **kwargs: Any) -> TurnResult

Run a chat turn and capture wizard state.

Parameters:

Name	Type	Description	Default
`message`	`str`	User message.	required
`**kwargs`	`Any`	Additional kwargs passed to `bot.chat()`.	`{}`

Returns:

Type	Description
`TurnResult`	`TurnResult` with response and wizard state snapshot.

Source code in packages/bots/src/dataknobs_bots/testing.py

async def chat(self, message: str, **kwargs: Any) -> TurnResult:
    """Run a chat turn and capture wizard state.

    Args:
        message: User message.
        **kwargs: Additional kwargs passed to ``bot.chat()``.

    Returns:
        ``TurnResult`` with response and wizard state snapshot.
    """
    response = await self._bot.chat(message, self._context, **kwargs)
    self._turn_count += 1

    state = await self._bot.get_wizard_state(
        self._context.conversation_id,
    )
    result = TurnResult(
        response=response or "",
        wizard_stage=state["current_stage"] if state else None,
        wizard_data=state.get("data", {}) if state else {},
        wizard_state=state,
        turn_index=self._turn_count,
    )
    self._last_result = result
    return result

greet `async` ¶

greet(**kwargs: Any) -> TurnResult

Run a greet turn and capture wizard state.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional kwargs passed to `bot.greet()`.	`{}`

Returns:

Type	Description
`TurnResult`	`TurnResult` with response and wizard state snapshot.

Source code in packages/bots/src/dataknobs_bots/testing.py

async def greet(self, **kwargs: Any) -> TurnResult:
    """Run a greet turn and capture wizard state.

    Args:
        **kwargs: Additional kwargs passed to ``bot.greet()``.

    Returns:
        ``TurnResult`` with response and wizard state snapshot.
    """
    response = await self._bot.greet(self._context, **kwargs)
    self._turn_count += 1

    state = await self._bot.get_wizard_state(
        self._context.conversation_id,
    )
    result = TurnResult(
        response=response or "",
        wizard_stage=state["current_stage"] if state else None,
        wizard_data=state.get("data", {}) if state else {},
        wizard_state=state,
        turn_index=self._turn_count,
    )
    self._last_result = result
    return result

close `async` ¶

close() -> None

Close the bot and release resources.

Source code in packages/bots/src/dataknobs_bots/testing.py

async def close(self) -> None:
    """Close the bot and release resources."""
    await self._bot.close()

CaptureReplay ¶

CaptureReplay(data: dict[str, Any])

Loads a capture JSON file and creates pre-loaded EchoProviders.

Capture files contain serialized LLM request/response pairs from real provider runs, organized by turn. CaptureReplay deserializes these and creates EchoProviders queued with the correct responses, enabling deterministic replay of captured conversations.

Attributes:

Name	Type	Description
`metadata`	`dict[str, Any]`	Capture session metadata (description, model info, timestamps)
`turns`	`list[dict[str, Any]]`	List of turn dicts with wizard state, user messages, bot responses
`format_version`	`str`	Capture file format version

Example

replay = CaptureReplay.from_file("captures/quiz_basic.json")

# Get providers for replay
main = replay.main_provider()
extraction = replay.extraction_provider()

# Or inject directly into a bot
replay.inject_into_bot(bot)

Methods:

Name	Description
`from_file`	Load a capture replay from a JSON file.
`from_dict`	Create a CaptureReplay from a dict (e.g., already-parsed JSON).
`main_provider`	Create an EchoProvider queued with main-role responses.
`extraction_provider`	Create an EchoProvider queued with extraction-role responses.
`inject_into_bot`	Replace providers on a DynaBot with capture-replay EchoProviders.

Source code in packages/bots/src/dataknobs_bots/testing.py

def __init__(
    self,
    data: dict[str, Any],
) -> None:
    self.format_version: str = data.get("format_version", "1.0")
    self.metadata: dict[str, Any] = data.get("metadata", {})
    self.turns: list[dict[str, Any]] = data.get("turns", [])
    self._data = data

    # Pre-separate LLM calls by role for provider creation
    self._main_responses: list[LLMResponse] = []
    self._extraction_responses: list[LLMResponse] = []
    self._parse_calls()

Functions¶

from_file `classmethod` ¶

from_file(path: str | Path) -> CaptureReplay

Load a capture replay from a JSON file.

Parameters:

Name	Type	Description	Default
`path`	`str \| Path`	Path to the capture JSON file	required

Returns:

Type	Description
`CaptureReplay`	CaptureReplay instance

Raises:

Type	Description
`FileNotFoundError`	If the file does not exist
`JSONDecodeError`	If the file is not valid JSON

Source code in packages/bots/src/dataknobs_bots/testing.py

@classmethod
def from_file(cls, path: str | Path) -> CaptureReplay:
    """Load a capture replay from a JSON file.

    Args:
        path: Path to the capture JSON file

    Returns:
        CaptureReplay instance

    Raises:
        FileNotFoundError: If the file does not exist
        json.JSONDecodeError: If the file is not valid JSON
    """
    with open(path) as f:
        data = json.load(f)
    return cls(data)

from_dict `classmethod` ¶

from_dict(data: dict[str, Any]) -> CaptureReplay

Create a CaptureReplay from a dict (e.g., already-parsed JSON).

Parameters:

Name	Type	Description	Default
`data`	`dict[str, Any]`	Capture data dict	required

Returns:

Type	Description
`CaptureReplay`	CaptureReplay instance

Source code in packages/bots/src/dataknobs_bots/testing.py

@classmethod
def from_dict(cls, data: dict[str, Any]) -> CaptureReplay:
    """Create a CaptureReplay from a dict (e.g., already-parsed JSON).

    Args:
        data: Capture data dict

    Returns:
        CaptureReplay instance
    """
    return cls(data)

main_provider ¶

main_provider() -> EchoProvider

Create an EchoProvider queued with main-role responses.

Returns:

Type	Description
`EchoProvider`	EchoProvider with responses in capture order

Source code in packages/bots/src/dataknobs_bots/testing.py

def main_provider(self) -> EchoProvider:
    """Create an EchoProvider queued with main-role responses.

    Returns:
        EchoProvider with responses in capture order
    """
    provider = EchoProvider({"provider": "echo", "model": "capture-replay"})
    if self._main_responses:
        provider.set_responses(self._main_responses)
    return provider

extraction_provider ¶

extraction_provider() -> EchoProvider

Create an EchoProvider queued with extraction-role responses.

Returns:

Type	Description
`EchoProvider`	EchoProvider with responses in capture order

Source code in packages/bots/src/dataknobs_bots/testing.py

def extraction_provider(self) -> EchoProvider:
    """Create an EchoProvider queued with extraction-role responses.

    Returns:
        EchoProvider with responses in capture order
    """
    provider = EchoProvider({"provider": "echo", "model": "capture-replay"})
    if self._extraction_responses:
        provider.set_responses(self._extraction_responses)
    return provider

inject_into_bot ¶

inject_into_bot(bot: Any) -> None

Replace providers on a DynaBot with capture-replay EchoProviders.

Creates main and extraction EchoProviders from the captured data and injects them into the bot using inject_providers.

Parameters:

Name	Type	Description	Default
`bot`	`Any`	A DynaBot instance	required

Source code in packages/bots/src/dataknobs_bots/testing.py

def inject_into_bot(self, bot: Any) -> None:
    """Replace providers on a DynaBot with capture-replay EchoProviders.

    Creates main and extraction EchoProviders from the captured data
    and injects them into the bot using ``inject_providers``.

    Args:
        bot: A DynaBot instance
    """
    inject_providers(
        bot,
        main_provider=self.main_provider(),
        extraction_provider=self.extraction_provider() if self._extraction_responses else None,
    )

TurnResult `dataclass` ¶

TurnResult(
    response: str,
    wizard_stage: str | None = None,
    wizard_data: dict[str, Any] = dict(),
    wizard_state: dict[str, Any] | None = None,
    turn_index: int = 0,
)

Result of a single bot.chat() or bot.greet() turn.

Captures the bot response along with a snapshot of wizard state at the end of the turn.

Attributes:

Name	Type	Description
`response`	`str`	Bot response text.
`wizard_stage`	`str \| None`	Current wizard stage after this turn, or None if no wizard.
`wizard_data`	`dict[str, Any]`	Wizard state data dict after this turn.
`wizard_state`	`dict[str, Any] \| None`	Full normalized wizard state after this turn, or None.
`turn_index`	`int`	One-based turn index (1 = first turn).

Attributes¶

response `instance-attribute` ¶

response: str

Bot response text.

wizard_stage `class-attribute` `instance-attribute` ¶

wizard_stage: str | None = None

Current wizard stage after this turn, or None if no wizard.

wizard_data `class-attribute` `instance-attribute` ¶

wizard_data: dict[str, Any] = field(default_factory=dict)

Wizard state data dict after this turn.

wizard_state `class-attribute` `instance-attribute` ¶

wizard_state: dict[str, Any] | None = None

Full normalized wizard state after this turn, or None.

turn_index `class-attribute` `instance-attribute` ¶

turn_index: int = 0

One-based turn index (1 = first turn).

WizardConfigBuilder ¶

WizardConfigBuilder(name: str, version: str = '1.0')

Fluent builder for wizard configuration dicts.

Replaces verbose inline dict construction (40+ lines) with a readable chained API. Performs build-time validation to catch common mistakes.

Example

config = (WizardConfigBuilder("my-wizard")
    .stage("gather", is_start=True, prompt="Tell me your name.")
        .field("name", field_type="string", required=True)
        .field("domain", field_type="string")
        .transition("done", "data.get('name') and data.get('domain')")
    .stage("done", is_end=True, prompt="All done!")
    .settings(extraction_scope="current_message")
    .build())

Methods:

Name	Description
`stage`	Add a stage to the wizard config.
`field`	Add a field to the current stage's schema.
`transition`	Add a transition from the current stage.
`settings`	Set wizard-level settings.
`build`	Build and validate the wizard config dict.

Source code in packages/bots/src/dataknobs_bots/testing.py

def __init__(self, name: str, version: str = "1.0") -> None:
    self._name = name
    self._version = version
    self._stages: list[dict[str, Any]] = []
    self._settings: dict[str, Any] = {}
    self._current_stage: dict[str, Any] | None = None

Functions¶

stage ¶

stage(
    name: str,
    *,
    is_start: bool = False,
    is_end: bool = False,
    prompt: str = "",
    response_template: str | None = None,
    mode: str | None = None,
    extraction_scope: str | None = None,
    auto_advance: bool | None = None,
    skip_extraction: bool | None = None,
    derivation_enabled: bool | None = None,
    recovery_enabled: bool | None = None,
    confirm_first_render: bool | None = None,
    confirm_on_new_data: bool | None = None,
    can_skip: bool | None = None,
    skip_default: bool | None = None,
    can_go_back: bool | None = None,
    reasoning: str | None = None,
    max_iterations: int | None = None,
    capture_mode: str | None = None,
    routing_transforms: list[str] | None = None,
    **extra_fields: Any,
) -> WizardConfigBuilder

Add a stage to the wizard config.

After calling stage(), subsequent field() and transition() calls apply to this stage.

Parameters:

Name	Type	Description	Default
`name`	`str`	Stage name (unique identifier).	required
`is_start`	`bool`	Whether this is the start stage.	`False`
`is_end`	`bool`	Whether this is an end stage.	`False`
`prompt`	`str`	Stage prompt text.	`''`
`response_template`	`str \| None`	Jinja2 template rendered after extraction to confirm captured data.	`None`
`mode`	`str \| None`	Stage mode (e.g. `"conversation"`).	`None`
`extraction_scope`	`str \| None`	Per-stage extraction scope override.	`None`
`auto_advance`	`bool \| None`	Per-stage auto-advance override.	`None`
`skip_extraction`	`bool \| None`	Whether to skip extraction on this stage.	`None`
`derivation_enabled`	`bool \| None`	Per-stage field derivation override. Set to `False` to suppress derivation on this stage.	`None`
`recovery_enabled`	`bool \| None`	Per-stage recovery pipeline override. Set to `False` to suppress all recovery on this stage.	`None`
`confirm_first_render`	`bool \| None`	Whether to pause for confirmation on first render when new data is extracted. Default `True`. Set to `False` to skip confirmation and evaluate transitions immediately.	`None`
`confirm_on_new_data`	`bool \| None`	Whether to re-confirm when schema property values change on subsequent renders.	`None`
`can_skip`	`bool \| None`	Whether the user can skip this stage.	`None`
`skip_default`	`bool \| None`	Default value to use when the stage is skipped.	`None`
`can_go_back`	`bool \| None`	Whether the user can navigate back from this stage.	`None`
`reasoning`	`str \| None`	Reasoning strategy for this stage (e.g. `"react"`).	`None`
`max_iterations`	`int \| None`	Maximum ReAct iterations for this stage.	`None`
`capture_mode`	`str \| None`	Extraction capture mode — `"auto"` (default), `"verbatim"` (raw input), or `"extract"` (force LLM extraction).	`None`
`routing_transforms`	`list[str] \| None`	List of transform function names to execute before transition condition evaluation.	`None`
`**extra_fields`	`Any`	Additional stage config fields passed through to the stage dict verbatim. Use for less common fields (e.g. `llm_assist=True`, `navigation={...}`) without needing explicit builder parameters.	`{}`

Returns:

Type	Description
`WizardConfigBuilder`	Self for method chaining.

Source code in packages/bots/src/dataknobs_bots/testing.py

def stage(
    self,
    name: str,
    *,
    is_start: bool = False,
    is_end: bool = False,
    prompt: str = "",
    response_template: str | None = None,
    mode: str | None = None,
    extraction_scope: str | None = None,
    auto_advance: bool | None = None,
    skip_extraction: bool | None = None,
    derivation_enabled: bool | None = None,
    recovery_enabled: bool | None = None,
    confirm_first_render: bool | None = None,
    confirm_on_new_data: bool | None = None,
    can_skip: bool | None = None,
    skip_default: bool | None = None,
    can_go_back: bool | None = None,
    reasoning: str | None = None,
    max_iterations: int | None = None,
    capture_mode: str | None = None,
    routing_transforms: list[str] | None = None,
    **extra_fields: Any,
) -> WizardConfigBuilder:
    """Add a stage to the wizard config.

    After calling ``stage()``, subsequent ``field()`` and
    ``transition()`` calls apply to this stage.

    Args:
        name: Stage name (unique identifier).
        is_start: Whether this is the start stage.
        is_end: Whether this is an end stage.
        prompt: Stage prompt text.
        response_template: Jinja2 template rendered after extraction
            to confirm captured data.
        mode: Stage mode (e.g. ``"conversation"``).
        extraction_scope: Per-stage extraction scope override.
        auto_advance: Per-stage auto-advance override.
        skip_extraction: Whether to skip extraction on this stage.
        derivation_enabled: Per-stage field derivation override.
            Set to ``False`` to suppress derivation on this stage.
        recovery_enabled: Per-stage recovery pipeline override.
            Set to ``False`` to suppress all recovery on this stage.
        confirm_first_render: Whether to pause for confirmation on
            first render when new data is extracted. Default ``True``.
            Set to ``False`` to skip confirmation and evaluate
            transitions immediately.
        confirm_on_new_data: Whether to re-confirm when schema
            property values change on subsequent renders.
        can_skip: Whether the user can skip this stage.
        skip_default: Default value to use when the stage is skipped.
        can_go_back: Whether the user can navigate back from this
            stage.
        reasoning: Reasoning strategy for this stage
            (e.g. ``"react"``).
        max_iterations: Maximum ReAct iterations for this stage.
        capture_mode: Extraction capture mode — ``"auto"``
            (default), ``"verbatim"`` (raw input), or ``"extract"``
            (force LLM extraction).
        routing_transforms: List of transform function names to
            execute before transition condition evaluation.
        **extra_fields: Additional stage config fields passed through
            to the stage dict verbatim. Use for less common fields
            (e.g. ``llm_assist=True``, ``navigation={...}``) without
            needing explicit builder parameters.

    Returns:
        Self for method chaining.
    """
    # Finalize previous stage
    self._finalize_current_stage()

    stage: dict[str, Any] = {"name": name, "prompt": prompt}
    if is_start:
        stage["is_start"] = True
    if is_end:
        stage["is_end"] = True
    if response_template is not None:
        stage["response_template"] = response_template
    if mode is not None:
        stage["mode"] = mode
    if extraction_scope is not None:
        stage["extraction_scope"] = extraction_scope
    if auto_advance is not None:
        stage["auto_advance"] = auto_advance
    if skip_extraction is not None:
        stage["skip_extraction"] = skip_extraction
    if derivation_enabled is not None:
        stage["derivation_enabled"] = derivation_enabled
    if recovery_enabled is not None:
        stage["recovery_enabled"] = recovery_enabled
    if confirm_first_render is not None:
        stage["confirm_first_render"] = confirm_first_render
    if confirm_on_new_data is not None:
        stage["confirm_on_new_data"] = confirm_on_new_data
    if can_skip is not None:
        stage["can_skip"] = can_skip
    if skip_default is not None:
        stage["skip_default"] = skip_default
    if can_go_back is not None:
        stage["can_go_back"] = can_go_back
    if reasoning is not None:
        stage["reasoning"] = reasoning
    if max_iterations is not None:
        stage["max_iterations"] = max_iterations
    if capture_mode is not None:
        stage["capture_mode"] = capture_mode
    if routing_transforms is not None:
        stage["routing_transforms"] = routing_transforms
    if extra_fields:
        # Prevent accidental override of structural keys set by
        # positional/explicit parameters above.
        reserved = {"name", "prompt", "is_start", "is_end"}
        safe_fields = {
            k: v for k, v in extra_fields.items()
            if k not in reserved
        }
        stage.update(safe_fields)

    self._current_stage = stage
    return self

field ¶

field(
    name: str,
    *,
    field_type: str = "string",
    required: bool = False,
    description: str | None = None,
    enum: list[str] | None = None,
    default: Any = None,
    x_extraction: dict[str, Any] | None = None,
) -> WizardConfigBuilder

Add a field to the current stage's schema.

Must be called after stage().

Parameters:

Name	Type	Description	Default
`name`	`str`	Field name.	required
`field_type`	`str`	JSON Schema type (`"string"`, `"integer"`, etc.).	`'string'`
`required`	`bool`	Whether this field is required.	`False`
`description`	`str \| None`	Field description.	`None`
`enum`	`list[str] \| None`	Allowed values.	`None`
`default`	`Any`	Default value.	`None`
`x_extraction`	`dict[str, Any] \| None`	Extraction hints (`x-extraction` schema extension).	`None`

Returns:

Type	Description
`WizardConfigBuilder`	Self for method chaining.

Raises:

Type	Description
`ValueError`	If no current stage is set.

Source code in packages/bots/src/dataknobs_bots/testing.py

def field(
    self,
    name: str,
    *,
    field_type: str = "string",
    required: bool = False,
    description: str | None = None,
    enum: list[str] | None = None,
    default: Any = None,
    x_extraction: dict[str, Any] | None = None,
) -> WizardConfigBuilder:
    """Add a field to the current stage's schema.

    Must be called after ``stage()``.

    Args:
        name: Field name.
        field_type: JSON Schema type (``"string"``, ``"integer"``, etc.).
        required: Whether this field is required.
        description: Field description.
        enum: Allowed values.
        default: Default value.
        x_extraction: Extraction hints (``x-extraction`` schema extension).

    Returns:
        Self for method chaining.

    Raises:
        ValueError: If no current stage is set.
    """
    if self._current_stage is None:
        raise ValueError("field() must be called after stage()")

    schema = self._current_stage.setdefault("schema", {
        "type": "object",
        "properties": {},
        "required": [],
    })
    props = schema.setdefault("properties", {})

    field_def: dict[str, Any] = {"type": field_type}
    if description is not None:
        field_def["description"] = description
    if enum is not None:
        field_def["enum"] = enum
    if default is not None:
        field_def["default"] = default
    if x_extraction is not None:
        field_def["x-extraction"] = x_extraction

    props[name] = field_def

    if required:
        req_list = schema.setdefault("required", [])
        if name not in req_list:
            req_list.append(name)

    return self

transition ¶

transition(
    target: str, condition: str | None = None, priority: int | None = None
) -> WizardConfigBuilder

Add a transition from the current stage.

Must be called after stage().

Parameters:

Name	Type	Description	Default
`target`	`str`	Target stage name.	required
`condition`	`str \| None`	Python expression evaluated against wizard state.	`None`
`priority`	`int \| None`	Transition evaluation priority (lower = first).	`None`

Returns:

Type	Description
`WizardConfigBuilder`	Self for method chaining.

Raises:

Type	Description
`ValueError`	If no current stage is set.

Source code in packages/bots/src/dataknobs_bots/testing.py

def transition(
    self,
    target: str,
    condition: str | None = None,
    priority: int | None = None,
) -> WizardConfigBuilder:
    """Add a transition from the current stage.

    Must be called after ``stage()``.

    Args:
        target: Target stage name.
        condition: Python expression evaluated against wizard state.
        priority: Transition evaluation priority (lower = first).

    Returns:
        Self for method chaining.

    Raises:
        ValueError: If no current stage is set.
    """
    if self._current_stage is None:
        raise ValueError("transition() must be called after stage()")

    transitions = self._current_stage.setdefault("transitions", [])
    t: dict[str, Any] = {"target": target}
    if condition is not None:
        t["condition"] = condition
    if priority is not None:
        t["priority"] = priority
    transitions.append(t)
    return self

settings ¶

settings(**kwargs: Any) -> WizardConfigBuilder

Set wizard-level settings.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Settings key-value pairs (e.g. `extraction_scope="current_message"`, `scope_escalation={"enabled": True}`).	`{}`

Returns:

Type	Description
`WizardConfigBuilder`	Self for method chaining.

Source code in packages/bots/src/dataknobs_bots/testing.py

def settings(self, **kwargs: Any) -> WizardConfigBuilder:
    """Set wizard-level settings.

    Args:
        **kwargs: Settings key-value pairs (e.g.
            ``extraction_scope="current_message"``,
            ``scope_escalation={"enabled": True}``).

    Returns:
        Self for method chaining.
    """
    self._settings.update(kwargs)
    return self

build ¶

build() -> dict[str, Any]

Build and validate the wizard config dict.

Returns:

Type	Description
`dict[str, Any]`	Complete wizard configuration dict compatible with
`dict[str, Any]`	`WizardConfigLoader.load_from_dict()`.

Raises:

Type	Description
`ValueError`	If validation fails (no start stage, no end stage, transition to nonexistent stage).

Source code in packages/bots/src/dataknobs_bots/testing.py

def build(self) -> dict[str, Any]:
    """Build and validate the wizard config dict.

    Returns:
        Complete wizard configuration dict compatible with
        ``WizardConfigLoader.load_from_dict()``.

    Raises:
        ValueError: If validation fails (no start stage, no end stage,
            transition to nonexistent stage).
    """
    self._finalize_current_stage()

    config: dict[str, Any] = {
        "name": self._name,
        "version": self._version,
        "stages": list(self._stages),
    }
    if self._settings:
        config["settings"] = dict(self._settings)

    self._validate(config)
    return config

AddKBResourceTool ¶

AddKBResourceTool(knowledge_dir: Path | None = None)

Bases: ContextAwareTool

Tool for adding a resource to the knowledge base resource list.

Supports adding file references (from the source directory) or inline content that gets written to the knowledge directory.

Wizard data read/written: - _kb_resources: list[dict] — resource list (append) - domain_id: str — used for knowledge directory organization

Attributes:

Name	Type	Description
`_knowledge_dir`		Optional base directory for writing inline content.

Initialize the tool.

Parameters:

Name	Type	Description	Default
`knowledge_dir`	`Path \| None`	Base directory for knowledge files. Used when writing inline content to disk. Resolved from wizard data `_knowledge_dir` if not provided here.	`None`

Methods:

Name	Description
`catalog_metadata`	Return catalog metadata for this tool class.
`execute_with_context`	Add a KB resource.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

def __init__(self, knowledge_dir: Path | None = None) -> None:
    """Initialize the tool.

    Args:
        knowledge_dir: Base directory for knowledge files. Used when
            writing inline content to disk. Resolved from wizard data
            ``_knowledge_dir`` if not provided here.
    """
    super().__init__(
        name="add_kb_resource",
        description=(
            "Add a resource to the bot's knowledge base. Can add "
            "a file from the source directory or inline content."
        ),
    )
    self._knowledge_dir = knowledge_dir

Attributes¶

schema `property` ¶

schema: dict[str, Any]

Return JSON Schema for tool parameters.

Functions¶

catalog_metadata `classmethod` ¶

catalog_metadata() -> dict[str, Any]

Return catalog metadata for this tool class.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

@classmethod
def catalog_metadata(cls) -> dict[str, Any]:
    """Return catalog metadata for this tool class."""
    return {
        "name": "add_kb_resource",
        "description": (
            "Add a resource to the bot's knowledge base."
        ),
        "tags": ("configbot", "kb"),
    }

execute_with_context `async` ¶

execute_with_context(
    context: ToolExecutionContext,
    path: str,
    title: str = "",
    resource_type: str = "file",
    content: str | None = None,
    description: str | None = None,
    **kwargs: Any,
) -> dict[str, Any]

Add a KB resource.

Parameters:

Name	Type	Description	Default
`context`	`ToolExecutionContext`	Execution context with wizard state.	required
`path`	`str`	Resource path or filename.	required
`title`	`str`	Optional display title.	`''`
`resource_type`	`str`	Type of resource ('file' or 'inline').	`'file'`
`content`	`str \| None`	Inline content (required if resource_type='inline').	`None`
`description`	`str \| None`	Optional resource description.	`None`

Returns:

Type	Description
`dict[str, Any]`	Dict with add result.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

async def execute_with_context(
    self,
    context: ToolExecutionContext,
    path: str,
    title: str = "",
    resource_type: str = "file",
    content: str | None = None,
    description: str | None = None,
    **kwargs: Any,
) -> dict[str, Any]:
    """Add a KB resource.

    Args:
        context: Execution context with wizard state.
        path: Resource path or filename.
        title: Optional display title.
        resource_type: Type of resource ('file' or 'inline').
        content: Inline content (required if resource_type='inline').
        description: Optional resource description.

    Returns:
        Dict with add result.
    """
    wizard_data = _get_wizard_data_ref(context)
    resources: list[dict[str, Any]] = wizard_data.setdefault(
        "_kb_resources", []
    )

    # Check for duplicate
    existing_paths = {r["path"] for r in resources}
    if path in existing_paths:
        return {
            "success": False,
            "error": f"Resource already exists: {path}",
            "existing_resources": len(resources),
        }

    resource: dict[str, Any] = {
        "path": path,
        "type": resource_type,
    }
    if title:
        resource["title"] = title
    if description:
        resource["description"] = description

    # Handle inline content — write to knowledge directory
    if resource_type == "inline":
        if not content:
            return {
                "success": False,
                "error": "Content is required for inline resources",
            }
        kb_dir = _resolve_knowledge_dir(self._knowledge_dir, wizard_data)
        if kb_dir is None:
            return {
                "success": False,
                "error": "No knowledge directory configured",
            }
        domain_id = wizard_data.get("domain_id", "default")
        target_dir = kb_dir / domain_id
        target_dir.mkdir(parents=True, exist_ok=True)
        target_path = target_dir / path
        target_path.write_text(content, encoding="utf-8")
        resource["source"] = str(target_path)
        logger.debug(
            "Wrote inline resource: %s",
            target_path,
            extra={"conversation_id": context.conversation_id},
        )

    resources.append(resource)

    logger.debug(
        "Added KB resource: %s (type=%s)",
        path,
        resource_type,
        extra={"conversation_id": context.conversation_id},
    )

    return {
        "success": True,
        "resource": resource,
        "total_resources": len(resources),
    }

CheckKnowledgeSourceTool ¶

CheckKnowledgeSourceTool()

Bases: ContextAwareTool

Tool for verifying a knowledge source directory exists and has content.

Checks the specified path for files matching common document patterns and records the results in wizard data for subsequent tools.

Wizard data written: - source_verified: bool — whether the source was found - files_found: list[str] — matching file names - _source_path_resolved: str — the resolved absolute path - _kb_resources: list[dict] — initialized if not present

Initialize the tool.

Methods:

Name	Description
`catalog_metadata`	Return catalog metadata for this tool class.
`execute_with_context`	Check the knowledge source directory.

Attributes:

Name	Type	Description
`schema`	`dict[str, Any]`	Return JSON Schema for tool parameters.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

def __init__(self) -> None:
    """Initialize the tool."""
    super().__init__(
        name="check_knowledge_source",
        description=(
            "Check if a knowledge source directory exists and contains "
            "files that can be used for the knowledge base."
        ),
    )

Attributes¶

schema `property` ¶

schema: dict[str, Any]

Return JSON Schema for tool parameters.

Functions¶

catalog_metadata `classmethod` ¶

catalog_metadata() -> dict[str, Any]

Return catalog metadata for this tool class.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

@classmethod
def catalog_metadata(cls) -> dict[str, Any]:
    """Return catalog metadata for this tool class."""
    return {
        "name": "check_knowledge_source",
        "description": (
            "Check if a knowledge source directory exists and "
            "contains usable files."
        ),
        "tags": ("configbot", "kb"),
    }

execute_with_context `async` ¶

execute_with_context(
    context: ToolExecutionContext,
    source_path: str,
    file_patterns: list[str] | None = None,
    **kwargs: Any,
) -> dict[str, Any]

Check the knowledge source directory.

Parameters:

Name	Type	Description	Default
`context`	`ToolExecutionContext`	Execution context with wizard state.	required
`source_path`	`str`	Path to the knowledge source directory.	required
`file_patterns`	`list[str] \| None`	Optional glob patterns to match files.	`None`

Returns:

Type	Description
`dict[str, Any]`	Dict with verification results.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

async def execute_with_context(
    self,
    context: ToolExecutionContext,
    source_path: str,
    file_patterns: list[str] | None = None,
    **kwargs: Any,
) -> dict[str, Any]:
    """Check the knowledge source directory.

    Args:
        context: Execution context with wizard state.
        source_path: Path to the knowledge source directory.
        file_patterns: Optional glob patterns to match files.

    Returns:
        Dict with verification results.
    """
    wizard_data = _get_wizard_data_ref(context)
    patterns = file_patterns or _DEFAULT_GLOB_PATTERNS

    path = Path(source_path).expanduser().resolve()
    if not path.exists() or not path.is_dir():
        wizard_data["source_verified"] = False
        wizard_data["files_found"] = []
        logger.debug(
            "Knowledge source not found: %s",
            path,
            extra={"conversation_id": context.conversation_id},
        )
        return {
            "exists": False,
            "error": f"Directory not found: {source_path}",
            "files_found": [],
        }

    # Find matching files
    found_files: list[str] = []
    for pattern in patterns:
        for match in path.glob(pattern):
            if match.is_file():
                found_files.append(match.name)
    found_files = sorted(set(found_files))

    # Update wizard data
    wizard_data["source_verified"] = True
    wizard_data["files_found"] = found_files
    wizard_data["_source_path_resolved"] = str(path)
    if "_kb_resources" not in wizard_data:
        wizard_data["_kb_resources"] = []

    # Auto-populate _kb_resources with discovered files
    resources: list[dict[str, Any]] = wizard_data["_kb_resources"]
    existing_paths = {r.get("path") for r in resources}
    for fname in found_files:
        if fname not in existing_paths:
            resources.append(
                {"path": fname, "type": "file", "source": str(path / fname)}
            )

    logger.debug(
        "Checked knowledge source: %s (%d files)",
        path,
        len(found_files),
        extra={"conversation_id": context.conversation_id},
    )

    return {
        "exists": True,
        "path": str(path),
        "files_found": found_files,
        "file_count": len(found_files),
        "patterns_checked": patterns,
    }

GetTemplateDetailsTool ¶

GetTemplateDetailsTool(template_registry: ConfigTemplateRegistry)

Bases: ContextAwareTool

Tool for getting detailed information about a template.

Returns the full template definition including all variables, their types, defaults, and constraints.

Attributes:

Name	Type	Description
`_registry`		Template registry to query.

Initialize the tool.

Parameters:

Name	Type	Description	Default
`template_registry`	`ConfigTemplateRegistry`	Registry containing available templates.	required

Methods:

Name	Description
`catalog_metadata`	Return catalog metadata for this tool class.
`from_config`	Create from YAML-compatible configuration.
`execute_with_context`	Get template details.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

def __init__(self, template_registry: ConfigTemplateRegistry) -> None:
    """Initialize the tool.

    Args:
        template_registry: Registry containing available templates.
    """
    super().__init__(
        name="get_template_details",
        description=(
            "Get detailed information about a specific configuration "
            "template, including all variables and their requirements."
        ),
    )
    self._registry = template_registry

Attributes¶

schema `property` ¶

schema: dict[str, Any]

Return JSON Schema for tool parameters.

Functions¶

catalog_metadata `classmethod` ¶

catalog_metadata() -> dict[str, Any]

Return catalog metadata for this tool class.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

@classmethod
def catalog_metadata(cls) -> dict[str, Any]:
    """Return catalog metadata for this tool class."""
    return {
        "name": "get_template_details",
        "description": (
            "Get detailed information about a specific "
            "configuration template."
        ),
        "tags": ("configbot",),
        "requires": ("template_registry",),
        "default_params": {"template_dir": "configs/templates"},
    }

from_config `classmethod` ¶

from_config(config: dict[str, Any]) -> GetTemplateDetailsTool

Create from YAML-compatible configuration.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Dict with `template_dir` key pointing to a directory containing template YAML files.	required

Returns:

Type	Description
`GetTemplateDetailsTool`	Configured GetTemplateDetailsTool instance.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

@classmethod
def from_config(cls, config: dict[str, Any]) -> GetTemplateDetailsTool:
    """Create from YAML-compatible configuration.

    Args:
        config: Dict with ``template_dir`` key pointing to a
            directory containing template YAML files.

    Returns:
        Configured GetTemplateDetailsTool instance.
    """
    from pathlib import Path

    template_dir = config.get("template_dir", "configs/templates")
    registry = ConfigTemplateRegistry()
    path = Path(template_dir)
    if path.is_dir():
        registry.load_from_directory(path)
    return cls(template_registry=registry)

execute_with_context `async` ¶

execute_with_context(
    context: ToolExecutionContext, template_name: str, **kwargs: Any
) -> dict[str, Any]

Get template details.

Parameters:

Name	Type	Description	Default
`context`	`ToolExecutionContext`	Execution context.	required
`template_name`	`str`	Name of the template.	required

Returns:

Type	Description
`dict[str, Any]`	Dict with template details, or error if not found.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

async def execute_with_context(
    self,
    context: ToolExecutionContext,
    template_name: str,
    **kwargs: Any,
) -> dict[str, Any]:
    """Get template details.

    Args:
        context: Execution context.
        template_name: Name of the template.

    Returns:
        Dict with template details, or error if not found.
    """
    template = self._registry.get(template_name)
    if template is None:
        return {
            "error": f"Template not found: {template_name}",
            "available": [
                t.name for t in self._registry.list_templates()
            ],
        }

    logger.debug(
        "Retrieved template details: %s",
        template_name,
        extra={"conversation_id": context.conversation_id},
    )

    return {
        "name": template.name,
        "description": template.description,
        "version": template.version,
        "tags": template.tags,
        "variables": [v.to_dict() for v in template.variables],
        "required_variables": [
            v.to_dict() for v in template.get_required_variables()
        ],
        "optional_variables": [
            v.to_dict() for v in template.get_optional_variables()
        ],
    }

IngestKnowledgeBaseTool ¶

IngestKnowledgeBaseTool(knowledge_dir: Path | None = None)

Bases: ContextAwareTool

Tool for writing the KB ingestion manifest and finalizing KB config.

Writes a manifest.json file listing resources and chunking parameters, and updates wizard data with the final KB configuration for inclusion in the bot config.

Wizard data read: - _kb_resources: list[dict] — resources to include - domain_id: str — domain identifier - files_found: list[str] — auto-discovered files (fallback) - _source_path_resolved: str — resolved source path

Wizard data written: - kb_config: dict — final KB configuration for the bot config - kb_resources: list[dict] — finalized resource list (public key) - ingestion_complete: bool — whether ingestion manifest was written

Attributes:

Name	Type	Description
`_knowledge_dir`		Optional base directory for knowledge files.

Initialize the tool.

Parameters:

Name	Type	Description	Default
`knowledge_dir`	`Path \| None`	Base directory for knowledge files. Resolved from wizard data `_knowledge_dir` if not provided here.	`None`

Methods:

Name	Description
`catalog_metadata`	Return catalog metadata for this tool class.
`execute_with_context`	Write ingestion manifest and finalize KB config.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

def __init__(self, knowledge_dir: Path | None = None) -> None:
    """Initialize the tool.

    Args:
        knowledge_dir: Base directory for knowledge files. Resolved
            from wizard data ``_knowledge_dir`` if not provided here.
    """
    super().__init__(
        name="ingest_knowledge_base",
        description=(
            "Finalize and ingest the knowledge base resources. "
            "Writes an ingestion manifest and prepares the KB "
            "configuration for the bot."
        ),
    )
    self._knowledge_dir = knowledge_dir

Attributes¶

schema `property` ¶

schema: dict[str, Any]

Return JSON Schema for tool parameters.

Functions¶

catalog_metadata `classmethod` ¶

catalog_metadata() -> dict[str, Any]

Return catalog metadata for this tool class.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

@classmethod
def catalog_metadata(cls) -> dict[str, Any]:
    """Return catalog metadata for this tool class."""
    return {
        "name": "ingest_knowledge_base",
        "description": (
            "Finalize and ingest the knowledge base resources."
        ),
        "tags": ("configbot", "kb"),
    }

execute_with_context `async` ¶

execute_with_context(
    context: ToolExecutionContext, chunk_size: int = 512, **kwargs: Any
) -> dict[str, Any]

Write ingestion manifest and finalize KB config.

Parameters:

Name	Type	Description	Default
`context`	`ToolExecutionContext`	Execution context with wizard state.	required
`chunk_size`	`int`	Size of text chunks.	`512`

Returns:

Type	Description
`dict[str, Any]`	Dict with ingestion result.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

async def execute_with_context(
    self,
    context: ToolExecutionContext,
    chunk_size: int = 512,
    **kwargs: Any,
) -> dict[str, Any]:
    """Write ingestion manifest and finalize KB config.

    Args:
        context: Execution context with wizard state.
        chunk_size: Size of text chunks.

    Returns:
        Dict with ingestion result.
    """
    wizard_data = _get_wizard_data_ref(context)
    domain_id = wizard_data.get("domain_id", "default")
    resources = wizard_data.get("_kb_resources", [])
    source_path = wizard_data.get("_source_path_resolved")

    # Fallback: if no explicit resources, use auto-discovered files
    if not resources and wizard_data.get("files_found"):
        resources = [
            {"path": f, "type": "file"}
            for f in wizard_data["files_found"]
        ]

    if not resources:
        return {
            "success": False,
            "error": "No resources to ingest. Add resources first.",
        }

    kb_dir = _resolve_knowledge_dir(self._knowledge_dir, wizard_data)
    if kb_dir is None:
        return {
            "success": False,
            "error": "No knowledge directory configured",
        }

    # Write manifest
    manifest_dir = kb_dir / domain_id
    manifest_dir.mkdir(parents=True, exist_ok=True)
    manifest = {
        "domain_id": domain_id,
        "source_path": source_path,
        "resources": resources,
        "chunking": {
            "chunk_size": chunk_size,
        },
    }
    manifest_path = manifest_dir / "manifest.json"
    manifest_path.write_text(
        json.dumps(manifest, indent=2), encoding="utf-8"
    )

    # Build KB config for the bot configuration
    kb_config: dict[str, Any] = {
        "enabled": True,
        "type": "rag",
        "documents_path": str(manifest_dir),
        "chunking": {
            "chunk_size": chunk_size,
        },
    }

    # Update wizard data with finalized KB config
    wizard_data["kb_config"] = kb_config
    wizard_data["kb_resources"] = resources
    wizard_data["ingestion_complete"] = True

    logger.info(
        "Wrote KB manifest for '%s' with %d resources",
        domain_id,
        len(resources),
        extra={
            "domain_id": domain_id,
            "resource_count": len(resources),
            "conversation_id": context.conversation_id,
        },
    )

    return {
        "success": True,
        "domain_id": domain_id,
        "manifest_path": str(manifest_path),
        "resource_count": len(resources),
        "chunk_size": chunk_size,
    }

KnowledgeSearchTool ¶

KnowledgeSearchTool(knowledge_base: Any, name: str = 'knowledge_search')

Bases: ContextAwareTool

Tool for searching the knowledge base.

This tool allows LLMs to search the bot's knowledge base for relevant information during conversations.

Demonstrates the umbrella pattern for tools: - Static dependency: knowledge_base (via constructor injection) - Dynamic context: conversation_id, user_id (via ToolExecutionContext)

Example

# Create tool with knowledge base (static dependency)
tool = KnowledgeSearchTool(knowledge_base=kb)

# Register with bot
bot.tool_registry.register_tool(tool)

# LLM can now call the tool
# Context is automatically injected by reasoning strategy
results = await tool.execute(
    query="How do I configure the database?",
    max_results=3
)

Initialize knowledge search tool.

Parameters:

Name	Type	Description	Default
`knowledge_base`	`Any`	RAGKnowledgeBase instance to search	required
`name`	`str`	Tool name (default: knowledge_search)	`'knowledge_search'`

Methods:

Name	Description
`catalog_metadata`	Return catalog metadata for this tool class.
`execute_with_context`	Execute knowledge base search with context.

Attributes:

Name	Type	Description
`schema`	`dict[str, Any]`	Get JSON schema for tool parameters.

Source code in packages/bots/src/dataknobs_bots/tools/knowledge_search.py

def __init__(self, knowledge_base: Any, name: str = "knowledge_search"):
    """Initialize knowledge search tool.

    Args:
        knowledge_base: RAGKnowledgeBase instance to search
        name: Tool name (default: knowledge_search)
    """
    super().__init__(
        name=name,
        description="Search the knowledge base for relevant information. "
        "Use this when you need to find documentation, examples, or "
        "specific information to answer user questions.",
    )
    # Static dependency - doesn't change per-request
    self.knowledge_base = knowledge_base

Attributes¶

schema `property` ¶

schema: dict[str, Any]

Get JSON schema for tool parameters.

Returns:

Type	Description
`dict[str, Any]`	JSON Schema for the tool parameters

Functions¶

catalog_metadata `classmethod` ¶

catalog_metadata() -> dict[str, Any]

Return catalog metadata for this tool class.

Source code in packages/bots/src/dataknobs_bots/tools/knowledge_search.py

@classmethod
def catalog_metadata(cls) -> dict[str, Any]:
    """Return catalog metadata for this tool class."""
    return {
        "name": "knowledge_search",
        "description": (
            "Search the knowledge base for relevant information."
        ),
        "tags": ("general", "rag"),
        "requires": ("knowledge_base",),
    }

execute_with_context `async` ¶

execute_with_context(
    context: ToolExecutionContext,
    query: str,
    max_results: int = 3,
    **kwargs: Any,
) -> dict[str, Any]

Execute knowledge base search with context.

Parameters:

Name	Type	Description	Default
`context`	`ToolExecutionContext`	Execution context with conversation/user info	required
`query`	`str`	Search query text	required
`max_results`	`int`	Maximum number of results (default: 3)	`3`
`**kwargs`	`Any`	Additional arguments (ignored)	`{}`

Returns:

Type	Description
`dict[str, Any]`	Dictionary with search results: - query: Original query - results: List of relevant chunks - num_results: Number of results found - conversation_id: ID of conversation (if available)

Example

result = await tool.execute(
    query="How do I configure the database?",
    max_results=3
)
for chunk in result['results']:
    print(f"{chunk['heading_path']}: {chunk['text']}")

Source code in packages/bots/src/dataknobs_bots/tools/knowledge_search.py

async def execute_with_context(
    self,
    context: ToolExecutionContext,
    query: str,
    max_results: int = 3,
    **kwargs: Any,
) -> dict[str, Any]:
    """Execute knowledge base search with context.

    Args:
        context: Execution context with conversation/user info
        query: Search query text
        max_results: Maximum number of results (default: 3)
        **kwargs: Additional arguments (ignored)

    Returns:
        Dictionary with search results:
            - query: Original query
            - results: List of relevant chunks
            - num_results: Number of results found
            - conversation_id: ID of conversation (if available)

    Example:
        ```python
        result = await tool.execute(
            query="How do I configure the database?",
            max_results=3
        )
        for chunk in result['results']:
            print(f"{chunk['heading_path']}: {chunk['text']}")
        ```
    """
    # Clamp max_results to valid range
    max_results = max(1, min(10, max_results))

    # Log search with context for observability
    logger.debug(
        "Knowledge search",
        extra={
            "query": query,
            "max_results": max_results,
            "conversation_id": context.conversation_id,
            "user_id": context.user_id,
        },
    )

    # Search knowledge base
    results = await self.knowledge_base.query(query, k=max_results)

    # Format response with optional context info
    response: dict[str, Any] = {
        "query": query,
        "results": [
            {
                "text": r["text"],
                "source": r["source"],
                "heading": r["heading_path"],
                "similarity": round(r["similarity"], 3),
            }
            for r in results
        ],
        "num_results": len(results),
    }

    # Include conversation_id for traceability if available
    if context.conversation_id:
        response["conversation_id"] = context.conversation_id

    return response

ListAvailableToolsTool ¶

ListAvailableToolsTool(available_tools: list[dict[str, Any]])

Bases: ContextAwareTool

Tool for listing tools available to configure for a bot.

Takes a constructor-injected catalog of available tools and lets the LLM browse them, optionally filtering by category. The catalog data is consumer-specific — each DynaBot consumer provides its own list.

Attributes:

Name	Type	Description
`_tools`		The available tool catalog.

Initialize the tool.

Parameters:

Name	Type	Description	Default
`available_tools`	`list[dict[str, Any]]`	List of tool descriptors. Each dict should have at minimum `name` and `description` keys. Optional: `category`, `params`, `class`.	required

Methods:

Name	Description
`catalog_metadata`	Return catalog metadata for this tool class.
`execute_with_context`	List available tools.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

def __init__(self, available_tools: list[dict[str, Any]]) -> None:
    """Initialize the tool.

    Args:
        available_tools: List of tool descriptors. Each dict should
            have at minimum ``name`` and ``description`` keys.
            Optional: ``category``, ``params``, ``class``.
    """
    super().__init__(
        name="list_available_tools",
        description=(
            "List tools that can be added to the bot configuration. "
            "Optionally filter by category."
        ),
    )
    self._tools = available_tools

Attributes¶

schema `property` ¶

schema: dict[str, Any]

Return JSON Schema for tool parameters.

Functions¶

catalog_metadata `classmethod` ¶

catalog_metadata() -> dict[str, Any]

Return catalog metadata for this tool class.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

@classmethod
def catalog_metadata(cls) -> dict[str, Any]:
    """Return catalog metadata for this tool class."""
    return {
        "name": "list_available_tools",
        "description": (
            "List tools that can be added to the bot configuration."
        ),
        "tags": ("configbot",),
    }

execute_with_context `async` ¶

execute_with_context(
    context: ToolExecutionContext, category: str | None = None, **kwargs: Any
) -> dict[str, Any]

List available tools.

Parameters:

Name	Type	Description	Default
`context`	`ToolExecutionContext`	Execution context.	required
`category`	`str \| None`	Optional category to filter by.	`None`

Returns:

Type	Description
`dict[str, Any]`	Dict with matching tools, count, and available categories.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

async def execute_with_context(
    self,
    context: ToolExecutionContext,
    category: str | None = None,
    **kwargs: Any,
) -> dict[str, Any]:
    """List available tools.

    Args:
        context: Execution context.
        category: Optional category to filter by.

    Returns:
        Dict with matching tools, count, and available categories.
    """
    if category:
        filtered = [
            t for t in self._tools
            if t.get("category", "").lower() == category.lower()
        ]
    else:
        filtered = list(self._tools)

    categories = sorted({
        t["category"] for t in self._tools if "category" in t
    })

    logger.debug(
        "Listed %d available tools (category=%s)",
        len(filtered),
        category,
        extra={"conversation_id": context.conversation_id},
    )

    return {
        "tools": filtered,
        "count": len(filtered),
        "categories": categories,
    }

ListKBResourcesTool ¶

ListKBResourcesTool()

Bases: ContextAwareTool

Tool for listing currently tracked knowledge base resources.

Reads _kb_resources and _source_path_resolved from wizard data to show what resources have been added so far.

Initialize the tool.

Methods:

Name	Description
`catalog_metadata`	Return catalog metadata for this tool class.
`execute_with_context`	List KB resources.

Attributes:

Name	Type	Description
`schema`	`dict[str, Any]`	Return JSON Schema for tool parameters.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

def __init__(self) -> None:
    """Initialize the tool."""
    super().__init__(
        name="list_kb_resources",
        description=(
            "List the knowledge base resources that have been added "
            "to the current bot configuration."
        ),
    )

Attributes¶

schema `property` ¶

schema: dict[str, Any]

Return JSON Schema for tool parameters.

Functions¶

catalog_metadata `classmethod` ¶

catalog_metadata() -> dict[str, Any]

Return catalog metadata for this tool class.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

@classmethod
def catalog_metadata(cls) -> dict[str, Any]:
    """Return catalog metadata for this tool class."""
    return {
        "name": "list_kb_resources",
        "description": (
            "List the knowledge base resources added to the "
            "current bot configuration."
        ),
        "tags": ("configbot", "kb"),
    }

execute_with_context `async` ¶

execute_with_context(
    context: ToolExecutionContext, **kwargs: Any
) -> dict[str, Any]

List KB resources.

Parameters:

Name	Type	Description	Default
`context`	`ToolExecutionContext`	Execution context with wizard state.	required

Returns:

Type	Description
`dict[str, Any]`	Dict with resource list and source path.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

async def execute_with_context(
    self,
    context: ToolExecutionContext,
    **kwargs: Any,
) -> dict[str, Any]:
    """List KB resources.

    Args:
        context: Execution context with wizard state.

    Returns:
        Dict with resource list and source path.
    """
    wizard_data = _get_wizard_data_ref(context)
    resources = wizard_data.get("_kb_resources", [])
    source_path = wizard_data.get("_source_path_resolved")

    logger.debug(
        "Listed %d KB resources",
        len(resources),
        extra={"conversation_id": context.conversation_id},
    )

    return {
        "resources": resources,
        "count": len(resources),
        "source_path": source_path,
    }

ListTemplatesTool ¶

ListTemplatesTool(template_registry: ConfigTemplateRegistry)

Bases: ContextAwareTool

Tool for listing available configuration templates.

Allows the LLM to discover what templates are available, optionally filtered by tags.

Attributes:

Name	Type	Description
`_registry`		Template registry to query.

Initialize the tool.

Parameters:

Name	Type	Description	Default
`template_registry`	`ConfigTemplateRegistry`	Registry containing available templates.	required

Methods:

Name	Description
`catalog_metadata`	Return catalog metadata for this tool class.
`from_config`	Create from YAML-compatible configuration.
`execute_with_context`	List available templates.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

def __init__(self, template_registry: ConfigTemplateRegistry) -> None:
    """Initialize the tool.

    Args:
        template_registry: Registry containing available templates.
    """
    super().__init__(
        name="list_templates",
        description=(
            "List available bot configuration templates. "
            "Optionally filter by tags to find templates for "
            "specific use cases."
        ),
    )
    self._registry = template_registry

Attributes¶

schema `property` ¶

schema: dict[str, Any]

Return JSON Schema for tool parameters.

Functions¶

catalog_metadata `classmethod` ¶

catalog_metadata() -> dict[str, Any]

Return catalog metadata for this tool class.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

@classmethod
def catalog_metadata(cls) -> dict[str, Any]:
    """Return catalog metadata for this tool class."""
    return {
        "name": "list_templates",
        "description": (
            "List available bot configuration templates."
        ),
        "tags": ("configbot",),
        "requires": ("template_registry",),
        "default_params": {"template_dir": "configs/templates"},
    }

from_config `classmethod` ¶

from_config(config: dict[str, Any]) -> ListTemplatesTool

Create from YAML-compatible configuration.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Dict with `template_dir` key pointing to a directory containing template YAML files.	required

Returns:

Type	Description
`ListTemplatesTool`	Configured ListTemplatesTool instance.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

@classmethod
def from_config(cls, config: dict[str, Any]) -> ListTemplatesTool:
    """Create from YAML-compatible configuration.

    Args:
        config: Dict with ``template_dir`` key pointing to a
            directory containing template YAML files.

    Returns:
        Configured ListTemplatesTool instance.
    """
    from pathlib import Path

    template_dir = config.get("template_dir", "configs/templates")
    registry = ConfigTemplateRegistry()
    path = Path(template_dir)
    if path.is_dir():
        registry.load_from_directory(path)
    return cls(template_registry=registry)

execute_with_context `async` ¶

execute_with_context(
    context: ToolExecutionContext, tags: list[str] | None = None, **kwargs: Any
) -> dict[str, Any]

List available templates.

Parameters:

Name	Type	Description	Default
`context`	`ToolExecutionContext`	Execution context.	required
`tags`	`list[str] \| None`	Optional tags to filter by.	`None`

Returns:

Type	Description
`dict[str, Any]`	Dict with list of template summaries.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

async def execute_with_context(
    self,
    context: ToolExecutionContext,
    tags: list[str] | None = None,
    **kwargs: Any,
) -> dict[str, Any]:
    """List available templates.

    Args:
        context: Execution context.
        tags: Optional tags to filter by.

    Returns:
        Dict with list of template summaries.
    """
    templates = self._registry.list_templates(tags=tags)

    logger.debug(
        "Listed %d templates (tags=%s)",
        len(templates),
        tags,
        extra={"conversation_id": context.conversation_id},
    )

    return {
        "templates": [
            {
                "name": t.name,
                "description": t.description,
                "version": t.version,
                "tags": t.tags,
                "variables_count": len(t.variables),
                "required_variables": [
                    v.name for v in t.get_required_variables()
                ],
            }
            for t in templates
        ],
        "count": len(templates),
    }

PreviewConfigTool ¶

PreviewConfigTool(
    builder_factory: Callable[[dict[str, Any]], DynaBotConfigBuilder],
)

Bases: ContextAwareTool

Tool for previewing the configuration being built.

Uses a consumer-provided builder_factory to construct the configuration from wizard data. This is the key extension point: the factory encapsulates domain-specific logic.

Attributes:

Name	Type	Description
`_builder_factory`		Callable that creates a configured builder from wizard data.

Initialize the tool.

Parameters:

Name	Type	Description	Default
`builder_factory`	`Callable[[dict[str, Any]], DynaBotConfigBuilder]`	Function that takes wizard collected data and returns a configured DynaBotConfigBuilder. This is where consumers inject domain-specific config logic.	required

Methods:

Name	Description
`catalog_metadata`	Return catalog metadata for this tool class.
`from_config`	Create from YAML-compatible configuration.
`execute_with_context`	Preview the current configuration.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

def __init__(
    self,
    builder_factory: Callable[[dict[str, Any]], DynaBotConfigBuilder],
) -> None:
    """Initialize the tool.

    Args:
        builder_factory: Function that takes wizard collected data
            and returns a configured DynaBotConfigBuilder. This is
            where consumers inject domain-specific config logic.
    """
    super().__init__(
        name="preview_config",
        description=(
            "Preview the bot configuration being built from the "
            "current wizard data. Shows what the final config will "
            "look like."
        ),
    )
    self._builder_factory = builder_factory

Attributes¶

schema `property` ¶

schema: dict[str, Any]

Return JSON Schema for tool parameters.

Functions¶

catalog_metadata `classmethod` ¶

catalog_metadata() -> dict[str, Any]

Return catalog metadata for this tool class.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

@classmethod
def catalog_metadata(cls) -> dict[str, Any]:
    """Return catalog metadata for this tool class."""
    return {
        "name": "preview_config",
        "description": (
            "Preview the bot configuration being built from "
            "the current wizard data."
        ),
        "tags": ("configbot",),
        "requires": ("builder_factory",),
    }

from_config `classmethod` ¶

from_config(config: dict[str, Any]) -> PreviewConfigTool

Create from YAML-compatible configuration.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Dict with `builder_factory` key — a dotted import path to a callable that accepts wizard data and returns a `DynaBotConfigBuilder`.	required

Returns:

Type	Description
`PreviewConfigTool`	Configured PreviewConfigTool instance.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

@classmethod
def from_config(cls, config: dict[str, Any]) -> PreviewConfigTool:
    """Create from YAML-compatible configuration.

    Args:
        config: Dict with ``builder_factory`` key — a dotted
            import path to a callable that accepts wizard data
            and returns a ``DynaBotConfigBuilder``.

    Returns:
        Configured PreviewConfigTool instance.
    """
    from .resolve import resolve_callable

    factory_ref = config["builder_factory"]
    factory = resolve_callable(factory_ref)
    return cls(builder_factory=factory)

execute_with_context `async` ¶

execute_with_context(
    context: ToolExecutionContext, format: str = "summary", **kwargs: Any
) -> dict[str, Any]

Preview the current configuration.

Parameters:

Name	Type	Description	Default
`context`	`ToolExecutionContext`	Execution context with wizard state.	required
`format`	`str`	Output format ('summary', 'full', or 'yaml').	`'summary'`

Returns:

Type	Description
`dict[str, Any]`	Dict with the configuration preview.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

async def execute_with_context(
    self,
    context: ToolExecutionContext,
    format: str = "summary",
    **kwargs: Any,
) -> dict[str, Any]:
    """Preview the current configuration.

    Args:
        context: Execution context with wizard state.
        format: Output format ('summary', 'full', or 'yaml').

    Returns:
        Dict with the configuration preview.
    """
    wizard_data = _get_wizard_data(context)
    if not wizard_data:
        return {"error": "No wizard data available for preview"}

    try:
        builder = self._builder_factory(wizard_data)
        config = builder._build_internal()
    except Exception as e:
        logger.exception("Failed to build config for preview")
        return {"error": f"Failed to build configuration: {e}"}

    logger.debug(
        "Generated config preview (format=%s)",
        format,
        extra={"conversation_id": context.conversation_id},
    )

    if format == "yaml":
        return {"yaml": yaml.dump(config, default_flow_style=False, sort_keys=False)}
    elif format == "full":
        return {"config": config}
    else:
        return _build_summary(config)

RemoveKBResourceTool ¶

RemoveKBResourceTool()

Bases: ContextAwareTool

Tool for removing a resource from the knowledge base resource list.

Wizard data read/written: - _kb_resources: list[dict] — resource list (remove by name)

Initialize the tool.

Methods:

Name	Description
`catalog_metadata`	Return catalog metadata for this tool class.
`execute_with_context`	Remove a KB resource.

Attributes:

Name	Type	Description
`schema`	`dict[str, Any]`	Return JSON Schema for tool parameters.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

def __init__(self) -> None:
    """Initialize the tool."""
    super().__init__(
        name="remove_kb_resource",
        description=(
            "Remove a resource from the bot's knowledge base "
            "resource list."
        ),
    )

Attributes¶

schema `property` ¶

schema: dict[str, Any]

Return JSON Schema for tool parameters.

Functions¶

catalog_metadata `classmethod` ¶

catalog_metadata() -> dict[str, Any]

Return catalog metadata for this tool class.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

@classmethod
def catalog_metadata(cls) -> dict[str, Any]:
    """Return catalog metadata for this tool class."""
    return {
        "name": "remove_kb_resource",
        "description": (
            "Remove a resource from the bot's knowledge base "
            "resource list."
        ),
        "tags": ("configbot", "kb"),
    }

execute_with_context `async` ¶

execute_with_context(
    context: ToolExecutionContext, path: str, **kwargs: Any
) -> dict[str, Any]

Remove a KB resource.

Parameters:

Name	Type	Description	Default
`context`	`ToolExecutionContext`	Execution context with wizard state.	required
`path`	`str`	Path of the resource to remove.	required

Returns:

Type	Description
`dict[str, Any]`	Dict with removal result.

Source code in packages/bots/src/dataknobs_bots/tools/kb_tools.py

async def execute_with_context(
    self,
    context: ToolExecutionContext,
    path: str,
    **kwargs: Any,
) -> dict[str, Any]:
    """Remove a KB resource.

    Args:
        context: Execution context with wizard state.
        path: Path of the resource to remove.

    Returns:
        Dict with removal result.
    """
    wizard_data = _get_wizard_data_ref(context)
    resources: list[dict[str, Any]] = wizard_data.get("_kb_resources", [])

    original_count = len(resources)
    updated = [r for r in resources if r["path"] != path]

    if len(updated) == original_count:
        return {
            "success": False,
            "error": f"Resource not found: {path}",
            "available": [r["path"] for r in resources],
        }

    wizard_data["_kb_resources"] = updated

    logger.debug(
        "Removed KB resource: %s",
        path,
        extra={"conversation_id": context.conversation_id},
    )

    return {
        "success": True,
        "removed": path,
        "remaining_resources": len(updated),
    }

SaveConfigTool ¶

SaveConfigTool(
    draft_manager: ConfigDraftManager,
    on_save: Callable[[str, dict[str, Any]], Any] | None = None,
    builder_factory: Callable[[dict[str, Any]], DynaBotConfigBuilder]
    | None = None,
    portable: bool = False,
)

Bases: ContextAwareTool

Tool for saving/finalizing the configuration.

Finalizes the draft and writes the final config file. Optionally calls a consumer-provided callback for post-save actions (e.g., registering the bot with a manager).

When portable=True, the builder's build_portable() method is used instead of _build_internal(), producing a config with a bot wrapper key suitable for environment-aware deployment.

Attributes:

Name	Type	Description
`_draft_manager`		Draft manager for file operations.
`_on_save`		Optional callback invoked after successful save.
`_builder_factory`		Optional factory for building config from wizard data.
`_portable`		Whether to use portable (bot-wrapped) output format.

Initialize the tool.

Parameters:

Name	Type	Description	Default
`draft_manager`	`ConfigDraftManager`	Manager for draft file operations.	required
`on_save`	`Callable[[str, dict[str, Any]], Any] \| None`	Optional callback called with (config_name, config) after successful save. Can be used for post-save actions like bot registration.	`None`
`builder_factory`	`Callable[[dict[str, Any]], DynaBotConfigBuilder] \| None`	Optional factory to build final config from wizard data before saving.	`None`
`portable`	`bool`	When True, use `build_portable()` for output (wraps config under `bot` key with custom sections as siblings). When False (default), use `_build_internal()` for flat format.	`False`

Methods:

Name	Description
`catalog_metadata`	Return catalog metadata for this tool class.
`from_config`	Create from YAML-compatible configuration.
`execute_with_context`	Save the configuration.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

def __init__(
    self,
    draft_manager: ConfigDraftManager,
    on_save: Callable[[str, dict[str, Any]], Any] | None = None,
    builder_factory: Callable[[dict[str, Any]], DynaBotConfigBuilder] | None = None,
    portable: bool = False,
) -> None:
    """Initialize the tool.

    Args:
        draft_manager: Manager for draft file operations.
        on_save: Optional callback called with (config_name, config)
            after successful save. Can be used for post-save actions
            like bot registration.
        builder_factory: Optional factory to build final config from
            wizard data before saving.
        portable: When True, use ``build_portable()`` for output
            (wraps config under ``bot`` key with custom sections as
            siblings). When False (default), use ``_build_internal()``
            for flat format.
    """
    super().__init__(
        name="save_config",
        description=(
            "Save and finalize the bot configuration. Writes the "
            "final config file and optionally activates the bot."
        ),
    )
    self._draft_manager = draft_manager
    self._on_save = on_save
    self._builder_factory = builder_factory
    self._portable = portable

Attributes¶

schema `property` ¶

schema: dict[str, Any]

Return JSON Schema for tool parameters.

Functions¶

catalog_metadata `classmethod` ¶

catalog_metadata() -> dict[str, Any]

Return catalog metadata for this tool class.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

@classmethod
def catalog_metadata(cls) -> dict[str, Any]:
    """Return catalog metadata for this tool class."""
    return {
        "name": "save_config",
        "description": (
            "Save and finalize the bot configuration."
        ),
        "tags": ("configbot",),
        "requires": ("draft_manager",),
    }

from_config `classmethod` ¶

from_config(config: dict[str, Any]) -> SaveConfigTool

Create from YAML-compatible configuration.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Dict with keys: - `config_dir` (str): Output directory for configs. - `builder_factory` (str, optional): Dotted import path. - `on_save` (str, optional): Dotted import path. - `portable` (bool, optional): Use portable output format.	required

Returns:

Type	Description
`SaveConfigTool`	Configured SaveConfigTool instance.

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

@classmethod
def from_config(cls, config: dict[str, Any]) -> SaveConfigTool:
    """Create from YAML-compatible configuration.

    Args:
        config: Dict with keys:
            - ``config_dir`` (str): Output directory for configs.
            - ``builder_factory`` (str, optional): Dotted import path.
            - ``on_save`` (str, optional): Dotted import path.
            - ``portable`` (bool, optional): Use portable output format.

    Returns:
        Configured SaveConfigTool instance.
    """
    from pathlib import Path

    config_dir = config.get("config_dir", "configs")
    manager = ConfigDraftManager(output_dir=Path(config_dir))

    on_save = None
    factory = None
    if "on_save" in config or "builder_factory" in config:
        from .resolve import resolve_callable

        if "on_save" in config:
            on_save = resolve_callable(config["on_save"])
        if "builder_factory" in config:
            factory = resolve_callable(config["builder_factory"])

    portable = config.get("portable", False)
    return cls(
        draft_manager=manager,
        on_save=on_save,
        builder_factory=factory,
        portable=portable,
    )

execute_with_context `async` ¶

execute_with_context(
    context: ToolExecutionContext,
    config_name: str | None = None,
    activate: bool = False,
    **kwargs: Any,
) -> dict[str, Any]

Save the configuration.

Parameters:

Name	Type	Description	Default
`context`	`ToolExecutionContext`	Execution context with wizard state.	required
`config_name`	`str \| None`	Name for the config file.	`None`
`activate`	`bool`	Whether to activate the bot.	`False`

Returns:

Type	Description
`dict[str, Any]`	Dict with save result (success, file path, etc.).

Source code in packages/bots/src/dataknobs_bots/tools/config_tools.py

async def execute_with_context(
    self,
    context: ToolExecutionContext,
    config_name: str | None = None,
    activate: bool = False,
    **kwargs: Any,
) -> dict[str, Any]:
    """Save the configuration.

    Args:
        context: Execution context with wizard state.
        config_name: Name for the config file.
        activate: Whether to activate the bot.

    Returns:
        Dict with save result (success, file path, etc.).
    """
    wizard_data = _get_wizard_data(context)
    if not wizard_data:
        return {"success": False, "error": "No wizard data available"}

    # Determine config name
    name = config_name or wizard_data.get("domain_id") or wizard_data.get("config_name")
    if not name:
        return {
            "success": False,
            "error": "No config_name provided and no domain_id in wizard data",
        }

    # Build final config
    if self._builder_factory is not None:
        try:
            builder = self._builder_factory(wizard_data)
            if self._portable:
                config = builder.build_portable()
            else:
                config = builder._build_internal()
        except Exception as e:
            return {"success": False, "error": f"Failed to build configuration: {e}"}
    else:
        config = {
            k: v for k, v in wizard_data.items() if not k.startswith("_")
        }

    # Check for existing draft — finalize cleans up the draft file,
    # but we always use the freshly-built config (draft may be stale)
    draft_id = wizard_data.get("_draft_id")
    if draft_id:
        try:
            self._draft_manager.finalize(draft_id, final_name=name)
        except FileNotFoundError:
            logger.warning("Draft %s not found, saving directly", draft_id)
    final_config = config

    # Write the final file
    output_dir = self._draft_manager.output_dir
    output_dir.mkdir(parents=True, exist_ok=True)
    final_path = output_dir / f"{name}.yaml"
    with open(final_path, "w") as f:
        yaml.dump(final_config, f, default_flow_style=False, sort_keys=False)

    logger.info(
        "Saved configuration '%s' to %s",
        name,
        final_path,
        extra={
            "config_name": name,
            "activate": activate,
            "conversation_id": context.conversation_id,
        },
    )

    # Run consumer callback
    if self._on_save is not None:
        try:
            self._on_save(name, final_config)
        except Exception:
            logger.exception("on_save callback failed for '%s'", name)

    return {
        "success": True,
        "config_name": name,
        "file_path": str(final_path),
        "activated": activate,
    }

Functions¶

normalize_wizard_state ¶

normalize_wizard_state(wizard_meta: dict[str, Any]) -> dict[str, Any]

Normalize wizard metadata to canonical structure.

Handles both old nested format (fsm_state.current_stage) and new flat format (current_stage directly).

Parameters:

Name	Type	Description	Default
`wizard_meta`	`dict[str, Any]`	Raw wizard metadata from manager or storage	required

Returns:

Type	Description
`dict[str, Any]`	Normalized wizard state dict with canonical fields:
`dict[str, Any]`	current_stage, stage_index, total_stages, progress, completed,
`dict[str, Any]`	data, can_skip, can_go_back, suggestions, history, stages,
`dict[str, Any]`	subflow_depth, and (when in a subflow) subflow_stage.

Source code in packages/bots/src/dataknobs_bots/bot/base.py

def normalize_wizard_state(wizard_meta: dict[str, Any]) -> dict[str, Any]:
    """Normalize wizard metadata to canonical structure.

    Handles both old nested format (fsm_state.current_stage) and
    new flat format (current_stage directly).

    Args:
        wizard_meta: Raw wizard metadata from manager or storage

    Returns:
        Normalized wizard state dict with canonical fields:
        current_stage, stage_index, total_stages, progress, completed,
        data, can_skip, can_go_back, suggestions, history, stages,
        subflow_depth, and (when in a subflow) subflow_stage.
    """
    # Handle nested fsm_state format (legacy)
    fsm_state = wizard_meta.get("fsm_state", {})

    # Prefer direct fields, fall back to fsm_state
    current_stage = (
        wizard_meta.get("current_stage")
        or wizard_meta.get("stage")  # Old response format
        or fsm_state.get("current_stage")
    )

    result: dict[str, Any] = {
        "current_stage": current_stage,
        "stage_index": (
            wizard_meta.get("stage_index") or fsm_state.get("stage_index", 0)
        ),
        "total_stages": wizard_meta.get("total_stages", 0),
        "progress": wizard_meta.get("progress", 0.0),
        "completed": wizard_meta.get("completed", False),
        "data": wizard_meta.get("data") or fsm_state.get("data", {}),
        "can_skip": wizard_meta.get("can_skip", False),
        "can_go_back": wizard_meta.get("can_go_back", True),
        "suggestions": wizard_meta.get("suggestions", []),
        "history": wizard_meta.get("history") or fsm_state.get("history", []),
        "stages": wizard_meta.get("stages", []),
    }

    # Subflow context: present when wizard is executing a subflow
    subflow_stage = wizard_meta.get("subflow_stage")
    if subflow_stage:
        result["subflow_stage"] = subflow_stage
        result["subflow_depth"] = 1  # _build_wizard_metadata exposes top subflow
    else:
        result["subflow_depth"] = 0

    return result

create_default_catalog ¶

create_default_catalog() -> ToolCatalog

Create a new ToolCatalog pre-populated with built-in tools.

Returns a fresh catalog (not the module-level singleton) so consumers can extend it without affecting other users of default_catalog.

Returns:

Type	Description
`ToolCatalog`	New ToolCatalog with all built-in tools registered.

Source code in packages/bots/src/dataknobs_bots/config/tool_catalog.py

def create_default_catalog() -> ToolCatalog:
    """Create a new ToolCatalog pre-populated with built-in tools.

    Returns a fresh catalog (not the module-level singleton) so consumers
    can extend it without affecting other users of ``default_catalog``.

    Returns:
        New ToolCatalog with all built-in tools registered.
    """
    catalog = ToolCatalog()
    for entry in default_catalog.list_items():
        catalog.register_entry(entry)
    return catalog

create_knowledge_base_from_config `async` ¶

create_knowledge_base_from_config(config: dict[str, Any]) -> KnowledgeBase

Create knowledge base from configuration.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Knowledge base configuration with: - type: Type of knowledge base (currently only 'rag' supported) - vector_store: Vector store configuration - embedding_provider: LLM provider for embeddings - embedding_model: Model to use for embeddings - chunking: Optional chunking configuration - documents_path: Optional path to load documents - document_pattern: Optional file pattern	required

Returns:

Type	Description
`KnowledgeBase`	Configured knowledge base instance

Raises:

Type	Description
`ValueError`	If knowledge base type is not supported

Example

config = {
    "type": "rag",
    "vector_store": {
        "backend": "memory",
        "dimensions": 384
    },
    "embedding_provider": "echo",
    "embedding_model": "test"
}
kb = await create_knowledge_base_from_config(config)

Source code in packages/bots/src/dataknobs_bots/knowledge/__init__.py

async def create_knowledge_base_from_config(config: dict[str, Any]) -> KnowledgeBase:
    """Create knowledge base from configuration.

    Args:
        config: Knowledge base configuration with:
            - type: Type of knowledge base (currently only 'rag' supported)
            - vector_store: Vector store configuration
            - embedding_provider: LLM provider for embeddings
            - embedding_model: Model to use for embeddings
            - chunking: Optional chunking configuration
            - documents_path: Optional path to load documents
            - document_pattern: Optional file pattern

    Returns:
        Configured knowledge base instance

    Raises:
        ValueError: If knowledge base type is not supported

    Example:
        ```python
        config = {
            "type": "rag",
            "vector_store": {
                "backend": "memory",
                "dimensions": 384
            },
            "embedding_provider": "echo",
            "embedding_model": "test"
        }
        kb = await create_knowledge_base_from_config(config)
        ```
    """
    kb_type = config.get("type", "rag").lower()

    if kb_type == "rag":
        return await RAGKnowledgeBase.from_config(config)
    else:
        raise ValueError(
            f"Unknown knowledge base type: {kb_type}. " f"Available types: rag"
        )

create_memory_from_config `async` ¶

create_memory_from_config(
    config: dict[str, Any], llm_provider: Any | None = None
) -> Memory

Create memory instance from configuration.

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Memory configuration with 'type' field and type-specific params	required
`llm_provider`	`Any \| None`	Optional LLM provider instance, required for summary memory	`None`

Returns:

Type	Description
`Memory`	Configured Memory instance

Raises:

Type	Description
`ValueError`	If memory type is not recognized or required params missing

Example

# Buffer memory
config = {
    "type": "buffer",
    "max_messages": 10
}
memory = await create_memory_from_config(config)

# Vector memory
config = {
    "type": "vector",
    "backend": "faiss",
    "dimension": 768,
    "embedding_provider": "ollama",
    "embedding_model": "nomic-embed-text"
}
memory = await create_memory_from_config(config)

# Summary memory (uses bot's LLM as fallback)
config = {
    "type": "summary",
    "recent_window": 10,
}
memory = await create_memory_from_config(config, llm_provider=llm)

# Summary memory with its own dedicated LLM
config = {
    "type": "summary",
    "recent_window": 10,
    "llm": {
        "provider": "ollama",
        "model": "gemma3:1b",
    },
}
memory = await create_memory_from_config(config)

# Composite memory (multiple strategies)
config = {
    "type": "composite",
    "strategies": [
        {"type": "buffer", "max_messages": 50},
        {
            "type": "vector",
            "backend": "memory",
            "dimension": 768,
            "embedding_provider": "ollama",
            "embedding_model": "nomic-embed-text",
        },
    ],
    "primary": 0,
}
memory = await create_memory_from_config(config)

Source code in packages/bots/src/dataknobs_bots/memory/__init__.py

async def create_memory_from_config(
    config: dict[str, Any],
    llm_provider: Any | None = None,
) -> Memory:
    """Create memory instance from configuration.

    Args:
        config: Memory configuration with 'type' field and type-specific params
        llm_provider: Optional LLM provider instance, required for summary memory

    Returns:
        Configured Memory instance

    Raises:
        ValueError: If memory type is not recognized or required params missing

    Example:
        ```python
        # Buffer memory
        config = {
            "type": "buffer",
            "max_messages": 10
        }
        memory = await create_memory_from_config(config)

        # Vector memory
        config = {
            "type": "vector",
            "backend": "faiss",
            "dimension": 768,
            "embedding_provider": "ollama",
            "embedding_model": "nomic-embed-text"
        }
        memory = await create_memory_from_config(config)

        # Summary memory (uses bot's LLM as fallback)
        config = {
            "type": "summary",
            "recent_window": 10,
        }
        memory = await create_memory_from_config(config, llm_provider=llm)

        # Summary memory with its own dedicated LLM
        config = {
            "type": "summary",
            "recent_window": 10,
            "llm": {
                "provider": "ollama",
                "model": "gemma3:1b",
            },
        }
        memory = await create_memory_from_config(config)

        # Composite memory (multiple strategies)
        config = {
            "type": "composite",
            "strategies": [
                {"type": "buffer", "max_messages": 50},
                {
                    "type": "vector",
                    "backend": "memory",
                    "dimension": 768,
                    "embedding_provider": "ollama",
                    "embedding_model": "nomic-embed-text",
                },
            ],
            "primary": 0,
        }
        memory = await create_memory_from_config(config)
        ```
    """
    memory_type = config.get("type", "buffer").lower()

    if memory_type == "buffer":
        return BufferMemory(max_messages=config.get("max_messages", 10))

    elif memory_type == "vector":
        return await VectorMemory.from_config(config)

    elif memory_type == "summary":
        # Track whether a dedicated provider was created (owns lifecycle)
        # vs reusing the bot's main LLM (bot owns lifecycle)
        has_dedicated_llm = "llm" in config
        summary_llm = await _resolve_summary_llm(config, llm_provider)
        return SummaryMemory(
            llm_provider=summary_llm,
            recent_window=config.get("recent_window", 10),
            summary_prompt=config.get("summary_prompt"),
            owns_llm_provider=has_dedicated_llm,
        )

    elif memory_type == "composite":
        strategy_configs = config.get("strategies", [])
        strategies: list[Memory] = []
        try:
            for strategy_config in strategy_configs:
                strategy = await create_memory_from_config(
                    strategy_config, llm_provider
                )
                strategies.append(strategy)
            if not strategies:
                raise ValueError(
                    "Composite memory requires at least one strategy "
                    "in 'strategies' list"
                )
            return CompositeMemory(
                strategies=strategies,
                primary_index=config.get("primary", 0),
            )
        except Exception:
            # Clean up any already-initialized strategies
            for s in strategies:
                try:
                    await s.close()
                except Exception:
                    logger.warning(
                        "Failed to close strategy during cleanup: %s",
                        type(s).__name__,
                        exc_info=True,
                    )
            raise

    else:
        raise ValueError(
            f"Unknown memory type: {memory_type}. "
            f"Available types: buffer, composite, summary, vector"
        )

create_reasoning_from_config ¶

create_reasoning_from_config(
    config: dict[str, Any], *, knowledge_base: Any | None = None
) -> ReasoningStrategy

Create reasoning strategy from configuration.

Delegates to the :class:StrategyRegistry singleton. Built-in strategies (simple, react, wizard, grounded, hybrid) are registered automatically; 3^rd-party strategies can be added via :func:register_strategy.

See each strategy class's from_config() for available config keys (e.g. ReActReasoning.from_config, WizardReasoning.from_config).

Parameters:

Name	Type	Description	Default
`config`	`dict[str, Any]`	Reasoning configuration dict. The `strategy` key selects the strategy type (default `"simple"`). All other keys are forwarded to the strategy's `from_config()` classmethod.	required
`knowledge_base`	`Any \| None`	Optional knowledge base instance forwarded as a kwarg to the strategy factory.	`None`

Returns:

Type	Description
`ReasoningStrategy`	Configured reasoning strategy instance.

Raises:

Type	Description
`ValueError`	If strategy type is not registered.

Example

# Simple reasoning
config = {"strategy": "simple"}
strategy = create_reasoning_from_config(config)

# Grounded reasoning (deterministic KB retrieval)
config = {
    "strategy": "grounded",
    "intent": {"mode": "extract", "num_queries": 3},
    "retrieval": {"top_k": 5},
}
strategy = create_reasoning_from_config(config, knowledge_base=kb)

Source code in packages/bots/src/dataknobs_bots/reasoning/__init__.py

def create_reasoning_from_config(
    config: dict[str, Any],
    *,
    knowledge_base: Any | None = None,
) -> ReasoningStrategy:
    """Create reasoning strategy from configuration.

    Delegates to the :class:`StrategyRegistry` singleton.  Built-in
    strategies (simple, react, wizard, grounded, hybrid) are registered
    automatically; 3rd-party strategies can be added via
    :func:`register_strategy`.

    See each strategy class's ``from_config()`` for available config
    keys (e.g. ``ReActReasoning.from_config``,
    ``WizardReasoning.from_config``).

    Args:
        config: Reasoning configuration dict.  The ``strategy`` key
            selects the strategy type (default ``"simple"``).  All
            other keys are forwarded to the strategy's
            ``from_config()`` classmethod.
        knowledge_base: Optional knowledge base instance forwarded
            as a kwarg to the strategy factory.

    Returns:
        Configured reasoning strategy instance.

    Raises:
        ValueError: If strategy type is not registered.

    Example:
        ```python
        # Simple reasoning
        config = {"strategy": "simple"}
        strategy = create_reasoning_from_config(config)

        # Grounded reasoning (deterministic KB retrieval)
        config = {
            "strategy": "grounded",
            "intent": {"mode": "extract", "num_queries": 3},
            "retrieval": {"top_k": 5},
        }
        strategy = create_reasoning_from_config(config, knowledge_base=kb)
        ```
    """
    return get_registry().create(config, knowledge_base=knowledge_base)

register_strategy ¶

register_strategy(
    name: str, factory: StrategyFactory, *, override: bool = False
) -> None

Register a custom reasoning strategy.

Parameters:

Name	Type	Description	Default
`name`	`str`	Strategy name (used in `reasoning.strategy` config).	required
`factory`	`StrategyFactory`	`ReasoningStrategy` subclass or factory callable.	required
`override`	`bool`	Replace existing registration if `True`.	`False`

Example::

from dataknobs_bots.reasoning.registry import register_strategy

class MyStrategy(ReasoningStrategy):
    ...

register_strategy("my_strategy", MyStrategy)

Source code in packages/bots/src/dataknobs_bots/reasoning/registry.py

def register_strategy(
    name: str,
    factory: StrategyFactory,
    *,
    override: bool = False,
) -> None:
    """Register a custom reasoning strategy.

    Args:
        name: Strategy name (used in ``reasoning.strategy`` config).
        factory: ``ReasoningStrategy`` subclass or factory callable.
        override: Replace existing registration if ``True``.

    Example::

        from dataknobs_bots.reasoning.registry import register_strategy

        class MyStrategy(ReasoningStrategy):
            ...

        register_strategy("my_strategy", MyStrategy)
    """
    _registry.register(name, factory, override=override)

inject_providers ¶

inject_providers(
    bot: Any,
    main_provider: AsyncLLMProvider | None = None,
    extraction_provider: AsyncLLMProvider | None = None,
    *,
    extractor: Any | None = None,
    **role_providers: AsyncLLMProvider,
) -> None

Inject LLM providers into a DynaBot instance for testing.

For main_provider, directly replaces bot.llm (the "main" role is always served from this attribute, not the registry catalog).

For extraction_provider and **role_providers, updates both the registry catalog and the actual subsystem wiring via set_provider().

For extractor, calls strategy.set_extractor() to replace the reasoning strategy's extractor entirely. Use this to inject a ConfigurableExtractor (which is not an AsyncLLMProvider and cannot be wired through set_provider()).

Lifecycle note: bot.close() will close self.llm (the main provider) unconditionally — the caller should be aware that an injected main_provider will be closed when the bot is closed. For subsystem providers (memory embedding, extraction), ownership flags control whether close() acts on them.

If bot does not implement register_provider, catalog registration is skipped; only subsystem wiring via set_provider() is performed.

Parameters:

Name	Type	Description	Default
`bot`	`Any`	A DynaBot instance (or any object with `llm` and `reasoning_strategy` attributes).	required
`main_provider`	`AsyncLLMProvider \| None`	Provider to use for main LLM calls. If None, the existing provider is kept.	`None`
`extraction_provider`	`AsyncLLMProvider \| None`	Provider to use for schema extraction. If None, the existing provider is kept.	`None`
`extractor`	`Any \| None`	A `ConfigurableExtractor` (or compatible object) to replace the wizard's `SchemaExtractor` directly. Mutually exclusive with `extraction_provider`.	`None`
`**role_providers`	`AsyncLLMProvider`	Additional providers keyed by role name (e.g. `memory_embedding=echo_provider`). Each provider is registered in the catalog AND wired into the owning subsystem via `set_provider()`.	`{}`

Example

from dataknobs_llm import EchoProvider
from dataknobs_bots.testing import inject_providers

main = EchoProvider()
extraction = EchoProvider()
inject_providers(bot, main, extraction)

Source code in packages/bots/src/dataknobs_bots/testing.py

def inject_providers(
    bot: Any,
    main_provider: AsyncLLMProvider | None = None,
    extraction_provider: AsyncLLMProvider | None = None,
    *,
    extractor: Any | None = None,
    **role_providers: AsyncLLMProvider,
) -> None:
    """Inject LLM providers into a DynaBot instance for testing.

    For ``main_provider``, directly replaces ``bot.llm`` (the ``"main"``
    role is always served from this attribute, not the registry catalog).

    For ``extraction_provider`` and ``**role_providers``, updates both the
    registry catalog and the actual subsystem wiring via ``set_provider()``.

    For ``extractor``, calls ``strategy.set_extractor()`` to replace
    the reasoning strategy's extractor entirely.  Use this to inject a
    ``ConfigurableExtractor`` (which is not an ``AsyncLLMProvider`` and
    cannot be wired through ``set_provider()``).

    **Lifecycle note:** ``bot.close()`` will close ``self.llm`` (the main
    provider) unconditionally — the caller should be aware that an
    injected ``main_provider`` will be closed when the bot is closed.
    For subsystem providers (memory embedding, extraction), ownership
    flags control whether ``close()`` acts on them.

    If ``bot`` does not implement ``register_provider``, catalog
    registration is skipped; only subsystem wiring via ``set_provider()``
    is performed.

    Args:
        bot: A DynaBot instance (or any object with ``llm`` and
            ``reasoning_strategy`` attributes).
        main_provider: Provider to use for main LLM calls. If None,
            the existing provider is kept.
        extraction_provider: Provider to use for schema extraction.
            If None, the existing provider is kept.
        extractor: A ``ConfigurableExtractor`` (or compatible object)
            to replace the wizard's ``SchemaExtractor`` directly.
            Mutually exclusive with ``extraction_provider``.
        **role_providers: Additional providers keyed by role name
            (e.g. ``memory_embedding=echo_provider``).  Each provider
            is registered in the catalog AND wired into the owning
            subsystem via ``set_provider()``.

    Example:
        ```python
        from dataknobs_llm import EchoProvider
        from dataknobs_bots.testing import inject_providers

        main = EchoProvider()
        extraction = EchoProvider()
        inject_providers(bot, main, extraction)
        ```
    """
    if extractor is not None and extraction_provider is not None:
        raise ValueError(
            "extractor and extraction_provider are mutually exclusive"
        )

    if main_provider is not None:
        bot.llm = main_provider

    if extractor is not None:
        strategy = getattr(bot, "reasoning_strategy", None)
        if strategy is not None and hasattr(strategy, "set_extractor"):
            strategy.set_extractor(extractor)
        else:
            logger.warning(
                "Bot has no reasoning_strategy.set_extractor — "
                "skipping extractor injection"
            )

    if extraction_provider is not None:
        from dataknobs_bots.bot.base import PROVIDER_ROLE_EXTRACTION

        # Update the registry entry
        if hasattr(bot, "register_provider"):
            bot.register_provider(PROVIDER_ROLE_EXTRACTION, extraction_provider)

        # Also update the actual extractor so subsystem calls use it
        strategy = getattr(bot, "reasoning_strategy", None)
        if strategy is None:
            logger.warning(
                "Bot has no reasoning_strategy — skipping extraction provider injection"
            )
        elif hasattr(strategy, "set_provider"):
            strategy.set_provider(PROVIDER_ROLE_EXTRACTION, extraction_provider)
        else:
            # Fallback for strategies without set_provider (e.g. test stubs)
            extractor = getattr(strategy, "_extractor", None)
            if extractor is None:
                logger.warning(
                    "Reasoning strategy has no _extractor — "
                    "skipping extraction provider injection"
                )
            else:
                extractor.provider = extraction_provider
                if hasattr(extractor, "_owns_provider"):
                    extractor._owns_provider = False

    # Wire role-based providers into catalog AND subsystems
    for role, provider in role_providers.items():
        if hasattr(bot, "register_provider"):
            bot.register_provider(role, provider)

        # Wire into the actual subsystem that owns this role
        _wire_role_provider(bot, role, provider)

dataknobs-bots Complete API Reference¶

dataknobs_bots ¶

Attributes¶

default_catalog module-attribute ¶

Classes¶

BotContext dataclass ¶

Functions¶

__getitem__ ¶

__setitem__ ¶

__contains__ ¶

get ¶

copy ¶

BotManager ¶

Attributes¶

environment_name property ¶

environment property ¶

Functions¶

get_or_create async ¶

get async ¶

remove async ¶

reload async ¶

list_bots ¶

get_bot_count ¶

clear_all async ¶

get_portable_config ¶

__repr__ ¶

BotRegistry ¶

Attributes¶

backend property ¶

environment property ¶

environment_name property ¶

cache_ttl property ¶

max_cache_size property ¶

Functions¶

initialize async ¶

close async ¶

register async ¶

get_bot async ¶

get_config async ¶

get_registration async ¶

unregister async ¶

deactivate async ¶

exists async ¶

list_bots async ¶

count async ¶

get_cached_bots ¶

clear_cache async ¶

register_client async ¶

remove_client async ¶

get_cached_clients ¶

__repr__ ¶

DynaBot ¶

Attributes¶

all_providers property ¶

Functions¶

register_provider ¶

get_provider ¶

from_config async classmethod ¶

from_environment_aware_config async classmethod ¶

get_portable_config staticmethod ¶

chat async ¶

greet async ¶

stream_chat async ¶

get_conversation async ¶

clear_conversation async ¶

get_wizard_state async ¶

close async ¶

__aenter__ async ¶

__aexit__ async ¶

get_conversation_manager ¶

undo_last_turn async ¶

rewind_to_turn async ¶

UndoResult dataclass ¶

ConfigDraftManager ¶

Attributes¶

output_dir property ¶

Functions¶

create_draft ¶

update_draft ¶

get_draft ¶

default_catalog `module-attribute` ¶

BotContext `dataclass` ¶

getitem ¶

setitem ¶

contains ¶

environment_name `property` ¶

environment `property` ¶

get_or_create `async` ¶

get `async` ¶

remove `async` ¶

reload `async` ¶

clear_all `async` ¶

repr ¶

backend `property` ¶

environment `property` ¶

environment_name `property` ¶

cache_ttl `property` ¶

max_cache_size `property` ¶

initialize `async` ¶

close `async` ¶

register `async` ¶

get_bot `async` ¶

get_config `async` ¶

get_registration `async` ¶

unregister `async` ¶

deactivate `async` ¶

exists `async` ¶

list_bots `async` ¶

count `async` ¶

clear_cache `async` ¶

register_client `async` ¶

remove_client `async` ¶

repr ¶

all_providers `property` ¶

from_config `async` `classmethod` ¶

from_environment_aware_config `async` `classmethod` ¶

get_portable_config `staticmethod` ¶

chat `async` ¶

greet `async` ¶

stream_chat `async` ¶

get_conversation `async` ¶

clear_conversation `async` ¶

get_wizard_state `async` ¶

close `async` ¶

aenter `async` ¶

aexit `async` ¶

undo_last_turn `async` ¶

rewind_to_turn `async` ¶

UndoResult `dataclass` ¶

output_dir `property` ¶

ConfigTemplate `dataclass` ¶

from_dict `classmethod` ¶

from_yaml_file `classmethod` ¶

DraftMetadata `dataclass` ¶

from_dict `classmethod` ¶

from_config `classmethod` ¶

TemplateVariable `dataclass` ¶