Testing DynaBot¶

This guide covers the testing utilities provided by dataknobs-bots for writing reliable, maintainable tests for wizard bots and DynaBot interactions.

Overview¶

Testing DynaBot wizard flows requires wiring together an LLM provider, conversation storage, extraction, and wizard configuration. The dataknobs_bots.testing module provides high-level helpers that reduce this from ~50 lines of boilerplate to ~5 lines.

Key principle: Test wizard behavior through the public API (DynaBot.from_config() + bot.chat()) rather than by constructing WizardReasoning directly and poking internal state.

BotTestHarness¶

BotTestHarness is the preferred way to test wizard bots. It wraps DynaBot.from_config(), EchoProvider, and ConfigurableExtractor into a single object with chat()/greet() methods and automatic wizard state capture.

Basic Usage¶

from dataknobs_bots.testing import BotTestHarness, WizardConfigBuilder

config = (WizardConfigBuilder("my-wizard")
    .stage("gather", is_start=True, prompt="Tell me your name and topic.")
        .field("name", field_type="string", required=True)
        .field("topic", field_type="string", required=True)
        .transition("done", "data.get('name') and data.get('topic')")
    .stage("done", is_end=True, prompt="All done!")
    .build())

async with await BotTestHarness.create(
    wizard_config=config,
    main_responses=["Got it!", "All done!"],
    extraction_results=[
        [{"name": "Alice", "topic": "math"}],
    ],
) as harness:
    result = await harness.chat("I'm Alice and I like math")
    assert harness.wizard_data["name"] == "Alice"
    assert harness.wizard_data["topic"] == "math"
    assert harness.wizard_stage == "done"

What `create()` Does¶

Builds a bot config dict with EchoProvider as main LLM
Flattens extraction_results into a ConfigurableExtractor sequence
Calls DynaBot.from_config() to create a real bot
Injects the ConfigurableExtractor via inject_providers()
Queues main_responses on the EchoProvider

Properties¶

Property	Type	Description
`harness.wizard_stage`	`str \\| None`	Current wizard stage after last turn
`harness.wizard_data`	`dict`	Wizard state data after last turn
`harness.wizard_state`	`dict \\| None`	Full normalized wizard state
`harness.last_response`	`str`	Bot response from last turn
`harness.turn_count`	`int`	Number of turns executed
`harness.bot`	`DynaBot`	The underlying bot (for advanced assertions)
`harness.provider`	`EchoProvider`	Main provider (for call history)
`harness.extractor`	`ConfigurableExtractor \\| None`	Extractor (for call verification)

TurnResult¶

Each chat() and greet() call returns a TurnResult:

result = await harness.chat("hello")
assert result.response == "Got it!"
assert result.wizard_stage == "gather"
assert result.wizard_data == {"name": "Alice"}
assert result.turn_index == 1  # 1-based

Per-Turn Extraction Results¶

extraction_results groups extraction calls by turn. The harness flattens them into a ConfigurableExtractor sequence internally, but you think in turns:

extraction_results=[
    # Turn 1: one extraction call
    [{"name": "Alice"}],
    # Turn 2: two calls (initial + escalated scope)
    [{"domain_id": "chess"}, {"name": "Alice", "domain_id": "chess"}],
]

Full Bot Config¶

For complete control, pass bot_config instead of wizard_config:

async with await BotTestHarness.create(
    bot_config={
        "llm": {"provider": "echo", "model": "test"},
        "conversation_storage": {"backend": "memory"},
        "reasoning": {
            "strategy": "wizard",
            "wizard_config": {...},
            "extraction_config": {"provider": "echo", "model": "ext"},
        },
    },
    main_responses=["Hello!"],
) as harness:
    await harness.chat("hi")

WizardConfigBuilder¶

Fluent builder for wizard config dicts. Replaces verbose 40-line inline dicts with a readable chained API, and validates at build time.

from dataknobs_bots.testing import WizardConfigBuilder

config = (WizardConfigBuilder("quiz-wizard")
    .stage("gather", is_start=True, prompt="What topic?")
        .field("topic", field_type="string", required=True)
        .field("level", field_type="string", required=True, default="beginner")
        .transition("quiz", "data.get('topic') and data.get('level')")
    .stage("quiz", prompt="Answer the question.")
        .field("answer", field_type="string", required=True)
        .transition("done", "data.get('answer')")
    .stage("done", is_end=True, prompt="All done!")
    .settings(extraction_scope="current_message")
    .build())

Methods¶

Method	Description
`.stage(name, *, is_start, is_end, prompt, mode, extraction_scope, auto_advance, skip_extraction)`	Add a stage
`.field(name, *, field_type, required, description, enum, default, x_extraction)`	Add a field to the current stage
`.transition(target, condition, priority)`	Add a transition from the current stage
`.settings(**kwargs)`	Set wizard-level settings
`.build()`	Validate and return the config dict

Build-Time Validation¶

build() raises ValueError on:

No start stage defined
No end stage defined
Transition to a nonexistent stage name

inject_providers¶

Injects LLM providers and extractors into a DynaBot instance for testing.

from dataknobs_bots.testing import inject_providers
from dataknobs_llm import EchoProvider
from dataknobs_llm.testing import ConfigurableExtractor

bot = await DynaBot.from_config(config)

# Replace main LLM
inject_providers(bot, main_provider=EchoProvider({...}))

# Replace extraction provider (swaps provider inside existing SchemaExtractor)
inject_providers(bot, extraction_provider=EchoProvider({...}))

# Replace entire extractor (e.g., with ConfigurableExtractor)
inject_providers(bot, extractor=ConfigurableExtractor(results=[...]))

extractor and extraction_provider are mutually exclusive.

Extraction Testing Utilities¶

Two approaches for testing extraction, depending on what you need to verify:

ConfigurableExtractor (Bypass Extraction Pipeline)¶

Returns pre-configured results without calling any LLM. Use this when testing wizard flow behavior, not extraction quality:

from dataknobs_llm.testing import ConfigurableExtractor, SimpleExtractionResult

extractor = ConfigurableExtractor(results=[
    SimpleExtractionResult(data={"name": "Alice"}, confidence=0.9),
    SimpleExtractionResult(data={"topic": "math"}, confidence=0.5),
])

# Track calls
assert len(extractor.extract_calls) == 0
await extractor.extract("text", schema={})
assert len(extractor.extract_calls) == 1

scripted_schema_extractor (Real Extraction Pipeline)¶

Creates a real SchemaExtractor backed by scripted EchoProvider responses. Exercises the full extraction pipeline (prompt building, JSON parsing, confidence scoring):

from dataknobs_llm.testing import scripted_schema_extractor

extractor, ext_provider = scripted_schema_extractor([
    '{"name": "Alice", "topic": "math"}',
])

# Use with WizardReasoning directly
reasoning = WizardReasoning(wizard_fsm=fsm, extractor=extractor)

# Or inject into a bot
inject_providers(bot, extractor=extractor)

Anti-Patterns¶

Avoid these patterns in bot tests:

Anti-Pattern	Why It's Wrong	Use Instead
`WizardReasoning()` + `reasoning.generate(manager)`	Bypasses `from_config()`, middleware, context pipeline	`BotTestHarness.create()` + `harness.chat()`
`bot._conversation_managers` access	Couples to internal cache implementation	`bot.get_wizard_state()` or `harness.wizard_data`
`strategy._extractor = extractor`	Private attribute injection	`strategy.set_extractor(ext)` or `inject_providers(bot, extractor=ext)`
Per-file `_get_wizard_state()` helpers	Duplicated internal metadata access	`harness.wizard_stage` / `harness.wizard_data`
`MagicMock(spec=ConversationManager)`	Mocks hide integration bugs	`BotTestHarness` creates real bots via `from_config()`
Inline 40-line wizard config dicts	Verbose, error-prone, copy-pasted	`WizardConfigBuilder` fluent API

Exception: Internal-Method Unit Tests¶

Tests that verify WizardReasoning internal logic (_evaluate_condition, _can_auto_advance, transform flows) are legitimate unit tests. These may use WizardReasoning directly with the conversation_manager_pair conftest fixture. They test specific internal methods, not wizard flow behavior.