Skip to content

User Guide

Welcome to the Dataknobs User Guide. This section provides detailed information on using Dataknobs effectively for building knowledge-centric applications.

Getting Started

Core Capabilities

AI Agents & Chatbots

Package: dataknobs-bots

Configuration-driven AI agents with memory, RAG, reasoning strategies, and multi-tenancy.

Use Cases: Customer support bots, virtual assistants, knowledge base Q&A, multi-user chat systems

Configuration Management

Package: dataknobs-config

Flexible configuration with environment variable support, factory patterns, and cross-references.

Use Cases: Multi-environment deployments, dynamic backend selection, application configuration

Data Abstraction

Package: dataknobs-data

Unified interface across Memory, File, PostgreSQL, Elasticsearch, and S3 backends with transactions and migrations.

Use Cases: Backend-agnostic data access, ETL pipelines, multi-backend applications, data migration

Workflow Orchestration

Package: dataknobs-fsm

Finite State Machine framework for building robust, testable data processing pipelines.

Use Cases: ETL pipelines, data validation, multi-step processing, workflow automation

LLM Integration

Package: dataknobs-llm

Multi-provider LLM integration with prompt management, conversations, versioning, and tool calling.

Use Cases: Chatbots, content generation, code analysis, document summarization, Q&A systems

Data Structures

Package: dataknobs-structures

Core data structures for organizing knowledge: trees, documents, record stores, conditional dictionaries.

Use Cases: Hierarchical data, document management, knowledge graphs, data organization

Utilities

Package: dataknobs-utils

Utility functions for JSON manipulation, file operations, HTTP requests, and more.

Use Cases: JSON processing, file handling, search integration, API interactions

Text Processing

Package: dataknobs-xization

Text normalization, tokenization, masking, and lexical analysis for NLP and data processing.

Use Cases: Data anonymization, text preprocessing, NLP pipelines, search indexing

Common Workflows

Building a Data Pipeline

Combine FSM, Data, and Config packages:

from dataknobs_fsm import SimpleFSM
from dataknobs_data import database_factory
from dataknobs_config import Config

# Load configuration
config = Config("pipeline.yaml")
config.register_factory("database", database_factory)

# Access database through config
source_db = config.get_instance("databases", "source")
target_db = config.get_instance("databases", "target")

# Define FSM workflow
fsm = SimpleFSM({
    "states": [...],
    "arcs": [...]
})

# Process with database access
fsm.context["source"] = source_db
fsm.context["target"] = target_db
result = fsm.process(data)

Building an AI Chatbot

Combine Bots, LLM, and Data packages:

from dataknobs_bots import BotRegistry
from dataknobs_data import PostgresDatabase

# Persistent storage for conversations
db = PostgresDatabase(connection_string="...")

# Configure bot with memory and RAG
bot_config = {
    "llm": {"provider": "openai", "model": "gpt-4"},
    "memory": {"type": "vector", "db": db},
    "knowledge_base": {"type": "elasticsearch", "index": "docs"}
}

registry = BotRegistry()
bot = registry.create_bot("support", bot_config)

# Multi-session conversations with persistence
response = bot.chat("How do I reset my password?", session_id="user123")

Processing Text at Scale

Combine FSM, Data, and Xization packages:

from dataknobs_fsm import SimpleFSM
from dataknobs_data import S3Database
from dataknobs_xization import normalize

# Read from S3, process, write back
s3_db = S3Database(bucket="documents")

fsm_config = {
    "name": "text_processor",
    "states": [
        {"name": "load", "is_start": True},
        {"name": "normalize"},
        {"name": "save", "is_end": True}
    ],
    "arcs": [
        {
            "from": "load",
            "to": "normalize",
            "transform": {
                "type": "inline",
                "code": "lambda data, ctx: normalize.basic_normalization_fn(data['text'])"
            }
        }
    ]
}

fsm = SimpleFSM(fsm_config)

Learning Path

Beginners - Start Here: 1. Quick Start - Get familiar with basic concepts 2. Basic Usage - Learn core data structures and utilities 3. Examples - See practical applications

Intermediate - Build Applications: 1. Configuration System - Environment management 2. Data Abstraction - Backend-agnostic data access 3. FSM Workflows - Build robust pipelines 4. Advanced Usage - Advanced patterns

Advanced - AI & Complex Systems: 1. LLM Integration - Integrate language models 2. AI Agents - Build intelligent chatbots 3. Streaming Workflows - Real-time processing 4. Production Best Practices - Deploy at scale

Package Integration

Dataknobs packages are designed to work together seamlessly:

  • ConfigData: Dynamic backend configuration
  • DataFSM: Database access in workflows
  • LLMBots: LLM integration in AI agents
  • BotsData: Persistent conversation memory
  • FSMLLM: LLM calls in workflow states
  • UtilsEverything: Common utilities across all packages

Additional Resources