FSM Examples¶
This section contains practical examples showing how to use the FSM package for real-world data processing tasks. All examples use the actual FSM implementation with proper import paths and API usage.
Available Examples¶
Core Examples (In Repository)¶
These complete, runnable examples are located in packages/fsm/examples/:
1. Database ETL Pipeline (database_etl.py)¶
A comprehensive example showing how to build a production-ready ETL pipeline using the FSM framework.
Key Features:
- Uses SimpleFSM with DataHandlingMode.COPY for transaction safety
- Multi-stage data extraction, transformation, and loading
- Custom function registration for ETL operations
- Error handling with rollback states
- Batch processing with configurable size
- Data validation and quality checks
- Business metrics calculation (revenue, customer segments)
2. Data Processing Pipeline (data_pipeline_example.py)¶
A robust data processing pipeline demonstrating validation, enrichment, and aggregation.
Key Features: - Custom transform functions (ITransformFunction interface) - Data validation with required fields and type checking - Data enrichment with computed fields and categorization - Resource management for tracking processing statistics - Error handling and recovery patterns - Single record and batch processing modes
3. Data Validation Pipeline (data_validation_pipeline.py)¶
Demonstrates a validation pipeline for data quality assurance.
Key Features: - Schema validation - Data type checking - Business rule validation - Error collection and reporting - Configurable validation rules
4. End-to-End Streaming (end_to_end_streaming.py)¶
Comprehensive demonstration of FSM's streaming capabilities for processing large datasets.
Key Features: - True end-to-end streaming through FSM states - File-to-file streaming with transformations - Real-time stream processing from generators - Multi-stage streaming pipelines - Memory-efficient chunk-based processing - Automatic backpressure management - Progress tracking and monitoring
5. Large File Processor (large_file_processor.py)¶
Shows how to process large files efficiently using streaming.
Key Features: - Streaming mode for memory efficiency - Chunk-based processing - Progress tracking - Error recovery for partial failures
6. Advanced Debugging Examples (advanced_debugging.py, advanced_debugging_simple.py)¶
Demonstrates the AdvancedFSM debugging capabilities.
Key Features: - Step-by-step execution - Breakpoint debugging - Execution hooks and monitoring - State inspection - Performance profiling
7. LLM Conversation System (llm_conversation.py)¶
An FSM-based conversational AI system.
Key Features: - Conversation state management - Context handling - Multi-turn dialogue support - Intent recognition states - Response generation workflow
8. Text Normalization with Regex (normalize_file_example.py, normalize_file_with_regex.py)¶
Powerful text processing using regular expressions directly in YAML configurations.
Key Features: - Regular expressions in inline transforms - Field preservation pattern - Whitespace and punctuation normalization - Email and URL pattern matching - Phone number and SSN masking - Duplicate word removal - Multiple format conversions (snake_case, kebab-case, CamelCase) - Pattern extraction and detection
9. Regex Transform Configurations (regex_transforms.yaml, regex_workflow.yaml)¶
Ready-to-use YAML configurations for text processing.
Key Features: - Field-by-field transformations - All-in-one transform patterns - Sensitive data masking - Pattern extraction (emails, URLs, hashtags, mentions) - Format standardization
Documentation Examples (Guides)¶
File Processing Workflow¶
Detailed guide on file processing patterns.
API Orchestration¶
Guide for coordinating multiple API calls.
LLM Conversation¶
Building conversation systems with FSM.
LLM Chain Processing¶
Multi-step LLM processing chains (to be implemented).
Quick Start Examples¶
Basic FSM with SimpleFSM¶
from dataknobs_fsm.api.simple import SimpleFSM
from dataknobs_fsm.core.data_modes import DataHandlingMode
# Define configuration
config = {
"name": "simple_example",
"states": [
{"name": "start", "is_start": True},
{"name": "process"},
{"name": "end", "is_end": True}
],
"arcs": [
{
"from": "start",
"to": "process",
"transform": {
"type": "inline",
"code": "lambda data, ctx: {**data, 'processed': True}"
}
},
{"from": "process", "to": "end"}
]
}
# Create and run FSM
fsm = SimpleFSM(config, data_mode=DataHandlingMode.COPY)
result = fsm.process({"input": "data"})
print(f"Result: {result['data']}")
Debugging with AdvancedFSM¶
from dataknobs_fsm import AdvancedFSM, ExecutionMode
import asyncio
async def debug_example():
# Create FSM with debug mode
fsm = AdvancedFSM(
"config.yaml",
execution_mode=ExecutionMode.DEBUG
)
# Add breakpoint
fsm.add_breakpoint("process")
# Create context and run
context = fsm.create_context({"input": "data"})
await fsm.run_until_breakpoint(context)
print(f"Stopped at: {context.current_state}")
# Continue execution
await fsm.step(context)
asyncio.run(debug_example())
Example Features¶
Each example demonstrates:
- Correct API Usage: Using
SimpleFSMorAdvancedFSMproperly - Data Handling Modes: When to use
COPY,REFERENCE, orDIRECT - Custom Functions: Registering and using custom transform functions
- Error Handling: Proper error states and recovery
- Real-world Patterns: Practical solutions to common problems
Running Examples¶
Examples are located in the packages/fsm/examples/ directory:
# Navigate to the FSM package
cd packages/fsm
# Run the database ETL example
uv run python examples/database_etl.py
# Run the data processing pipeline
uv run python examples/data_pipeline_example.py
# Run the data validation pipeline
uv run python examples/data_validation_pipeline.py
# Run the large file processor
uv run python examples/large_file_processor.py
# Run advanced debugging example
uv run python examples/advanced_debugging.py
# Run LLM conversation example
uv run python examples/llm_conversation.py
# Run text normalization example
uv run python examples/normalize_file_example.py
# Run regex transformation examples
uv run python examples/normalize_file_with_regex.py
# Test regex YAML configurations
uv run python examples/test_regex_yaml.py
Running with Custom Parameters¶
Most examples accept command-line arguments:
# Database ETL with custom batch size
uv run python examples/database_etl.py --batch-size 500
# File processor with specific input
uv run python examples/large_file_processor.py --input data.csv --output processed.json
# Debugging with specific config
uv run python examples/advanced_debugging.py --config custom_fsm.yaml
# Process file with regex transformations
uv run python examples/normalize_file_example.py
Prerequisites¶
Before running the examples, ensure you have:
-
FSM package installed:
-
Required dependencies for specific examples:
-
Environment setup:
- For LLM examples: Set API keys (
OPENAI_API_KEY,ANTHROPIC_API_KEY) - For database examples: SQLite is used by default (no setup needed)
Key Concepts Demonstrated¶
The examples showcase important FSM concepts:
Data Handling Modes¶
- COPY Mode: Used in
database_etl.pyfor transaction safety - REFERENCE Mode: Used in
large_file_processor.pyfor memory efficiency - DIRECT Mode: Shown in performance-critical sections
Custom Functions¶
# Register custom functions with SimpleFSM
def transform_data(state):
data = state.data.copy()
# Transform logic
return data
fsm = SimpleFSM(
config,
custom_functions={"transform": transform_data}
)
Error Handling¶
# Configuration with error states
config = {
"states": [
{"name": "process"},
{"name": "error"},
{"name": "rollback"}
],
"arcs": [
{
"from": "process",
"to": "error",
"pre_test": {
"type": "inline",
"code": "lambda data, ctx: 'error' in data"
}
}
]
}
Regular Expressions in YAML¶
# Use regex directly in inline transforms
arcs:
- from: start
to: normalize
transform:
type: inline
code: |
lambda data, ctx: {
**data,
'normalized': __import__('re').sub(
r'\\s+', # Pattern: multiple spaces
' ', # Replacement: single space
data.get('text', '')
).strip()
}
Learn More¶
- API Reference - Complete API documentation
- Patterns Guide - Pre-built integration patterns
- Data Modes Guide - Understanding data handling
- Contributing Guide - Submit your examples