Adding Configuration Support to DataKnobs Packages¶
This guide provides step-by-step instructions for adding configuration support to new or existing DataKnobs packages and classes.
When to Add Configuration Support¶
Add configuration support when: - Your class might be instantiated from external configuration files - Users need to configure your class without modifying code - Your class is part of a larger system that uses dependency injection - You want to support environment-based configuration - Your class has complex initialization parameters
Step-by-Step Implementation¶
1. Add Dependencies¶
First, ensure your package depends on dataknobs-config:
# pyproject.toml
[project]
dependencies = [
"dataknobs-config>=0.1.0",
# ... other dependencies
]
[tool.uv.sources]
dataknobs-config = { workspace = true }
2. Import ConfigurableBase¶
3. Update Class Definition¶
Option A: New Class¶
from dataknobs_config import ConfigurableBase
from typing import Dict, Any, Optional
class MyConfigurableClass(ConfigurableBase):
"""A configurable class that follows DataKnobs patterns.
Configuration Options:
param1 (str): Description of param1 (required)
param2 (int): Description of param2 (default: 100)
param3 (bool): Description of param3 (default: False)
"""
def __init__(self, config: Optional[Dict[str, Any]] = None):
"""Initialize from configuration dictionary.
Args:
config: Configuration dictionary with class parameters
"""
self.config = config or {}
self.param1 = self.config.get("param1")
self.param2 = self.config.get("param2", 100)
self.param3 = self.config.get("param3", False)
if not self.param1:
raise ValueError("param1 is required in configuration")
self._initialize()
def _initialize(self):
"""Perform any complex initialization."""
# Your initialization logic here
pass
@classmethod
def from_config(cls, config: dict) -> "MyConfigurableClass":
"""Create instance from configuration dictionary.
This method is called by the Config.get_instance() method.
Args:
config: Configuration dictionary
Returns:
Configured instance of the class
"""
return cls(config)
Option B: Existing Class (Backward Compatible)¶
from dataknobs_config import ConfigurableBase
from typing import Dict, Any, Optional
class ExistingClass(SomeBaseClass, ConfigurableBase):
"""An existing class with added configuration support.
Can be instantiated either directly or from configuration.
Direct instantiation:
obj = ExistingClass("value1", 42, flag=True)
Configuration instantiation:
obj = ExistingClass.from_config({
"param1": "value1",
"param2": 42,
"flag": True
})
Configuration Options:
param1 (str): Description of param1
param2 (int): Description of param2
flag (bool): Optional flag (default: False)
"""
def __init__(self,
param1: Optional[str] = None,
param2: Optional[int] = None,
flag: bool = False,
config: Optional[Dict[str, Any]] = None):
"""Initialize with parameters or configuration.
Args:
param1: First parameter (used if config not provided)
param2: Second parameter (used if config not provided)
flag: Optional flag
config: Configuration dictionary (overrides other params)
"""
if config:
# Configuration-based initialization
self.param1 = config.get("param1", param1)
self.param2 = config.get("param2", param2)
self.flag = config.get("flag", flag)
else:
# Direct initialization
self.param1 = param1
self.param2 = param2
self.flag = flag
# Call parent class initialization
super().__init__()
@classmethod
def from_config(cls, config: dict) -> "ExistingClass":
"""Create instance from configuration dictionary."""
return cls(config=config)
4. Handle Complex Configurations¶
For classes with nested configurations or references:
class ComplexConfigurableClass(ConfigurableBase):
"""A class with complex configuration needs."""
def __init__(self, config: Optional[Dict[str, Any]] = None):
self.config = config or {}
# Handle nested configuration
self.database_config = self.config.get("database", {})
self.cache_config = self.config.get("cache", {})
# Initialize sub-components
self._init_database()
self._init_cache()
def _init_database(self):
"""Initialize database from nested config."""
if self.database_config:
from dataknobs_data import Database
self.db = Database.create(
self.database_config.get("backend", "memory"),
self.database_config
)
def _init_cache(self):
"""Initialize cache from nested config."""
if self.cache_config:
self.cache_size = self.cache_config.get("size", 1000)
self.cache_ttl = self.cache_config.get("ttl", 3600)
5. Implement Factory Pattern (Optional)¶
For creating different implementations based on configuration:
from dataknobs_config import FactoryBase
class MyClassFactory(FactoryBase):
"""Factory for creating MyClass instances based on type."""
def create(self, **config) -> Any:
"""Create instance based on configuration.
Args:
**config: Configuration including 'type' field
Returns:
Instance of appropriate class
"""
class_type = config.pop("type", "default")
if class_type == "advanced":
from .advanced import AdvancedImplementation
return AdvancedImplementation(config)
elif class_type == "simple":
from .simple import SimpleImplementation
return SimpleImplementation(config)
else:
from .default import DefaultImplementation
return DefaultImplementation(config)
Testing Configuration Support¶
1. Unit Tests¶
import pytest
from mypackage import MyConfigurableClass
class TestConfigSupport:
"""Test configuration support for MyConfigurableClass."""
def test_direct_instantiation(self):
"""Test direct instantiation still works."""
obj = MyConfigurableClass({"param1": "value1"})
assert obj.param1 == "value1"
def test_from_config_method(self):
"""Test from_config classmethod."""
config = {
"param1": "value1",
"param2": 200,
"param3": True
}
obj = MyConfigurableClass.from_config(config)
assert obj.param1 == "value1"
assert obj.param2 == 200
assert obj.param3 is True
def test_missing_required_param(self):
"""Test that missing required parameters raise errors."""
with pytest.raises(ValueError, match="param1 is required"):
MyConfigurableClass({})
def test_default_values(self):
"""Test that default values are applied."""
obj = MyConfigurableClass({"param1": "value1"})
assert obj.param2 == 100 # default value
assert obj.param3 is False # default value
2. Integration Tests¶
from dataknobs_config import Config
def test_config_integration():
"""Test integration with Config class."""
config = Config()
config.load({
"my_objects": [{
"name": "test_object",
"class": "mypackage.MyConfigurableClass",
"param1": "test_value",
"param2": 300
}]
})
# Test that object can be built from config
obj = config.get_instance("my_objects", "test_object")
assert isinstance(obj, MyConfigurableClass)
assert obj.param1 == "test_value"
assert obj.param2 == 300
def test_environment_variables():
"""Test environment variable substitution."""
import os
os.environ["MY_PARAM"] = "env_value"
config = Config()
config.load({
"my_objects": [{
"name": "env_test",
"class": "mypackage.MyConfigurableClass",
"param1": "${MY_PARAM}"
}]
})
obj = config.get_instance("my_objects", "env_test")
assert obj.param1 == "env_value"
Documentation Requirements¶
1. Class Docstring¶
Always document configuration options in the class docstring:
class WellDocumentedClass(ConfigurableBase):
"""A well-documented configurable class.
This class can be instantiated directly or from configuration files
using the DataKnobs configuration system.
Configuration Options:
host (str): Server hostname (required)
port (int): Server port (default: 8080)
timeout (int): Connection timeout in seconds (default: 30)
ssl (bool): Enable SSL/TLS (default: False)
credentials (dict): Optional credentials dictionary with:
- username (str): Username for authentication
- password (str): Password for authentication
Example Configuration:
servers:
- name: production
class: mypackage.WellDocumentedClass
host: prod.example.com
port: 443
ssl: true
credentials:
username: ${API_USER}
password: ${API_PASSWORD}
Example Usage:
>>> from dataknobs_config import Config
>>> config = Config("config.yaml")
>>> server = config.get_instance("servers", "production")
"""
2. README Examples¶
Add configuration examples to your package README:
## Configuration Support
This package supports the DataKnobs configuration system. All main classes
inherit from `ConfigurableBase` and can be instantiated from configuration files.
### Example Configuration
```yaml
# config.yaml
my_services:
- name: processor
class: mypackage.DataProcessor
input_dir: /data/input
output_dir: /data/output
batch_size: 100
- name: validator
class: mypackage.DataValidator
rules_file: /config/rules.yaml
strict_mode: true
Loading from Configuration¶
from dataknobs_config import Config
config = Config("config.yaml")
processor = config.get_instance("my_services", "processor")
validator = config.get_instance("my_services", "validator")
## Common Patterns
### Pattern 1: Optional Dependencies
```python
class OptionalDependencyClass(ConfigurableBase):
"""Class with optional dependencies based on configuration."""
def __init__(self, config: Optional[Dict[str, Any]] = None):
self.config = config or {}
self.backend = self.config.get("backend", "basic")
if self.backend == "advanced":
try:
import advanced_library
self._setup_advanced()
except ImportError:
raise ImportError(
"Advanced backend requires 'advanced_library'. "
"Install with: pip install mypackage[advanced]"
)
else:
self._setup_basic()
Pattern 2: Validation¶
from dataknobs_config import ConfigurableBase, ValidationError
class ValidatedClass(ConfigurableBase):
"""Class with configuration validation."""
@classmethod
def from_config(cls, config: dict) -> "ValidatedClass":
"""Create instance with validation."""
# Validate required fields
required = ["field1", "field2"]
missing = [f for f in required if f not in config]
if missing:
raise ValidationError(f"Missing required fields: {missing}")
# Validate types
if not isinstance(config.get("port"), int):
raise ValidationError("'port' must be an integer")
# Validate ranges
port = config.get("port")
if port and not (1 <= port <= 65535):
raise ValidationError("'port' must be between 1 and 65535")
return cls(config)
Pattern 3: Lazy Initialization¶
class LazyInitClass(ConfigurableBase):
"""Class with lazy initialization from config."""
def __init__(self, config: Optional[Dict[str, Any]] = None):
self.config = config or {}
self._connection = None
@property
def connection(self):
"""Lazily initialize connection when first accessed."""
if self._connection is None:
self._initialize_connection()
return self._connection
def _initialize_connection(self):
"""Initialize connection from config."""
host = self.config.get("host", "localhost")
port = self.config.get("port", 8080)
self._connection = create_connection(host, port)
Checklist¶
Before considering configuration support complete:
- Class inherits from
ConfigurableBase -
from_config()classmethod is implemented - Configuration options are documented in class docstring
- Unit tests verify configuration-based instantiation
- Integration tests verify Config.get_instance() works
- README includes configuration examples
- Backward compatibility is maintained (if updating existing class)
- Environment variable substitution is tested
- Default values are documented and tested
- Required parameters are validated
- Error messages are helpful and specific
Troubleshooting¶
Issue: ImportError when using Config.get_instance()¶
Solution: Ensure the module path in the class attribute is correct and the module is importable:
# Correct: full module path
"class": "mypackage.submodule.MyClass"
# Incorrect: missing package prefix
"class": "submodule.MyClass"
Issue: TypeError on instantiation¶
Solution: Implement from_config() to handle the config dictionary:
@classmethod
def from_config(cls, config: dict):
return cls(config) # or cls(**config) if appropriate
Issue: Circular imports¶
Solution: Use lazy imports inside methods:
def _init_component(self):
# Import here to avoid circular dependency
from .other_module import OtherClass
self.component = OtherClass(self.config.get("component", {}))
Next Steps¶
- Review the Configuration System Documentation
- Look at examples in the
dataknobs-datapackage - Test your implementation with the Config class
- Add your class to the package documentation
- Consider contributing your patterns back to this guide