Configuration System Overview¶
The DataKnobs Config package provides a powerful, flexible configuration management system designed for complex applications with multiple environments and dynamic configuration needs.
Core Concepts¶
Configuration Structure¶
The configuration system organizes settings as dictionaries of lists by type. Each configuration type (e.g., "databases", "caches", "services") contains a list of configuration objects:
databases:
- name: primary
host: localhost
port: 5432
caches:
- name: redis
host: localhost
port: 6379
Key Features¶
- Multi-format Support: Load from YAML, JSON, or Python dictionaries
- Environment Overrides: Override any configuration value via environment variables
- Factory Pattern: Register and use factories for object construction
- Cross-references: Reference configurations across types using xref syntax
- Variable Substitution: Use environment variables within configuration files
- Validation: Built-in validation and normalization
- Caching: Intelligent caching of constructed objects
Basic Usage¶
Loading Configuration¶
from dataknobs_config import Config
# Load from file
config = Config.from_file("config.yaml")
# Load from dictionary
config = Config({
"databases": [
{"name": "primary", "host": "localhost", "port": 5432}
]
})
# Load with environment overrides
config = Config.from_file("config.yaml", apply_env_overrides=True)
Accessing Configuration¶
# Get all databases
databases = config.get("databases")
# Get specific database by name
primary_db = config.get("databases", "primary")
# Get by index
first_db = config.get("databases", 0)
# Get with default
cache = config.get("caches", "redis", default={"host": "localhost"})
Setting Configuration¶
# Set entire type
config.set("caches", [{"name": "redis", "host": "localhost"}])
# Update specific item
config.set("databases", "primary", {"host": "prod.example.com"})
# Add new item
config.add("databases", {"name": "analytics", "host": "analytics.example.com"})
Configuration Files¶
YAML Format¶
# config.yaml
databases:
- name: primary
host: ${DB_HOST:localhost} # Variable substitution with default
port: 5432
pool_size: 20
- name: analytics
host: analytics.example.com
port: 5432
pool_size: 10
caches:
- name: redis
factory: "myapp.cache.RedisFactory"
host: localhost
port: 6379
ttl: 3600
services:
- name: api
database: "xref:databases[primary]" # Cross-reference
cache: "xref:caches[redis]"
port: 8000
JSON Format¶
{
"databases": [
{
"name": "primary",
"host": "localhost",
"port": 5432
}
],
"caches": [
{
"name": "redis",
"host": "localhost",
"port": 6379
}
]
}
Cross-References¶
The xref syntax allows referencing configurations across types:
services:
- name: api
# Reference by name
database: "xref:databases[primary]"
# Reference by index
cache: "xref:caches[0]"
# Reference first item (default)
queue: "xref:queues"
References are resolved recursively and support circular reference detection.
Variable Substitution¶
Configuration files can use environment variables:
database:
host: ${DB_HOST} # Required variable
port: ${DB_PORT:5432} # With default value
password: ${DB_PASS:-secret} # Bash-style default
Object Construction¶
The configuration system can automatically construct objects using registered factories or class references:
# Register a factory
config.register_factory("database", DatabaseFactory())
# In configuration
databases:
- name: primary
factory: "database" # Uses registered factory
host: localhost
# Or use class directly
caches:
- name: redis
class: "myapp.cache.RedisCache"
host: localhost
# Get constructed objects
db = config.construct("databases", "primary")
Settings and Defaults¶
Configure global settings and type-specific defaults:
from dataknobs_config import Settings
settings = Settings()
# Set type-specific defaults
settings.set_defaults("databases", {
"port": 5432,
"pool_size": 20
})
# Set global defaults
settings.set_global_defaults({
"timeout": 30
})
# Apply to config
config = Config.from_file("config.yaml", settings=settings)
Merging Configurations¶
Combine multiple configuration sources:
# Load base configuration
base_config = Config.from_file("base.yaml")
# Load environment-specific overrides
env_config = Config.from_file("production.yaml")
# Merge configurations
base_config.merge(env_config)
# Or during construction
config = Config.from_file("base.yaml")
config.merge_file("production.yaml")
Validation¶
The configuration system provides built-in validation:
# Validate configuration
try:
config.validate()
except ConfigurationError as e:
print(f"Invalid configuration: {e}")
# Custom validation
def validate_database(db_config):
if db_config.get("port", 0) < 1024:
raise ValueError("Port must be >= 1024")
config.add_validator("databases", validate_database)
Best Practices¶
- Use Type Organization: Group related configurations by type (databases, caches, services)
- Leverage Cross-References: Avoid duplication by referencing shared configurations
- Environment-Specific Files: Use separate files for different environments
- Variable Substitution: Keep sensitive data in environment variables
- Factory Registration: Register reusable factories for common object types
- Validation: Add custom validators for critical configurations
- Defaults: Use settings to define sensible defaults
Advanced Features¶
Dynamic Configuration¶
# Reload configuration
config.reload()
# Watch for changes
config.watch(callback=on_config_change)
# Clear caches
config.clear_cache()
Custom Factories¶
from dataknobs_config import FactoryBase
class DatabaseFactory(FactoryBase):
def create(self, **config):
# Custom initialization logic
config.setdefault("pool_size", 20)
return Database(**config)
config.register_factory("database", DatabaseFactory())
Configuration Export¶
# Export to dictionary
config_dict = config.to_dict()
# Export to YAML
yaml_str = config.to_yaml()
# Export to JSON
json_str = config.to_json()
Examples¶
For practical examples, see: - Database Configuration Example - Service Configuration Example - Multi-Environment Setup