Component Configuration Framework
Overview
Traditional microscopy processing systems hardcode assumptions about data dimensions and multiprocessing strategies. The ComponentConfiguration framework eliminates these assumptions by providing a generic configuration abstraction that can represent any enum-based component system.
@dataclass(frozen=True)
class ComponentConfiguration(Generic[T]):
all_components: Set[T]
multiprocessing_axis: T
default_variable: List[T]
default_group_by: Optional[T]
This enables the same processing engine to work with different component structures - wells/sites/channels, timepoints/batches/conditions, or any other dimensional organization - without code changes.
Core Constraint
The framework enforces one fundamental processing constraint: the multiprocessing axis cannot be used as a variable component. This prevents ambiguous processing behavior where the same component would be used for both task partitioning and data grouping.
def validate_combination(self, variable_components: List[T], group_by: Optional[T]) -> None:
"""Validate that group_by is not in variable_components."""
if group_by and group_by in variable_components:
raise ValueError(f"group_by {group_by.value} cannot be in variable_components")
This constraint is enforced at configuration creation and during step validation.
ComponentConfiguration Usage
Component configurations define the dimensional structure and processing behavior for a system.
Creating Configurations
# Standard OpenHCS configuration
config = ComponentConfigurationFactory.create_configuration(
StandardComponents,
multiprocessing_axis=StandardComponents.WELL,
default_variable=[StandardComponents.SITE],
default_group_by=StandardComponents.CHANNEL
)
# Custom temporal analysis configuration
config = ComponentConfigurationFactory.create_configuration(
TimeSeriesComponents,
multiprocessing_axis=TimeSeriesComponents.TIMEPOINT,
default_variable=[TimeSeriesComponents.WELL, TimeSeriesComponents.SITE],
default_group_by=TimeSeriesComponents.CONDITION
)
The factory automatically calculates remaining components available for user selection.
ComponentConfigurationFactory
The factory creates configurations with automatic default resolution when defaults aren’t specified.
# Explicit configuration
config = ComponentConfigurationFactory.create_configuration(
MyComponents,
multiprocessing_axis=MyComponents.BATCH,
default_variable=[MyComponents.SAMPLE],
default_group_by=MyComponents.CONDITION
)
# Auto-resolved defaults
config = ComponentConfigurationFactory.create_configuration(
MyComponents,
multiprocessing_axis=MyComponents.BATCH
# default_variable and default_group_by auto-resolved from remaining components
)
Auto-resolution uses the first remaining component as default_variable and the second as default_group_by.
Integration Examples
Component configurations drive enum generation and validation across OpenHCS subsystems.
Dynamic Enum Creation
# Configuration drives enum creation
config = get_openhcs_config()
remaining = config.get_remaining_components()
# AllComponents: Complete dimensional space
AllComponents = Enum('AllComponents', {c.name: c.value for c in config.all_components})
# VariableComponents: User-selectable components
VariableComponents = Enum('VariableComponents', {c.name: c.value for c in remaining})
Validation Integration
# Configuration drives validation
validator = GenericValidator(config)
result = validator.validate_step(
variable_components=[VariableComponents.SITE],
group_by=GroupBy.CHANNEL
)
Multiprocessing Integration
# Configuration drives task coordination
coordinator = MultiprocessingCoordinator(config)
tasks = coordinator.create_tasks(orchestrator, pipeline_definition)
Extension Examples
Custom Component Systems
class TimeSeriesComponents(Enum):
WELL = "well"
TIMEPOINT = "timepoint"
CHANNEL = "channel"
FIELD = "field"
# Temporal parallelization strategy
timeseries_config = ComponentConfigurationFactory.create_configuration(
TimeSeriesComponents,
multiprocessing_axis=TimeSeriesComponents.TIMEPOINT,
default_variable=[TimeSeriesComponents.WELL, TimeSeriesComponents.FIELD],
default_group_by=TimeSeriesComponents.CHANNEL
)
Common Gotchas:
Don’t use the multiprocessing axis as a variable component - validation will fail
Component keys are cached on initialization - call
clear_component_cache()if input directory changesDict pattern keys must match actual component values, not enum names