Systematic Refactoring Framework
Authoritative guide for OpenHCS architectural decisions and refactoring approaches.
Core Architectural Philosophy
Pragmatic OOP/FP Balance
Use OOP for:
Contracts and interfaces (ABCs)
Stateful systems (configuration, I/O, UI)
Polymorphism (multiple implementations)
Encapsulation (complex state management)
Use FP for:
Data transformation (image processing, math operations)
Configuration resolution (lazy evaluation, hierarchical chains)
Validation logic (stateless functions)
Utility functions (pure functions, no side effects)
Code Quality Principles
Declarative, terse, elegant:
# Good: Declarative enum-driven configuration
class ProcessingContract(Enum):
PURE_3D = "_execute_pure_3d"
PURE_2D = "_execute_pure_2d"
def execute(self, registry, func, image, *args, **kwargs):
method = getattr(registry, self.value)
return method(func, image, *args, **kwargs)
# Bad: Imperative conditional logic
def execute_contract(contract_type, registry, func, image, *args, **kwargs):
if contract_type == "pure_3d":
return registry._execute_pure_3d(func, image, *args, **kwargs)
elif contract_type == "pure_2d":
return registry._execute_pure_2d(func, image, *args, **kwargs)
Strict separation of concerns:
Business logic: Domain operations isolated from framework
I/O operations: All file operations through FileManager
Configuration: Declarative dataclass-based, separate from logic
UI logic: Framework-agnostic service layer with UI adapters
Fundamental Refactoring Principles
Fail-Loud Philosophy
Eliminate defensive programming. Let Python’s exceptions bubble up:
# Forbidden: Defensive programming with silent failures
if hasattr(obj, 'method'):
result = obj.method()
else:
result = default_value # Masks bugs
# Forbidden: getattr with fallbacks
result = getattr(obj, 'attribute', default_value) # Masks missing attributes
# Required: Let Python fail naturally
def process_data(data: Array) -> Array:
if data.ndim != 3:
raise ValueError(f"Expected 3D array, got {data.ndim}D")
return transform(data)
# Required: Error handling ONLY where errors are expected
try:
result = gpu_operation(data)
except CudaError as e:
raise MemoryConversionError(
source_type="cupy", target_type="torch",
method="GPU_conversion", reason=str(e)
) from e
Stateless Architecture
Prefer pure functions over stateful classes:
# Good: Pure function with explicit dependencies
def validate_pipeline_config(config: PipelineConfig,
available_functions: Set[str]) -> ValidationResult:
errors = []
for step in config.steps:
if step.function_name not in available_functions:
errors.append(f"Unknown function: {step.function_name}")
return ValidationResult(errors)
# Bad: Stateful validator with hidden dependencies
class PipelineValidator:
def __init__(self):
self.function_registry = get_global_registry() # Hidden dependency
def validate(self, config):
# Stateful validation logic
Dataclass Patterns
Use dataclasses for declarative configuration:
@dataclass(frozen=True)
class StepConfig:
function_name: str
parameters: Dict[str, Any]
memory_type: MemoryType = MemoryType.NUMPY
def validate(self) -> List[str]:
errors = []
if not self.function_name:
errors.append("function_name required")
return errors
ABC Contract Enforcement
Use ABCs to enforce explicit contracts:
class StorageBackend(ABC):
@abstractmethod
def load(self, path: str) -> bytes: pass
@abstractmethod
def save(self, data: bytes, path: str) -> None: pass
class FileSystemBackend(StorageBackend):
def load(self, path: str) -> bytes:
with open(path, 'rb') as f:
return f.read()
def save(self, data: bytes, path: str) -> None:
with open(path, 'wb') as f:
f.write(data)
Enum-Driven Configuration
Replace magic strings with enums:
# Good: Enum-driven behavior
class MemoryType(Enum):
NUMPY = "numpy"
TORCH = "torch"
CUPY = "cupy"
def convert_array(self, array, target_type: 'MemoryType'):
converter = getattr(self, f"_to_{target_type.value}")
return converter(array)
# Bad: Magic strings
def convert_array(array, source_type: str, target_type: str):
if source_type == "numpy" and target_type == "torch":
return torch.from_numpy(array)
elif source_type == "torch" and target_type == "numpy":
return array.numpy()
OpenHCS-Specific Patterns
Lazy Configuration Resolution
Hierarchical resolution: step → pipeline → global:
def resolve_field_value(field_name: str,
step_config: Optional[StepConfig],
pipeline_config: Optional[PipelineConfig],
global_config: GlobalConfig) -> Any:
# Breadth-first resolution
for config in [step_config, pipeline_config, global_config]:
if config and hasattr(config, field_name):
value = getattr(config, field_name)
if value is not None:
return value
return None
FileManager I/O Abstraction
All I/O operations must go through FileManager:
# Good: FileManager abstraction
def save_results(results: List[Array],
output_paths: List[str],
filemanager: FileManager) -> None:
for result, path in zip(results, output_paths):
filemanager.save_array(result, path)
# Bad: Direct file system access
def save_results(results: List[Array], output_paths: List[str]) -> None:
for result, path in zip(results, output_paths):
np.save(path, result) # Bypasses backend system
3D→3D Function Contracts
All OpenHCS functions maintain 3D→3D contracts:
@register_function("denoise_3d")
def denoise_volume(volume: Array3D) -> Array3D:
"""Denoise 3D volume. Input and output must be 3D."""
if volume.ndim != 3:
raise ValueError(f"Expected 3D volume, got {volume.ndim}D")
denoised = apply_denoising(volume)
if denoised.ndim != 3:
raise ValueError(f"Function violated 3D→3D contract")
return denoised
Systematic Refactoring Process
Step 1: Identify Violations
Code smells to look for:
hasattr()checks with fallback logicgetattr()calls with default valuestry/exceptblocks where errors aren’t expectedMagic strings instead of enums
Direct file system access bypassing FileManager
Defensive programming patterns
Hardcoded lists that should use enums
Stateful classes that could be pure functions
Step 2: Apply Patterns
Create ABCs for similar functionality
Extract dependencies to constructor parameters
Remove dispatch layers in favor of direct method calls
Use Generic[T] and metaprogramming for true genericism
Replace defensive code with explicit error handling
Standardize interfaces across subsystems
Step 3: Validate Changes
Refactoring validation checklist:
[ ] All I/O operations go through FileManager
[ ] No defensive programming patterns remain
[ ] Error handling only where errors are expected
[ ] Enums used instead of magic strings
[ ] ABCs define clear contracts
[ ] Functions maintain 3D→3D contracts where applicable
[ ] Breadth-first traversal used for recursive operations
[ ] Lazy resolution follows step → pipeline → global hierarchy
Decision Framework
When to Refactor
Refactor when:
Code violates OpenHCS architectural principles
Defensive programming patterns are present
Magic strings are used instead of enums
Similar functionality lacks consistent interfaces
Direct file system access bypasses FileManager
Don’t refactor when:
Code already follows OpenHCS patterns
Changes would break existing contracts
Refactoring adds complexity without clear benefit
Architectural Decision Process
Identify the domain - Configuration, I/O, processing, or UI?
Choose paradigm - OOP for contracts/state, FP for transformations
Apply patterns - Use established OpenHCS patterns for the domain
Validate design - Ensure fail-loud behavior and clear contracts
Test integration - Verify compatibility with existing systems
This framework ensures consistent, maintainable, and robust code across the OpenHCS codebase.