DtypeConfig System

Module: openhcs.core.config Status: STABLE

Overview

Traditional image processing pipelines hardcode dtype conversion assumptions for each backend. The DtypeConfig system eliminates these assumptions by providing a unified configuration that controls dtype conversion behavior across all memory backends (NumPy, CuPy, PyTorch, TensorFlow).

Quick Reference

from openhcs.core.config import GlobalPipelineConfig, DtypeConfig, DtypeConversion

# Set global default: preserve input dtype with scaling
config = GlobalPipelineConfig(
    dtype_config=DtypeConfig(
        default_dtype_conversion=DtypeConversion.PRESERVE_INPUT
    )
)

# Override per-step: use native output dtype (no scaling)
step = FunctionStep(
    function='my_gpu_function',
    dtype_config=DtypeConfig(
        default_dtype_conversion=DtypeConversion.NATIVE_OUTPUT
    )
)

Configuration Hierarchy

DtypeConfig follows the standard OpenHCS configuration hierarchy:

Priority Order (highest to lowest):

  1. Step-level override: FunctionStep(dtype_config=...)

  2. Pipeline-level default: PipelineConfig(dtype_config=...)

  3. Global default: GlobalPipelineConfig(dtype_config=...)

This enables setting a global policy while allowing per-step exceptions for specific processing requirements.

Conversion Modes

NATIVE_OUTPUT (Default)

Backend functions return their natural output dtype without scaling.

# GPU function returns float32 [0.0, 1.0]
# Output: float32 [0.0, 1.0] (no conversion)

dtype_config = DtypeConfig(
    default_dtype_conversion=DtypeConversion.NATIVE_OUTPUT
)

When to use: GPU processing pipelines where float32 is preferred, or when downstream steps expect normalized float values.

PRESERVE_INPUT

Backend functions scale output to match input dtype range.

# Input: uint16 [0, 65535]
# GPU function returns: float32 [0.0, 1.0]
# Output: uint16 [0, 65535] (scaled back)

dtype_config = DtypeConfig(
    default_dtype_conversion=DtypeConversion.PRESERVE_INPUT
)

When to use: Mixed CPU/GPU pipelines where dtype consistency is critical, or when saving results that must match input format.

Backend Integration

DtypeConfig is automatically injected into every memory backend decorator:

from openhcs.core.memory.decorators import numpy_memory, cupy_memory

@numpy_memory
def process_numpy(image: np.ndarray, dtype_config: DtypeConfig = None) -> np.ndarray:
    # dtype_config automatically injected by decorator
    # Conversion applied based on dtype_config.default_dtype_conversion
    return processed_image

@cupy_memory
def process_cupy(image: cp.ndarray, dtype_config: DtypeConfig = None) -> cp.ndarray:
    # Same injection pattern for all backends
    return processed_image

The decorator system handles:

  • Automatic injection: dtype_config parameter added to function signature

  • Context resolution: Resolves from step → pipeline → global hierarchy

  • Conversion application: Applies scaling based on configuration mode

  • Type preservation: Maintains dtype semantics across backend boundaries

Scaling Semantics

Range-Based Transformation

Scaling uses dtype-specific ranges to preserve data fidelity:

# uint8 → float32 conversion
# Input range: [0, 255]
# Output range: [0.0, 1.0]
# Formula: float_value = uint8_value / 255.0

# float32 → uint16 conversion
# Input range: [0.0, 1.0]
# Output range: [0, 65535]
# Formula: uint16_value = float_value * 65535.0

Supported dtypes: uint8, uint16, uint32, int8, int16, int32, float16, float32, float64

Dtype-Preserving Percentile Normalization

Percentile normalization maintains dtype while rescaling intensity range:

from openhcs.processing.backends.processors.percentile_utils import (
    percentile_normalize_preserve_dtype
)

# Input: uint16 [100, 50000] (raw microscopy data)
# Normalize to [2%, 98%] percentiles
# Output: uint16 [0, 65535] (full dynamic range)

normalized = percentile_normalize_preserve_dtype(
    image,
    lower_percentile=2.0,
    upper_percentile=98.0
)

This enables contrast enhancement without dtype conversion overhead.

Common Patterns

Global GPU Pipeline

# All GPU steps use native float32 output
config = GlobalPipelineConfig(
    dtype_config=DtypeConfig(
        default_dtype_conversion=DtypeConversion.NATIVE_OUTPUT
    )
)

Mixed CPU/GPU Pipeline

# Global: preserve dtype for consistency
global_config = GlobalPipelineConfig(
    dtype_config=DtypeConfig(
        default_dtype_conversion=DtypeConversion.PRESERVE_INPUT
    )
)

# GPU step: override to use native float32
gpu_step = FunctionStep(
    function='gpu_denoising',
    dtype_config=DtypeConfig(
        default_dtype_conversion=DtypeConversion.NATIVE_OUTPUT
    )
)

Per-Step Override

# Most steps preserve dtype
pipeline_config = PipelineConfig(
    dtype_config=DtypeConfig(
        default_dtype_conversion=DtypeConversion.PRESERVE_INPUT
    )
)

# Final visualization step uses native float32
viz_step = FunctionStep(
    function='create_overlay',
    dtype_config=DtypeConfig(
        default_dtype_conversion=DtypeConversion.NATIVE_OUTPUT
    )
)

Implementation Notes

🔬 Source Code:

  • Configuration: openhcs/core/config.py (line 289)

  • Decorator injection: openhcs/core/memory/decorators.py (line 18)

  • Scaling logic: openhcs/core/memory/dtype_scaling.py (line 15)

  • Percentile utils: openhcs/processing/backends/processors/percentile_utils.py (line 121)

🏗️ Architecture:

  • ../architecture/memory-type-system - Memory backend architecture

  • ../architecture/configuration-management-system - Configuration hierarchy

📊 Performance:

  • Zero-copy when input/output dtypes match

  • Single scaling operation when conversion needed

  • Framework-specific optimizations (CuPy uses GPU kernels)

Key Design Decisions

Why inject into every backend?

Ensures consistent dtype behavior across all memory types without requiring backend-specific configuration.

Why default to NATIVE_OUTPUT?

GPU backends naturally produce float32, and forcing conversion adds overhead. Users opt-in to dtype preservation when needed.

Why separate from ProcessingConfig?

Dtype conversion is orthogonal to processing logic (grouping, variable components). Separate config enables independent evolution.

Common Gotchas

  • Don’t assume dtype preservation: Default mode is NATIVE_OUTPUT, which may change dtype

  • Percentile normalization preserves dtype: Use percentile_normalize_preserve_dtype when dtype consistency is critical

  • Step overrides are absolute: Step-level dtype_config completely replaces pipeline/global config (no merging)

  • Scaling is range-based: Float values outside [0.0, 1.0] will be clipped when converting to integer dtypes