Analysis Consolidation System

OpenHCS provides an automatic analysis consolidation system that aggregates CSV-based analysis results from pipelines into MetaXpress-compatible summary tables, enabling seamless integration with existing microscopy analysis workflows.

## Overview

The analysis consolidation system addresses the challenge of managing multiple analysis outputs across wells and analysis types by:

  • Automatic aggregation: Consolidates CSV files from multiple wells into single summary tables

  • MetaXpress compatibility: Generates output format compatible with MetaXpress analysis tools

  • Pipeline integration: Automatically triggers after pipeline completion

  • Flexible configuration: Configurable output formats, file patterns, and metadata

## Architecture Components

### Core Modules

consolidate_analysis_results.py

Primary consolidation engine that processes CSV files and creates summary tables

consolidate_special_outputs.py

Handles special output types and custom aggregation patterns

metaxpress.py

MetaXpress format compatibility and legacy analysis support

### Configuration Classes

The system uses two main configuration dataclasses:

@dataclass(frozen=True)
class AnalysisConsolidationConfig:
    """Configuration for automatic analysis results consolidation."""
    enabled: bool = True
    metaxpress_style: bool = True
    well_pattern: str = r"([A-Z]\d{2})"
    file_extensions: tuple[str, ...] = (".csv",)
    exclude_patterns: tuple[str, ...] = (r".*consolidated.*", r".*summary.*")
    output_filename: str = "metaxpress_style_summary.csv"

@dataclass(frozen=True)
class PlateMetadataConfig:
    """Configuration for plate metadata in MetaXpress-style output."""
    barcode: Optional[str] = None
    plate_name: Optional[str] = None
    plate_id: Optional[str] = None
    description: Optional[str] = None
    acquisition_user: str = "OpenHCS"
    z_step: str = "1"

## Orchestrator Integration

The analysis consolidation system integrates directly with the PipelineOrchestrator and runs automatically after plate execution:

# In orchestrator.py
def run_compiled_plate(self, compiled_contexts: Dict[str, Any]) -> Dict[str, Any]:
    # ... execute pipeline ...

    # Run automatic analysis consolidation if enabled
    shared_context = get_current_global_config(GlobalPipelineConfig)
    if shared_context.analysis_consolidation_config.enabled:
        # Find results directory from compiled contexts
        results_dir = self._find_results_directory(compiled_contexts)

        if results_dir and results_dir.exists():
            csv_files = list(results_dir.glob("*.csv"))
            if csv_files:
                consolidate_analysis_results(
                    results_directory=str(results_dir),
                    well_ids=axis_ids,  # List of well IDs to consolidate
                    consolidation_config=shared_context.analysis_consolidation_config,
                    plate_metadata_config=shared_context.plate_metadata_config
                )

Automatic Triggering: The system automatically detects when analysis results are available and triggers consolidation without user intervention.

## MetaXpress Format Support

### Header Structure

The system generates MetaXpress-compatible headers with plate metadata:

def create_metaxpress_header(summary_df: pd.DataFrame, plate_metadata: Dict[str, str]) -> List[List[str]]:
    """Create MetaXpress-style header rows with metadata."""
    header_rows = [
        ['Barcode', plate_metadata.get('barcode', 'OpenHCS-Plate')],
        ['Plate Name', plate_metadata.get('plate_name', 'OpenHCS Analysis Results')],
        ['Plate ID', plate_metadata.get('plate_id', '00000')],
        ['Description', plate_metadata.get('description', 'Consolidated analysis results')],
        ['Acquisition User', plate_metadata.get('acquisition_user', 'OpenHCS')],
        ['Z Step', plate_metadata.get('z_step', '1')]
    ]
    return header_rows

### Column Organization

MetaXpress-style column ordering: 1. Well column first: Primary identifier for each row 2. Grouped by analysis type: Columns grouped by analysis method 3. Sorted within groups: Consistent ordering within each analysis type

### Output Format

The consolidated output follows MetaXpress conventions:

Barcode,OpenHCS-Plate-001
Plate Name,Cell Analysis Results
Plate ID,12345
Description,Consolidated analysis results from OpenHCS pipeline: 96 wells analyzed
Acquisition User,OpenHCS
Z Step,1
Well,Cell Count (cell_counting),Cell Area Mean (cell_counting),Intensity Mean (intensity_analysis)
A01,245,156.7,0.823
A02,198,142.3,0.756
...

## Pipeline Function Integration

The system provides a pipeline-compatible function for use in FunctionStep objects:

@numpy_func
@special_outputs(("consolidated_results", materialize_consolidated_results))
def consolidate_analysis_results_pipeline(
    image_stack: np.ndarray,
    results_directory: str,
    consolidation_config: AnalysisConsolidationConfig,
    plate_metadata_config: PlateMetadataConfig
) -> tuple[np.ndarray, pd.DataFrame]:
    """Pipeline-compatible version of consolidate_analysis_results."""

    summary_df = consolidate_analysis_results(
        results_directory=results_directory,
        consolidation_config=consolidation_config,
        plate_metadata_config=plate_metadata_config,
        output_path=None  # Handled by materialization
    )

    return image_stack, summary_df

Special Outputs Integration: Uses the @special_outputs decorator to handle DataFrame materialization through the OpenHCS special outputs system.

## File Pattern Recognition

### Well ID Extraction

The system uses configurable regex patterns to extract well IDs from filenames:

# Default pattern for standard 96/384-well plates
well_pattern = r"([A-Z]\d{2})"  # Matches A01, B12, etc.

# Custom patterns can be configured
well_pattern = r"([A-Z]\d{1,2})"  # Matches A1, A01, etc.

### File Filtering

Include patterns: File extensions to process (default: .csv) Exclude patterns: Patterns to skip (consolidated files, summaries, etc.)

file_extensions = (".csv",)
exclude_patterns = (r".*consolidated.*", r".*metaxpress.*", r".*summary.*")

## Analysis Type Detection

The system automatically detects analysis types from filename patterns:

def detect_analysis_type(file_path: str) -> str:
    """Detect analysis type from filename patterns."""
    filename = Path(file_path).stem.lower()

    # Common analysis type patterns
    if 'cell_count' in filename or 'counting' in filename:
        return 'cell_counting'
    elif 'intensity' in filename or 'fluorescence' in filename:
        return 'intensity_analysis'
    elif 'morphology' in filename or 'shape' in filename:
        return 'morphology_analysis'
    else:
        return 'general_analysis'

## Configuration Integration

The analysis consolidation system integrates with the global configuration system:

@dataclass(frozen=True)
class GlobalPipelineConfig:
    # ... other config fields ...

    analysis_consolidation: AnalysisConsolidationConfig = field(default_factory=AnalysisConsolidationConfig)
    """Configuration for automatic analysis results consolidation."""

    plate_metadata: PlateMetadataConfig = field(default_factory=PlateMetadataConfig)
    """Configuration for plate metadata in consolidated outputs."""

Global Context: Configuration is accessible throughout the pipeline execution via the global context system.

## Benefits and Use Cases

### Workflow Integration - Seamless MetaXpress compatibility: Direct import into existing analysis workflows - Automatic execution: No manual consolidation steps required - Consistent formatting: Standardized output across different analysis types

### Data Management - Single summary files: Reduces file proliferation from multi-well analyses - Structured metadata: Preserves experimental context and plate information - Flexible aggregation: Configurable summarization strategies per analysis type

### Analysis Efficiency - Immediate availability: Consolidated results available immediately after pipeline completion - Standard format: Compatible with downstream statistical analysis tools - Quality control: Consistent data structure enables automated validation

The analysis consolidation system provides essential infrastructure for high-throughput microscopy workflows, ensuring that OpenHCS analysis results integrate seamlessly with existing laboratory data management and analysis pipelines.