ROI Extraction and Streaming System

Status: Production | Introduced: 2025-10 | Stability: Stable

The Problem: Scattered ROI Handling

Cell segmentation and counting pipelines generate labeled masks that need to be converted into regions of interest (ROIs) for further analysis. Without a unified ROI system, each visualization tool (Napari, Fiji, OMERO) requires custom code to extract and format ROIs. Additionally, ROIs need to be materialized to multiple backends simultaneously (disk for archival, OMERO for server storage, Napari for visualization), creating complex coordination logic scattered throughout the codebase.

The Solution: Unified ROI Extraction and Multi-Backend Materialization

The ROI (Region of Interest) system provides backend-agnostic extraction, representation, and materialization of regions of interest from segmentation masks. ROIs are automatically extracted from labeled masks generated by cell counting and segmentation functions, then materialized to multiple backends simultaneously (disk, OMERO, Napari, Fiji) through a unified interface.

Overview

The ROI (Region of Interest) system provides backend-agnostic extraction, representation, and materialization of regions of interest from segmentation masks. ROIs are automatically extracted from labeled masks generated by cell counting and segmentation functions, then materialized to multiple backends simultaneously (disk, OMERO, Napari, Fiji).

Architecture

Core Components

ROIs are represented using immutable dataclasses defined in openhcs/core/roi.py:

@dataclass(frozen=True)
class ROI:
    """Region of Interest with metadata."""
    shapes: List[Any]  # List of shape objects
    metadata: Dict[str, Any] = field(default_factory=dict)

Shape Types:

PolygonShape: Polygon defined by vertex coordinates (Nx2 array of (y, x))
MaskShape: Binary mask with bounding box
PointShape: Single point (y, x)
EllipseShape: Ellipse with center and radii

ROIs are extracted from labeled segmentation masks using scikit-image:

from openhcs.core.roi import extract_rois_from_labeled_mask

rois = extract_rois_from_labeled_mask(
    labeled_mask,
    min_area=10,  # Minimum area in pixels
    extract_contours=True  # Extract polygon contours vs binary masks
)

Extraction Process:

Use skimage.measure.regionprops to get region properties
Filter regions by minimum area threshold
Extract metadata (label, area, perimeter, centroid, bbox)
Extract shapes (polygon contours or binary masks)
Create immutable ROI objects

Skeleton masks require specialized extraction to preserve branch topology and connectivity. The standard extract_rois_from_labeled_mask() approach fragments skeletons at junctions, losing connectivity information.

Problem with Standard Extraction:

Using scipy.ndimage.label() treats skeletons as binary masks:

Fragments branches at junction points
Loses connectivity and topology information
Creates disconnected regions instead of continuous paths
Cannot distinguish individual branches

Solution: skan Branch Path Extraction:

Use skan.Skeleton to extract actual branch paths:

from skan import Skeleton
from openhcs.core.roi import PolylineShape, ROI

# Create skan Skeleton object
skeleton_obj = Skeleton(skeleton_mask)

# Extract each branch as a separate ROI
for branch_idx in range(skeleton_obj.n_paths):
    # Get pixel coordinates for this branch path
    # Returns (N, 2) array of (row, col) = (y, x) coordinates
    path_coords = skeleton_obj.path_coordinates(branch_idx)

    # Create polyline shape from path coordinates
    # PolylineShape is for open paths (not closed polygons)
    shape = PolylineShape(coordinates=path_coords)

    # Create ROI with metadata
    metadata = {
        'position': z_idx,
        'label': f'Skeleton_Z{z_idx:03d}_Branch{branch_idx:03d}',
        'branch_index': branch_idx,
        'path_length': len(path_coords)
    }

    roi = ROI(shapes=[shape], metadata=metadata)

Benefits:

Preserves skeleton topology and connectivity
One ROI per continuous branch (not per connected component)
Compatible with skan’s graph-based skeleton analysis
Enables visual validation of skeleton analysis results
Proper polyline ROIs that represent skeleton structure

Implementation: openhcs/processing/backends/analysis/skan_axon_analysis.py::_skeleton_mask_to_rois()

ROIs are materialized to multiple backends simultaneously during pipeline execution:

def materialize_segmentation_masks(
    data: List[np.ndarray],
    path: str,
    filemanager,
    backends: Union[str, List[str]],
    backend_kwargs: dict = None
) -> str:
    """Materialize segmentation masks as ROIs to multiple backends."""

Backend-Specific Formats:

Disk: JSON file + ImageJ ROI format (.roi files)
OMERO: omero.model.RoiI objects linked to images
Napari: ZMQ streaming as shapes layers
Fiji: ZMQ streaming as ImageJ ROIs to ROI Manager

Per-Channel Materialization

For dict patterns (e.g., processing multiple channels), ROIs are materialized separately for each channel to preserve channel identity:

Path Structure:

# Single channel
A01_segmentation_masks_step7_rois.json

# Multi-channel (dict pattern)
A01_w1_segmentation_masks_step7_rois.json  # Channel 1
A01_w2_segmentation_masks_step7_rois.json  # Channel 2

Implementation (openhcs/core/steps/function_step.py):

# For dict patterns, materialize each channel separately
if is_dict_pattern and dict_keys:
    for dict_key in dict_keys:
        # Construct channel-specific path
        channel_path = f"{dir_part}/{well_id}_w{dict_key}_{rest}"

        # Load channel data
        channel_data = filemanager.load(channel_path, Backend.MEMORY.value)

        # Materialize to all backends
        result_path = mat_func(channel_data, str(analysis_path),
                               filemanager, backends, backend_kwargs)

Backend-Specific Implementation

Disk Backend

Location: openhcs/io/disk.py

Saves ROIs in two formats:

JSON Format: Human-readable ROI data with metadata
ImageJ Format: Binary .roi files for ImageJ/Fiji compatibility

def _save_rois(self, rois: List, output_path: Path, **kwargs) -> str:
    """Save ROIs to disk in JSON and ImageJ formats."""

    # Save JSON
    roi_data = [self._roi_to_dict(roi) for roi in rois]
    json_path = output_path.with_suffix('.json')
    json_path.write_text(json.dumps(roi_data, indent=2))

    # Save ImageJ ROIs
    imagej_dir = output_path.with_name(f"{output_path.stem}_rois")
    for roi in rois:
        # Convert to ImageJ format using roifile library
        ...

OMERO Backend

Location: openhcs/io/omero_local.py

Creates omero.model.RoiI objects and links them to images:

def _save_rois(self, rois: List, output_path: Path, images_dir: str = None, **kwargs) -> str:
    """Create OMERO ROIs and link to images."""

    # Find corresponding image
    image_id = self._find_image_id_from_path(output_path, images_dir)

    # Create OMERO ROI
    omero_roi = omero.model.RoiI()
    omero_roi.setImage(omero.model.ImageI(image_id, False))

    # Add shapes
    for roi in rois:
        for shape in roi.shapes:
            if isinstance(shape, PolygonShape):
                polygon = omero.model.PolygonI()
                # Convert coordinates to OMERO format
                ...

Napari Streaming Backend

Location: openhcs/io/napari_stream.py

Streams ROIs as shapes layers via ZMQ:

def _save_rois(self, rois: List, output_path: Path, napari_port: int, **kwargs) -> str:
    """Stream ROIs to Napari as shapes layer."""

    # Extract layer name from output_path
    layer_name = output_path.stem  # e.g., "A01_w1_segmentation_masks_step7_rois"

    # Convert ROIs to Napari shapes format
    shapes_data = []
    for roi in rois:
        for shape in roi.shapes:
            if isinstance(shape, PolygonShape):
                shapes_data.append({
                    'type': 'polygon',
                    'coordinates': shape.coordinates.tolist(),
                    'metadata': roi.metadata
                })

    # Send via ZMQ
    message = {
        'type': 'shapes',
        'shapes': shapes_data,
        'layer_name': layer_name
    }
    publisher.send_json(message)

Viewer Server (openhcs/runtime/napari_stream_visualizer.py):

def _process_shapes_message(self, data: Dict[str, Any]):
    """Process shapes/ROIs message and add to Napari viewer."""

    shapes_data = data.get('shapes', [])
    layer_name = data.get('layer_name', 'ROIs')

    # Convert to Napari format
    napari_shapes = []
    for shape_dict in shapes_data:
        if shape_dict['type'] == 'polygon':
            napari_shapes.append(np.array(shape_dict['coordinates']))

    # Add or update layer
    if layer_name in self.layers:
        layer.data = napari_shapes
    else:
        layer = self.viewer.add_shapes(
            napari_shapes,
            name=layer_name,
            edge_color='red',
            face_color='transparent'
        )

Fiji Streaming Backend

Location: openhcs/io/fiji_stream.py

Streams ROIs as ImageJ ROIs to ROI Manager via ZMQ:

def _save_rois(self, rois: List, output_path: Path, fiji_port: int, **kwargs) -> str:
    """Stream ROIs to Fiji ROI Manager."""

    from roifile import ImagejRoi
    import base64

    # Extract descriptive prefix
    roi_prefix = output_path.stem  # e.g., "A01_w1_segmentation_masks_step7_rois"

    # Convert ROIs to ImageJ format
    roi_bytes_list = []
    for roi in rois:
        for shape in roi.shapes:
            if isinstance(shape, PolygonShape):
                # Convert to ImageJ ROI
                coords_xy = shape.coordinates[:, [1, 0]]  # Swap to (x, y)
                ij_roi = ImagejRoi.frompoints(coords_xy)

                # Set descriptive name
                ij_roi.name = f"{roi_prefix}_ROI_{roi.metadata['label']}"
                roi_bytes_list.append(ij_roi.tobytes())

    # Encode and send via ZMQ
    encoded_rois = [base64.b64encode(b).decode('utf-8') for b in roi_bytes_list]
    message = {'type': 'rois', 'rois': encoded_rois}
    publisher.send_json(message)

Viewer Server (openhcs/runtime/fiji_viewer_server.py):

def _process_rois_message(self, data: Dict[str, Any]):
    """Process ROIs message and add to Fiji ROI Manager."""

    rois_encoded = data.get('rois', [])

    # Get ROI Manager
    rm = RoiManager.getInstance()
    if rm is None:
        rm = RoiManager()

    # Decode and add ROIs
    for roi_encoded in rois_encoded:
        roi_bytes = base64.b64decode(roi_encoded)

        # Load via temporary file
        with tempfile.NamedTemporaryFile(suffix='.roi', delete=False) as tmp:
            tmp.write(roi_bytes)
            tmp_path = tmp.name

        # Load ROI using ImageJ's RoiDecoder
        roi_decoder = RoiDecoder(tmp_path)
        java_roi = roi_decoder.getRoi()
        rm.addRoi(java_roi)

Integration with Pipeline System

Special Outputs Registration

Functions register ROI outputs using the @special_outputs decorator with writer-based materialization specs:

from openhcs.processing.materialization import MaterializationSpec, CsvOptions, ROIOptions

@special_outputs(
    ("cell_counts", MaterializationSpec(CsvOptions(filename_suffix="_details.csv"))),
    ("segmentation_masks", MaterializationSpec(ROIOptions())),
)
def count_cells_single_channel(..., return_segmentation_mask: bool = False):
    """Count cells and optionally return segmentation masks."""

    if return_segmentation_mask:
        return output_stack, cell_counts, labeled_masks
    else:
        return output_stack, cell_counts

Materialization Workflow

Execution: Function returns special outputs (e.g., labeled masks)
Storage: Special outputs saved to memory backend with channel-specific paths
Materialization: After step completion, format writers are invoked
Multi-Backend: ROIs saved to all backends (disk + streaming) simultaneously

Path Transformation:

Memory:   /memory/plate_001/results/A01_w1_segmentation_masks_step7.pkl
Analysis: /disk/plate_001_outputs/images_results/A01_w1_segmentation_masks_step7_rois.json

Usage Examples

Basic Usage

from openhcs.processing.backends.analysis.cell_counting_cpu import count_cells_single_channel
from openhcs.core.config import DetectionMethod, DtypeConversion

# Enable segmentation mask return
Step(
    func={
        '1': (count_cells_single_channel, {
            'min_cell_area': 40,
            'max_cell_area': 200,
            'detection_method': DetectionMethod.WATERSHED,
            'return_segmentation_mask': True  # Enable ROI extraction
        })
    },
    group_by=GroupBy.CHANNEL,
    variable_components=[VariableComponents.SITE]
)

Multi-Channel with Streaming

Step(
    func={
        '1': (count_cells_single_channel, {'return_segmentation_mask': True}),
        '2': (count_cells_single_channel, {'return_segmentation_mask': True})
    },
    group_by=GroupBy.CHANNEL,
    napari_streaming_config=LazyNapariStreamingConfig(napari_port=5559),
    fiji_streaming_config=LazyFijiStreamingConfig(fiji_port=5560)
)

Result: 4 separate ROI layers/sets (2 wells × 2 channels):

Napari: A01_w1_rois, A01_w2_rois, B03_w1_rois, B03_w2_rois
Fiji: ROIs named A01_w1_rois_ROI_1, A01_w2_rois_ROI_1, etc.