External Integrations Overview =============================== The Problem: Isolated Processing Pipelines ------------------------------------------- Scientific image processing rarely happens in isolation. Researchers need to integrate OpenHCS pipelines with external tools: OMERO servers for data storage, Napari/Fiji for visualization, custom analysis tools, and cloud services. Without a unified integration approach, each external tool requires custom code, leading to duplicated logic, inconsistent error handling, and brittle connections that break when tools update. Executive Summary ----------------- OpenHCS implements a comprehensive integration strategy with the bioimage analysis ecosystem, providing seamless interoperability with visualization tools (Napari, Fiji) and data management platforms (OMERO). These integrations follow consistent architectural patterns based on inter-process communication (IPC) via ZeroMQ, enabling real-time streaming, remote execution, and location-transparent data access. This document provides a high-level overview of OpenHCS's integration architecture and how the different components work together to create a unified bioimage analysis platform. Boundary Split -------------- The integration stack is now explicitly split across repositories: - ``zmqruntime`` owns transport and execution lifecycle primitives (REQ/REP + PUB/SUB channels, polling, typed execution status contracts). - ``polystore`` owns streaming payload semantics and receiver-side projection utilities (component-mode grouping, batch/debounce engines, viewer payload constants). - ``pyqt-reactive`` owns generic manager/browser UI infrastructure (polling shells, tree sync/rebuild, aggregation policies). - ``openhcs`` owns domain wiring only (pipeline/orchestrator semantics, OpenHCS-specific adapters, UX policies). OpenHCS architecture docs focus on OpenHCS wrapper behavior and link to external module docs for generic abstraction internals. Integration Philosophy ---------------------- Core Principles ~~~~~~~~~~~~~~~ 1. **Reusable Patterns**: Extract common IPC patterns into generic base classes 2. **Location Transparency**: Same API for local and remote operations 3. **Zero-Copy Performance**: Minimize data movement through shared memory and direct file access 4. **Process Isolation**: Independent processes for stability and resource management 5. **Fail-Loud Design**: Clear error messages, no silent failures 6. **Production-Grade**: Designed for institutional deployment, not just research prototypes Unified Architecture ~~~~~~~~~~~~~~~~~~~~ All OpenHCS integrations share a common dual-channel ZeroMQ pattern: .. code-block:: text ┌─────────────────────────────────────────────────────────┐ │ ZMQ Base Classes │ │ ┌──────────────────────┐ ┌──────────────────────┐ │ │ │ ZMQServer │ │ ZMQClient │ │ │ │ - Dual channels │ │ - Connection mgmt │ │ │ │ - Ping/pong │ │ - Auto-spawn │ │ │ │ - Process lifecycle │ │ - Multi-instance │ │ │ └──────────────────────┘ └──────────────────────┘ │ └─────────────────────────────────────────────────────────┘ │ ┌─────────────────┼─────────────────┐ │ │ │ ▼ ▼ ▼ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ Napari │ │ Fiji │ │ OMERO │ │ Streaming │ │ Streaming │ │ Execution │ │ │ │ │ │ Server │ └──────────────┘ └──────────────┘ └──────────────┘ Integration Components ---------------------- Napari Integration ~~~~~~~~~~~~~~~~~~ **Purpose**: Real-time visualization of processing results **Architecture**: OpenHCS wrapper over ``zmqruntime`` transport plus ``polystore`` streaming semantics and receiver projection **Key Features**: - Zero-copy shared memory transfer - Component-aware layer organization - Persistent viewers across pipeline runs - Multi-instance support (multiple viewers on different ports) - Network streaming capability (local → remote) **Use Cases**: - Pipeline development and debugging - Quality control during batch processing - Multi-step workflow validation - Remote HPC visualization **Documentation**: See :doc:`napari_integration_architecture` OMERO Integration ~~~~~~~~~~~~~~~~~ **Purpose**: Data management and institutional deployment **Architecture**: Storage backend + execution server + web UI **Key Features**: - Zero-copy server-side file access - Virtual backend pattern (no real filesystem) - Code-based pipeline serialization - Web-based pipeline submission - Automatic instance management **Use Cases**: - Institutional core facilities - High-throughput screening - Collaborative research - Teaching and training **Documentation**: See :doc:`omero_backend_system` Fiji Integration ~~~~~~~~~~~~~~~~ **Purpose**: Interoperability with ImageJ/Fiji ecosystem **Architecture**: OpenHCS wrapper over ``zmqruntime`` transport plus ``polystore`` streaming semantics and receiver projection **Key Features**: - Zero-copy shared memory transfer - Automatic hyperstack building from component metadata - PyImageJ integration for native Fiji functionality - Persistent viewers across pipeline runs - Multi-instance support (multiple viewers on different ports) **Use Cases**: - Leveraging ImageJ/Fiji plugin ecosystem - Macro scripting integration - CZT-based visualization workflows - Cross-platform bioimage analysis **Documentation**: See :doc:`fiji_streaming_system` Architectural Patterns ---------------------- Pattern 1: Dual-Channel ZeroMQ Communication ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Need both synchronous request/response and asynchronous data streaming **Solution**: Separate control and data channels .. code-block:: text Control Channel (REQ/REP): - Port: Base port (e.g., 5555, 7777) - Purpose: Handshake, commands, status queries - Pattern: Synchronous request/response Data Channel (PUB/SUB): - Port: Base port + 1000 (e.g., 6555, 8777) - Purpose: Image streaming, progress updates - Pattern: Asynchronous publish/subscribe **Benefits**: - Control messages don't block data streaming - Reliable handshake before data transfer - Independent scaling of control and data throughput **Implementations**: - Napari: Image streaming + viewer control - Fiji: Image streaming + macro execution - OMERO: Pipeline execution + progress updates Pattern 2: Instance Management ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Need to connect to existing services or start new ones **Solution**: Check → Connect → Start pattern .. code-block:: python class ServiceManager: def connect(self, timeout: int) -> bool: # 1. Check if already connected if self.is_connected(): return True # 2. Try to connect to existing instance if self.is_service_running(): return self._connect_to_service() # 3. Start new instance if needed if self._start_service(): return self._wait_and_connect(timeout) return False **Benefits**: - Reuses existing instances (faster, resource-efficient) - Automatic startup when needed (user-friendly) - Graceful degradation with clear errors **Implementations**: - ``NapariStreamVisualizer``: Manages Napari viewer processes - ``OMEROInstanceManager``: Manages OMERO server connections - ``ZMQExecutionClient``: Manages execution server connections Pattern 3: Zero-Copy Data Transfer ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Large image arrays are expensive to copy between processes **Solution**: Shared memory for local IPC, direct file access for storage **Napari/Fiji Streaming**: .. code-block:: python # Create shared memory block shm = shared_memory.SharedMemory(create=True, size=data.nbytes) shm_array = np.ndarray(data.shape, dtype=data.dtype, buffer=shm.buf) shm_array[:] = data[:] # Send metadata + shared memory name (not data) message = { 'shm_name': shm.name, 'shape': data.shape, 'dtype': str(data.dtype) } # Receiver attaches to same memory shm = shared_memory.SharedMemory(name=message['shm_name']) data = np.ndarray(message['shape'], dtype=message['dtype'], buffer=shm.buf) **OMERO Server-Side**: .. code-block:: python # Direct file access (no API overhead) local_path = omero_data_dir / user_id / fileset_id / filename data = tifffile.imread(local_path) # Zero-copy mmap possible **Benefits**: - Minimal memory overhead - Maximum throughput - Reduced latency Pattern 4: Code-Based Serialization ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Pickle fails with enums, custom classes, version mismatches **Solution**: Generate executable Python code .. code-block:: python # Client: Object → Code config_code = generate_python_source(Assignment("config", config_obj)) # Result: "config = GlobalPipelineConfig(num_workers=4, ...)" # Server: Code → Object namespace = {} exec(config_code, namespace) config_obj = namespace['config'] **Benefits**: - Human-readable - Version-independent - Debuggable - Network-safe (JSON strings) **Implementations**: - OMERO remote execution - PyQt UI bidirectional conversion - Future: Pipeline sharing and templates Integration Workflows --------------------- Workflow 1: Local Development with Napari ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: text Developer Machine: OpenHCS Pipeline ↓ FileManager (Memory/Disk backends) ↓ NapariStreamingBackend ↓ (shared memory) Napari Viewer Process **Characteristics**: - Single machine - Immediate visual feedback - Interactive parameter tuning - Rapid iteration Workflow 2: Remote Execution with OMERO ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: text Researcher's Browser: OMERO.web UI ↓ (HTTP/AJAX) Django Plugin ↓ (ZeroMQ) OMERO Server Machine: Execution Server ↓ (zero-copy file access) OMERO Data ↓ (GPU processing) Results → OMERO **Characteristics**: - Multi-tier architecture - Server-side GPU processing - No data movement - Web-based interface Workflow 3: Hybrid Remote Execution + Visualization ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: text OMERO Server: Execution Server ↓ (processing) Results ↓ (network streaming) Researcher's Workstation: Napari Viewer (real-time visualization) **Characteristics**: - Processing near data (server-side) - Visualization near user (local) - Network-based streaming - Best of both worlds **Implementation Status**: Architecture supports this, requires: - Network-aware Napari streaming (change ``localhost`` to remote IP) - Bandwidth-aware compression - Authentication/encryption Performance Characteristics --------------------------- Napari Streaming ~~~~~~~~~~~~~~~~ ================= ============= ============================ Metric Value Notes ================= ============= ============================ Latency ~50ms Timer-based polling Throughput 10 images/tick Batch processing Memory Overhead ~0% Shared memory (zero-copy) Startup Time ~2s Viewer process spawn ================= ============= ============================ OMERO Server-Side Execution ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ================= ============= ============================ Metric Value Notes ================= ============= ============================ File Access ~0ms overhead Direct file system access API Access ~100ms/image Network + API overhead Serialization ~1ms Code generation GPU Speedup 10-100x vs CPU processing ================= ============= ============================ ZeroMQ Communication ~~~~~~~~~~~~~~~~~~~~ ================= ============= ============================ Metric Value Notes ================= ============= ============================ Handshake ~10ms Ping/pong verification Message Latency <1ms Local IPC Network Latency ~10-50ms Depends on network Throughput >1GB/s Shared memory path ================= ============= ============================ See Also -------- - :doc:`napari_integration_architecture` - Napari integration details - :doc:`fiji_streaming_system` - Fiji streaming architecture - :doc:`omero_backend_system` - OMERO backend architecture - :doc:`../guides/omero_integration` - OMERO integration guide - :doc:`../guides/viewer_management` - Viewer management guide - :doc:`../guides/fiji_viewer_management` - Fiji viewer management