Data Dimensions
High-content screening data has multiple dimensions - wells, sites, channels, Z-planes, and timepoints. OpenHCS provides systematic ways to organize processing across these dimensions through variable components and group by parameters.
Understanding Microscopy Data Dimensions
Typical HCS data structure:
Plate/
├── A01_s1_w1.tif # Well A01, Site 1, Channel 1
├── A01_s1_w2.tif # Well A01, Site 1, Channel 2
├── A01_s2_w1.tif # Well A01, Site 2, Channel 1
├── A01_s2_w2.tif # Well A01, Site 2, Channel 2
├── A02_s1_w1.tif # Well A02, Site 1, Channel 1
└── ...
Dimensions:
Well: Sample position (A01, A02, B01, etc.) - represents experimental conditions
Site: Imaging position within a well (1, 2, 3, etc.) - multiple fields of view
Channel: Fluorescence channel (1, 2, 3, etc.) - different markers or wavelengths
Z-Index: Z-plane depth (1, 2, 3, etc.) - for 3D imaging
Timepoint: Time series point (1, 2, 3, etc.) - for live imaging
Variable Components
Variable components tell OpenHCS how to group files for processing. They define which dimensions vary within each processing group.
from openhcs.constants.constants import VariableComponents
# Available variable components
VariableComponents.SITE # Process each site separately
VariableComponents.CHANNEL # Process each channel separately
VariableComponents.Z_INDEX # Process each Z-plane separately
VariableComponents.TIMEPOINT # Process each timepoint separately
VariableComponents.WELL # Process each well separately
Most Common: Process Each Site Separately
from openhcs.core.steps.function_step import FunctionStep
# Process each site independently (most common pattern)
step = FunctionStep(
func=(normalize_images, {}),
variable_components=[VariableComponents.SITE],
name="normalize"
)
What this does: Groups files by (well, channel, z_index) and processes each site separately.
Example grouping: - Group 1: [A01_s1_w1.tif, A01_s1_w2.tif] → Process site 1 of well A01 - Group 2: [A01_s2_w1.tif, A01_s2_w2.tif] → Process site 2 of well A01 - Group 3: [A02_s1_w1.tif, A02_s1_w2.tif] → Process site 1 of well A02
When to use: Most image processing operations (filtering, segmentation, analysis) that work on complete images from one imaging position.
Process Each Channel Separately
# Process each channel independently
step = FunctionStep(
func=(create_composite, {}),
variable_components=[VariableComponents.CHANNEL],
name="composite"
)
What this does: Groups files by (well, site, z_index) and processes each channel separately.
Example grouping: - Group 1: [A01_s1_w1.tif, A01_s2_w1.tif] → Process channel 1 across all sites - Group 2: [A01_s1_w2.tif, A01_s2_w2.tif] → Process channel 2 across all sites
When to use: Operations that combine data across sites for each channel (creating channel composites, channel-specific normalization).
Process Each Z-Plane Separately
# Process each Z-plane independently for 3D analysis
step = FunctionStep(
func=(max_projection_across_sites, {}),
variable_components=[VariableComponents.Z_INDEX],
name="z_projection"
)
What this does: Groups files by (well, site, channel) and processes each Z-plane separately.
Process Each Timepoint Separately
# Process each timepoint independently for time series analysis
step = FunctionStep(
func=(track_cells_over_time, {}),
variable_components=[VariableComponents.TIMEPOINT],
name="cell_tracking"
)
What this does: Groups files by (well, site, channel, z_index) and processes each timepoint separately.
Example grouping (ImageXpress format): - Group 1: [A01_s1_w1_t001.tif] → Process timepoint 1 - Group 2: [A01_s1_w1_t002.tif] → Process timepoint 2 - Group 3: [A01_s1_w1_t003.tif] → Process timepoint 3
When to use: Time series analysis, cell tracking, temporal dynamics studies.
Microscope Format Support:
ImageXpress:
_t001,_t002, etc. (e.g.,A01_s1_w1_t003.tif)Opera Phenix:
sk1,sk2, etc. (e.g.,r01c01f01p01-ch1sk5.tif)OMERO: Timepoint dimension from OMERO metadata
OpenHCS:
_t001,_t002, etc. (native format)
Accessing Timepoint Values:
All microscope handlers provide get_timepoint_values() to retrieve available timepoints:
from openhcs.microscopes import get_microscope_handler
handler = get_microscope_handler('/path/to/plate', microscope_type='imagexpress')
# Get all timepoints for a specific well
timepoints = handler.get_timepoint_values(well_id='A01')
# Returns: ['001', '002', '003', ...]
# Timepoint is now a first-class component dimension
# alongside channel, z-index, site, and well
When to use: 3D imaging where you need to combine or analyze across Z-planes, such as creating maximum projections.
Multiple Variable Components
# Process each site and channel combination separately
step = FunctionStep(
func=(single_image_analysis, {}),
variable_components=[VariableComponents.SITE, VariableComponents.CHANNEL],
name="single_image"
)
What this does: Creates separate groups for each unique combination of site and channel.
When to use: Operations that work on individual images rather than image stacks.
Group By Parameter
The group_by parameter works with dictionary function patterns to route different data to different functions.
from openhcs.constants.constants import GroupBy
# Route different channels to different functions
step = FunctionStep(
func={
'1': (analyze_nuclei, {}), # Channel 1 → nuclei analysis
'2': (analyze_neurites, {}) # Channel 2 → neurite analysis
},
group_by=GroupBy.CHANNEL,
variable_components=[VariableComponents.SITE]
)
How Group By Works
Data Grouping: Files are first grouped by
variable_componentsFunction Routing: Within each group, data is routed to functions based on
group_byExecution: Each function processes its assigned data
Example with channel routing:
Files: A01_s1_w1.tif, A01_s1_w2.tif
Step 1 - Group by variable_components=[SITE]:
Group: [A01_s1_w1.tif, A01_s1_w2.tif] # Same site
Step 2 - Route by group_by=CHANNEL:
Channel 1: A01_s1_w1.tif → analyze_nuclei()
Channel 2: A01_s1_w2.tif → analyze_neurites()
Available Group By Options
GroupBy.CHANNEL # Route by channel number (primary use case)
Common Data Organization Patterns
Site-by-Site Processing
Most common pattern for standard image processing:
# Process each imaging site independently
step = FunctionStep(
func=(segment_cells, {}),
variable_components=[VariableComponents.SITE],
name="segmentation"
)
Use cases: Filtering, segmentation, feature extraction, most analysis operations.
Channel-Specific Analysis
Different analysis for different fluorescent markers:
# Different analysis for each channel
step = FunctionStep(
func={
'1': (count_nuclei, {}), # DAPI channel
'2': (measure_intensity, {}), # GFP channel
'3': (detect_structures, {}) # RFP channel
},
group_by=GroupBy.CHANNEL,
variable_components=[VariableComponents.SITE],
name="channel_analysis"
)
Use cases: Multi-marker experiments where each channel represents different biological features.
Multi-Channel Processing
Different processing for different fluorescent markers:
# Different preprocessing for different channels
step = FunctionStep(
func={
'1': [(gaussian_filter, {'sigma': 1.0}), (tophat, {'selem_radius': 15})], # DAPI preprocessing
'2': [(gaussian_filter, {'sigma': 2.0}), (enhance_contrast, {'percentile_range': (1, 99)})] # GFP preprocessing
},
group_by=GroupBy.CHANNEL,
variable_components=[VariableComponents.SITE],
name="channel_preprocessing"
)
Use cases: Multi-marker experiments where each channel requires different preprocessing approaches.
Z-Stack Processing
Processing 3D image stacks:
# Combine Z-planes into maximum projection
step = FunctionStep(
func=(max_projection, {}),
variable_components=[VariableComponents.Z_INDEX],
name="z_projection"
)
Use cases: 3D imaging where you need to combine or analyze across Z-planes.
Choosing the Right Organization
Consider Your Analysis Goal:
Single image operations: Use
[SITE, CHANNEL]to process individual imagesMulti-channel analysis: Use
[SITE]with channel-specific functionsCross-site analysis: Use
[CHANNEL]to combine data across sitesWell-level summaries: Use
[WELL]to analyze entire wells
Consider Your Data Structure:
2D images: Typically use
[SITE]3D stacks: May need
[Z_INDEX]for projection operationsTime series: May need
[TIME]for temporal analysisMulti-condition: Use
group_by=WELLfor condition-specific processing
Performance Considerations:
Parallel processing: More variable components = more parallel groups
Memory usage: Fewer variable components = larger data groups = more memory per group
I/O efficiency: Group organization affects how data is loaded and cached
The data dimensions system provides systematic control over how your analysis processes the multiple dimensions of HCS data, enabling both simple single-image operations and complex multi-dimensional workflows.