Compare State-Epoch Data Between Groups¶
This tool uses 1.0 compute credits per hour.
Overview¶
The Compare State-Epoch Data Between Groups workflow compares state-epoch activity, correlation, and modulation metrics across two experimental groups using the CSV/H5 outputs generated by the Compare State Data Across Epochs tool. It ingests the per-group activity_per_state_epoch_data.csv, correlations_per_state_epoch_data.csv (or raw correlation H5 files), and modulation_vs_baseline_data.csv, then produces harmonized group tables, statistical summaries, and publication-ready previews. The comparison can be performed across states or epochs (one dimension per run) and supports trace-only, event-only, or dual-measure analyses.
Key capabilities:
- Validates that both groups share the same baseline state-epoch reference, state list, epoch list, and baseline modulation metadata.
- Supports paired or unpaired comparisons, automatic or user-specified parametric tests, and multiple correction strategies (Bonferroni, FDR, etc.).
- Calculates per-group descriptive statistics, ANOVA-style summaries, pairwise tests, and optional linear mixed-model (LMM) analyses for cell-level metrics.
- Reclassifies modulation significance at a new α threshold when requested, preserving the alpha/2 logic used in the baseline tool.
- Runs either two-group comparisons or a single-group collapse; single-group mode aggregates all provided recordings and reports within-group ANOVA/pairwise statistics while retaining the same output directory layout.
Design Benefits¶
- Structured inputs: By reusing the single-tool outputs (activity, correlation, modulation), each group already shares state/epoch labels, baseline metadata, and scaling, so comparison code can stay focused on higher-level logic.
- Orthogonal flow control: Two explicit branches—trace vs event and state vs epoch—allow users to toggle modalities or comparison axes independently without touching other parameters.
- Targeted statistics: Limiting the engine to 2-way ANOVAs (state-by-group or epoch-by-group) avoids 3-way complexity while matching how the single tool reports per-dimension summaries; multiple-comparison correction is applied within each tested context.
Potential Future Expansion
Future versions may add a third comparison dimension so that a full three-way ANOVA (e.g., state × epoch × group) can be explored to support the additional complexity. If you’d like this feature, please reach out to inscopix.support@bruker.com.
Note
- Group 2 inputs are optional; the tool can run in "single group" mode to collapse recordings and produce per-group ANOVA/pairwise outputs. Single-group runs still require at least two subjects so statistical tests remain valid.
- When correlation data is provided in H5 format, it must include the
traceand/oreventgroups produced by Compare State Data Across Epochs. The tool converts them to the same schema as the CSV files automatically.
Input Data¶
Compatibility
This tool is designed to work exclusively with outputs from the Compare State Data Across Epochs tool. It is not compatible with outputs from the Compare Neural Activity Across States tool, as the data structures differ between these workflows.
All input files must come directly from Compare State Data Across Epochs and use the same configuration (states, epochs, baseline definitions, scaling choices). Each run must include at least two activity_per_state_epoch_data.csv files for Group 1 because the statistical tests require more than one subject. You may omit Group 2 entirely to run a single-group analysis; when two groups are supplied, every modality present for Group 1 must also be provided for Group 2 with matching file counts so comparisons remain balanced.
| Source Parameter | File Type | File Format |
|---|---|---|
| Group 1 Activity CSV Files | epoch_activity_data | csv |
| Group 1 Correlation Files | correlation_data, correlation_data | csv, h5 |
| Group 1 Modulation CSV Files | modulation_data | csv |
| Group 2 Activity CSV Files | epoch_activity_data | csv |
| Group 2 Correlation Files | correlation_data, correlation_data | csv, h5 |
| Group 2 Modulation CSV Files | modulation_data | csv |
Minimum recordings & pairing rules
- Provide at least two
activity_per_state_epoch_data.csvfiles per group; correlation and modulation modalities (when supplied) must also contain ≥2 files so ANOVA/pairwise/LMM tests have sufficient degrees of freedom. - When two groups are provided, each modality must have matching file counts across groups. Paired analyses additionally require
subject_matchingto identify at least two matched subject pairs per modality; otherwise the run aborts with a descriptive error.
Group consistency requirements¶
- Baseline match: Both groups must share identical
baseline_stateandbaseline_epochvalues embedded in their modulation CSVs. - State/Epoch match: The ordered lists of state names and epoch names must match exactly; otherwise execution stops with an informative error.
- Subject identifiers: Subject IDs (from
normalized_subject_id) must follow the standardized format. Whendata_pairing="paired",subject_matchingaligns files using one of the supported strategies:number(match digits in filenames),filename(exact basename), ororder(original list order, used as the fallback). At least two matched subjects are required. - Modality parity: In two-group runs, correlation and modulation inputs must be present for Group 2 whenever they exist for Group 1 so the tool can build balanced statistics.
Correlation inputs options¶
- CSV path:
correlations_per_state_epoch_data.csvfiles for each recording. This is the preferred (future default) format. - H5 path:
pairwise_correlation_heatmaps.h5files withtrace/<state-epoch>and/orevent/<state-epoch>datasets. These remain supported for backward compatibility; the tool automatically converts them into the same column structure as the CSV files.
If neither CSV nor H5 correlation data is provided, correlation analyses and previews are skipped.
Parameters¶
| Parameter | Required? | Default | Description |
|---|---|---|---|
| Group 1 Activity CSV Files | True | N/A | List of activity_per_state_epoch_data.csv files for group 1 (from state_epoch_baseline_analysis outputs) |
| Group 1 Correlation Files | False | N/A | Optional correlation data files for group 1 (from state_epoch_baseline_analysis outputs). Provide to enable correlation analyses; accepts correlations_per_state_epoch_data.csv or pairwise_correlation_heatmaps.h5 files |
| Group 1 Modulation CSV Files | False | N/A | Optional modulation_vs_baseline_data.csv files for group 1 (from state_epoch_baseline_analysis outputs). Provide to enable modulation analyses |
| Group 1 Name | True | N/A | Name for group 1 (e.g., 'Control') |
| Group 1 Color | False | N/A | Color for group 1 visualizations (e.g., 'blue') |
| Group 2 Activity CSV Files | False | N/A | List of activity_per_state_epoch_data.csv files for group 2 (optional for two-group comparison) |
| Group 2 Correlation Files | False | N/A | Correlation data files for group 2 (optional for two-group comparison). Required when group 2 activity files are provided and group 1 correlation files are supplied. Accepts correlations_per_state_epoch_data.csv or pairwise_correlation_heatmaps.h5 files |
| Group 2 Modulation CSV Files | False | N/A | Modulation_vs_baseline_data.csv files for group 2 (optional for two-group comparison). Required when group 2 activity files are provided and group 1 modulation files are supplied |
| Group 2 Name | False | N/A | Name for group 2 (e.g., 'Treatment'). Required if group 2 files are provided. |
| Group 2 Color | False | N/A | Color for group 2 visualizations (e.g., 'red') |
| Comparison Dimension | True | N/A | Dimension to compare: 'states' (compare across behavioral states) or 'epochs' (compare across time epochs) |
| Measure Source | False | N/A | Data source to analyze across activity, correlation, and modulation: 'trace' (use trace-based measures, fallback to event if missing), 'event' (use event-based measures exclusively), or 'both' (analyze both trace and event separately) |
| State Colors | False | N/A | Optional list of hex color codes for states (e.g., ["#FF0000", "#00FF00", "#0000FF"]). Colors will be assigned to states in the order they appear in the data. If not provided, colors will be extracted from CSV files or auto-generated. |
| Epoch Colors | False | N/A | Optional list of hex color codes for epochs (e.g., ["#0000FF", "#FFFF00", "#FF00FF"]). Colors will be assigned to epochs in the order they appear in the data. If not provided, colors will be extracted from CSV files or auto-generated. |
| Modulation Colors | False | N/A | Comma-separated list of matplotlib compatible colors representing up-modulated, down-modulated, and non-modulated neurons respectively (e.g., 'green,blue,black'). Default: 'green,blue,black' |
| Data Pairing | True | N/A | Type of data pairing: 'unpaired' (independent samples) or 'paired' (matched subjects across groups) |
| Subject Matching | False | N/A | Method for matching subjects between groups (for paired analysis): 'order' (match by file order), 'number' (match by numeric IDs), or 'filename' (match by filename) |
| Correlation Statistic | False | N/A | Type of per-cell correlation statistic to analyze: 'max' (maximum), 'min' (minimum), or 'mean' (average) |
| Significance Threshold | False | N/A | Significance threshold for statistical tests (default: 0.05). Leave empty to use pre-computed modulation classifications. |
| Multiple Comparison Correction | False | N/A | Method for multiple comparison correction |
| Multiple Comparison Scope | False | global | Scope for multiple comparison correction. Global: correct across ALL tests from all strata (recommended, most conservative). Within-stratum: correct only within each stratum (e.g., each epoch separately when comparing states). Global correction prevents inflation of Type I error when performing stratified analyses. |
| Effect Size | False | N/A | Method for calculating effect size |
| Group Comparison Type | True | N/A | Type of statistical test to perform. Two-tailed tests for differences in either direction, one-tailed tests for directional hypotheses. |
| Parametric | False | N/A | Indicates whether to perform a parametric test. If set to 'auto', a parametric test will be used if the data follows a normal distribution and there are at least 8 observations. Otherwise, a non-parametric test will be used. |
| Enable LMM Analysis | False | N/A | Enable linear mixed model analysis to support imbalanced designs. Disable to skip LMM processing. |
| Save LMM Output Files | False | N/A | Persist linear mixed model result tables to CSV outputs. Disable to keep LMM results in memory only. |
Highlights:
comparison_dimension: choose"states"or"epochs"to drive the aggregation and visualization axis.measure_source:"trace","event", or"both"to control whether analyses run on traces, events, or both modalities independently.correlation_statistic:"max","mean", or"min"selects the per-cell correlation metric that feeds the summaries and previews (positive/negative population averages are always added).data_pairing&subject_matching: set"paired"vs"unpaired"plus the matching rule used only in paired mode (numberfor digits inside filenames,filename/namefor exact basename matches, ororderto keep the provided ordering). Paired mode requires ≥2 matched subjects.multiple_correction&multiple_correction_scope: configure family-wise error control globally or within strata (within_stratum, withper_conditionaccepted as an alias).parametric:"auto"(data-driven parametric/non-parametric selection),"True"(force parametric; raises an error if assumptions fail), or"False"(always use non-parametric tests).enable_lmm_analysis: toggles the optional linear mixed-effects pipeline for cell-level measures;save_lmm_outputsadds per-comparison CSVs summarizing model fits.
Workflow¶
Processing steps¶
- Validation & metadata alignment
- Ensures required CSV/H5 files exist and share the same states, epochs, baseline, and color metadata.
-
Normalizes group names/colors (or generates defaults) and records them for output labeling.
-
Subject matching & pairing
- Aligns subjects across groups using the configured
subject_matchingrule (number,filename, ororder) wheneverdata_pairing="paired". -
Handles paired vs. unpaired scenarios and logs dropped or unmatched subjects.
-
Measure selection
- Builds unified column sets (activity, per-cell correlations, population correlations, modulation counts) based on
measure_sourceandcorrelation_statistic. -
For modulation data, can reclassify neurons at a custom significance threshold while respecting alpha/2 directionality.
-
Statistical testing
- Produces ANOVA-style summaries and pairwise test tables for each metric.
- Optional LMM analysis (when enabled and data are cell-level) uses subject and cell IDs as random effects to detect subtle group-by-state/epoch interactions.
-
Multiple-comparison corrections (Bonferroni, FDR, etc.) are applied either globally or within each stratum.
-
Visualization
- Generates per-group boxplots/CDFs for activity and correlations, modulation distributions, and comparison-level summaries for the selected dimension.
- Preview SVGs are stored alongside the CSV outputs in
.previews/subdirectories.
Outputs¶
All outputs are organized under the chosen output_dir with descriptive subfolders. Key artifacts include:
- Per-group combined data
<group>_combined_activity_data/<comparison>_<group>_combined_activity_data.csv<group>_combined_trace_correlation_data/<comparison>_<group>_combined_trace_correlation_data.csv<group>_combined_modulation_data/<comparison>_<group>_combined_modulation_data.csv-
Each folder contains matching preview SVGs inside
.previews/, including<group>_{trace|event}_activity_boxplot.svg,<stat>_{trace|event}_correlation_{boxplot|cdf}.svgfor every requestedcorrelation_statistic,<positive|negative>_{trace|event}_population_boxplot.svg, and<group>_{trace|event}_modulation_distribution.svg(event variants appear only when event data are analyzed). -
Trace statistical summaries
trace_aov_comparisons/<comparison>_trace_aov_comparisons.csv-
trace_pairwise_comparisons/<comparison>_trace_pairwise_comparisons.csv -
Event statistical summaries (when event data are available)
event_aov_comparisons/<comparison>_event_aov_comparisons.csv-
event_pairwise_comparisons/<comparison>_event_pairwise_comparisons.csv -
Optional LMM outputs
-
Saved only when
enable_lmm_analysisandsave_lmm_outputsareTrue; file names follow the same<comparison>_<measure>_lmm_results.csvconvention. -
Comparison previews
- Trace comparisons:
states_comparison_trace_activity.svg,states_comparison_trace_correlation.svg,states_comparison_trace_positive_correlation.svg,states_comparison_trace_negative_correlation.svg,states_comparison_trace_modulation.svg,states_comparison_trace_up_modulated_counts.svg, andstates_comparison_trace_down_modulated_counts.svg(automatically switched to theepochs_prefix whencomparison_dimension="epochs"). - Event comparisons: the same set of filenames with
_event_instead of_trace_are generated whenever event measures are part of the analysis.
All CSVs include metadata columns for state/epoch identifiers, group labels, subject IDs, source filenames, and the statistical annotations (test type, effect size, corrected p-values) needed for downstream reporting.
Example Combined Activity Output¶
| state | epoch | group_name | normalized_subject_id | filename | cell_index | mean_trace_activity | mean_event_rate |
|---|---|---|---|---|---|---|---|
| rest | baseline | Control | subj_001 | recording_001.isxd | 0 | 0.125 | 0.018 |
| rest | baseline | Control | subj_001 | recording_001.isxd | 1 | 0.142 | 0.012 |
| rest | baseline | Treatment | subj_005 | recording_005.isxd | 0 | 0.163 | 0.020 |
| rest | training | Control | subj_001 | recording_001.isxd | 0 | 0.179 | 0.024 |
| rest | training | Treatment | subj_005 | recording_005.isxd | 0 | 0.211 | 0.029 |
Example Combined Correlation Output¶
| state | epoch | group_name | normalized_subject_id | filename | max_trace_correlation | mean_trace_correlation | positive_trace_correlation |
|---|---|---|---|---|---|---|---|
| rest | baseline | Control | subj_001 | recording_001.isxd | 0.83 | 0.21 | 0.34 |
| rest | baseline | Treatment | subj_005 | recording_005.isxd | 0.87 | 0.24 | 0.36 |
| rest | training | Control | subj_001 | recording_001.isxd | 0.78 | 0.18 | 0.30 |
| rest | training | Treatment | subj_005 | recording_005.isxd | 0.81 | 0.22 | 0.33 |
Example Combined Modulation Output¶
| state | epoch | group_name | filename | cell_index | trace_modulation_scores | trace_p_values | trace_modulation | trace_up_modulation_number | trace_down_modulation_number |
|---|---|---|---|---|---|---|---|---|---|
| rest | training | Control | recording_001.isxd | 0 | 0.18 | 0.032 | 1 | 12 | 3 |
| rest | training | Control | recording_001.isxd | 1 | -0.09 | 0.210 | 0 | 12 | 3 |
| rest | training | Treatment | recording_005.isxd | 0 | 0.31 | 0.004 | 1 | 15 | 1 |
| rest | training | Treatment | recording_005.isxd | 2 | -0.22 | 0.012 | -1 | 15 | 1 |
Example Trace ANOVA Output¶
| comparison | state_or_epoch | effect | df1 | df2 | SS | MS | F | p_unc | p_corr | effect_size |
|---|---|---|---|---|---|---|---|---|---|---|
| trace_activity | rest | group | 1 | 10 | 0.024 | 0.024 | 6.42 | 0.030 | 0.060 | 0.39 |
| trace_activity | rest | epoch | 2 | 20 | 0.011 | 0.005 | 2.14 | 0.143 | 0.286 | 0.21 |
| trace_activity | rest | interaction | 2 | 20 | 0.004 | 0.002 | 0.81 | 0.459 | 0.918 | 0.09 |
Example Trace Pairwise Output¶
| comparison | contrast | state_or_epoch | A | B | paired | parametric | statistic | dof | p_unc | p_corr | p_adjust | effect_size |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| trace_activity | group | rest | Control | Treatment | False | True | -2.53 | 10 | 0.030 | 0.060 | bonf | -0.72 |
| trace_activity | epoch | Control | baseline | training | True | True | -1.61 | 5 | 0.164 | 0.328 | bonf | -0.37 |
Example LMM Output (optional)¶
| measure | fixed_effect | estimate | std_error | df | t_value | p_unc | group | state_or_epoch |
|---|---|---|---|---|---|---|---|---|
| trace_activity | group | -0.041 | 0.016 | 96 | -2.62 | 0.010 | Treatment | rest |
| trace_activity | epoch | 0.018 | 0.007 | 96 | 2.57 | 0.012 | Treatment | training |
Previews¶
The preview examples below correspond to the epochs comparison mode (Group 1 vs. Group 2 across epochs) using trace metrics. When comparison_dimension="states" or when measure_source includes events ("event" or "both"), the tool generates the same family of figures with state-level layouts and/or event-specific data, following the exact styling shown here.
Additional per-group previews for positive/negative population correlations and the event-based activity/correlation/modulation plots are produced automatically when those modalities are analyzed. Similarly, comparison-level modulation prevalence figures (states/epochs_comparison_{trace|event}_modulation.svg) accompany the up/down count charts even though only the trace-based epoch examples are illustrated below.