Combine and Compare Correlation Data¶
This tool uses 1.0 compute credits per hour.
Overview¶
This tool combines cell-cell correlation data generated by the Compare Neural Circuit Correlations Across States tool from multiple recordings. It focuses on specific states defined by the user. The tool calculates and compares several correlation metrics across recordings, states, and experimental groups:
New version changes
-
Input Format Change: This tool now requires correlation data in HDF5 (
.h5
) format, specifically the file generated by the Compare Neural Circuit Correlations Across States. This is a change from previous versions which have used CSV outputs. -
Benefits of H5 Input: The H5 format stores the full raw correlation matrices for each state. This enables:
- Single-cell Level Analysis: Calculation and comparison of cell-specific statistics like the maximum, minimum, or mean correlation (
statistic
parameter). - Detailed Average Correlations: Separate calculation and comparison of the average positive and average negative correlations across all cell pairs within a state.
- Single-cell Level Analysis: Calculation and comparison of cell-specific statistics like the maximum, minimum, or mean correlation (
-
For Users of the Previous Correlation Tool: If you previously used the Compare Neural Circuit Correlations Across States and saved its outputs, you can use this updated combine-and-compare tool by providing the
*.h5
file generated by that tool as input. The previous CSV outputs are no longer the primary input for this combined analysis tool.
Parameters¶
Parameter | Required? | Default | Description |
---|---|---|---|
Group 1 Correlation Data Files | True | N/A | Select correlation data from the first group to use for analysis |
Group 1 Name | True | group1 | Name of the first group |
Group 1 Color | True | tab:red | Color of the first group |
Group 2 Correlation Data Files | False | N/A | Select correlation data from the second group to use for analysis |
Group 2 Name | False | N/A | Name of the second group |
Group 2 Color | True | tab:orange | Color of the second group |
State Names | True | N/A | Names of analyzed states |
State Colors | True | N/A | Colors of analyzed states |
Comparison Type | False | N/A | Type of statistical test to perform |
Multiple Comparison Correction method | True | N/A | Method for correcting for multiple comparisons |
Effect Size Method | True | N/A | Method for calculating the effect size |
Data Pairing | False | unpaired | Indicates whether observations should be paired for statistical comparison |
Subject Matching Method | False | order | Method for matching subjects between groups in paired analysis |
Significance Threshold | True | 0.05 | p-value threshold for classifying neurons as up- or down-modulated |
Input Files¶
Source Parameter | File Type | File Format |
---|---|---|
Group 1 Correlation Data Files | correlation_data | h5 |
Group 2 Correlation Data Files | correlation_data | h5 |
The input files have the following requirements:
- File Type: Input files must be HDF5 files (
.h5
) generated by the Compare Neural Circuit Correlations Across States. - Group Size: If data for a group is provided, that group must contain at least two H5 files.
- State Matching: The
State Names
parameter provided to this tool must be a comma-separated list of strings that exactly match (case-insensitive) the names of the datasets within the input H5 files that you wish to analyze. - State Consistency: All input H5 files should contain datasets for the states specified in
State Names
. - State Colors: The number of
state_colors
provided must be equal to the number ofState Names
provided.
Additionally, each input H5 file is expected to contain top-level datasets where:
- The name of each dataset corresponds to a state (e.g., "immobile", "mobile", "other").
- The value of each dataset is a 2D NumPy array representing the cell-cell Pearson correlation matrix for that state. The diagonal elements are expected to be zero.
Algorithm Description¶
The tool follows a simple three-step process:
- Data Processing: Reads correlation data from input files, filters for specified states, and calculates summary statistics.
- Statistical Analysis: Compares correlations across states and groups to identify significant differences.
- Output Generation: Saves combined data, statistical results, and creates visualization plots.
Data Processing¶
- Read Data: The tool reads correlation data from H5 files for each group.
- Filter States: Only the user-specified states are included in the analysis.
- Calculate Average Correlations: For each recording and state, the tool calculates average positive and negative correlations.
- Calculate Cell Statistics: For each individual cell, the tool calculates the specified statistic (max, min, or mean) of that cell's correlations with other cells.
Statistical Comparisons¶
The tool uses functions from the pingouin package for statistical analysis.
-
Average Correlation Analysis:
- Single Group: Compares average positive/negative correlations across states using a one-way Repeated Measures ANOVA (
pingouin.rm_anova
). - Two Groups (Paired): Compares average positive/negative correlations across states and groups using a two-way Repeated Measures ANOVA (
pingouin.rm_anova
). - Two Groups (Unpaired): Compares average positive/negative correlations across states (within-subject factor) and groups (between-subject factor) using a Mixed ANOVA (
pingouin.mixed_anova
).
- Single Group: Compares average positive/negative correlations across states using a one-way Repeated Measures ANOVA (
-
Cell-level Statistic Analysis:
- State Comparison: Uses Linear Mixed Models (LMM) (
pingouin.linear_regression
) to compare the cell-level statistic across states, accounting for within-subject variability and potentially group differences. This handles the nested structure (cells within subjects). - Group Comparison: First, averages the cell-level statistic per subject/state/group. Then, compares these subject-level averages between groups using ANOVA (RM-ANOVA for paired, Mixed ANOVA for unpaired).
- State Comparison: Uses Linear Mixed Models (LMM) (
-
Pairwise Comparisons: Following significant ANOVA or LMM results, pairwise tests (
pingouin.pairwise_tests
) are performed to pinpoint differences between specific states or groups. The user selects the multiple comparison correction method and effect size calculation.
Outputs¶
Combination Outputs¶
Combined Data (CSV)¶
Two types of combined data files are generated per group:
-
Average Correlations: A CSV file containing the calculated average positive and negative correlations for each recording and state.
-
Cell Statistic Correlations: A CSV file containing the calculated cell-level statistic (max, min, or mean) for each cell, state, and recording.
Example Average Correlation Data:
File | State | Positive Correlation | Negative Correlation | Subject | Group |
---|---|---|---|---|---|
recording1.h5 | immobile | 0.08 | -0.07 | 1 | Group 1 |
recording1.h5 | mobile | 0.09 | -0.08 | 1 | Group 1 |
recording2.h5 | immobile | 0.07 | -0.06 | 2 | Group 1 |
recording2.h5 | mobile | 0.10 | -0.09 | 2 | Group 1 |
Example Cell Statistic Data (Maximum Correlation):
File | State | Cell | Max Correlation | Subject | Group |
---|---|---|---|---|---|
recording1.h5 | immobile | 1 | 0.36 | 1 | Group 1 |
recording1.h5 | immobile | 2 | 0.45 | 1 | Group 1 |
recording1.h5 | mobile | 1 | 0.39 | 1 | Group 1 |
recording1.h5 | mobile | 2 | 0.48 | 1 | Group 1 |
Combination Previews (SVG)¶
For each group, multiple visualization files are generated as previews for the combined data:
Average Correlation Previews: * Two boxplot visualizations for average positive and negative correlations.
Cell-level Statistic Previews: * One CDF plot showing the distribution of cell-level correlation statistics. * One boxplot with individual data points for cell-level correlation statistics.
These plots show the distribution of the respective correlation values across all recordings in the group, with separate lines or groupings colored by state.
Statistical Comparison Outputs¶
Statistical Results (CSV)¶
Two CSV files summarize the statistical tests:
-
ANOVA Results: Contains summary results from statistical tests comparing groups and states, including test statistics, p-values, and analysis parameters.
-
Pairwise Comparisons: Contains detailed results from follow-up tests that identify specific differences between conditions, including effect sizes and corrected p-values.
Note: The example tables below are illustrative and may not exactly match all columns generated by every possible test configuration.
Example ANOVA Results:
Source | P-value | F-statistic | Stat Method | Measure | Comparison | Analysis Level | Multiple Correction | Data Pairing |
---|---|---|---|---|---|---|---|---|
state | 0.002 | 477.74 | rm_anova | positive_correlation | Average Correlation | subject | bonf | paired |
group | 0.241 | 6.32 | rm_anova | positive_correlation | Average Correlation | subject | bonf | paired |
state * group | 0.114 | 7.79 | rm_anova | positive_correlation | Average Correlation | subject | bonf | paired |
state | 0.0002 | 4052.66 | rm_anova | max_correlation | Max Correlation | subject | bonf | paired |
group | 0.219 | 7.78 | rm_anova | max_correlation | Max Correlation | subject | bonf | paired |
Example Pairwise Comparisons:
A | B | Contrast | T | P-unc | P-corr | Effect Size | Stat Method | Measure | Comparison | Analysis Level | Multiple Correction | Data Pairing |
---|---|---|---|---|---|---|---|---|---|---|---|---|
center | quad 1 | center vs quad 1 | 0.009 | 0.994 | 1.0 | 0.004 | paired_ttest | activity | trace_activity | subject | bonf | unpaired |
center | quad 2 | center vs quad 2 | -0.010 | 0.993 | 1.0 | -0.005 | paired_ttest | activity | trace_activity | subject | bonf | unpaired |
center | quad 4 | center vs quad 4 | 1.084 | 0.358 | 1.0 | 0.542 | paired_ttest | activity | trace_activity | subject | bonf | unpaired |
Drug | Vehicle | group | -0.384 | 0.738 | - | -0.192 | unpaired_ttest | activity | trace_activity | subject | bonf | unpaired |
Statistical Comparison Previews (SVG)¶
Several plots visualize the statistical results:
- Average Correlation Distribution: Shows how average positive and negative correlations differ across states and groups.
- Correlation State Comparison: Displays cell-level statistic comparisons across different states.
- Correlation Group Comparison: Shows how correlation statistics differ between experimental groups.