Skip to content

Combine and Compare Correlation Data

Compute Credits

This tool uses 1.0 compute credits per hour.

Overview

This tool combines cell-cell correlation data generated by the Compare Neural Circuit Correlations Across States tool from multiple recordings. It focuses on specific states defined by the user. The tool calculates and compares several correlation metrics across recordings, states, and experimental groups:

New version changes

  • Input Format Change: This tool now requires correlation data in HDF5 (.h5) format, specifically the file generated by the Compare Neural Circuit Correlations Across States. This is a change from previous versions which have used CSV outputs.

  • Benefits of H5 Input: The H5 format stores the full raw correlation matrices for each state. This enables:

    • Single-cell Level Analysis: Calculation and comparison of cell-specific statistics like the maximum, minimum, or mean correlation (statistic parameter).
    • Detailed Average Correlations: Separate calculation and comparison of the average positive and average negative correlations across all cell pairs within a state.
  • For Users of the Previous Correlation Tool: If you previously used the Compare Neural Circuit Correlations Across States and saved its outputs, you can use this updated combine-and-compare tool by providing the *.h5 file generated by that tool as input. The previous CSV outputs are no longer the primary input for this combined analysis tool.

Parameters

Parameter Required? Default Description
Group 1 Correlation Data Files True N/A Select correlation data from the first group to use for analysis
Group 1 Name True group1 Name of the first group
Group 1 Color True tab:red Color of the first group
Group 2 Correlation Data Files False N/A Select correlation data from the second group to use for analysis
Group 2 Name False N/A Name of the second group
Group 2 Color True tab:orange Color of the second group
State Names True N/A Names of analyzed states
State Colors True N/A Colors of analyzed states
Comparison Type False N/A Type of statistical test to perform
Multiple Comparison Correction method True N/A Method for correcting for multiple comparisons
Effect Size Method True N/A Method for calculating the effect size
Data Pairing False unpaired Indicates whether observations should be paired for statistical comparison
Subject Matching Method False order Method for matching subjects between groups in paired analysis
Significance Threshold True 0.05 p-value threshold for classifying neurons as up- or down-modulated

Input Files

Source Parameter File Type File Format
Group 1 Correlation Data Files correlation_data h5
Group 2 Correlation Data Files correlation_data h5

The input files have the following requirements:

  • File Type: Input files must be HDF5 files (.h5) generated by the Compare Neural Circuit Correlations Across States.
  • Group Size: If data for a group is provided, that group must contain at least two H5 files.
  • State Matching: The State Names parameter provided to this tool must be a comma-separated list of strings that exactly match (case-insensitive) the names of the datasets within the input H5 files that you wish to analyze.
  • State Consistency: All input H5 files should contain datasets for the states specified in State Names.
  • State Colors: The number of state_colors provided must be equal to the number of State Names provided.

Additionally, each input H5 file is expected to contain top-level datasets where:

  • The name of each dataset corresponds to a state (e.g., "immobile", "mobile", "other").
  • The value of each dataset is a 2D NumPy array representing the cell-cell Pearson correlation matrix for that state. The diagonal elements are expected to be zero.

Algorithm Description

The tool follows a simple three-step process:

  1. Data Processing: Reads correlation data from input files, filters for specified states, and calculates summary statistics.
  2. Statistical Analysis: Compares correlations across states and groups to identify significant differences.
  3. Output Generation: Saves combined data, statistical results, and creates visualization plots.

Data Processing

  1. Read Data: The tool reads correlation data from H5 files for each group.
  2. Filter States: Only the user-specified states are included in the analysis.
  3. Calculate Average Correlations: For each recording and state, the tool calculates average positive and negative correlations.
  4. Calculate Cell Statistics: For each individual cell, the tool calculates the specified statistic (max, min, or mean) of that cell's correlations with other cells.

Statistical Comparisons

The tool uses functions from the pingouin package for statistical analysis.

  1. Average Correlation Analysis:

    • Single Group: Compares average positive/negative correlations across states using a one-way Repeated Measures ANOVA (pingouin.rm_anova).
    • Two Groups (Paired): Compares average positive/negative correlations across states and groups using a two-way Repeated Measures ANOVA (pingouin.rm_anova).
    • Two Groups (Unpaired): Compares average positive/negative correlations across states (within-subject factor) and groups (between-subject factor) using a Mixed ANOVA (pingouin.mixed_anova).
  2. Cell-level Statistic Analysis:

    • State Comparison: Uses Linear Mixed Models (LMM) (pingouin.linear_regression) to compare the cell-level statistic across states, accounting for within-subject variability and potentially group differences. This handles the nested structure (cells within subjects).
    • Group Comparison: First, averages the cell-level statistic per subject/state/group. Then, compares these subject-level averages between groups using ANOVA (RM-ANOVA for paired, Mixed ANOVA for unpaired).
  3. Pairwise Comparisons: Following significant ANOVA or LMM results, pairwise tests (pingouin.pairwise_tests) are performed to pinpoint differences between specific states or groups. The user selects the multiple comparison correction method and effect size calculation.

Outputs

Combination Outputs

Combined Data (CSV)

Two types of combined data files are generated per group:

  1. Average Correlations: A CSV file containing the calculated average positive and negative correlations for each recording and state.

  2. Cell Statistic Correlations: A CSV file containing the calculated cell-level statistic (max, min, or mean) for each cell, state, and recording.

Example Average Correlation Data:

File State Positive Correlation Negative Correlation Subject Group
recording1.h5 immobile 0.08 -0.07 1 Group 1
recording1.h5 mobile 0.09 -0.08 1 Group 1
recording2.h5 immobile 0.07 -0.06 2 Group 1
recording2.h5 mobile 0.10 -0.09 2 Group 1

Example Cell Statistic Data (Maximum Correlation):

File State Cell Max Correlation Subject Group
recording1.h5 immobile 1 0.36 1 Group 1
recording1.h5 immobile 2 0.45 1 Group 1
recording1.h5 mobile 1 0.39 1 Group 1
recording1.h5 mobile 2 0.48 1 Group 1

Combination Previews (SVG)

For each group, multiple visualization files are generated as previews for the combined data:

Average Correlation Previews: * Two boxplot visualizations for average positive and negative correlations.

Cell-level Statistic Previews: * One CDF plot showing the distribution of cell-level correlation statistics. * One boxplot with individual data points for cell-level correlation statistics.

These plots show the distribution of the respective correlation values across all recordings in the group, with separate lines or groupings colored by state.

Example: Max Correlation CDF for Group 1
Example: Max Correlation CDF for Group 2

Statistical Comparison Outputs

Statistical Results (CSV)

Two CSV files summarize the statistical tests:

  1. ANOVA Results: Contains summary results from statistical tests comparing groups and states, including test statistics, p-values, and analysis parameters.

  2. Pairwise Comparisons: Contains detailed results from follow-up tests that identify specific differences between conditions, including effect sizes and corrected p-values.

Note: The example tables below are illustrative and may not exactly match all columns generated by every possible test configuration.

Example ANOVA Results:

Source P-value F-statistic Stat Method Measure Comparison Analysis Level Multiple Correction Data Pairing
state 0.002 477.74 rm_anova positive_correlation Average Correlation subject bonf paired
group 0.241 6.32 rm_anova positive_correlation Average Correlation subject bonf paired
state * group 0.114 7.79 rm_anova positive_correlation Average Correlation subject bonf paired
state 0.0002 4052.66 rm_anova max_correlation Max Correlation subject bonf paired
group 0.219 7.78 rm_anova max_correlation Max Correlation subject bonf paired

Example Pairwise Comparisons:

A B Contrast T P-unc P-corr Effect Size Stat Method Measure Comparison Analysis Level Multiple Correction Data Pairing
center quad 1 center vs quad 1 0.009 0.994 1.0 0.004 paired_ttest activity trace_activity subject bonf unpaired
center quad 2 center vs quad 2 -0.010 0.993 1.0 -0.005 paired_ttest activity trace_activity subject bonf unpaired
center quad 4 center vs quad 4 1.084 0.358 1.0 0.542 paired_ttest activity trace_activity subject bonf unpaired
Drug Vehicle group -0.384 0.738 - -0.192 unpaired_ttest activity trace_activity subject bonf unpaired

Statistical Comparison Previews (SVG)

Several plots visualize the statistical results:

  1. Average Correlation Distribution: Shows how average positive and negative correlations differ across states and groups.
Example: Average Correlation Distribution
  1. Correlation State Comparison: Displays cell-level statistic comparisons across different states.
Example: Max Correlation State Comparison
  1. Correlation Group Comparison: Shows how correlation statistics differ between experimental groups.
Example: Max Correlation Group Comparison