c2v.tl.associations

Contents

c2v.tl.associations#

c2v.tl.associations(adata, response_key, response_field='obsm', response_transform=None, use_rep=None, use_raw=None, layers=None, method='pearson', p_adjustment_method='fdr_bh', concat_layers=False, random_state=42, progress_bar=False, mask_key=None, clr_pseudocount=0.001, binomial=False, **gam_kwargs)#

Calculate correlation between features in adata and a response variable. Correlation coefficients, p-values, and False Discovery Rates (FDRs) are computed for each feature. Results are stored in adata.varm with keys “{X_name}:corr”, “{X_name}:pvalue”, and “{X_name}:FDR”, where {X_name} is the name of the feature representation (e.g., “X”, “X_raw”, or layer names). If p_adjustment_method is None, no correction is performed.

Parameters:
adata AnnData

Annotated data matrix at the cell level.

response_key str | list[str] | None

Key in adata.obs or adata.obsm containing the response variable. Might be a list only if response_field=’obs’, in this case the response variable is a multi-column DataFrame.

response_field Literal["obs", "obsm"], optional

Whether the response variable is in adata.obs or adata.obsm, by default “obsm”.

response_transform None | Literal["logit", "log1p", "sqrt", "clr"], optional

Transform to apply to the response variable before correlation calculation, by default None.

use_rep Literal["X", "layers"] | None, optional

Whether to use adata.X or adata.layers for feature representations, if None, uses adata.X if layers is None, otherwise adata.layers, by default None.

use_raw bool | None, optional

Whether to use adata.raw.X as the feature representation, by default None.

layers list[str] | str | None, optional

Layers in adata.layers to use for feature representations if use_rep=’layers’. If None and use_rep=”layers”, uses all layers, by default None.

method Literal["pearson", "spearman", "gam"], optional

Method for correlation calculation, by default “pearson”.

p_adjustment_method str, optional

Method for multiple testing correction, passed to statsmodels.stats.multitest.multipletests, default is “fdr_bh”.

random_state int, optional

Random seed for GAM model training, by default 42.

progress_bar bool, optional

Whether to show a progress bar, by default False.

mask_key str | None | Literal[False], optional

Key in adata.obs or adata.obsm containing a boolean mask to filter cells, by default None.

clr_pseudocount float, optional

Pseudocount to add to expression values before CLR transformation, by default 1e-3.

binomial bool, optional

If True, treat the response as count data and use binomial modeling. The response matrix is interpreted as integer counts; proportions and row totals are computed internally. For method='gam', a Binomial GLM is fitted (with freq_weights=totals). For method='pearson' or 'spearman', weighted correlation is computed on the proportions, with weights equal to row totals, by default False.

**gam_kwargs dict, optional

Additional keyword arguments to pass to gam.

concat_layers bool

Return type:

None

Returns:

None The adata object is modified in place with correlation results stored in adata.varm.