omicverse.single.find_markers¶
- omicverse.single.find_markers(adata, groupby, method='cosg', n_genes=50, key_added=None, use_raw=None, layer=None, groups='all', reference='rest', corr_method='benjamini-hochberg', rankby_abs=False, tie_correct=False, pts=True, **kwargs)[source]¶
Find marker genes for each cluster / group in single-cell data.
A unified wrapper supporting multiple algorithms. For statistical methods (
t-test,wilcoxon,logreg) the implementation is ported directly from scanpy — no scanpy runtime dependency. Results are stored inadata.uns[key_added]using the same structured-array format assc.tl.rank_genes_groups, so all downstream tools (includingomicverse.single.get_markers()andomicverse.pl.markers_dotplot()) work out of the box.- Parameters:
adata (Annotated data matrix. Data must be log-normalised for) – statistical tests; raw counts are expected for
method='cosg'.groupby (Key in
adata.obsto group cells by (e.g.'leiden').)method (Algorithm. One of:) –
'cosg'— cosine-similarity-based, fast, recommended for large datasets.'t-test'— Welch’s t-test.'t-test_overestim_var'— t-test with per-group variance overestimation (conservative).'wilcoxon'— Wilcoxon rank-sum / Mann-Whitney U test.'logreg'— logistic regression (requires scikit-learn).
Default:
'cosg'.n_genes (Top marker genes per group to keep. Default:
50.)key_added (Key in
adata.unsto write results to.) – Default:'rank_genes_groups'.use_raw (Use
adata.rawfor expression values.None(default)) – means use raw if it exists (matching scanpy behaviour).layer (Layer to use instead of
adata.X. Default:None.)groups (Groups to compute markers for —
'all'or a list of names.) – Default:'all'.reference (Reference group.
'rest'(default) compares each group) – against the union of all other cells; a group name restricts the comparison to that group only.corr_method (Multiple-testing correction.
'benjamini-hochberg') – (default) or'bonferroni'. Ignored for'cosg'and'logreg'.rankby_abs (Rank genes by absolute score instead of raw score.) – Default:
False.tie_correct (Apply tie correction for
'wilcoxon'. Default:False.)pts (Compute fraction of cells expressing each gene (stored as) –
adata.uns[key_added]['pts']). Default:False.**kwargs (Forwarded to the underlying method (e.g.
mufor cosg,) – or sklearn parameters for logreg).
- Return type:
- Returns:
None. Results are written toadata.uns[key_added].Examples – >>> import omicverse as ov >>> ov.single.find_markers(adata, groupby=’leiden’, method=’cosg’) >>> df = ov.single.get_markers(adata, n_genes=5) >>> ov.pl.markers_dotplot(adata, groupby=’leiden’, n_genes=5)