User API

Import OmicVerse as:

import omicverse as ov

Data IO

read

Read common omics file formats into AnnData or pandas DataFrame.

io.read_h5ad

Read an .h5ad file.

io.read_h5ad

Read an .h5ad file.

io.read_10x_h5

Read a 10x Genomics HDF5 matrix file.

io.read_10x_mtx

Read a 10x Genomics Matrix Market directory.

io.read_nanostring

Read Nanostring formatted dataset.

io.read_visium_hd

Read 10x Visium HD outputs with a single entry point.

Preprocessing (pp)

Quality control & filtering

pp.qc

Perform quality control on a dictionary of AnnData objects.

pp.filter_cells

Filter cell outliers based on counts and numbers of genes expressed.

pp.filter_genes

Filter genes based on number of cells or counts.

pp.scrublet

Predict cell doublets using Scrublet with optional GPU acceleration.

Normalisation & feature selection

pp.normalize_total

Normalize counts per cell.

pp.log1p

Log-transform expression values with log(1 + x).

pp.highly_variable_genes

Run HVG detection and write flags/statistics into adata.var.

pp.highly_variable_features

Select highly variable features (HVF/HVG) for downstream modeling.

pp.normalize_pearson_residuals

Normalize count matrix using Pearson residuals.

pp.recover_counts

Given log-normalized gene expression data, recover the raw read/UMI counts by inferring the unknown size factors.

Dimensionality reduction & graph

pp.pca

Performs Principal Component Analysis (PCA) on the data stored in a scanpy AnnData object.

pp.neighbors

Compute a neighborhood graph of observations [McInnes18].

pp.umap

Run UMAP on AnnData, choosing implementation based on settings.mode, The argument could be found in scanpy.pp.umap

pp.tsne

Compute t-SNE coordinates for cells.

pp.mde

Run MDE (Minimum Distortion Embedding) from a latent representation.

Clustering

pp.leiden

leiden clustering

pp.louvain

Run Louvain clustering on the precomputed kNN graph.

Batch correction & scaling

pp.scale

Scale the input AnnData object.

pp.regress

Regress out technical covariates from each gene.

pp.regress_and_scale

Scale the regressed layer and store it as a new analysis layer.

pp.remove_cc_genes

Remove cell-cycle-correlated genes from highly_variable_features.

pp.score_genes_cell_cycle

Score cell cycle phases using predefined or custom gene sets.

Single-cell (single)

Annotation

single.pySCSA

Automated cell-type annotation using SCSA marker-enrichment scoring.

single.MetaTiME

MetaTiME wrapper for tumor microenvironment cell-state annotation.

single.CellVote

Ensemble cell-type annotation manager with multiple backends.

single.gptcelltype

Annotate cluster cell types with a remote LLM service.

single.gptcelltype_local

Annotate cell types with a local instruction-tuned LLM.

single.CellOntologyMapper

🧬 Cell ontology mapping class using NLP

single.Annotation

Unified single-cell annotation manager for cell-type labeling.

single.AnnotationRef

Reference-based label transfer helper for single-cell annotation.

Trajectory & cell fate

single.TrajInfer

Trajectory inference class for single-cell data analysis.

single.Velo

RNA velocity analysis wrapper for directional cell-state transition inference.

single.Fate

Adaptive ridge-regression framework for pseudotime-associated gene discovery.

single.cytotrace2

Predict developmental potency with CytoTRACE2.

Cell structure

single.MetaCell

SEACells-based metacell construction workflow.

single.DEG

Differential gene-expression testing wrapper for single-cell datasets.

single.SCENIC

single.aucell

Calculate gene signature enrichment scores using AUCell algorithm.

single.geneset_aucell

Calculate the AUC-ell score for a given gene set.

single.cellphonedb_v5

Run CellPhoneDB statistical analysis with proper file handling

single.Drug_Response

Predict drug sensitivity from single-cell transcriptomes using CaDRReS models.

Batch correction & integration

single.Batch

Run MultiMAP to correct batch effect within a single AnnData object.

single.pySIMBA

SIMBA wrapper for single-cell batch integration and graph-embedding construction.

single.Integration

Run MultiMAP to integrate a number of AnnData objects from various multi-omics experiments into a single joint dimensionally reduced space.

Multi-omics

single.pyMOFA

Train MOFA models for latent factor discovery across multiple omics layers.

single.pyMOFAART

Load pretrained MOFA models for downstream factor interpretation.

single.GLUE_pair

Pair RNA and ATAC cells using GLUE latent embeddings and neighbor matching.

single.pyTOSICA

TOSICA wrapper for pathway-informed transformer-based cell-type annotation.

Topic modelling

single.cNMF

Consensus NMF workflow wrapper for robust gene-program discovery.

Bulk RNA-seq (bulk)

bulk.pyDEG

Differential-expression analysis helper for bulk RNA-seq count tables.

bulk.pyGSEA

Gene Set Enrichment Analysis (GSEA) wrapper for ranked gene lists.

bulk.pyPPI

Protein-protein interaction (PPI) analysis wrapper based on STRING.

bulk.pyTCGA

TCGA (The Cancer Genome Atlas) data analysis module.

bulk.Deconvolution

Bulk RNA-seq deconvolution class for inferring cell-type fractions from single-cell references.

bulk.Matrix_ID_mapping

Map gene IDs in the input data to gene symbols using a reference table.

bulk.batch_correction

Perform batch effect correction using ComBat algorithm.

bulk.geneset_enrichment

Perform pathway enrichment analysis using Enrichr-compatible gene-set libraries.

Spatial transcriptomics (space)

space.clusters

Perform clustering analysis on spatial transcriptomics data using multiple methods.

space.Deconvolution

Spatial deconvolution pipeline that aligns scRNA-seq references with spatial transcriptomics.

space.pySTAGATE

A class representing the PyTorch implementation of STAGATE (Spatial Transcriptomics Analysis using Graph Attention autoEncoder).

space.pySTAligner

STAligner for spatial transcriptomics data integration.

space.pySpaceFlow

SpaceFlow spatial flow analysis class.

space.Tangram

Tangram spatial deconvolution class for cell type mapping.

space.STT

Spatial Transition Tensor (STT) analysis class.

space.GASTON

GASTON spatial depth estimation and clustering.

space.Cal_Spatial_Net

Construct spatial neighbor networks for spatial integration.

space.spatial_neighbors

Build a spatial neighborhood graph from coordinates stored in adata.obsm.

space.moranI

Compute Moran's I spatial autocorrelation for gene expression.

Bulk-to-Single (bulk2single)

bulk2single.BulkTrajBlend

Integrate bulk and single-cell information to infer transitional cell-state trajectories.

bulk2single.Bulk2Single

VAE-based bulk-to-single framework for reconstructing pseudo single cells from bulk RNA-seq.

bulk2single.Single2Spatial

Deep-learning mapper that projects single-cell profiles onto spatial coordinates.

Foundation Models (fm)

fm.run

Execute a foundation model task.

fm.list_models

List available single-cell foundation models.

fm.get_registry

Get the global model registry singleton.

fm.describe_model

Get detailed specification for a foundation model.

fm.select_model

Select the best foundation model for a task and dataset.

fm.preprocess_validate

Validate data compatibility with a model and suggest preprocessing.

fm.profile_data

Profile an AnnData file to detect species, gene scheme, and modality.

fm.interpret_results

Generate QA metrics and visualizations for model results.

fm.ModelSpec

Complete specification for a foundation model.

fm.ModelRegistry

Registry of available single-cell foundation models.

Plotting (pl)

Embedding & dimensionality

pl.embedding

Scatter plot for user specified embedding basis (e.g. umap, pca, etc).

pl.embedding_celltype

Plot embedding with celltype color by omicverse.

pl.embedding_density

Plot cluster-specific density on an existing embedding.

pl.embedding_multi

Create embedding scatter plots for multi-modal data (MuData) or single-cell data.

pl.embedding_atlas

Render large-scale embeddings with Datashader.

pl.pca

Plot PCA embedding.

pl.umap

Plot UMAP embedding.

pl.tsne

Plot t-SNE embedding.

Differential expression

pl.volcano

Create a volcano plot for differential expression analysis.

pl.marker_heatmap

Create a dot plot heatmap showing marker gene expression using PyComplexHeatmap.

pl.rank_genes_groups_dotplot

Create a dot plot from rank_genes_groups results.

pl.dotplot

Make a dot plot of the expression values of var_names.

pl.markers_dotplot

Dot plot of marker genes — clean drop-in for rank_genes_groups_dotplot().

Cell proportion & composition

pl.cellproportion

Plot cell proportion of each cell type in each visual cluster.

pl.cellstackarea

Plot the cell type percentage in each groupby category

pl.venn

Create a Venn diagram to visualize set overlaps.

pl.bardotplot

Create a combined bar-and-dot summary plot by groups.

Distribution

pl.violin

Enhanced violin plot compatible with omicverse's interface.

pl.violin_box

pl.boxplot

Create a boxplot with jittered points to visualize data distribution across categories.

pl.plot_boxplots

Grouped boxplot visualization.

Spatial

pl.spatial

Scatter plot in spatial coordinates, aligned with scanpy.pl.spatial behavior.

pl.plot_spatial

Create spatial plot from Visium data with color gradient and interpolation.

pl.highlight_spatial_region

Mark a rectangular region on a spatial plot.

Cell communication

pl.cpdb_heatmap

Create a dot heatmap of CellPhoneDB interaction counts between cell types.

pl.cpdb_network

Create a circular network plot of CellPhoneDB cell-cell interactions.

pl.cpdb_chord

Create a chord diagram visualization of CellPhoneDB interactions.

pl.CellChatViz

Visualization helper for CellPhoneDB cell-cell communication outputs.

Colours & palettes

pl.palette_112

Built-in mutable sequence.

pl.palette_28

Built-in mutable sequence.

pl.sc_color

Built-in mutable sequence.

pl.ForbiddenCity

Forbidden City traditional-color palette utility.

pl.optim_palette

Optimized palette for plotting

pl.colormaps_palette

Returns a colormap palette.

Datasets

datasets.pbmc3k

Load PBMC 3k dataset from URL.

datasets.zebrafish

The zebrafish is from Saunders, et al (2019).

datasets.pancreatic_endocrinogenesis

Pancreatic endocrinogenesis.

datasets.dentate_gyrus

The Dentate Gyrus dataset used in https://github.com/velocyto-team/velocyto-notebooks/blob/master/python/DentateGyrus.ipynb.

datasets.create_mock_dataset

Create a mock single-cell dataset for testing statistical functions.

datasets.predefined_signatures

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object's (key, value) pairs dict(iterable) -> new dictionary initialized as if via: d = {} for k, v in iterable: d[k] = v dict(**kwargs) -> new dictionary initialized with the name=value pairs in the keyword argument list. For example: dict(one=1, two=2).