User API¶

Import OmicVerse as:

import omicverse as ov

Data IO¶

`read`	Read common omics file formats into AnnData or pandas DataFrame.
`io.read_h5ad`	Read an `.h5ad` file.
`io.read_h5ad`	Read an `.h5ad` file.
`io.read_10x_h5`	Read a 10x Genomics HDF5 matrix file.
`io.read_10x_mtx`	Read a 10x Genomics Matrix Market directory.
`io.read_nanostring`	Read Nanostring formatted dataset.
`io.read_visium_hd`	Read 10x Visium HD outputs with a single entry point.

Preprocessing (pp)¶

Quality control & filtering

`pp.qc`	Perform quality control on a dictionary of AnnData objects.
`pp.filter_cells`	Filter cell outliers based on counts and numbers of genes expressed.
`pp.filter_genes`	Filter genes based on number of cells or counts.
`pp.scrublet`	Predict cell doublets using Scrublet with optional GPU acceleration.

Normalisation & feature selection

`pp.normalize_total`	Normalize counts per cell.
`pp.log1p`	Log-transform expression values with `log(1 + x)`.
`pp.highly_variable_genes`	Run HVG detection and write flags/statistics into `adata.var`.
`pp.highly_variable_features`	Select highly variable features (HVF/HVG) for downstream modeling.
`pp.normalize_pearson_residuals`	Normalize count matrix using Pearson residuals.
`pp.recover_counts`	Given log-normalized gene expression data, recover the raw read/UMI counts by inferring the unknown size factors.

Dimensionality reduction & graph

`pp.pca`	Performs Principal Component Analysis (PCA) on the data stored in a scanpy AnnData object.
`pp.neighbors`	Compute a neighborhood graph of observations [McInnes18].
`pp.umap`	Run UMAP on AnnData, choosing implementation based on settings.mode, The argument could be found in scanpy.pp.umap
`pp.tsne`	Compute t-SNE coordinates for cells.
`pp.mde`	Run MDE (Minimum Distortion Embedding) from a latent representation.

Clustering

`pp.leiden`	leiden clustering
`pp.louvain`	Run Louvain clustering on the precomputed kNN graph.

Batch correction & scaling

`pp.scale`	Scale the input AnnData object.
`pp.regress`	Regress out technical covariates from each gene.
`pp.regress_and_scale`	Scale the regressed layer and store it as a new analysis layer.
`pp.remove_cc_genes`	Remove cell-cycle-correlated genes from `highly_variable_features`.
`pp.score_genes_cell_cycle`	Score cell cycle phases using predefined or custom gene sets.

Single-cell (single)¶

Annotation

`single.pySCSA`	Automated cell-type annotation using SCSA marker-enrichment scoring.
`single.MetaTiME`	MetaTiME wrapper for tumor microenvironment cell-state annotation.
`single.CellVote`	Ensemble cell-type annotation manager with multiple backends.
`single.gptcelltype`	Annotate cluster cell types with a remote LLM service.
`single.gptcelltype_local`	Annotate cell types with a local instruction-tuned LLM.
`single.CellOntologyMapper`	🧬 Cell ontology mapping class using NLP
`single.Annotation`	Unified single-cell annotation manager for cell-type labeling.
`single.AnnotationRef`	Reference-based label transfer helper for single-cell annotation.

Trajectory & cell fate

`single.TrajInfer`	Trajectory inference class for single-cell data analysis.
`single.Velo`	RNA velocity analysis wrapper for directional cell-state transition inference.
`single.Fate`	Adaptive ridge-regression framework for pseudotime-associated gene discovery.
`single.cytotrace2`	Predict developmental potency with CytoTRACE2.

Cell structure

`single.MetaCell`	SEACells-based metacell construction workflow.
`single.DEG`	Differential gene-expression testing wrapper for single-cell datasets.
`single.SCENIC`
`single.aucell`	Calculate gene signature enrichment scores using AUCell algorithm.
`single.geneset_aucell`	Calculate the AUC-ell score for a given gene set.
`single.cellphonedb_v5`	Run CellPhoneDB statistical analysis with proper file handling
`single.Drug_Response`	Predict drug sensitivity from single-cell transcriptomes using CaDRReS models.

Batch correction & integration

`single.Batch`	Run MultiMAP to correct batch effect within a single AnnData object.
`single.pySIMBA`	SIMBA wrapper for single-cell batch integration and graph-embedding construction.
`single.Integration`	Run MultiMAP to integrate a number of AnnData objects from various multi-omics experiments into a single joint dimensionally reduced space.

Multi-omics

`single.pyMOFA`	Train MOFA models for latent factor discovery across multiple omics layers.
`single.pyMOFAART`	Load pretrained MOFA models for downstream factor interpretation.
`single.GLUE_pair`	Pair RNA and ATAC cells using GLUE latent embeddings and neighbor matching.
`single.pyTOSICA`	TOSICA wrapper for pathway-informed transformer-based cell-type annotation.

Topic modelling

single.cNMF

Consensus NMF workflow wrapper for robust gene-program discovery.

Bulk RNA-seq (bulk)¶

`bulk.pyDEG`	Differential-expression analysis helper for bulk RNA-seq count tables.
`bulk.pyGSEA`	Gene Set Enrichment Analysis (GSEA) wrapper for ranked gene lists.
`bulk.pyPPI`	Protein-protein interaction (PPI) analysis wrapper based on STRING.
`bulk.pyTCGA`	TCGA (The Cancer Genome Atlas) data analysis module.
`bulk.Deconvolution`	Bulk RNA-seq deconvolution class for inferring cell-type fractions from single-cell references.
`bulk.Matrix_ID_mapping`	Map gene IDs in the input data to gene symbols using a reference table.
`bulk.batch_correction`	Perform batch effect correction using ComBat algorithm.
`bulk.geneset_enrichment`	Perform pathway enrichment analysis using Enrichr-compatible gene-set libraries.

Spatial transcriptomics (space)¶

`space.clusters`	Perform clustering analysis on spatial transcriptomics data using multiple methods.
`space.Deconvolution`	Spatial deconvolution pipeline that aligns scRNA-seq references with spatial transcriptomics.
`space.pySTAGATE`	A class representing the PyTorch implementation of STAGATE (Spatial Transcriptomics Analysis using Graph Attention autoEncoder).
`space.pySTAligner`	STAligner for spatial transcriptomics data integration.
`space.pySpaceFlow`	SpaceFlow spatial flow analysis class.
`space.Tangram`	Tangram spatial deconvolution class for cell type mapping.
`space.STT`	Spatial Transition Tensor (STT) analysis class.
`space.GASTON`	GASTON spatial depth estimation and clustering.
`space.Cal_Spatial_Net`	Construct spatial neighbor networks for spatial integration.
`space.spatial_neighbors`	Build a spatial neighborhood graph from coordinates stored in `adata.obsm`.
`space.moranI`	Compute Moran's I spatial autocorrelation for gene expression.

Bulk-to-Single (bulk2single)¶

`bulk2single.BulkTrajBlend`	Integrate bulk and single-cell information to infer transitional cell-state trajectories.
`bulk2single.Bulk2Single`	VAE-based bulk-to-single framework for reconstructing pseudo single cells from bulk RNA-seq.
`bulk2single.Single2Spatial`	Deep-learning mapper that projects single-cell profiles onto spatial coordinates.

Foundation Models (fm)¶

`fm.run`	Execute a foundation model task.
`fm.list_models`	List available single-cell foundation models.
`fm.get_registry`	Get the global model registry singleton.
`fm.describe_model`	Get detailed specification for a foundation model.
`fm.select_model`	Select the best foundation model for a task and dataset.
`fm.preprocess_validate`	Validate data compatibility with a model and suggest preprocessing.
`fm.profile_data`	Profile an AnnData file to detect species, gene scheme, and modality.
`fm.interpret_results`	Generate QA metrics and visualizations for model results.
`fm.ModelSpec`	Complete specification for a foundation model.
`fm.ModelRegistry`	Registry of available single-cell foundation models.

Plotting (pl)¶

Embedding & dimensionality

`pl.embedding`	Scatter plot for user specified embedding basis (e.g. umap, pca, etc).
`pl.embedding_celltype`	Plot embedding with celltype color by omicverse.
`pl.embedding_density`	Plot cluster-specific density on an existing embedding.
`pl.embedding_multi`	Create embedding scatter plots for multi-modal data (MuData) or single-cell data.
`pl.embedding_atlas`	Render large-scale embeddings with Datashader.
`pl.pca`	Plot PCA embedding.
`pl.umap`	Plot UMAP embedding.
`pl.tsne`	Plot t-SNE embedding.

Differential expression

`pl.volcano`	Create a volcano plot for differential expression analysis.
`pl.marker_heatmap`	Create a dot plot heatmap showing marker gene expression using PyComplexHeatmap.
`pl.rank_genes_groups_dotplot`	Create a dot plot from rank_genes_groups results.
`pl.dotplot`	Make a dot plot of the expression values of var_names.
`pl.markers_dotplot`	Dot plot of marker genes — clean drop-in for `rank_genes_groups_dotplot()`.

Cell proportion & composition

`pl.cellproportion`	Plot cell proportion of each cell type in each visual cluster.
`pl.cellstackarea`	Plot the cell type percentage in each groupby category
`pl.venn`	Create a Venn diagram to visualize set overlaps.
`pl.bardotplot`	Create a combined bar-and-dot summary plot by groups.

Distribution

`pl.violin`	Enhanced violin plot compatible with omicverse's interface.
`pl.violin_box`
`pl.boxplot`	Create a boxplot with jittered points to visualize data distribution across categories.
`pl.plot_boxplots`	Grouped boxplot visualization.

Spatial

`pl.spatial`	Scatter plot in spatial coordinates, aligned with `scanpy.pl.spatial` behavior.
`pl.plot_spatial`	Create spatial plot from Visium data with color gradient and interpolation.
`pl.highlight_spatial_region`	Mark a rectangular region on a spatial plot.

Cell communication

`pl.cpdb_heatmap`	Create a dot heatmap of CellPhoneDB interaction counts between cell types.
`pl.cpdb_network`	Create a circular network plot of CellPhoneDB cell-cell interactions.
`pl.cpdb_chord`	Create a chord diagram visualization of CellPhoneDB interactions.
`pl.CellChatViz`	Visualization helper for CellPhoneDB cell-cell communication outputs.

Colours & palettes

`pl.palette_112`	Built-in mutable sequence.
`pl.palette_28`	Built-in mutable sequence.
`pl.sc_color`	Built-in mutable sequence.
`pl.ForbiddenCity`	Forbidden City traditional-color palette utility.
`pl.optim_palette`	Optimized palette for plotting
`pl.colormaps_palette`	Returns a colormap palette.

Datasets¶

`datasets.pbmc3k`	Load PBMC 3k dataset from URL.
`datasets.zebrafish`	The zebrafish is from Saunders, et al (2019).
`datasets.pancreatic_endocrinogenesis`	Pancreatic endocrinogenesis.
`datasets.dentate_gyrus`	The Dentate Gyrus dataset used in https://github.com/velocyto-team/velocyto-notebooks/blob/master/python/DentateGyrus.ipynb.
`datasets.create_mock_dataset`	Create a mock single-cell dataset for testing statistical functions.
`datasets.predefined_signatures`	dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object's (key, value) pairs dict(iterable) -> new dictionary initialized as if via: d = {} for k, v in iterable: d[k] = v dict(**kwargs) -> new dictionary initialized with the name=value pairs in the keyword argument list. For example: dict(one=1, two=2).