Developer guild¶
Note
To better understand the following guide, you may check out our publication first to learn about the general idea.
Below we describe main components of the framework, and how to extend the existing implementations.
Framework¶
The omicverse code is stored in the omicverse folder in the github repository, with the __init__.py
file taking care of the import of the library functions.
A omicverse framework is primarily composed of 5 components.
utils
: Functions, including data, plotting, etc.pp
: preprocess, including quantity control, normalize, etc.bulk
: to analysis the bulk omic-seq like RNA-seq or Proper-seq.single
: to analysis the single cell omic-seq like scRNA-seq or scATAC-seqspace
: to analysis the spatial RNA-seqbulk2single
: to integrate the bulk RNA-seq and single cell RNA-seqexternel
: more related module included RNA-seq avoided installation and confliction
The __init__.py
file is responsible for importing function entries within each folder, and all function functions use a file starting with _*.py
for function writing.
For Developer¶
Externel module¶
In most cases, we realize that writing a module function is difficult. Therefore, we introduced the external
module. We can directly clone the entire package from GitHub and then move the entire folder to the external
folder. During this process, we need to pay attention to whether the License allows it and whether there is a conflict with OmicVerse's GPL license. Subsequently, we need to modify the import
content. We need to change the packages that are not dependencies of OmicVerse from top-level imports to function-level imports.
.
├── omicverse
├───── externel
├──────── STT
├─────────── __init__.py
├─────────── pl
├─────────── tl
All imports need to ensure that there are no conflicts.
This is an error because this package is not included in the default requirements.txt of OmicVerse.
import dgl
def calculate():
dgl.run()
pass
The correct import is
def calculate():
import dgl
dgl.run()
pass
We recommend using try
to detect import errors, which can then guide the user to the correct installation page.
def calculate():
try:
import dgl
except ImportError:
raise ImportError(
'Please install the dgl from https://www.dgl.ai/pages/start.html'
)
dgl.run()
pass
Main module¶
If you want to provide pull request for omicverse, you need to be clear about which module the functionality you are developing is subordinate to, e.g. TOSICA
belongs to the algorithms of the single-cell domain, i.e., you need to add the _tosica.py
file inside the single
folder of omicverse
and _init__.py
inside the from . _tosica import pyTOSICA
to make the omicverse add the new functionality
.
├── omicverse
├───── single
├──────── __init__.py
├──────── _tosica.py
All functions require parameter descriptions in the following format:
def preprocess(adata:anndata.AnnData, mode:str='scanpy', target_sum:int=50*1e4, n_HVGs:int=2000,
organism:str='human', no_cc:bool=False)->anndata.AnnData:
"""
Preprocesses the AnnData object adata using either a scanpy or a pearson residuals workflow for size normalization
and highly variable genes (HVGs) selection, and calculates signature scores if necessary.
Arguments:
adata: The data matrix.
mode: The mode for size normalization and HVGs selection. It can be either 'scanpy' or 'pearson'. If 'scanpy', performs size normalization using scanpy's normalize_total() function and selects HVGs
using pegasus' highly_variable_features() function with batch correction. If 'pearson', selects HVGs
using scanpy's experimental.pp.highly_variable_genes() function with pearson residuals method and performs
size normalization using scanpy's experimental.pp.normalize_pearson_residuals() function.
target_sum: The target total count after normalization.
n_HVGs: the number of HVGs to select.
organism: The organism of the data. It can be either 'human' or 'mouse'.
no_cc: Whether to remove cc-correlated genes from HVGs.
Returns:
adata: The preprocessed data matrix.
"""
Pull request¶
- You need to
fork
omicverse at first, and git clone your fork from your repository. - When you updated the related function development, open a pull request and waited reviewed and merged.