{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Identify the driver regulators of cell fate decisions\n", "CEFCON is a computational tool for deciphering driver regulators of cell fate decisions from single-cell RNA-seq data. It takes a prior gene interaction network and expression profiles from scRNA-seq data associated with a given developmental trajectory as inputs, and consists of three main components, including cell-lineage-specific gene regulatory network (GRN) construction, driver regulator identification and regulon-like gene module (RGM) identification.\n", "\n", "Check out [(Wang et al., Nature Communications, 2023)](https://www.nature.com/articles/s41467-023-44103-3) for the detailed methods and applications.\n", "\n", "Code: [https://github.com/WPZgithub/CEFCON](https://github.com/WPZgithub/CEFCON)\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", " ____ _ _ __ \n", " / __ \\____ ___ (_)___| | / /__ _____________ \n", " / / / / __ `__ \\/ / ___/ | / / _ \\/ ___/ ___/ _ \\ \n", "/ /_/ / / / / / / / /__ | |/ / __/ / (__ ) __/ \n", "\\____/_/ /_/ /_/_/\\___/ |___/\\___/_/ /____/\\___/ \n", "\n", "Version: 1.5.6, Tutorials: https://omicverse.readthedocs.io/\n" ] } ], "source": [ "import omicverse as ov\n", "#print(f\"omicverse version: {ov.__version__}\")\n", "import scanpy as sc\n", "#print(f\"scanpy version: {sc.__version__}\")\n", "import pandas as pd\n", "from tqdm.auto import tqdm\n", "ov.plot_set()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Data loading and processing\n", "Here, we use the mouse hematopoiesis data provided by [Nestorowa et al. (2016, Blood).](https://doi.org/10.1182/blood-2016-05-716480)\n", "\n", "**The scRNA-seq data requires processing to extract lineage information for the CEFCON analysis.** Please refer to the [original notebook](https://github.com/WPZgithub/CEFCON/blob/e74d2d248b88fb3349023d1a97d3cc8a52cc4060/notebooks/preprocessing_nestorowa16_data.ipynb) for detailed instructions on preprocessing scRNA-seq data." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Load mouse_hsc_nestorowa16_v0.h5ad\n" ] }, { "data": { "text/plain": [ "AnnData object with n_obs × n_vars = 1645 × 3000\n", " obs: 'E_pseudotime', 'GM_pseudotime', 'L_pseudotime', 'label_info', 'n_genes', 'leiden', 'cell_type_roughly', 'cell_type_finely'\n", " var: 'n_cells', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'E_pseudotime_logFC', 'GM_pseudotime_logFC', 'L_pseudotime_logFC'\n", " uns: 'cell_type_finely_colors', 'cell_type_roughly_colors', 'draw_graph', 'hvg', 'leiden', 'leiden_colors', 'lineages', 'neighbors', 'pca', 'tsne', 'umap'\n", " obsm: 'X_draw_graph_fa', 'X_pca'\n", " varm: 'PCs'\n", " layers: 'raw_count'\n", " obsp: 'connectivities', 'distances'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "adata = ov.single.mouse_hsc_nestorowa16()\n", "adata" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "CEFCON fully exploit an available global and **context-free gene interaction network** as prior knowledge, from which we extract the cell-lineage-specific gene interactions according to the gene expression profiles derived from scRNA-seq data associated with a given developmental trajectory. \n", "\n", "You can download the prior network in the [zenodo](https://zenodo.org/records/8013900). **CEFCON only provides the prior network for human and mosue data anaylsis**. For other species, you should provide the prior network mannully.\n", "\n", "The author of CEFCON has provided several prior networks here; however, 'nichenet' yields the best results." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Load the prior gene interaction network: nichenet. #Genes: 25345, #Edges: 5290993\n" ] } ], "source": [ "prior_network = ov.single.load_human_prior_interaction_network(dataset='nichenet') " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**In the scRNA-seq analysis of human data, you should not run this step. Running it may change the gene symbol and result in errors.**\n", "\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Convert genes of the prior interaction network to mouse gene symbols:\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "8a71aaf3892d4640872a7e28d51d604e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Processing: 0%| | 0/10 [00:00, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Server 'http://asia.ensembl.org/biomart/' is OK\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Converting ambiguous gene symbols: 0%| | 0/202510 [00:00, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "The converted prior gene interaction network: #Genes: 18579, #Edges: 5029532\n" ] }, { "data": { "text/html": [ "
| \n", " | from | \n", "to | \n", "
|---|---|---|
| 0 | \n", "Klf2 | \n", "Dlgap1 | \n", "
| 2 | \n", "Klf2 | \n", "Bhlhe40 | \n", "
| 3 | \n", "Klf2 | \n", "Rps6ka1 | \n", "
| 4 | \n", "Klf2 | \n", "Pxn | \n", "
| 5 | \n", "Klf2 | \n", "Ube2v1 | \n", "
| ... | \n", "... | \n", "... | \n", "
| 837982 | \n", "Zranb1 | \n", "Zfp141 | \n", "
| 837983 | \n", "Zranb1 | \n", "Zfy1 | \n", "
| 837984 | \n", "Zranb1 | \n", "Zfy2 | \n", "
| 837987 | \n", "Zscan21 | \n", "Zfy1 | \n", "
| 837988 | \n", "Zscan21 | \n", "Zfy2 | \n", "
5029532 rows × 2 columns
\n", "| \n", " | influence_score | \n", "is_driver_regulator | \n", "is_MFVS_driver | \n", "is_MDS_driver | \n", "is_TF | \n", "
|---|---|---|---|---|---|
| JUN | \n", "7.352254 | \n", "True | \n", "True | \n", "True | \n", "True | \n", "
| GATA1 | \n", "7.071392 | \n", "True | \n", "True | \n", "True | \n", "True | \n", "
| FOS | \n", "6.930125 | \n", "True | \n", "True | \n", "True | \n", "True | \n", "
| GATA2 | \n", "6.683559 | \n", "True | \n", "False | \n", "True | \n", "True | \n", "
| MEIS1 | \n", "5.851068 | \n", "True | \n", "True | \n", "True | \n", "True | \n", "