{ "cells": [ { "cell_type": "markdown", "id": "6c84b310", "metadata": {}, "source": [ "# Bulk RNA-seq generate 'interrupted' cells to interpolate scRNA-seq\n", "\n", "The limited number of cells available for single-cell sequencing has led to 'interruptions' in the study of cell development and differentiation trajectories. In contrast, bulk RNA-seq sequencing of whole tissues contains, in principle, 'interrupted' cells. To our knowledge, there is no algorithm for extracting 'interrupted' cells from bulk RNA-seq. There is a lack of tools that effectively bridge the gap between bulk-seq and single-seq analyses.\n", "\n", "We developed BulkTrajBlend in OmicVerse, which is specifically designed to address cell continuity in single-cell sequencing.BulkTrajBlend first deconvolves single-cell data from Bulk RNA-seq and then uses a GNN-based overlapping community discovery algorithm to identify contiguous cells in the generated single-cell data.\n", "\n", "Colab_Reproducibility:https://colab.research.google.com/drive/1HulVXQIlUEcpGRDZo4MxcHYOjnVhuCC-?usp=sharing\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "fcf0728b-7bfd-473f-806b-642b164c00f3", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", " ____ _ _ __ \n", " / __ \\____ ___ (_)___| | / /__ _____________ \n", " / / / / __ `__ \\/ / ___/ | / / _ \\/ ___/ ___/ _ \\ \n", "/ /_/ / / / / / / / /__ | |/ / __/ / (__ ) __/ \n", "\\____/_/ /_/ /_/_/\\___/ |___/\\___/_/ /____/\\___/ \n", "\n", "Version: 1.5.6, Tutorials: https://omicverse.readthedocs.io/\n" ] } ], "source": [ "import omicverse as ov\n", "from omicverse.utils import mde\n", "import scanpy as sc\n", "import scvelo as scv\n", "ov.plot_set()\n" ] }, { "cell_type": "markdown", "id": "308eb6bc", "metadata": {}, "source": [ "## loading data\n", "\n", "For illustration, we apply differential kinetic analysis to dentate gyrus neurogenesis, which comprises multiple heterogeneous subpopulations.\n", "\n", "We utilized single-cell RNA-seq data (GEO accession: GSE95753) obtained from the dentate gyrus of the hippocampus in rats, along with bulk RNA-seq data (GEO accession: GSE74985). " ] }, { "cell_type": "code", "execution_count": 2, "id": "08829c90-d988-4c9e-bf76-31c7f38b6d9a", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "AnnData object with n_obs × n_vars = 2930 × 13913\n", " obs: 'clusters', 'age(days)', 'clusters_enlarged'\n", " uns: 'clusters_colors'\n", " obsm: 'X_umap'\n", " layers: 'ambiguous', 'spliced', 'unspliced'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "adata=scv.datasets.dentategyrus()\n", "adata" ] }, { "cell_type": "code", "execution_count": 3, "id": "56db2ac5-20a0-49bf-9ae6-e4acf45dc611", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
| \n", " | dg_d_1 | \n", "dg_d_2 | \n", "dg_d_3 | \n", "dg_v_1 | \n", "dg_v_2 | \n", "dg_v_3 | \n", "ca4_1 | \n", "ca4_2 | \n", "ca4_3 | \n", "ca3_d_1 | \n", "... | \n", "ca3_v_3 | \n", "ca2_1 | \n", "ca2_2 | \n", "ca2_3 | \n", "ca1_d_1 | \n", "ca1_d_2 | \n", "ca1_d_3 | \n", "ca1_v_1 | \n", "ca1_v_2 | \n", "ca1_v_3 | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Adat1 | \n", "70 | \n", "46 | \n", "49 | \n", "150 | \n", "150 | \n", "99 | \n", "164 | \n", "33 | \n", "29 | \n", "76 | \n", "... | \n", "64 | \n", "87 | \n", "86 | \n", "21 | \n", "42 | \n", "143 | \n", "23 | \n", "26 | \n", "10 | \n", "23 | \n", "
| Gm12094 | \n", "0 | \n", "103 | \n", "0 | \n", "21 | \n", "5 | \n", "2 | \n", "0 | \n", "5 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "10 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "
| Olfr203 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
| Mageb5b | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
| Top2a | \n", "0 | \n", "0 | \n", "5 | \n", "0 | \n", "19 | \n", "0 | \n", "0 | \n", "18 | \n", "1 | \n", "0 | \n", "... | \n", "0 | \n", "37 | \n", "0 | \n", "2 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
5 rows × 24 columns
\n", "| \n", " | nocd_Cck-Tox | \n", "nocd_Microglia | \n", "nocd_OPC | \n", "nocd_Astrocytes | \n", "nocd_Mossy | \n", "nocd_OL | \n", "nocd_Cajal Retzius | \n", "nocd_Endothelial_1 | \n", "nocd_Granule immature | \n", "nocd_Neuroblast | \n", "nocd_Cck-Tox_1 | \n", "nocd_Endothelial | \n", "nocd_GABA | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| C_1 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
| C_2 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
| C_3 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "
| C_4 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
| C_5 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "