General StatisticsOverview
9,798
Total Cells
14,589
Total Genes
2,000
Highly Variable Genes
0
Median Genes/Cell
0
Median UMIs/Cell
7/7
Analysis Steps
📋 Dataset Summary: This single-cell RNA-seq dataset contains
9,798 cells and 14,589 genes.
After quality control and feature selection, 2,000 highly variable genes
(13.7% of total) were identified for downstream analysis.
Gene Expression AnalysisFeature Selection
Gene Expression Overview and HVG Selection
✅ Feature Selection Results: 2,000 highly variable genes were selected
from 14,589 total genes (13.7%). These features will be used
for dimensionality reduction and downstream analyses.
Principal Component AnalysisDimensionality Reduction
PCA Results and Variance Explained
🔧 PCA Parameters
• Number of components: 50• Data layer: scaled
• Use highly variable genes: True
Batch Effect CorrectionIntegration
Batch Correction Comparison: Before and After Integration
🔄 Integration Methods Applied: Multiple batch correction methods were evaluated.
X_scVI was selected as the
optimal integration method based on benchmarking metrics.
🔧 Integration Parameters
• Harmony PCs: 50• scVI latent dimensions: 30
• scVI layers: 2
• Best method: X_scVI
Cell Clustering7 Clusters
Final Clustering Results
🎯 Clustering Summary: Automated clustering identified
7 distinct cell clusters using the SCCAF algorithm
with Leiden clustering. Results are visualized using MDE (Minimum Distortion Embedding).
Cell Cycle AnalysisPhase Distribution
Cell Cycle Phase Distribution and Scores
Cell Cycle Phase | Cell Count | Percentage | Status |
---|---|---|---|
G1 | 4,813 | 49.1% | ✅ Normal |
S | 2,792 | 28.5% | ✅ Normal |
G2M | 2,193 | 22.4% | ✅ Normal |
Integration Method BenchmarkAuto-Selected
🏆 Best Method: X_scVI
was automatically selected as the integration method. Detailed benchmarking metrics are not available.
🔧 Available Integration Methods
• Harmony: ✅ Available• scVI: ✅ Available
• Selected: X_scVI
Analysis Pipeline StatusWorkflow
Analysis Step | Status | Parameters |
---|---|---|
🔍 Quality Control & Filtering | ✅ Completed | mode: seurat; min_cells: 3; min_genes: 200 (+ 10 more) |
⚙️ Preprocessing & Normalization | ✅ Completed | mode: shiftlog|pearson; target_sum: 500000.0; n_HVGs: 2000 (+ 1 more) |
📏 Data Scaling | ✅ Completed | Default parameters |
📈 Principal Component Analysis | ✅ Completed | layer: scaled; n_pcs: 50 |
🔄 Cell Cycle Scoring | ✅ Completed | s_genes: ['Cdca7', 'Mcm4', 'Mcm7', 'Rfc2', 'Ung', 'Mcm6', 'Rrm1', 'Slbp', 'Pcna', 'Atad2', 'Tipin', 'Mcm5', 'Uhrf1', 'Polr1b', 'Dtl', 'Prim1', 'Fen1', 'Hells', 'Gmnn', 'Pold3', 'Nasp', 'Chaf1b', 'Gins2', 'Pola1', 'Msh2', 'Casp8ap2', 'Cdc6', 'Ubr7', 'Ccne2', 'Wdr76', 'Tyms', 'Cdc45', 'Clspn', 'Rrm2', 'Dscc1', 'Rad51', 'Usp1', 'Exo1', 'Blm', 'Rad51ap1', 'Cenpu', 'E2f8', 'Mrpl36']; g2m_genes: ['Cbx5', 'Aurkb', 'Cks1b', 'Cks2', 'Jpt1', 'Hmgb2', 'Anp32e', 'Lbr', 'Tmpo', 'Top2a', 'Tacc3', 'Tubb4b', 'Ncapd2', 'Rangap1', 'Cdk1', 'Smc4', 'Kif20b', 'Cdca8', 'Ckap2', 'Ndc80', 'Dlgap5', 'Hjurp', 'Ckap5', 'Bub1', 'Ckap2l', 'Ect2', 'Kif11', 'Birc5', 'Cdca2', 'Nuf2', 'Cdca3', 'Nusap1', 'Ttk', 'Aurka', 'Mki67', 'Pimreg', 'Ccnb2', 'Tpx2', 'Hjurp', 'Anln', 'Kif2c', 'Cenpe', 'Gtse1', 'Kif23', 'Cdc20', 'Ube2c', 'Cenpf', 'Cenpa', 'Hmmr', 'Ctcf', 'Psrc1', 'Cdc25c', 'Nek2', 'Gas2l3', 'G2e3'] |
🎵 Harmony Integration | ✅ Completed | n_pcs: 50 |
🧬 scVI Integration | ✅ Completed | n_layers: 2; n_latent: 30; gene_likelihood: nb |
📊 Method Benchmarking | ❌ Not Completed | Default parameters |
🎯 SCCAF Clustering Analysis | ❌ Not Completed | Default parameters |
📋 Pipeline Summary: This analysis was completed using the OmicVerse lazy function pipeline.
The pipeline automatically performed quality control, normalization, batch correction, clustering, and benchmarking
to provide comprehensive single-cell RNA-seq analysis results.