Skip to content

Tabula

⚠️ Status: partial | Version: federated-v1


Overview

Privacy-preserving federated learning + tabular transformer, 60697 gene vocabulary, quantile-binned expression, FlashAttention

When to choose Tabula

User needs privacy-preserving analysis, federated-trained embeddings, or perturbation prediction with tabular modeling approach


Specifications

Property Value
Model Tabula
Version federated-v1
Tasks embed, annotate, integrate, perturb
Modalities RNA
Species human
Gene IDs custom (60,697 gene vocabulary)
Embedding Dim 192
GPU Required Yes
Min VRAM 8 GB
Recommended VRAM 16 GB
CPU Fallback No
Adapter Status ⚠️ partial

Quick Start

import omicverse as ov

# 1. Check model spec
info = ov.fm.describe_model("tabula")

# 2. Profile your data
profile = ov.fm.profile_data("your_data.h5ad")

# 3. Validate compatibility
check = ov.fm.preprocess_validate("your_data.h5ad", "tabula", "embed")

# 4. Run inference
result = ov.fm.run(
    task="embed",
    model_name="tabula",
    adata_path="your_data.h5ad",
    output_path="output_tabula.h5ad",
    device="auto",
)

# 5. Interpret results
metrics = ov.fm.interpret_results("output_tabula.h5ad", task="embed")

Input Requirements

Requirement Detail
Gene ID scheme custom (60,697 gene vocabulary)
Preprocessing Gene expression is quantile-binned. Model uses its own 60,697 gene vocabulary for tokenization.
Data format AnnData (.h5ad)
Batch key .obs column for batch integration (optional)
Label key .obs column for cell type labels (optional)

Output Keys

After running ov.fm.run(), results are stored in the AnnData object:

Key Location Description
X_tabula adata.obsm Cell embeddings (192-dim)
tabula_pred adata.obs Predicted cell type labels
import scanpy as sc

adata = sc.read_h5ad("output_tabula.h5ad")
embeddings = adata.obsm["X_tabula"]  # shape: (n_cells, 192)

# Downstream analysis
sc.pp.neighbors(adata, use_rep="X_tabula")
sc.tl.umap(adata)
sc.tl.leiden(adata, resolution=0.5)
sc.pl.umap(adata, color=["leiden"])

Resources


Hands-On Tutorial

For a step-by-step walkthrough with code, see the Tabula Tutorial Notebook.