Skip to content

OmicVerse MCP Full Start

This page is the complete getting-started path for OmicVerse MCP. It sits between the short Quick Start and the more specialized reference pages.

Who This Is For

Read this page if:

  • the quick start feels too compressed
  • you want one continuous setup-to-analysis walkthrough
  • you want both the commands and the reasoning behind the default choices

What You Will End Up With

By the end of this page, you will have:

  • an OmicVerse MCP server running
  • a client connected through stdio or local HTTP
  • a dataset loaded as an adata_id
  • a standard preprocessing workflow completed
  • a plot and marker analysis generated
  • a persisted .h5ad you can restore later

Step 1: Install

pip install omicverse[mcp]
python -m omicverse.mcp --version

If you are working from a clone:

pip install -e ".[mcp]"

Step 2: Choose a Transport

OmicVerse MCP supports two local transports.

Common phase selections

# Core only
python -m omicverse.mcp --phase P0

# Default
python -m omicverse.mcp --phase P0+P0.5

# Include advanced P2 tools
python -m omicverse.mcp --phase P0+P0.5+P2

Option A: stdio

Use this when you want the simplest setup and want Claude to own the MCP process lifecycle.

python -m omicverse.mcp --phase P0+P0.5

Option B: streamable-http

Use this when you want a separately managed MCP process, clearer logs, or easier reconnect behavior.

NUMBA_CACHE_DIR=/tmp/numba_cache MPLCONFIGDIR=/tmp/mpl \
python -m omicverse.mcp \
  --transport streamable-http \
  --host 127.0.0.1 \
  --port 8765 \
  --http-path /mcp \
  --phase P0+P0.5

Default Recommendation

Start with stdio. Move to local HTTP when debugging, using larger datasets, or keeping one MCP process alive across reconnects.

Step 3: Connect a Client

Claude Code with stdio

{
  "mcpServers": {
    "omicverse": {
      "command": "python",
      "args": ["-m", "omicverse.mcp", "--phase", "P0+P0.5"]
    }
  }
}

If you want Claude Code to launch the full P2 rollout directly:

{
  "mcpServers": {
    "omicverse": {
      "command": "python",
      "args": ["-m", "omicverse.mcp", "--phase", "P0+P0.5+P2"]
    }
  }
}

Claude Code with local HTTP

{
  "mcpServers": {
    "omicverse": {
      "type": "http",
      "url": "http://127.0.0.1:8765/mcp"
    }
  }
}

If you use Claude Code, you can also read the Claude Code walkthrough.

Step 4: Understand What Crosses the Boundary

AnnData objects stay on the server side. The client does not receive the full in-memory object. Instead, it receives lightweight handles such as:

  • adata_id for datasets
  • artifact_id for plots and files
  • instance_id for P2 class-backed tools

That means a typical successful load looks like:

{
  "ok": true,
  "tool_name": "ov.utils.read",
  "outputs": [
    {
      "type": "object_ref",
      "ref_type": "adata",
      "ref_id": "adata_a1b2c3d4e5f6"
    }
  ]
}

Step 5: Load Data

You can start from a built-in dataset:

Load the built-in pbmc3k dataset

or from a local file:

Load the pbmc3k.h5ad file

Built-in loaders currently include:

  • ov.datasets.pbmc3k
  • ov.datasets.pbmc8k
  • ov.datasets.seqfish

Step 6: Inspect Before You Analyze

Before asking for preprocessing, inspect the current dataset. This is important because it lets the model see what is actually present in obs, var, obsm, and uns.

Useful prompts:

Describe the current adata
What is the first gene in var?
Does CD3D exist in var_names?
Inspect adata.obsm
Inspect adata.uns
Show value counts for leiden

These route to:

  • ov.adata.describe
  • ov.adata.peek
  • ov.adata.find_var
  • ov.adata.inspect
  • ov.adata.value_counts

Step 7: Run the Standard Workflow

The default analysis chain is:

  1. ov.pp.qc
  2. ov.pp.scale
  3. ov.pp.pca
  4. ov.pp.neighbors
  5. ov.pp.umap
  6. ov.pp.leiden

Natural-language request:

Run QC, scale, PCA with 50 components, build neighbors, compute UMAP, and run Leiden clustering at resolution 1.0

The server enforces prerequisites. For example, PCA requires the scaled layer, and neighbors requires X_pca.

Step 8: Plot and Interpret

After the embedding and clustering are ready, ask for plots:

Plot the UMAP colored by leiden
Show me a violin plot of n_genes grouped by leiden cluster
Create a dot plot for genes CD3D, CD79A, LYZ, NKG7 grouped by leiden

This typically uses:

  • ov.pl.embedding
  • ov.pl.violin
  • ov.pl.dotplot

Plot outputs are registered as artifacts.

Step 9: Marker Analysis and Enrichment

Once clusters exist, ask for marker analysis:

Find marker genes for each Leiden cluster and show me the top 5 per cluster
Plot a marker gene dotplot
Run COSG to rank marker genes
Perform pathway enrichment on the marker genes
Plot the pathway enrichment results

This can use:

  • ov.single.find_markers
  • ov.single.get_markers
  • ov.pl.markers_dotplot
  • ov.single.cosg
  • ov.single.pathway_enrichment
  • ov.single.pathway_enrichment_plot

Step 10: Persist and Restore

The analysis state is in memory until you persist it.

Save:

Save the current dataset to disk

Restore later:

Restore the dataset from /path/to/file.h5ad

Under the hood this uses:

  • ov.persist_adata
  • ov.restore_adata

Optional Step 11: Move to P2

If you need advanced class-backed tools such as DEG, annotation, metacells, DCT, or topic modeling, start the server with:

python -m omicverse.mcp --phase P0+P0.5+P2

P2 tools may appear in tool listings but still be unavailable if their optional dependencies are not installed.

Troubleshooting Checklist

Tools are missing

  • verify the --phase
  • ask the client to run ov.list_tools

A tool is unavailable

  • ask the client to run ov.describe_tool
  • check missing optional dependencies

The dataset handle is gone

  • the server may have restarted
  • reload the data or use ov.restore_adata

A long-running task is confusing to debug

  • prefer local HTTP mode
  • inspect ov.list_traces, ov.get_trace, ov.list_events, and ov.get_health

Tips and Best Practices

  1. Start with P0+P0.5. Only add +P2 when you need advanced class-backed tools.
  2. Use --persist-dir if you want saved datasets to survive across sessions.
  3. Let the client track adata_id for you instead of managing handles manually.
  4. Use ov.adata.* inspection tools before asking for interpretation of obs, var, obsm, or uns.
  5. Use ov.describe_tool when you want prerequisite or availability details.
  6. Use remote deployment when the dataset or dependency stack outgrows your local machine.
  7. Persist before stopping if you plan to continue later.

Where to Go Next