omicverse.space.pySTAligner

omicverse.space.pySTAligner(adata, hidden_dims: list = [512, 30], n_epochs: int = 1000, lr: float = 0.001, batch_key: str = 'batch_name', key_added: str = 'STAligner', gradient_clipping: float = 5, weight_decay: float = 0.0001, margin: float = 1, verbose: bool = False, random_seed: int = 666, iter_comb=None, knn_neigh: int = 100, Batch_list=None, device=torch.device) None[source]

STAligner for spatial transcriptomics data integration.

STAligner is a deep learning method for integrating spatial transcriptomics data across different experimental conditions, technologies, and developmental stages. It combines graph neural networks with mutual nearest neighbors to preserve both transcriptional and spatial relationships during integration.

The method works by: 1. Constructing spatial neighborhood graphs 2. Learning batch-invariant embeddings 3. Aligning similar regions across batches 4. Preserving spatial organization 5. Enabling cross-condition comparison

Parameters:
  • adata (AnnData) – Combined multi-batch AnnData for integration.

  • hidden_dims (list, default=[512, 30]) – Hidden dimensions of STAligner encoder.

  • n_epochs (int, default=1000) – Total training epochs.

  • lr (float, default=0.001) – Optimizer learning rate.

  • batch_key (str, default='batch_name') – Batch column in adata.obs.

  • key_added (str, default='STAligner') – Output embedding key in adata.obsm.

  • gradient_clipping (float, default=5) – Max norm for gradient clipping.

  • weight_decay (float, default=0.0001) – L2 regularization term.

  • margin (float, default=1) – Margin used in triplet loss during alignment.

  • verbose (bool, default=False) – Whether to print detailed training logs.

  • random_seed (int, default=666) – Random seed for reproducibility.

  • iter_comb (list, optional) – Batch-pair list for MNN comparison.

  • knn_neigh (int, default=100) – K for mutual nearest-neighbor search.

  • Batch_list (list, optional) – Per-batch AnnData list aligned to batch_key.

  • device (torch.device, default=auto cuda/cpu) – Device used for model training.

  • Attributes

    adata: AnnData

    Combined data containing all batches

    model: STAligner

    Neural network model for integration

    loader: DataLoader

    PyTorch geometric data loader

    device: torch.device

    Computing device (GPU/CPU)

    optimizer: torch.optim.Optimizer

    Adam optimizer for training

  • Examples

    >>> import scanpy as sc
    >>> import omicverse as ov
    >>> # Load data
    >>> adata1 = sc.read_visium(...)
    >>> adata2 = sc.read_visium(...)
    >>> # Construct spatial networks
    >>> ov.space.Cal_Spatial_Net(adata1, rad_cutoff=100)
    >>> ov.space.Cal_Spatial_Net(adata2, rad_cutoff=100)
    >>> # Combine data
    >>> adata = adata1.concatenate(adata2)
    >>> # Initialize STAligner
    >>> staligner = ov.space.pySTAligner(
    ...     adata=adata,
    ...     batch_key='batch',
    ...     Batch_list=[adata1, adata2]
    ... )
    >>> # Train model
    >>> staligner.train()
    >>> # Get integrated embeddings
    >>> embeddings = staligner.predicted()