omicverse.alignment.ref¶
- omicverse.alignment.ref(index_path, t2g_path, fasta_paths=None, gtf_paths=None, cdna_path=None, workflow='standard', d=None, k=None, threads=8, overwrite=False, temp_dir='tmp', make_unique=False, include=None, exclude=None, dlist=None, dlist_overhang=1, aa=False, max_ec_size=None, nucleus=False, f2=None, c1=None, c2=None, flank=None, feature=None, no_mismatches=False, distinguish=False, **kwargs)[source]¶
Build kallisto index and transcript-to-gene mapping files via
kb ref.- Parameters:
index_path (str) – Output path for generated kallisto index file.
t2g_path (str) – Output path for transcript-to-gene mapping table.
fasta_paths (str|list[str]|None, optional) – Input transcript/genome FASTA file(s) used to build the reference.
gtf_paths (str|list[str]|None, optional) – Input GTF annotation file(s) aligned with
fasta_paths.cdna_path (str|None, optional) – Optional cDNA FASTA output path required by some workflows.
workflow (str, optional) – kb workflow mode (for example
standard,nucleus,lamanno,kite).d (str|None, optional) – Prebuilt reference bundle shortcut (
kb ref -d).k (int|None, optional) – K-mer length for kallisto index construction.
threads (int, optional) – Number of threads for
kbexecution.overwrite (bool, optional) – Whether to overwrite existing output files.
temp_dir (str, optional) – Temporary directory root for kb intermediate files.
make_unique (bool, optional) – Make feature IDs unique when duplicate names are detected.
include (list[dict[str,str]]|None, optional) – Attribute filters to include specific transcript/gene records.
exclude (list[dict[str,str]]|None, optional) – Attribute filters to exclude records.
dlist (str|None, optional) – Decoy list file for selective-alignment workflows.
dlist_overhang (int, optional) – Overhang length used when generating decoy targets.
aa (bool, optional) – Enable amino-acid mode where supported by kb workflow.
max_ec_size (int|None, optional) – Maximum equivalence-class size for index generation.
nucleus (bool, optional) – Shortcut to switch from
standardtonucleusworkflow.f2 (str|None, optional) – Secondary FASTA output/input path for nucleus/velocity workflows.
c1 (str|None, optional) – Spliced transcript capture output path for velocity workflows.
c2 (str|None, optional) – Intronic transcript capture output path for velocity workflows.
flank (int|None, optional) – Flanking sequence length for selected workflows.
feature (str|None, optional) – Feature FASTA path used by
kiteworkflow.no_mismatches (bool, optional) – Disable mismatches for feature barcoding workflows.
distinguish (bool, optional) – Distinguish overlapping features when supported by kb.
**kwargs – Additional kb flags passed through (for example
kallisto,bustools,opt_off).
- Returns:
Metadata dictionary with workflow info and generated output paths.
- Return type: