omicverse.alignment.parallel_fastq_dump¶

omicverse.alignment.parallel_fastq_dump(sra_id, threads=1, outdir='.', tmpdir=None, min_spot_id=1, max_spot_id=None, split_files=False, gzip=False, **kwargs)[source]¶

Download SRA data in parallel using parallel-fastq-dump.

This function wraps the parallel-fastq-dump tool to download sequencing data from NCBI SRA (Sequence Read Archive) in parallel for faster downloads.

Parameters:

sra_id (str) – SRA accession ID (for example SRR2244401).
threads (int, optional) – Number of parallel threads used by parallel-fastq-dump.
outdir (str, optional) – Output directory for downloaded FASTQ files.
tmpdir (str|None, optional) – Temporary directory for chunk/intermediate files.
min_spot_id (int, optional) – Minimum SRA spot ID to download.
max_spot_id (int|None, optional) – Maximum SRA spot ID to download. None downloads all remaining spots.
split_files (bool, optional) – Split paired-end reads into separate *_1/*_2 FASTQ files.
gzip (bool, optional) – Compress output FASTQ files using gzip.
**kwargs – Additional flags passed through to parallel-fastq-dump.

Return type:

Dict[str, Union[str, int]]

Returns:

dict[str,str|int] – Download metadata including input parameters and discovered output FASTQ paths.
Examples – >>> import omicverse as ov >>> # Download SRA data with 4 threads and split files >>> result = ov.alignment.parallel_fastq_dump( … sra_id=’SRR2244401’, … threads=4, … outdir=’fastq_output/’, … split_files=True, … gzip=True … ) >>> # Download with spot range limit >>> result = ov.alignment.parallel_fastq_dump( … sra_id=’SRR2244401’, … threads=8, … outdir=’fastq_output/’, … min_spot_id=1, … max_spot_id=100000, … split_files=True, … gzip=True … )