Pipeline Steps

Each of the below algorithms, if selected, will run in parallel subject to available resources.

1. SAMtools stats

samtools stats collects basic statistics from BAM files including read counts, qualities, GC content, insert sizes, read lengths, proper pairing, and duplicated bases.

2. Picard CollectWgsMetrics

picard CollectWgsMetrics collects coverage metrics from WGS BAM files.

3. Picard CollectHsMetrics

picard CollectHsMetrics collects coverage metrics from WGS BAM files.

4. Qualimap bamqc

qualimap bamqc collects basic statistics and coverage metrics from BAM files. Example output: html pdf. Qualimap bamqc uses a lot of memory and should not be run within uclahs-cds/metapipeline-DNA.

5. mosdepth coverage and quantize

mosdepth coverage by windows provides fast BAM/CRAM depth calculation, reported by windows. quantize creates a bed file labeling regions within specified coverage thresholds. Similar to GATK's callable loci tool.

6. FastQC

FastQC aims to provide a QC report which can spot problems which originate either in the sequencer or in the starting library material.