Pipeline Steps

Each of the below algorithms, if selected, will run in parallel subject to available resources.

  • Note about duplicated reads: SAMtools stats does not ignore reads marked as duplicate by default. The option samtools_remove_duplicates can be set to true to override this. Picard CollectWgsMetrics and Qualimap bamqc do ignore reads marked as duplicate by default.

1. SAMtools stats

samtools stats collects basic statistics from BAM files including read counts, qualities, GC content, insert sizes, read lengths, proper pairing, and duplicated bases.

2. Picard CollectWgsMetrics

picard CollectWgsMetrics collects coverage metrics from WGS BAM files.

3. Qualimap bamqc

qualimap bamqc collects basic statistics and coverage metrics from BAM files. Example output: html pdf. Qualimap bamqc uses a lot of memory and should not be run within uclahs-cds/metapipeline-DNA.