Inputs and Configuration

Input and Input Parameter/Flag Required Type Description
input.BAM yes path BAM file for which to calculate coverage, path provided in input yaml.
target_BED yes path BED file specifying target intervals (defines regions for target and off-target coverage operations).
save_intermediate_files yes boolean Whether to save intermediate files.
reference_dict yes path Human genome reference dictionary file for use in BED to INTERVAL_LIST conversion. Required if collecting metrics.
reference_dbSNP yes path dbSNP reference VCF file, with proper chromosome encoding and compression. See discussion. Required if performing off-target read depth calculation.
genome_sizes yes path Reference file consisting of chromosomes and their lengths used by bedtools slop. Required for off-target read depth workflows. .fai files accepted.
target_depth no bool Whether to calculate per-base read depth in targeted regions. Default false.
off_target_depth no bool Whether to perform off-target read depth calculation at dbSNP loci. Default true.
output_enriched_target_file no bool Whether to output a new target file containing coverage-enriched off-target dbSNP loci. Default true.
min_read_depth no bool Minimum read depth threshold for an off-target locus to be considered enriched and be included in the new target file. Default 30.
min_base_quality no integer Minimum base quality for a read to be counted in depth calculation by samtools depth. Applies to both off- and on-target calculations. Defaults to 20.
min_mapping_quality no integer Minimum mapping quality for a read to be counted in depth calculation by samtools depth. Applies to both off- and on-target calculations. Defaults to 20.
collect_metrics no bool Whether to run CollectHsMetrics. Default true.
target_interval_list no path Interval list file specifying target intervals used to calculate coverage by collecHsMetrics. If not provided, the target BED will be used to calculate the intervals.
bait_BED no path BED file with bait locations that can be used to generate a bait interval list used by CollecHsMetrics. If not provided, the target BED will be used.
bait_interval_list no path Interval list file specifying bait intervals used by CollectHsMetrics. If not provided, the bait BED will be used to calculate the intervals.
save_interval_list yes boolean Whether to save a copy of any generated interval lists. Saves to the output_dir.
save_all_dbSNP no boolean Whether to save a copy of the read depth BED file for all dbSNP loci generated by the off-target workflows. Default false.
save_raw_target_bed no boolean Whether to save a copy of the per-base, target read depth BED with uncollapsed intervals. Default false.
off_target_slop no integer Number of base pairs to add to either side of target file coordinates so that they may be excluded from off-target read depth calculation. Default is 500.
dbSNP_slop no integer Number of base pairs to add to either side of off-target dbSNP loci to generate off-target intervals. The purpose is to merge adjacent dbSNP loci into single intervals prior to mergeing with target intervals. Default is 150.
coverage_cap no integer COVERAGE_CAP parameter for CollectHsMetrics, determines the coverage threshold at which to stop calculating coverage.
near_distance no integer NEAR_DISTANCE parameter for CollectHsMetrics, determines the maximum distance in bp of a read from the nearest probe (bait) for it to be counted as "near probe" in metrics calculations. Default 250.
samtools_depth_extra_args no string Extra arguments for samtools depth.
picard_CollectHsMetrics_extra_args no string Extra arguments for picard CollectHsMetrics.
merge_operation no string Operation performed on read depth column values when intervals are collapsed during bedtools merge. Defaults to 'collapse'. See bedtools documentation for other options.
work_dir no path Path of working directory for Nextflow. When included in the sample config file, Nextflow intermediate files and logs will be saved to this directory. With ucla_cds, the default is /scratch and should only be changed for testing/development. Changing this directory to /hot or /tmp can lead to high server latency and potential disk space limitations, respectively.