Overview

The call-gSV nextflow pipeline, calls structural variants (SVs) and copy number variants (CNVs) utilizing Delly and Manta. Additionally, the pipeline can also regenotype previously identified SVs or CNVs with Delly. It is suitable for detecting copy-number variable deletion and tandem duplication events as well as balanced rearrangements such as inversions or reciprocal translocations and validates the output quality with BCFtools. The pipeline has been engineered to run in a 4 layer stack in a cloud-based scalable environment of CycleCloud, Slurm, Nextflow and Docker. Additionally it has been validated with the SMC-HET dataset and the GRCh38 reference genome, using paired-end FASTQ's that were back-extracted from BAMs created by BAM Surgeon.

Node Specific Config File Settings

Config File Available Node cpus / memory Designated Process 1; cpus / memory Designated Process 2; cpus / memory Designated Process 3; cpus / memory
F2.config 2 / 3 GB call_gSV_Delly; 1 / 2 GB call_gCNV_Delly; 1 / 2 GB call_gSV_Manta; 1 / 2 GB*
F16.config 16 / 26 GB call_gSV_Delly; 1 / 8 GB call_gCNV_Delly; 1 / 8 GB call_gSV_Manta; 6 / 8 GB
F32.config 32 / 62.8 GB call_gSV_Delly; 1 / 15 GB call_gCNV_Delly; 1 / 15 GB call_gSV_Manta; 12 / 15 GB
F72.config 72 / 136.8 GB call_gSV_Delly; 1 / 30 GB call_gCNV_Delly; 1 / 30 GB call_gSV_Manta; 24 / 30 GB
M64.config 64 / 950 GB call_gSV_Delly; 1 / 60 GB call_gCNV_Delly; 1 / 60 GB call_gSV_Manta; 30 / 60 GB

* Manta SV calling wouldn't work on an F2 node due to incompatible resources. In order to test the pipeline for tasks not relevant to Manta, please set run_manta = false in the sample specific config file.