parseFusionCatcher

parseFusionCatcher takes the identified fusion transcript results from FusionCatcher and save as a GVF file. The GVF file can be later used to call variant peptides using callVariant.

Reference Version

The version of reference genome and proteome FASTA and annotation GTF MUST be consistent across all analysis.

Usage

usage: moPepGen parseFusionCatcher [-h] -i <file> -o <file>
                                   [--max-common-mapping <number>]
                                   [--min-spanning-unique <number>] --source
                                   SOURCE [-g <file>] [-a <file>]
                                   [--reference-source {GENCODE,ENSEMBL}]
                                   [--index-dir [<file>]]
                                   [--debug-level <value|number>] [-q]

Parse the FusionCatcher result to GVF format of variant records for moPepGen
to call variant peptides. The genome

optional arguments:
  -h, --help            show this help message and exit
  -i <file>, --input-path <file>
                        File path to FusionCatcher's output TSV file. Valid
                        formats: ['.tsv', '.txt'] (default: None)
  -o <file>, --output-path <file>
                        File path to the output file. Valid formats: ['.gvf']
                        (default: None)
  --max-common-mapping <number>
                        Maximal number of common mapping reads. (default: 0)
  --min-spanning-unique <number>
                        Minimal spanning unique reads. (default: 5)
  --source SOURCE       Variant source (e.g. gSNP, sSNV, Fusion) (default:
                        None)
  --debug-level <value|number>
                        Debug level. (default: INFO)
  -q, --quiet           Quiet (default: False)

Reference Files:
  -g <file>, --genome-fasta <file>
                        Path to the genome assembly FASTA file. Only ENSEMBL
                        and GENCODE are supported. Its version must be the
                        same as the annotation GTF and proteome FASTA
                        (default: None)
  -a <file>, --annotation-gtf <file>
                        Path to the annotation GTF file. Only ENSEMBL and
                        GENCODE are supported. Its version must be the same as
                        the genome and proteome FASTA. (default: None)
  --reference-source {GENCODE,ENSEMBL}
                        Source of reference genome and annotation. (default:
                        None)
  --index-dir [<file>]  Path to the directory of index files generated by
                        moPepGen generateIndex. If given, --genome-fasta,
                        --proteome-fasta and --anntotation-gtf will be
                        ignored. (default: None)

Arguments

-h, --help

show this help message and exit

-i, --input-path <file> Path

File path to FusionCatcher's output TSV file. Valid formats: ['.tsv', '.txt']

-o, --output-path <file> Path

File path to the output file. Valid formats: ['.gvf']

--max-common-mapping <number> int

Maximal number of common mapping reads. int
Default: 0

--min-spanning-unique <number> int

Minimal spanning unique reads. int
Default: 5

--source str

Variant source (e.g. gSNP, sSNV, Fusion)

-g, --genome-fasta <file> Path

Path to the genome assembly FASTA file. Only ENSEMBL and GENCODE are supported. Its version must be the same as the annotation GTF and proteome FASTA

-a, --annotation-gtf <file> Path

Path to the annotation GTF file. Only ENSEMBL and GENCODE are supported. Its version must be the same as the genome and proteome FASTA.

--reference-source str

Source of reference genome and annotation.
Choices: ['GENCODE', 'ENSEMBL']

--index-dir <file> Path

Path to the directory of index files generated by moPepGen generateIndex. If given, --genome-fasta, --proteome-fasta and --anntotation-gtf will be ignored.

--debug-level <value|number> str

Debug level. str
Default: INFO

-q, --quiet

Quiet
Default: False