parseArriba
parseArriba
takes the identified fusion transcript results from
Arriba and saves as a GVF file. The GVF
file can be later used to call variant peptides using
callVariant.
Reference Version
The version of reference genome and proteome FASTA and annotation GTF MUST be consistent across all analysis.
Usage
usage: moPepGen parseArriba [-h] -i <file> -o <file>
[--min-split-read1 <value>]
[--min-split-read2 <value>]
[--min-confidence <choice>] --source SOURCE
[-g <file>] [-a <file>]
[--reference-source {GENCODE,ENSEMBL}]
[--index-dir [<file>]]
[--debug-level <value|number>] [-q]
Parse the Arriba result to GVF format of variant records for moPepGen to call
variant peptides.
optional arguments:
-h, --help show this help message and exit
-i <file>, --input-path <file>
File path to Arriba's output TSV file. Valid formats:
['.tsv', '.txt'] (default: None)
-o <file>, --output-path <file>
File path to the output file. Valid formats: ['.gvf']
(default: None)
--min-split-read1 <value>
Minimal split_read1 value. (default: 1)
--min-split-read2 <value>
Minimal split_read2 value. (default: 1)
--min-confidence <choice>
Minimal confidence value. (default: medium)
--source SOURCE Variant source (e.g. gSNP, sSNV, Fusion) (default:
None)
--debug-level <value|number>
Debug level. (default: INFO)
-q, --quiet Quiet (default: False)
Reference Files:
-g <file>, --genome-fasta <file>
Path to the genome assembly FASTA file. Only ENSEMBL
and GENCODE are supported. Its version must be the
same as the annotation GTF and proteome FASTA
(default: None)
-a <file>, --annotation-gtf <file>
Path to the annotation GTF file. Only ENSEMBL and
GENCODE are supported. Its version must be the same as
the genome and proteome FASTA. (default: None)
--reference-source {GENCODE,ENSEMBL}
Source of reference genome and annotation. (default:
None)
--index-dir [<file>] Path to the directory of index files generated by
moPepGen generateIndex. If given, --genome-fasta,
--proteome-fasta and --anntotation-gtf will be
ignored. (default: None)
Arguments
-h, --help
show this help message and exit
-i, --input-path <file> Path
File path to Arriba's output TSV file. Valid formats: ['.tsv', '.txt']
-o, --output-path <file> Path
File path to the output file. Valid formats: ['.gvf']
--min-split-read1 <value> int
Minimal split_read1 value.
int
Default: 1
--min-split-read2 <value> int
Minimal split_read2 value.
int
Default: 1
--min-confidence <choice> str
Minimal confidence value.
str
Default: medium
Choices: dict_keys(['low', 'medium', 'high'])
--source str
Variant source (e.g. gSNP, sSNV, Fusion)
-g, --genome-fasta <file> Path
Path to the genome assembly FASTA file. Only ENSEMBL and GENCODE are supported. Its version must be the same as the annotation GTF and proteome FASTA
-a, --annotation-gtf <file> Path
Path to the annotation GTF file. Only ENSEMBL and GENCODE are supported. Its version must be the same as the genome and proteome FASTA.
--reference-source str
Source of reference genome and annotation.
Choices: ['GENCODE', 'ENSEMBL']
--index-dir <file> Path
Path to the directory of index files generated by moPepGen generateIndex. If given, --genome-fasta, --proteome-fasta and --anntotation-gtf will be ignored.
--debug-level <value|number> str
Debug level.
str
Default: INFO
-q, --quiet
Quiet
Default: False