parseCIRCexplorer

parseCIRCExplorer takes the identified circRNA results from CIRCexplorer and save as a GVF file. The GVF file can be later used to call variant peptides using callVariant. Noted that only known circRNA is supported ( *_circular_known.txt)

Reference Version

The version of reference genome and proteome FASTA and annotation GTF MUST be consistent across all analysis.

Usage

usage: moPepGen parseCIRCexplorer [-h] -i <file> -o <file> [--circexplorer3]
                                  [--min-read-number <number>]
                                  [--min-fpb-circ <number>]
                                  [--min-circ-score <number>]
                                  [--intron-start-range <number>]
                                  [--intron-end-range <number>] --source
                                  SOURCE [-a <file>]
                                  [--reference-source {GENCODE,ENSEMBL}]
                                  [--index-dir [<file>]]
                                  [--debug-level <value|number>] [-q]

Parse CIRCexplorer result to a TSV format for moPepGen to call variant
peptides

optional arguments:
  -h, --help            show this help message and exit
  -i <file>, --input-path <file>
                        File path to CIRCexplorer's TSV output. Valid formats:
                        ['.tsv', '.txt'] (default: None)
  -o <file>, --output-path <file>
                        File path to the output file. Valid formats: ['.gvf']
                        (default: None)
  --circexplorer3       Using circRNA resutls from CIRCexplorer3 (default:
                        False)
  --min-read-number <number>
                        Minimal number of junction read counts. (default: 1)
  --min-fpb-circ <number>
                        Minimal CRICscore value for CIRCexplorer3. Recommends
                        to 1, defaults to None (default: None)
  --min-circ-score <number>
                        Minimal CIRCscore value for CIRCexplorer3. Recommends
                        to 1, defaults to None (default: None)
  --intron-start-range <number>
                        The range of difference allowed between the intron
                        start and the reference position. (default: -2,0)
  --intron-end-range <number>
                        The range of difference allowed between the intron end
                        and the reference position. (default: -100,5)
  --source SOURCE       Variant source (e.g. gSNP, sSNV, Fusion) (default:
                        None)
  --debug-level <value|number>
                        Debug level. (default: INFO)
  -q, --quiet           Quiet (default: False)

Reference Files:
  -a <file>, --annotation-gtf <file>
                        Path to the annotation GTF file. Only ENSEMBL and
                        GENCODE are supported. Its version must be the same as
                        the genome and proteome FASTA. (default: None)
  --reference-source {GENCODE,ENSEMBL}
                        Source of reference genome and annotation. (default:
                        None)
  --index-dir [<file>]  Path to the directory of index files generated by
                        moPepGen generateIndex. If given, --genome-fasta,
                        --proteome-fasta and --anntotation-gtf will be
                        ignored. (default: None)

Arguments

-h, --help

show this help message and exit

-i, --input-path <file> Path

File path to CIRCexplorer's TSV output. Valid formats: ['.tsv', '.txt']

-o, --output-path <file> Path

File path to the output file. Valid formats: ['.gvf']

--circexplorer3

Using circRNA resutls from CIRCexplorer3
Default: False

--min-read-number <number> int

Minimal number of junction read counts. int
Default: 1

--min-fpb-circ <number> float

Minimal CRICscore value for CIRCexplorer3. Recommends to 1, defaults to None

--min-circ-score <number> float

Minimal CIRCscore value for CIRCexplorer3. Recommends to 1, defaults to None

--intron-start-range <number> str

The range of difference allowed between the intron start and the reference position. str
Default: -2,0

--intron-end-range <number> str

The range of difference allowed between the intron end and the reference position. str
Default: -100,5

--source str

Variant source (e.g. gSNP, sSNV, Fusion)

-a, --annotation-gtf <file> Path

Path to the annotation GTF file. Only ENSEMBL and GENCODE are supported. Its version must be the same as the genome and proteome FASTA.

--reference-source str

Source of reference genome and annotation.
Choices: ['GENCODE', 'ENSEMBL']

--index-dir <file> Path

Path to the directory of index files generated by moPepGen generateIndex. If given, --genome-fasta, --proteome-fasta and --anntotation-gtf will be ignored.

--debug-level <value|number> str

Debug level. str
Default: INFO

-q, --quiet

Quiet
Default: False