Var

Format

cg sv ?options? bamfile ?resultfile?

Summary

generic command to call structural variants based on the alignment in bamfile.

Description

The command can be used to call structural variants using different methods. The methods are implemented in separate commands such as cg sv_manta; The -method options determines which one is used. The resultfile is a tsv file containing the called structural variants. If ""resultfile is not given, a file name will be derived from the bam file name: If the bamfile has a name of the form map-alignmethod-sample.bam, the result file will will be var-svcaller-alignmethod-sample.tsv.zst. Other result files may also be created, depending on the method used.

cg sv can distribute structural variant calling by analysing separate regions in parallel and combining the results (-distrreg option). This command can be distributed on a cluster or using multiple cores with job options (more info with cg help joboptions)

Arguments

bamfile: alignment on which to call variants
resultfile: filename of the resulting structural variant calls

Options

-method svcaller: use the given method (svcaller) to call structural variants. Default svcaller is manta. Possible values are: manta, lumpy sniffles, npinv, cg, gridds
-refseq refseq: genomic reference sequence
-regionfile regionfile: only call variants in the region(s) given in regionfile
-regmincoverage mincoverage: If no regionfile is given, variants will be called only in regions with a coverage of at least mincoverage (default 3)
-distrreg: distribute regions for parallel processing (default s50000000). Possible options are 0: no distribution (also empty) 1: default distribution (s50000000) schr or schromosome: each chromosome processed separately chr or chromosome: each chromosome processed separately, except the unsorted, etc. with a _ in the name that will be combined), a number: distribution into regions of this size a number preceded by an s: distribution into regions targeting the given size, but breaks can only occur in unsequenced regions of the genome (N stretches) a number preceded by an r: distribution into regions targeting the given size, but breaks can only occur in large (>=100000 bases) repeat regions a number preceded by an g: distribution into regions targeting the given size, but breaks can only occur in large (>=200000 bases) regions without known genes g: distribution into regions as given in the <refdir>/extra/reg_*_distrg.tsv file; if this does not exist uses g5000000 a filename: the regions in the file will be used for distribution
-split 1/0: indicate if in the datafiles, multiple alternative genotypes are split over different lines (default 1)
-t number (-threads): number of threads used by each job (if the method supports threads)
-cleanup 0/1: Use 0 to no cleanup some of the temporary files (vcf, delvar, ...)

Home

Contact

Installation

Documentation

Var

Format

Summary

Description

Arguments

Options

Category