GenomeComb
This text gives examples of how to view results in a projectdir using the gui cg viz.
In the howto, the smal example and test data set ori_mixed_yri_mx2 downloadable from the genomecomb website will be used. This data set was derived from publically available exome and genome sequencing data by extracting only raw data covering the region of the MX2 gene (on chr21) and a part of the ACO2 gene (on chr22).
This howto expects a processed projectdir in tmp/mixed_yri_mx2. This can be created following the directions in howto_process_project. Alternatively you could also copy it from the expected dir (or adapt the path):
cp -a expected/mixed_yri_mx2 tmp/mixed_yri_mx2
Start up cg viz using
cg viz tmp/mixed_yri_mx2/compar/annot_compar-mixed_yri_mx2.tsv.zst
This opens the annotated combined variant file (fomrmat described in tsv) using cg viz, allowing you to browse through the table (even if it is millions of lines long).
First thing you may want to do is make the title/header row higher. You can do this by dragging the edge down. You can also make columns broader by dragging the edges.
You can use the Fields button to limit the number of fields you want to see. The list on the right of the dialog shows the currently displayed fields. The list on the left shows available fields. Sample specific fields are indicated by having a - followed by a sample suffix. We will select to display only a limited set of sample specific fields:
The sample fields have a specific format, e.g.
When combining sample results, process_project will check if a variant is not present in a sample variant list, whether this is due to actually being reference (zyg = r) or being unsequenced (zyg = u), according to the criteria used. also other data, such as the coverage or quality of the "variant" call, is added for the reference calls where possible. However, for unsequenced variants (zyg-* = u) many fields (e.g. quality) will remain empty (where ref call but no variant call) or ? for completely unknown (e.g. gath is not even called on regions with coverage < 5)
You can use Query to show only lines that fit a number of criteria. The query language is the same as supported by cg select and the specifics can be found in the cg select help. You can type a query directly into the Query field at the top, e.g. type $zyg-gatk-rdsbwa-gilNA19240mx2 == "m" to select only variants that are homozygous gatk calls for sample gilNA19240mx2 and press Enter.
You can use the "Query" button to get help in building queries. The main part of the query builder is still a (larger) text field where you can edit the query as text.
The buttons and selection lists allow easy adding of components to your query. You can for instance select one or more fields in the first list, and operator in the second and values in the third (Some common/example values are in the list for selection). Then clicking the and button will add the query component made this way at the cursor position using "and" logic while the or button does the same using or. condition, field, value and comp (comparison) ad these parts of the selections at the cursor.
Using the functions button you can select out of all supported functions. Double click to insert the function with parameters based on the currently selected fields, operator and values. You can always still edit the result. The button block on the right gives shortcuts to some common functions.
The "EasyQuery" button can be used for adding some common queries in an easier but less flexible way.
Select which fields to sort on; take the ones with - prefix for reverse sort.
The Summaries button can be used to create summary data. This provides functionality similar to the -g and -gc options in cg select (more info in the cg select help), but you can select fields etc. in the GUI.
For example:
Make the tree view on the left larger by dragging the dividing line to the right. Here you can select other result files to view.