Bcol_make
Format
cg bcol make ?options? bcolfile column ?srcfile?
Summary
Creates a bcol file
Description
The bcol_make commands creates a bcol file
based on tab-separated data. The input data is given via stdin. The
data in one of the columns (given as a parameter) in the file is
converted to the binary format.
Values in the input file can be assigned to specific positions
using the poscol parameter, which indicates which field will contain
the position. A default value will be added in the bcol file for
positions skipped in the input file. If the endcol parameter is also
given, all values between first pos ition and end will be set to the
value in the value column. The optional chromosomecol parameter can
be used to indicate which field must be used to assign values to a
specific chromosome. If the input file contains fields with names
recognized for chromosome, begin or end, these will be automatically
used for these parameters. If these fields are present, but you do
not want to use them, set the parameters explicitely to empty
Multiple columns (of the same type) can be stored in one bcol
file, e.g. for variant annotation (a different score for each
potential alternative allele) using the -m and -l options.
Arguments
- bcolfile
- name of bcolfile to be created
- column
- Name of the column containing the data
- srcfile
- Name of the file used for providing input data (if not given,
default is from stdin)
Options
- -t type
- type of data, should be one described in the description of
the format (bcol)
- --precision precision
- default precision for display of data in bcol
- -d defaultvalue (--default)
- value that will be given for positions not in the input file
- -h header (--header)
- set to 0 (default is 1) if the input data does not have a
header (in this case chromosomecol and poscol must be given by
number, starting from 0)
- -c chromosomecol (--chromosomecol)
- name of the column that contains the chromosome
- -n chromosomename (--chromosomename)
- if no chromosomecol is present (no -c), this name will be
given as chromosome (default is empty)
- -p poscol
- name of the column that contains the positions
- -e endcol
- name of the column that contains the the end positions (in
case of regions, this cannot be used with -m)
- -co compressionlevel (--compress)
- compression level used (default 8): 1 is very fast; 8 is
slower, but compresses better, 0 can be used for no compression
- -l multilist (--multilist)
- a comma separated list with the names of all columns, e.g.
A,C,G,T for variant annotation. This option must be given together
with the next option (-m)
- -m multicol (--multicol)
- name of the column containing a list of names the values will
be assigned to. If a name in multilist is not present in
this list for a row, the default value will be assigned.
Category
Format Conversion