GenomeComb
cg fastq_split ?options? infile outtemplate
split a (large) fastq file in multiple smaller ones
cg fastq_split will split a (large) fastq file in several smaller fastq files. You can either specify the target number of sequences per output file (-numseq option), or the target number of files you want to generate (-parts option). outtemplate is a filename (which may contain a path) that determines the names of the files the parts of the fastq file will be written to. The file part will be prepended with p<part>_., e.g. if outtemplate is results/test.fastq.gz, the results files will be named: results/p1_test.fastq.gz results/p2_test.fastq.gz ...
The command supports parallel processing using the joboptions, mainly for the second part of the command, namely compressing the resulting fastq files. The first part (splitting up the file) is not parallellized, but can be run as a single job on a cluster when using the -parts option. Using -numseq, the first part is always run directly (not as a job), because the command cannot know up front how many parts will be generated, and thus not submit the follow up compression jobs
Conversion