Fastq – Guessing between 4 formats

There is a huge mess in the fastq format description and sometimes you might get lost trying to find which is the encoding of your file. Because of that i found a script written in here https://github.com/brentp/bio-playground/blob/master/reads-utils/guess-encoding.py that you can use to find out which is the correct version of your files with the command:

awk ‘NR % 4 == 0’ your.fastq | python %prog [options]

So far this support the following encodings:

‘Sanger’: (33, 73),
‘Solexa’: (59, 104),
‘Illumina-1.3’: (64, 104),
‘Illumina-1.5’: (67, 104)
My file outputed Solexa  Illumina-1.3    (66, 104)
There is also this script:
./SolexaQA.pl reads1.fastq

Leave a Reply

Your email address will not be published. Required fields are marked *