FastQ file: the common output from NGS sequencers

Most NGS sequencers will save their output as text files in FASTQ format. In the modern incarnation of this format each sequence is written using 4 lines:

  1. The first will contain the sequence name, followed by the “@” symbol
  2. The DNA sequence itself
  3. A spacing line,  a “+”, optionally followed by the sequence name (repeated)
  4. The quality line

An example of a single sequencing read written in FASTQ format is:

@SRR5232030.1 1 length=101
NATCAATAGTATTCGTACCAATAGAACGAATATCCGCCAGCACCATTTGTTTGGCGGCGTCGCCCACCACGACAATGGAAACCACCGACGCAATACCGATT
+
#>BBABFFFFFFGGGGGGGGGGHHHHHGGGGGHHHGGGGGGHHHHHHHHHGHHGHGGGGGGGGGGGGGHHGGGGGHHHHHGHHHGGGGGGGGFGHHHGGGG

 The quality is encoded to have a single character representing the Phred score of a base. This means that the quality of the tenth base is encoded in the tenth character of the quality line.

One Reply to “FastQ file: the common output from NGS sequencers”

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: