Illumina reads naming scheme

These are three reads produced by an Illumina sequencer, and they are in FASTQ format. What we describe here is the naming scheme of the widespread Illumina sequencers.

@M02007:58:000000000-AW0NA:1:1101:11070:1384 1:N:0:CAGAGGCA+CTAAGCCT
CCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGTCCGCACTCCTTTTGCACCCCTTCCCCGTGTTTGAAGC
+
6BCCCFEE9)88B@@FE@FCFGD7@CCFE,6,C@,CC,,<,8+++;6;,,6,;,,CB+:,:6,9+8,6,:,,,,,
@M02007:58:000000000-AW0NA:1:1101:19460:1444 1:N:0:AAGAGGCA+CTACGCCT
CCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGCCCCACCCCTGTTCCAGCCTTCCCGCGTGTTTGTTCC
+
@CCCCGGG>)=@FFGG<FFGGGG7,C,EE9C9FE,C,,,;,8,+,86:,,6,<,,;C,9,:,,++8+6,6,,,,,
@M02007:58:000000000-AW0NA:1:1101:19666:1451 1:N:0:AAGAGGCA+CTAAGCCT
CCCTACGGGAGGCAGCAGTAGGGAATCTTCCACAATGGGCGAAACCCTGTTCGAGCACCTCCGCTTCCGTGTAGC
+
<BCCCFEG>)=@CFFC<EFFFGGFFEEFCFEDFE,C@,,;+++++;6;,,6,6,6B+,,,:6+,,8,,,::,,,,

First, remember that the name is the part after the “@” and before any space (it’s highlighted in bold in the example above). It must be unique within a single file.

As described in the Illumina website the read name is composed by these parts (separated by columns):

  1. Instrument (i.e. M02007)
  2. Run number (i.e. 58)
  3. Flowcell ID (i.e. 000000000-AW0NA)
    These three codes are constant in a single FASTQ file, produced by a single flowcell)
  4. Lane
  5. Tile
  6. X coordinate
  7. Y coordinate

As you can note it’s followed by a “comment” specifying the Index used, for example.

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: