Cluster Density Optimization on Illumina Sequencing Instruments

⬅️ NGS Handbook

The amount of DNA one loads onto a flow cell is an important part of Illumina sequencing as it influences the density of the clusters that form. If you load too little DNA, you’re likely to ‘under-cluster’ the flow cell. Under-clustering usually maintains data quality, but results in lower data output. If you load too much DNA, clusters will be too close together (over-clustering), resulting in poor image resolution and analysis problems. Over-clustered flow cells have lower Q30 scores and reduced data output. In each case (over/under clustering) the caveat is lower data output. The goal of any sequencing run is hit the Goldilocks balance between under and over clustering. In this guide, we summarize Illumina’s recommendations for each instrument and discuss procedures to prevent over/under clustering.

Table 1. Optimal flow cell loading concentrations and cluster density

Illumina Instrument Version of Reagent Chemistry Recommended flow cell loading concentration Recommended Insert size (bp) Raw Density (K/mm2) Reference
HiSeq X v2.5 250+ pM * 350 and 450 1255 - 1412 1
HiSeq 3000 / 4000 N/A 250+ pM * 350 and 450 1310 - 1524 2
HiSeq 2500 High Output v3 8 – 10 pM N/A 610 – 678 3
HiSeq 2500 High Output v4 8 – 10 pM N/A 870 – 930 3
HiSeq Rapid Run v1, 2 8 – 10 pM N/A 700 - 820 3
NextSeq v2 High and Mid Output 1.8 pM N/A 129 - 165 4
MiniSeq High and Mid Output 1.8 pM N/A 129 - 165 5
MiSeq v2 6 – 10 pM N/A 865 - 965 6
MiSeq v3 6 – 20 pM N/A 1200 – 1400 6

* HiSeq 3000/4000 and HiSeq X use patterned flow cells (billions of nano-wells in a structured pattern). While uniform cluster spacing and density reduce the emphasis of loading concentration, under-loading a patterned flow cell results in a lower number of reads passing filter and fewer unique reads. Overloading also results in a lower number of pass filter reads.

Best Practices for Avoiding Over/Under Clustering

Properly quantify your library

Inaccurate library quantification is the most common cause of over or under-clustering. The most effective method for quantifying a library for NGS is by qPCR. Primers used in qPCR are similar to those used in cluster generation, so only dually ligated, doubly stranded libraries with the proper adapters are most efficiently amplified and quantified. With qPCR you don’t have to worry about partially ligated library fragments or primer dimers inaccurately skewing the concentration of a library. Fluor-metric methods that only detect double stranded DNA, such as Qubit, are fairly accurate. However, dyes in these assays also bind partially ligated double stranded libraries and adapter dimers, potentially overinflating the actual concentration of a library. We don’t recommend using a Bioanalyzer or spectrophotometer for accurate library quantification. While the Bioanalyzer is a good method to determine library size, measurement of concentration is not as accurate. Accuracy of one’s measurement is critical for subsequent serial dilution and loading of a flow cell.

Ensure your libraries are of high quality

Contaminating spurious library products (adapter and primer dimer, singly ligated templates) and improper size measurement of a library can overinflate concentrations causing you to underload a flow cell and reduce optimal cluster density. In some circumstances calculating your library’s concentration using the wrong template size can lead you to overload a flow cell. Using a microfluidic instrument such as the Bioanalyzer or LabChip to size DNA products and detect spurious library products is recommended. Dimers and partially ligated products can be easily eliminated using magnetic bead based size selection.

Check sequence diversity of your library

Sequence diversity refers to the proportion of each nucleotide in each position on a template library. Libraries with an equal proportion of each nucleotide are considered balanced. Cluster density and flow cell loading recommendations in Table 1 assume you have a library that’s sufficiently diverse. If your library is not diverse or sufficiently balanced reduce the library loading amounts recommended in Table 1. by at least 10 - 20%. The exact reduction needed depends on several factors such as your library’s insert size and whether the first bases of read 1 are sufficiently diverse. Whenever you have a low diversity library, consider spiking in a high diversity library such as PhiX to increase your overall nucleotide diversity.

Other Guides: