In this guide we define sequencing coverage as the average number of reads that align known
reference bases, i.e number of reads
x read length
/ target
size
; assuming that reads are
randomly distributed across the genome. In other places coverage has also been defined in terms of
breadth (i.e. assembly size
/ target size
) and an empirical average depth
of an assembly
(i.e. number of reads
x read length
/ assembly size
).
While in general more coverage means that each base is covered by a larger number of aligned
sequence reads, coverage and read requirements can depend on several of the following
parameters:
In the table below we address 1-4. Simply click on the detection methods or applications below and
adjust genome size, number of reads and read length to fit the organism you’re sequencing. The
coverage values below apply to most organisms while the read recommendations are for mammalian
species with genome sizes of ~3Gb. If you’re working with a smaller genome size you can
proportionately scale down the number of reads to get an estimate. It is important to note coverage
can depend heavily on the experiment you're trying to perform. In many cases, biological replicates
offer more value than a large number of reads for a single sample. The values described below are
what others in the field have determined necessary and are meant to serve as a starting point. To
most accurately determine the coverage you need in an experiment, a sequencing saturation analysis
should be performed.
This is an evolving coverage guide, meaning our goal is to improve it with new applications and
citations. We’d love your feedback. You can contact us about this and other sequencing-related
material at: science@genohub.com.
Category |
Detection or Application |
Recommended Coverage (x) or Reads (millions) |
References |
Whole genome sequencing |
Homozygous SNVs |
15x |
Bentley et al., 2008 |
|
Heterozygous SNVs |
33x |
Bentley et al., 2008 |
|
INDELs
|
60x |
Feng et al., 2014 |
|
Genotype calls |
35x |
Ajay et al., 2011 |
|
CNV
|
1-8x |
Xie et al., 2009; Medvedev at al., 2010 |
Whole exome sequencing |
Homozygous
SNVs |
100x (3x local depth) |
Clark et al., 2011; Meynert et al., 2013 |
|
Heterozygous SNVs |
100x (13x local depth) |
Clark et al., 2011; Meynert et al., 2013 |
|
INDELs
|
not recommended |
Feng et al., 2014 |
Transcriptome Sequencing |
Differential
expression profiling |
10-25M |
Liu Y. et al., 2014; ENCODE 2011 RNA-Seq |
|
Alternative splicing |
50-100M |
Liu Y. et al., 2013; ENCODE 2011 RNA-Seq |
|
Allele specific expression |
50-100M |
Liu Y. et al., 2013; ENCODE 2011 RNA-Seq |
|
De novo assembly |
>100M |
Liu Y. et al., 2013; ENCODE 2011 RNA-Seq |
DNA Target-Based Sequencing |
ChIP-Seq
|
10-14M (sharp peaks); 20-40M (broad marks) |
Rozowsky et al., 2009; ENCODE 2011 Genome; Landt et al., 2012 |
|
Hi-C
|
100M |
Belton, J.M et al., 2012 |
|
4C
(Circularized Chromosome Confirmation Capture) |
1-5M |
van de Weken, H.J.G. et al., 2012 |
|
5C (Chromosome Carbon Capture Carbon Copy) |
15-25M |
Sanyal A. et al., 2012 |
|
ChIA-PET (Chromatin Interaction Analysis by Paired-End Tag Sequencing) |
15-20M |
Zhang, J. et al., 2012 |
|
FAIRE-Seq
|
25-55M |
ENCODE 2011 Genome; Landt et al., 2012 |
|
DNAse 1-Seq |
25-55M |
Landt et al., 2012 |
DNA Methylation Sequencing |
CAP-Seq
|
>20M |
Long, H.K. et al., 2013 |
|
MeDIP-Seq
|
60M |
Taiwo, O. et al., 2012 |
|
RRBS
(Reduced Representation Bisulfite Sequencing) |
10X |
ENCODE 2011 Genome |
|
Bisulfite-Seq
|
5-15X; 30X |
Ziller, M.J et al., 2015; Epigenomics Road Map |
RNA-Target-Based Sequencing |
CLIP-Seq
|
10-40M |
Cho J. et al., 2012; Eom T. et al., 2013; Sugimoto Y. et al., 2012 |
|
iCLIP
|
5-15M |
Sugimoto Y. et al., 2012; Rogelj B. et al., 2012 |
|
PAR-CLIP
|
5-15M |
Rogelj B. et al., 2012 |
|
RIP-Seq
|
5-20M |
Lu Z. et al., 2014 |
Small RNA (microRNA) Sequencing |
Differential Expression |
~1-2M |
Metpally RPR et al., 2013; Campbell et al., 2015 |
|
Discovery
|
~5-8M |
Metpally RPR et al., 2013; Campbell et al., 2015 |