Whole genome sequencing (WGS) refers to the comprehensive examination of a genome by reading and
stitching together short fragments to determine an organism’s complete chromosomal (nuclear) and
mitochondrial DNA sequence. De novo sequencing refers to sequencing a novel genome when a
reference or template sequence is not available. Sequencing reads are assembled as contigs
(contiguous consensus sequences from collections of overlapping reads). Once a de novo
genome has been completely sequenced, assembled and annotated, a draft or common reference sequence
is generated. Focused sequencing approaches such as exome or targeted resequencing are frequently
used to determine genomic variations such as single-nucleotide polymorphisms (SNPs), copy number
variations (CNVs), re-arrangements and indels. While WGS is well suited on its own to determine
genomic variations, sequencing depth afforded by focused or targeted resequencing is currently
significantly more cost effective.
Typically, WGS generates a single consensus sequence. This sequence does not distinguish between
variants on homologous chromosomes. Genome phasing identifies alleles on both maternal and paternal
chromosomes offering haplotype information. Phased sequencing is important in genetic disorders
where there are disruptions to alleles in cis and trans positions on a
chromosome. It’s ideal in studies where variant linkage and allele expression is important. For
phasing applications we recommend a 10X Genomics approach that includes GemCode/Chromium + Illumina HiSeq 2x125
reads. Deliverables of this service include 186 Gb of sequencing data which is approximately
a 48x genome.
Targeted sequencing is one of the most popular applications of next generation sequencing. Targeted
sequencing can be broken into three different approaches:
While whole genome sequencing and re-sequencing represent ~90% of all DNA based sequencing
applications, it’s important to not lose sight of the myriad of new protocols available to count or
detect epi-genomic features. These include genotyping, measuring DNA-protein interactions and
epigenetic markers. Several examples of these protocols are listed below:
- DNA Extraction- The first step in any DNA-seq workflow is the process of
purifying DNA from a cellular, plasma, viral or microbiome samples. Isolation of DNA must be
optimized so that the purified product has high yield, purity and integrity. Methods to extract
DNA from a sample can be broken down into the following categories:
- Organic extraction
- Silica membrane
- Filter plate
- Magnetic beads
The DNA extraction method used is critical, especially for hard to extract samples. Specialized
procedures are available for DNA extraction from Buccal swabs, cultured cells, tissue, food and
feed, cell free DNA in plasma, and formalin-fixed, paraffin-embedded (FFPE) samples. If you need
help, fill out our complimentary consultation
form and we'll be happy to offer our recommendations.
- DNA QC- Once DNA has been extracted it needs to be measured and quantified.
Measuring the concentration of DNA is usually performed on a spectrophotometer or a fluorescent
detection system (Qubit). DNA is typically run on an agarose gel to examine size and integrity.
In lieu of agarose gels, several microfluidic instruments are available and produce an
electropherogram plot of concentration, yield and size. Examples include the Bioanalyzer,
TapeStation, LabChip and Fragment Analyzer.
- Ordering Sequencing and/or Library Prep Services- Quotes for DNA-Seq services
can
be obtained instantly on Genohub.
- DNA Sample Submission- Typically 100 to 1000 nanograms of DNA are required for
whole genome or whole exome sequencing. Targeted panels or amplicon based sequencing can use as
little as 1 to 10 ng of input material. Other applications will have specific input
requirements.
See our guide for recommendations on
shipping DNA samples.
- DNA Library Preparation- DNA or fragment DNA library preparation methods are
available for sequencing whole genomes, as well as targeted regions within genomes, e.g.
ChIP-seq,
Methy-seq and Amplicon-seq. The two main DNA library preparation methods involve either
ligation, where adapters are ligated on to end-repaired inserts or by transposition where a
transposase simultaneously fragments and tags DNA in a single reaction, called “tagmentation”
(Nextera chemistry). Tagmentation improves upon ligation based methods by combining several
library prep steps into one reaction. The protocol is very sensitive to the amount and length of
starting DNA used. Conditions such as temperature and reaction time must be tightly controlled
and attention must be paid to biases introduced by any enzymatic protocol. Library preparation
protocols that employ both ligation and tagmentation are described below.
- Sequencing- Parameters for your sequencing run will depend on your experiment.
As a general recommendation, for whole genome sequencing we recommend at least 30x coverage of a
human genome using a minimum of 2x150 bp reads. PacBio or Roche 454 reads on top of short
Illumina reads are useful for obtaining longer contigs and closing gaps in a genome. See our coverage
guide for more information.
- Data Analysis- Data analysis requirements vary based on your application. They
range from processing sequencing reads from an instrument to data aggregation and mining of data
across multiple sample types. Data analysis can be categorized into three broad stages of
primary, secondary and tertiary analysis. To learn more about the types of DNA data analysis
available we recommend reading our bioinformatics data
analysis listings page
DNA-Seq Library Preparation Kits
Beckman SPRIworks HT (Illumina-compatible)
This Beckman based high throughput library prep kit is designed for use with the Beckman Coulter
Biomek FXP
liquid handler. The kit contains enough reagents allowing users to construct 96 libraries in
less
than 6 hours,
3 hours if size selection is not performed. The automated protocol contains three SPRI size
selection options
for recovering 150-350 bp, 250-450 bp and 350-700 bp insert sizes. Supported sample inputs
include
at least 1 µg
of sheared DNA, genomic DNA, cDNA and amplicons. While the kit cost is reasonable, you will need
an
upfront
investment to purchase the Biomek FXP liquid handler.
Protocol Overview:
- End repair
- Adenyation
- Ligation
- Optional bead size selection
- PCR (Automation optional, T-robot on Biomek FXP)
Bioo Scientific NEXTflex PCR-Free
(Illumina-compatible)
Amplification biases and dropouts in coverage in high GC and AT rich genomic regions are the main
reasons why users would want to use this kit. While several polymerases claim to decrease gaps
in coverage and handle GC/AT rich regions, the standard to which each polymerase is benchmarked
is a PCR-free library. Launched in 2011, this kit is based on the Kozarewa et al., 2009 paper
which first described the approach. Taking advantage of adapters which contain flow cell and
primer binding regions, the user is able to stop the library construction process after adapter
ligation. Reduced library bias and gaps in coverage allow users to prepare libraries from
difficult, small bacterial genomes to whole-human genomes. To accommodate an amplification free
library, the user will have to supply at least 500 ng - 2 µg of genomic DNA. The procedure takes
approximately 5 hours, with 4 hours of hands on time. While eliminating amplification may lead
you to think the procedure is faster, users are required to perform qPCR post ligation for
quantitation. Yields after ligation are typically sub-nanomolar requiring careful pre-flow cell
loading dilutions. Users are able to multiplex up to 96 samples using single indexed barcodes (6
or 8 base index).
Protocol Overview:
- Acoustic DNA shearing
- End-repair
- Adenylation
- Ligation
- Gel-free or gel size-selection
- qPCR quantitation
Bioo Scientific NEXTflex Rapid DNA-Seq
(Illumina-compatible)
The NEXTflex Rapid DNA-Seq kit is a faster and more versatile kit compared to its predecessor,
the NEXTflex DNA-Seq kit. The kit accommodates DNA inputs between 1 ng to 1 µg and can produce
sequenceable libraries in under 2 hours with as few as 6 cycles of PCR. While similar in library
completion time to Nextera, the Rapid DNA-Seq kit is ligation based and does not use
transposases. The End Repair and Adenylation steps are combined into a single reaction reducing
time and bead cleanups. The kit contains 5 bead based size selection options post ligation:
300-400 bp, 350-500 bp, 400-600 bp, 500-700 bp and 650-800 bp. The kit utilizes “enhanced
adapter ligation technology” and offers compatibility with clinical FFPE and degraded DNA
samples. As with the earlier NEXTflex DNA-Seq kit, this kit is compatible with up to 96 barcoded
adapters.
Protocol Overview:
- End repair & adenylation
- Ligation
- PCR
Illumina TruSeq Nano Kit
As one of the most widely adopted library preparation kits on the market, the TruSeq Nanokit has
been thoroughly
validated for use with many different genomic types. The procedure takes about 1 day to perform
with ~8 hours of hands on time. Users are able to multiplex up to 24 samples using single
indexed
barcodes (6
base index) or up to 96 using dual indexed barcodes (8 base indices).
Protocol Overview:
- Acoustic DNA shearing
- End-repair
- Adenylation
- Ligation
- Gel-free or gel size selection
- PCR
- QC
Illumina Nextera XT
The Nextera XT kit was designed for preparing sequence ready libraries from samples consisting of
small genomes
(bacteria, archaea, viruses), PCR amplicons and plasmids. Library preparation takes 90 minutes
and
only requires
1 ng of sample input. The kit uses a single transposase enzymatic reaction to simultaneously
fragment and add
adapters and recommends as few as 12 cycles of PCR. The kit contains a unique quantification
method
using beads
to normalize library amounts prior to pooling and sequencing. This reduces the need to perform a
lengthy qPCR
step to measure library concentration. The kit has barcoding options allowing the user to pool
up to
96 samples
together.
Protocol Overview:
- Tagmentation of genomic DNA
- PCR amplification
- Library normalization and pooling
Illumina TruSeq DNA PCR-Free
Amplification biases and dropouts in coverage in high GC and AT rich genomic regions are the
main
reasons why
users would want to use this kit. While several polymerases are now claiming to decrease gaps in
coverage and
handle GC/AT rich regions, the standard to which each polymerase is benchmarked is a PCR-free
library. Launched
in 2013, this kit is based on the Kozarewa
et
al., 2009 paper which first described the approach. Taking advantage of adapters which
contain
flow cell and
primer binding regions, the user is able to stop the library construction process after adapter
ligation.
Reduced library bias and gaps in coverage allow users to prepare libraries from difficult, small
bacterial
genomes to whole-human genomes. To accommodate an amplification free library, the user will have
to
supply at
least 1-2 µg of genomic DNA. The procedure takes approximately 5 hours, with 4 hours of hands on
time. While
eliminating amplification may lead you to think the procedure is faster, the mandatory qPCR post
ligation
compensates for time saved. Yields after ligation are typically sub-nanomolar requiring careful
pre-flow cell
loading dilutions. Users are able to multiplex up to 24 samples using single indexed barcodes (6
base index) or
up to 96 using dual indexed barcodes (8 base indices).
Protocol Overview:
- Acoustic DNA shearing
- End-repair
- Adenylation
- Ligation
- Gel-free or gel size selection
- qPCR quantitation
Kapa DNA Library (Illumina-compatible)
The Kapa DNA kit contains a novel enzyme, Kapa HiFi HotStart DNA Polymerase, which enables
amplification across a
wide range of genomic species with varying GC content. The enzyme claims to reduce sequence
bias,
and improve
uniform sequence coverage. The kit manual asks for 1- 5 µg of sheared input dsDNA and follows a
ligation based
approach to adding on adapters and constructing Illumina compatible libraries. Kits are
available in
10 and 50
reaction sizes with options to order with their Real Time PCR Library quantification kits. If
your
sample genome
has considerable GC content resulting in dropout in coverage, the Kapa HiFi HotStart DNA
Polymerase
is a popular
and effective option for amplification. Adapter barcodes are not provided with their Illumina
compatible
kits.
Protocol Overview:
- End repair
- A-tailing
- Adapter ligation
- PCR
NEB NEBNext Ultra DNA (Illumina-compatible)
Outperforming its predecessor, the NEBNext DNA library kit, the NEBNext Ultra, reduces library
preparation time
from 6 to under 3 hours, while allowing as little as 5 ng of input DNA. While the kit does not
include barcoded
adapters, the kit is compatible with NEB’s Multiplex Oligos for Illumina indexing. The indexed
adapters contain
a stem loop structure and are ligated immediately after adenylation. The loop of the adapter
contains a modified
base that is cleaved using NEB’s USER enzyme, revealing primer binding sites for amplification.
24
barcoded
adapters are available for multiplexing applications. The kit also contains NEBNext High
Fidelity 2X
PCR Master
Mix which is designed to reduce GC bias.
Protocol:
- End repair / dA-Tailing
- Adapter ligation
- USER adapter cleavage
- PCR
NuGEN Encore NGS Library
This kit has been replaced with the NuGen Encore Rapid Library Kit.
NuGEN Encore Rapid Library
(Illumina-compatible)
The Encore Rapid Library Kit is designed to produce libraries from as little as 100 ng of double
stranded DNA or
double stranded cDNA without PCR amplification. This workflow makes it compatible with several
applications,
including RNA-Seq, Digital Gene Expression (DGE), genomic DNA sequencing and amplicon
sequencing.
The Encore
Rapid system is designed for integration into Nugen’s Ovation, RNA-Seq System v2, and Ovation
RNA-Seq FFPE
systems. The absence of amplification steps makes this protocol suited for analyzing genomes
that
have high GC
content. Multiplexing is possible by purchasing a separate barcode module. This allows the user
to
multiplex up
to 96 samples using inline or dedicated barcode designs.
Protocol:
- Fragment
- End-repair
- Add adapters and ligate
- Final repair
Life Tech Ion Plus Fragment (for the Ion Torrent
platform)
The Ion Plus Fragment Kit is designed to produce Ion Torrent (PGM) compatible libraries with as
little as 100
ng of input DNA. The kit contains a proprietary Ion Shear™ enzyme which enzymatically shears
genomic
DNA,
eliminating the need to do this mechanically. As a result, the procedure can be perform in less
than
2
hours.
Protocol:
- Enzymatic shearing of DNA
- End-repair
- Blunt ended ligation of barcoded adapters
- Nick repair
- PCR amplification
PacBio DNA Template Prep (for the PacBio platform)
The PacBio Template Prep Kits require as little as 250 ng of sheared DNA input to create
libraries
with
insert sizes between 250 bp - < 3 Kb and 3 Kb - 10 Kb. The kit utilizes unique universal hairpin
adapters
(SMRTbell) to ligate onto double stranded DNA fragments. The SMRTbell template preparation
method
creates a
structurally linear and topologically circular DNA morphology enabling consensus sequencing of
the
same
template. Once templates or libraries are constructed, single molecule, real time sequencing can
begin.
Protocol:
- Fragmentation
- End-repair
- A-tailing
- Ligation of adapters
- Annealing of sequencing primer to templates
- Polymerase binding
When to use paired-end or single reads in DNA-Seq applications
Paired end Illumina sequencing refers to sequencing both ends of a DNA fragment, while single
end or
single read sequencing refers to sequencing from one end of a fragment. Single end sequencing is
usually sufficient for counting applications. For de novo whole genome sequencing, phased
sequencing
or targeted sequencing paired-end is recommended as reads are more likely to align better to a
reference genome. Paired-end reads form longer contigs for de novo sequencing and help fill in
gaps
in the consensus sequence. DNA alignments across repetitive regions are improved with paired end
reads, as are detection of rearrangements, indels and variants. The cost differences and the
importance of paired-end vs. single end for RNA applications can be found in our sequencing guide.
Data analysis expectations
Data from a DNA-Seq run can be delivered as raw or 'analyzed'. Below are the data deliverables
you
can expect from a whole genome sequencing service:
- Raw Data- Raw reads are typically delivered in a FASTQ file format. Raw reads and Phred
quality
scores are typically provided together
- Quality of run- FASTQC offers quality control checks on raw sequencing data so you can
determine
whether to proceed with further analysis. These include base and sequence quality scores, GC
content, N content, length distribution, duplication levels, over-represented sequences and
Kmer
content.
- Variant Calls and Alignments- Mapped reads are provided in a BAM file format, while variant
calls, including SNVs, CNVs, Indels and SVs are provided in a VCF file format.
- Annotations- Detailed information about breakpoints and interpretation of variants are often
provided in a CNS file format
Whole human genome sequencing and re-sequencing services and costs
A. Whole genome library preparation and sequencing services cost between $1,700 - $1,900 per
sample
- 30x coverage guaranteed
- 2 week guaranteed turnaround time
- 2x150 bp paired end sequencing
- Sample QC, library prep QC and DNA library preparation
- Data delivered as FASTQ files with SNP/INDEL, copy number variation, and structural
variation
reports
- Unlimited data storage
Search for Whole Genome Sequencing Services
B. Whole animal and plant genome library preparation and sequencing services cost between $1,700
-
$1,800 per sample
- 700 million paired end reads per sample guaranteed
- 2 week guaranteed turnaround time
- Includes sample QC, library prep and DNA library preparation
- Data delivered as FASTQ files
- Unlimited data storage
Search for Whole Plant and Animal Genome Sequencing Services