Whole Transcriptome and mRNA Sequencing Guide

Library and data analysis recommendations, kits, services and costs


⬅️ NGS Handbook

RNA-Seq

The term RNA-Seq refers to a next generation sequencing approach that offers a snapshot of the entire transcriptome or messenger RNA (mRNA) profile at a given moment in time. RNA-Seq allows for the detection of transcript isoforms, allele specific gene expression, gene fusions, and single nucleotide variants, all without the need for knowing anything about the sample’s sequence composition. The term RNA-Seq is frequently inaccurately used, as RNA is not directly sequenced. Single RNA strands are converted to complementary DNAs (cDNA) and then turned into double stranded DNA before being sequenced. So while the initial starting input material is RNA, material loaded on the sequencing instrument is DNA.

Most applications of RNA-Seq fall under two broad categories:

Whole transcriptome sequencing

Whole transcriptome sequencing by RNA-Seq involves a snap-shot measurement of the complete complement of transcripts in a cell. These include transcripts such as mRNA and all non-coding RNAs. By looking at the whole transcriptome, researchers are able to determine global expression levels of each transcript, identify exons, introns and map their boundaries. In addition, RNA-Seq can be used for the identification of splicing variants. To accurately look at the whole transcriptome, most library preparation protocols first start with the removal of ribosomal RNA (rRNA) which otherwise takes up the majority of all sequencing reads. Assuming you’re not interested in ribosomal RNA, removing these transcripts allows for more of the sequencing reads to be focused on transcripts you’re actually interested in sequencing, giving you improved sensitivity toward low expressed transcripts.

Messenger RNA-Seq or mRNA-Seq

Messenger RNA-Seq or mRNA-Seq is a targeted RNA-Seq protocol that enriches for all polyadenylated (poly-A) transcripts of the transcriptome. mRNA-Seq is a method used in studying transcription in disease states as well as expression in variety of research based applications. Only around 1-2% of the entire transcriptome is comprised of poly-A tailed RNA, the coding part of the genome. By targeting mRNA, sequencing depth is improved as resources are dedicated to the sequencing of coding genes. This makes identifying rare variants and low expressed mRNA transcripts easier.

See Genohub's up-to-date list of available whole transcriptome and mRNA sequencing services.


Common Applications of Whole Transcriptome and mRNA-Seq


  1. Gene expression- Detection, measurement and comparison of coding transcripts from a population of cells. Target poly(A) mRNAs are enriched and cDNA synthesis performed using random primers, oligo-dT primers or by attaching adapters to mRNA fragments, followed by PCR amplification.
  2. Alternative splicing- Exon/intron boundaries are examined by long read or paired end sequencing.
  3. Non-coding RNA discovery- Identification and functional analysis of low expressing RNAs that do not translate protein.
  4. Anti-sense- Examination of the subset of transcripts that represent the anti-sense orientation and correlating this with gene expression changes.
  5. Fusion detection- Identification of fusion transripts, which are genomic rearrangements in cancer that can disrupt the activity of tumor suppressor genes or activate proto-oncogenes.
  6. Single cell- Examination of the complete transcriptome of a single isolated cell, as opposed to an averaging of the transcripts in millions of cells.


Other Applications of RNA-Seq


While whole transcriptome and mRNA-seq represent ~90% of all RNA based sequencing applications, it’s important to not lose site of the myriad of new protocols available to detect transcription events, RNA-protein interactions, RNA modifications, RNA structure and low input RNA. Several examples of these are listed here:

RNA-protein interaction

  1. CLIP-Seq
  2. RIP-Seq
  3. Ribo-Seq
  4. PAR-CLIP
  5. iCLIP
  6. HITS-CLIP
RNA-structure examination
  1. CapSeq
  2. Frag-Seq
  3. SHAPE-MaP
  4. CIRS-Seq
  5. CIP-TAP
RNA transcription event detection
  1. GRO-Seq
  2. CAGE-Seq
  3. PARE-Seq
  4. FRT-Seq
  5. TAIL-Seq
  6. SMORE-Seq
Low RNA input RNA-Seq
  1. SMART-Seq
  2. SMARTer
  3. CEL-Seq
  4. Drop-Seq
  5. G&T-Seq
  6. STRT-Seq
Small RNA-Seq
  1. Serial adapter ligation based miRNA-Seq
  2. Adapter circularized miRNA-Seq
  3. Single adapter ligation based miRNA-Seq



RNA-Seq Workflow


  1. RNA Extraction- Total RNA is extracted by phenol chloroform, gel or column enrichment. Depending on the sample type, there are several considerations one should make while extracting total RNA for the purposes of next generation sequencing. These include ensuring your RNA is free of organic solvents that might inhibit down-stream library preparation, RNA is not fragmentated into sizes that are too short for library preparation and sequencing, quality of your total RNA is not diminished by your extraction method. If you need help, fill out our complimentary consultation form and we'll be happy to offer our recommendations.

  2. RNA QC- RNA sample quality will vary significantly depending on where total RNA was extracted. RNA quality from FFPE or degraded samples can have poor RNA integrity while total RNA extracted from fresh tissue can be in superb condition. Running total RNA on a electropherogram or gel to detect 28S and 18S rRNA bands is the most common method for ensuring RNA quality is good. The 28S rRNA band should be approximately twice as intense as the 18S rRNA band on a gel. RIN values or RNA integrity can be measured using a Bioanalyzer or other type of electropherogram.

  3. RNA Sample Submission- Typically 100 ng – 1 µg of total RNA is required for mRNA and whole transcriptome sequencing. See our guide for recommendations on shipping RNA samples.

  4. Order Sequencing and/or Library Prep Services- Quotes for RNA-Seq services can be obtained instantly on Genohub.

  5. RNA Library Preparation- The type of RNA library preparation performed depends on whether you’re interested in examining the entire transcriptome or just coding transcripts in the case of mRNA-seq. If examining the transcriptome, it’s recommended that rRNA and potentially other non-coding RNAs be removed from your total RNA (see methods for removing rRNA from total RNA). If you’re interested in interrogating mRNA, poly T oligonucleotides fixed to magnetic beads are added to total RNA and selectively bind to messenger RNAs. Anything not bound is removed during a wash step. mRNAs are eluted from the beads and used in the first step of library preparation. While this doesn’t completely eliminate non-coding RNA, it does significantly reduce the proportion of rRNA in your final sequencing results. After your depletion or selection strategy has been chosen, all RNA-seq library preparation applications have a reverse transcription step where RNA is converted to cDNA and sequencing adapters added by ligation. The steps for library preparation include:
    • Total RNA isolation
    • mRNA enrichment or rRNA depletion from total RNA
    • RNA Fragmentation
    • Reverse transcription - 1st strand synthesis
    • Second strand synthesis
    • Ligation of adapters
    • Amplification

    Commonly used RNA-Seq library preparation kits for whole transcriptome or mRNA-Seq can be found below.


  6. Sequencing- Parameters for your sequencing run will depend on your experiment. As a general recommendation, for differential expression profiling we recommend at least between 10-25M single 1x50 or 1x100 reads. For de novo assembly or alternative splicing, we recommend around 100M paired 2x100 or 2x150 reads. See Transcriptome Sequencing in our coverage guide for more information.

  7. Data Analysis- Again, analysis requirements will depend on your experiment. Typically RNA-seq data is filtered, mapped and assembled. Quantification of splice variants, junctions and differential expression are commonly performed to interpret data. For general pipeline recommendations, go to: RNA-Seq Data Analysis Recommendations


  8. RNA-Seq Library Preparation Kits


    Bioo Scientific NEXTflex Directional RNA-Seq Kit

    The NEXTflex Rapid Directional RNA-Seq Kit is optimized for starting inputs between 10 ng – 1 ug of Total RNA or 10 – 100 ng of purified mRNA or ribosomal depleted RNA. The kit utilizes a “directional” or “stranded” approach that identifies from which of two DNA strands a given RNA transcript was derived. When RNA is copied back into cDNA during RNA-Seq library prep, the information about which of the strands was copied into RNA is lost unless a “directional” or “stranded” approach is used to preserve it’s identity. Strandedness is useful for transcription annotation, increases the percentage of alignable reads and provides insight into antisense transcription. Stranded character is retained due to the directionality of the adapters added. During second strand synthesis, dUTP is used in place of dTTP. Just before PCR, Uracil DNA Glycosylase (UDG) is used to catalyze excision of the uracil base, cleaving the uridine containing strand.

    This kit’s unique characteristics include a reverse transcription step that’s performed at 50ºC instead of 42ºC. The NEXTflex Rapid Thermostable enzyme included in the kit reduces secondary structure in RNA before transcription, improves yield and read through in complex GC regions. The protocol combines end-repair with second strand synthesis, reducing the need for a separate step. The kit is available with magnetic mRNA poly(A) beads and up to 48 RNA barcodes.

    Protocol Overview:

    1. Poly A selection of mRNA or ribosomal depletion of rRNA
    2. Fragmention with a high divalent cation buffer
    3. Random hexamer priming
    4. First strand synthesis with Actinomycin D (to prevent DNA synthesis)
    5. Second strand synthesis (using dUTP instead of dTTP)
    6. A-tailing
    7. Adapter ligation
    8. UDG (reagent depletes second strand)
    9. PCR amplification

    Bioo Scientific NEXTflex Rapid RNA-Seq Kit

    The NEXTflex Rapid RNA-Seq Kit is similar to the Rapid Directional RNA-Seq kit above, but does not provide strand information. Second strand synthesis does not contain dUTP and UDG is not used prior to PCR. The kit does contain the same thermostable reverse transcriptase step and fast workflow. Magnetic mRNA beads and up to 48 barcodes are also available.


    Bioo Scientific NEXTflex Rapid Directional qRNA-Seq Kit

    Similar to the NEXTflex Rapid Directional RNA-Seq kit, this kit uses a stranded approach to determine from which two DNA strands a given RNA transcript was derived. The main feature of this kit has to do with the ‘Q’ in qRNA-Seq which stands for quantitative. This kit utilizes a series of 9,216 molecular labels during ligation to ensure that each fragmented molecule is tagged with a unique index. The unique labels help differentiate between duplicates and unique fragments. While de-duplication is typically performed using start and stop sites, fragments with similar start and stop sites are collapsed and lost during counting. Using stochastic labels, this kit gives a more accurate representation of transcript expression. The protocol is similar to the previously mentioned Directional RNA-Seq kit, but includes molecular indices and up to 96 sample barcodes which are added during PCR.


    Clontech SMARTer

    SMART stands for “Switching Mechanism at 5’ End of RNA Template”. The procedure allows you to add known sequence to the 3’ and 5’ ends of the cDNA fragment without ligation. Generally ligation based approaches yield a maximum of 40-50% doubly ligated product. By using a ligation-free approach, the user is able to start with significantly lower inputs of material. The Clontech SMARTer kit allows you to start with as little as 1-2 ng of total RNA which makes this kit ideal for laser capture microscopy, cells sorted with flow cytometry or other techniques that yield small input RNA amounts. SMART technology’s benefits are that it enriches for full length transcripts and maintains true representation of the original mRNA transcripts, factors that are critical for transcriptome sequencing and gene expression analysis.

    Protocol Overview:

    1. Modified oligo(dT) or random hexamer primer primers the first strand synthesis reaction.
    2. When SMARTScribe RT reaches the end of the 5’end of the mRNA, the enzyme’s terminal transferase activity adds a few additional nucleotides to the 3’ end of the cDNA
    3. SMARTer oligo base pairs with non template nucleotide stretch creating an extended template.
    4. SMARTScribe RT template switches and starts and continues replicating to the end of the oligonucleotide.
    5. Resulting single stranded cDNA contains the 5’ end of the mRNA as well as sequence complementary to the SMARTer Oligonucleotide.
    6. Covaris shearing of full length cDNA
    7. Library Preparation

    Epicentre ScriptSeq

    The ScriptSeq kit has been developed for preparing libraries from 500 pg - 50 ng of either ribosomal depleted or polyA enriched RNA. Directional and paired end libraries can be constructed using this kit for Illumina platforms.

    Protocol Overview:

    1. RNA Fragmentation
    2. Random hexamer tagging
    3. RNA removal
    4. Annealing of 3’-end blocked terminal tagging oligo
    5. cDNA synthesis
    6. DNA purification
    7. PCR amplification and barcode addition

    Gnomegen RNA Profiling Kit

    The Gnomegen RNA profiling is designed specifically for identifying the 3’ regions of RNA transcripts. It is marketed as a cost effective alternative to whole transcriptome sequencing, as it focuses mainly on identifying changes in RNA expression levels. It is not designed for RNA discovery, annotation or identification of alternative splicing sites. The input material required for library construction is as low as 1 ng of total RNA and can be performed in 1 day.

    Protocol Overview:

    1. Fragmentation of Total RNA
    2. 5’adapter ligation
    3. Reverse transcription with a polyA RT primer
    4. PCR

    Illumina TruSeq RNA (Discontinued)

    The TruSeq RNA Kit from Illumina is designed for generating mRNA libraries from total RNA. The kit is optimized for starting inputs between 0.1 - 4 µg of total RNA, but the user may also start with the entire fraction of mRNA isolated from 0.1 - 4 µg of total RNA. Needless to say, the kit comes with polyA beads for the purification procedure. All reagents in the kit are master mixed for ease of use during library construction. The v2 version of this kit contains 12 indexing adapters, with up to 24 available. There are no other significant differences between v1 and v2. Illumina recommends the use of high quality starting material, particularly total RNA with a high RNA integrity number (RIN) greater than or equal to 8. Alternatively, the user may run their sample on a gel and look for a 28S band that is twice as intense as the 18S band. The kit contains all the components you need for RNA-Seq library prep, except for the reverse transcriptase enzyme. Illumina recommends the purchase of SuperScript II for first strand synthesis.

    Protocol Overview:

    1. PolyA selection of mRNA
    2. Fragmention with a high divalent cation buffer
    3. Random hexamer priming
    4. First strand synthesis (using SuperScript II)
    5. Second strand synthesis
    6. End repair
    7. A-tailing
    8. Adapter ligation
    9. PCR

    Illumina TruSeq Stranded

    The TruSeq Stranded kit is essentially the same as the Illumina TruSeq RNA kit except for the fact it provides “directional” or “stranded” information that identifies from which of two DNA strands a given RNA transcript was derived. When RNA is copied back into cDNA during RNA-Seq library prep, the information about which of the strands was copied into RNA is lost unless a “directional” or “stranded” approach is used to preserve its identity. Strandedness is useful for transcription annotation, increases the percentage of alignable reads and provides insight into antisense transcription. Stranded character is retained due to the directionality of the adapters. The P7 adapter is on the 3’ end of the cDNA strand. During second strand synthesis, dUTP is used in place of dTTP, eliminating the second strand during amplification since the PCR polymerase used cannot read through dUTP.

    The kit is optimized for 0.1 - 4 µg of total RNA input and may be used with either polyA or ribosomal depletion beads. The kit includes 12 adapter barcodes with a total of 24 available in the “low-throughput” version of the kit. The “high-throughput” version contains 96 unique dual indexed adapters. The move to dual indexed adapters is a strategy that decreases the number of adapters that need to be synthesized to make 96. Essentially 12 adapters containing unique indices are annealed with 8 adapters containing another set of unique indices, allowing for 96 possible combinations, with the need to only synthesize 20 adapters. There are no particular user benefits of “dual” over “single” indices. In fact, the main disadvantage is that using “dual” indices, there are several sequencing cycles that are lost, reading “dark regions”.

    The TruSeq Stranded mRNA kit comes with the polyA beads, while the TruSeq Stranded Total RNA kit contains ribosomal depletion beads. The ribo-depletion beads are essentially from Epicentre (Ribo-Zero) and contain biotinylated probes that selectively bind and remove rRNA. By depleting rRNA, the user is able to not only sequence polyA genes, but a broad range of other non-coding transcripts, including long non-coding RNA (lincRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA) and other species. The TruSeq Stranded Total RNA Kit can be purchased with Ribo-Zero Gold beads, designed for rRNA removal across human, mouse and rat species, Ribo-Zero Globin, for removal of rRNA and globin in a single step and Ribo-Zero Plant for specific removal of cytoplasmic mitochondrial and chloroplast rRNA from leaf, seed and root tissue.

    Protocol Overview:

    1. PolyA selection of mRNA or ribosomal depletion of rRNA
    2. Fragmention with a high divalent cation buffer
    3. Random hexamer priming
    4. First strand synthesis (using SuperScript II) with Actinomycin D (to prevent DNA synthesis)
    5. Second strand synthesis (using dUTP instead of dTTP)
    6. End repair
    7. A-tailing
    8. Adapter ligation
    9. PCR (polymerase cannot read through dUTP)

    NEB NEBNext Ultra Directional RNA

    The NEB NEBNext Ultra Directional RNA kit requires 100 ng - 1 µg of Total RNA or 10 - 100 ng of purified mRNA or ribosomal depleted RNA as starting input. The protocol is similar to Illumina’s TruSeq Directional RNA-Seq Kit.


    See Genohub's up-to-date list of available library prep services for the following applications:

    RNA (polyA-selected) Illumina library prep services
    RNA (rRNA-depleted) Illumina library prep services

    If you're doing your own library preparation see the list of facilities that offer Illumina sequencing.



    RNA-Seq Data Analysis Recommendations


    Evaluating the quality of your data and extracting biological importance is one of the most important steps in any RNA-Seq application. It’s important to discuss your project with an experienced bioinformatician or learn about the best tools to properly analyze your data. We do not recommend the one pipeline fits all approach offered by several commercial black-box software providers. Each RNA tool (and there are many) needs to be carefully examined to see if it is the right tool for your application. With that disclaimer, we recommend starting with the Tuxedo suite of software for differential gene and transcript expression analysis of RNA-seq experiments (1). Tuxedo's tools enable short-read mapping, identification of splice junctions, transcript and isoform detection. The tools are open source and most importantly use peer-reviewed statistical methods (2-5).

    3 main components to a Tuxedo based RNA-Seq data analysis pipeline:

    1. Bowtie2 (2)
    2. Tophat (3)
    3. Cufflinks (4-5)

    Bowtie2 is a fast, memory efficient aligner designed to quickly align large sets of short reads to the genome. Bowtie is the basis for tools like TopHat and Cufflinks.

    Tophat is a fast splice junction mapper that can be used with Bowtie or Bowtie2. Tophat uses Bowtie to map RNA-seq reads and then analyze the mapping results to identify splice junctions between exons.

    Cufflinks uses alignment data from Tophat and assembles RNA-seq reads into transcripts, provides an abundance estimate, measures differential expression and regulation of the transcriptome. Cuffdiff can measure differential expression levels from CDS, gene, isoform and TSS transcript level.



    How Many Replicates are Needed for RNA-Seq?


    To properly interpret read count differences in RNA-Seq data you should consider where variation can be introduced. When thinking about the replicates you need for a RNA-Seq study, you're likely thinking about biological replicates. As a general answer we recommend at least 6 biological replicates per sample group. For a more detailed and ‘experimentally validated’ answer, we recommend you read How Many Replicates are Sufficient for Differential Gene Expression. Replicates for experimental variation should also be considered. These include:

    1. Sample variation - When you extract total RNA from a sample, only a small fraction of nucleic acid is actually sampled and represented in your library. This causes sampling variation and should be considered in your analysis.
    2. Technical variation - Library preparation for RNA-seq is a series of coordinated enzymatic reactions that may each contribute to variation between libraries. Technical variation should be controlled for in your experiment.
    3. Biological variation - This type of variation is what you are actually interested in measuring. The number of biological replicates you need for a whole transcriptome or mRNA-seq experiment depends on what you are trying to compare or measure statistically.



    Paired or Single-end Reads for RNA-Seq?


    In paired-end sequencing, both ends of a transcript are sequenced as opposed to single-end sequencing where one end is sequenced. Advantages of paired-end sequencing for RNA applications include better alignment and transcript assembly. As a result, paired end sequencing is recommended for the following RNA-seq applications:

    1. De novo assembly
    2. Discovery of novel non-coding RNAs
    3. Splice isoform detection
    4. Resolution at the 3’end of your transcript
    5. Identification of polycistronic mRNAs and operons

    Single end sequencing is sufficient for differential expression studies, where you’re interested in examining a profile of all coding transcripts in a sample. For counting applications, sequencing both ends of a transcript is not critical.



    Methods for removing rRNA from Total RNA


    The most common methods for removing rRNA from total RNA are:

    Oligo-bead based. Biotin labeled oligonucleotides complementary to rRNA or other non-coding RNAs are mixed with your total RNA, hybridize and pulled down by streptavidin beads.

    RNAseH cleavage. DNA oligos designed to hybridize to rRNA are incubated with total RNA before RNAseH is introduced. RNAseH belongs to a family of non-specific endonucleases and catalyzes cleavage of 3’O-P (phosphodiester) bonds of RNA in DNA/RNA duplexes. After a cleanup, rRNA is no longer available for reverse transcription.

    Priming. Reverse transcription primer sets can been designed, e.g. oligo(dT) primers or 'not so random priming' to specifically avoid rRNA and other non-coding products.

    CRISPR/Cas9. CRISPR has been shown in at least one publication (6) to specifically cleave rRNA targets, and knock them down.



    Considerations for Whole Transcriptome and mRNA Sequencing:


    1. What sequencing instruments are recommended for RNA-seq, specifically whole transcriptome or mRNA-sequencing?

    We recommend the Illumina Nextseq, HiSeq 2500 and HiSeq 3K/4K instruments. They each offer enough throughput and a variety of read length offerings to make them completely suitable for all RNA sequencing applications.


    2. Will my RNA-Seq results be stranded or directional?

    Yes, almost all commercially available library preparation kits use techniques to preserve strand information. Directional or stranded RNA-seq identifies from which of two DNA strands a given RNA transcript was derived. When RNA is copied into cDNA during transcription, the information about which of the strands was copied into RNA is lost unless a “directional” or “stranded” approach is used to preserve it’s identity. Strandedness is useful for transcription annotation, increases the percentage of alignable reads and provides insight into antisense transcription.


    3. What technique offers greater read-depth, mRNA-seq or whole transcriptome sequencing?

    mRNA-seq offers greater read depth than whole transcriptome sequencing. mRNA-seq is a technique that enriches poly-adenylated RNA so sequencing reads are focused on a subset of RNA, giving you higher sequencing depth. For whole transcriptome work, rRNA depletion will remove rRNA, focusing reads on mRNA and other non-coding RNAs.


    4. How long of a sequencing read should I use for mRNA-seq and whole transcriptome sequencing?

    Longer read lengths are important for de novo transcript assembly and identifying transcript isoforms. We recommend paired 2x100 or 2x150 read lengths for these applications. For mRNA differential gene expression studies, a long read length is typically not required when there is an available reference genome. For these studies we recommend single end 1x50 or 1x100 read lengths.



    Direct RNA Sequencing - Actual Sequencing of RNA not DNA


    To date, almost all RNA-seq studies, including all Illumina based RNA-sequencing involves the sequencing of cDNA. In 2009, Helicos™ published a paper (7) describing their ability to directly sequence RNA however this technology is not widely commercially available. In 2015, Oxford Nanopore announced the ability to sequence RNA strands on the MiniION™ and PromethION™ without needing to convert to double stranded DNA. Directly sequencing RNA offers the following advantages:

    1. Direct sequencing of RNA eliminates biases associated with transcription and provides a more accurate measure of transcript profiles.
    2. Modified RNA bases can be directly measured, whereas with cDNA conversion they are lost
    3. It’s faster than cDNA library preparation, no reverse transcription step is necessary



    Whole Transcriptome and mRNA Sequencing Services:


    A. Standard differential mRNA expression library and sequencing services cost between $210 - $300 per sample

    1. 10 million reads per sample
    2. Minimum 1x50 single end read length
    3. mRNA library preparation with poly(A) beads
    4. Ligation based library preparation
    5. Appropriate for counting applications

    Search for Standard Differential mRNA Expression Sequencing Services


    B. Whole transcriptome library and sequencing services cost between $330 - $400 per sample

    1. 50 million paired end reads per sample (25M reads in each direction)
    2. Appropriate for examining mRNA and non-coding transcripts
    3. Ligation based library preparation
    4. Appropriate for more contextual examinations of the transcriptome

    Search for Whole Transcriptome Sequencing Services


    C. High depth RNA sequencing services cost between $780 - $900 per sample

    1. 200 million paired end reads per sample (100M reads in each direction)
    2. Paired-end reads that are 2x75 or greater in length
    3. Ideal for transcript discovery, splice site identification, gene fusion detection, de novo transcript assembly

    Search for High Depth RNA Sequencing Services



    References


    1. Trapnell, C., et al. (2013) Differential analysis of gene regulation at transcript resolution with RNA-seq. Nature Biotechnology 31, 46-53.
    2. Langmead B, Salzberg S. (2012) Fast gapped-read alignment with Bowtie 2. Nature Methods. 9:357-359.
    3. Trapnell, C., et al. (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics (Oxford, England), 25(9):1105-1111.
    4. Trapnell, C., et al. (2010). Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology, 28(5): 511-515.
    5. Trapnell,C., et al. (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols 7,562–578.
    6. Gu, W., et al., 2016. Depletion of Abundant Sequences by Hybridization (DASH): using Cas9 to remove unwanted high-abundance species in sequencing libraries and molecular counting applications. Genome Biology 17:41.
    7. Ozsolak, F., et al., 2009. Direct RNA Sequencing. Nature 461, 814-818.