RNA modifications regulate the complex life of transcripts. An experimental approach called LAIC-seq was developed to characterize modification levels on a transcriptome-wide scale. In this method, the modified and unmodified molecules are separated using antibodies specific for a given RNA modification (e.g., m6A). In essence, the procedure of biochemical separation yields three fractions: Input, eluate, and supernatent, which are subjected to RNA-seq. In this work, we present a bioinformatics workflow, which starts from RNA-seq data to infer gene-specific modification levels by a statistical model on a transcriptome-wide scale. Our workflow centers around the pulseR package, which was originally developed for the analysis of metabolic labeling experiments. We demonstrate how to analyze data without external normalization (i.e., in the absence of spike-ins), given high efficiency of separation, and how, alternatively, scaling factors can be derived from unmodified spike-ins. Importantly, our workflow provides an estimate of uncertainty of modification levels in terms of confidence intervals for model parameters, such as gene expression and RNA modification levels. We also compare alternative model parametrizations, log-odds, or the proportion of the modified molecules and discuss the pros and cons of each representation. In summary, our workflow is a versatile approach to RNA modification level estimation, which is open to any read-count-based experimental approach.
circtools: a modular, python-based framework for circRNA-related tools that unifies several functionalities in a single, command line driven software has been accepted for publication in Bioinformatics.
Please visit https://github.com/dieterich-lab/circtools for more information.
Our new article on “Exon junction complexes suppress spurious splice sites to safeguard transcriptome integrity” is in press and will appear in Molecular Cell beginning of November. Congratulations to the team of authors:
Volker Boehm1, Thiago Britto-Borges2,3, Anna-Lena Steckelberg1,4, Kusum K. Singh1,5, Jennifer V. Gerbracht1, Elif Gueney1, Lorea Blazquez6,7, Janine Altmüller8,9,10, Christoph Dieterich2,3, Niels H. Gehring1,11
Productive splicing of human pre-mRNAs requires the correct selection of authentic splice sites (SS) from the large pool of potential SS. Although SS consensus sequence and splicing regulatory proteins are known to influence SS usage, the mechanisms ensuring the effective suppression of cryptic SS are insufficiently explored. Here, we find that many aberrant exonic SS are efficiently silenced by the exon junction complex (EJC), a multi-protein complex that is deposited on spliced mRNA near the exon-exon junction. Upon depletion of EJC proteins, cryptic SS are de-repressed, leading to the mis-splicing of a broad set of mRNAs. Mechanistically, the EJC-mediated recruitment of the splicing regulator RNPS1 inhibits cryptic 5′SS usage, while the deposition of the EJC core directly masks reconstituted 3′SS, thereby precluding transcript disintegration. Thus, the EJC protects the transcriptome of mammalian cells from inadvertent loss of exonic sequences and safeguards the expression of intact, full length mRNAs.
Identification of circular RNAs with host gene-independent expression in human model systems for cardiac differentiation and disease.
Siede D(1), Rapti K(2), Gorska AA(2), Katus HA(2), Altmüller J(3), Boeckel JN(2), Meder B(2), Maack C(4), Völkers M(2), Müller OJ(2), Backs J(5), Dieterich C(6).
AIMS: Cardiovascular disease, one of the most common causes of death in western populations, is characterized by changes in RNA splicing and expression. Circular RNAs (circRNA) originate from back-splicing events, which link a downstream 5′ splice site to an upstream 3′ splice site. Several back-splicing junctions (BSJ) have been described in heart biopsies from human, rat and mouse hearts (Werfel et al., 2016; Jakobi et al., 2016 ). Here, we use human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs) to identify circRNA and host gene dynamics in cardiac development and disease. In parallel, we explore candidate interactions of selected homologs in mouse and rat via RIP-seq experiments.
METHODS AND RESULTS: Deep RNA sequencing of cardiomyocyte development and β-adrenergic stimulation uncovered 4518 circRNAs. The set of circular RNA host genes is enriched for chromatin modifiers and GTPase activity regulators. RNA-seq and qRT-PCR data showed that circular RNA expression is highly dynamic in the hiPSC-CM model with 320 circRNAs showing significant expression changes. Intriguingly, 82 circRNAs are independently regulated to their host genes. We validated the same circRNA dynamics for circRNAs from ATXN10, CHD7, DNAJC6 and SLC8A1 in biopsy material from human dilated cardiomyopathy (DCM) and control patients. Finally, we could show that rodent homologs of circMYOD, circSLC8A1, circATXN7 and circPHF21A interact with either the ribosome or Argonaute2 protein complexes.
CONCLUSION: CircRNAs are dynamically expressed in a hiPSC-CM model of cardiac development and stress response. Some circRNAs show similar, host-gene independent expression dynamics in patient samples and may interact with the ribosome and RISC complex. In summary, the hiPSC-CM model uncovered a new signature of potentially disease relevant circRNAs which may serve as novel therapeutic targets.
There is a new publication out.
Jakobi T; Czaja-Hasse LF; Reinhardt R; Dieterich C, 2016. Profiling and Validation of the Circular RNA Repertoire in Adult Murine Hearts.
Title: Mondo complexes regulate TFEB via TOR inhibition to promote longevity in response to gonadal signals
Authors: Shuhei Nakamura, Özlem Karalay, Philipp S. Jäger, MakotoHorikawa, Corinna Klein, Kayo Nakamura, Christian Latza, Sven E. Templer, Christoph Dieterich & Adam Antebi
Joint work with our dear colleagues from the MPI-AGE was accepted for publication
in Nature Communications.
Motivation: Circular RNAs (circRNAs) are a poorly characterised class of molecules that have been identified decades ago. Emerging high-throughput sequencing methods as well as first reports on confirmed functions have sparked new interest in this RNA species. However, the computational detection and quantification tools are still limited.
Results: We developed the software tandem, DCC and CircTest. DCC uses output from the STAR read mapper to systematically detect back-splice junctions in next-generation sequencing data. DCC applies a series of filters and integrates data across replicate sets to arrive at a precise list of circRNA candidates. We assessed the detection performance of DCC on a newly generated mouse brain data set and publicly available sequencing data. Our software achieves a much higher precision than state-of-the-art competitors at similar sensitivity levels. Moreover, DCC estimates circRNA vs. host gene expression from counting junction and non-junction reads. These read counts are finally used to test for host gene-independence of circRNA expression across different experimental conditions by our R package CircTest. We demonstrate the benefits of this approach on previously reported age-dependent circRNAs in the fruit fly.
Availability: The source code of DCC and CircTest is licensed under the GNU General Public Licence (GPL) version 3 and available from https://github.com/dieterich-lab/DCC and https://github.com/dieterich-lab/circ-test.