Dark matter RNA illuminates the puzzle of genome-wide association studies
© St. Laurent et al.; licensee BioMed Central Ltd. 2014
Received: 7 March 2014
Accepted: 22 May 2014
Published: 12 June 2014
In the past decade, numerous studies have made connections between sequence variants in human genomes and predisposition to complex diseases. However, most of these variants lie outside of the charted regions of the human genome whose function we understand; that is, the sequences that encode proteins. Consequently, the general concept of a mechanism that translates these variants into predisposition to diseases has been lacking, potentially calling into question the validity of these studies. Here we make a connection between the growing class of apparently functional RNAs that do not encode proteins and whose function we do not yet understand (the so-called ‘dark matter’ RNAs) and the disease-associated variants. We review advances made in a different genomic mapping effort – unbiased profiling of all RNA transcribed from the human genome – and provide arguments that the disease-associated variants exert their effects via perturbation of regulatory properties of non-coding RNAs existing in mammalian cells.
KeywordsGenome-wide association study Non-coding RNA vlincRNA Intronic RNA lncRNA RNA scaffold LincRNA Long Non-coding RNA Long intergenic non-coding RNA Very long intergenic non-coding RNA
Pervasive transcription: the answer to function behind the non-coding GWAS variants?
Glossary of technical terms
A system of regulation of gene activity in a cell that works by affecting the immediate surroundings of DNA, for example, by modifying various proteins that coat DNA inside the nucleus. Depending on the exact nature of the modification, DNA becomes either more or less accessible to cellular machinery that activates genes
A sequence of DNA that can regulate a target gene or genes over long distances
DNAse I hypersensitivity region
A region of DNA identified in an assay where chromatin is digested with DNAse I, an enzyme that degrades DNA. More accessible regions of chromatin, typically containing regulatory elements such as promoters and enhancers, are more susceptible to DNAse digestion and thus are enriched in DNAse I hypersensitivity regions
Gene Ontology (GO) term
GO is a an international initiative aimed at assigning controlled vocabulary, consisting of terms such as ‘regulation of apoptosis’ that define the functional property of each gene. This vocabulary is often very useful in understanding the biological meaning of a genomics experiment. For example, a list of genes activated during a disease would have a list of specific terms associated with each gene. Enrichment of specific terms in the list would suggest general cellular functions in which these genes participatem and give clues to the molecular functions underlying the disease
H1 embryonic stem cells
A line of human embryonic stem cells maintained in culture
A certain type of chemical modification of a protein that binds DNA. Important for reversible deactivation oftargeted portions of the genome
Part of an RNA molecule that is included immediately after transcription and removed during maturation of that molecule
RNA encoded by a DNA sequence that also encodes an intron of another transcript
A non-coding RNA activated upon DNA damage and in various tumor cell lines
A gene encoding an important regulator controlling activity of many genes. This gene has been associated with many cancers
Normal human epidermal keratinocytes (NHEK)
A line of primary keratinocytes maintained in culture
RNA that is not used as a template for protein synthesis
Massive transcription from unannotated regions of the genome
A molecule of RNA containing a long stretch of adenosine residues at the end
PRC2 chromatin signaling complex
A complex composed of multiple protein molecules that reversibly modifies chromatin and silences target genes
A sequence of DNA that is located immediately adjacent to a target gene and regulates its activity
A copy of a gene, presumed to be non-functional, although a number of recent examples describe both non-coding functions and occasionally coding functions for some of these loci
Regulation in trans
Regulation via interaction with molecules encoded by distal regions of the genome
RNA Pol II
A complex composed of multiple protein molecules responsible for synthesis of RNA, which is used as template for protein synthesis
A molecule of RNA produced by transcription, that is, copying of RNA from the DNA template
A protein that regulates expression of genes by binding to their promoters and/or enhancers
Transcription factor motif
A short DNA sequence recognized by a transcription factor or group of transcription factors, typically found in promoters and enhancers
A collection of all the RNA molecules (transcripts) in a cell or a tissue
Study of the transcriptome
Oocytes from frogs of genus Xenopus, an important model system for study of developmental biology, cell biology, molecular biology, toxicology, and neuroscience
In 2002, the first report of pervasive transcription  subsequently triggered a series of genome-mapping endeavors that have discovered large numbers of dark matter RNAs transcribed from much of the non-coding space of the human genome [10–12]. Although an object of considerable debate over the last decade , an increasing number of independent observations in different species [14–19] have confirmed these results by continually increasing the annotation of the dark matter transcriptome. Presently, little doubt remains that the human genome produces large amounts of RNA whose function we still do not understand [10–12]. In fact, non-coding (nc)RNA represents at least 75% of the human genome , and its relative mass outweighs that of protein-coding mRNA .
These large and well-validated datasets now provide a strong basis for an in-depth look at ncRNA as a possible answer to the mystery of the many GWAS hits that do not fall neatly into protein-coding regions of the human genome. The non-coding disease-associated polymorphisms from the GWAS studies may have uncovered a vast hidden regulatory layer composed of ncRNA transcripts and their network of interactions in the cell. Below we describe the evidence supporting this view, and explain how this perspective can solve a number of outstanding questions.
Undiscovered transcripts may underlie non-coding GWAS variants
As discussed by Mudge et al.  in their excellent review on functional transcriptomics, we have only begun to annotate the full complexity of RNAs encoded by the human genome. First, the database of complete, full-length cDNAs – the basis for gene annotations – still has surprisingly shallow coverage . In fact, for many gene loci, the database contains only a single complete cDNA. This implies that most protein-coding genes, even well-characterized ones, have yet undiscovered exons that would require highly sensitive in-depth profiling methods to reveal [22, 23]. With much less coverage than coding genes, the annotation situation for ncRNAs remains far more incomplete. The main reasons for this include the over-reliance on methods designed for protein-coding mRNAs, such as the use of polyA+ RNA for transcriptome profiling, the analytical focus on spliced RNAs, and the avoidance of intronic RNAs. The vast majority of RNA sequencing (RNAseq) experiments so far have profiled the polyA+ RNA fraction. Although this is informative for mRNAs, it leads to loss of significant complexity of ncRNAs , and a bias against the discovery of their unspliced versions. As a consequence, spliced versions of ncRNAs dominate the current annotated lists, which has resulted in an underestimate of their genomic coverage. It is very possible that, unlike protein-coding mRNAs, the longer, unspliced versions of the ncRNAs represent the functional forms. The abundance of ncRNAs in the nucleus compared to the cytosol  makes this a likely scenario.
Partly to bring order to this complexity, annotation efforts have classified ncRNAs by their physical characteristics. Typically, they are defined based on length (short, long, or very long), location relative to known genomic features (introns, genes, promoters, enhancers, or intergenic space), and overlap of known genes (sense or antisense) . Of greatest importance to GWAS,long ncRNAs (lncRNAs) include all the classes of ncRNAs greater than 200 nucleotides in length, such as long intergenic ncRNAs (lincRNAs) , very long intergenic ncRNAs (vlincRNAs; (>40 kb in length, see below), natural antisense RNAs, and intronic RNAs [19, 25–27]. Genomic annotation efforts typically focus on intergenic regions as the logical place to look for novel genes and transcripts, following the general notion that introns of known genes probably represent mere pre-mRNAs. However, this simple assumption has failed in the light of recent RNAseq datasets, as we have shown in a mouse inflammation time-course experiment . In fact, thousands of mouse introns can harbor functional ncRNAs that behave separately from their exonic counterparts , resonating with discoveries of independent intronic RNAs in other systems such as Xenopus oocytes . These observations support visionary ideas originally conceived by John Mattick almost 2 decades ago .
Strikingly, discovery of GWAS variants seem to favor different genomic elements depending on disease type (Figure 1). Only two disease types showed a preference for discovery of GWAS variants in coding regions of known genes (CDSs): metabolic diseases and anatomical diseases. By contrast, diseases affecting mental health favored annotated lncRNAs (lincRNAs and vlincRNAs), while cellular proliferation diseases favored introns (Figure 1). The even distribution in the non-disease trait category (which includes a large number of different phenotypes) provides a perspective for the contrasting results in the disease categories. Although understanding these observations will require additional research, we hope they raise the question of why variants in different diseases favor different categories of genomic elements. For example, it is tempting to speculate that the preference for promoters in anatomical diseases relates to the importance of tight control of gene expression during development.
As alluded to above, most of the existing lists of lncRNAs come from profiling of polyA+ RNA. By contrast, sequencing of total RNA from normal and tumor tissues recently uncovered vlincRNAs), a novel class of lncRNAs that showed statistically significant associations with GWAS variants . Thousands of these vlincsRNAs span at least 10% of the human genome, and probably span much more, once additional tissues are profiled. These RNAs range from 40 kb up to around 1 MB in length, and are controlled by typical RNA Pol II promoters. Interestingly, an intriguing subset of these RNAs – those controlled by promoters within endogenous retroviral elements – characterizes cancerous and pluripotent states.
Emerging patterns of lncRNA function in disease
A number of recent examples indicate that ncRNAs can underlie the function of non-coding GWAS SNPs. Perhaps the best-studied example is ANRIL, a lncRNA transcribed from the 9p21.3 locus. the single greatest GWAS risk factor for atherosclerosis that is currently known . Remarkably, the expression level of this ncRNA stands out as the variable most strongly associated with the disease phenotypes linked to 9p21.3 . Recent work has highlighted the importance of the atherogenic SNPs in this locus by demonstrating that ANRIL functions by trans-regulating over 900 genes . The study combined various in vitro experiments with measurement of gene expression in 2,280 patients with cardiovascular disease from the Leipzig Heart Study to show that these ANRIL-regulated genes can be classified into known atherogenic Gene Ontology terms such as ‘cell adhesion’ and ‘apoptosis’ .
The study further showed that ANRIL interacts with the PRC2 chromatin signaling complex, and requires an intact Alu sequence (a short repeated sequence present in thousands of copies in the human genome) for its regulatory effects . Delivery of the PRC2 complex, which promotes H3K27 trimethylation and repression of gene expression, to its targets via the Alu-mediated interaction thus provides an attractive model for ANRIL function. Other systems have previously provided similar examples of Alu repeats mediating intermolecular interactions between RNA molecules, and leading to functional consequences [37, 38]. All this evidence suggests that many other lncRNAs might function through intermolecular interactions mediated by Alu and other abundant repeated sequences in mammalian genomes.
The molecular mechanism of ANRIL function illustrates a potential general paradigm in gene expression regulation that promises to explain the function of large numbers of lncRNAs. In this model, one type of RNA species could regulate hundreds of targets in trans via intermolecular interactions, a mode of regulation previously associated primarily with short RNAs such as micro RNAs. However, it is becoming increasingly clear that lncRNAs can function in this manner, perhaps by providing scaffolds that bring together various protein and RNA molecules . In this regard, the functions of ANRIL parallel those of the lincRNA HOTAIR, which also trans-regulates hundreds of genes by changing the chromatin occupancy of PRC2 .
A newly discovered vlincRNA associated with Hemolysis, elevated liver enzymes, low platelet count syndrome provides another prominent example of this mode of function . While this syndrome represents a mendelian disorder, mapped using a traditional genetic analysis of affected families, the mutations occur in a non-coding region that harbors an ncRNA approximately 200 kb long. Subsequent analysis has shown that mutations can affect stability of this RNA , which also functions by trans-regulating hundreds of target genes.
Impressive as the aforementioned examples are, even more striking is the trend that has emerged from these and other investigations: more detailed examination of non-coding GWAS loci have increasingly led to the discovery of disease-relevant transcripts in the highlighted region. For example, a careful examination of the GWAS region at 5p14.1 that is implicated in autism led to the discovery of non-coding RNA antisense to a moesin pseudogene. Further experiments determined that this natural antisense RNA probably works in trans by lowering the level of moesin protein encoded by a gene on the X chromosome . Supporting this emerging trend, depletion (small interfering RNA or antisense RNA) or over-expression of lncRNAs, even the ultra-long vlincRNAs, now routinely results in apparent phenotypes [31, 41]. The growing wealth of examples precludes us from going into details of the studies or even citing all of them. Suffice to say that such functional analysis has begun to connect ncNAs, including those transcribed from GWAS loci, with processes such as cancer, heart disease, degenerative diseases, and senescence [31, 36, 40, 41, 43–45].
A complex transcriptome for complex diseases
The magnitude of functional transcripts in non-coding genomic space, many of which still remain either hidden or under-appreciated as functional RNAs, makes it almost certain that these RNAs will explain an important fraction of the non-coding GWAS hits. If so, then further intriguing questions arise. Why are mendelian disorders mostly explained by mutations affecting protein-coding exons, yet complex diseases are explained by mutations in ncRNAs? Does this notion form a pattern consistent with prevailingmechanistic models of how ncRNAs function? Does this pattern tell us something about the underlying systems biology of human cells?
The evidence thus far suggests positive answers to these questions. As described above in the few examples that have been worked out in detail, lncRNAs can act as trans regulators, potentially master regulators, of large numbers of known genes. Examples such as ANRIL, HOTAIR, lincRNA-p21, and HELLP highlight this emerging paradigm. A similar model of regulation occurs with transcription factors; however, a mutation in a long ncRNA would generally not have quite the same effect as a mutation in a protein. Considering that the former usually results in a much more flexible phenotype than the latter, a mutation would affect rather than abrogate the interaction affinity of the lncRNA with its partner molecules, such as proteins or other nucleic acids. Moreover, these interactions occur in the context of hundreds or thousands of competing and cooperative interactions with other lncRNAs in a complex ecosystem that controls signaling in the nucleus. The expected result would include small but cooperative and cumulative effects on a large number of downstream targets, thus displaying itself as a relatively subtle contribution to one or possibly many complex phenotypes.
Clearly, we are still at the early stages of understanding the full complexity of functional elements encoded in the human genome. However, recent results paint an emerging picture of a very complex regulatory network composed of numerous ncRNAs and their targets. Each ncRNA molecule in this network could potentially regulate hundreds of other RNAs in trans. GWAS variants could function by affecting this overarching layer of ncRNA regulation. In fact, the recent examples of this type of network regulation probably represent the tip of the iceberg of its true significance in complex diseases. Therefore, the time is right to bring the two fields together to fully unravel the underlying relationships.
In addition, the theme of ncRNA in disease brings with it an immediate implication for clinical research. If ncRNAs are as intricately involved in underlying disease mechanisms, as the data reviewed here suggest, then clinical transcriptome sequencing (including ncRNAs) has to come to the forefront of biomedical research. Indeed, first indications suggest that ncRNAs represent excellent biomarkers for cancer diagnostics [46, 47]. Considering the much larger complexity of ncRNAs compared with coding RNAs, the former represent a gigantic untapped potential for clinically relevant biomarkers, in addition to expanding our basic knowledge of molecular events leading to disease.
genome-wide association study
long intergenic non-coding RNAs
long non-coding RNA
National Human Genome Research Institute
single nucleotide polymorphism
very long intergenic non-coding RNA.
We would like to thank Dmitry Shtokalo, Maxim Ri, Denis Antonets, Olga Saik, Tim McCaffrey and Mark Mazaitis for their helpful discussions and help with figure generation.
- Haines JL, Hauser MA, Schmidt S, Scott WK, Olson LM, Gallins P, Spencer KL, Kwan SY, Noureddine M, Gilbert JR, Schnetz-Boutaud N, Agarwal A, Postel EA, Pericak-Vance MA: Complement factor H variant increases the risk of age-related macular degeneration. Science. 2005, 308: 419-421. 10.1126/science.1110359.View ArticlePubMedGoogle Scholar
- A Catalog of Published Genome-Wide Association Studies. http://www.genome.gov/gwastudies.
- Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA: Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A. 2009, 106: 9362-9367. 10.1073/pnas.0903103106.View ArticlePubMedPubMed CentralGoogle Scholar
- Barendse W: The effect of measurement error of phenotypes on genome wide association studies. BMC Genomics. 2011, 12: 232-10.1186/1471-2164-12-232.View ArticlePubMedPubMed CentralGoogle Scholar
- Cardon LR, Palmer LJ: Population stratification and spurious allelic association. Lancet. 2003, 361: 598-604. 10.1016/S0140-6736(03)12520-2.View ArticlePubMedGoogle Scholar
- Clayton DG, Walker NM, Smyth DJ, Pask R, Cooper JD, Maier LM, Smink LJ, Lam AC, Ovington NR, Stevens HE, Nutland S, Howson JM, Faham M, Moorhead M, Jones HB, Falkowski M, Hardenbol P, Willis TD, Todd JA: Population structure, differential bias and genomic control in a large-scale, case-control association study. Nat Genet. 2005, 37: 1243-1246. 10.1038/ng1653.View ArticlePubMedGoogle Scholar
- The Wellcome Trust Case Control Consortium: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007, 447: 661-678. 10.1038/nature05911.View ArticlePubMed CentralGoogle Scholar
- Khurana E, Fu Y, Colonna V, Mu XJ, Kang HM, Lappalainen T, Sboner A, Lochovsky L, Chen J, Harmanci A, Das J, Abyzov A, Balasubramanian S, Beal K, Chakravarty D, Challis D, Chen Y, Clarke D, Clarke L, Cunningham F, Evani US, Flicek P, Fragoza R, Garrison E, Gibbs R, Gumus ZH, Herrero J, Kitabayashi N, Kong Y, Lage K, et al: Integrative annotation of variants from 1092 humans: application to cancer genomics. Science. 2013, 342: 1235587-10.1126/science.1235587.View ArticlePubMedPubMed CentralGoogle Scholar
- Kapranov P, Cawley SE, Drenkow J, Bekiranov S, Strausberg RL, Fodor SP, Gingeras TR: Large-scale transcriptional activity in chromosomes 21 and 22. Science. 2002, 296: 916-919. 10.1126/science.1068597.View ArticlePubMedGoogle Scholar
- Kapranov P, St Laurent G: Dark Matter RNA: Existence, Function, and Controversy. Front Genet. 2012, 3: 60.PubMedPubMed CentralGoogle Scholar
- Clark MB, Choudhary A, Smith MA, Taft RJ, Mattick JS: The dark matter rises: the expanding world of regulatory RNAs. Essays Biochem. 2013, 54: 1-16. 10.1042/bse0540001.View ArticlePubMedGoogle Scholar
- Mudge JM, Frankish A, Harrow J: Functional transcriptomics in the post-ENCODE era. Genome Res. 2013, 23: 1961-1973. 10.1101/gr.161315.113.View ArticlePubMedPubMed CentralGoogle Scholar
- Clark MB, Amaral PP, Schlesinger FJ, Dinger ME, Taft RJ, Rinn JL, Ponting CP, Stadler PF, Morris KV, Morillon A, Rozowsky JS, Gerstein MB, Wahlestedt C, Hayashizaki Y, Carninci P, Gingeras TR, Mattick JS: The reality of pervasive transcription. PLoS Biol. 2011, 9: e1000625-10.1371/journal.pbio.1000625. discussion e1001102View ArticlePubMedPubMed CentralGoogle Scholar
- Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest AR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, et al: The transcriptional landscape of the mammalian genome. Science. 2005, 309: 1559-1563.View ArticlePubMedGoogle Scholar
- Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, Kuehn MS, Taylor CM, Neph S, Koch CM, Asthana S, Malhotra A, Adzhubei I, Greenbaum JA, Andrews RM, Flicek P, Boyle PJ, Cao H, Carter NP, Clelland GK, Davis S, Day N, Dhami P, Dillon SC, Dorschner MO, Fiegler H, et al: Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007, 447: 799-816. 10.1038/nature05874.View ArticlePubMedGoogle Scholar
- Okazaki Y, Furuno M, Kasukawa T, Adachi J, Bono H, Kondo S, Nikaido I, Osato N, Saito R, Suzuki H, Yamanaka I, Kiyosawa H, Yagi K, Tomaru Y, Hasegawa Y, Nogami A, Schonbach C, Gojobori T, Baldarelli R, Hill DP, Bult C, Hume DA, Quackenbush J, Schriml LM, Kanapin A, Matsuda H, Batalov S, Beisel KW, Blake JA, Bradt D, et al: Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature. 2002, 420: 563-573. 10.1038/nature01266.View ArticlePubMedGoogle Scholar
- Bertone P, Stolc V, Royce TE, Rozowsky JS, Urban AE, Zhu X, Rinn JL, Tongprasit W, Samanta M, Weissman S, Gerstein M, Snyder M: Global identification of human transcribed sequences with genome tiling arrays. Science. 2004, 306: 2242-2246. 10.1126/science.1103388.View ArticlePubMedGoogle Scholar
- Cheng J, Kapranov P, Drenkow J, Dike S, Brubaker S, Patel S, Long J, Stern D, Tammana H, Helt G, Sementchenko V, Piccolboni A, Bekiranov S, Bailey DK, Ganesh M, Ghosh S, Bell I, Gerhard DS, Gingeras TR: Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science. 2005, 308: 1149-1154. 10.1126/science.1108625.View ArticlePubMedGoogle Scholar
- Kapranov P, Cheng J, Dike S, Nix DA, Duttagupta R, Willingham AT, Stadler PF, Hertel J, Hackermuller J, Hofacker IL, Bell I, Cheung E, Drenkow J, Dumais E, Patel S, Helt G, Ganesh M, Ghosh S, Piccolboni A, Sementchenko V, Tammana H, Gingeras TR: RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science. 2007, 316: 1484-1488. 10.1126/science.1138341.View ArticlePubMedGoogle Scholar
- Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, Tanzer A, Lagarde J, Lin W, Schlesinger F, Xue C, Marinov GK, Khatun J, Williams BA, Zaleski C, Rozowsky J, Roder M, Kokocinski F, Abdelhamid RF, Alioto T, Antoshechkin I, Baer MT, Bar NS, Batut P, Bell K, Bell I, Chakrabortty S, Chen X, Chrast J, Curado J, et al: Landscape of transcription in human cells. Nature. 2012, 489: 101-108. 10.1038/nature11233.View ArticlePubMedPubMed CentralGoogle Scholar
- Kapranov P, St Laurent G, Raz T, Ozsolak F, Reynolds CP, Sorensen PH, Reaman G, Milos P, Arceci RJ, Thompson JF, Triche TJ: The majority of total nuclear-encoded non-ribosomal RNA in a human cell is 'dark matter' un-annotated RNA. BMC Biol. 2010, 8: 149-10.1186/1741-7007-8-149.View ArticlePubMedPubMed CentralGoogle Scholar
- Denoeud F, Kapranov P, Ucla C, Frankish A, Castelo R, Drenkow J, Lagarde J, Alioto T, Manzano C, Chrast J, Dike S, Wyss C, Henrichsen CN, Holroyd N, Dickson MC, Taylor R, Hance Z, Foissac S, Myers RM, Rogers J, Hubbard T, Harrow J, Guigo R, Gingeras TR, Antonarakis SE, Reymond A: Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions. Genome Res. 2007, 17: 746-759. 10.1101/gr.5660607.View ArticlePubMedPubMed CentralGoogle Scholar
- Mercer TR, Gerhardt DJ, Dinger ME, Crawford J, Trapnell C, Jeddeloh JA, Mattick JS, Rinn JL: Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Nat Biotechnol. 2012, 30: 99-104.View ArticleGoogle Scholar
- Khalil AM, Guttman M, Huarte M, Garber M, Raj A, Rivea Morales D, Thomas K, Presser A, Bernstein BE, van Oudenaarden A, Regev A, Lander ES, Rinn JL: Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proc Natl Acad Sci U S A. 2009, 106: 11667-11672. 10.1073/pnas.0904715106.View ArticlePubMedPubMed CentralGoogle Scholar
- Kim TK, Hemberg M, Gray JM, Costa AM, Bear DM, Wu J, Harmin DA, Laptewicz M, Barbara-Haley K, Kuersten S, Markenscoff-Papadimitriou E, Kuhl D, Bito H, Worley PF, Kreiman G, Greenberg ME: Widespread transcription at neuronal activity-regulated enhancers. Nature. 2010, 465: 182-187. 10.1038/nature09033.View ArticlePubMedPubMed CentralGoogle Scholar
- Wahlestedt C: Targeting long non-coding RNA to therapeutically upregulate gene expression. Nat Rev Drug Discov. 2013, 12: 433-446. 10.1038/nrd4018.View ArticlePubMedGoogle Scholar
- Kapranov P, Willingham AT, Gingeras TR: Genome-wide transcription and the implications for genomic organization. Nat Rev Genet. 2007, 8: 413-423. 10.1038/nrg2083.View ArticlePubMedGoogle Scholar
- St Laurent G, Shtokalo D, Tackett M, Yang Z, Eremina T, Wahlestedt C, Inchima SU, Seilheimer B, McCaffrey TA, Kapranov P: Intronic RNAs constitute the major fraction of the non-coding RNA in mammalian cells. BMC Genomics. 2012, 13: 504-10.1186/1471-2164-13-504.View ArticlePubMedPubMed CentralGoogle Scholar
- Gardner EJ, Nizami ZF, Talbot CC, Gall JG: Stable intronic sequence RNA (sisRNA), a new class of noncoding RNA from the oocyte nucleus of Xenopus tropicalis. Genes Dev. 2012, 26: 2550-2559. 10.1101/gad.202184.112.View ArticlePubMedPubMed CentralGoogle Scholar
- Mattick JS: Introns: evolution and function. Curr Opin Genet Dev. 1994, 4: 823-831. 10.1016/0959-437X(94)90066-3.View ArticlePubMedGoogle Scholar
- St Laurent G, Shtokalo D, Dong B, Tackett M, Fan X, Lazorthes S, Nicolas E, Sang N, Triche T, McCaffrey T, Xiao W, Kapranov P: VlincRNAs controlled by retroviral elements are a hallmark of pluripotency and cancer. Genome Biology. 2013, 14: R73-10.1186/gb-2013-14-7-r73.View ArticlePubMedPubMed CentralGoogle Scholar
- Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, Zhang X, Wang L, Issner R, Coyne M, Ku M, Durham T, Kellis M, Bernstein BE: Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011, 473: 43-49. 10.1038/nature09906.View ArticlePubMedPubMed CentralGoogle Scholar
- Nissan A, Stojadinovic A, Mitrani-Rosenbaum S, Halle D, Grinbaum R, Roistacher M, Bochem A, Dayanc BE, Ritter G, Gomceli I, Bostanci EB, Akoglu M, Chen YT, Old LJ, Gure AO: Colon cancer associated transcript-1: a novel RNA expressed in malignant and pre-malignant human tissues. Int J Cancer. 2012, 130: 1598-1606. 10.1002/ijc.26170.View ArticlePubMedGoogle Scholar
- Ling H, Spizzo R, Atlasi Y, Nicoloso M, Shimizu M, Redis RS, Nishida N, Gafa R, Song J, Guo Z, Ivan C, Barbarotto E, De Vries I, Zhang X, Ferracin M, Churchman M, van Galen JF, Beverloo BH, Shariati M, Haderk F, Estecio MR, Garcia-Manero G, Patijn GA, Gotley DC, Bhardwaj V, Shureiqi I, Sen S, Multani AS, Welsh J, Yamamoto K, et al: CCAT2, a novel noncoding RNA mapping to 8q24, underlies metastatic progression and chromosomal instability in colon cancer. Genome Res. 2013, 23: 1446-1461. 10.1101/gr.152942.112.View ArticlePubMedPubMed CentralGoogle Scholar
- Pasmant E, Sabbagh A, Vidaud M, Bieche I: ANRIL, a long, noncoding RNA, is an unexpected major hotspot in GWAS. FASEB J. 2011, 25: 444-448. 10.1096/fj.10-172452.View ArticlePubMedGoogle Scholar
- Holdt LM, Hoffmann S, Sass K, Langenberger D, Scholz M, Krohn K, Finstermeier K, Stahringer A, Wilfert W, Beutner F, Gielen S, Schuler G, Gabel G, Bergert H, Bechmann I, Stadler PF, Thiery J, Teupser D: Alu elements in ANRIL non-coding RNA at chromosome 9p21 modulate atherogenic cell functions through trans-regulation of gene networks. PLoS Genet. 2013, 9: e1003588-10.1371/journal.pgen.1003588.View ArticlePubMedPubMed CentralGoogle Scholar
- Gong C, Tang Y, Maquat LE: mRNA-mRNA duplexes that autoelicit Staufen1-mediated mRNA decay. Nat Struct Mol Biol. 2013, 20: 1214-1220. 10.1038/nsmb.2664.View ArticlePubMedPubMed CentralGoogle Scholar
- Wang J, Gong C, Maquat LE: Control of myogenesis by rodent SINE-containing lncRNAs. Genes Dev. 2013, 27: 793-804. 10.1101/gad.212639.112.View ArticlePubMedPubMed CentralGoogle Scholar
- St Laurent G, Savva YA, Kapranov P: Dark matter RNA: an intelligent scaffold for the dynamic regulation of the nuclear information landscape. Front Genet. 2012, 3: 57.View ArticlePubMedPubMed CentralGoogle Scholar
- Gupta RA, Shah N, Wang KC, Kim J, Horlings HM, Wong DJ, Tsai MC, Hung T, Argani P, Rinn JL, Wang Y, Brzoska P, Kong B, Li R, West RB, van de Vijver MJ, Sukumar S, Chang HY: Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature. 2010, 464: 1071-1076. 10.1038/nature08975.View ArticlePubMedPubMed CentralGoogle Scholar
- van Dijk M, Thulluru HK, Mulders J, Michel OJ, Poutsma A, Windhorst S, Kleiverda G, Sie D, Lachmeijer AM, Oudejans CB: HELLP babies link a novel lincRNA to the trophoblast cell cycle. J Clin Invest. 2012, 122: 4003-4011. 10.1172/JCI65171.View ArticlePubMedPubMed CentralGoogle Scholar
- Kerin T, Ramanathan A, Rivas K, Grepo N, Coetzee GA, Campbell DB: A noncoding RNA antisense to moesin at 5p14.1 in autism. Sci Transl Med. 2012, 4: 128ra140.View ArticleGoogle Scholar
- Askarian-Amiri ME, Crawford J, French JD, Smart CE, Smith MA, Clark MB, Ru K, Mercer TR, Thompson ER, Lakhani SR, Vargas AC, Campbell IG, Brown MA, Dinger ME, Mattick JS: SNORD-host RNA Zfas1 is a regulator of mammary development and a potential marker for breast cancer. RNA. 2011, 17: 878-891. 10.1261/rna.2528811.View ArticlePubMedPubMed CentralGoogle Scholar
- Khaitan D, Dinger ME, Mazar J, Crawford J, Smith MA, Mattick JS, Perera RJ: The melanoma-upregulated long noncoding RNA SPRY4-IT1 modulates apoptosis and invasion. Cancer Res. 2011, 71: 3852-3862. 10.1158/0008-5472.CAN-10-4460.View ArticlePubMedGoogle Scholar
- Abdelmohsen K, Panda A, Kang MJ, Xu J, Selimyan R, Yoon JH, Martindale JL, De S, Wood WH, Becker KG, Gorospe M: Senescence-associated lncRNAs: senescence-associated long noncoding RNAs. Aging Cell. 2013, 12: 890-900. 10.1111/acel.12115.View ArticlePubMedPubMed CentralGoogle Scholar
- Vergara IA, Erho N, Triche TJ, Ghadessi M, Crisan A, Sierocinski T, Black PC, Buerki C, Davicioni E: Genomic "Dark Matter" in Prostate Cancer: Exploring the Clinical Utility of ncRNA as Biomarkers. Front Genet. 2012, 3: 23.View ArticlePubMedPubMed CentralGoogle Scholar
- Reis EM, Verjovski-Almeida S: Perspectives of Long Non-Coding RNAs in Cancer Diagnostics. Front Genet. 2012, 3: 32.PubMedPubMed CentralGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.