Full Text:   <2133>

Summary:  <1548>

CLC number: Q522

On-line Access: 2024-08-27

Received: 2023-10-17

Revision Accepted: 2024-05-08

Crosschecked: 2019-04-28

Cited: 0

Clicked: 4745

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Liang-Jiang Wang

https://orcid.org/0000-0002-6316-7962

-   Go to

Article info.
Open peer comments

Journal of Zhejiang University SCIENCE B 2019 Vol.20 No.6 P.476-487

http://doi.org/10.1631/jzus.B1900162


Genomic data mining for functional annotation of human long noncoding RNAs


Author(s):  Brian L. Gudenas, Jun Wang, Shu-Zhen Kuang, An-Qi Wei, Steven B. Cogill, Liang-Jiang Wang

Affiliation(s):  Department of Genetics and Biochemistry, Clemson University, Clemson, South Carolina 29634, USA

Corresponding email(s):   liangjw@clemson.edu

Key Words:  Long noncoding RNA, Functional annotation, Genomic data mining, Machine learning


Brian L. Gudenas, Jun Wang, Shu-Zhen Kuang, An-Qi Wei, Steven B. Cogill, Liang-Jiang Wang. Genomic data mining for functional annotation of human long noncoding RNAs[J]. Journal of Zhejiang University Science B, 2019, 20(6): 476-487.

@article{title="Genomic data mining for functional annotation of human long noncoding RNAs",
author="Brian L. Gudenas, Jun Wang, Shu-Zhen Kuang, An-Qi Wei, Steven B. Cogill, Liang-Jiang Wang",
journal="Journal of Zhejiang University Science B",
volume="20",
number="6",
pages="476-487",
year="2019",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.B1900162"
}

%0 Journal Article
%T Genomic data mining for functional annotation of human long noncoding RNAs
%A Brian L. Gudenas
%A Jun Wang
%A Shu-Zhen Kuang
%A An-Qi Wei
%A Steven B. Cogill
%A Liang-Jiang Wang
%J Journal of Zhejiang University SCIENCE B
%V 20
%N 6
%P 476-487
%@ 1673-1581
%D 2019
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.B1900162

TY - JOUR
T1 - Genomic data mining for functional annotation of human long noncoding RNAs
A1 - Brian L. Gudenas
A1 - Jun Wang
A1 - Shu-Zhen Kuang
A1 - An-Qi Wei
A1 - Steven B. Cogill
A1 - Liang-Jiang Wang
J0 - Journal of Zhejiang University Science B
VL - 20
IS - 6
SP - 476
EP - 487
%@ 1673-1581
Y1 - 2019
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.B1900162


Abstract: 
Life may have begun in an RNA world, which is supported by increasing evidence of the vital role that RNAs perform in biological systems. In the human genome, most genes actually do not encode proteins; they are noncoding RNA genes. The largest class of noncoding genes is known as long noncoding RNAs (lncRNAs), which are transcripts greater in length than 200 nucleotides, but with no protein-coding capacity. While some lncRNAs have been demonstrated to be key regulators of gene expression and 3D genome organization, most lncRNAs are still uncharacterized. We thus propose several data mining and machine learning approaches for the functional annotation of human lncRNAs by leveraging the vast amount of data from genetic and genomic studies. Recent results from our studies and those of other groups indicate that genomic data mining can give insights into lncRNA functions and provide valuable information for experimental studies of candidate lncRNAs associated with human disease.

利用基因组数据挖掘对人类长非编码RNA进行功能注释

概要:越来越多证据表明RNA在生物系统中扮演着重要的角色,而这些发现支持了生命起源于RNA的假设.在人类基因组中,大部分的基因并不编码蛋白质,被称为非编码RNA基因.长非编码RNA(lncRNA)是其中最大的一类,其转录本长度大于200个核苷酸.虽然一些lncRNA已被证明是调控基因表达和3D基因组结构的重要元件,但是大部分lncRNA还未被研究和注释.本课题组利用大量基因组数据,提出一些基于数据挖掘和机器学习的方法,对人类lncRNA进行功能注释.我们与其他同领域课题组的近期研究结果表明,基因组数据挖掘可帮助加深对lncRNA功能的理解,并为与疾病相关lncRNA的实验研究提供重要信息.
关键词:长非编码RNA(lncRNA);功能注释;基因组数据挖掘;机器学习

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Achar A, Sætrom P, 2015. RNA motif discovery: a computational overview. Biol Direct, 10:61.

[2]Brázda V, Hároníková L, Liao JCC, et al., 2014. DNA and RNA quadruplex-binding proteins. Int J Mol Sci, 15(10):17493-17517.

[3]Cabili MN, Dunagin MC, McClanahan PD, et al., 2015. Localization and abundance analysis of human lncRNAs at single-cell and single-molecule resolution. Genome Biol, 16:20.

[4]Cajigas I, Leib DE, Cochrane J, et al., 2015. Evf2 lncRNA/ BRG1/DLX1 interactions reveal RNA-dependent inhibition of chromatin remodeling. Development, 142(15):2641-2652.

[5]Cammas A, Millevoi S, 2017. RNA G-quadruplexes: emerging mechanisms in disease. Nucleic Acids Res, 45(4):1584-1595.

[6]Cao HF, Wahlestedt C, Kapranov P, 2018. Strategies to annotate and characterize long noncoding RNAs: advantages and pitfalls. Trends Genet, 34(9):704-721.

[7]Cao Z, Pan XY, Yang Y, et al., 2018. The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier. Bioinformatics, 34(13):2185-2194.

[8]Carlevaro-Fita J, Johnson R, 2019. Global positioning system: understanding long noncoding RNAs through subcellular localization. Mol Cell, 73(5):869-883.

[9]Chaudhary R, Gryder B, Woods WS, et al., 2017. Prosurvival long noncoding RNA PINCR regulates a subset of p53 targets in human colorectal cancer cells by binding to Matrin 3. eLife, 6:e23244.

[10]Chen LL, 2016. Linking long noncoding RNA localization and function. Trends Biochem Sci, 41(9):761-772.

[11]Ching T, Himmelstein DS, Beaulieu-Jones BK, et al., 2018. Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface, 15(141):20170387.

[12]Clark BS, Blackshaw S, 2014. Long non-coding RNA-dependent transcriptional regulation in neuronal development and disease. Front Genet, 5:164.

[13]Clemson CM, Hutchinson JN, Sara SA, et al., 2009. An architectural role for a nuclear noncoding RNA: NEAT1 RNA is essential for the structure of paraspeckles. Mol Cell, 33(6):717-726.

[14]Cogill SB, Wang LJ, 2014. Co-expression network analysis of human lncRNAs and cancer genes. Cancer Inform, 13(Suppl 5):49-59.

[15]Cogill SB, Wang LJ, 2016. Support vector machine model of developmental brain gene expression data for prioritization of Autism risk gene candidates. Bioinformatics, 32(23):3611-3618.

[16]Cogill SB, Srivastava AK, Yang MQ, et al., 2018. Co-expression of long non-coding RNAs and autism risk genes in the developing human brain. BMC Syst Biol, 12(Suppl 7):91.

[17]Darnell JC, Fraser CE, Mostovetsky O, et al., 2005. Kissing complex RNAs mediate interaction between the Fragile-X mental retardation protein KH2 domain and brain polyribosomes. Genes Dev, 19(8):903-918.

[18]Davidovich C, Cech TR, 2015. The recruitment of chromatin modifiers by long noncoding RNAs: lessons from PRC2. RNA, 21(12):2007-2022.

[19]de Rubeis S, He X, Goldberg AP, et al., 2014. Synaptic, transcriptional and chromatin genes disrupted in autism. Nature, 515(7526):209-215.

[20]Derrien T, Johnson R, Bussotti G, et al., 2012. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res, 22(9):1775-1789.

[21]ENCODE Project Consortium, 2012. An integrated encyclopedia of DNA elements in the human genome. Nature, 489(7414):57-74.

[22]Ferrè F, Colantoni A, Helmer-Citterich M, 2016. Revealing protein–lncRNA interaction. Brief Bioinform, 17(1):106-116.

[23]Geisler S, Coller J, 2013. RNA in unexpected places: long non-coding RNA functions in diverse cellular contexts. Nat Rev Mol Cell Biol, 14(11):699-712.

[24]Gudenas BL, Wang LJ, 2015. Gene coexpression networks in human brain developmental transcriptomes implicate the association of long noncoding RNAs with intellectual disability. Bioinform Biol Insights, 9(Suppl 1):21-27.

[25]Gudenas BL, Wang LJ, 2018. Prediction of lncRNA subcellular localization with deep learning from sequence features. Sci Rep, 8(1):16385.

[26]Gudenas BL, Srivastava AK, Wang LJ, 2017. Integrative genomic analyses for identification and prioritization of long non-coding RNAs associated with autism. PLoS ONE, 12(5):e0178532.

[27]Guo Y, Chen X, Xing RX, et al., 2018. Interplay between FMRP and lncRNA TUG1 regulates axonal development through mediating SnoN-Ccd1 pathway. Hum Mol Genet, 27(3):475-485.

[28]Guttman M, Rinn JL, 2012. Modular regulatory principles of large non-coding RNAs. Nature, 482(7385):339-346.

[29]Hangauer MJ, Vaughn IW, McManus MT, 2013. Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs. PLoS Genet, 9(6):e1003569.

[30]Huarte M, Guttman M, Feldser D, et al., 2010. A large intergenic noncoding RNA induced by p53 mediates global gene repression in the p53 response. Cell, 142(3):409-419.

[31]Iyer MK, Niknafs YS, Malik R, et al., 2015. The landscape of long noncoding RNAs in the human transcriptome. Nat Genet, 47(3):199-208.

[32]Jackman JE, Alfonzo JD, 2013. Transfer RNA modifications: nature’s combinatorial chemistry playground. Wiley Interdiscip Rev RNA, 4(1):35-48.

[33]Jin JJ, Lv W, Xia P, et al., 2018. Long noncoding RNA SYISL regulates myogenesis by interacting with polycomb repressive complex 2. Proc Natl Acad Sci USA, 115(42):E9802-E9811.

[34]Ke SD, Alemu EA, Mertens C, et al., 2015. A majority of m6A residues are in the last exons, allowing the potential for 3' UTR regulation. Genes Dev, 29(19):2037-2053.

[35]Kiser DP, Rivero O, Lesch KP, 2015. Annual research review: the (epi)genetics of neurodevelopmental disorders in the era of whole-genome sequencing—unveiling the dark matter. J Child Psychol Psychiatry, 56(3):278-295.

[36]Kumar V, Westra HJ, Karjalainen J, et al., 2013. Human disease-associated genetic variation impacts large intergenic non-coding RNA expression. PLoS Genet, 9(1):e1003201.

[37]Kung JT, Kesner B, An JY, et al., 2015. Locus-specific targeting to the X chromosome revealed by the RNA interactome of CTCF. Mol Cell, 57(2):361-375.

[38]Li L, Zhuang YL, Zhao XS, et al., 2019. Long non-coding RNA in neuronal development and neurological disorders. Front Genet, 9:744.

[39]Li R, Zhu HL, Luo YB, 2016. Understanding the functions of long non-coding RNAs through their higher-order structures. Int J Mol Sci, 17(5):E702.

[40]Liao Q, Liu CN, Yuan XY, et al., 2011. Large-scale prediction of long non-coding RNA functions in a coding-non-coding gene co-expression network. Nucleic Acids Res, 39(9):3864-3878.

[41]Linder B, Grozhik AV, Olarerin-George AO, et al., 2015. Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome. Nat Methods, 12(8):767-772.

[42]Liu N, Dai Q, Zheng GQ, et al., 2015. N6-methyladenosine-dependent RNA structural switches regulate RNA–protein interactions. Nature, 518(7540):560-564.

[43]Lu QS, Ren SJ, Lu M, et al., 2013. Computational prediction of associations between long non-coding RNAs and proteins. BMC Genomics, 14:651.

[44]Maurano MT, Humbert R, Rynes E, et al., 2012. Systematic localization of common disease-associated variation in regulatory DNA. Science, 337(6099):1190-1195.

[45]Morris KV, 2016. Long Non-coding RNAs in Human Disease. Springer International Publishing, Cham, Germany.

[46]Muppirala UK, Honavar VG, Dobbs D, 2011. Predicting RNA– protein interactions using only sequence information. BMC Bioinformatics, 12:489.

[47]Necsulea A, Soumillon M, Warnefors M, et al., 2014. The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature, 505(7485):635-640.

[48]O'Roak BJ, Vives L, Girirajan S, et al., 2012. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature, 485(7397):246-250.

[49]Pan XY, Fan YX, Yan JC, et al., 2016. IPMiner: hidden ncRNA–protein interaction sequential pattern mining with stacked autoencoder for accurate computational prediction. BMC Genomics, 17:582.

[50]Patil DP, Chen CK, Pickering BF, et al., 2016. m6A RNA methylation promotes XIST-mediated transcriptional repression. Nature, 537(7620):369-373.

[51]Pertea M, Salzberg SL, 2010. Between a chicken and a grape: estimating the number of human genes. Genome Biol, 11(5):206.

[52]Pian C, Zhang GL, Chen Z, et al., 2016. LncRNApred: classification of long non-coding RNAs and protein-coding transcripts by the ensemble algorithm with a new hybrid feature. PLoS ONE, 11(5):e0154567.

[53]Ponting CP, Oliver PL, Reik W, 2009. Evolution and functions of long noncoding RNAs. Cell, 136(4):629-641.

[54]Quinn JJ, Chang HY, 2016. Unique features of long non-coding RNA biogenesis and function. Nat Rev Genet, 17(1):47-62.

[55]Rashid F, Shah A, Shan G, 2016. Long non-coding RNAs in the cytoplasm. Genomics Proteomics Bioinformatics, 14(2):73-80.

[56]Ricaño-Ponce I, Wijmenga C, 2013. Mapping of immune-mediated disease genes. Annu Rev Genomics Hum Genet, 14:325-353.

[57]Song JH, Yi CQ, 2017. Chemical modifications to RNA: a new layer of gene expression regulation. ACS Chem Biol, 12(2):316-325.

[58]Srivastava AK, Schwartz CE, 2014. Intellectual disability and autism spectrum disorders: causal genes and molecular mechanisms. Neurosci Biobehav Rev, 46:161-174.

[59]Su ZD, Huang Y, Zhang ZY, et al., 2018. iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC. Bioinformatics, 34(24):4196-4204.

[60]Sun QY, Hao QY, Prasanth KV, 2018. Nuclear long noncoding RNAs: key regulators of gene expression. Trends Genet, 34(2):142-157.

[61]Sun S, del Rosario BC, Szanto A, et al., 2013. Jpx RNA activates Xist by evicting CTCF. Cell, 153(7):1537-1551.

[62]Tripathi V, Ellis JD, Shen Z, et al., 2010. The nuclear-retained noncoding RNA MALAT1 regulates alternative splicing by modulating SR splicing factor phosphorylation. Mol Cell, 39(6):925-938.

[63]van de Vondervoort IIGM, Gordebeke PM, Khoshab N, et al., 2013. Long non-coding RNAs in neurodevelopmental disorders. Front Mol Neurosci, 6:53.

[64]Verpelli C, Montani C, Vicidomini C, et al., 2013. Mutations of the synapse genes and intellectual disability syndromes. Eur J Pharmacol, 719(1-3):112-116.

[65]Wang KC, Chang HY, 2011. Molecular mechanisms of long noncoding RNAs. Mol Cell, 43(6):904-914.

[66]Wang X, He C, 2014. Dynamic RNA modifications in posttranscriptional regulation. Mol Cell, 56(1):5-12.

[67]Wang X, Lu ZK, Gomez A, et al., 2014. N6-methyladenosine-dependent regulation of messenger RNA stability. Nature, 505(7481):117-120.

[68]Wang X, Zhao BS, Roundtree IA, et al., 2015. N6-methyladenosine modulates messenger RNA translation efficiency. Cell, 161(6):1388-1399.

[69]Wang Y, Zhao X, Ju W, et al., 2015. Genome-wide differential expression of synaptic long noncoding RNAs in autism spectrum disorder. Transl Psychiatry, 5(10):e660.

[70]Werner MS, Ruthenburg AJ, 2015. Nuclear fractionation reveals thousands of chromatin-tethered noncoding RNAs adjacent to active genes. Cell Rep, 12(7):1089-1098.

[71]Wu P, Zuo XL, Deng HL, et al., 2013. Roles of long noncoding RNAs in brain development, functional diversification and neurodegenerative diseases. Brain Res Bull, 97:69-80.

[72]Xu X, Xu YC, Shi CQ, et al., 2017. A genome-wide comprehensively analyses of long noncoding RNA profiling and metastasis associated lncRNAs in renal cell carcinoma. Oncotarget, 8(50):87773-87781.

[73]https://doi.org/10.18632/oncotarget.21206

[74]Yang LT, Tang YY, Xiong F, et al., 2018. LncRNAs regulate cancer metastasis via binding to functional proteins. Oncotarget, 9(1):1426-1443.

[75]https://doi.org/10.18632/oncotarget.22840

[76]Yoon JH, Abdelmohsen K, Kim J, et al., 2013. Scaffold function of long non-coding RNA HOTAIR in protein ubiquitination. Nat Commun, 4:2939.

[77]Zampetaki A, Albrecht A, Steinhofel K, 2018. Long-noncoding RNA structure and function: is there a link? Front Physiol, 9:1201.

[78]Zhang YQ, Hamada M, 2018. DeepM6ASeq: prediction and characterization of m6A-containing sequences using deep learning. BMC Bioinformatics, 19(Suppl 19):524.

[79]Zhang ZH, Jhaveri DJ, Marshall VM, et al., 2014. A comparative study of techniques for differential expression analysis on RNA-seq data. PLoS ONE, 9(8):e103207.

[80]Zheng GXY, Do BT, Webster DE, et al., 2014. Dicer-microRNA-Myc circuit promotes transcription of hundreds of long noncoding RNAs. Nat Struct Mol Biol, 21(7):585-590.

[81]Zhou Y, Zeng P, Li YH, et al., 2016. SRAMP: prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features. Nucleic Acids Res, 44(10):e91.

[82]Ziats MN, Rennert OM, 2013. Aberrant expression of long noncoding RNAs in autistic brain. J Mol Neurosci, 49(3):589-593.

[83]Zou Q, Xing PW, Wei LY, et al., 2019. Gene2vec: gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA. RNA, 25(2):205-218.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE