CLC number: Q39
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2018-11-08
Cited: 0
Clicked: 4266
Hui An, Chang-shuai Wei, Oliver Wang, Da-hui Wang, Liang-wen Xu, Qing Lu, Cheng-yin Ye. An ensemble-based likelihood ratio approach for family-based genomic risk prediction[J]. Journal of Zhejiang University Science B, 2018, 19(12): 935-947.
@article{title="An ensemble-based likelihood ratio approach for family-based genomic risk prediction",
author="Hui An, Chang-shuai Wei, Oliver Wang, Da-hui Wang, Liang-wen Xu, Qing Lu, Cheng-yin Ye",
journal="Journal of Zhejiang University Science B",
volume="19",
number="12",
pages="935-947",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.B1800162"
}
%0 Journal Article
%T An ensemble-based likelihood ratio approach for family-based genomic risk prediction
%A Hui An
%A Chang-shuai Wei
%A Oliver Wang
%A Da-hui Wang
%A Liang-wen Xu
%A Qing Lu
%A Cheng-yin Ye
%J Journal of Zhejiang University SCIENCE B
%V 19
%N 12
%P 935-947
%@ 1673-1581
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.B1800162
TY - JOUR
T1 - An ensemble-based likelihood ratio approach for family-based genomic risk prediction
A1 - Hui An
A1 - Chang-shuai Wei
A1 - Oliver Wang
A1 - Da-hui Wang
A1 - Liang-wen Xu
A1 - Qing Lu
A1 - Cheng-yin Ye
J0 - Journal of Zhejiang University Science B
VL - 19
IS - 12
SP - 935
EP - 947
%@ 1673-1581
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.B1800162
Abstract: Objective: As one of the most popular designs used in genetic research, family-based design has been well recognized for its advantages, such as robustness against population stratification and admixture. With vast amounts of genetic data collected from family-based studies, there is a great interest in studying the role of genetic markers from the aspect of risk prediction. This study aims to develop a new statistical approach for family-based risk prediction analysis with an improved prediction accuracy compared with existing methods based on family history. Methods: In this study, we propose an ensemble-based likelihood ratio (ELR) approach, Fam-ELR, for family-based genomic risk prediction. Fam-ELR incorporates a clustered receiver operating characteristic (ROC) curve method to consider correlations among family samples, and uses a computationally efficient tree-assembling procedure for variable selection and model building. Results: Through simulations, Fam-ELR shows its robustness in various underlying disease models and pedigree structures, and attains better performance than two existing family-based risk prediction methods. In a real-data application to a family-based genome-wide dataset of conduct disorder, Fam-ELR demonstrates its ability to integrate potential risk predictors and interactions into the model for improved accuracy, especially on a genome-wide level. Conclusions: By comparing existing approaches, such as genetic risk-score approach, Fam-ELR has the capacity of incorporating genetic variants with small or moderate marginal effects and their interactions into an improved risk prediction model. Therefore, it is a robust and useful approach for high-dimensional family-based risk prediction, especially on complex disease with unknown or less known disease etiology.
[1]Abraham G, Inouye M, 2015. Genomic risk prediction of complex human disease and its clinical application. Curr Opin Genet Dev, 33:10-16.
[2]Anney RJL, Lasky-Su J, Ó'Dúshláine C, et al., 2008. Conduct disorder and ADHD: evaluation of conduct problems as a categorical and quantitative trait in the international multicentre ADHD genetics study. Am J Med Genet B Neuropsychiatr Genet, 147B(8):1369-1378.
[3]Chatterjee N, Wheeler B, Sampson J, et al., 2013. Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat Genet, 45(4):400-405.
[4]Choi S, Bae S, Park T, 2016. Risk prediction using genome-wide association studies on type 2 diabetes. Genomics Inform, 14(4):138-148.
[5]de los Campos G, Naya H, Gianola D, et al., 2009. Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics, 182(1):375-385.
[6]Ferreira MAR, O'Donovan MC, Meng YA, et al., 2008. Collaborative genome-wide association analysis supports a role for ANK3 and CACNA1C in bipolar disorder. Nat Genet, 40(9):1056-1058.
[7]Ginsburg GS, Willard HF, 2009. Genomic and personalized medicine: foundations and applications. Transl Res, 154(6):277-287.
[8]Goes FS, Hamshere ML, Seifuddin F, et al., 2012. Genome-wide association of mood-incongruent psychotic bipolar disorder. Transl Psychiatry, 2(10):e180.
[9]Goes FS, McGrath J, Avramopoulos D, et al., 2015. Genome-wide association study of schizophrenia in Ashkenazi Jews. Am J Med Genet B Neuropsychiatr Genet, 168(8):649-659.
[10]Janssens ACJW, van Duijn CM, 2008. Genome-based prediction of common diseases: advances and prospects. Hum Mol Genet, 17(R2):R166-R173.
[11]Kazdin AE, 1997. Practitioner review: psychosocial treatments for conduct disorder in children. J Child Psychol Psychiatry, 38(2):161-178.
[12]Lasky-Su J, Neale BM, Franke B, et al., 2008. Genome-wide association scan of quantitative traits for attention deficit hyperactivity disorder identifies novel associations and confirms candidate gene associations. Am J Med Genet B Neuropsychiatr Genet, 147B(8):1345-1354.
[13]Maller J, George S, Purcell S, et al., 2006. Common variation in three genes, including a noncoding variant in CFH, strongly influences risk of age-related macular degeneration. Nat Genet, 38(9):1055-1059.
[14]Marchini J, Donnelly P, Cardon LR, 2005. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet, 37(4):413-417.
[15]Meigs JB, Shrader P, Sullivan LM, et al., 2008. Genotype score in addition to common risk factors for prediction of type 2 diabetes. N Engl J Med, 359(21):2208-2219.
[16]Need AC, Attix DK, McEvoy JM, et al., 2009. A genome-wide study of common SNPs and CNVs in cognitive performance in the CANTAB. Hum Mol Genet, 18(23):4650-4661.
[17]Obuchowski NA, 1997. Nonparametric analysis of clustered ROC curve data. Biometrics, 53(2):567-578.
[18]Pappa I, St Pourcain B, Benke K, et al., 2016. A genome-wide approach to children’s aggressive behavior: the EAGLE consortium. Am J Med Genet B Neuropsychiatr Genet, 171(5):562-572.
[19]Rietveld CA, Esko T, Davies G, et al., 2014. Common genetic variants associated with cognitive performance identified using the proxy-phenotype method. Proc Natl Acad Sci USA, 111(38):13790-13794.
[20]Sherva R, Wang Q, Kranzler H, et al., 2016. Genome-wide association study of cannabis dependence severity, novel risk variants, and shared genetic risks. JAMA Psychiatry, 73(5):472-480.
[21]Shieh Y, Hu DL, Ma L, et al., 2016. Breast cancer risk prediction using a clinical risk model and polygenic risk score. Breast Cancer Res Treat, 159(3):513-525.
[22]Smith JA, Ware EB, Middha P, et al., 2015. Current applications of genetic risk scores to cardiovascular outcomes and subclinical phenotypes. Curr Epidemiol Rep, 2(3):180-190.
[23]Sonuga-Barke EJS, Lasky-Su J, Neale BM, et al., 2008. Does parental expressed emotion moderate genetic effects in ADHD? An exploration using a genome wide association scan. Am J Med Genet B Neuropsychiatr Genet, 147B(8):1359-1368.
[24]Wackerly DD, Mendenhall III W, Scheaffer RL, 2008. Mathematical Statistics with Applications, 7th Ed. Thomson, Belmont, CA, USA.
[25]Wei CS, Anthony JC, Lu Q, 2012. Genome-environmental risk assessment of cocaine dependence. Front Genet, 3:83.
[26]Wei CS, Schaid DJ, Lu Q, 2013. Trees assembling Mann-Whitney approach for detecting genome-wide joint association among low-marginal-effect loci. Genet Epidemiol, 37(1):84-91.
[27]Wen YL, Burt A, Lu Q, 2017. Risk prediction modeling on family-based sequencing data using a random field method. Genetics, 207(1):63-73.
[28]Wray NR, Lee SH, Mehta D, et al., 2014. Research review: polygenic methods and their application to psychiatric traits. J Child Psychol Psychiatry, 55(10):1068-1087.
[29]Yang J, Benyamin B, McEvoy BP, et al., 2010. Common SNPs explain a large proportion of the heritability for human height. Nat Genet, 42(7):565-569.
[30]Ye C, Zhu J, Lu Q, 2011a. A clustered optimal ROC curve method for family-based genetic risk prediction. Stat Interface, 4(3):373-380.
[31]Ye C, Cui Y, Wei C, et al., 2011b. A non-parametric method for building predictive genetic tests on high-dimensional data. Hum Hered, 71(3):161-170.
[32]List of electronic supplementary materials
[33]Table S1 Significant interaction effects identified by logistic regression in the genome-wide prediction
Open peer comments: Debate/Discuss/Question/Opinion
<1>