Full Text:   <1789>

Summary:  <1480>

Suppl. Mater.: 

CLC number: Q39

On-line Access: 2018-12-03

Received: 2018-03-14

Revision Accepted: 2018-07-12

Crosschecked: 2018-11-08

Cited: 0

Clicked: 3434

Citations:  Bibtex RefMan EndNote GB/T7714


Cheng-yin Ye


-   Go to

Article info.
Open peer comments

Journal of Zhejiang University SCIENCE B 2018 Vol.19 No.12 P.935-947


An ensemble-based likelihood ratio approach for family-based genomic risk prediction

Author(s):  Hui An, Chang-shuai Wei, Oliver Wang, Da-hui Wang, Liang-wen Xu, Qing Lu, Cheng-yin Ye

Affiliation(s):  Department of Health Management, School of Medicine, Hangzhou Normal University, Hangzhou 310036, China; more

Corresponding email(s):   yechengyin@hznu.edu.cn

Key Words:  Family-based study, Genetic risk prediction, High-dimensional data

Hui An, Chang-shuai Wei, Oliver Wang, Da-hui Wang, Liang-wen Xu, Qing Lu, Cheng-yin Ye. An ensemble-based likelihood ratio approach for family-based genomic risk prediction[J]. Journal of Zhejiang University Science B, 2018, 19(12): 935-947.

@article{title="An ensemble-based likelihood ratio approach for family-based genomic risk prediction",
author="Hui An, Chang-shuai Wei, Oliver Wang, Da-hui Wang, Liang-wen Xu, Qing Lu, Cheng-yin Ye",
journal="Journal of Zhejiang University Science B",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T An ensemble-based likelihood ratio approach for family-based genomic risk prediction
%A Hui An
%A Chang-shuai Wei
%A Oliver Wang
%A Da-hui Wang
%A Liang-wen Xu
%A Qing Lu
%A Cheng-yin Ye
%J Journal of Zhejiang University SCIENCE B
%V 19
%N 12
%P 935-947
%@ 1673-1581
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.B1800162

T1 - An ensemble-based likelihood ratio approach for family-based genomic risk prediction
A1 - Hui An
A1 - Chang-shuai Wei
A1 - Oliver Wang
A1 - Da-hui Wang
A1 - Liang-wen Xu
A1 - Qing Lu
A1 - Cheng-yin Ye
J0 - Journal of Zhejiang University Science B
VL - 19
IS - 12
SP - 935
EP - 947
%@ 1673-1581
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.B1800162

Objective: As one of the most popular designs used in genetic research, family-based design has been well recognized for its advantages, such as robustness against population stratification and admixture. With vast amounts of genetic data collected from family-based studies, there is a great interest in studying the role of genetic markers from the aspect of risk prediction. This study aims to develop a new statistical approach for family-based risk prediction analysis with an improved prediction accuracy compared with existing methods based on family history. Methods: In this study, we propose an ensemble-based likelihood ratio (ELR) approach, Fam-ELR, for family-based genomic risk prediction. Fam-ELR incorporates a clustered receiver operating characteristic (ROC) curve method to consider correlations among family samples, and uses a computationally efficient tree-assembling procedure for variable selection and model building. Results: Through simulations, Fam-ELR shows its robustness in various underlying disease models and pedigree structures, and attains better performance than two existing family-based risk prediction methods. In a real-data application to a family-based genome-wide dataset of conduct disorder, Fam-ELR demonstrates its ability to integrate potential risk predictors and interactions into the model for improved accuracy, especially on a genome-wide level. Conclusions: By comparing existing approaches, such as genetic risk-score approach, Fam-ELR has the capacity of incorporating genetic variants with small or moderate marginal effects and their interactions into an improved risk prediction model. Therefore, it is a robust and useful approach for high-dimensional family-based risk prediction, especially on complex disease with unknown or less known disease etiology.




Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Abraham G, Inouye M, 2015. Genomic risk prediction of complex human disease and its clinical application. Curr Opin Genet Dev, 33:10-16.

[2]Anney RJL, Lasky-Su J, Ó'Dúshláine C, et al., 2008. Conduct disorder and ADHD: evaluation of conduct problems as a categorical and quantitative trait in the international multicentre ADHD genetics study. Am J Med Genet B Neuropsychiatr Genet, 147B(8):1369-1378.

[3]Chatterjee N, Wheeler B, Sampson J, et al., 2013. Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat Genet, 45(4):400-405.

[4]Choi S, Bae S, Park T, 2016. Risk prediction using genome-wide association studies on type 2 diabetes. Genomics Inform, 14(4):138-148.

[5]de los Campos G, Naya H, Gianola D, et al., 2009. Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics, 182(1):375-385.

[6]Ferreira MAR, O'Donovan MC, Meng YA, et al., 2008. Collaborative genome-wide association analysis supports a role for ANK3 and CACNA1C in bipolar disorder. Nat Genet, 40(9):1056-1058.

[7]Ginsburg GS, Willard HF, 2009. Genomic and personalized medicine: foundations and applications. Transl Res, 154(6):277-287.

[8]Goes FS, Hamshere ML, Seifuddin F, et al., 2012. Genome-wide association of mood-incongruent psychotic bipolar disorder. Transl Psychiatry, 2(10):e180.

[9]Goes FS, McGrath J, Avramopoulos D, et al., 2015. Genome-wide association study of schizophrenia in Ashkenazi Jews. Am J Med Genet B Neuropsychiatr Genet, 168(8):649-659.

[10]Janssens ACJW, van Duijn CM, 2008. Genome-based prediction of common diseases: advances and prospects. Hum Mol Genet, 17(R2):R166-R173.

[11]Kazdin AE, 1997. Practitioner review: psychosocial treatments for conduct disorder in children. J Child Psychol Psychiatry, 38(2):161-178.

[12]Lasky-Su J, Neale BM, Franke B, et al., 2008. Genome-wide association scan of quantitative traits for attention deficit hyperactivity disorder identifies novel associations and confirms candidate gene associations. Am J Med Genet B Neuropsychiatr Genet, 147B(8):1345-1354.

[13]Maller J, George S, Purcell S, et al., 2006. Common variation in three genes, including a noncoding variant in CFH, strongly influences risk of age-related macular degeneration. Nat Genet, 38(9):1055-1059.

[14]Marchini J, Donnelly P, Cardon LR, 2005. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet, 37(4):413-417.

[15]Meigs JB, Shrader P, Sullivan LM, et al., 2008. Genotype score in addition to common risk factors for prediction of type 2 diabetes. N Engl J Med, 359(21):2208-2219.

[16]Need AC, Attix DK, McEvoy JM, et al., 2009. A genome-wide study of common SNPs and CNVs in cognitive performance in the CANTAB. Hum Mol Genet, 18(23):4650-4661.

[17]Obuchowski NA, 1997. Nonparametric analysis of clustered ROC curve data. Biometrics, 53(2):567-578.

[18]Pappa I, St Pourcain B, Benke K, et al., 2016. A genome-wide approach to children’s aggressive behavior: the EAGLE consortium. Am J Med Genet B Neuropsychiatr Genet, 171(5):562-572.

[19]Rietveld CA, Esko T, Davies G, et al., 2014. Common genetic variants associated with cognitive performance identified using the proxy-phenotype method. Proc Natl Acad Sci USA, 111(38):13790-13794.

[20]Sherva R, Wang Q, Kranzler H, et al., 2016. Genome-wide association study of cannabis dependence severity, novel risk variants, and shared genetic risks. JAMA Psychiatry, 73(5):472-480.

[21]Shieh Y, Hu DL, Ma L, et al., 2016. Breast cancer risk prediction using a clinical risk model and polygenic risk score. Breast Cancer Res Treat, 159(3):513-525.

[22]Smith JA, Ware EB, Middha P, et al., 2015. Current applications of genetic risk scores to cardiovascular outcomes and subclinical phenotypes. Curr Epidemiol Rep, 2(3):180-190.

[23]Sonuga-Barke EJS, Lasky-Su J, Neale BM, et al., 2008. Does parental expressed emotion moderate genetic effects in ADHD? An exploration using a genome wide association scan. Am J Med Genet B Neuropsychiatr Genet, 147B(8):1359-1368.

[24]Wackerly DD, Mendenhall III W, Scheaffer RL, 2008. Mathematical Statistics with Applications, 7th Ed. Thomson, Belmont, CA, USA.

[25]Wei CS, Anthony JC, Lu Q, 2012. Genome-environmental risk assessment of cocaine dependence. Front Genet, 3:83.

[26]Wei CS, Schaid DJ, Lu Q, 2013. Trees assembling Mann-Whitney approach for detecting genome-wide joint association among low-marginal-effect loci. Genet Epidemiol, 37(1):84-91.

[27]Wen YL, Burt A, Lu Q, 2017. Risk prediction modeling on family-based sequencing data using a random field method. Genetics, 207(1):63-73.

[28]Wray NR, Lee SH, Mehta D, et al., 2014. Research review: polygenic methods and their application to psychiatric traits. J Child Psychol Psychiatry, 55(10):1068-1087.

[29]Yang J, Benyamin B, McEvoy BP, et al., 2010. Common SNPs explain a large proportion of the heritability for human height. Nat Genet, 42(7):565-569.

[30]Ye C, Zhu J, Lu Q, 2011a. A clustered optimal ROC curve method for family-based genetic risk prediction. Stat Interface, 4(3):373-380.

[31]Ye C, Cui Y, Wei C, et al., 2011b. A non-parametric method for building predictive genetic tests on high-dimensional data. Hum Hered, 71(3):161-170.

[32]List of electronic supplementary materials

[33]Table S1 Significant interaction effects identified by logistic regression in the genome-wide prediction

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE