Machine learning-based quantification for disease uncertainty increases the statistical power of genetic association studies.open access
- Authors
- Park, Jun Young; Lee, Jang Jae; Lee, Younghwa; Lee, Dongsoo; Gim, Jungsoo; Farrer, Lindsay; Lee, Kun Ho; Won, Sungho
- Issue Date
- Sep-2023
- Publisher
- Oxford University Press
- Citation
- Bioinformatics, v.39, no.9
- Journal Title
- Bioinformatics
- Volume
- 39
- Number
- 9
- URI
- http://scholarworks.bwise.kr/kbri/handle/2023.sw.kbri/965
- DOI
- 10.1093/bioinformatics/btad534
- ISSN
- 1367-4803
- Abstract
- BACKGROUND: Allowance for increasingly large samples is a key to identify the association of genetic variants with Alzheimer's disease (AD) in genome-wide association studies (GWAS). Accordingly, we aimed to develop a method that incorporates patients with mild cognitive impairment (MCI) and unknown cognitive status in GWAS using a machine learning-based AD prediction model. RESULTS: Simulation analyses showed that weighting imputed phenotypes (WIP) method increased the statistical power compared to ordinary logistic regression using only AD cases and controls. Applied to real-world data, the penalized logistic method had the highest AUC (0.96) for AD prediction and WIP method performed well in terms of power. We identified an association (p < 5.0×10-8) of AD with several variants in the APOE region and rs143625563 in LMX1A. Our method, which allows the inclusion of individuals with MCI, improves the statistical power of GWAS for AD. We discovered a novel association with LMX1A. METHODS: Simulation codes can be accessed at https://github.com/Junkkkk/wGEE_GWAS. BACKGROUND: Supplementary data are available at Bioinformatics online. © The Author(s) 2023. Published by Oxford University Press.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - ETC > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.