Statistical analysis for aggregated count data in genetic association studies
- Authors
- Choi, Haewon; JUNG, HYE YOUNG; Park, Taesung
- Issue Date
- Oct-2016
- Publisher
- Inderscience Publishers
- Keywords
- self-reported studies; association studies; CPD; SNP; calibration model
- Citation
- International Journal of Data Mining and Bioinformatics, v.16, no.1, pp.77 - 91
- Indexed
- SCIE
SCOPUS
- Journal Title
- International Journal of Data Mining and Bioinformatics
- Volume
- 16
- Number
- 1
- Start Page
- 77
- End Page
- 91
- URI
- https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/12610
- DOI
- 10.1504/ijdmb.2016.10000564
- ISSN
- 1748-5673
- Abstract
- Abstract: In smoking behaviour studies, Cigarette Counts Per Day (CPD) are aggregated such as 0, one pack, two packs, etc. Analysis of such count data is a challenge, owing to its reporting bias and difficulty in estimating its appropriate distribution. In this study, we set forth to identify genetic variants, such as Single Nucleotide Polymorphisms (SNPs), that correlate with aggregated count data, such as CPD. We first reviewed the existing approaches, in which the aggregated count data is a dependent variable and the SNP is an ordinal independent variable. We then considered a calibration model in which the SNP is the ordinal dependent variable and the aggregated count data is the independent variable. This calibration modelling approach becomes robust to accommodate distributional assumptions of count data. We applied our robust calibration modelling approach to CPD data from the Korean Association Resource project data of 4183 male samples. Through simulation studies, we investigated the performance of the proposed method for comparison to other competing approaches.
- Files in This Item
-
Go to Link
- Appears in
Collections - COLLEGE OF SCIENCE AND CONVERGENCE TECHNOLOGY > ERICA 수리데이터사이언스학과 > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.