Abstract |
The analysis of whole-genome sequencing studies is challenging due to the large number of rare variants in noncoding regions and the lack of natural units for testing. We propose a statistical method to detect and localize rare and common risk variants in whole-genome sequencing studies based on a recently developed knockoff framework. It can (1) prioritize causal variants over associations due to linkage disequilibrium thereby improving interpretability; (2) help distinguish the signal due to rare variants from shadow effects of significant common variants nearby; (3) integrate multiple knockoffs for improved power, stability, and reproducibility; and (4) flexibly incorporate state-of-the-art and future association tests to achieve the benefits proposed here. In applications to whole-genome sequencing data from the Alzheimer's Disease Sequencing Project (ADSP) and COPDGene samples from NHLBI Trans-Omics for Precision Medicine (TOPMed) Program we show that our method compared with conventional association tests can lead to substantially more discoveries.
|
Authors | Zihuai He, Linxi Liu, Chen Wang, Yann Le Guen, Justin Lee, Stephanie Gogarten, Fred Lu, Stephen Montgomery, Hua Tang, Edwin K Silverman, Michael H Cho, Michael Greicius, Iuliana Ionita-Laza |
Journal | Nature communications
(Nat Commun)
Vol. 12
Issue 1
Pg. 3152
(05 25 2021)
ISSN: 2041-1723 [Electronic] England |
PMID | 34035245
(Publication Type: Evaluation Study, Journal Article, Research Support, N.I.H., Extramural, Research Support, N.I.H., Intramural, Research Support, Non-U.S. Gov't)
|
Topics |
- Algorithms
- Causality
- Computer Simulation
- Data Interpretation, Statistical
- Datasets as Topic
- Genetic Loci
- Genetic Predisposition to Disease
- Genome, Human
- Genome-Wide Association Study
(methods)
- Humans
- Linkage Disequilibrium
- Markov Chains
- Models, Genetic
- Polymorphism, Single Nucleotide
- Reproducibility of Results
- Whole Genome Sequencing
(methods)
|