Abstract |
Omics-based biomedical learning frequently relies on data of high-dimensions (up to thousands) and low-sample sizes (dozens to hundreds), which challenges efficient deep learning (DL) algorithms, particularly for low-sample omics investigations. Here, an unsupervised novel feature aggregation tool AggMap was developed to Aggregate and Map omics features into multi-channel 2D spatial-correlated image-like feature maps (Fmaps) based on their intrinsic correlations. AggMap exhibits strong feature reconstruction capabilities on a randomized benchmark dataset, outperforming existing methods. With AggMap multi-channel Fmaps as inputs, newly-developed multi-channel DL AggMapNet models outperformed the state-of-the-art machine learning models on 18 low-sample omics benchmark tasks. AggMapNet exhibited better robustness in learning noisy data and disease classification. The AggMapNet explainable module Simply-explainer identified key metabolites and proteins for COVID-19 detections and severity predictions. The unsupervised AggMap algorithm of good feature restructuring abilities combined with supervised explainable AggMapNet architecture establish a pipeline for enhanced learning and interpretability of low-sample omics data.
|
Authors | Wan Xiang Shen, Yu Liu, Yan Chen, Xian Zeng, Ying Tan, Yu Yang Jiang, Yu Zong Chen |
Journal | Nucleic acids research
(Nucleic Acids Res)
Vol. 50
Issue 8
Pg. e45
(05 06 2022)
ISSN: 1362-4962 [Electronic] England |
PMID | 35100418
(Publication Type: Journal Article, Research Support, Non-U.S. Gov't)
|
Copyright | © The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research. |
Chemical References |
|
Topics |
- Algorithms
- COVID-19
- Deep Learning
- Humans
- Machine Learning
- Proteins
|