The lack of a reliable and easy-to-operate screening pipeline for disease-related
noncoding RNA regulatory axis is a problem that needs to be solved urgently. To address this, we designed a hybrid pipeline, disease-related
lncRNA-
miRNA-
mRNA regulatory axis prediction from multiomics (DLRAPom), to identify risk
biomarkers and disease-related
lncRNA-
miRNA-
mRNA regulatory axes by adding a novel machine learning model on the basis of conventional analysis and combining experimental validation. The pipeline consists of four parts, including selecting hub
biomarkers by conventional bioinformatics analysis, discovering the most essential
protein-coding
biomarkers by a novel machine learning model, extracting the key
lncRNA-
miRNA-
mRNA axis and validating experimentally. Our study is the first one to propose a new pipeline predicting the interactions between
lncRNA and
miRNA and
mRNA by combining WGCNA and XGBoost. Compared with the methods reported previously, we developed an Optimized XGBoost model to reduce the degree of overfitting in multiomics data, thereby improving the generalization ability of the overall model for the integrated analysis of multiomics data. With applications to
gestational diabetes mellitus (GDM), we predicted nine risk
protein-coding
biomarkers and some potential
lncRNA-
miRNA-
mRNA regulatory axes, which all correlated with GDM. In those regulatory axes, the MALAT1/hsa-miR-144-3p/IRS1 axis was predicted to be the key axis and was identified as being associated with GDM for the first time. In short, as a flexible pipeline, DLRAPom can contribute to molecular pathogenesis research of diseases, effectively predicting potential disease-related
noncoding RNA regulatory networks and providing promising candidates for functional research on disease pathogenesis.