Access to
RNA secondary structure is a prerequisite for understanding and mastering
RNA function.
RNA secondary structures play an important role in cells, they can cause or contribute to
neurological disorders and can be applied in the medical field. However, the experimental method to obtain
RNA secondary structure is costly, laborious and not universal. Although computational methods can predict
RNA secondary structure more accurately for short-sequence RNAs, it cannot predict long-sequence RNAs and pseudoknot, which is the bottleneck of
RNA secondary structure prediction at present. In recent years, researchers have attempted to use deep learning algorithms to predict
RNA secondary structure and have achieved results. However, the small amount of data on the secondary structure of long-sequence RNAs leads to the low accuracy of deep learning methods to predict the secondary structure of RNAs across races. Similarly,
RNA structure with pseudoknot is very complex and insufficient data caused the deep learning algorithm to struggle to predict the secondary structure of
RNA containing pseudoknots. The
RNA data are encoded into grayscale images by a unique encoding method based on the real
RNA secondary structure and sequence information. Then, this paper reasonably expands the image data to increase the amount of
RNA data, solves the problem of insufficient data for predicting long sequences and
RNA secondary structure with pseudoknots in current deep learning methods, and provides a good data foundation for deep learning.The article proposes a multi-scale feature fusion Conditional Deep Convolutional Generative Adversarial Network prediction model (MSFF-CDCGAN) based on the improved Conditional Deep Convolutional Generative Adversarial Network (CDCGAN) model to predict
RNA secondary structure. The experimental results showed that the MSFF-CDCGAN model could predict long-sequence RNAs and pseudoknots more accurately than traditional prediction methods. This paper introduces Generative Adversarial Network (GAN) to
RNA secondary structure prediction for the first time. It uses a unique image encoding approach to expand the original
RNA data set, thus transforming the structure prediction problem into an image analysis problem and effectively solving the bottleneck in
RNA secondary structure prediction.