Predicting Long Non-coding RNA-disease Associations using Multiple Features and Deep Learning

  • Van Tinh Nguyen
  • Dang Hung Tran
Keywords: lncRNA-disease associations prediction, weighted K nearest known neighbors, singular value decomposition, feature extraction, deep learning

Abstract

Various long non-coding RNAs have been shown
to play crucial roles in different biological processes including
cell cycle control, transcription, translation, epigenetic regulation, splicing, differentiation, immune response and so forth
in the human body. Discovering lncRNA-disease associations
promotes the awareness of human complex disease at molecular level and support the diagnosis, treatment and prevention of complex diseases. It is costly, laboratory and timeconsuming to discover and verify lncRNA-disease associations
by biological experiments. Therefore, it is crucial to develop a
computational method to predict lncRNA-disease associations
to save time and resources. In this paper, we proposed a new
method to predict lncRNA-disease associations using multiple
features and deep learning. Our method uses a weighted
????-nearest known neighbors algorithm as a pre-processing
step to eliminate the impact of sparsity data problem. And
it combines the linear and non-linear features extracted by
singular value decomposition and deep learning techniques,
respectively, to obtain better prediction performance. Our
proposed method achieves a decisive performance with the
best AUC and AUPR values of 0.9702 and 0.8814, respectively,
under LOOCV experiments. It is superior to other stateof-the-art SDLDA and NCPLDA methods in both AUC and
AUPR evaluation metrics. It could be considered as a powerful
tool to predict lncRNA-disease associations.

References

E. S. Lander, “Ribosome profiling provides evidence that large non-coding RNAs do not encode proteins,” Cell, vol. 154, no. 1, pp. 240–251, 2014, doi: 10.1016/j.cell.2013.06.009.Ribosome.

K. C. Wang and H. Y. Chang, “Molecular mechanisms of long noncoding RNAs,” vol. 43, no. 6, pp. 904–914, 2011, doi: 10.1016/j.molcel.2011.08.018.Molecular.

M. Guttman and J. L. Rinn, “Modular regulatory principles of large non-coding RNAs,” Nature, vol. 482, no. 7385, pp. 339–346, 2012, doi: 10.1038/nature10887.

V. Mohanty, Y. Gokmen-Polar, S. Badve, and S. C. Janga, “Role of lncRNAs in health and disease-size and shape matter,” Brief. Funct. Genomics, vol. 14, no. 2, pp. 115–129, 2015, doi: 10.1093/bfgp/elu034.

X. Chen, C. C. Yan, X. Zhang, and Z. H. You, “Long non-coding RNAs and complex diseases: From experimental results to computational models,” Brief. Bioinform., vol. 18, no. 4, pp. 558–576, 2017, doi: 10.1093/bib/bbw060.

X. Li, J. Xu, Y. Xiao, and S. Ning, Non-coding RNAs in Complex Diseases. Springer Singapore, 2018.

X. Lei et al., “A comprehensive survey on computational methods of non-coding RNA and disease association prediction,” Brief. Bioinform., vol. 00, no. August, pp. 1–31, 2020, doi: 10.1093/bib/bbaa350.

M. Zeng et al., “SDLDA: lncRNA-disease association prediction based on singular value decomposition and deep learning,” Methods, vol. 179, no. February, pp. 73–80, 2020, doi: 10.1016/j.ymeth.2020.05.002.

G. Li et al., “Prediction of LncRNA-Disease Associations Based on Network Consistency Projection,” IEEE Access, vol. 7, pp. 58849–58856, 2019, doi: 10.1109/ACCESS.2019.2914533.

Z. Xuan, J. Li, J. Yu, X. Feng, B. Zhao, and L. Wang, “A probabilistic matrix factorization method for identifying lncRNA-disease associations,” Genes (Basel)., vol. 10, no. 2, 2019, doi: 10.3390/genes10020126.

G. Xie, Z. Huang, Z. Liu, Z. Lin, and L. Ma, “NCPHLDA: A novel method for human lncRNA-disease association prediction based on network consistency projection,” Mol. Omi., vol. 15, no. 6, pp. 442–450, 2019, doi:

1039/c9mo00092e.

X. Xiao et al., “BPLLDA: Predicting lncRNA-disease associations based on simple paths with limited lengths in a heterogeneous network,” Front. Genet., vol. 9, no. OCT, pp. 1–11, 2018, doi: 10.3389/fgene.2018.00411.

G. Xie, L. Wu, Z. Lin, and J. Cui, “WLDAP: A computational model of weighted lncRNA-disease associations prediction,” Phys. A Stat. Mech. its Appl., vol. 558, p. 124765, 2020, doi: 10.1016/j.physa.2020.124765.

M. Chen, Y. Deng, A. Li, and Y. Tan, “Inferring Latent Disease-lncRNA Associations by Label-Propagation Algorithm and Random Projection on a Heterogeneous Network,” Front. Genet., vol. 13, no. February, pp. 1–11, 2022, doi: 10.3389/fgene.2022.798632.

L. Ding, M. Wang, D. Sun, and A. Li, “TPGLDA: Novel prediction of associations between lncRNAs and diseases via lncRNA-disease-gene tripartite graph,” Sci. Rep., vol. 8, no. 1, pp. 1–11, 2018, doi: 10.1038/s41598-018-19357-3.

M. X. Liu, X. Chen, G. Chen, Q. H. Cui, and G. Y. Yan, “A computational framework to infer human disease-associated long noncoding RNAs,” PLoS One, vol. 9, no. 1, 2014, doi: 10.1371/journal.pone.0084408.

V. T. Nguyen and D. H. Tran, “An improved computational method for prediction of lncRNA-disease associations based on collaborative filtering and resource allocation,” 2021 13th Int. Conf. Knowl. Syst. Eng., pp. 1–6, 2021, doi: 10.1109/kse53942.2021.9648632.

J. Yu, P. Ping, L. Wang, L. Kuang, X. Li, and Z. Wu, “A novel probability model for lncRNA–disease association prediction based on the na¨ıve bayesian classifier,” Genes (Basel)., vol. 9, no. 7, 2018, doi: 10.3390/genes9070345.

J. Yu, Z. Xuan, X. Feng, Q. Zou, and L. Wang, “A novel collaborative filtering model for LncRNA-disease association prediction based on the Na¨ıve Bayesian classifier,” BMC Bioinformatics, vol. 20, no. 1, pp. 1–13, 2019, doi: 10.1186/s12859-019-2985-0.

D. Yao, X. Zhan, X. Zhan, C. K. Kwoh, P. Li, and J. Wang, “A random forest based computational model for predicting novel lncRNA-disease associations,” BMC Bioinformatics, vol. 21, no. 1, pp. 1–18, 2020, doi: 10.1186/s12859-020- 3458-1.

R. Zhu, Y. Wang, J. X. Liu, and L. Y. Dai, “IPCARF: improving lncRNA-disease association prediction using incremental principal component analysis feature selection and a random forest classifier,” BMC Bioinformatics, vol. 22, no. 1. 2021, doi: 10.1186/s12859-021-04104-9.

W.Lan et al., "LDAP: a web server for lncRNA-disease association prediction," vol.33, pp458-460, 2017, doi: 10.1093/bioinformatics/btw639.

G. Fu, J. Wang, C. Domeniconi, and G. Yu, “Matrix factorization-based data fusion for the prediction of lncRNAdisease associations,” Bioinformatics, vol. 34, no. 9, pp. 1529–1537, 2018, doi: 10.1093/bioinformatics/btx794.

N. Fraidouni and G. Zaruba, “A Matrix Completion Approach for Predicting lncRNA-disease association,” Int’l Conf. Bioinforma. Comput. Biol., pp. 1–6, 2019.

M. Madhavan and G. Gopakumar, “DBNLDA: Deep Belief Network based representation learning for lncRNA-disease association prediction,” Appl. Intell., vol. 52, no. 5, pp. 5342–5352, 2022, doi: 10.1007/s10489-021-02675-x.

C. Yan et al., “Computational Methods and Applications for Identifying Disease-Associated lncRNAs as Potential Biomarkers and Therapeutic Targets,” Mol. Ther. - Nucleic Acids, vol. 21, pp. 156–171, 2020, doi: 10.1016/j.omtn.2020.05.018.

Z. H. Guo, Z. H. You, Y. Bin Wang, H. C. Yi, and Z. H. Chen, “A Learning-Based Method for LncRNA-Disease Association Identification Combing Similarity Information and Rotation Forest,” iScience, vol. 19, pp. 786–795, 2019, doi: 10.1016/j.isci.2019.08.030.

Z. Shi, H. Zhang, C. Jin, X. Quan, and Y. Yin, “A representation learning model based on variational inference and graph autoencoder for predicting lncRNA-disease associations,” BMC Bioinformatics, vol. 22, no. 1, pp. 1–20, 2021, doi: 10.1186/s12859-021-04073-z.

D. Wang, J. Wang, M. Lu, F. Song, and Q. Cui, “Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases,” Bioinformatics, vol. 26, no. 13, pp. 1644–1650, 2010, doi: 10.1093/bioinformatics/btq241.

J. Sun et al., “Inferring novel lncRNA-disease associations based on a random walk model of a lncRNA functional similarity network,” Mol. Biosyst., vol. 10, no. 8, pp. 2074–2081, 2014, doi: 10.1039/c3mb70608g.

H.-T. K., “Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation,” Casp. J Intern Med 2013;, vol. 4(2), pp. 627–635, 2013.

T. Saito and M. Rehmsmeier, “The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets,” PLoS One, vol. 10, no. 3, p. e0118432., 2015, doi: 10.1371/journal.pone.0118432.

M. Abadi et al., "TensorFlow: A system for large-scale machine learning," 12th USENIX Symp. Oper. Syst. Des. Implement., pp. 265–283, 2016, doi: 10.1007/978-1-4842- 6418-8-2.

H. Jin et al., “lncRNA and breast cancer: Progress from identifying mechanisms to challenges and opportunities of clinical treatment,” Mol. Ther. - Nucleic Acids, vol. 25, no. September, pp. 613–637, 2021, doi: 10.1016/j.omtn.2021.08.005.

H. Shima et al., “Lnc RNA H19 is associated with poor prognosis in breast cancer patients and promotes cancer stemness,” Breast Cancer Res. Treat., vol. 170, no. 3, pp. 507–516, 2018, doi: 10.1007/s10549-018-4793-z.

Z. Wang et al., “High expression of long non-coding RNA MALAT1 in breast cancer is associated with poor relapsefree survival,” Breast Cancer Res Trea, vol. 171, no. 2, pp. 261–271, 2018, doi: 10.1007/s10549-018-4839-2.High.

X. Xu et al., “The role of MicroRNAs in hepatocellular carcinoma,” J. Cancer, vol. 9, no. 19, pp. 3557–3569, 2018, doi: 10.7150/jca.26350.

P. Malakar et al., “Long Noncoding RNA MALAT1 Promotes Hepatocellular Carcinoma Development by SRSF1 Upregulation and mTOR Activation,” Cancer Res, vol. 77, no. 5, pp. 1155–1167, 2018, doi: 10.1158/0008- 472.CAN- 16-1508.Long.

F. Wang et al., “Upregulated lncRNA-UCA1 contributes to progression of hepatocellular carcinoma through inhibition of miR-216b and activation of FGFR1/ERK signaling pathway,” Oncotarget, vol. 6, no. 10, pp. 7899–7917, 2015, doi: 10.18632/oncotarget.3219.

Published
2022-09-30