A novel one-class classification approach to accurately predict disease-gene association in acute myeloid leukemia cancer

Autoři: Akram Vasighizaker aff001;  Alok Sharma aff002;  Abdollah Dehzangi aff007
Působiště autorů: Electrical & Computer Engineering Department, Tarbiat Modares University, Tehran, Iran aff001;  Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, Queensland, Australia aff002;  Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University (TMDU), Tokyo, Japan aff003;  Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan aff004;  School of Engineering and Physics, Faculty of Science Technology and Environment, University of the South Pacific, Suva, Fiji aff005;  CREST, JST, Tokyo, Japan aff006;  Department of Computer Science, Morgan State University, Baltimore, Maryland, United States of America aff007
Vyšlo v časopise: PLoS ONE 14(12)
Kategorie: Research Article
doi: 10.1371/journal.pone.0226115


Disease causing gene identification is considered as an important step towards drug design and drug discovery. In disease gene identification and classification, the main aim is to identify disease genes while identifying non-disease genes are of less or no significant. Hence, this task can be defined as a one-class classification problem. Existing machine learning methods typically take into consideration known disease genes as positive training set and unknown genes as negative samples to build a binary-class classification model. Here we propose a new One-class Classification Support Vector Machines (OCSVM) method to precisely classify candidate disease genes. Our aim is to build a model that concentrate its focus on detecting known disease-causing gene to increase sensitivity and precision. We investigate the impact of our proposed model using a benchmark consisting of the gene expression dataset for Acute Myeloid Leukemia (AML) cancer. Compared with the traditional methods, our experimental result shows the superiority of our proposed method in terms of precision, recall, and F-measure to detect disease causing genes for AML. OCSVM codes and our extracted AML benchmark are publicly available at: https://github.com/imandehzangi/OCSVM.

Klíčová slova:

Acute myeloid leukemia – Algorithms – Drug discovery – Gene expression – Gene prediction – Machine learning – Support vector machines – Kernel methods


