Association of single nucleotide polymorphism and phenotype in type 2 of diabetes mellitus using Support Vector Regression and Genetic Algorithm

Ratu Mutiara Siregar(1*); Wisnu Ananta Kusuma(2); Annisa Annisa(3);

(1) Institut Pertanian Bogor
(2) Institut Pertanian Bogor
(3) Institut Pertanian Bogor
(*) Corresponding Author



Precision Medicine is used to improve proper health care and patients' quality of life, one of which is diabetes. Diabetes Mellitus (DM) is a multifactorial and heterogeneous group of disorders characterized by deficiency or failure to maintain normal glucose homeostasis. About 90% of all DM patients are Type 2 Diabetes Mellitus (T2DM). Biological characteristics and genetic information of T2DM disease were obtained by looking for associations in Single Nucleotide Polymorphism (SNP) which allows for determining the relationship between phenotypic and genotypic information and identifying genes associated with T2DM disease. This research focuses on the Support Vector Regression method and Genetic Algorithm to obtain SNPs that have previously calculated the correlation value using Spearman's rank correlation. Then do association mapping on the SNP results from the SVR-GA selection and check pastasis interaction. The results produced 14 SNP importance. Evaluation of the model using the mean absolute error (MAE) obtained is 0.02807. If the value of MAE is close to zero, then a model can be accepted. The genes generated from the association can be used to assist other researchers in finding the right treatment for T2DM patients according to their genetic profile.


Aassociation mapping; Epistasis; SNP; SVR-GA; Type 2 Diabetes mellitus


Full Text:


Article Metrics

Abstract view: 163 times
PDF view: 58 times

Digital Object Identifier




RB Prasad and L Groop, “Precision medicine in type 2 diabetes”. Journal of Internal Medicine, 286(1):112-114. 2018 Dec 7.

J Ren, T He, Y Li, S Liu, Y Du, Y Jiang, C Wu, “Network-based regularization for high dimensional SNP data in the case–control study of Type 2 diabetes”, BMC genetics, 18(1), 44 2017 Dec 18.

C Sandor, NL Beer, C Webber. “Diverse type 2 diabetes genetic risk factors functionally converge in a phenotype-focused gene network”, PLoS computational biology, 13(10), p.e1005816.

S Leo, L Pireddu, G Zanetti. “SNP genotype calling with MapReduce”, MapReduce 12 - 3rd International Workshop on MapReduce and Its Applications:49–55. 2017. doi: 10.1145/2287016.2287026.

WQ Wei, JC Denny. “Extracting research-quality phenotypes from electronic health records to support precision medicine”, Genome medicine, 7(1), 41. 2015 Dec 7.

MB Taylor, IM Ehrenreich. “Higher-order genetic interactions and their contribution to complex traits”. Trends Genet. 2015;31(1):34-40. doi:10.1016/j.tig.2014.09.001

BG Hall, “SNP-associations and phenotype predictions from hundreds of microbial genomes without genome alignments”. PLoS One, 9(2), e90490. 2014 Feb 28.

HJ Ban, JY Heo, KS Oh, KJ Park, “Identification of Type 2 Diabetes-associated combination of SNPs using Support Vector Machine”. PMID: 20416077; PubMed Central PMCID: PMC2875201. 2010

I Ilhan, G Tezel, “How to select tag snps in genetic association studies? The CLONTagger method with parameter optimization.”, a Journal of Integrative Biology. 17(7):368–383. 2013.

LH Tresnawati, WA Kusuma, SH Wijaya, LS Hasibuan. “Asosiasi Single Nucleotide Polymorphism pada Diabetes Mellitus Tipe 2 Menggunakan Random Forest Regression”, Jurnal Nasional Teknik Elektro dan Teknologi Informasi. 8(4):320-325. 2019

U Ilhan, G Tezel, C Özcan “Tag SNP selection using similarity associations between SNPs”. International Symposium on Innovations in Intelligent SysTems and Applications. Proceedings. 2015

doi: 10.1109/INISTA.2015.7276793.

MM Mukaka. “Statistics corner: A guide to appropriate use of correlation coefficient in medical research”, Malawi medical journal : the journal of Medical Association of Malawi, 24(3), 69–71. 2012

B Coustet, P Dieudé, M Guedj, M Bouaziz, J Avouac, B Ruiz, J Sibilia, “C8orf13–BLK is a genetic risk locus for systemic sclerosis and has additive effects with BANK1: Results from a large french cohort and meta‐analysis Arthritis & Rheumatism, 63(7), 2091-2096. 2011. 10.1002/art.30379

M Kayri, I Kayri and M. T Gencoglu, “The performance comparison of Multiple Linear Regression, Random Forest and Artificial Neural Network by using photovoltaic and atmospheric data. 14th International Conference on Engineering of Modern Electric Systems (EMES), 2017, pp. 1-4, 10.1109/EMES.2017.7980368. 2017

H. F. Ramadhani, W. A. Kusuma, L. S. Hasibuan and R. Heryanto, "Association of Single Nucleotide Polymorphism and Phenotypes in Type 2 Diabetes Mellitus Using Genetic Algorithm and CatBoost," 2020 International Conference on Computer Science and Its Application in Agriculture (ICOSICA), 2020, pp. 1-6, doi: 10.1109/ICOSICA49951.2020.9243208.


  • There are currently no refbacks.

Copyright (c) 2022 Ratu Mutiara Siregar, Wisnu Ananta Kusuma, Annisa Annisa

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.