A Hybrid Deep Learning–Machine Learning Approach for the Identification of Active Compounds in Blumea balsamifera (Sembung Leaves)


Kusnaeni Kusnaeni(1*); Prihatin Prihatin(2); Rahmatullah Rahmatullah(3); Mega Sartika Hafid(4); Muhammad Rifki Nisardi(5); Nurmalasari Nurmalasari(6); Afif Budi Andy B(7);

(1) Institut Teknologi Bacharuddin Jusuf Habibie
(2) Institut Teknologi Bacharuddin Jusuf Habibie
(3) Pemerintah Kabupaten Mamuju
(4) Institut Teknologi Bacharuddin Jusuf Habibie
(5) Institut Teknologi Bacharuddin Jusuf Habibie
(6) Institut Teknologi Bacharuddin Jusuf Habibie
(7) Universitas Sulawesi Barat
(*) Corresponding Author

  

Abstract


Blumea balsamifera (sembung) is a medicinal plant with well-documented antibacterial, anti-inflammatory, and analgesic properties. However, the systematic identification of its bioactive compounds remains a significant challenge due to the complexity and high dimensionality of LC–MS (Liquid Chromatography–Mass Spectrometry) data. This study aims to develop a robust computational framework for automated compound identification using a hybrid modeling approach.A hybrid model integrating Long Short-Term Memory (LSTM) and Extreme Gradient Boosting (XGBoost) was employed to enhance feature extraction and classification performance. The LSTM component was utilized to capture sequential dependencies in spectral data, while XGBoost performed optimized classification through gradient boosting. This integration enables efficient handling of complex spectral patterns and improves predictive accuracy.The proposed model achieved an accuracy of 91%, demonstrating strong performance in classifying and identifying bioactive compounds. Feature importance analysis identified several key compounds contributing to the model predictions, including Luteolin-7-methyl-ether, Umbelliferone, Blumeatin, Dihydroquercetin-7,4′-dimethylether, Chrysosplenol C, Blumealactone B, and Blumeaene E. These compounds are associated with known pharmacological activities, supporting the therapeutic relevance of B. balsamifera.The proposed hybrid LSTM–XGBoost framework provides an effective and scalable approach for LC–MS-based compound identification. This method reduces analytical complexity, enhances classification reliability, and offers a data-driven strategy for accelerating phytochemical research and bioactive compound validation

Keywords


Hybrid Method; LSTM; Sembung Leaves; XGBoost

  
  

Full Text:

PDF
  

Article Metrics

Abstract view: 84 times
PDF view: 40 times
     

Digital Object Identifier

doi  https://doi.org/10.33096/ilkom.v18i1.3195.165-179
  

Cite

References


I. G. Widhiantara and I. M. Jawi, “Phytochemical composition and health properties of Sembung plant ( Blumea balsamifera ): A review,” Vet. World, vol. 14, no. 5, pp. 1185–1196, 2021, doi: www.doi.org/10.14202/vetworld.2021.1185-1196.

A. Ruhardi and M. Handoyo Sahumena, “Identifikasi Senyawa Flavanoid Daun Sembung (Blumea balsamifera L.),” J. Syifa Sci. Clin. Res., vol. 3, no. 1, pp. 29–36, 2021, doi: www.doi.org/10.37311/jsscr.v3i1.9925.

X. Huang, D. Wang, Y. Liu, and Y. Cheng, “Diterpenoids from Blumea balsamifera and Their Anti-Inflammatory Activities,” Molecules, vol. 27, p. 2890, 2022, doi: https://doi.org/10.3390/molecules27092890.

Y. D. Derso et al., “Composition , medicinal values , and threats of plants used in indigenous medicine in Jawi District , Ethiopia : implications for conservation and sustainable use,” Sci. Rep., vol. 14, no. 23638, pp. 1–18, 2024, doi: https://doi.org/10.1038/s41598-024-71411-5.

S. L. Chen, H. Yu, H. M. Luo, Q. Wu, C. F. Li, and A. Steinmetz, “Conservation and sustainable use of medicinal plants : problems , progress , and prospects,” Chin. Med., vol. 11, no. 37, pp. 1–10, 2016, doi: https://doi.org/10.1186/s13020-016-0108-7.

C. C. Davis and P. Choisy, “Medicinal plants meet modern biodiversity science,” Curr. Biol., vol. 34, no. 4, pp. R158–R173, 2024, doi: www.doi.org/10.1016/j.cub.2023.12.038.

X. Zhang, X. Yu, X. Sun, X. Meng, J. Fan, and F. Zhang, “Comparative study on chemical constituents of different medicinal parts of Lonicera japonica Thunb . Based on LC-MS combined with multivariate statistical analysis,” Heliyon, vol. 10, no. 12, p. e31722, 2024, doi: https://doi.org/10.1016/j.heliyon.2024.e31722.

H. Jiang, Y. Zhang, Z. Liu, X. Wang, J. He, and H. Jin, “Advanced applications of mass spectrometry imaging technology in quality control and safety assessments of traditional Chinese medicines,” J. Ethnopharmacol., vol. 284, no. October 2021, p. 114760, 2022, doi: www.doi.org/10.1016/j.jep.2021.114760.

M. Jiang et al., “Integration of deep neural network modeling and LC-MS-based pseudo-targeted metabolomics to discriminate easily confused ginseng species,” J. Pharm. Anal., vol. 15, no. 1, 2025, doi: www.doi.org/10.1016/j.jpha.2024.101116.

S. D. Cahya, B. Sartono, I. Indahwati, and E. Purnaningrum, “Performance of LAD-LASSO and WLAD-LASSO on High Dimensional Regression in Handling Data Containing Outliers,” JTAM (Jurnal Teor. dan Apl. Mat., vol. 6, no. 4, p. 844, Oct. 2022, doi: www.doi.org/10.31764/jtam.v6i4.8968.

R. Rochayati, K. Sadik, B. Sartono, and E. Purnaningrum, “Study on the performance of Robust LASSO in determining important variables data with outliers,” J. Nat., vol. 23, no. 1, pp. 9–15, 2023, doi: www.doi.org/10.24815/jn.v23i1.26279.

K. Kusnaeni, A. M. Soleh, F. M. Afendi, and B. Sartono, “Function Group Selection Of Sembung Leaves (Blumea Balsamifera) Significant To Antioxidants Using Overlapping Group Lasso,” BAREKENG J. Ilmu Mat. dan Terap., vol. 16, no. 2, pp. 721–728, 2022, doi: www.doi.org/10.30598/barekengvol16iss2pp721-728.

S. Alanazi, “Recent Advances in Liquid Chromatography – Mass Spectrometry ( LC – MS ) Applications in Biological and Applied Sciences,” Anal. Sci. Adv., vol. 6, no. e70024, pp. 1–12, 2025, doi: https://doi.org/10.1002/ansa.70024.

N. J. Brittin, J. M. Anderson, D. R. Braun, S. R. Rajski, C. R. Currie, and T. S. Bugni, “Machine Learning-Based Bioactivity Classification of Natural Products Using LC-MS/MS Metabolomics,” J. Nat. Prod., vol. 88, pp. 361–372, 2025, doi: www.doi.org/10.1021/acs.jnatprod.4c01123.

Z. Jin, L. Chen, Y. Wang, C. Shi, Y. Zhou, and B. Xia, “Application of machine learning in LC-MS-based non-targeted analysis,” Trends Anal. Chem., vol. 189, no. 118243, 2025, doi: https://doi.org/10.1016/j.trac.2025.118243.

Y. Hong, S. Li, Y. Ye, and H. Tang, “FIDDLE: a deep learning method for chemical formulas prediction from tandem mass spectra,” Nat. Commun., vol. 16, no. 11102, pp. 1–23, Nov. 2025, doi: https://doi.org/10.1038/s41467-025-66060-9.

S. Javid, A. Rahmanulla, M. G. Ahmed, R. Sultana, and B. R. Prashantha Kumar, “Machine learning & deep learning tools in pharmaceutical sciences: A comprehensive review,” Intell. Pharm., vol. 2949–866X, no. October 2024, pp. 1–14, 2025, doi: www.doi.org/10.1016/j.ipha.2024.11.003.

H. Chen, O. Engkvist, Y. Wang, M. Olivecrona, and T. Blaschke, “The rise of deep learning in drug discovery,” Drug Discov. Today, vol. 23, no. 6, pp. 1241–1250, 2018, doi: www.doi.org/10.1016/j.drudis.2018.01.039.

S. Yoo et al., “A Deep Learning-Based Approach for Identifying the Medicinal Uses of Plant-Derived Natural Compounds,” Front. Pharmacol., vol. 11, no. November, pp. 1–15, 2020, doi: www.doi.org/10.3389/fphar.2020.584875.

F. L. Duan et al., “AI-driven drug discovery from natural products,” Adv. Agrochem, vol. 3, no. 3, pp. 185–187, 2024, doi: www.doi.org/10.1016/j.aac.2024.06.003.

G. D. Urso et al., “The Role of LC-MS in Profiling Bioactive Compounds from Plant Waste for Cosmetic Applications : A General Overview,” Plants, vol. 14, no. 2284, pp. 1–26, 2025, doi: https://doi.org/10.3390/plants14152284.

A. Sherstinsky, “Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network,” Phys. D, vol. 404, p. 132306, 2020, doi: www.doi.org/10.1016/j.physd.2019.132306.

A. Mahmoudi, “Investigating LSTM-based time series prediction using dynamic systems measures,” Evol. Syst., vol. 16, pp. 1–18, 2025, doi: https://doi.org/10.1007/s12530-025-09703-y.

F. Landi, L. Baraldi, M. Cornia, and R. Cucchiara, “Working Memory Connections for LSTM,” Neural Networks, vol. 144, pp. 334–341, 2021, doi: https://doi.org/10.1016/j.neunet.2021.08.030.

M. J. Allah, S. Hassouna, A. Timesli, and B. A. El Majd, “LSTM-based neural network architecture for predicting the nonlinear dynamic behavior of functional gradient viscoelastic porous plates,” Mater. Today, vol. 42, no. 111269, 2025, doi: https://doi.org/10.1016/j.mtcomm.2024.111269.

M. Candan and M. Cubukcu, “Implementation of Caputo Type Fractional Derivative Chain Rule on Back Propagation Algorithm,” Appl. Soft Comput., vol. 155, p. 111475, 2024, doi: https://doi.org/10.1016/j.asoc.2024.111475.

M. Waqas and U. W. Humphries, “A critical review of RNN and LSTM variants in hydrological time series predictions,” MethodsX, vol. 13, no. 102946, 2024, doi: https://doi.org/10.1016/j.mex.2024.102946.

G. Liao, M. Nashrul, M. Zubir, H. J. Yap, A. H. Ali, and M. Alkhedher, “Enhanced Convolutional Neural Network ( CNN ) -Long Short-Term Memory ( LSTM ) attention model with adaptive loss for lithium-ion battery state of health estimation,” PeerJ Comput. Sci., vol. 11, no. e3006, 2025, doi: www.doi.org/10.7717/peerj-cs.3006.

J. Duan, P. F. Zhang, R. Qiu, and Z. Huang, “Long short-term enhanced memory for sequential recommendation,” World Wide Web, vol. 26, no. 2, pp. 561–583, 2023, doi: www.doi.org/10.1007/s11280-022-01056-9.

S. Hakkal and A. A. Lahcen, “XGBoost To Enhance Learner Performance Prediction,” Comput. Educ. Artif. Intell., vol. 7, no. 100254, 2024, doi: www.doi.org/10.1016/j.caeai.2024.100254.

N. Basha, K. Jongkittinarukorn, and K. Bingi, “XGBoost based enhanced predictive model for handling missing input parameters : A case study on gas turbine,” Case Stud. Chem. Environ. Eng., vol. 10, no. 100775, 2024, doi: www.doi.org/10.1016/j.cscee.2024.100775.

H. Chen, “Enterprise marketing strategy using big data mining technology combined with XGBoost model in the new economic era,” PLoS One, vol. 18, no. 6, pp. 1–22, 2023, doi: www.doi.org/10.1371/journal.pone.0285506.

C. Banyong, N. Hantanong, and P. Wisutwattanasak, “A machine learning comparison of transportation mode changes from high-speed railway promotion in Thailand,” Results Eng., vol. 24, no. 103110, 2024, doi: www.doi.org/10.1016/j.rineng.2024.103110.

Z. Zhang, Y. Zhang, Y. Wen, and Y. Ren, “Data-driven XGBoost model for maximum stress prediction of additive manufactured lattice structures,” Complex Intell. Syst., vol. 9, no. 5, pp. 5881–5892, 2023, doi: https://doi.org/10.1007/s40747-023-01061-z.


Refbacks

  • There are currently no refbacks.


Copyright (c) 2026 Kusnaeni Kusnaeni, Prihatin Prihatin, Rahmatullah Rahmatullah, Mega Sartika Hafid, Muhammad Rifki Nisardi, Nurmalasari Nurmalasari, Afif Budi Andy B

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.