Refining the Performance of Indonesian-Javanese Bilingual Neural Machine Translation Using Adam Optimizer
Fadia Irsania Putri(1); Aji Prasetya Wibawa(2*); Leonel Hernandez Collante(3);
(1) Universitas Negeri Malang
(2) Universitas Negeri Malang
(3) Institución Universitaria de Barranquilla IUB
(*) Corresponding Author
AbstractThis study focuses on creating a Neural Machine Translation (NMT) model for Indonesian and Javanese languages using Long Short-Term Memory (LSTM) architecture. The dataset was sourced from online platforms, containing pairs of parallel sentences in both languages. Training was performed with the Adam optimizer, and its effectiveness was compared to machine translation (MT) conducted without an optimizer. The Adam optimizer was utilized to enhance the convergence speed and stabilize the model by dynamically adjusting the learning rate. Model performance was assessed using BLEU (Bilingual Evaluation Understudy) scores to evaluate translation accuracy across different training epochs. The findings reveal that employing the Adam optimizer led to a significant enhancement in model performance. At epoch 2000, the model using the Adam optimizer achieved the highest BLEU score of 0.989957, reflecting very accurate translations, whereas the model without the optimizer showed lower results. Furthermore, translations from Indonesian to Javanese were found to be more precise than those from Javanese to Indonesian, largely due to the intricate structure and varying speech levels of the Javanese language. In summary, the implementation of the LSTM method with the Adam optimizer significantly improved the accuracy of bidirectional translations between Indonesian and Javanese. This research contributes notably to the advancement of local language translation technologies, supporting language preservation in the digital age and holding promise for applications in other regional languages. KeywordsAdam Optimizer; BLEU Score; LSTM; Neural Machine Translation
|
Full Text:PDF |
Article MetricsAbstract view: 25 timesPDF view: 7 times |
Digital Object Identifierhttps://doi.org/10.33096/ilkom.v16i3.2467.271-282 |
Cite |
References
A. F. Aji et al., “One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia,” Proc. Annu. Meet. Assoc. Comput. Linguist., vol. 1, pp. 7226–7249, 2022, doi: 10.18653/v1/2022.acl-long.500.
A. Fitriati, “The Family Language Policy of Javanese Multilingual Families,” J. Lang. Lit., vol. 23, no. 2, pp. 405–415, 2023, doi: 10.24071/joll.v23i2.7020.
E. L. Zen, “Javanese Language As an Ethnic Identity Marker Among Multilingual Families in Indonesia,” Linguist. Indones., vol. 39, no. 1, pp. 49–62, 2021, doi: 10.26499/li.v39i1.195.
A. D. S. Muhammad Rohmadi, Memet Sudaryanto, Dias Andris Susanto, Kundharu Saddhono, “Sociopragmatic study of Javanese oral discourse in big city communities as an effort to maintain indigenous language,” Res. J. Adv. Humanit., vol. 4, no. 4, 2023, doi: https://doi.org/10.58256/rjah.v4i4.1290Section: Literature, Linguistics & Criticism.
Z. Sakhiyya and N. Martin-Anatias, “Reviving the language at risk: a social semiotic analysis of the linguistic landscape of three cities in Indonesia,” Int. J. Multiling., vol. 20, no. 2, pp. 290–307, 2023, doi: 10.1080/14790718.2020.1850737.
T. Wahyu Guntara, A. Fikri Aji, R. Eko Prasojo, and J. Kemang Raya No, “Benchmarking Multidomain English-Indonesian Machine Translation,” Proc. 13th Work. Build. Using Comp. Corpora, no. May, pp. 35–43, 2020, [Online]. Available: www.islamicfinder.org/.
A. W. Rukua, Y. A. Lesnussa, D. L. Rahakbauw, and B. P. Tomasouw, “Application of Neural Machine Translation with Attention Mechanism for Translation of Indonesian to Seram Language (Geser),” Pattimura Int. J. Math., vol. 2, no. 2, pp. 53–62, 2023, doi: 10.30598/pijmathvol2iss2pp53-62.
C. T. and K. H. B. D. Wijanarko, Y. Heryadi, D. F. Murad, “Recurrent Neural Network-based Models as Bahasa Indonesia-Sundanese Language Neural Machine Translator,” Int. Conf. Comput. Sci. Inf. Technol. Eng. (ICCoSITE), Jakarta, Indones., pp. 951–956, 2023.
P. Permata and Z. Abidin, “Statistical Machine Translation Pada Bahasa Lampung Dialek Api Ke Bahasa Indonesia,” J. Media Inform. Budidarma, vol. 4, no. 3, p. 519, 2020, doi: 10.30865/mib.v4i3.2116.
I. A. and A. S. P. Z. Abidin, A. Junaidi, Wamiliana, F. M. Togatorop, “Direct Machine Translation Indonesian-Batak Toba,” 7th Int. Conf. New Media Stud. (CONMEDIA), Bali, Indones., pp. 82–87, 2023, doi: https://doi.org/10.1109/CONMEDIA60526.2023.10428332.
D. A. Sulistyo, A. P. Wibawa, D. D. Prasetya, and F. A. Ahda, “LSTM-Based Machine Translation for Madurese-Indonesian,” J. Appl. Data Sci., vol. 4, no. 3, pp. 189–199, 2023.
A. E. P. Lesatari, A. Ardiyanti, and I. Asror, “Phrase based statistical machine translation Javanese-Indonesian,” J. Media Inform. Budidarma, vol. 5, no. 2, pp. 378–386, 2021.
S. Saxena, S. Chauhan, P. Arora, and P. Daniel, “Unsupervised SMT: an analysis of Indic languages and a low resource language,” J. Exp. Theor. Artif. Intell., pp. 865–884, 2022, doi: https://doi.org/10.1080/0952813X.2022.2115142.
H. Sujaini, “Improving the role of language model in statistical machine translation (Indonesian-Javanese),” Int. J. Electr. Comput. Eng., vol. 10, no. 2, pp. 2102–2109, 2020, doi: 10.11591/ijece.v10i2.pp2102-2109.
M. Reyad, A. M. Sarhan, and M. Arafa, “A modified Adam algorithm for deep neural network optimization,” Neural Comput. Appl., vol. 35, no. 23, pp. 17095–17112, 2023, doi: 10.1007/s00521-023-08568-z.
A. Lopopolo, E. Fedorenko, R. Levy, and M. Rabovsky, “Cognitive Computational Neuroscience of Language: Using Computational Models to Investigate Language Processing in the Brain,” Neurobiol. Lang., vol. 5, no. 1 Special Issue, pp. 1–6, 2024, doi: 10.1162/nol_e_00131.
G. D. K. M. Kashina, I.D. Lenivtceva, “Preprocessing of unstructured medical data: the impact of each preprocessing stage on classification,” Procedia Comput. Sci., vol. 178, pp. 284–290, 2020, doi: https://doi.org/10.1016/j.procs.2020.11.030.
K. Dedes, A. B. P. Utama, A. P. Wibawa, A. N. Afandi, A. N. Handayani, and L. Hernandez, “Neural machine translation of Spanish-English food recipes using LSTM,” JOIV Int. J. Informatics Vis., vol. 6, no. 2, pp. 290–297, 2022.
A. Pranolo, Y. Mao, A. P. Wibawa, A. B. P. Utama, and F. A. Dwiyanto, “Robust LSTM With tuned-PSO and bifold-attention mechanism for analyzing multivariate time-series,” Ieee Access, vol. 10, pp. 78423–78434, 2022.
N. Mathur, T. Baldwin, and T. Cohn, “Tangled up in BLEU: Reevaluating the evaluation of automatic machine translation evaluation metrics,” arXiv Prepr. arXiv2006.06264, 2020.
S. Lee et al., “A Survey on Evaluation Metrics for Machine Translation,” Mathematics, vol. 11, no. 4, pp. 1–22, 2023, doi: 10.3390/math11041006.
A. P. Wibawa, I. T. Saputra, A. B. P. Utama, W. Lestari, and Z. N. Izdihar, “Long Short-Term Memory to Predict Unique Visitors of an Electronic Journal,” in 2020 6th International Conference on Science in Information Technology (ICSITech), IEEE, 2020, pp. 176–179.
K. C. A. Thakkar, D. Mungra, A. Agrawal, “Improving the Performance of Sentiment Analysis Using Enhanced Preprocessing Technique and Artificial Neural Network,” IEEE Trans. Affect. Comput., vol. 13, no. 4, pp. 1771–1782, 2022, doi: 10.1109/TAFFC.2022.3206891.
Q. Dombrowski, M. Goregaokar, B. Joeng, and A. Kamran, “Encoding Multilingualism: Technical Affordances of Multilingual Publication from Manuscripts to Unicode and OpenType,” J. Electron. Publ., vol. 27, no. 1, 2024.
A. Rai and S. Borah, “Study of various methods for tokenization,” in Applications of Internet of Things: Proceedings of ICCCIOT 2020, Springer, 2021, pp. 193–200.
G. V. A. Gutiérrez, “A comparative study of NLP and machine learning techniques for sentiment analysis and topic modeling on amazon,” Int. J. Comput. Sci. Eng, vol. 9, no. 2, pp. 159–170, 2020.
M. Isik and H. Dag, “The impact of text preprocessing on the prediction of review ratings,” Turkish J. Electr. Eng. Comput. Sci., vol. 28, no. 3, pp. 1405–1421, 2020.
S. Edunov, A. Baevski, and M. Auli, “Pre-trained language model representations for language generation,” arXiv Prepr. arXiv1903.09722, 2019.
A. Javaloy and G. García-Mateos, “Text normalization using encoder-decoder networks based on the causal feature extractor,” Appl. Sci., vol. 10, no. 13, 2020, doi: 10.3390/app10134551.
K. Aitken, V. Ramasesh, Y. Cao, and N. Maheswaranathan, “Understanding how encoder-decoder architectures attend,” Adv. Neural Inf. Process. Syst., vol. 34, pp. 22184–22195, 2021.
M. Stasimioti, V. Sosoni, K. L. Kermanidis, and D. Mouratidis, “Machine Translation Quality: A comparative evaluation of SMT, NMT and tailored-NMT outputs,” in Proceedings of the 22nd annual conference of the European Association for Machine Translation, 2020, pp. 441–450.
E. Chatzikoumi, “How to evaluate machine translation: A review of automated and human metrics,” Nat. Lang. Eng., vol. 26, no. 2, pp. 137–161, 2020.
Z. Sun, J. M. Zhang, M. Harman, M. Papadakis, and L. Zhang, “Automatic testing and improvement of machine translation,” in Proceedings of the ACM/IEEE 42nd international conference on software engineering, 2020, pp. 974–985.
S. H. Wijono, K. Azizah, and W. Jatmiko, “Canonical Segmentation For Javanese-Indonesian Neural Machine Translation,” J. Eng. Sci. Technol., vol. 18, pp. 62–68, 2023.
X. Jiang, B. Hu, S. Chandra Satapathy, S. H. Wang, and Y. D. Zhang, “Fingerspelling Identification for Chinese Sign Language via AlexNet-Based Transfer Learning and Adam Optimizer,” Sci. Program., vol. 2020, 2020, doi: 10.1155/2020/3291426.
N. Landro, I. Gallo, and R. La Grassa, “Mixing Adam and SGD: A combined optimization method,” arXiv Prepr. arXiv2011.08042, 2020.
A. B. Shanmugavel, V. Ellappan, A. Mahendran, M. Subramanian, R. Lakshmanan, and M. Mazzara, “A Novel Ensemble Based Reduced Overfitting Model with Convolutional Neural Network for Traffic Sign Recognition System,” Electron., vol. 12, no. 4, 2023, doi: 10.3390/electronics12040926.
Refbacks
- There are currently no refbacks.
Copyright (c) 2024 Fadia Irsania Putri, Aji Prasetya Wibawa
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.