Semantic Similarity Comparison Between Production Line Failures for Predictive Maintenance

Hilal Tekgöz; Sevinç İlhan Omurca; Kadir Yunus Koç; Umut Topçu; Osman Çelik

doi:10.54569/aair.1142568

Research Article

Semantic Similarity Comparison Between Production Line Failures for Predictive Maintenance

Year 2023, Volume: 3 Issue: 1, 1 - 11, 15.02.2023

Hilal Tekgöz , Sevinç İlhan Omurca , Kadir Yunus Koç , Umut Topçu , Osman Çelik

https://doi.org/10.54569/aair.1142568

Cited By: 1

Abstract

With the introduction of Industry 4.0 into our lives and the creation of smart factories, predictive maintenance has become even more important. Predictive maintenance systems are often used in the manufacturing industry. On the other hand, text analysis and Natural Language Processing (NLP) techniques are gaining a lot of attention by both research and industry due to their ability to combine natural languages and industrial solutions. There is a great increase in the number of studies on NLP in the literature. Even though there are studies in the field of NLP in predictive maintenance systems, no studies were found on Turkish NLP for predictive maintenance. This study focuses on the similarity analysis of failure texts that can be used in the predictive maintenance system we developed for VESTEL, one of the leading consumer electronics manufacturers in Turkey. In the manufacturing industry, operators record descriptions of failure that occur on production lines as short texts. However, these descriptions are not often used in predictive maintenance work. In this study, semantic text similarities between fault definitions in the production line were compared using traditional word representations, modern word representations and Transformer models. Levenshtein, Jaccard, Pearson, and Cosine scales were used as similarity measures and the effectiveness of these measures were compared. Experimental data including failure texts were obtained from a consumer electronics manufacturer in Turkey. When the experimental results are examined, it is seen that the Jaccard similarity metric is not successful in grouping semantic similarities according to the other three similarity measures. In addition, Multilingual Universal Sentence Encoder (MUSE), Language-agnostic BERT Sentence Embedding (LAbSE), Bag of Words (BoW) and Term Frequency - Inverse Document Frequency (TF-IDF) outperform FastText and Language-Agnostic Sentence Representations (LASER) models in semantic discovery of error identification in embedding methods. Briefly to conclude, Pearson and Cosine are more effective at finding similar failure texts; MUSE, LAbSE, BoW and TF-IDF methods are more successful at representing the failure text.

Keywords

Predictive maintenance, Natural Language Processing, Sentence Similarity, Word Representation Methods

Supporting Institution

TUBİTAK

Project Number

3215073

Thanks

This work has been supported by TUBİTAK in Turkey under project number 3215073.

References

Chandrasekaran D, and Vijay M. "Evolution of semantic similarity—a survey." ACM Computing Surveys (CSUR) 54.2, 1-37, 2021.
Wang Y, et al. "A comparison of word embeddings for the biomedical natural language processing." Journal of biomedical informatics 87,12-20, 2018.
Liu J, Tianqi L, and Cong Y. “Newsembed: Modeling news through pre-trained document representations”, arXiv preprint arXiv:2106.00590, 2021.
Mikolov T, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781, 2013.
Pennington J, Richard S, and Christopher D.M. “Glove: Global vectors for word representation”. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014.
Bojanowski P, et al. “Enriching word vectors with subword information”, Transactions of the association for computational linguistics 5, 135-146, 2017.
Devlin J, et al. “Bert: Pre-training of deep bidirectional transformers for language understanding”, arXiv preprint arXiv:1810.04805, 2018.
Mohammad S.M, and Graeme H. “Distributional measures of semantic distance: A survey”, arXiv preprint arXiv:1203.1858, 2012.
Akhbardeh F, Travis D, and Marcos Z. “NLP tools for predictive maintenance records in MaintNet”. Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: System Demonstrations. 2020.
Yang H, Aidan L, and Travis D. “Predictive maintenance for general aviation using convolutional transformers”. Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 36. No. 11. 2022.
Wellsandt S. et al. “Hybrid-augmented intelligence in predictive maintenance with digital intelligent assistants”. Annual Reviews in Control, 2022.
Maulud D.H., et al. “State of art for semantic analysis of natural language processing”, Qubahan Academic Journal 1.2, 21-28, 2021.
Netisopakul P, et al. “Improving the state-of-the-art in Thai semantic similarity using distributional semantics and ontological information”, Plos one 16.2, 2021.
Zhang P, et al. “Semantic similarity computing model based on multi model fine-grained nonlinear fusion”, IEEE Access 9, 8433-8443, 2021.
Saipech P, and Pusadee S. “Automatic Thai subjective examination using cosine similarity”. 2018 5th international conference on advanced informatics: concept theory and applications (ICAICTA). IEEE, 2018.
Chandrathlake R, et al. “A semantic similarity measure-based news posts validation on social media”. 2018 3rd International Conference on Information Technology Research (ICITR). IEEE, 2018.
Jin X, Shuwu Z, and Jie L. “Word semantic similarity calculation based on word2vec”. 2018 International Conference on Control, Automation and Information Sciences (ICCAIS). IEEE, 2018. Pearson, K. Notes on Regression and Inheritance in the Case of Two Parents Proceedings of the Royal Society of London, 58, 240-242, 1895.
Zhou H, et al. “A new sampling method in particle filter based on Pearson correlation coefficient”. Neurocomputing 216 , 208-215, 2016.
Evans, J.D. “Straightforward statistics for the behavioral sciences”. Thomson Brooks/Cole Publishing Co, 1996.
Jayakodi K., Bandara M., and Meedeniya D. “An automatic classifier for exam questions with WordNet and Cosine similarity”. 2016 Moratuwa engineering research conference (MERCon). IEEE, 2016.
Jaccard P. “Nouvelles recherches sur la distribution florale”. Bull. Soc. Vaud. Sci. Nat., 44, 223-270, 1908.
Levenshtein, V.I. “Binary codes capable of correcting deletions, insertions, and reversals”. Soviet physics doklady. Vol. 10. No. 8. 1966.
Bosch A, Xavier M, and Robert M. “Which is the best way to organize/classify images by content?”. Image and vision computing 25.6, 778-791, 2007.
Aksu, M Ç, and Karaman E. “FastText ve Kelime Çantası Kelime Temsil Yöntemlerinin Turistik Mekanlar İçin Yapılan Türkçe İncelemeler Kullanılarak Karşılaştırılması”. Avrupa Bilim ve Teknoloji Dergisi 20, 311-320, 2020.
Trstenjak B, Sasa M, and Dzenana D. “KNN with TF-IDF based framework for text categorization”. Procedia Engineering 69, 1356-1364, 2014.
Uçar, M.K, Bozkurt M.R, and Bilgin C. “Signal Processing and Communications Applications Conference”. IEEE, 2017.
Jang, B.C, Inhwan K, and Jong W.K. “Word2vec convolutional neural networks for classification of news articles and tweets”. PloS one 14.8, 2019.
Lilleberg J, Yun Z, and Yanqing Z. “Support vector machines and word2vec for text classification with semantic features”. 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC). IEEE, 2015.
Santos I., Nadia N, and Luiza de Macedo M. “Sentiment analysis using convolutional neural network with fastText embeddings”. 2017 IEEE Latin American conference on computational intelligence (LA-CCI). IEEE, 2017.
Tekgöz H, Çelenli H.İ, and Omurca S.İ. “Semantic Similarity Comparison of Word Representation Methods in the Field of Health”. 2021 6th International Conference on Computer Science and Engineering (UBMK). IEEE, 2021.
Singh S, and Ausif M. “The NLP cookbook: modern recipes for transformer based deep learning architectures”. IEEE Access 9, 68675-68702, 2021.
Feng F, et al. “Language-agnostic bert sentence embedding”. arXiv preprint arXiv:2007.01852, 2020.
Yang Y, et al. "Multilingual universal sentence encoder for semantic retrieval." arXiv preprint arXiv:1907.04307, 2019.

Year 2023, Volume: 3 Issue: 1, 1 - 11, 15.02.2023

Hilal Tekgöz , Sevinç İlhan Omurca , Kadir Yunus Koç , Umut Topçu , Osman Çelik

https://doi.org/10.54569/aair.1142568

Cited By: 1

Abstract

Project Number

3215073

References

Chandrasekaran D, and Vijay M. "Evolution of semantic similarity—a survey." ACM Computing Surveys (CSUR) 54.2, 1-37, 2021.
Wang Y, et al. "A comparison of word embeddings for the biomedical natural language processing." Journal of biomedical informatics 87,12-20, 2018.
Liu J, Tianqi L, and Cong Y. “Newsembed: Modeling news through pre-trained document representations”, arXiv preprint arXiv:2106.00590, 2021.
Mikolov T, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781, 2013.
Pennington J, Richard S, and Christopher D.M. “Glove: Global vectors for word representation”. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014.
Bojanowski P, et al. “Enriching word vectors with subword information”, Transactions of the association for computational linguistics 5, 135-146, 2017.
Devlin J, et al. “Bert: Pre-training of deep bidirectional transformers for language understanding”, arXiv preprint arXiv:1810.04805, 2018.
Mohammad S.M, and Graeme H. “Distributional measures of semantic distance: A survey”, arXiv preprint arXiv:1203.1858, 2012.
Akhbardeh F, Travis D, and Marcos Z. “NLP tools for predictive maintenance records in MaintNet”. Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: System Demonstrations. 2020.
Yang H, Aidan L, and Travis D. “Predictive maintenance for general aviation using convolutional transformers”. Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 36. No. 11. 2022.
Wellsandt S. et al. “Hybrid-augmented intelligence in predictive maintenance with digital intelligent assistants”. Annual Reviews in Control, 2022.
Maulud D.H., et al. “State of art for semantic analysis of natural language processing”, Qubahan Academic Journal 1.2, 21-28, 2021.
Netisopakul P, et al. “Improving the state-of-the-art in Thai semantic similarity using distributional semantics and ontological information”, Plos one 16.2, 2021.
Zhang P, et al. “Semantic similarity computing model based on multi model fine-grained nonlinear fusion”, IEEE Access 9, 8433-8443, 2021.
Saipech P, and Pusadee S. “Automatic Thai subjective examination using cosine similarity”. 2018 5th international conference on advanced informatics: concept theory and applications (ICAICTA). IEEE, 2018.
Chandrathlake R, et al. “A semantic similarity measure-based news posts validation on social media”. 2018 3rd International Conference on Information Technology Research (ICITR). IEEE, 2018.
Jin X, Shuwu Z, and Jie L. “Word semantic similarity calculation based on word2vec”. 2018 International Conference on Control, Automation and Information Sciences (ICCAIS). IEEE, 2018. Pearson, K. Notes on Regression and Inheritance in the Case of Two Parents Proceedings of the Royal Society of London, 58, 240-242, 1895.
Zhou H, et al. “A new sampling method in particle filter based on Pearson correlation coefficient”. Neurocomputing 216 , 208-215, 2016.
Evans, J.D. “Straightforward statistics for the behavioral sciences”. Thomson Brooks/Cole Publishing Co, 1996.
Jayakodi K., Bandara M., and Meedeniya D. “An automatic classifier for exam questions with WordNet and Cosine similarity”. 2016 Moratuwa engineering research conference (MERCon). IEEE, 2016.
Jaccard P. “Nouvelles recherches sur la distribution florale”. Bull. Soc. Vaud. Sci. Nat., 44, 223-270, 1908.
Levenshtein, V.I. “Binary codes capable of correcting deletions, insertions, and reversals”. Soviet physics doklady. Vol. 10. No. 8. 1966.
Bosch A, Xavier M, and Robert M. “Which is the best way to organize/classify images by content?”. Image and vision computing 25.6, 778-791, 2007.
Aksu, M Ç, and Karaman E. “FastText ve Kelime Çantası Kelime Temsil Yöntemlerinin Turistik Mekanlar İçin Yapılan Türkçe İncelemeler Kullanılarak Karşılaştırılması”. Avrupa Bilim ve Teknoloji Dergisi 20, 311-320, 2020.
Trstenjak B, Sasa M, and Dzenana D. “KNN with TF-IDF based framework for text categorization”. Procedia Engineering 69, 1356-1364, 2014.
Uçar, M.K, Bozkurt M.R, and Bilgin C. “Signal Processing and Communications Applications Conference”. IEEE, 2017.
Jang, B.C, Inhwan K, and Jong W.K. “Word2vec convolutional neural networks for classification of news articles and tweets”. PloS one 14.8, 2019.
Lilleberg J, Yun Z, and Yanqing Z. “Support vector machines and word2vec for text classification with semantic features”. 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC). IEEE, 2015.
Santos I., Nadia N, and Luiza de Macedo M. “Sentiment analysis using convolutional neural network with fastText embeddings”. 2017 IEEE Latin American conference on computational intelligence (LA-CCI). IEEE, 2017.
Tekgöz H, Çelenli H.İ, and Omurca S.İ. “Semantic Similarity Comparison of Word Representation Methods in the Field of Health”. 2021 6th International Conference on Computer Science and Engineering (UBMK). IEEE, 2021.
Singh S, and Ausif M. “The NLP cookbook: modern recipes for transformer based deep learning architectures”. IEEE Access 9, 68675-68702, 2021.
Feng F, et al. “Language-agnostic bert sentence embedding”. arXiv preprint arXiv:2007.01852, 2020.
Yang Y, et al. "Multilingual universal sentence encoder for semantic retrieval." arXiv preprint arXiv:1907.04307, 2019.

There are 33 citations in total.

Details

Primary Language	English
Subjects	Artificial Intelligence
Journal Section	Research Articles
Authors	Hilal Tekgöz 0000-0001-5469-5125 Sevinç İlhan Omurca 0000-0003-1214-9235 Kadir Yunus Koç 0000-0003-0604-2749 Umut Topçu 0000-0002-8069-7973 Osman Çelik 0000-0003-3407-2101
Project Number	3215073
Early Pub Date	February 13, 2023
Publication Date	February 15, 2023
Acceptance Date	November 2, 2022
Published in Issue	Year 2023 Volume: 3 Issue: 1

Cite

IEEE	H. Tekgöz, S. İlhan Omurca, K. Y. Koç, U. Topçu, and O. Çelik, “Semantic Similarity Comparison Between Production Line Failures for Predictive Maintenance”, Adv. Artif. Intell. Res., vol. 3, no. 1, pp. 1–11, 2023, doi: 10.54569/aair.1142568.

Cited By

Technical language processing for Prognostics and Health Management: applying text similarity and topic modeling to maintenance work orders

Journal of Intelligent Manufacturing

https://doi.org/10.1007/s10845-024-02323-4

Download Cover Image

Article Files

Full Text

Advances in Artificial Intelligence Research is an open access journal which means that the content is freely available without charge to the user or his/her institution. All papers are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, which allows users to distribute, remix, adapt, and build upon the material in any medium or format for non-commercial purposes only, and only so long as attribution is given to the creator.

Graphic design @ Özden Işıktaş