Research Article
BibTex RIS Cite

Semantic Similarity Comparison Between Production Line Failures for Predictive Maintenance

Year 2023, Volume: 3 Issue: 1, 1 - 11, 15.02.2023
https://doi.org/10.54569/aair.1142568

Abstract

With the introduction of Industry 4.0 into our lives and the creation of smart factories, predictive maintenance has become even more important. Predictive maintenance systems are often used in the manufacturing industry. On the other hand, text analysis and Natural Language Processing (NLP) techniques are gaining a lot of attention by both research and industry due to their ability to combine natural languages and industrial solutions. There is a great increase in the number of studies on NLP in the literature. Even though there are studies in the field of NLP in predictive maintenance systems, no studies were found on Turkish NLP for predictive maintenance. This study focuses on the similarity analysis of failure texts that can be used in the predictive maintenance system we developed for VESTEL, one of the leading consumer electronics manufacturers in Turkey. In the manufacturing industry, operators record descriptions of failure that occur on production lines as short texts. However, these descriptions are not often used in predictive maintenance work. In this study, semantic text similarities between fault definitions in the production line were compared using traditional word representations, modern word representations and Transformer models. Levenshtein, Jaccard, Pearson, and Cosine scales were used as similarity measures and the effectiveness of these measures were compared. Experimental data including failure texts were obtained from a consumer electronics manufacturer in Turkey. When the experimental results are examined, it is seen that the Jaccard similarity metric is not successful in grouping semantic similarities according to the other three similarity measures. In addition, Multilingual Universal Sentence Encoder (MUSE), Language-agnostic BERT Sentence Embedding (LAbSE), Bag of Words (BoW) and Term Frequency - Inverse Document Frequency (TF-IDF) outperform FastText and Language-Agnostic Sentence Representations (LASER) models in semantic discovery of error identification in embedding methods. Briefly to conclude, Pearson and Cosine are more effective at finding similar failure texts; MUSE, LAbSE, BoW and TF-IDF methods are more successful at representing the failure text.

Supporting Institution

TUBİTAK

Project Number

3215073

Thanks

This work has been supported by TUBİTAK in Turkey under project number 3215073.

References

  • Chandrasekaran D, and Vijay M. "Evolution of semantic similarity—a survey." ACM Computing Surveys (CSUR) 54.2, 1-37, 2021.
  • Wang Y, et al. "A comparison of word embeddings for the biomedical natural language processing." Journal of biomedical informatics 87,12-20, 2018.
  • Liu J, Tianqi L, and Cong Y. “Newsembed: Modeling news through pre-trained document representations”, arXiv preprint arXiv:2106.00590, 2021.
  • Mikolov T, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781, 2013.
  • Pennington J, Richard S, and Christopher D.M. “Glove: Global vectors for word representation”. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014.
  • Bojanowski P, et al. “Enriching word vectors with subword information”, Transactions of the association for computational linguistics 5, 135-146, 2017.
  • Devlin J, et al. “Bert: Pre-training of deep bidirectional transformers for language understanding”, arXiv preprint arXiv:1810.04805, 2018.
  • Mohammad S.M, and Graeme H. “Distributional measures of semantic distance: A survey”, arXiv preprint arXiv:1203.1858, 2012.
  • Akhbardeh F, Travis D, and Marcos Z. “NLP tools for predictive maintenance records in MaintNet”. Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: System Demonstrations. 2020.
  • Yang H, Aidan L, and Travis D. “Predictive maintenance for general aviation using convolutional transformers”. Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 36. No. 11. 2022.
  • Wellsandt S. et al. “Hybrid-augmented intelligence in predictive maintenance with digital intelligent assistants”. Annual Reviews in Control, 2022.
  • Maulud D.H., et al. “State of art for semantic analysis of natural language processing”, Qubahan Academic Journal 1.2, 21-28, 2021.
  • Netisopakul P, et al. “Improving the state-of-the-art in Thai semantic similarity using distributional semantics and ontological information”, Plos one 16.2, 2021.
  • Zhang P, et al. “Semantic similarity computing model based on multi model fine-grained nonlinear fusion”, IEEE Access 9, 8433-8443, 2021.
  • Saipech P, and Pusadee S. “Automatic Thai subjective examination using cosine similarity”. 2018 5th international conference on advanced informatics: concept theory and applications (ICAICTA). IEEE, 2018.
  • Chandrathlake R, et al. “A semantic similarity measure-based news posts validation on social media”. 2018 3rd International Conference on Information Technology Research (ICITR). IEEE, 2018.
  • Jin X, Shuwu Z, and Jie L. “Word semantic similarity calculation based on word2vec”. 2018 International Conference on Control, Automation and Information Sciences (ICCAIS). IEEE, 2018. Pearson, K. Notes on Regression and Inheritance in the Case of Two Parents Proceedings of the Royal Society of London, 58, 240-242, 1895.
  • Zhou H, et al. “A new sampling method in particle filter based on Pearson correlation coefficient”. Neurocomputing 216 , 208-215, 2016.
  • Evans, J.D. “Straightforward statistics for the behavioral sciences”. Thomson Brooks/Cole Publishing Co, 1996.
  • Jayakodi K., Bandara M., and Meedeniya D. “An automatic classifier for exam questions with WordNet and Cosine similarity”. 2016 Moratuwa engineering research conference (MERCon). IEEE, 2016.
  • Jaccard P. “Nouvelles recherches sur la distribution florale”. Bull. Soc. Vaud. Sci. Nat., 44, 223-270, 1908.
  • Levenshtein, V.I. “Binary codes capable of correcting deletions, insertions, and reversals”. Soviet physics doklady. Vol. 10. No. 8. 1966.
  • Bosch A, Xavier M, and Robert M. “Which is the best way to organize/classify images by content?”. Image and vision computing 25.6, 778-791, 2007.
  • Aksu, M Ç, and Karaman E. “FastText ve Kelime Çantası Kelime Temsil Yöntemlerinin Turistik Mekanlar İçin Yapılan Türkçe İncelemeler Kullanılarak Karşılaştırılması”. Avrupa Bilim ve Teknoloji Dergisi 20, 311-320, 2020.
  • Trstenjak B, Sasa M, and Dzenana D. “KNN with TF-IDF based framework for text categorization”. Procedia Engineering 69, 1356-1364, 2014.
  • Uçar, M.K, Bozkurt M.R, and Bilgin C. “Signal Processing and Communications Applications Conference”. IEEE, 2017.
  • Jang, B.C, Inhwan K, and Jong W.K. “Word2vec convolutional neural networks for classification of news articles and tweets”. PloS one 14.8, 2019.
  • Lilleberg J, Yun Z, and Yanqing Z. “Support vector machines and word2vec for text classification with semantic features”. 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC). IEEE, 2015.
  • Santos I., Nadia N, and Luiza de Macedo M. “Sentiment analysis using convolutional neural network with fastText embeddings”. 2017 IEEE Latin American conference on computational intelligence (LA-CCI). IEEE, 2017.
  • Tekgöz H, Çelenli H.İ, and Omurca S.İ. “Semantic Similarity Comparison of Word Representation Methods in the Field of Health”. 2021 6th International Conference on Computer Science and Engineering (UBMK). IEEE, 2021.
  • Singh S, and Ausif M. “The NLP cookbook: modern recipes for transformer based deep learning architectures”. IEEE Access 9, 68675-68702, 2021.
  • Feng F, et al. “Language-agnostic bert sentence embedding”. arXiv preprint arXiv:2007.01852, 2020.
  • Yang Y, et al. "Multilingual universal sentence encoder for semantic retrieval." arXiv preprint arXiv:1907.04307, 2019.
Year 2023, Volume: 3 Issue: 1, 1 - 11, 15.02.2023
https://doi.org/10.54569/aair.1142568

Abstract

Project Number

3215073

References

  • Chandrasekaran D, and Vijay M. "Evolution of semantic similarity—a survey." ACM Computing Surveys (CSUR) 54.2, 1-37, 2021.
  • Wang Y, et al. "A comparison of word embeddings for the biomedical natural language processing." Journal of biomedical informatics 87,12-20, 2018.
  • Liu J, Tianqi L, and Cong Y. “Newsembed: Modeling news through pre-trained document representations”, arXiv preprint arXiv:2106.00590, 2021.
  • Mikolov T, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781, 2013.
  • Pennington J, Richard S, and Christopher D.M. “Glove: Global vectors for word representation”. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014.
  • Bojanowski P, et al. “Enriching word vectors with subword information”, Transactions of the association for computational linguistics 5, 135-146, 2017.
  • Devlin J, et al. “Bert: Pre-training of deep bidirectional transformers for language understanding”, arXiv preprint arXiv:1810.04805, 2018.
  • Mohammad S.M, and Graeme H. “Distributional measures of semantic distance: A survey”, arXiv preprint arXiv:1203.1858, 2012.
  • Akhbardeh F, Travis D, and Marcos Z. “NLP tools for predictive maintenance records in MaintNet”. Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: System Demonstrations. 2020.
  • Yang H, Aidan L, and Travis D. “Predictive maintenance for general aviation using convolutional transformers”. Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 36. No. 11. 2022.
  • Wellsandt S. et al. “Hybrid-augmented intelligence in predictive maintenance with digital intelligent assistants”. Annual Reviews in Control, 2022.
  • Maulud D.H., et al. “State of art for semantic analysis of natural language processing”, Qubahan Academic Journal 1.2, 21-28, 2021.
  • Netisopakul P, et al. “Improving the state-of-the-art in Thai semantic similarity using distributional semantics and ontological information”, Plos one 16.2, 2021.
  • Zhang P, et al. “Semantic similarity computing model based on multi model fine-grained nonlinear fusion”, IEEE Access 9, 8433-8443, 2021.
  • Saipech P, and Pusadee S. “Automatic Thai subjective examination using cosine similarity”. 2018 5th international conference on advanced informatics: concept theory and applications (ICAICTA). IEEE, 2018.
  • Chandrathlake R, et al. “A semantic similarity measure-based news posts validation on social media”. 2018 3rd International Conference on Information Technology Research (ICITR). IEEE, 2018.
  • Jin X, Shuwu Z, and Jie L. “Word semantic similarity calculation based on word2vec”. 2018 International Conference on Control, Automation and Information Sciences (ICCAIS). IEEE, 2018. Pearson, K. Notes on Regression and Inheritance in the Case of Two Parents Proceedings of the Royal Society of London, 58, 240-242, 1895.
  • Zhou H, et al. “A new sampling method in particle filter based on Pearson correlation coefficient”. Neurocomputing 216 , 208-215, 2016.
  • Evans, J.D. “Straightforward statistics for the behavioral sciences”. Thomson Brooks/Cole Publishing Co, 1996.
  • Jayakodi K., Bandara M., and Meedeniya D. “An automatic classifier for exam questions with WordNet and Cosine similarity”. 2016 Moratuwa engineering research conference (MERCon). IEEE, 2016.
  • Jaccard P. “Nouvelles recherches sur la distribution florale”. Bull. Soc. Vaud. Sci. Nat., 44, 223-270, 1908.
  • Levenshtein, V.I. “Binary codes capable of correcting deletions, insertions, and reversals”. Soviet physics doklady. Vol. 10. No. 8. 1966.
  • Bosch A, Xavier M, and Robert M. “Which is the best way to organize/classify images by content?”. Image and vision computing 25.6, 778-791, 2007.
  • Aksu, M Ç, and Karaman E. “FastText ve Kelime Çantası Kelime Temsil Yöntemlerinin Turistik Mekanlar İçin Yapılan Türkçe İncelemeler Kullanılarak Karşılaştırılması”. Avrupa Bilim ve Teknoloji Dergisi 20, 311-320, 2020.
  • Trstenjak B, Sasa M, and Dzenana D. “KNN with TF-IDF based framework for text categorization”. Procedia Engineering 69, 1356-1364, 2014.
  • Uçar, M.K, Bozkurt M.R, and Bilgin C. “Signal Processing and Communications Applications Conference”. IEEE, 2017.
  • Jang, B.C, Inhwan K, and Jong W.K. “Word2vec convolutional neural networks for classification of news articles and tweets”. PloS one 14.8, 2019.
  • Lilleberg J, Yun Z, and Yanqing Z. “Support vector machines and word2vec for text classification with semantic features”. 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC). IEEE, 2015.
  • Santos I., Nadia N, and Luiza de Macedo M. “Sentiment analysis using convolutional neural network with fastText embeddings”. 2017 IEEE Latin American conference on computational intelligence (LA-CCI). IEEE, 2017.
  • Tekgöz H, Çelenli H.İ, and Omurca S.İ. “Semantic Similarity Comparison of Word Representation Methods in the Field of Health”. 2021 6th International Conference on Computer Science and Engineering (UBMK). IEEE, 2021.
  • Singh S, and Ausif M. “The NLP cookbook: modern recipes for transformer based deep learning architectures”. IEEE Access 9, 68675-68702, 2021.
  • Feng F, et al. “Language-agnostic bert sentence embedding”. arXiv preprint arXiv:2007.01852, 2020.
  • Yang Y, et al. "Multilingual universal sentence encoder for semantic retrieval." arXiv preprint arXiv:1907.04307, 2019.
There are 33 citations in total.

Details

Primary Language English
Subjects Artificial Intelligence
Journal Section Research Articles
Authors

Hilal Tekgöz 0000-0001-5469-5125

Sevinç İlhan Omurca 0000-0003-1214-9235

Kadir Yunus Koç 0000-0003-0604-2749

Umut Topçu 0000-0002-8069-7973

Osman Çelik 0000-0003-3407-2101

Project Number 3215073
Early Pub Date February 13, 2023
Publication Date February 15, 2023
Acceptance Date November 2, 2022
Published in Issue Year 2023 Volume: 3 Issue: 1

Cite

IEEE H. Tekgöz, S. İlhan Omurca, K. Y. Koç, U. Topçu, and O. Çelik, “Semantic Similarity Comparison Between Production Line Failures for Predictive Maintenance”, Adv. Artif. Intell. Res., vol. 3, no. 1, pp. 1–11, 2023, doi: 10.54569/aair.1142568.

88x31.png
Advances in Artificial Intelligence Research is an open access journal which means that the content is freely available without charge to the user or his/her institution. All papers are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, which allows users to distribute, remix, adapt, and build upon the material in any medium or format for non-commercial purposes only, and only so long as attribution is given to the creator.

Graphic design @ Özden Işıktaş