jnrs

Journal of New Results in Science

1304-7981

Tokat Gaziosmanpasa University

10.54187/jnrs.1129440

Statistics

İstatistik

A novel data processing approach to detect fraudulent insurance claims for physical damage to cars

https://orcid.org/0000-0002-2364-9449

Yücel

Ahmet

ANKARA YILDIRIM BEYAZIT ÜNİVERSİTESİ

08 31 2022

11 2 120 131 06 11 2022 08 22 2022

2012

Journal of New Results in Science

Some automobile insurance companies use computerized auto-detection systems to expedite claims payment decisions for insured vehicles. Claims suspected of fraud are evaluated using empirical data from previously investigated claims. The main objective of this manuscript is to demonstrate a novel data processing system and its potential for use in data classification. The data processing approach was used to develop a machine learning-based sentiment classification model to describe property damage fraud in vehicle accidents and the indicators of fraudulent claims. To this end, Singular Value Decomposition-based components and correlation-based composite variables were created. Machine learning models were then developed, with predictors and composite variables selected based on standard feature selection procedures. Five machine learning models were used: Boosted Trees, Classification and Regression Trees, Random Forests, Artificial Neural Networks, and Support Vector Machines. For all models, the models with composite variables achieved higher accuracy rates, and among these models, the artificial neural network was the model with the highest accuracy performance at 76.56%.

Artificial neural network Tree-based decision systems Support vector machines Singular value decomposition Data processing Natural language processing

S. Viaene, M. Ayuso, M. Guillen, D. V. Gheel, G. Dedene, Strategies for detecting fraudulent claims in the automobile insurance industry, European Journal of Operational Research, 176(1), (2007) 565–583.

T. Baldock, Insurance fraud. Australian Institute of Criminology: Trends and issues in crime and criminal justice, 66, (1997).

I. Akomea-Frimpong, C. Andoh, E. Ofosu-Hene, Causes, effects and deterrence of insurance fraud: evidence from Ghana, Journal of Financial Crime, 23(4), (2016) 678–699.

G. Baader, H. Krcmar, Reducing false positives in fraud detection: Combining the red flag approach with process mining, International Journal of Accounting Information Systems, 31, (2018) 1–16.

J. Nahr, H. Nozari, M. E. Sadeghi, Artificial intelligence and machine learning for real-world problems (A survey), International journal of innovation in Engineering, 1(3), (2021) 38–47.

H. Ma, Y. Wang, K. Wang, Automatic detection of false positive RFID readings using machine learning algorithms, Expert Systems with Applications, 91, (2018) 442–451.

S. Chand, Y. Zhang, Learning from machines to close the gap between funding and expenditure in the Australian National Disability Insurance Scheme, International Journal of Information Management Data Insights, 2(1), (2022) 1–15.

M. K. Mishra, R. Dash, A comparative study of Chebyshev functional link artificial neural network, multi-layer perceptron and decision tree for credit card fraud detection, in: S. P. Mohanty, R. K. Patnaik, M. Gomathisankaran, B. S. Panda (Eds.) International Conference on Information Technology 2014, Bhubaneswar, India, 2014, pp. 228–233.

G. van Capelleveen, M. Poel, R. M. Mueller, D. Thornton, J. van Hillegersberg, Outlier detection in healthcare fraud: A case study in the Medicaid dental domain, International Journal of Accounting Information Systems, 21, (2016) 18–31.

L. Sabetti, R. Heijmans, Shallow or deep? Training an autoencoder to detect anomalous flows in a retail payment system, Latin American Journal of Central Banking, 2(2), (2021) 1–14.

J. Jiang, P. Trundle, J. Ren, Medical image analysis with artificial neural networks, Computerized Medical Imaging and Graphics, 34(8), (2010) 617–631.

A. Ansari, A. Riasi, Modelling and evaluating customer loyalty using neural networks: Evidence from startup insurance companies, Future Business Journal, 2(1), (2016) 15–30.

N. K. Frempong, N. Nicholas, M. A. Boateng, Decision tree as a predictive modeling tool for auto insurance claims, International Journal of Statistics and Applications, 7(2), (2017) 117–120.

N. K. Gyamfi, J. D. Abdulai, Bank fraud detection using support vector machine, in: V. Leung, S. Vuong, S. Chakrabarti (Eds.), IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON) 2018, Vancouver, BC, Canada, 2018, pp. 37–41.

E. Badr, S. Almotairi, M. A. Salam, H. Ahmed, New sequential and parallel support vector machine with grey wolf optimizer for breast cancer diagnosis. Alexandria Engineering Journal, 61(3), (2022) 2520–2534.

G. Tolan, T. Abou-El-Enien, M. Khorshid, A comparison among support vector machine and other machine learning classification algorithms, IPASJ International Journal of Computer Science (IIJCS), 3(5), (2015) 25–35.

A. Kao, S. R. Poteet, Natural language processing and text mining, Springer Publishing Company, 2006.

N. Chintalapudi, G. Battineni, M. D. Canio, G. G. Sagaro, F. Amenta, Text mining with sentiment analysis on seafarers’ medical documents, International Journal of Information Management Data Insights, 1(1), (2021) 1–9.

R. Alfrjani, T. Osman, G. Cosma, A hybrid semantic knowledgebase-machine learning approach for opinion mining, Data and Knowledge Engineering, 121, (2019) 88–108.

E. Teso, M. Olmedilla, M. Martínez-Torres, S. Toral, Application of text mining techniques to the analysis of discourse in eWOM communications from a gender perspective, Technological Forecasting and Social Change, 129, (2018) 131–142.

O. Rouane, H. Belhadef, M. Bouakkaz, Combine clustering and frequent itemsets mining to enhance biomedical text summarisation, Expert Systems with Applications, 135, (2019) 362–373.

Y. Zhang, A. Hu, J. Wang, Y. Zhang, Detection of fraud statement based on word vector: Evidence from financial companies in China, Finance Research Letters, 46, (2022) 1–7.

S. Fu, C. C. Wyles, D. R. Osmon, M. L. Carvour, E. Sagheb, T. Ramazanian, H. M. Kremers, Automated detection of periprosthetic joint infections and data elements using natural language processing, The Journal of Arthroplasty, 36(2), (2021) 688–692.

V. Nourani, M. Sayyah-Fard, M. T. Alami, E. Sharghi, Data pre-processing effect on ANN-based prediction intervals construction of the evaporation process at different climate regions in Iran, Journal of Hydrology, 588, (2020) 1–15.

W. Zhang, T. Liu, L. Ye, M. Ueland, S. L. Forbes, S. W. Su, A novel data pre-processing method for odour detection and identification system, Sensors and Actuators A: Physical, 287, (2019) 113–120.

C. Chilipirea, A. C. Petre, L. M. Groza, C. Dobre, F. Pop, An integrated architecture for future studies in data processing for smart cities, Microprocessors and Microsystems, 52, (2017) 335–342.

M. Hanafy, R. Ming, Using machine learning models to compare various resampling methods in predicting insurance fraud, Journal of Theoretical and Applied Information Technology, 99(12), (2021), 2819–2833.

M. K. Severino, Y. Peng, Machine learning algorithms for fraud prediction in property insurance: Empirical evidence using real-world microdata, Machine Learning with Applications, 5, (2021) 1–14.

R. Roy, K. T. George, Detecting insurance claims fraud using machine learning techniques, in: K. P. Isaac, A. Rahiman, G. P. Padmakumar (Eds.), International Conference on Circuit, Power and Computing Technologies (ICCPCT) 2017, Kollam, India, 2017, pp. 1–6.

G. Miner, D. Delen, J. Elder, A. Fast, T. Hill, R. A. Nisbet, Conceptual foundations of text mining and pre-processing steps, practical text mining and statistical analysis for non-structured text data applications, Academic Press. (2012) 43–51.

A. K. Menon, C. Elkan, Fast algorithms for approximating the singular value decomposition, ACM Transactions on Knowledge Discovery from Data, 5(2), (2011) 1–36.

TIBCO product documentation, Data Science Textbook, https://docs.tibco.com/data-science/GUID-4C6F72C1-F4F8-48A9-83C7-D4C72A66A3AC.html (Accessed on 14.08.2022)

C. Peña-Bautista, T. Durand, C. Oger, M. Baquero, M. Vento, C. Cháfer-Pericás, Assessment of lipid peroxidation and artificial neural network models in early Alzheimer disease diagnosis, Clinical Biochemistry, 72, (2019) 64–70.

R. Azadnia, K. Kheiralipour, Recognition of leaves of different medicinal plant species using a robust image processing algorithm and artificial neural networks classifier, Journal of Applied Research on Medicinal and Aromatic Plants, 25, (2021) 1–10.

C. Li, R. Chen, C. Moutafis, S. Furber, Robustness to noisy synaptic weights in spiking neural networks, in: A. Roy (Ed.), International Joint Conference on Neural Networks (IJCNN) 2020, Glasgow, UK, 2020, pp. 1–8.