Improving the Quality of Enterprise Data Management with Tree-Based Models
Abstract
Data credibility is essential for reliable decision-making and decision-support systems across Salesforce environments during the processing of transactional data as observed in the Online Retail Dataset. This work analyses the application of tree-based machine-learning models in improving data quality through transformation and cleaning processes for the removal of missing values, duplication, and outliers. The approach includes data preparation, relevant feature selection, model construction, and deployment within Salesforce processes to monitor data quality in real time and batch workflows. With tree-based models, substantial performance gains were observed, with precision up to 90.8%, recall up to 74%, and overall accuracy up to 91.6%. After Salesforce integration, completeness increased by 12%, accuracy by 10%, and consistency by 15%. The system’s retraining mechanism and feedback loop ensure protection against long-term data degradation in enterprise CRM environments.
Keywords
- Artificial intelligence
- Customer relationship management
- Data quality improvement
- Tree-based models
- Machine learning
- Salesforce
Ethical Statement
Thanks
References
- [1]. Yocupicio-Zazueta, A., Brau-Avila, A., Cirett-Galán, F., & Valenzuela-Galván, M. (2024). Design and Deployment of ML in CRM to Identify Leads. Applied Artificial Intelligence, 38(1): 2376978. (https://doi.org/10.1080/08839514.2024.2376978)
- [2]. Pookandy, J. (2022). AI-based data cleaning and management in Salesforce CRM for improving data integrity and accuracy to enhance customer insights. International Journal of Advanced Research in Engineering and Technology (IJARET), 13(5), 108-116.
- [3]. Elouataoui, W., El Mendili, S., & Gahi, Y. (2023). An Automated Big Data Quality Anomaly Correction Framework Using Predictive Analysis. Data, 8(12): 182. (https://doi.org/10.3390/data8120182)
- [4]. Xie, J., Sun, L., & Zhao, Y. F. (2025). On the data quality and imbalance in machine learning-based design and manufacturing—A systematic review. Engineering, 45, 105-131. (https://doi.org/10.1016/j.eng.2024.04.024)
- [5]. Azimi, S., Pahl, C. 2024. Anomaly analytics in data-driven machine learning applications. Azimi, S., & Pahl, C. (2024). International Journal of Data Science and Analytics, 19, 155–180. (https://doi.org/10.1007/s41060-024-00593-y)
- [6]. Panarese, A., Settanni, G., Vitti, V., & Galiano, A. (2022). Developing and preliminary testing of a machine learning-based platform for sales forecasting using a gradient boosting approach. Applied Sciences, 12(21): 11054. (https://doi.org/10.3390/app122111054)
- [7]. Massaro, A., Panarese, A., Giannone, D., & Galiano, A. (2021). Augmented data and XGBoost improvement for sales forecasting in the large-scale retail sector. Applied Sciences; 11(17): 7793. (https://doi.org/10.3390/app11177793)
- [8]. Chinta, U., Aggarwal, A., & Goel, P. (2024). Quality Assurance in Salesforce Implementations: Developing and Enforcing Frameworks for Success. International Journal of Computer Science and Engineering, 13(1): 27-44.
Details
Primary Language
English
Subjects
Computer System Software
Journal Section
Research Article
Authors
Publication Date
June 30, 2026
Submission Date
April 16, 2025
Acceptance Date
February 16, 2026
Published in Issue
Year 2026 Volume: 22 Number: 2