https://scikit-learn.org/stable/. https://scikit-learn.org/stable/. 2024. google scholar" />https://matplotlib.org/. https://matplotlib.org/. 2024. google scholar" />https://pypi.org/project/gensim/. https://pypi.org/project/gensim/. 2024. google scholar" />
This paper describes word similarity analysis in tax law using the Word2Vec model. By similarity analysis, we mean identifying relationships between similar terms in tax terminology. The Word2Vec model represents the meanings of words with vectors and identifies the semantic relationships of words through the proximity between these vectors.
This article analyzes the semantic proximity of terms frequently used in tax law and visualises the relationships between these words. For example, the close relationships of the word ‘mükellef’ with words such as ‘kişi’, ‘tam’, ‘dar’, ‘firma’, and ‘imalatçı’ are represented through vectors. The paper also explains the mathematical structure of the models. Then, the features of the NumPy, Gensim, Scikit-learn, and Matplotlib libraries of the Python programming language are explained and used for this paper. For the visualisation of the similarity analysis, the t-SNE algorithm, which allows the visualisation of highdimensional data on a two-dimensional plane, was used.
The main purpose of this paper is to enable AI systems that can be used as tax advisors to better understand tax law by modelling the conceptual relationships between the terms of tax law, thus contributing to the provision of more accurate and consistent information by AI.
Word2Vec tax law natural language processing (NLP) t-SNE algorithm Skip-Gram Model language model visualisation
Primary Language | English |
---|---|
Subjects | Artificial Intelligence (Other) |
Journal Section | Research Article |
Authors | |
Publication Date | January 30, 2025 |
Submission Date | December 6, 2024 |
Acceptance Date | January 3, 2025 |
Published in Issue | Year 2025 Volume: 1 Issue: 1 |