Research Article

Advances in Transformer-Based Semantic Search: Techniques, Benchmarks, and Future Directions

Volume: 17 Number: 1 June 30, 2025
EN

Advances in Transformer-Based Semantic Search: Techniques, Benchmarks, and Future Directions

Abstract

Semantic search has developed quickly as the need for accurate information retrieval has increased in a variety of fields, from expert knowledge systems to web search engines. Conventional search methods that rely solely on keywords frequently fail to understand user intent and contextual hints. This survey focuses on recent advances in Transformer-based models, such as BERT, RoBERTa, T5, and GPT, which leverage self-attention mechanisms and contextual embeddings to deliver heightened precision and recall across diverse domains. Key architectural elements underlying these models are discussed, including dual-encoder and cross-encoder frameworks, and how Dense Passage Retrieval extends their capabilities to large-scale applications is examined. Practical considerations, such as domain adaptation and fine-tuning strategies, are reviewed to highlight their impact on real-world deployment. Benchmark evaluations (e.g., MS MARCO, TREC, and BEIR) are also presented to illustrate performance gains over traditional Information Retrieval methods and explore ongoing challenges involving interpretability, bias, and resource-intensive training. Lastly, emerging trends—multimodal semantic search, personalized retrieval, and continual learning—that promise to shape the future of AI-driven information retrieval are identified for more efficient and interpretable semantic search.

Keywords

References

  1. Abdi, H., Singular value decomposition (SVD) and generalized singular value decomposition (GSVD), In: Encyclopedia of Measurement and Statistics, Sage Publications, 2007.
  2. Aliyu, M.B., Efficiency of Boolean search strings for Information retrieval, American Journal of Engineering Research, 6(11)(2017), 216–222.
  3. Aumuller, M., Bernhardsson, E., Faithfull, A., ANN-benchmarks: A benchmarking tool for approximate nearest neighbor algorithms, Information Systems, 87(2020), 101374.
  4. Bringmann, K., Chaudhuri, K., Dao, T., et al., Domain adaptation in the presence of distribution shift, Artificial Intelligence, 321(2024), 103946.
  5. Chen, H., Tian, X., Liu, B., Overview of the MS MARCO 2024 passage ranking challenge, Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024.
  6. Church, K.W., Word2vec, Natural Language Engineering, 23(1)(2017), 155–162.
  7. Cong, Y., Chai, Z., Zeng, Y., et al., Self-supervised weight prediction for continual learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(10)(2023), 11939–11952.
  8. Delobelle, P., Winters, T., Berendt, B., RobBERT: A Dutch RoBERTa-based language model, arXiv preprint arXiv:2001.06286, 2020.

Details

Primary Language

English

Subjects

Deep Learning, Knowledge Representation and Reasoning, Computer System Software, Computer Software, Software Engineering (Other)

Journal Section

Research Article

Publication Date

June 30, 2025

Submission Date

February 4, 2025

Acceptance Date

March 1, 2025

Published in Issue

Year 2025 Volume: 17 Number: 1

APA
Kamil, M., & Çakır, D. (2025). Advances in Transformer-Based Semantic Search: Techniques, Benchmarks, and Future Directions. Turkish Journal of Mathematics and Computer Science, 17(1), 145-166. https://doi.org/10.47000/tjmcs.1633092
AMA
1.Kamil M, Çakır D. Advances in Transformer-Based Semantic Search: Techniques, Benchmarks, and Future Directions. TJMCS. 2025;17(1):145-166. doi:10.47000/tjmcs.1633092
Chicago
Kamil, Mohammad, and Duygu Çakır. 2025. “Advances in Transformer-Based Semantic Search: Techniques, Benchmarks, and Future Directions”. Turkish Journal of Mathematics and Computer Science 17 (1): 145-66. https://doi.org/10.47000/tjmcs.1633092.
EndNote
Kamil M, Çakır D (June 1, 2025) Advances in Transformer-Based Semantic Search: Techniques, Benchmarks, and Future Directions. Turkish Journal of Mathematics and Computer Science 17 1 145–166.
IEEE
[1]M. Kamil and D. Çakır, “Advances in Transformer-Based Semantic Search: Techniques, Benchmarks, and Future Directions”, TJMCS, vol. 17, no. 1, pp. 145–166, June 2025, doi: 10.47000/tjmcs.1633092.
ISNAD
Kamil, Mohammad - Çakır, Duygu. “Advances in Transformer-Based Semantic Search: Techniques, Benchmarks, and Future Directions”. Turkish Journal of Mathematics and Computer Science 17/1 (June 1, 2025): 145-166. https://doi.org/10.47000/tjmcs.1633092.
JAMA
1.Kamil M, Çakır D. Advances in Transformer-Based Semantic Search: Techniques, Benchmarks, and Future Directions. TJMCS. 2025;17:145–166.
MLA
Kamil, Mohammad, and Duygu Çakır. “Advances in Transformer-Based Semantic Search: Techniques, Benchmarks, and Future Directions”. Turkish Journal of Mathematics and Computer Science, vol. 17, no. 1, June 2025, pp. 145-66, doi:10.47000/tjmcs.1633092.
Vancouver
1.Mohammad Kamil, Duygu Çakır. Advances in Transformer-Based Semantic Search: Techniques, Benchmarks, and Future Directions. TJMCS. 2025 Jun. 1;17(1):145-66. doi:10.47000/tjmcs.1633092