Araştırma Makalesi
BibTex RIS Kaynak Göster

Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation

Yıl 2024, Cilt: 5 Sayı: 2, 121 - 132, 20.12.2024
https://doi.org/10.58769/joinssr.1574195

Öz

This paper introduces a Retrieval-Augmented Generation (RAG) system specifically designed for enhancing the accessibility and comprehension of medical information from patient information leaflets documents. Leveraging state-of-the-art technologies such as Optical Character Recognition (OCR), vector embeddings, hybrid search mechanisms combining semantic and full-text search, and Large Language Models (LLMs) like GPT-3.5 turbo, the system efficiently processes and responds to natural language queries. By integrating these components into a cohesive architecture, the RAG system facilitates accurate retrieval of medical data and generates responses that are not only precise but also formatted to be easily understood by laypersons. The effectiveness of the RAG system was evaluated through a series of real-world case studies, which demonstrated its ability to provide reliable, contextually relevant medical advice, thereby significantly improving users' access to essential health information. Insights gained from these studies indicate critical areas for future enhancement, particularly in user interaction and system feedback integration. This work underscores the potential of advanced AI tools to transform information accessibility in healthcare, making critical medical information more approachable for the public.

Kaynakça

  • [1] Tian, S., Jin, Q., Yeganova, L., Lai, P. T., Zhu, Q., Chen, X., ... & Lu, Z. (2024). Opportunities and challenges for ChatGPT and large language models in biomedicine and health. Briefings in Bioinformatics, 25(1), bbad493.
  • [2] Jin, Q., Leaman, R., & Lu, Z. (2023). Retrieve, summarize, and verify: how will ChatGPT affect information seeking from the medical literature?. Journal of the American Society of Nephrology, 34(8), 1302-1304.
  • [3] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33, 9459-9474.
  • [4] Zhao, P., Zhang, H., Yu, Q., Wang, Z., Geng, Y., Fu, F., ... & Cui, B. (2024). Retrieval-augmented generation for ai-generated content: A survey. arXiv preprint arXiv:2402.19473.
  • [5] Alsentzer, E., Murphy, J. R., Boag, W., Weng, W. H., Jin, D., Naumann, T., & McDermott, M. (2019). Publicly available clinical BERT embeddings. arXiv preprint arXiv:1904.03323.
  • [6] Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., ... & Amodei, D. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 1.
  • [7] Borgeaud, S., Mensch, A., Hoffmann, J., Cai, T., Rutherford, E., Millican, K., ... & Sifre, L. (2022, June). Improving language models by retrieving from trillions of tokens. In International conference on machine learning (pp. 2206-2240). PMLR.
  • [8] Zhu, X., Lin, T., Anand, V., Calderwood, M., Clausen-Brown, E., Lueck, G., ... & Wu, C. (2023, April). Explicit and Implicit Semantic Ranking Framework. In Companion Proceedings of the ACM Web Conference 2023 (pp. 326-330).
  • [9] Starker, E. (2023). Azure Cognitive Search: Outperforming vector search with hybrid retrieval and ranking capabilities. Microsoft. https://techcommunity.microsoft.com/t5/ai-azure-ai-services/azure-cognitive-search-outperforming-vector-search-with-hybrid/m-p/3931019.
  • [10] Andersson, H. (2024). RETRIEVAL-AUGMENTEDGENERATION WITH AZURE OPEN AI.
  • [11] Bruch, S., Gai, S., & Ingber, A. (2023). An analysis of fusion functions for hybrid retrieval. ACM Transactions on Information Systems, 42(1), 1-35.
  • [12] Chen, X., Gao, C., Chen, C., Zhang, G., & Liu, Y. (2024). An Empirical Study on Challenges for OpenAI Developers. arXiv preprint arXiv:2408.05002.
  • [13] Frisoni, G., Mizutani, M., Moro, G., & Valgimigli, L. (2022, December). Bioreader: a retrieval-enhanced text-to-text transformer for biomedical literature. In Proceedings of the 2022 conference on empirical methods in natural language processing (pp. 5770-5793).

Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation

Yıl 2024, Cilt: 5 Sayı: 2, 121 - 132, 20.12.2024
https://doi.org/10.58769/joinssr.1574195

Öz

This paper introduces a Retrieval-Augmented Generation (RAG) system specifically designed for enhancing the accessibility and comprehension of medical information from patient information leaflets documents. Leveraging state-of-the-art technologies such as Optical Character Recognition (OCR), vector embeddings, hybrid search mechanisms combining semantic and full-text search, and Large Language Models (LLMs) like GPT-3.5 turbo, the system efficiently processes and responds to natural language queries. By integrating these components into a cohesive architecture, the RAG system facilitates accurate retrieval of medical data and generates responses that are not only precise but also formatted to be easily understood by laypersons. The effectiveness of the RAG system was evaluated through a series of real-world case studies, which demonstrated its ability to provide reliable, contextually relevant medical advice, thereby significantly improving users' access to essential health information. Insights gained from these studies indicate critical areas for future enhancement, particularly in user interaction and system feedback integration. This work underscores the potential of advanced AI tools to transform information accessibility in healthcare, making critical medical information more approachable for the public.

Kaynakça

  • [1] Tian, S., Jin, Q., Yeganova, L., Lai, P. T., Zhu, Q., Chen, X., ... & Lu, Z. (2024). Opportunities and challenges for ChatGPT and large language models in biomedicine and health. Briefings in Bioinformatics, 25(1), bbad493.
  • [2] Jin, Q., Leaman, R., & Lu, Z. (2023). Retrieve, summarize, and verify: how will ChatGPT affect information seeking from the medical literature?. Journal of the American Society of Nephrology, 34(8), 1302-1304.
  • [3] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33, 9459-9474.
  • [4] Zhao, P., Zhang, H., Yu, Q., Wang, Z., Geng, Y., Fu, F., ... & Cui, B. (2024). Retrieval-augmented generation for ai-generated content: A survey. arXiv preprint arXiv:2402.19473.
  • [5] Alsentzer, E., Murphy, J. R., Boag, W., Weng, W. H., Jin, D., Naumann, T., & McDermott, M. (2019). Publicly available clinical BERT embeddings. arXiv preprint arXiv:1904.03323.
  • [6] Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., ... & Amodei, D. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 1.
  • [7] Borgeaud, S., Mensch, A., Hoffmann, J., Cai, T., Rutherford, E., Millican, K., ... & Sifre, L. (2022, June). Improving language models by retrieving from trillions of tokens. In International conference on machine learning (pp. 2206-2240). PMLR.
  • [8] Zhu, X., Lin, T., Anand, V., Calderwood, M., Clausen-Brown, E., Lueck, G., ... & Wu, C. (2023, April). Explicit and Implicit Semantic Ranking Framework. In Companion Proceedings of the ACM Web Conference 2023 (pp. 326-330).
  • [9] Starker, E. (2023). Azure Cognitive Search: Outperforming vector search with hybrid retrieval and ranking capabilities. Microsoft. https://techcommunity.microsoft.com/t5/ai-azure-ai-services/azure-cognitive-search-outperforming-vector-search-with-hybrid/m-p/3931019.
  • [10] Andersson, H. (2024). RETRIEVAL-AUGMENTEDGENERATION WITH AZURE OPEN AI.
  • [11] Bruch, S., Gai, S., & Ingber, A. (2023). An analysis of fusion functions for hybrid retrieval. ACM Transactions on Information Systems, 42(1), 1-35.
  • [12] Chen, X., Gao, C., Chen, C., Zhang, G., & Liu, Y. (2024). An Empirical Study on Challenges for OpenAI Developers. arXiv preprint arXiv:2408.05002.
  • [13] Frisoni, G., Mizutani, M., Moro, G., & Valgimigli, L. (2022, December). Bioreader: a retrieval-enhanced text-to-text transformer for biomedical literature. In Proceedings of the 2022 conference on empirical methods in natural language processing (pp. 5770-5793).
Toplam 13 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Yapay Zeka (Diğer)
Bölüm Araştırma Makaleleri
Yazarlar

Serhan Ayberk Kılıç Bu kişi benim 0000-0001-9986-3751

Kasım Serbest 0000-0002-0064-4020

Yayımlanma Tarihi 20 Aralık 2024
Gönderilme Tarihi 27 Ekim 2024
Kabul Tarihi 28 Kasım 2024
Yayımlandığı Sayı Yıl 2024 Cilt: 5 Sayı: 2

Kaynak Göster

APA Kılıç, S. A., & Serbest, K. (2024). Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation. Journal of Smart Systems Research, 5(2), 121-132. https://doi.org/10.58769/joinssr.1574195
AMA Kılıç SA, Serbest K. Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation. JoinSSR. Aralık 2024;5(2):121-132. doi:10.58769/joinssr.1574195
Chicago Kılıç, Serhan Ayberk, ve Kasım Serbest. “Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation”. Journal of Smart Systems Research 5, sy. 2 (Aralık 2024): 121-32. https://doi.org/10.58769/joinssr.1574195.
EndNote Kılıç SA, Serbest K (01 Aralık 2024) Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation. Journal of Smart Systems Research 5 2 121–132.
IEEE S. A. Kılıç ve K. Serbest, “Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation”, JoinSSR, c. 5, sy. 2, ss. 121–132, 2024, doi: 10.58769/joinssr.1574195.
ISNAD Kılıç, Serhan Ayberk - Serbest, Kasım. “Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation”. Journal of Smart Systems Research 5/2 (Aralık 2024), 121-132. https://doi.org/10.58769/joinssr.1574195.
JAMA Kılıç SA, Serbest K. Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation. JoinSSR. 2024;5:121–132.
MLA Kılıç, Serhan Ayberk ve Kasım Serbest. “Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation”. Journal of Smart Systems Research, c. 5, sy. 2, 2024, ss. 121-32, doi:10.58769/joinssr.1574195.
Vancouver Kılıç SA, Serbest K. Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation. JoinSSR. 2024;5(2):121-32.