Research Article
BibTex RIS Cite

Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation

Year 2024, Volume: 5 Issue: 2, 121 - 132, 20.12.2024
https://doi.org/10.58769/joinssr.1574195

Abstract

This paper introduces a Retrieval-Augmented Generation (RAG) system specifically designed for enhancing the accessibility and comprehension of medical information from patient information leaflets documents. Leveraging state-of-the-art technologies such as Optical Character Recognition (OCR), vector embeddings, hybrid search mechanisms combining semantic and full-text search, and Large Language Models (LLMs) like GPT-3.5 turbo, the system efficiently processes and responds to natural language queries. By integrating these components into a cohesive architecture, the RAG system facilitates accurate retrieval of medical data and generates responses that are not only precise but also formatted to be easily understood by laypersons. The effectiveness of the RAG system was evaluated through a series of real-world case studies, which demonstrated its ability to provide reliable, contextually relevant medical advice, thereby significantly improving users' access to essential health information. Insights gained from these studies indicate critical areas for future enhancement, particularly in user interaction and system feedback integration. This work underscores the potential of advanced AI tools to transform information accessibility in healthcare, making critical medical information more approachable for the public.

References

  • [1] Tian, S., Jin, Q., Yeganova, L., Lai, P. T., Zhu, Q., Chen, X., ... & Lu, Z. (2024). Opportunities and challenges for ChatGPT and large language models in biomedicine and health. Briefings in Bioinformatics, 25(1), bbad493.
  • [2] Jin, Q., Leaman, R., & Lu, Z. (2023). Retrieve, summarize, and verify: how will ChatGPT affect information seeking from the medical literature?. Journal of the American Society of Nephrology, 34(8), 1302-1304.
  • [3] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33, 9459-9474.
  • [4] Zhao, P., Zhang, H., Yu, Q., Wang, Z., Geng, Y., Fu, F., ... & Cui, B. (2024). Retrieval-augmented generation for ai-generated content: A survey. arXiv preprint arXiv:2402.19473.
  • [5] Alsentzer, E., Murphy, J. R., Boag, W., Weng, W. H., Jin, D., Naumann, T., & McDermott, M. (2019). Publicly available clinical BERT embeddings. arXiv preprint arXiv:1904.03323.
  • [6] Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., ... & Amodei, D. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 1.
  • [7] Borgeaud, S., Mensch, A., Hoffmann, J., Cai, T., Rutherford, E., Millican, K., ... & Sifre, L. (2022, June). Improving language models by retrieving from trillions of tokens. In International conference on machine learning (pp. 2206-2240). PMLR.
  • [8] Zhu, X., Lin, T., Anand, V., Calderwood, M., Clausen-Brown, E., Lueck, G., ... & Wu, C. (2023, April). Explicit and Implicit Semantic Ranking Framework. In Companion Proceedings of the ACM Web Conference 2023 (pp. 326-330).
  • [9] Starker, E. (2023). Azure Cognitive Search: Outperforming vector search with hybrid retrieval and ranking capabilities. Microsoft. https://techcommunity.microsoft.com/t5/ai-azure-ai-services/azure-cognitive-search-outperforming-vector-search-with-hybrid/m-p/3931019.
  • [10] Andersson, H. (2024). RETRIEVAL-AUGMENTEDGENERATION WITH AZURE OPEN AI.
  • [11] Bruch, S., Gai, S., & Ingber, A. (2023). An analysis of fusion functions for hybrid retrieval. ACM Transactions on Information Systems, 42(1), 1-35.
  • [12] Chen, X., Gao, C., Chen, C., Zhang, G., & Liu, Y. (2024). An Empirical Study on Challenges for OpenAI Developers. arXiv preprint arXiv:2408.05002.
  • [13] Frisoni, G., Mizutani, M., Moro, G., & Valgimigli, L. (2022, December). Bioreader: a retrieval-enhanced text-to-text transformer for biomedical literature. In Proceedings of the 2022 conference on empirical methods in natural language processing (pp. 5770-5793).

Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation

Year 2024, Volume: 5 Issue: 2, 121 - 132, 20.12.2024
https://doi.org/10.58769/joinssr.1574195

Abstract

This paper introduces a Retrieval-Augmented Generation (RAG) system specifically designed for enhancing the accessibility and comprehension of medical information from patient information leaflets documents. Leveraging state-of-the-art technologies such as Optical Character Recognition (OCR), vector embeddings, hybrid search mechanisms combining semantic and full-text search, and Large Language Models (LLMs) like GPT-3.5 turbo, the system efficiently processes and responds to natural language queries. By integrating these components into a cohesive architecture, the RAG system facilitates accurate retrieval of medical data and generates responses that are not only precise but also formatted to be easily understood by laypersons. The effectiveness of the RAG system was evaluated through a series of real-world case studies, which demonstrated its ability to provide reliable, contextually relevant medical advice, thereby significantly improving users' access to essential health information. Insights gained from these studies indicate critical areas for future enhancement, particularly in user interaction and system feedback integration. This work underscores the potential of advanced AI tools to transform information accessibility in healthcare, making critical medical information more approachable for the public.

References

  • [1] Tian, S., Jin, Q., Yeganova, L., Lai, P. T., Zhu, Q., Chen, X., ... & Lu, Z. (2024). Opportunities and challenges for ChatGPT and large language models in biomedicine and health. Briefings in Bioinformatics, 25(1), bbad493.
  • [2] Jin, Q., Leaman, R., & Lu, Z. (2023). Retrieve, summarize, and verify: how will ChatGPT affect information seeking from the medical literature?. Journal of the American Society of Nephrology, 34(8), 1302-1304.
  • [3] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33, 9459-9474.
  • [4] Zhao, P., Zhang, H., Yu, Q., Wang, Z., Geng, Y., Fu, F., ... & Cui, B. (2024). Retrieval-augmented generation for ai-generated content: A survey. arXiv preprint arXiv:2402.19473.
  • [5] Alsentzer, E., Murphy, J. R., Boag, W., Weng, W. H., Jin, D., Naumann, T., & McDermott, M. (2019). Publicly available clinical BERT embeddings. arXiv preprint arXiv:1904.03323.
  • [6] Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., ... & Amodei, D. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 1.
  • [7] Borgeaud, S., Mensch, A., Hoffmann, J., Cai, T., Rutherford, E., Millican, K., ... & Sifre, L. (2022, June). Improving language models by retrieving from trillions of tokens. In International conference on machine learning (pp. 2206-2240). PMLR.
  • [8] Zhu, X., Lin, T., Anand, V., Calderwood, M., Clausen-Brown, E., Lueck, G., ... & Wu, C. (2023, April). Explicit and Implicit Semantic Ranking Framework. In Companion Proceedings of the ACM Web Conference 2023 (pp. 326-330).
  • [9] Starker, E. (2023). Azure Cognitive Search: Outperforming vector search with hybrid retrieval and ranking capabilities. Microsoft. https://techcommunity.microsoft.com/t5/ai-azure-ai-services/azure-cognitive-search-outperforming-vector-search-with-hybrid/m-p/3931019.
  • [10] Andersson, H. (2024). RETRIEVAL-AUGMENTEDGENERATION WITH AZURE OPEN AI.
  • [11] Bruch, S., Gai, S., & Ingber, A. (2023). An analysis of fusion functions for hybrid retrieval. ACM Transactions on Information Systems, 42(1), 1-35.
  • [12] Chen, X., Gao, C., Chen, C., Zhang, G., & Liu, Y. (2024). An Empirical Study on Challenges for OpenAI Developers. arXiv preprint arXiv:2408.05002.
  • [13] Frisoni, G., Mizutani, M., Moro, G., & Valgimigli, L. (2022, December). Bioreader: a retrieval-enhanced text-to-text transformer for biomedical literature. In Proceedings of the 2022 conference on empirical methods in natural language processing (pp. 5770-5793).
There are 13 citations in total.

Details

Primary Language English
Subjects Artificial Intelligence (Other)
Journal Section Research Articles
Authors

Serhan Ayberk Kılıç This is me 0000-0001-9986-3751

Kasım Serbest 0000-0002-0064-4020

Publication Date December 20, 2024
Submission Date October 27, 2024
Acceptance Date November 28, 2024
Published in Issue Year 2024 Volume: 5 Issue: 2

Cite

APA Kılıç, S. A., & Serbest, K. (2024). Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation. Journal of Smart Systems Research, 5(2), 121-132. https://doi.org/10.58769/joinssr.1574195
AMA Kılıç SA, Serbest K. Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation. JoinSSR. December 2024;5(2):121-132. doi:10.58769/joinssr.1574195
Chicago Kılıç, Serhan Ayberk, and Kasım Serbest. “Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation”. Journal of Smart Systems Research 5, no. 2 (December 2024): 121-32. https://doi.org/10.58769/joinssr.1574195.
EndNote Kılıç SA, Serbest K (December 1, 2024) Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation. Journal of Smart Systems Research 5 2 121–132.
IEEE S. A. Kılıç and K. Serbest, “Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation”, JoinSSR, vol. 5, no. 2, pp. 121–132, 2024, doi: 10.58769/joinssr.1574195.
ISNAD Kılıç, Serhan Ayberk - Serbest, Kasım. “Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation”. Journal of Smart Systems Research 5/2 (December 2024), 121-132. https://doi.org/10.58769/joinssr.1574195.
JAMA Kılıç SA, Serbest K. Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation. JoinSSR. 2024;5:121–132.
MLA Kılıç, Serhan Ayberk and Kasım Serbest. “Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation”. Journal of Smart Systems Research, vol. 5, no. 2, 2024, pp. 121-32, doi:10.58769/joinssr.1574195.
Vancouver Kılıç SA, Serbest K. Increasing the Efficiency of the Use of Patient Information Leaflets by Using Retrieval Augmented Generation. JoinSSR. 2024;5(2):121-32.