Research Article
BibTex RIS Cite

Assessment of Association Rule Mining Using Interest Measures on the Gene Data

Year 2022, Volume: 4 Issue: 3, 286 - 292, 22.09.2022
https://doi.org/10.37990/medr.1088631

Abstract

Aim: Data mining is the discovery process of beneficial information, not revealed from large-scale data beforehand. One of the fields in which data mining is widely used is health. With data mining, the diagnosis and treatment of the disease and the risk factors affecting the disease can be determined quickly. Association rules are one of the data mining techniques. The aim of this study is to determine patient profiles by obtaining strong association rules with the apriori algorithm, which is one of the association rule algorithms.
Material and Method: The data set used in the study consists of 205 acute myocardial infarction (AMI) patients. The patients have also carried the genotype of the FNDC5 (rs3480, rs726344, rs16835198) polymorphisms. Support and confidence measures are used to evaluate the rules obtained in the Apriori algorithm. The rules obtained by these measures are correct but not strong. Therefore, interest measures are used, besides two basic measures, with the aim of obtaining stronger rules. In this study For reaching stronger rules, interest measures lift, conviction, certainty factor, cosine, phi and mutual information are applied.
Results: In this study, 108 rules were obtained. The proposed interest measures were implemented to reach stronger rules and as a result 29 of the rules were qualified as strong.
Conclusion: As a result, stronger rules have been obtained with the use of interest measures in the clinical decision making process. Thanks to the strong rules obtained, it will facilitate the patient profile determination and clinical decision-making process of AMI patients.

Thanks

We thank Ozge Dıs for her contribution to the study.

References

  • Bulduk B, Aktaş MC, Bulduk M. Akut Miyokard İnfarktüsü Sonrası Gelişen Ruhsal Bozukluklar. JAREN/Hemşirelik Akademik Araştırma Dergisi.3(Supp: 1):24-7.
  • Arslan AK, Colak C, Sarihan ME. Different medical data mining approaches based prediction of ischemic stroke. Computer methods and programs in biomedicine. 2016;130:87-92.
  • Chen M-S, Han J, Yu PS. Data mining: an overview from a database perspective. IEEE Transactions on Knowledge and data Engineering. 1996;8(6):866-83.
  • Piateski G, Frawley W. Knowledge discovery in databases: MIT press; 1991.
  • Crockett D, Eliason B. What is data mining in healthcare. Insights: Health Catalyst. 2016.
  • Jain D, Gautam S. Implementation of apriori algorithm in health care sector: a survey. International Journal of Computer Science and Communication Engineering. 2013;2(4):22-8.
  • Abdullah U, Ahmad J, Ahmed A, editors. Analysis of effectiveness of apriori algorithm in medical billing data mining. 2008 4th International Conference on Emerging Technologies; 2008: IEEE.
  • Zhang W-J, Ma D-L, Dong B, editors. The automatic diagnosis system of breast cancer based on the improved Apriori algorithm. 2012 International Conference on Machine Learning and Cybernetics; 2012: IEEE.
  • Akbaş KE, Kivrak M, Arslan AK, Çolak C, editors. Assessment of Association Rules based on Certainty Factor: an Application on Heart Data Set. 2019 International Artificial Intelligence and Data Processing Symposium (IDAP); 2019 21-22 Sept. 2019.
  • DİŞ Ö. Akut miyokard infarktus hastalarında serum irisin düzeylerinin ve genetik varyantlarının araştırılması/The investigation of serum irisin levels and genetic variants in patients with acute myocardial i? nfarction. 2017.
  • Hornik K, Grün B, Hahsler M. arules-A computational environment for mining association rules and frequent item sets. Journal of Statistical Software. 2005;14(15):1-25.
  • Team R. RStudio: Integrated development for R (version 1.1. 463)[Computer software]. Boston: RStudio, Inc. http://www. rstudio. com; 2018.
  • Rao S, Gupta P. Implementing Improved Algorithm Over APRIORI Data Mining Association Rule Algorithm 1. 2012.
  • Berzal F, Blanco I, Sánchez D, Vila M-A. Measuring the accuracy and interest of association rules: A new framework. Intelligent Data Analysis. 2002;6(3):221-35.
  • Agrawal R, Imieliński T, Swami A, editors. Mining association rules between sets of items in large databases. Acm sigmod record; 1993: ACM.
  • Manimaran J, Velmurugan T. Analysing the quality of association rules by computing an interestingness measures. Indian Journal of Science and Technology. 2015;8(15):1-12.
  • Han J, Pei J, Kamber M. Data mining: concepts and techniques: Elsevier; 2011.
  • Kiran A, Reddy K. Selecting a right interestingness measure for rare association rules. Management of Data. 2010;115:115-24.
  • Brin S, Motwani R, Ullman JD, Tsur S. Dynamic itemset counting and implication rules for market basket data. Acm Sigmod Record. 1997;26(2):255-64.
  • Tan P-N, Kumar V, Srivastava J. Selecting the right objective measure for association analysis. Information Systems. 2004;29(4):293-313.
  • Sathyavani D, Sharmila D. Optimized Weighted Association Rule Mining using Mutual information on Fuzzy Data. International Journal of scientific research and management (IJSRM). 2014;2(1):501-5.
  • Zhang C, Zhang S. Association rule mining: models and algorithms: Springer-Verlag; 2002.
  • Hahsler M. A probabilistic comparison of commonly used interest measures for association rules. United States Southern Methodist University. 2015.
  • Li W, Han J, Pei J, editors. CMAR: Accurate and efficient classification based on multiple class-association rules. Proceedings 2001 IEEE international conference on data mining; 2001: IEEE.
  • Shortliffe EH, Buchanan BG. A model of inexact reasoning in medicine. Rule-based expert systems. 1984:233-62.
  • Ilayaraja M, Meyyappan T, editors. Mining medical data to identify frequent diseases using Apriori algorithm. 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering; 2013: IEEE.
  • Nahar J, Imam T, Tickle KS, Chen Y-PP. Association rule mining to detect factors which contribute to heart disease in males and females. Expert Systems with Applications. 2013;40(4):1086-93.
  • Karaolis M, Moutiris JA, Papaconstantinou L, Pattichis CS, editors. Association rule analysis for the assessment of the risk of coronary heart events. 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society; 2009: IEEE.
  • Safdari R, Saeedi MG, Arji G, Gharooni M, Soraki M, Nasiri M. A model for predicting myocardial infarction using data mining techniques. Iranian journal of medical informatics. 2013;2(4):1-6.
  • Licastro F, Ianni M, Ferrari R, Campo G, Buscema PM, Grossi E, et al., editors. A New Risk Chart for Acute Myocardial Infarction by a Innovative Algoritm. HEALTHINF; 2015.

Gen Verileri Üzerinde İlginçlik Ölçütleri Kullanılarak Birliktelik Kuralları Madenciliğinin Uygulanması

Year 2022, Volume: 4 Issue: 3, 286 - 292, 22.09.2022
https://doi.org/10.37990/medr.1088631

Abstract

Amaç: Veri madenciliği, önceden büyük ölçekli verilerden ortaya çıkarılmayan faydalı bilgilerin keşfedilme sürecidir. Veri madenciliğinin yaygın olarak kullanıldığı alanlardan biri de sağlıktır. Veri madenciliği ile hastalığın tanı ve tedavisi ile hastalığı etkileyen risk faktörleri hızlı bir şekilde belirlenebilmektedir. Birliktelik kuralları, veri madenciliği tekniklerinden biridir. Bu çalışmanın amacı, birliktelik kuralı algoritmalarından biri olan apriori algoritması ile güçlü birliktelik kuralları elde ederek hasta profillerini belirlemektir.
Materyal ve Metot: Çalışmada kullanılan veri seti 205 akut miyokard enfarktüsü (AMI) hastasından oluşmaktadır. Hastalar ayrıca FNDC5 polimorfizmlerinin rs3480, rs726344, rs16835198 genotipini de taşımaktadır. Apriori algoritması ile elde edilen kuralları değerlendirmek için destek ve güven ölçüleri kullanılır. Ancak bu ölçütler ile elde edilen kurallar doğrudur ancak güçlü değildir. Bu nedenle, daha güçlü kurallar elde etmek amacıyla iki temel ölçütün yanı sıra ilginçlik ölçütleri kullanılmaktadır. Bu çalışmada daha güçlü kurallara ulaşmak için ilginçlik ölçütlerinden kaldıraç, kanaat, kesinlik faktörü, cosine, korelasyon katsayısı (phi) ve karşılıklı bilgi ölçütleri uygulanmıştır.
Bulgular: Çalışmada 108 kural elde edilmiştir. Bu kurallara ilginçlik ölçütlerinin de uygulanması ile elde edilen kural sayısı 29 olmuştur ve bu kurallar güçlü kural olarak nitelendirilmiştir.
Sonuç: Sonuç olarak, klinik karar verme sürecinde ilginçlik ölçütlerinin kullanılmasıyla daha güçlü kurallar elde edilmiştir. Elde edilen güçlü kurallar sayesinde AMİ hastalarının hasta profili belirleme ve klinik karar verme sürecini kolaylaştıracaktır.

References

  • Bulduk B, Aktaş MC, Bulduk M. Akut Miyokard İnfarktüsü Sonrası Gelişen Ruhsal Bozukluklar. JAREN/Hemşirelik Akademik Araştırma Dergisi.3(Supp: 1):24-7.
  • Arslan AK, Colak C, Sarihan ME. Different medical data mining approaches based prediction of ischemic stroke. Computer methods and programs in biomedicine. 2016;130:87-92.
  • Chen M-S, Han J, Yu PS. Data mining: an overview from a database perspective. IEEE Transactions on Knowledge and data Engineering. 1996;8(6):866-83.
  • Piateski G, Frawley W. Knowledge discovery in databases: MIT press; 1991.
  • Crockett D, Eliason B. What is data mining in healthcare. Insights: Health Catalyst. 2016.
  • Jain D, Gautam S. Implementation of apriori algorithm in health care sector: a survey. International Journal of Computer Science and Communication Engineering. 2013;2(4):22-8.
  • Abdullah U, Ahmad J, Ahmed A, editors. Analysis of effectiveness of apriori algorithm in medical billing data mining. 2008 4th International Conference on Emerging Technologies; 2008: IEEE.
  • Zhang W-J, Ma D-L, Dong B, editors. The automatic diagnosis system of breast cancer based on the improved Apriori algorithm. 2012 International Conference on Machine Learning and Cybernetics; 2012: IEEE.
  • Akbaş KE, Kivrak M, Arslan AK, Çolak C, editors. Assessment of Association Rules based on Certainty Factor: an Application on Heart Data Set. 2019 International Artificial Intelligence and Data Processing Symposium (IDAP); 2019 21-22 Sept. 2019.
  • DİŞ Ö. Akut miyokard infarktus hastalarında serum irisin düzeylerinin ve genetik varyantlarının araştırılması/The investigation of serum irisin levels and genetic variants in patients with acute myocardial i? nfarction. 2017.
  • Hornik K, Grün B, Hahsler M. arules-A computational environment for mining association rules and frequent item sets. Journal of Statistical Software. 2005;14(15):1-25.
  • Team R. RStudio: Integrated development for R (version 1.1. 463)[Computer software]. Boston: RStudio, Inc. http://www. rstudio. com; 2018.
  • Rao S, Gupta P. Implementing Improved Algorithm Over APRIORI Data Mining Association Rule Algorithm 1. 2012.
  • Berzal F, Blanco I, Sánchez D, Vila M-A. Measuring the accuracy and interest of association rules: A new framework. Intelligent Data Analysis. 2002;6(3):221-35.
  • Agrawal R, Imieliński T, Swami A, editors. Mining association rules between sets of items in large databases. Acm sigmod record; 1993: ACM.
  • Manimaran J, Velmurugan T. Analysing the quality of association rules by computing an interestingness measures. Indian Journal of Science and Technology. 2015;8(15):1-12.
  • Han J, Pei J, Kamber M. Data mining: concepts and techniques: Elsevier; 2011.
  • Kiran A, Reddy K. Selecting a right interestingness measure for rare association rules. Management of Data. 2010;115:115-24.
  • Brin S, Motwani R, Ullman JD, Tsur S. Dynamic itemset counting and implication rules for market basket data. Acm Sigmod Record. 1997;26(2):255-64.
  • Tan P-N, Kumar V, Srivastava J. Selecting the right objective measure for association analysis. Information Systems. 2004;29(4):293-313.
  • Sathyavani D, Sharmila D. Optimized Weighted Association Rule Mining using Mutual information on Fuzzy Data. International Journal of scientific research and management (IJSRM). 2014;2(1):501-5.
  • Zhang C, Zhang S. Association rule mining: models and algorithms: Springer-Verlag; 2002.
  • Hahsler M. A probabilistic comparison of commonly used interest measures for association rules. United States Southern Methodist University. 2015.
  • Li W, Han J, Pei J, editors. CMAR: Accurate and efficient classification based on multiple class-association rules. Proceedings 2001 IEEE international conference on data mining; 2001: IEEE.
  • Shortliffe EH, Buchanan BG. A model of inexact reasoning in medicine. Rule-based expert systems. 1984:233-62.
  • Ilayaraja M, Meyyappan T, editors. Mining medical data to identify frequent diseases using Apriori algorithm. 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering; 2013: IEEE.
  • Nahar J, Imam T, Tickle KS, Chen Y-PP. Association rule mining to detect factors which contribute to heart disease in males and females. Expert Systems with Applications. 2013;40(4):1086-93.
  • Karaolis M, Moutiris JA, Papaconstantinou L, Pattichis CS, editors. Association rule analysis for the assessment of the risk of coronary heart events. 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society; 2009: IEEE.
  • Safdari R, Saeedi MG, Arji G, Gharooni M, Soraki M, Nasiri M. A model for predicting myocardial infarction using data mining techniques. Iranian journal of medical informatics. 2013;2(4):1-6.
  • Licastro F, Ianni M, Ferrari R, Campo G, Buscema PM, Grossi E, et al., editors. A New Risk Chart for Acute Myocardial Infarction by a Innovative Algoritm. HEALTHINF; 2015.
There are 30 citations in total.

Details

Primary Language English
Subjects Clinical Sciences
Journal Section Original Articles
Authors

Kübra Elif Akbaş 0000-0002-2804-000X

Mehmet Kıvrak 0000-0002-2405-8552

Ahmet Kadir Arslan 0000-0001-8626-9542

Tuğçe Yakınbaş 0000-0002-2116-8211

Hasan Korkmaz 0000-0001-8455-6724

Ebru Önalan 0000-0001-9968-8201

Cemil Çolak 0000-0001-5406-098X

Publication Date September 22, 2022
Acceptance Date May 13, 2022
Published in Issue Year 2022 Volume: 4 Issue: 3

Cite

AMA Akbaş KE, Kıvrak M, Arslan AK, Yakınbaş T, Korkmaz H, Önalan E, Çolak C. Assessment of Association Rule Mining Using Interest Measures on the Gene Data. Med Records. September 2022;4(3):286-292. doi:10.37990/medr.1088631

17741

Chief Editors

Assoc. Prof. Zülal Öner
Address: İzmir Bakırçay University, Department of Anatomy, İzmir, Türkiye

Assoc. Prof. Deniz Şenol
Address: Düzce University, Department of Anatomy, Düzce, Türkiye

E-mail: medrecsjournal@gmail.com

Publisher:
Medical Records Association (Tıbbi Kayıtlar Derneği)
Address: Orhangazi Neighborhood, 440th Street,
Green Life Complex, Block B, Floor 3, No. 69
Düzce, Türkiye
Web: www.tibbikayitlar.org.tr

Publication Support:
Effect Publishing & Agency
Phone: + 90 (540) 035 44 35
E-mail:
info@effectpublishing.com
Address: Akdeniz Neighborhood, Şehit Fethi Bey Street,
No: 66/B, Ground floor, 35210 Konak/İzmir, Türkiye
web: www.effectpublishing.com