Research Article
BibTex RIS Cite

Bayesian Networks and Association Analysis in Knowledge Discovery Process

Year 2012, Volume: 5 Issue: 2, 51 - 64, 01.06.2012

Abstract

Data
mining is a statistical process to extract useful information, unknown patterns
and interesting relationships in large databases. In this process, many
statistical methods are used. Two of these methods are Bayesian networks and
association analysis. Bayesian networks are probabilistic graphical models that
encode relationships among a set of random variables in a database. Since they
have both causal and probabilistic aspects, data information and expert
knowledge can easily be combined by them. Bayesian networks can also represent
knowledge about uncertain domain and make strong inferences. Association
analysis is a useful technique to detect hidden associations and rules in large
databases, and it extracts previously unknown and surprising patterns from
already known information. A drawback of association analysis is that many
patterns are generated even if the data set is very small. Hence, suitable
interestingnes measures must be performed to eliminate uninteresting patterns.



Bayesian networks and association
analysis can be used together in knowledge discovery. As association rules are
used to create Bayesian networks, interestingness measures to determine
interesting patterns can be established by Bayesian networks. In this study,
this mutual utilization between Bayesian Netwoks and association analysis is
explained and an illustration over a real life problem is presented.  

References

  • K. Bache, M. Lichman, 2013, UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
  • I. Ben-Gal, 2007, Bayesian Networks,Encyclopedia of Statistics in Quality &Reliability,F. Ruggeri, F.Faltin, R. Kenett, R. (eds), Wiley & Sons.
  • M.W. Berry,M.Browne, 2006,Lecture Notes in Data Mining, World Scientific Publishing, Singapore, 222p.
  • Y. Dong-Peng, L. Jin-Lin, 2008, Research on personal credit evaluation model based on bayesian network and association rules, Knowledge Discovery and Data Mining, 2008. WKDD 2008. First International Workshop on , 457-460.
  • D. Ersel, 2012,An Original Combined Interestingness Measure in Association Analysis, Unpublished PhD Thesis, Hacettepe University Institute of Graduate Studies in Science, Ankara, Türkiye.
  • D. Hand, H. Mannila, P. Smyth, 2001, Principles of Data Mining, The MIT Press, Cambridge, 546p.
  • D. Heckerman, 1995, Bayesian networks for data mining, Data Mining and Knowledge Discovery1, 79-119.
  • S. Jaroszewicz, D.A. Simovici, 2004, Interestingness of frequent itemsets using Bayesian networks as background knowledge, Proceedings of the 10th ACM SIGKDD Conference on Knowledge Dicovery and Data Mining, August 20-25, 2004, New York, USA, 178-186.
  • F.V. Jensen, 2001, Bayesian Networks and Decision Graphs, Springer-Verlag, New York, 268p.
  • D.T. Larose, 2004, Discovering Knowledge in Data: An Introduction to Data Mining, Wiley Interscience, New York, 222p.
  • R. Malhas, Z. Aghbari, 2007, Fast discovery of interesting patterns based on Bayesian network background knowledge, University of Sharjah Journal of Pure and Applied Science, 4 (3).
  • K. Murphy, 1998, A brief introduction to graphical models and Bayesian networks, http://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html#infer.
  • A. Siberschatz, A. Tuzhilin, A., 1995, On subjective measures of interestingness in knowledge discovery, Proceedings of the 1st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 20-21, 1995, Montreal, Canada, 275-281.
  • P. Tan, V. Kumar, J. Srivastava, 2002, Selecting the right interestingness measure for association patterns, Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 23-26, 2002, Edmonton, Alberta, Canada, 32-41.
  • P.Tan, M. Steinbach, V. Kumar, 2006, Introduction to Data Mining, Addison-Wesley, Boston,769p.

Bilgi Keşfi Sürecinde Bayesci Ağlar ve Birliktelik Analizi

Year 2012, Volume: 5 Issue: 2, 51 - 64, 01.06.2012

Abstract

Veri madenciliği, büyük veri kümelerinden yararlı bilginin, bilinmeyen örüntülerin ve ilginç ilişkilerin ortaya çıkartıldığı istatistiksel bir süreçtir. Bu süreçte, pek çok istatistiksel yöntem kullanılabilir. Bu yöntemlerden ikisi Bayesci ağlar ve birliktelik analizidir. Bayesci ağlar, bir veri tabanında yer alan raslantı değişkenlerinin bir kümesindeki olasılıksal ilişkileri kodlayan grafiksel modellerdir. Hem nedensel hem de olasılıksal özelliklere sahip olduğundan Bayesci ağlar ile veri ve uzman bilgisi kolaylıkla birleştirilebilir. Bayesci ağlar ayrıca, ilgilenilen problemin kesin olmayan tanım kümesi hakkındaki bilgiyi temsil etmek için kullanılır ve güçlü çıkarsamaların yapılmasını sağlar. Birliktelik analizi, büyük veri tabanlarındaki gizli birlikteliklerin, yararlı kuralların ve şaşırtıcı örüntülerin ortaya çıkartılmasını sağlayan bir yöntemdir.Birliktelik analizinin bir kusuru, veri kümesi çok küçük olsa dahi çok sayıda örüntünün ortaya çıkartılmasıdır.Bu nedenle, bu örüntülerden ilginç olmayanların elenmesi için ilginçlik ölçümleri kullanılmalıdır

References

  • K. Bache, M. Lichman, 2013, UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
  • I. Ben-Gal, 2007, Bayesian Networks,Encyclopedia of Statistics in Quality &Reliability,F. Ruggeri, F.Faltin, R. Kenett, R. (eds), Wiley & Sons.
  • M.W. Berry,M.Browne, 2006,Lecture Notes in Data Mining, World Scientific Publishing, Singapore, 222p.
  • Y. Dong-Peng, L. Jin-Lin, 2008, Research on personal credit evaluation model based on bayesian network and association rules, Knowledge Discovery and Data Mining, 2008. WKDD 2008. First International Workshop on , 457-460.
  • D. Ersel, 2012,An Original Combined Interestingness Measure in Association Analysis, Unpublished PhD Thesis, Hacettepe University Institute of Graduate Studies in Science, Ankara, Türkiye.
  • D. Hand, H. Mannila, P. Smyth, 2001, Principles of Data Mining, The MIT Press, Cambridge, 546p.
  • D. Heckerman, 1995, Bayesian networks for data mining, Data Mining and Knowledge Discovery1, 79-119.
  • S. Jaroszewicz, D.A. Simovici, 2004, Interestingness of frequent itemsets using Bayesian networks as background knowledge, Proceedings of the 10th ACM SIGKDD Conference on Knowledge Dicovery and Data Mining, August 20-25, 2004, New York, USA, 178-186.
  • F.V. Jensen, 2001, Bayesian Networks and Decision Graphs, Springer-Verlag, New York, 268p.
  • D.T. Larose, 2004, Discovering Knowledge in Data: An Introduction to Data Mining, Wiley Interscience, New York, 222p.
  • R. Malhas, Z. Aghbari, 2007, Fast discovery of interesting patterns based on Bayesian network background knowledge, University of Sharjah Journal of Pure and Applied Science, 4 (3).
  • K. Murphy, 1998, A brief introduction to graphical models and Bayesian networks, http://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html#infer.
  • A. Siberschatz, A. Tuzhilin, A., 1995, On subjective measures of interestingness in knowledge discovery, Proceedings of the 1st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 20-21, 1995, Montreal, Canada, 275-281.
  • P. Tan, V. Kumar, J. Srivastava, 2002, Selecting the right interestingness measure for association patterns, Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 23-26, 2002, Edmonton, Alberta, Canada, 32-41.
  • P.Tan, M. Steinbach, V. Kumar, 2006, Introduction to Data Mining, Addison-Wesley, Boston,769p.
There are 15 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Derya Ersel

Süleyman Günay This is me

Publication Date June 1, 2012
Published in Issue Year 2012 Volume: 5 Issue: 2

Cite

IEEE D. Ersel and S. Günay, “Bilgi Keşfi Sürecinde Bayesci Ağlar ve Birliktelik Analizi”, JSSA, vol. 5, no. 2, pp. 51–64, 2012.