Classification of Down Syndrome of Mice Protein Dataset on MongoDB Database

Fahriye Gemci Furat; Turgay Ibrikci

doi:10.17694/bajece.419553

Araştırma Makalesi

Yıl 2018, Cilt: 6 Sayı: 2, 112 - 117, 30.04.2018

Fahriye Gemci Furat Turgay Ibrikci

https://doi.org/10.17694/bajece.419553

Cited By: 1

Öz

Kaynakça

[1] Han, Jiawei, Jian Pei, and Micheline Kamber. Data mining: concepts and techniques. Elsevier, 2011. [2] Győrödi, C., Győrödi, R., Pecherle, G., & Olah, A. (2015). A comparative study: MongoDB vs. MySQL. In Engineering of Modern Electric Systems (EMES) 2015 13th International Conference on (pp. 1-6). IEEE. [3] Nayak, A., Poriya, A., & Poojary, D. (2013). Type of NOSQL databases and its comparison with relational databases. International Journal of Applied Information Systems, 5(4), 16-19. [4] Othman, Mohd Fauzi, and Thomas Moh Shan Yau. Comparison of different classification techniques using WEKA for breast cancer. 3rd Kuala Lumpur International Conference on Biomedical Engineering. Springer, 2007. [5] Kumar, Ajay, and Indranath Chatterjee. Data Mining: An experimental approach with WEKA on UCI Dataset. International Journal of Computer Applications 138.13 (2016). [6] Kulkarni, Priti, and Haridas Acharya. Comparative analysis of classifiers for header based emails classification using supervised learning. International Research Journal of Engineering and Technology, 03 (03), 1939- 1944 (2016). [7] Modi, Ms Urvashi, and Anurag Jain. A survey of IDS classification using KDD CUP 99 dataset in WEKA. (2016). [8] Sarunyoo Boriratrit, Sirapat Chiewchanwattana, Khamron Sunat, Pakarat Musikawan and Punyaphol Horata. Harmonic extreme learning machine for data clustering. Computer Science and Software Engineering (JCSSE), 13th International Joint Conference on. IEEE, 2016. [9] Zhonghuan Tian, Raymond Wong, Richard Millham. Elephant search algorithm on data clustering. Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), 12th International Conference on. IEEE, 2016. [10] Raikwal, J. S., and Kanak Saxena. "Performance evaluation of SVM and k-nearest neighbor algorithm over medical data set." International Journal of Computer Applications 50.14 (2012). [11] Deekshatulu, B. L., and Priti Chandra. "Classification of heart disease using k-nearest neighbor and genetic algorithm." Procedia Technology 10 (2013): 85-94. [12] Khalilia, Mohammed, Sounak Chakraborty, and Mihail Popescu. "Predicting disease risks from highly imbalanced data using random forest." BMC medical informatics and decision making11.1 (2011): 51. [13] Blake, C. & Merz, C. (1998). UCI repository of machine learning databases. University of California, Irvine, Dept. of Inf. and Computer Science. [14] Higuera C, Gardiner KJ, Cios KJ. (2015) Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome. PLoS ONE 10(6): e0129126. [15] Heckerman, David. A tutorial on learning with Bayesian networks. Innovations in Bayesian networks. Springer, 33-82, 2008. [16] Buntine, W. (1991). Theory refinement on Bayesian networks. In B. D. D’Ambrosio, P. Smets, & P.P. Bonissone (Eds.), Proceedings of the Seventh Annual Conference on Uncertainty Artificial Intelligent pp. 52-60. San Francisco, CA. [17] Daniel Grossman and Pedro Domingos (2004). Learning Bayesian Network Classifiers by Maximizing Conditional Likelihood. In Press of Proceedings of the 21st International Conference on Machine Learning, Banff, Canada. [18] Bhatia, Nitin. "Survey of nearest neighbor techniques." arXiv preprint arXiv:1007.0085 (2010). [19] T.M. Mitchell, Machine Learning, The McGraw-Hill Companies Press, 1997. [20] Mahajan, Aditi, and Anita Ganpati. "Performance evaluation of rule based classification algorithms." International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Vol 3 (2014): 3546-3550. [21] Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley. [22] Kumari, V. Anuja, and R. Chitra. "Classification of diabetes disease using support vector machine." International Journal of Engineering Research and Applications 3.2 (2013): 1797-1801. [23] Cortes, C., Vapnik, V., “Support-vector networks”, Machine Learning, 20(2), pp. 273-297, 1995. Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley. [24] WEKA at http://www.cs.waikato.ac.nz/~ml/weka. (last accessed:15 September 2018) [25] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer Peter Reutemann, Ian H. Witten. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11.1 (2009): 10-18. [26] Eibe Frank, Mark A. Hall, and Ian H. Witten (2016). The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann.

Classification of Down Syndrome of Mice Protein Dataset on MongoDB Database

Yıl 2018, Cilt: 6 Sayı: 2, 112 - 117, 30.04.2018

Fahriye Gemci Furat Turgay Ibrikci

https://doi.org/10.17694/bajece.419553

Cited By: 1

Öz

There are samples both with Down Syndrome and without
in mice protein expression data set. It is important to define the reason of
Down Syndrome treatment by means of mice protein for the same treatment seem
human being. In the present study, mice protein expression data set from UCI
repository are classified using Bayesian Network algorithm, K- Nearest
Neighbor, Decision Table, Random Forest and Support Vector Machine which are
some of classification methods. The classification
algorithms with 10-fold cross validation and by splitting equally as test and
train data are tested to classify on the mice protein data set. The
classification of the data set was succeeded with 94.3519% accuracy in 0.06
seconds using Bayesian
Network, with 99.2593% accuracy in 0.01 seconds using KNN, with 95.4630 % accuracy in 1.2 seconds using
Decision Table, with 100% accuracy in 0.58 seconds using Random Forest and with 100% accuracy in 1.17 seconds using SVM, with 10-fold cross validation. On the other hand, the
classification of the data set was succeeded with 95.3704% accuracy in 0.22
seconds using Bayesian
Network, with 98.3333% accuracy in 0 seconds using KNN, with 98.3333% accuracy in 0.72 seconds using
Decision Table, with 100% accuracy in 0.77 seconds using Random Forest and with 100% accuracy in 1.48 seconds using SVM, by equally train-test data partition.

Anahtar Kelimeler

Bayesian Network, KNN, Decision Table, Random Forest, SVM, Classification, MongoDB, NoSQL

Kaynakça

[1] Han, Jiawei, Jian Pei, and Micheline Kamber. Data mining: concepts and techniques. Elsevier, 2011. [2] Győrödi, C., Győrödi, R., Pecherle, G., & Olah, A. (2015). A comparative study: MongoDB vs. MySQL. In Engineering of Modern Electric Systems (EMES) 2015 13th International Conference on (pp. 1-6). IEEE. [3] Nayak, A., Poriya, A., & Poojary, D. (2013). Type of NOSQL databases and its comparison with relational databases. International Journal of Applied Information Systems, 5(4), 16-19. [4] Othman, Mohd Fauzi, and Thomas Moh Shan Yau. Comparison of different classification techniques using WEKA for breast cancer. 3rd Kuala Lumpur International Conference on Biomedical Engineering. Springer, 2007. [5] Kumar, Ajay, and Indranath Chatterjee. Data Mining: An experimental approach with WEKA on UCI Dataset. International Journal of Computer Applications 138.13 (2016). [6] Kulkarni, Priti, and Haridas Acharya. Comparative analysis of classifiers for header based emails classification using supervised learning. International Research Journal of Engineering and Technology, 03 (03), 1939- 1944 (2016). [7] Modi, Ms Urvashi, and Anurag Jain. A survey of IDS classification using KDD CUP 99 dataset in WEKA. (2016). [8] Sarunyoo Boriratrit, Sirapat Chiewchanwattana, Khamron Sunat, Pakarat Musikawan and Punyaphol Horata. Harmonic extreme learning machine for data clustering. Computer Science and Software Engineering (JCSSE), 13th International Joint Conference on. IEEE, 2016. [9] Zhonghuan Tian, Raymond Wong, Richard Millham. Elephant search algorithm on data clustering. Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), 12th International Conference on. IEEE, 2016. [10] Raikwal, J. S., and Kanak Saxena. "Performance evaluation of SVM and k-nearest neighbor algorithm over medical data set." International Journal of Computer Applications 50.14 (2012). [11] Deekshatulu, B. L., and Priti Chandra. "Classification of heart disease using k-nearest neighbor and genetic algorithm." Procedia Technology 10 (2013): 85-94. [12] Khalilia, Mohammed, Sounak Chakraborty, and Mihail Popescu. "Predicting disease risks from highly imbalanced data using random forest." BMC medical informatics and decision making11.1 (2011): 51. [13] Blake, C. & Merz, C. (1998). UCI repository of machine learning databases. University of California, Irvine, Dept. of Inf. and Computer Science. [14] Higuera C, Gardiner KJ, Cios KJ. (2015) Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome. PLoS ONE 10(6): e0129126. [15] Heckerman, David. A tutorial on learning with Bayesian networks. Innovations in Bayesian networks. Springer, 33-82, 2008. [16] Buntine, W. (1991). Theory refinement on Bayesian networks. In B. D. D’Ambrosio, P. Smets, & P.P. Bonissone (Eds.), Proceedings of the Seventh Annual Conference on Uncertainty Artificial Intelligent pp. 52-60. San Francisco, CA. [17] Daniel Grossman and Pedro Domingos (2004). Learning Bayesian Network Classifiers by Maximizing Conditional Likelihood. In Press of Proceedings of the 21st International Conference on Machine Learning, Banff, Canada. [18] Bhatia, Nitin. "Survey of nearest neighbor techniques." arXiv preprint arXiv:1007.0085 (2010). [19] T.M. Mitchell, Machine Learning, The McGraw-Hill Companies Press, 1997. [20] Mahajan, Aditi, and Anita Ganpati. "Performance evaluation of rule based classification algorithms." International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Vol 3 (2014): 3546-3550. [21] Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley. [22] Kumari, V. Anuja, and R. Chitra. "Classification of diabetes disease using support vector machine." International Journal of Engineering Research and Applications 3.2 (2013): 1797-1801. [23] Cortes, C., Vapnik, V., “Support-vector networks”, Machine Learning, 20(2), pp. 273-297, 1995. Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley. [24] WEKA at http://www.cs.waikato.ac.nz/~ml/weka. (last accessed:15 September 2018) [25] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer Peter Reutemann, Ian H. Witten. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11.1 (2009): 10-18. [26] Eibe Frank, Mark A. Hall, and Ian H. Witten (2016). The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann.

Toplam 1 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Mühendislik
Bölüm	Araştırma Makalesi
Yazarlar	Fahriye Gemci Furat Bu kişi benim Turgay Ibrikci Bu kişi benim
Yayımlanma Tarihi	30 Nisan 2018
Yayımlandığı Sayı	Yıl 2018 Cilt: 6 Sayı: 2

Kaynak Göster

APA	Furat, F. G., & Ibrikci, T. (2018). Classification of Down Syndrome of Mice Protein Dataset on MongoDB Database. Balkan Journal of Electrical and Computer Engineering, 6(2), 112-117. https://doi.org/10.17694/bajece.419553

Balkan Journal of Electrical and Computer Engineering

Öz

Kaynakça

Classification of Down Syndrome of Mice Protein Dataset on MongoDB Database

Öz

Anahtar Kelimeler

Kaynakça

Ayrıntılar

Kaynak Göster

Cited By

LSTM-GRU Based Deep Learning Model with Word2Vec for Transcription Factors in Primates

Balkan Journal of Electrical and Computer Engineering

https://doi.org/10.17694/bajece.1191009