Research Article
BibTex RIS Cite
Year 2018, Volume: 6 Issue: 2, 112 - 117, 30.04.2018
https://doi.org/10.17694/bajece.419553

Abstract

References

  • [1] Han, Jiawei, Jian Pei, and Micheline Kamber. Data mining: concepts and techniques. Elsevier, 2011. [2] Győrödi, C., Győrödi, R., Pecherle, G., & Olah, A. (2015). A comparative study: MongoDB vs. MySQL. In Engineering of Modern Electric Systems (EMES) 2015 13th International Conference on (pp. 1-6). IEEE. [3] Nayak, A., Poriya, A., & Poojary, D. (2013). Type of NOSQL databases and its comparison with relational databases. International Journal of Applied Information Systems, 5(4), 16-19. [4] Othman, Mohd Fauzi, and Thomas Moh Shan Yau. Comparison of different classification techniques using WEKA for breast cancer. 3rd Kuala Lumpur International Conference on Biomedical Engineering. Springer, 2007. [5] Kumar, Ajay, and Indranath Chatterjee. Data Mining: An experimental approach with WEKA on UCI Dataset. International Journal of Computer Applications 138.13 (2016). [6] Kulkarni, Priti, and Haridas Acharya. Comparative analysis of classifiers for header based emails classification using supervised learning. International Research Journal of Engineering and Technology, 03 (03), 1939- 1944 (2016). [7] Modi, Ms Urvashi, and Anurag Jain. A survey of IDS classification using KDD CUP 99 dataset in WEKA. (2016). [8] Sarunyoo Boriratrit, Sirapat Chiewchanwattana, Khamron Sunat, Pakarat Musikawan and Punyaphol Horata. Harmonic extreme learning machine for data clustering. Computer Science and Software Engineering (JCSSE), 13th International Joint Conference on. IEEE, 2016. [9] Zhonghuan Tian, Raymond Wong, Richard Millham. Elephant search algorithm on data clustering. Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), 12th International Conference on. IEEE, 2016. [10] Raikwal, J. S., and Kanak Saxena. "Performance evaluation of SVM and k-nearest neighbor algorithm over medical data set." International Journal of Computer Applications 50.14 (2012). [11] Deekshatulu, B. L., and Priti Chandra. "Classification of heart disease using k-nearest neighbor and genetic algorithm." Procedia Technology 10 (2013): 85-94. [12] Khalilia, Mohammed, Sounak Chakraborty, and Mihail Popescu. "Predicting disease risks from highly imbalanced data using random forest." BMC medical informatics and decision making11.1 (2011): 51. [13] Blake, C. & Merz, C. (1998). UCI repository of machine learning databases. University of California, Irvine, Dept. of Inf. and Computer Science. [14] Higuera C, Gardiner KJ, Cios KJ. (2015) Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome. PLoS ONE 10(6): e0129126. [15] Heckerman, David. A tutorial on learning with Bayesian networks. Innovations in Bayesian networks. Springer, 33-82, 2008. [16] Buntine, W. (1991). Theory refinement on Bayesian networks. In B. D. D’Ambrosio, P. Smets, & P.P. Bonissone (Eds.), Proceedings of the Seventh Annual Conference on Uncertainty Artificial Intelligent pp. 52-60. San Francisco, CA. [17] Daniel Grossman and Pedro Domingos (2004). Learning Bayesian Network Classifiers by Maximizing Conditional Likelihood. In Press of Proceedings of the 21st International Conference on Machine Learning, Banff, Canada. [18] Bhatia, Nitin. "Survey of nearest neighbor techniques." arXiv preprint arXiv:1007.0085 (2010). [19] T.M. Mitchell, Machine Learning, The McGraw-Hill Companies Press, 1997. [20] Mahajan, Aditi, and Anita Ganpati. "Performance evaluation of rule based classification algorithms." International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Vol 3 (2014): 3546-3550. [21] Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley. [22] Kumari, V. Anuja, and R. Chitra. "Classification of diabetes disease using support vector machine." International Journal of Engineering Research and Applications 3.2 (2013): 1797-1801. [23] Cortes, C., Vapnik, V., “Support-vector networks”, Machine Learning, 20(2), pp. 273-297, 1995. Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley. [24] WEKA at http://www.cs.waikato.ac.nz/~ml/weka. (last accessed:15 September 2018) [25] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer Peter Reutemann, Ian H. Witten. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11.1 (2009): 10-18. [26] Eibe Frank, Mark A. Hall, and Ian H. Witten (2016). The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann.

Classification of Down Syndrome of Mice Protein Dataset on MongoDB Database

Year 2018, Volume: 6 Issue: 2, 112 - 117, 30.04.2018
https://doi.org/10.17694/bajece.419553

Abstract

There are samples both with Down Syndrome and without
in mice protein expression data set. It is important to define the reason of
Down Syndrome treatment by means of mice protein for the same treatment seem
human being. In the present study, mice protein expression data set from UCI
repository are classified using Bayesian Network algorithm, K- Nearest
Neighbor, Decision Table, Random Forest and Support Vector Machine which are
some of classification methods.  The classification
algorithms with 10-fold cross validation and by splitting equally as test and
train data are tested to classify on the mice protein data set. The
classification of the data set was succeeded with 94.3519% accuracy in 0.06
seconds using
Bayesian
Network
, with 99.2593% accuracy in 0.01 seconds using KNN, with 95.4630 % accuracy in 1.2 seconds using
Decision Table
, with 100% accuracy in 0.58 seconds using Random Forest and with 100% accuracy in 1.17 seconds using SVM, with 10-fold cross validation. On the other hand, the
classification of the data set was succeeded with 95.3704% accuracy in 0.22
seconds using
Bayesian
Network
, with 98.3333% accuracy in 0 seconds using KNN, with 98.3333% accuracy in 0.72 seconds using
Decision Table
, with 100% accuracy in 0.77 seconds using Random Forest and with 100% accuracy in 1.48 seconds using SVM, by equally train-test data partition.

References

  • [1] Han, Jiawei, Jian Pei, and Micheline Kamber. Data mining: concepts and techniques. Elsevier, 2011. [2] Győrödi, C., Győrödi, R., Pecherle, G., & Olah, A. (2015). A comparative study: MongoDB vs. MySQL. In Engineering of Modern Electric Systems (EMES) 2015 13th International Conference on (pp. 1-6). IEEE. [3] Nayak, A., Poriya, A., & Poojary, D. (2013). Type of NOSQL databases and its comparison with relational databases. International Journal of Applied Information Systems, 5(4), 16-19. [4] Othman, Mohd Fauzi, and Thomas Moh Shan Yau. Comparison of different classification techniques using WEKA for breast cancer. 3rd Kuala Lumpur International Conference on Biomedical Engineering. Springer, 2007. [5] Kumar, Ajay, and Indranath Chatterjee. Data Mining: An experimental approach with WEKA on UCI Dataset. International Journal of Computer Applications 138.13 (2016). [6] Kulkarni, Priti, and Haridas Acharya. Comparative analysis of classifiers for header based emails classification using supervised learning. International Research Journal of Engineering and Technology, 03 (03), 1939- 1944 (2016). [7] Modi, Ms Urvashi, and Anurag Jain. A survey of IDS classification using KDD CUP 99 dataset in WEKA. (2016). [8] Sarunyoo Boriratrit, Sirapat Chiewchanwattana, Khamron Sunat, Pakarat Musikawan and Punyaphol Horata. Harmonic extreme learning machine for data clustering. Computer Science and Software Engineering (JCSSE), 13th International Joint Conference on. IEEE, 2016. [9] Zhonghuan Tian, Raymond Wong, Richard Millham. Elephant search algorithm on data clustering. Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), 12th International Conference on. IEEE, 2016. [10] Raikwal, J. S., and Kanak Saxena. "Performance evaluation of SVM and k-nearest neighbor algorithm over medical data set." International Journal of Computer Applications 50.14 (2012). [11] Deekshatulu, B. L., and Priti Chandra. "Classification of heart disease using k-nearest neighbor and genetic algorithm." Procedia Technology 10 (2013): 85-94. [12] Khalilia, Mohammed, Sounak Chakraborty, and Mihail Popescu. "Predicting disease risks from highly imbalanced data using random forest." BMC medical informatics and decision making11.1 (2011): 51. [13] Blake, C. & Merz, C. (1998). UCI repository of machine learning databases. University of California, Irvine, Dept. of Inf. and Computer Science. [14] Higuera C, Gardiner KJ, Cios KJ. (2015) Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome. PLoS ONE 10(6): e0129126. [15] Heckerman, David. A tutorial on learning with Bayesian networks. Innovations in Bayesian networks. Springer, 33-82, 2008. [16] Buntine, W. (1991). Theory refinement on Bayesian networks. In B. D. D’Ambrosio, P. Smets, & P.P. Bonissone (Eds.), Proceedings of the Seventh Annual Conference on Uncertainty Artificial Intelligent pp. 52-60. San Francisco, CA. [17] Daniel Grossman and Pedro Domingos (2004). Learning Bayesian Network Classifiers by Maximizing Conditional Likelihood. In Press of Proceedings of the 21st International Conference on Machine Learning, Banff, Canada. [18] Bhatia, Nitin. "Survey of nearest neighbor techniques." arXiv preprint arXiv:1007.0085 (2010). [19] T.M. Mitchell, Machine Learning, The McGraw-Hill Companies Press, 1997. [20] Mahajan, Aditi, and Anita Ganpati. "Performance evaluation of rule based classification algorithms." International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Vol 3 (2014): 3546-3550. [21] Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley. [22] Kumari, V. Anuja, and R. Chitra. "Classification of diabetes disease using support vector machine." International Journal of Engineering Research and Applications 3.2 (2013): 1797-1801. [23] Cortes, C., Vapnik, V., “Support-vector networks”, Machine Learning, 20(2), pp. 273-297, 1995. Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley. [24] WEKA at http://www.cs.waikato.ac.nz/~ml/weka. (last accessed:15 September 2018) [25] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer Peter Reutemann, Ian H. Witten. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11.1 (2009): 10-18. [26] Eibe Frank, Mark A. Hall, and Ian H. Witten (2016). The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann.
There are 1 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Araştırma Articlessi
Authors

Fahriye Gemci Furat This is me

Turgay Ibrikci This is me

Publication Date April 30, 2018
Published in Issue Year 2018 Volume: 6 Issue: 2

Cite

APA Furat, F. G., & Ibrikci, T. (2018). Classification of Down Syndrome of Mice Protein Dataset on MongoDB Database. Balkan Journal of Electrical and Computer Engineering, 6(2), 112-117. https://doi.org/10.17694/bajece.419553

All articles published by BAJECE are licensed under the Creative Commons Attribution 4.0 International License. This permits anyone to copy, redistribute, remix, transmit and adapt the work provided the original work and source is appropriately cited.Creative Commons Lisansı