Research Article
BibTex RIS Cite

Boyut İndirgeme Yöntemlerinin Karşılaştırmalı Analizi

Year 2020, , 107 - 113, 18.06.2020
https://doi.org/10.46810/tdfd.707200

Abstract

Günümüz veritabanları hızlı bir şekilde büyümektedir. Örneğin Youtube’a her dakikada ortalama 300 saatlik video yüklenmektedir. Veri boyutuyla orantılı bir şekilde, işleme, depolama ve transfer maliyetleri artmaktadır. Buna karşılık, özellikle video ve imge gibi yüksek boyutlu veri içeriklerinin büyük oranda benzer olduğu bilinmektedir. Bu tür yüksek boyutlu ham verilerin, düşük boyutlara indirgenmesi, imge sınıflandırma, algılama ve anlamlı bilgi çıkarım prosesleri için hayati öneme sahiptir.
Veri boyutunu indirgeyen çok sayıda teknik mevcuttur. Klasik yapay öğrenme tekniklerinden; PCA (Temel Bileşenler Analizi) ve LDA (Doğrusal Ayıraç Analizi), probleme matematiksel bir çözüm zemini kazandırdıkları için ön plana çıkarken, doğrusal olmayan tekniklerden, derin öğrenme yaklaşımlarından olan Oto-Kodlayıcı (Auto-Encoding), büyük verilerin indirgenmesine izin vermesi bakımından araştırmacıların ilgisini çekmektedir.
Bu çalışmada, gerçek ve sentetik veriler (doğrusal ve doğrusal olmayan) kullanılarak PCA, LDA ve Auto-Encoding (AE) yöntemlerinin boyut indirgeme performansları incelenmiştir. Belirli kıstaslarda (harcanan zaman, yeniden inşa etme doğruluğu vb.) alınan sonuçlar karşılaştırmalı bir şekilde sunulmuştur.

References

  • [1] Tharwat A. Principal component analysis - a tutorial. Int J Appl Pattern Recognit. 2016; [2] Jamal A, Handayani A, Septiandri AA, Ripmiatin E, Effendi Y. Dimensionality Reduction using PCA and K-Means Clustering for Breast Cancer Prediction. Lontar Komput J Ilm Teknol Inf. 2018;
  • [3] Gu Q, Li Z, Han J. Linear discriminant dimensionality reduction. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2011.
  • [4] Analysis LD. Introduction to LDA LDA. Cancer Lett. 2005; [5] Ng A. “Sparse autoencoder.” CS294A Lect notes 72. 2011;1(19).
  • [6] Çalişan M, Talu MF. Examination of the effect of the basic parameters of the auto-encoder on coding performance. In: IDAP 2017 - International Artificial Intelligence and Data Processing Symposium. 2017.
  • [7] MNIST Dataset [Internet]. [cited 2019 May 12]. Available from: http://yann.lecun.com/exdb/mnist/
  • [8] Keogh EJ, Pazzani MJ. A simple dimensionality reduction technique for fast similarity search in large time series databases. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2000.
  • [9] Bishop CM. Pattern Regonization and Macine Learning. Oxidation Communications. 2004.
  • [10] Han J, Kamber M, Pei J. Data Mining: Concepts and Techniques. Data Mining: Concepts and Techniques. 2012.
  • [11] Tantawi MM, Revett K, Salem A, Tolba MF. Fiducial feature reduction analysis for electrocardiogram (ECG) based biometric recognition. J Intell Inf Syst. 2013;
  • [12] Bruce LM, Koger CH, Li J. Dimensionality reduction of hyperspectral data using discrete wavelet transform feature extraction. IEEE Trans Geosci Remote Sens. 2002;
  • [13] Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, et al. Human-level control through deep reinforcement learning. Nature. 2015;
  • [14] Vinjamuri R, Patel V, Powell M, Mao ZH, Crone N. Candidates for synergies: Linear Discriminants versus principal components. Comput Intell Neurosci. 2014; [15] Runnalls AR. Kullback-Leibler approach to Gaussian mixture reduction. IEEE Trans Aerosp Electron Syst. 2007;

Comparative Analysis of Dimension Reduction Methods

Year 2020, , 107 - 113, 18.06.2020
https://doi.org/10.46810/tdfd.707200

Abstract

Today's databases are growing rapidly. For example, Youtube uploads an average of 300 hours of video every minute. In proportion to the size of the data, processing, storage and transfer costs are increasing. On the other hand, it is known that high-dimensional data contents such as video and image are largely similar. Such high-dimensional raw data has a vital proposition for the reduction of images to low dimensions, image classification, detection and meaningful information extraction processes.
There are many techniques available to reduce data size. From classical artificial learning; PCA (Principal Components Analysis) and LDA (Linear Discriminant Analysis), while probing is at the forefront of gaining a mathematical solution, Autoencoder, which is one of the non-linear techniques and deep learning approaches, attracts researchers to allow the reduction of large data.
In this study, dimensional reduction performances of PCA, LDA and Auto-Encoding (AE) methods using real and synthetic data (linear and nonlinear) were investigated. The results obtained on certain criteria (time spent, correctness of reconstruction, etc.) are presented comparatively.

References

  • [1] Tharwat A. Principal component analysis - a tutorial. Int J Appl Pattern Recognit. 2016; [2] Jamal A, Handayani A, Septiandri AA, Ripmiatin E, Effendi Y. Dimensionality Reduction using PCA and K-Means Clustering for Breast Cancer Prediction. Lontar Komput J Ilm Teknol Inf. 2018;
  • [3] Gu Q, Li Z, Han J. Linear discriminant dimensionality reduction. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2011.
  • [4] Analysis LD. Introduction to LDA LDA. Cancer Lett. 2005; [5] Ng A. “Sparse autoencoder.” CS294A Lect notes 72. 2011;1(19).
  • [6] Çalişan M, Talu MF. Examination of the effect of the basic parameters of the auto-encoder on coding performance. In: IDAP 2017 - International Artificial Intelligence and Data Processing Symposium. 2017.
  • [7] MNIST Dataset [Internet]. [cited 2019 May 12]. Available from: http://yann.lecun.com/exdb/mnist/
  • [8] Keogh EJ, Pazzani MJ. A simple dimensionality reduction technique for fast similarity search in large time series databases. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2000.
  • [9] Bishop CM. Pattern Regonization and Macine Learning. Oxidation Communications. 2004.
  • [10] Han J, Kamber M, Pei J. Data Mining: Concepts and Techniques. Data Mining: Concepts and Techniques. 2012.
  • [11] Tantawi MM, Revett K, Salem A, Tolba MF. Fiducial feature reduction analysis for electrocardiogram (ECG) based biometric recognition. J Intell Inf Syst. 2013;
  • [12] Bruce LM, Koger CH, Li J. Dimensionality reduction of hyperspectral data using discrete wavelet transform feature extraction. IEEE Trans Geosci Remote Sens. 2002;
  • [13] Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, et al. Human-level control through deep reinforcement learning. Nature. 2015;
  • [14] Vinjamuri R, Patel V, Powell M, Mao ZH, Crone N. Candidates for synergies: Linear Discriminants versus principal components. Comput Intell Neurosci. 2014; [15] Runnalls AR. Kullback-Leibler approach to Gaussian mixture reduction. IEEE Trans Aerosp Electron Syst. 2007;
There are 12 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Mücahit Çalışan 0000-0003-2651-5937

Muhammed Fatih Talu 0000-0003-1166-8404

Publication Date June 18, 2020
Published in Issue Year 2020

Cite

APA Çalışan, M., & Talu, M. F. (2020). Boyut İndirgeme Yöntemlerinin Karşılaştırmalı Analizi. Türk Doğa Ve Fen Dergisi, 9(1), 107-113. https://doi.org/10.46810/tdfd.707200
AMA Çalışan M, Talu MF. Boyut İndirgeme Yöntemlerinin Karşılaştırmalı Analizi. TDFD. June 2020;9(1):107-113. doi:10.46810/tdfd.707200
Chicago Çalışan, Mücahit, and Muhammed Fatih Talu. “Boyut İndirgeme Yöntemlerinin Karşılaştırmalı Analizi”. Türk Doğa Ve Fen Dergisi 9, no. 1 (June 2020): 107-13. https://doi.org/10.46810/tdfd.707200.
EndNote Çalışan M, Talu MF (June 1, 2020) Boyut İndirgeme Yöntemlerinin Karşılaştırmalı Analizi. Türk Doğa ve Fen Dergisi 9 1 107–113.
IEEE M. Çalışan and M. F. Talu, “Boyut İndirgeme Yöntemlerinin Karşılaştırmalı Analizi”, TDFD, vol. 9, no. 1, pp. 107–113, 2020, doi: 10.46810/tdfd.707200.
ISNAD Çalışan, Mücahit - Talu, Muhammed Fatih. “Boyut İndirgeme Yöntemlerinin Karşılaştırmalı Analizi”. Türk Doğa ve Fen Dergisi 9/1 (June 2020), 107-113. https://doi.org/10.46810/tdfd.707200.
JAMA Çalışan M, Talu MF. Boyut İndirgeme Yöntemlerinin Karşılaştırmalı Analizi. TDFD. 2020;9:107–113.
MLA Çalışan, Mücahit and Muhammed Fatih Talu. “Boyut İndirgeme Yöntemlerinin Karşılaştırmalı Analizi”. Türk Doğa Ve Fen Dergisi, vol. 9, no. 1, 2020, pp. 107-13, doi:10.46810/tdfd.707200.
Vancouver Çalışan M, Talu MF. Boyut İndirgeme Yöntemlerinin Karşılaştırmalı Analizi. TDFD. 2020;9(1):107-13.