Research Article

A new feature vector model for alignment-free DNA sequence similarity analysis

Volume: 40 Number: 3 October 9, 2022
  • Emre Delibaş *
  • Ahmet Arslan
EN

A new feature vector model for alignment-free DNA sequence similarity analysis

Abstract

Improvements in technology have triggered the production of big data. Within this scope, enormous amounts of biological data have been generated. A number of analysis methods have been developed to access the information contained in biological data. DNA sequence analysis has drawn particular attention in recent years. As an alternative to alignment-based sequence comparison methods that have high computational costs, alignment-free comparison methods have emerged. These methods can calculate sequence similarity by applying different dimensions of numerical characterizations. In this paper, we propose a novel alignment-free DNA sequence analysis method based on a feature extraction strategy. The method utilizes numerical characterization and is implemented by calculating mean distance of the transitions, mean distance of the nucleotide duplications, and the base frequencies. The method then measures the similarity between 7-dimensional vectors that are obtained through feature extraction. Using this approach, we conducted a sequence similarity analysis of two different DNA sequence datasets of different lengths to demonstrate the effectiveness of the method. The proposed method shows that a simple and successful feature vector can be obtained when DNA sequences having many properties are used in combination with appropriate and effective descriptors. With this strategy, reasonable results were obtained with a low computational cost.

Keywords

References

  1. The article references can be accessed from the .pdf file.

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Authors

Emre Delibaş * This is me
0000-0001-7564-5020
Türkiye

Publication Date

October 9, 2022

Submission Date

February 8, 2021

Acceptance Date

March 23, 2021

Published in Issue

Year 2022 Volume: 40 Number: 3

APA
Delibaş, E., & Arslan, A. (2022). A new feature vector model for alignment-free DNA sequence similarity analysis. Sigma Journal of Engineering and Natural Sciences, 40(3), 610-619. https://izlik.org/JA73AJ74LA
AMA
1.Delibaş E, Arslan A. A new feature vector model for alignment-free DNA sequence similarity analysis. SIGMA. 2022;40(3):610-619. https://izlik.org/JA73AJ74LA
Chicago
Delibaş, Emre, and Ahmet Arslan. 2022. “A New Feature Vector Model for Alignment-Free DNA Sequence Similarity Analysis”. Sigma Journal of Engineering and Natural Sciences 40 (3): 610-19. https://izlik.org/JA73AJ74LA.
EndNote
Delibaş E, Arslan A (October 1, 2022) A new feature vector model for alignment-free DNA sequence similarity analysis. Sigma Journal of Engineering and Natural Sciences 40 3 610–619.
IEEE
[1]E. Delibaş and A. Arslan, “A new feature vector model for alignment-free DNA sequence similarity analysis”, SIGMA, vol. 40, no. 3, pp. 610–619, Oct. 2022, [Online]. Available: https://izlik.org/JA73AJ74LA
ISNAD
Delibaş, Emre - Arslan, Ahmet. “A New Feature Vector Model for Alignment-Free DNA Sequence Similarity Analysis”. Sigma Journal of Engineering and Natural Sciences 40/3 (October 1, 2022): 610-619. https://izlik.org/JA73AJ74LA.
JAMA
1.Delibaş E, Arslan A. A new feature vector model for alignment-free DNA sequence similarity analysis. SIGMA. 2022;40:610–619.
MLA
Delibaş, Emre, and Ahmet Arslan. “A New Feature Vector Model for Alignment-Free DNA Sequence Similarity Analysis”. Sigma Journal of Engineering and Natural Sciences, vol. 40, no. 3, Oct. 2022, pp. 610-9, https://izlik.org/JA73AJ74LA.
Vancouver
1.Emre Delibaş, Ahmet Arslan. A new feature vector model for alignment-free DNA sequence similarity analysis. SIGMA [Internet]. 2022 Oct. 1;40(3):610-9. Available from: https://izlik.org/JA73AJ74LA

IMPORTANT NOTE: JOURNAL SUBMISSION LINK https://eds.yildiz.edu.tr/sigma/