Database Selection Shapes Protein Identification and Quantification in Proteomics
Öz
Objective: This study evaluates the effect of protein database composition, size, and subcellular specificity on protein identification, peptide coverage, and quantitative accuracy in MS-based proteomic analyses. Methods: Whole proteome analyses were performed using HeLa cells. Label-free quantification (LFQ) analyses of secretome samples were conducted using the breast cell lines MCF-10A and MDA-MB-231 to assess database-related effects on quantification. Nuclear proteome analyses were carried out using raw MS data generated in previous studies. Reviewed, unreviewed, and subcellular proteome–specific databases were systematically analyzed and compared to evaluate the effect of database on LC-MS/MS results. Results: Using unreviewed protein entries did not result in a meaningful increase in protein or peptide identifications. Results showed that using gene name provided more accurate results comparing with accession number. In addition, subcellular proteome–specific databases improved peptide-to-protein assignment consistency and reduced ambiguity. Even modest differences in peptide counts propagated into downstream quantitative analyses, affecting abundance ratios. Conclusion: Results indicate that database choice plays a crucial role in proteomic data clarity, consistency, and quantitative reliability, and that the use of curated, reviewed, and target-specific subcellular databases enhances analytical confidence, emphasizing the importance of careful database selection in MS-based proteomic studies.
Anahtar Kelimeler
Destekleyen Kurum
Proje Numarası
Etik Beyan
Kaynakça
- NK, Thu NQ, Tien NTN, Long NP, Nguyen HT. Advancements in Mass Spectrometry-Based Targeted Metabolomics and Lipidomics: Implications for Clinical Research. Molecules. 2024;29:24. doi:10.3390/molecules29245934.
- Sidira M, Agriopoulou S, Smaoui S, Varzakas T. Omics-Integrated Approach (Metabolomics, Proteomics and Lipidomics) to Assess the Quality Control of Aquatic and Seafood Products. Applied Sciences. 2024;14:22-10755. doi:10.3390/app142210755.
- Xu W, Kang J, Rosing T. A near-storage framework for boosted data preprocessing of mass spectrum clustering. J Proteome Res. 2022:313–318. doi:10.1145/3489517.3530449
- Chen C, Hou J, Tanner JJ, Cheng J. Bioinformatics Methods for Mass Spectrometry-Based Proteomics Data Analysis. Int J Mol Sci. 2020;21:8-2873. doi:10.3390/ijms21082873
- Swaney DL, Villén J. Proteomic Analysis of Protein Posttranslational Modifications by Mass Spectrometry. Cold Spring Harb Protoc. 2016;2016(3):pdb.top077743. doi:10.1101/pdb.top077743
- Gao Y, Wang Y. A method to determine the ionization efficiency change of peptides caused by phosphorylation. J Am Soc Mass Spectrom. 2007;18(11):1973-6. doi:10.1016/j.jasms.2007.08.010
- Fancello L, Burger T. An analysis of proteogenomics and how and when transcriptome-informed reduction of protein databases can enhance eukaryotic proteomics. Genome Biol. 2022;23(1):132. doi:10.1186/s13059-022-02701-2
- McAfee KJ, Duncan DT, Assink M, Link AJ. Analyzing proteomes and protein function using graphical comparative analysis of tandem mass spectrometry results. Mol Cell Proteomics. 2006;5(8):1497-1513. doi:10.1074/mcp.T500027-MCP200
Ayrıntılar
Birincil Dil
İngilizce
Konular
Proteomik ve Moleküller Arası Etkileşimler
Bölüm
Araştırma Makalesi
Yazarlar
Mehmet Sarıhan
*
0000-0002-1565-5718
Türkiye
Murat Kasap
0000-0001-8527-2096
Türkiye
Gürler Akpınar
0000-0002-9675-3714
Türkiye
Yayımlanma Tarihi
27 Şubat 2026
Gönderilme Tarihi
30 Aralık 2025
Kabul Tarihi
28 Ocak 2026
Yayımlandığı Sayı
Yıl 2026 Cilt: 9