Predicting the 3-D structure of a protein from its sequence based on a template protein structure is still one of the most exact modeling techniques present today. However, template-based modeling is heavily dependent on the selection of a single template structure and the sequence alignment between target and template. Mainly when the target and template sequence identity is low, the error from the alignment introduces larger errors to the model structure. An iterative method to correct such alignment mistakes is used in this study with a benchmark set from CASP in the extremely low sequence-identity regime. This is a protocol developed and tested before and it evaluates the alignment quality by building rough 3-D models for each alignment. Then by using a genetic algorithm it iteratively creates a new set of alignments. Since the method evaluates models, not sequence alignments, structural features are automatically incorporated into the alignment protocol. In the current study, models from structural alignment have been built by Modeller program to show the maximum possible quality of the model that can be obtained from that template structure with the iterative modeling protocol. Then the results and correctly aligned segments from the iterative modeling protocol are analyzed. Finally, it has been shown that if a good local fragment assessment scoring function is developed, the correctly aligned segments exist in the pool of alignments created by the protocol. Thus, the improvement of modeling in the low sequence identity regime is conceivable.
Homology modeling Sequence-sequence alignment Genetic algorithm Molecular modeling Structural alignment
Predicting the 3-D structure of a protein from its sequence based on a template protein structure is still one of the most exact modeling techniques present today. However, template-based modeling is heavily dependent on the selection of a single template structure and the sequence alignment between target and template. Mainly when the target and template sequence identity is low, the error from the alignment introduces larger errors to the model structure. An iterative method to correct such alignment mistakes is used in this study with a benchmark set from CASP in the extremely low sequence-identity regime. This is a protocol developed and tested before and it evaluates the alignment quality by building rough 3-D models for each alignment. Then by using a genetic algorithm it iteratively creates a new set of alignments. Since the method evaluates models, not sequence alignments, structural features are automatically incorporated into the alignment protocol. In the current study, models from structural alignment have been built by Modeller program to show the maximum possible quality of the model that can be obtained from that template structure with the iterative modeling protocol. Then the results and correctly aligned segments from the iterative modeling protocol are analyzed. Finally, it has been shown that if a good local fragment assessment scoring function is developed, the correctly aligned segments exist in the pool of alignments created by the protocol. Thus, the improvement of modeling in the low sequence identity regime is conceivable.
Homology modeling Sequence-sequence alignment Genetic algorithm Molecular modeling Structural alignment
Primary Language | English |
---|---|
Subjects | Genetics (Other), Animal Cell and Molecular Biology, Protein Engineering |
Journal Section | Research Articles |
Authors | |
Early Pub Date | February 17, 2024 |
Publication Date | March 15, 2024 |
Submission Date | December 9, 2023 |
Acceptance Date | January 15, 2024 |
Published in Issue | Year 2024 |