Improving Profile Similarity Search and Alignment of Protein Sequences

Tong, Jing

Improving Profile Similarity Search and Alignment of Protein Sequences

Files

TONG-DISSERTATION-2015.pdf (5.27 MB)

Date

2015-11-20

Authors

Tong, Jing

Abstract

Protein function prediction is one of the most important problems in the field of computational biology. The most reliable method to predict protein function is to detect homologs. Homologous proteins tend to possess conserved sequence motifs, the same structure folds, and similar functional sites. Current sequence-based homology search methods are still unable to detect many similarities evident from protein spatial structures. We present a new method, COMPADRE, to assess the relationship between the query sequence and a hit in the database by considering the similarity between the query and hit's known homologs. This method markedly boosts the homology detection precision rate. Successful homology-based protein function prediction is also determined by accurate alignment between a protein sequence and its homolog. Alignment errors are the main bottleneck for homology modeling when the query is distantly related to the template. Alignment methods often misalign secondary structural elements by a few residues. We present a refinement method, SFESA, to improve pairwise sequence alignments by evaluating alignment variants generated by local shifts of template-defined secondary structures. The potential values of these methods for structure/function predictions are illustrated by the detection of homology between evolutionary distant yet structurally similar protein domains.