|
Identifying Tandem Structural Repeat Motifs in Protein by Graph Spectral Analysis
Authors: Broto Chakrabarty,Nita Parekh
Conference: International Conference on Biomolecular Forms and Function (ICBFF2013 2013)
Date: 2013-01-08
Report no: IIIT/TR/2013/19
Abstract BACKGROUND:
Ankyrin repeat is one of the most frequently observed structural motif in proteins across all
kingdoms of life. These proteins are involved in diverse set of cellular functions and act as
transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal
transducers, and consequently, defects in ankyrin repeat proteins have been found in a
number of human diseases. Identification of these structural repeats at the sequence level is
difficult due to low conservation between the repeat copies. Thus, analysis at the structure
level is desirable.
RESULTS:
In this study, we propose a graph based approach in the identification and analysis of ankyrin
repeats. The 3-dimensional topology of protein structures has been shown to be well captured
by protein contact graphs. The connectivity information of these networks is represented in
the adjacency matrix and here we propose the analysis of the eigen spectra of the adjacency
matrix in the identification of structural repeats. A clear two-peak pattern corresponding to
the helix-turn-helix region of the Ankyrin motif is observed in the principal eigenvector of
the adjacency matrix. The length distribution of this repetitive pattern along with the
organization of the secondary structure elements is used to design an algorithm to identify the
Ankyrin structural motifs. The analysis has been carried out on a non-redundant set of 51
proteins annotated in the UniProt database and a very good agreement is observed. Analysis
of all the proteins in the alpha class and alpha+beta class in SCOP database has been
performed and a number of novel repeats, not annotated in the database have been identified.
This approach is then applied on other structural repeats such as Tetraticopeptide repeat
(TPR), Annexin, HEAT, ARM, etc.
CONCLUSIONS:
The graph based analysis of protein structures, along with domain information such as the
organization of the secondary structure architecture provides a computationally efficient
approach for the identification of structural repeats.
Full paper: pdf
Centre for Computational Natural Sciences and Bioinformatics
|