IIIT Hyderabad Publications |
|||||||||
|
Deep-StRIP: Deep Learning Approach for Structural Repeat Identification in ProteinsAuthor: Kanak Garg Date: 2022-12-22 Report no: IIIT/TH/2022/152 Advisor:Nita Parekh AbstractIt is observed that internal repeats in eukaryotic proteins are three times more likely compared to in prokaryotes suggesting possible advantages provided to the organism. Functions specific to eukaryotes such as connective tissue proteins, cytoskeletal proteins, ribonucleoproteins, muscle proteins, brain and synaptic proteins, and cell adhesion proteins are observed to have internal repeats. The other major advantage of studying protein repeats is in their ability to confer multiple binding and structural roles on proteins. With a high frequency of occurrence in humans, 30%, tandem repeat proteins have been linked to various complex diseases, e.g., the role of Leucine-rich repeats in Parkinson's disease, ANK, HEAT, and ARM repeats in cancer, etc. A large surfaceto-volume ratio, good target affinity, smaller size, and stability in the shape of repeat proteins make them important for biomedical applications. For example, designed ankyrin repeat proteins (DARPins) are being developed for targeted therapies. Due to fewer constraints at the sequence level, detection of repeats at the structure level is desirable. Here we propose a deep learning-based approach for the detection of two major classes of structural repeats, namely, Class III and Class IV repeats in Kajava’s classification, which covers over 90% of known structural repeats and their repeat region. We approached the problem by exploiting the presence of structural information in the protein distance matrix which captures the internal distances between residues in a protein chain. Performance evaluation is carried out by comparison with other state-of-the-art methods and annotations in UniProt. Lastly, we have discussed the improvement done in the development of the NAPS Portal (Network Analysis of Protein Structures) which is used by researchers to visualize and to perform analyses of different protein structures. Full thesis: pdf Centre for Computational Natural Sciences and Bioinformatics |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |