How many moonlighting proteins are there in a genome?
Moonlighting proteins (MPs) are an important class of proteins that perform more than one independent cellular function. MPs are gaining more attention in recent years as they are found to play important roles in various systems including disease developments. However, the current knowledge on MPs is still very limited for obtaining a comprehensive picture of the cellular mechanisms underlying their functional diversity.
Currently MPs are not labeled as such in biological databases even in cases where multiple distinct functions are known for the proteins.
In this work, we used computational text mining methods to identify potential MPs from PubMed abstracts and functional description in the UniProt protein database. The developed method, which uses deep learning, was confirmed to have high accuracy in detecting MPs in the benchmark dataset of known MPs and non-MPs. Subsequently, the method was applied to three genomes, human, yeast and Xenopus laevis, and found that about 2.5-35% of the proteomes are potential MPs.
This work was presented in the Conference on Intelligent Systems for Molecular Biology (ISMB) in Prague in this summer, and published on Bioinformatics.
DextMP: deep dive into text for predicting moonlighting proteins. Ishita K. Khan, Mansurul Bhuiyan, Daisuke Kihara, Bioinformatics, 33: i83–i91 (2017)
Contact: Daisuke Kihara http://kiharalab.org