Development of a Multi-Task Learning Framework for Simultaneous Prediction of Protein Secondary Structure, Solvent Accessibility, and Disorder Regions

Yasmine G. Al-Jabouri; Hend Majed Muhsen; Kadhim Naeem Ajel

Authors

Yasmine G. Al-Jabouri Mustansiriyah University, College of Science, Palestine Street, Baghdad, Iraq
Hend Majed Muhsen Middle Technical University College of Health & Medical
Kadhim Naeem Ajel Mustansiriyah University, College of Science, Palestine Street, Baghdad, Iraq

Vol. 3 No. 3 (2026): American Journal of Bioscience and Clinical Integrity

Articles

March 14, 2026

Downloads

View Article

Abstract
How to Cite
Metrics
License

Accurately predicting protein structural properties is of great importance in protein function annotation and protein therapeutics design. The available protein databases, however, have fragmented labels - none of the existing datasets simultaneously possess labels for secondary structure, solvent accessibility, and disorder regions. The lack of comprehensive labeled data is caused by the intrinsic limitations of experimental methods and the purpose-oriented design of different databases. As a result, it is difficult to build models that accurately predict all of these properties. In this paper, we present a multi-task learning framework that leverages partially labeled data from three different, yet complementary datasets: CB513 (labeled for secondary structure and solvent accessibility), DisProt (labeled for disorder annotations), and PISCES (providing additional sequences). Our joint model uses a shared bidirectional LSTM encoder followed by task-specific attention modules and uncertainty-weighted loss balancing to predict all three properties jointly. We trained our framework on 6,056 proteins with fragmented annotations and obtained Q3 accuracy of 75.6%, 99.99% and 59.7% (46.9% F1-score) on secondary structure, solvent accessibility and disorder, respectively. The multi-task model significantly outperformed our single-task baselines by 5.6% on disorder F1-score, highlighting that shared representations learnt from weak, fragmented signals on each task can lead to better accuracy on all tasks.

Al-Jabouri , Y. G., Muhsen , H. M., & Ajel , K. N. (2026). Development of a Multi-Task Learning Framework for Simultaneous Prediction of Protein Secondary Structure, Solvent Accessibility, and Disorder Regions. American Journal of Bioscience and Clinical Integrity, 3(3), 52–70. Retrieved from https://biojournals.us/index.php/AJBCI/article/view/2161

Download Citation

Development of a Multi-Task Learning Framework for Simultaneous Prediction of Protein Secondary Structure, Solvent Accessibility, and Disorder Regions

Authors

Downloads

menubaru

templatebaru

index2

counter

Info

American Journal of Bioscience and Clinical Integrity