Abstract
BackgroundIntrinsically disordered proteins (IDPs) are critical for numerous biological processes, yet their inherent conformational heterogeneity presents significant challenges for structural and functional characterization. Traditional experimental and computational methods often struggle to fully capture the dynamic ensembles that define IDP behavior.MethodsThis study integrates advanced molecular dynamics (MD) simulations with machine learning (ML) techniques to provide a more comprehensive understanding of IDP conformational dynamics. We performed extensive MD simulations on model IDPs, generating large datasets of conformational trajectories. Key structural features, including radius of gyration, root-mean-square deviation, and secondary structure content, were extracted. These features were then used to train and validate various ML models, including Random Forest classifiers and deep neural networks, to predict and classify distinct conformational states and transitions.ResultsOur integrated approach successfully identified and characterized several metastable conformational states within the IDP ensembles, demonstrating the power of ML to discern subtle patterns in high-dimensional MD data. The ML models achieved high accuracy in classifying these states and revealed key molecular features driving conformational transitions. Feature importance analysis highlighted specific residue interactions and local structural motifs crucial for IDP dynamics.ConclusionThis research demonstrates a robust framework for combining MD simulations with ML to overcome the limitations of conventional methods in studying IDPs. The developed methodology offers a powerful tool for predicting conformational dynamics, aiding in the design of IDP-targeting therapeutics, and advancing our fundamental understanding of protein function in health and disease.