Kinetic Network Study of the Diversity and Temperature Dependence of Trp-Cage Folding Pathways: Combining Transition Path Theory with Stochastic Simulations

Weihua Zheng, Emilio Gallicchio, Nanjie Deng, Michael Andrec, Ronald M. Levy
2011 Journal of Physical Chemistry B  
We present a new approach to study a multitude of folding pathways and different folding mechanisms for the 20-residue mini-protein Trp-Cage using the combined power of replica exchange molecular dynamics (REMD) simulations for conformational sampling, transition path theory (TPT) for constructing folding pathways, and stochastic simulations for sampling the pathways in a high dimensional structure space. REMD simulations of Trp-Cage with 16 replicas at temperatures between 270 and 566 K are
more » ... ried out with an all-atom force field (OPLSAA) and an implicit solvent model (AGBNP). The conformations sampled from all temperatures are collected. They form a discretized state space that can be used to model the folding process. The equilibrium population for each state at a target temperature can be calculated using the weighted-histogram-analysis method (WHAM). By connecting states with similar structures and creating edges satisfying detailed balance conditions, we construct a kinetic network that preserves the equilibrium population distribution of the state space. After defining the folded and unfolded macrostates, committor probabilities (P fold ) are calculated by solving a set of linear equations for each node in the network and pathways are extracted together with their fluxes using the TPT algorithm. By clustering the pathways into folding "tubes", a more physically meaningful picture of the diversity of folding routes emerges. Stochastic simulations are carried out on the network, and a procedure is developed to project sampled trajectories onto the folding tubes. The fluxes through the folding tubes calculated from the stochastic trajectories are in good agreement with the corresponding values obtained from the TPT analysis. The temperature dependence of the ensemble of Trp-Cage folding pathways is investigated. Above the folding temperature, a large number of diverse folding pathways with comparable fluxes flood the energy landscape. At low temperature, however, the folding transition is dominated by only a few localized pathways. ARTICLE complex systems with high barriers separating different folding routes. It is also challenging to apply TPS to large molecular systems with intervening metastable free-energy basins. 31 Another strategy for extracting kinetics consists of discretizing the state space and constructing rules for moving among those states. The resulting scheme can be represented as a graph, a roadmap, or a network; 32-34 the kinetics on the graph is often assumed to have Markovian behavior. [35] [36] [37] [38] [39] Discretization of the state space can be done by clustering based on the conformational difference among structures; 37,40,41 the clusters should be chosen so as to satisfy the Markovian condition. 38, 39, 42 Recently, transition path theory (TPT) 43,44 has been developed and was applied to model the folding of the PinWW domain. 45 A Markovian network was constructed from many relatively short MD simulations at 360 K. Once the reactant and product states are defined and the committor probability (P fold ) 46 of each state is calculated, folding pathways and their fluxes can be extracted from the network using the TPT algorithm. 43, 44 The ensemble of folding pathways at 360 K shows various disjoint paths leading to the native state. On the other hand, stochastic simulations on a Markovian network using the Gillespie algorithm 47 are also a powerful tool to explore the thermodynamic and kinetic properties of the network. 48, 49 Numerous reactive trajectories on the network can be collected from the simulations to derive quantities such as the equilibrium population and the P fold value of each node, the flux, and the first passage time statistics for the reaction. While a single stochastic trajectory on the network is an approximation and abstraction of many all-atom trajectories in the continuous conformational space, a single pathway on the network defined in TPT theory is an abstract representation of a group of stochastic trajectories on the network. Results from stochastic simulations on the network do not only serve as a benchmark for the TPT calculation to test its validity but also provide additional conformational and kinetic information. Replica exchange molecular dynamics (REMD) 50 was developed to enhance the ability to obtain temperature canonical populations in complex systems by running many communicating simulations in parallel. The large range of temperatures of REMD enable it to achieve much better sampling at low temperatures by "borrowing" the fast kinetics at high temperatures. 51 However, since REMD involves temperature swaps between MD trajectories, it is not straightforward to obtain kinetic information from such simulations. 29, 39, 42, 52 We have made use of a kinetic network model 53 in which we take advantage of the REMD sampling, build the nodes of the network from molecular conformations collected from REMD trajectories, and then construct edges using an ansatz based on structural similarity. By allowing local transitions between two nodes that are structurally similar, we can generate trajectories or pathways that are not realized in the original REMD simulation. While this model was shown to yield physically plausible kinetics, 53 the scheme we used to weight nodes arising from different simulation temperatures was such that thermodynamic parameters of the system were not exactly preserved. Recently, we presented an improved version of the kinetic network model 49 which is guaranteed to reproduce the potential of mean force (PMF) with respect to any reduced coordinates and the model was tested on a folding-like two-dimensional potential. Compared with previous work 54 which builds the Markov state model from low temperature simulations, REMD provides a more thorough search in the conformational space of the system. In this paper, we apply our network model together with both
doi:10.1021/jp1089596 pmid:21254767 pmcid:PMC3059588 fatcat:f7ycca3cnfh3jhhoxyvhq6ggsi