METHODS AND APPARATUS FOR DOUBLE-INTEGRATION ORTHOGONAL SPACE TEMPERING

Abstract

The orthogonal space random walk (OSRW) algorithm is generalized to be the orthogonal space tempering (OST) method via the introduction of the orthogonal space sampling temperature. Moreover, a double-integration recursion method is developed to enable practically efficient and robust OST free energy calculations, and the algorithm is augmented by a novel -dynamics approach to realize both the uniform sampling of order parameter spaces and rigorous end point constraints. In the present work, the double-integration OST method is employed to perform alchemical free energy simulations, specifically to calculate the free energy difference between benzyl phosphonate and difluorobenzyl phosphonate in aqueous solution, to estimate the solvation free energy of the octanol molecule, and to predict the nontrivial Barnase-Barstar binding affinity change induced by the Barnase N58A mutation. As demonstrated in these model studies, the DI-OST method can robustly enable practically efficient free energy predictions, particularly when strongly coupled slow environmental transitions are involved.

Claims

1. A computer-implemented method for identifying one or more potentially therapeutic drug candidates by simulating systems of molecules through molecular dynamics, the method comprising: initializing a molecular dynamics engine in a first CPU or GPU; initializing an orthogonal space tempering (OST) engine in a second CPU or GPU, wherein the first CPU or GPU is independent of the second CPU or GPU; at each of a plurality of propagation steps of the molecular dynamics engine; determining, by the orthogonal space tempering (OST) engine, an alchemical free energy of a system comprising one or more ligands and a specific receptor in an orthogonal space by performing orthogonal space tempering utilizing a second-order generalized ensemble, wherein the second-order generalized ensemble is a generalized orthogonal space random walk (OSRW) method having an effective sampling boundary imposed by a pre-selected orthogonal space sampling temperature (T.sub.ES) and described by a modified energy function as $U_{m} = U_{o} () + f_{m} () + \frac{T_{ES} - T_{0}}{T_{ES}} .Math. g_{m} (, F_{})$ where is an alchemical order parameter, U.sub.O() stands for a hybrid energy function that is constructed on the basis of the constraints of U.sub.0(0)=U.sup.A and U.sub.1(0)=U.sup.B, wherein two ends states A and B are respectively represented by =1 and =0,f.sub.m() is adaptively updated to approach G.sub.0(), g.sub.m(,F.sub.) is adaptively updated to approach G.sub.O(,F.sub.) and the contribution of g.sub.m(,F.sub.) is scaled by a parameter $(T_{ES} - T_{0}) / T_{ES},$ wherein T.sub.0 is a system reservoir temperature and T.sub.ES is a preset parameter referred to as the orthogonal space sampling temperature; determining, by the orthogonal space tempering (OST) engine one or more forces between the one or more ligands and the specific receptor based upon the alchemical free energy of the system; providing the one or more forces to the molecular dynamics engine to accelerate the speed of the structural change of the one or more ligands to overcome one or more free energy barriers in the orthogonal space; predicting one or more chemical state related thermodynamic properties of the system based upon the determined alchemical free energy of the system; and identifying one or more potentially therapeutic drug candidates by identifying one or more ligands that are most likely to bind strongly to the specific receptor based upon the predicted one or more chemical state related thermodynamic properties of the system.

2. The method according to claim 1, wherein performing orthogonal space tempering further comprises, performing double-integration recursion.

3. The method according to claim 2, wherein: the double-integration recursion is based on dynamic reference restraining.

4. The method according to claim 1, wherein: the method further provides an output selected from the group consisting of, a molecular trajectory and the alchemical free energy of the system.

5. An apparatus for one or more potentially therapeutic drug candidates, the apparatus comprising: a molecular dynamics engine in a first CPU or GPU; an orthogonal space tempering (OST) engine in a second CPU or GPU, wherein the first CPU or GPU is independent of the second CPU or GPU; at each of a plurality of propagation steps, the orthogonal space tempering (OST) engine configured to: determine an alchemical free energy of a system comprising one or more ligands and a specific receptor in an orthogonal space by performing orthogonal space tempering utilizing a second-order generalized ensemble, wherein the second-order generalized ensemble is a generalized orthogonal space random walk (OSRW) method having an effective sampling boundary imposed by a pre-selected orthogonal space sampling temperature (T.sub.ES) and described by a modified energy function as $U_{w} = U_{o} () + f_{m} () + \frac{T_{ES} - T_{0}}{T_{ES}} .Math. g_{m} (, F_{})$ where is an alchemical order parameter, U.sub.O() stands for a hybrid energy function that is constructed on the basis of the constraints of U.sub.0(0)=U.sup.A and U.sub.1(0)=U.sup.B, wherein two ends states A and B are respectively represented by =1 and =0,f.sub.m() is adaptively updated to approach G.sub.0(), g.sub.m(,F.sub.) is adaptively updated to approach G.sub.0(,F.sub.) and the contribution of g.sub.m(,F.sub.) is scaled by a parameter $(T_{ES} - T_{0}) / T_{ES},$ wherein T.sub.0 is a system reservoir temperature and T.sub.ES is a preset parameter referred to as the orthogonal space sampling temperature; determine one or more forces between the one or more ligands and the specific receptor based upon the alchemical free energy of the system; provide the one or more forces to the molecular dynamics engine to accelerate the speed of the structural change of the one or more ligands to overcome one or more free energy barriers in the orthogonal space; the molecular dynamics engine configured to; predict one or more chemical state related thermodynamic properties of the system based upon the determined alchemical free energy of the system; identify one or more potentially therapeutic drug candidates by identifying one or more ligands that are most likely to bind strongly to the specific receptor based upon the predicted one or more chemical state related thermodynamic properties of the system.

6. The apparatus according to claim 5, further comprising an input generator configured to provide one or more inputs, including a molecular structure and the modified energy function to the OST engine and the molecular dynamics engine.

7. The apparatus according to claim 5, wherein the OST engine is further configured to provide one or more outputs including a molecular trajectory and the alchemical free energy of the system.

8. The apparatus according to claim 5, wherein the OST engine is further configured to perform double-integration recursion.

9. The apparatus according to claim 8, wherein the double integration recursion is based on dynamic reference restraining.

10. A non-transitory computer readable medium containing program instructions for identifying method one or more potentially therapeutic drug candidates by simulating systems of molecules through molecular dynamics, the method comprising: initializing a molecular dynamics engine in a first CPU or GPU; initializing an orthogonal space tempering (OST) engine in a second CPU or GPU, wherein the first CPU or GPU is independent of the second CPU or GPU; at each of a plurality of propagation steps of the molecular dynamics engine; determining, by the orthogonal space tempering (OST) engine, an alchemical free energy of a system comprising one or more ligands and a specific receptor in an orthogonal space by performing orthogonal space tempering utilizing a second-order generalized ensemble, wherein the second-order generalized ensemble is a generalized orthogonal space random walk (OSRW) method having an effective sampling boundary imposed by a pre-selected orthogonal space sampling temperature (T.sub.ES) and described by a modified energy function as $U_{m} = U_{o} () + f_{m} () + \frac{T_{ES} - T_{0}}{T_{ES}} .Math. g_{m} (, F_{})$ where is an alchemical order parameter, U.sub.O() stands for a hybrid energy function that is constructed on the basis of the constraints of U.sub.0(0)=U.sup.A and U.sub.1(0)=U.sup.B, wherein two ends states A and B are respectively represented by =1 and =0,f.sub.m() is adaptively updated to approach G.sub.0(), g.sub.m(,F.sub.) is adaptively updated to approach G.sub.0(,F.sub.) and the contribution of g.sub.m(,F.sub.) is scaled by a parameter $(T_{ES} - T_{0}) / T_{ES},$ wherein T.sub.0 is a system reservoir temperature and T.sub.ES is a preset parameter referred to as the orthogonal space sampling temperature; determining, by the orthogonal space tempering (OST) engine one or more forces between the one or more ligands and the specific receptor based upon the alchemical free energy of the system; providing the one or more forces to the molecular dynamics engine to accelerate the speed of the structural change of the one or more ligands to overcome one or more free energy barriers in the orthogonal space; predicting one or more chemical state related thermodynamic properties of the system based upon the determined alchemical free energy of the system; and identifying one or more potentially therapeutic drug candidates by identifying one or more ligands that are most likely to bind strongly to the specific receptor based upon the predicted one or more chemical state related thermodynamic properties of the system.

11. The non-transitory computer readable medium according to claim 10, further comprising instructions for initializing the OST engine and the molecular dynamic engine using one or more inputs, including a molecular structure and an energy function.

12. The non-transitory computer readable medium according to claim 10, further comprising instructions for providing one or more outputs, including a molecular trajectory and the alchemical free energy of the system.

13. The non-transitory computer readable medium according to claim 10, wherein performing orthogonal space tempering further comprises, performing double integration recursion.

14. The non-transitory computer readable medium according to claim 13, wherein the double integration recursion is based on dynamic reference restraining.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0038] FIG. 1 is a high level block diagram illustrating an apparatus and data flow in accordance with an embodiment of the present invention.

[0039] FIG. 2 is a high level block diagram illustrating an apparatus for carrying out the invention.

DETAILED DESCRIPTION

[0040] The present invention is focused on alchemical free energy simulations, by which protein-ligand binding affinity changes, protein-protein binding affinity changes, solvation energies, pK.sub.a values, and other chemical state related thermodynamic properties can be predicted. The disclosed DI-OST algorithm is also applicable to the geometry-based potential of mean force calculations.

[0041] To carry out alchemical free energy calculations, as described in Equation (1), a scaling parameter needs to be introduced to connect two target chemical states. A simplest hybrid energy function is the linear form shown in Equation (4).

U.sub.O()=(1)U.sub.S.sup.A+U.sub.S.sup.B+U.sub.e (4)

[0042] where U.sub.S.sup.A and U.sub.S.sup.B are the energy terms unique in the two end chemical states; U.sub.e represents the common environment energy terms shared by the two end states. When dummy atoms are employed in one of the end states, soft-cores potentials are commonly applied to treat the van der Walls terms or/and the electrostatic terms in U.sub.S.sup.A and U.sub.S.sup.B to avoid the end point singularity issue.

[0043] In GE alchemical free energy simulations, needs to be dynamically coupled with the motion of the rest of the system. Such extended dynamics can be realized either via the hybrid Monte Carlo method, where the scaling parameter jumps along a prearranged discrete ladder are enabled through the metropolis acceptance/rejection procedure, or via the -dynamics method, where moves in the continuous region between 0 and 1 are enabled through an extended Hamiltonian approach. The extended dynamics of the scaling parameter in OSRW are implemented on the basis of the -dynamics method. In the original -dynamics free energy calculation method, the scaling parameter is treated as a one-dimensional fictitious particle. In the present invention, especially to rigorously constrain between 0 and 1, a novel -dynamics approach is proposed. In this -dynamics, is set as the function (), the variable is treated as a one-dimensional fictitious particle, which travels periodically between and . In OSRW simulations, uniform distributions are targeted. Here, the usage of the -dynamics approach is mainly for the purpose of constraining the range. Actually, it is preferable to have uniform sampling in the space. For the above purpose, the functional form of () according to the design is as shown in Equation (5).

[00002] $\begin{matrix} () = {\begin{matrix} r .Math. .Math. \sin^{2} .Math. \frac{}{2}, .Math. .Math._{o} \\ a .Math. .Math. + b,_{o} < < -_{o} \\ - a .Math. .Math. + b,_{o} - < < -_{o} \\ r .Math. .Math. \sin^{2} .Math. \frac{}{2} + c, -_{o} .Math. .Math. \end{matrix} . & (5) \end{matrix}$

[0044] in which r=1/(1cos .sub.O+(2.sub.O)sin .sub.O), a=r/2 sin .sub.O, b=r/2(1cos .sub.O.sub.O sin .sub.O), and c=r/2(.sub.O)sin .sub.O+r/2(1cos .sub.O.sub.O sin .sub.O)r sin.sup.2((.sub.O)/2). In Equation (5), .sub.O is the parameter utilized to separate the linear region and the end-state (=0, 1) transition region. In OSRW and OST simulations, .sub.O should be set as a tiny value so that A is almost 1 and B is almost zero, thus the Jacobian contribution from the () function can be negligible. The propagation and the thermolyzation of the particle are based on the Langevin equation, the same as how the particle is treated in the original -dynamics method.

[0045] The OSRW method is based on the modified potential energy function as described in Equation (2). The OSRW algorithm has two recursion components: the recursion kernel to adaptively update g.sub.m(,F.sub.) toward its target function G.sub.O(,F.sub.) and the recursion slave to adaptively update f.sub.m() toward its target function G.sub.O() based on the concurrent g.sub.m(,F.sub.) function. In the original implementation, the metadynamics strategy is employed as the recursion kernel. Specifically, the free energy biased potential g.sub.m(,F.sub.) can be obtained by repetitively adding a relatively small Gaussian-shaped repulsive potential as explained in Equation (6).

[00003] $\begin{matrix} h_{o} .Math. \exp (- \frac{{.Math. - (t_{i}) .Math.}^{2}}{2 .Math. w_{1}^{2}}) .Math. \exp (- \frac{{.Math. F_{} - F_{} (t_{i}) .Math.}^{2}}{2 .Math. w_{2}^{2}}) & (6) \end{matrix}$

[0046] which is centered around [(t.sub.i),F.sub.(t.sub.i)] at the scheduled update time t.sub.i, and thereby discourages the system from often visited configurations. With this procedure repeated, the overall biasing potential shown in Equation (7)

[00004] $\begin{matrix} g_{m} (, F_{}) = \underset{t_{i}}{.Math.} .Math. h_{o} .Math. \exp (- \frac{{.Math. - (t_{i}) .Math.}^{2}}{2 .Math. w_{1}^{2}}) \exp (- \frac{{.Math. F_{} - F_{} (t_{i}) .Math.}^{2}}{2 .Math. w_{2}^{2}}) & (7) \end{matrix}$

[0047] will build up and eventually flatten the underlying curvature of the free energy surface in the (,F.sub.) space. Then, the free energy profile along the reaction coordinate (,F.sub.), which should eventually converge to G.sub.O(,F.sub.), can be estimated as g.sub.m(,F.sub.).

[0048] Since for a state , the free energy profile along its generalized force direction can be estimated as g.sub.m[,F.sub.()], the generalized force distribution should be proportional to exp{.sub.Og.sub.m[,F.sub.()]}, in which .sub.O represents 1/(kT.sub.O). On the basis of the above discussion, free energy derivatives at each state can be obtained as shown in Equation (8).

[00005] $\begin{matrix} \frac{{dG}_{o}}{d .Math. .Math.} .Math. {.Math.}_{^{.Math.}} .Math. = {F_{}}_{^{.Math.}} = \frac{_{F_{}} .Math. F_{} .Math. \exp .Math. {_{o} [g_{m} (, F_{})]} .Math. (-^{})}{_{F_{}} .Math. \exp .Math. {_{o} [g_{m} (, F_{})]} .Math. (-^{})} & (8) \end{matrix}$

[0049] Following the TI formula, the free energy change between the initial state with .sub.i, which is the lower bound of the collective variable range, and any target state with the order parameter can unfold as a function of shown in Equation (9).

[00006] $\begin{matrix} G_{o} () =_{_{i}}^{} .Math. \frac{{dG}_{o}}{d .Math. .Math.} .Math. {.Math.}_{^{}} .Math. d .Math. .Math.^{} & (9) \end{matrix}$

[0050] In the original OSRW implementation, the metadynamics strategy, as described in Equation (7), serves as the recursion kernel, the TI based formula (Equations (8) and (9)) serves as the recursion slave with f.sub.m() recursively set as instantaneously estimated G.sub.O().

[0051] On the basis of the above OSRW procedure, we carried out a free energy simulation study on the model system. The model simulation was performed on the basis of two-dimensional Langevin dynamics, where the temperature was set as 50 K and the particle mass was set as 100 g/mol. The OSRW simulation led to a converged free energy profile G.sub.O(x) [targeted as f.sub.m(x)], and a converged g.sub.m(x,U.sub.O/x) (in the model case, U.sub.O/x is the generalized force), where two energy minima are smoothly connected along U.sub.O/x at the transition state region. When converged, this represents the residual free energy surface after the free energy surface flattening treatment g.sub.m(x,U.sub.O/x) along the order parameter. [g.sub.m(x,U.sub.O/x)] reveals the fact that the residual free energy barrier exists around the transition state region. It can be traced along U.sub.O/x near the transition state, and more importantly, the residual barrier height (about 2.2 kcal/mol) is similar to that of the hidden energy barrier. In this model system, the generalized force can reveal the direction of the order-parameter-coupled hidden process, this is a prerequisite for efficient and accurate calculations of the target free energy profile G.sub.O(x).

[0052] To further understand the role of U.sub.O/x and the difference between the OSRW sampling [e.g., based on U.sub.O+f.sub.m(x)+g.sub.m(x,U.sub.O/x) as in Equation (2)] and the classical generalized ensemble sampling [e.g., based on U.sub.O+f.sub.m(x) as in Equation (1)], we respectively employed the biasing energy functions f.sub.m(x) and f.sub.m(x)+g.sub.m(x,U.sub.O/x), which were obtained in the recursion step, to perform two corresponding equilibrium generalized ensemble simulations. The OSRW sampling allows the system to travel repetitively between two energy minima, as in comparison to the classical generalized ensemble simulation, the system is trapped in the original energy minimum state due to the lack of sampling acceleration along the hidden dimension. Furthermore, according to the umbrella sampling reweighting relationship, the samples collected from the OSRW simulation can be employed to recover the free energy surface along x and y, the well-sampled region of which is the same as the target energy surface. As shown from this recovered free energy surface, the samples are more concentrated along the minimum energy path that connects two energy wells.

[0053] In an OSRW simulation, the sampling volume in the orthogonal space increases with the elongation of the simulation length. Additionally, the diffusion sampling overhead around the states, where no hidden barrier exists, continuously increases. As mentioned above, the OSRW method can be generalized to the orthogonal space tempering (OST) algorithm. The target energy function of the OST scheme is described in Equation (3). In the OST scheme, free energy surfaces along the generalized force direction are not completely flattened. Then, the orthogonal space effective sampling temperature T.sub.ES can impose an effective sampling boundary to ensure the long-time scale convergence. A larger T.sub.ES allows more efficient crossing of hidden free energy barriers but introduces more diffusion sampling overhead.

[0054] Interestingly, when T.sub.ES approaches the infinity limit, the OST method becomes the original OSRW algorithm; when T.sub.ES approaches the system reservoir temperature T.sub.O, the second-order GE sampling turns to the first-order GE sampling as described in Equation (1).

[0055] The metadynamics method according to the invention achieves adaptive recursions based on a dynamic force-balancing relationship. Its performance strongly depends on energy surface ruggedness and preset parameters. To improve the convergence behavior of OST, in the present work, we designed an alternative method to gain robust recursions.

[0056] Among various recursion methods, the adaptive biasing force (ABF) algorithm has a similar efficiency to that of the metadynamics algorithm. In contrast to the metadynamics technique, the ABF method has been mathematically proven; thus the usage of the ABF method as the recursion kernel, specifically via the calculation of the F.sub.-dependent free energy profile G.sub.O(,F.sub.) at each state, can ensure free energy convergence robustness. A challenging issue remains: how to numerically calculate the generalized force of F.sub. to estimate target F.sub.-dependent free energy profiles. As a matter of fact, calculating generalized forces of complex order parameters has been known to be a difficult issue in the ABE algorithm implementation. To circumvent this issue, in our OST implementation, we propose a dynamic reference restraining (DRR) recursion strategy. Specifically, the target OST potential described above with reference to Equation (3) is rewritten as Equation (10).

[00007] $\begin{matrix} U_{m} = U_{o} () + \frac{1}{2} .Math. {k_{} (F_{} -)}^{2} + f_{m} () + \frac{T_{ES} - T_{o}}{T_{ES}} .Math. g_{m} (,) & (10) \end{matrix}$

[0057] in which the generalized force fluctuation is restrained to the move of another dynamic particle . In Equation (10), f.sub.m() is still targeted toward G.sub.O(), and g.sub.m(,) is targeted toward G.sub.O(,), the negative of the free energy surface along (,) in the canonical ensemble with the energy function U.sub.O()+k (F.sub.)2G(), where G() is the -dependent free energy surface in the canonical ensemble with U.sub.O()+k (F.sub.).sup.2 as the energy function. On the basis of Equation (10), motions along F.sub. are indirectly activated via the restraining treatment to the dynamic reference: . Here, the dynamics of the particle are also realized through the same extended Hamiltonian method as in -dynamics or -dynamics, which was discussed above.

[0058] According to the OST target function in Equation (10), we need to design a recursion kernel to estimate G.sub.O(,) in order to adaptively update g.sub.m(,). To obtain the two-dimensional function G.sub.O(,), first, the ABF method is directly employed to calculate the dependent free energy profile at each state, specifically on the basis of the following TI relationship shown in Equation 11.

[00008] $\begin{matrix} G_{o^{}} (^{},) =_{}^{} .Math. {\frac{U_{o^{}} (,)}{} .Math. (-^{})}_{^{}} .Math. d .Math. .Math.^{} & (11) \end{matrix}$

[0059] Here, U.sub.O(,) represents U.sub.O()+k (F.sub.).sup.2; then U.sub.O(,)/ can be simply evaluated as k (F.sub.). It is noted that the numerical boundary of G.sub.O(,), i.e., the integration boundary in Equation (11), changes as the recursion proceeds. Following the general ABF strategy, <U.sub.O(,)/()> can be adaptively estimated as shown in Equation (12).

[00009] $\begin{matrix} \frac{\underset{i}{.Math.} .Math. - k_{} [F_{} (t_{i}) - (t_{i})] .Math. [(t_{i}) -^{}] .Math. [(t_{i}) -^{}]}{{.Math.}_{i}^{} .Math. [(t_{i}) -^{}] .Math. [(t_{i}) -^{}]} & (12) \end{matrix}$

[0060] where t.sub.i is the ith scheduled sample-collecting time. Equations (11) and (12) only allow the obtaining of the one-dimension function G.sub.O(,) at each state. The height of the G.sub.O(,) function can be recalibrated as shown in Equation (13).

[00010] $\begin{matrix} G_{o^{}} (^{},) = G_{o^{}} (^{},) - G_{o^{}, \min} (^{},) - RT .Math. .Math. \ln .Math._{}^{} .Math. \exp (- \frac{G_{o^{}} .Math. (^{},) - G_{o^{}, \min} (^{},)}{{kT}_{o}}) & (13) \end{matrix}$

[0061] where G.sub.O, min(,) is the lowest value in the free energy curve G.sub.O(,); G.sub.O(,) represents the post calibration function of G.sub.O(,). All of the calibrated one-dimension G.sub.O(,) functions can be assembled to be the target two-dimension G.sub.O(,) function. Then, g.sub.m(,) can be adaptively updated as instantaneously estimated G.sub.O(,). This calibration procedure is based on the g.sub.m(,) function definition in Equation (10), specifically to fulfill the condition that the target energy function for g.sub.m(,) free energy flattening treatment has already been flattened along the direction. In the DI-OST method according to the invention, Equations (11)-(13) constitute the recursion kernel.

[0062] Regarding the recursion slave, the TI formula in Equation (9) is still used to estimate G.sub.O(); then, (dG.sub.O/d)| at each state needs to be evaluated. Different from the recursion in the original OSRW algorithm, where the target function of the recursion kernel is G.sub.O(,F.sub.), here, the target function of the recursion kernel G.sub.O(,) does not provide direct information on generalized force F.sub. distributions. For the fact that F.sub. is restrained to , a simple but an approximate way of estimating (dG.sub.O/d)| can be made on the basis of the assumption of <>.sub.=<F>. Thus, (dG.sub.O/d)| can be approximated via Equation (14).

[00011] $\begin{matrix} {\frac{{dG}_{o}}{d .Math. .Math.} .Math.}_{^{}} = {F_{}}_{^{}}_{^{}} = \frac{_{}^{} .Math. .Math. .Math. \exp .Math. {[g_{m} (,)]} .Math. (-^{})}{_{}^{} .Math. \exp .Math. {[g_{m} (,)]} .Math. (-^{})} & (14) \end{matrix}$

[0063] To more rigorously estimate (dG.sub.O/d)|, G.sub.O(,F.sub.) needs to be calculated for each state as described above. Notably, the samples collected at the state with F.sub.=F.sub. can be considered as being obtained from multiple independent ensembles, each of which corresponds to a unique restraining reference value . According to the umbrella integration relationship, based on the samples from each (,) restraining ensemble, (dG.sub.O(,F.sub.)/dF.sub.)|F, can be estimated as

1/(.sub.O)(F.sub.F.sub..sup.,)/(.sub..sup.,).sup.2k.sub.(F.sub.), where F.sub..sup.,

stands for the average of the F.sub. values of all of the samples in the (,) restraining ensemble and .sub..sup., represents the variance of samples. Using the multihistogram approach to combine the estimations from all of the restraining ensembles that are visited at the state, (dG.sub.O(,F.sub.)/dF.sub.)|F, can be calculated as shown in Equation (15).

[00012] $\begin{matrix} {\frac{{dG}_{o} (^{}, F_{})}{{dF}_{}} .Math.}_{F_{}^{},^{}} = \frac{_{^{}}^{} .Math. (_{^{}, F_{}^{}}^{}) [\frac{1}{_{o}} .Math. \frac{F_{}^{} - \overline{F_{}^{^{},^{}}}}{{(_{}^{^{},^{}})}^{2}} - k_{} (F_{}^{} -^{})]}{_{^{}}^{} .Math. (_{^{}, F_{}^{}}^{})} & (15) \end{matrix}$

[0064] where (where (,F) denotes the total number of the (,F.sub.) samples that are collected from the restraining ensemble.

[0065] Then, based on the TI relationship, G.sub.O(,F.sub.) can be calculated according to Equation (16).

[00013] $\begin{matrix} {.Math. G_{o^{}} (^{}, F_{}) =_{F_{}^{}}^{} .Math. \frac{{dG}_{o} (^{}, F_{})}{{dF}_{}} .Math.}_{F_{}^{},^{}} .Math. {dF}_{}^{} & (16) \end{matrix}$

[0066] Again, like in Equation (11), the numerical boundary of G.sub.O(,F.sub.), i.e., the integration boundary in Equation (16), changes as the recursion proceeds. Following the corresponding derivation in the original OSRW method, we can obtain (dG.sub.O/d)| at the state using Equation 17.

[00014] $\begin{matrix} {\frac{{dG}_{o}}{d .Math. .Math.} .Math.}_{^{}} = {F_{}}_{^{}} = \frac{_{F_{}}^{} .Math. F_{} .Math. .Math. \exp .Math. {-_{o} [G_{o^{}} (, F_{})]} .Math. (-^{})}{_{F_{}} .Math. \exp .Math. {-_{o} [G_{o^{}} (, F_{})]} .Math. (-^{})} & (17) \end{matrix}$

[0067] On the basis of the corresponding TI formula in Equation (9), f.sub.m(), which is targeted as G.sub.O(), can then be adaptively updated. In the DI-OST method according to the invention, Equations (15)-(17) and (9) constitute the recursion slave. Notably, f.sub.m() does not have to be equal to G.sub.O() in a strict manner. Here, it is highly recommended to employ the approximate approach based on Equations (11)-(14) and (9) to update f.sub.m(), and the more rigorous approach based on Equations (15)-(17) and (9) to estimate G.sub.O(), because of the fact that <> in Equation (14), is directly estimated from -space ABF calculations (Equations (11) and (12)) and should converge faster. In the DI-OST method, both the recursion kernel and the recursion slave are based on the integration schemes. Therefore, it is named the double-integration recursion method.

[0068] The double-integration recursion based OST method is implemented in the orthogonal space sampling module, which is currently coupled with our customized CHARMM program. See, Brooks, B. R.; Bruccolleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.; Karplus, M. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 1983, 4, 187-217 and Brooks, B. R.; Brooks, C. L.; Mackerell, A. D.; Nilsson, L.; Petrella, R. J.; Roux, B.; Won, Y.; Archontis, G.; Bartels, C.; Boresch, S.; Calfischk.; Caves, L.; Cui, Q.; Dinner, A. R.; Feig, M. Feig; Fischer, S.; Gao, J.; Hodoscek, M.; Im, W.; Kuczera, K.; Lazaridis, T.; Ma, J.; Ovchinnikov, V.; Paci, E.; Pastor, R. W.; Post, C. B.; Pu, J. Z.; Schaefer, M.; Tidor, B.; Venable, R. M.; Woodcock, H. L.; Wu, X.; Yang, W.; York, D. M.; Karplus, M. CHARMM: The biomolecular simulation program. J. Comput. Chem. 2009, 30, 1545-1614. CHARMM is available from Harvard University.

[0069] In the present invention, the following van der Waals soft-core potential form is employed to treat the atoms which are annihilated as illustrated in Equation (18).

[00015] $\begin{matrix} U_{softcore .Math. vdW} = (1 -) [\frac{A}{{(_{vdW} .Math.^{2} + r^{6})}^{2}} - \frac{B}{_{vdW} .Math.^{2} + r^{6}}] & (18) \end{matrix}$

[0070] where .sub.vdW is the van der Wools soft-core shifting parameter. It is noted that Equation (18) is different from the one in the currently released CHARMM program. The electrostatic soft-core potential is based on Equation (19).

[00016] $\begin{matrix} U_{softcore .Math. elec} = \frac{(1 -) .Math. Q_{A} .Math. Q_{B}}{\sqrt{_{elec} .Math. + r^{2}}} & (19) \end{matrix}$

[0071] where .sub.elec is the electrostatic soft-core shifting parameter. In Equations (18) and (19), the annihilation is assumed to occur at the state of =1; to be consistent, in this study, all of the dummy atoms are set at the state of =1.

[0072] In the present invention, the DI-OST method is demonstrated in the context of alchemical free energy simulation, specifically to calculate the free energy difference between benzyl phosphonate and difluorobenzyl phosphonate in aqueous solution, to estimate the solvation free energy of the octanol molecule, and to predict the Barnase-Barstar nontrivial binding affinity change induced by the Barnase N58A mutation.

[0073] The molecules of benzyl phosphonate (BP) and difluorobenzyl phosphonate (F2BP) are the side chain analogues of prototypical phosphotyrosine mimetics, which are common targets in drug discovery. The free energy difference between these two molecules in aqueous solution, G.sub.BP.fwdarw.F2BP.sup.aqueous, has been used as a test-bed to analyze free energy simulation methods. In practical studies, if combined with the free energy difference in gas phase G.sub.BP.fwdarw.F2BP.sup.gas, G.sub.BP.fwdarw.F2BP.sup.aqueousG.sub.BP.fwdarw.F2BP.sup.gas gives rise to the value of the solvation energy difference; if combined with the free energy difference in a protein binding site G.sub.BP.fwdarw.F2BP.sup.protein, G.sub.BP.fwdarw.F2BP.sup.proteinG.sub.BP.fwdarw.F2BP.sup.aqueous gives rise to the value of the binding free energy difference. Here, the test calculations on G.sub.BP.fwdarw.F2BP.sup.aqueous gives rise to the value of the binding free energy difference. Here, the test calculations on GBP.fwdarw.F2BP aqueous calculations on G.sub.BP.fwdarw.F2BP.sup.aqueous are used to comparatively evaluate the original OSRW method and the invention's DI-OST method in the aspects of algorithm robustness and long-time convergence. On the basis of each of the two methods, five sets of independent simulations were carried out.

[0074] The MD simulation setup was the same as the one in the earlier studies, where the BP and F2BP molecules are described with the CHARMM22 parameter. In total, 294 water molecules are included in the truncated octahedral box; the water molecules are treated with the TIP3P model. The diagram below shows the setup of the alchemical transition from BP to F2BP.

##STR00001##

[0075] For the fact that there is no vanishing atom in either of the end states, the linear hybrid energy function (as described by Equation (4)) is used in this model study.

[0076] In the five OSRW simulations, g(,F.sub.) (in Equation (7)) was updated every 10 time steps; the height of the Gaussian function h was set as 0.01 kcal/mol; the widths of the Gaussian function, 1 and 2, were set as 0.01 and 4 kcal/mol respectively, and f.sub.m() was updated (based on Equations (8) and (9)) once per 1000 time steps. In the five DI-OST simulations, the samples were collected every time step, g(,) was updated (based on Equations (11)-(13)) once per 1000 time steps, f.sub.m() was updated (based on Equations (17-19) and (9)) once per 1000 time steps, and T.sub.ES was set as 600 K (the system reservoir temperature is 300 K). The length of each simulation is 20 nanoseconds (ns),

[0077] The model calculation on the octanol solvation free energy is to understand the role of the orthogonal space sampling temperature T.sub.ES. The octanol molecule which is described by the CHARMM general force field (CGFF), is embedded in a truncated octahedral water box with a total of 713 TIP3P water molecules. In the alchemical free energy simulation setup, the solvated octanol molecule (=0) is changed to a gas phase molecule (=1), which does not have any interaction with the solvent molecules. Accordingly, all of the van der Waals and the electrostatic energy terms describing the solute-solvent interactions are subject to the soft-core treatment, in which .sub.vdW is set as 0.5 and .sub.elec is set as 5.0. Then, the solvation free energy of octanol G.sub.octanol.sup.solvation can be estimated as the negative of the free energy difference G.sub.=0.fwdarw.=1 between the two end states.

[0078] To understand the influence of T.sub.ES on sampling efficiency, two sets of independent DI-OST simulations were run, each of which includes eight simulations with T.sub.ES respectively set as 750 and 375 K (the system reservoir temperature is 300 K). The samples were collected every time step. gm(,) was updated (based on Equations (11)-(13)) once per 1000 time steps. f.sub.m() was also updated (based on Equations (17-19) and (9)) once per 1000 time steps. The length of each simulation is 17 ns.

[0079] The model study on the binding between barnase, an extracellular RNase of Bacillus amyloliquefaciens, and barstart, the intracellular polypeptide inhibitor of barnase demonstrates the DI-OST method in predicting mutation induced protein-protein binding affinity changes. The barnase N58A mutation is located at the second layer of the binding interface; this noncharging mutation causes about 3.1 kcal/mol of the binding affinity loss.

[0080] The DI-OST simulations were performed to calculate the alchemical free energy changes in two environments: G.sub.Asn.fwdarw.Ala.sup.complex in the barnase-barstar complex and G.sub.Asn.fwdarw.Ala.sup.barnase in the unbound barnase. The binding affinity change G.sub.Asn.fwdarw.Ala can be calculated as G.sub.Asn.fwdarw.Ala.sup.complexG.sub.Asn.fwdarw.Ala.sup.barnase. All of the systems are treated with the CHARMM27/CMAP model. In the model for the G.sub.Asn=Ala.sup.complex calculation, the barnase-barstar complex (with the PDB code of 1BRS) is embedded in the octahedral box with 18 902 water molecules; in the model for the G.sub.Asn.fwdarw.Ala.sup.barnase calculation, the unbound barnase (also based on the PDB code of 1BRS) is embedded in the octahedral box with 11 291 water molecules.

[0081] In the alchemical free energy simulation setup shown in the diagram below, the vanishing atoms in Asn58 (=0) are switched to the corresponding dummy atoms at =1. The bond, angle, and dihedral terms associated with the dummy atoms are set identical to the corresponding ones of the original asparagine residue. All of the van der Waals terms and the electrostatic energy terms associated with the dummy atoms are subject to the soft-core treatment, in which .sub.vdW was set as 0.5 and .sub.elec was set as 5.0. The three DI-OST simulations were performed with T.sub.ES set as 1500 K (the system reservoir temperature is 300 K); the samples were collected every time step g(,) was updated (based on Equations (11-13)) once per 1000 time steps. f.sub.m() was also updated (based on Equations (17-19) and (9)) once per 1000 time steps.

##STR00002##

[0082] The CGFF parameters were generated through the CHARMM-GUI server. The particle mesh ewald (PME) method63 was applied to take care of the long-range columbic interactions while the short-range interactions were totally switched off at 12 . The Nse-Hoover method was employed to maintain a constant reservoir temperature at 300 K, and the Langevin piston algorithm was used to maintain the constant pressure at 1 atm. The time step was set as 1 fs.

[0083] The results from one of the five DI-OST simulations are summarized as follows. In about 800 ps, the scaling parameter completed the first one-way trip, which started at =0. It is noted that free energy estimations are only possible when the sampling covers the entire space. At 820 ps, the initial estimation of G.sub.BP.fwdarw.F2BP.sup.aqueous gives 299.91 kcal/mol, which is very close to the finally converged result 299.77 kcal/mol. In the DI-OST scheme, the ((T.sub.EST.sub.O)/T.sub.ES)g.sub.m(,) biasing term enables the accelerating of moves, which through the restraint term k.sub.(F.sub.).sup.2 induces simultaneous fluctuation enlargement of the generalized force F.sub.. In these simulations, the restraint force constant k was set as 0.1 (kcal/mol).sup.1; F.sub. and are robustly synchronized. The recursive orthogonal space tempering treatment allows F.sub. fluctuations to be continuingly enlarged until around 8 ns; then the space sampling boundary imposed by T.sub.ES was reached. Subsequent recursion kernel and recursion slave updates enable continuous refinement of the g.sub.m(,) and f.sub.m() terms. At the end of the 20 ns simulation, the orthogonal space sampling temperature 600 K allows the fluctuations of and F.sub. to overcome 9KT strongly coupled free energy barriers that are hidden in the orthogonal space.

[0084] The BP and D2BP molecules differ only in their local polarity. One would expect moderate environment changes to be associated with the target alchemical transition; simulating the BP-D2BP transition may not fully demonstrate the sampling power of the DI-OST method. However, for its simplicity, this is an ideal system to test the robustness and the long-time convergence behavior of a free energy simulation method. The estimated free energies from the five DI-OST simulations converge to the average value of 299.77 kcal/mol, which quantitatively agrees with the results obtained from the classical free energy simulation studies. Notably, as mentioned above, in this model study, we only targeted our calculations on the estimation of the alchemical free energy difference, G.sub.BP.fwdarw.F2BP.sup.aqueous, the value of which alone does not have any physical meaning. With 20 ns of the simulation lengths, the variance of the five independently estimated values is as low as 0.01 kcal/mol. Within only 940 picoseconds (ps), all five DI-OST simulations had completed their first one-way trips. Then, the average of the estimated values is 299.82 kcal/mol, and the variance of the calculation results is 0.12 kcal/mol. In 2 ns, the average of the estimated values converges to 299.79 kcal/mol, and the variance of the calculation results is 0.04 kcal/mol. In DI-OST simulations, G.sub.O() [the negative of f.sub.m()] should converge faster than G.sub.O(,) [the negative of g.sub.m(,)] because of the fact that the free energy derivative dG.sub.O()/d is largely determined by the lower region of the free energy surface along (,F.sub.). Besides the sampling efficiency, the DI-OST method provides free energy estimation robustness and long-time convergence rigorousness.

[0085] As discussed above, the original OSRW method is limited in two aspects. First, the orthogonal space sampling temperature T.sub.ES is effectively infinity; thus, there is no boundary to restrict the magnitude of F.sub. fluctuation enlargement. The orthogonal space free energy surface flattening treatment enlarges F.sub. fluctuations boundlessly. In comparison with the DI-OST simulations, which have their sampling boundaries imposed by the finite T.sub.ES value (600 K), the OSRW simulations have ever-increasing sampling coverage. Consequently, both the average and the variance of the free energy results show time-dependent oscillatory behaviors. Second, the original OSRW method is based on the metadynamics recursion kernel. The metadynamic kernel provides extra dynamic boosts on moves. Then, the first one-way trips can be quickly completed (around 350 ps in average). Although the free energy estimations could be started earlier, both of the short-time and long-time convergence behaviors of the OSRW simulations are not nearly as good as those of the DI-OST simulations. For example, at 2 ns, the average of the free energy values from the OSRW simulations converges to 299.97 kcal/mol, and the variance of these results is about 0.10 kcal/mol. The metadynamics sampling in the OSRW simulations is by nature in the nonequilibrium regime; in comparison, the sampling in the DI-OST simulations starts in the near-equilibrium regime and rigorously approaches the equilibrium regime with the converging of the two recursion target functions. The robustness and the convergence behavior of OSRW simulations can be improved with the decreasing of the employed Gaussian height; however, it is expected that then the orthogonal space recursion (the recursion kernel) efficiency will be lower and the g.sub.m(,F.sub.) convergence will be slower.

[0086] The DI-OST algorithm allows the orthogonal space sampling strategy to be more robustly realized for free energy simulations. It should be noted that although in the above comparison, better robustness and long-time convergence behavior of the DI-OST simulations have been demonstrated; indeed, within the simulated time scale, the absolute performance of the OSRW simulations is also expected to be superior.

[0087] Among various alchemical free energy simulation applications, solvation free energy calculations are unique because of the fact that they may require extensive sampling but the results are still quantitatively verifiable by classical free energy simulations. In this study, we carried out solvation energy calculations on the octanol molecule to understand the role of the orthogonal space sampling temperature T.sub.ES in the DI-OST method.

[0088] As discussed above, the sampling length required to achieve the first one-way trip is a key factor in sampling efficiency measurement. The average of the first one-way trip sampling lengths in the eight T.sub.ES=750 K DI-OST simulations is 1.6 ns; the variance of these sampling lengths is 0.53 ns. In comparison, the average of the first one-way trip sampling lengths in the eight T.sub.ES=375 K DI-OST simulations is 3.57 ns, and the variance of the first one-way trip sampling lengths is 0.63 ns. The sampling bottleneck is located in the region of (0.7, 0.8); infrequent crossing of this region slows down overall round-trip diffusivity. The solute appearance/annihilation transition is the major event in this sampling bottleneck region.

[0089] It is noted that due to the employment of the soft-core potential, the solute appearance/annihilation transition is shifted from =1, the expected region when the linear hybrid alchemical potential is applied, to this new region. Solvent molecule reorganizations are the hidden events that are associated with solute insertions/annihilations. When the orthogonal space sampling temperature T.sub.ES is higher (for example 750 K), the magnitude of the F.sub. fluctuation is expected to be larger and hidden free energy barriers associated with solvent reorganizations can be more quickly crossed; thereby, the sampling of the bottleneck region can be more efficient.

[0090] With regard to the time-dependent averages of the estimated desolvation free energies from the eight T.sub.ES=750 K DI-OST simulations, and the time-dependent variances of the estimated desolvation free energies from the eight T.sub.ES=750 K DI-OST simulations, at around 2 ns, the average of the estimated values is 3.45 kcal/mol and the variance of these values is about 0.23 kcal/mol. At around 6 ns, the average of the estimated values drops to around 3.35 kcal/mol, while their variance decreases to 0.17 kcal/mol. At around 13.5 ns, the free energy estimations reach very nice convergence with the average value of 3.36 kcal/mol, and the estimation variance drops below 0.1 kcal/mol. By the inclusion of the long-range Lennard-Jones correction (0.79 kcal/mol), the predicted solvation energy, 4.150.1 kcal/mol, is in excellent agreement with the experimental value 4.09 kcal/mol. At 17 ns, a nicely converged g.sub.m(,) function was obtained with the variance further reduced to 0.08 kcal/mol.

[0091] The orthogonal space sampling temperature 750 K allows the fluctuations of and F.sub. to quickly escape 5 kT strongly coupled free energy barriers. In comparison, the eight T.sub.ES=350K DI-OST simulations have smaller sampling coverage in the orthogonal space. The lack of sampling in the orthogonal space not only leads to the longer sampling length requirement for the first one-way trips as discussed above but also leads to the slower convergence. At 17 ns, some of the T.sub.ES=350 K DI-OST simulations have not yet converged well because of the fact that the variance among them is still larger than 0.1 kcal/mol. As a result, the average of these values is about 0.05 kcal/mol away from the average of the values estimated from the T.sub.ES=750K simulations. With T.sub.ES=350 K, the orthogonal space sampling treatment temperature 350 K only allows the fluctuations of and F.sub. to escape less than 2 kT strongly coupled hidden free energy barriers.

[0092] As shown in the above analysis, the orthogonal space tempering treatment allows the sampling bottleneck regions, where hidden free energy barriers exist, to be more efficiently explored. If there is no hidden free energy barrier in the orthogonal space, a higher orthogonal space sampling temperature T.sub.ES may introduce more diffusion sampling overhead, which might lower free energy estimation precision. In practical biomolecular simulation studies, there usually exist large hidden free energy barriers, and then, obtaining accurate free energy estimation should be a higher priority than improving estimation precision, as long as the estimation precision is in a reasonable range. On the basis of our experience, when a new system is explored, we would like to recommend setting T.sub.ES in a range between 750 and 1500K.

[0093] It has been known that charge-charge interactions are directly responsible for the strong binding between Barnase and Barstar. The Barnase Asn58 residue is located at the second layer of the binding interface. As measured experimentally, the noncharging N58A mutation causes 3.1 kcal/mol of the binding affinity loss. This unusual electrostatic response suggests that nontrivial conformational changes are likely to be coupled with the N58A mutation. To quantitatively understand the N59A induced binding affinity change, a specialized technique like the DI-OST method should be applied to ensure adequate sampling of the coupled structural transitions. To confidently sample such conformational changes, in the DI-OST simulations, T.sub.ES is set at 1500 K.

[0094] Two DI-OST simulations, which are respectively based on the Barnase-Barstar (bound) complex structure and the Barnase (unbound) structure, were performed. In 4 ns, multiple round-trips were realized in both of the DI-OST simulations. It took the bound-state simulation only 1.1 ns to complete the first one-way trip, while it took the unbound-state sampling about 1.8 ns to cover the entire order parameter range. The dynamics of the scaling parameter in the unbound-state simulation reveals that the region of =0.4 is the sampling bottleneck area, where slow gating events need to occur for continuing travels. In 4 ns, good convergence was realized in both of the free energy simulations. Through the DI-OST recursion treatment, the -dependent free energy derivatives dG.sub.O/d were calculated; the binding affinity change G.sub.Asn.fwdarw.Ala is largely responsible for the difference that occurs near the alanine state (=1), where the two free energy derivative curves are distinct from each other. As discussed below, the conformational change of the mutated (N58A) Barnase induced by the binding/unbinding of Barstar is mainly responsible for G.sub.Asn.fwdarw.Ala. On the basis of the TI formula (Equation (9)), G.sub.Asn.fwdarw.Ala.sup.complex is estimated to be 94.0 kcal/mol and G.sub.Asn.fwdarw.Ala.sup.Barnase is estimated to be 91.1 kcal/mol; thus G.sub.Asn.fwdarw.Ala can be predicted to be 2.9 kcal/mol, which is in excellent agreement with the experimental value of 3.1 kcal/mol. The orthogonal space tempering treatment allows the fluctuations of and F.sub. to overcome 12-14 kT of the strongly coupled hidden free energy barriers.

[0095] The comparison of the crystal structures (1BRS and 1BNR) suggests that the Barnase protein has the identical conformation at the bound and the unbound states. The Barnase Asn58 is located on a Barstar-binding loop, but at the opposite side from the binding interface residues, for instance, Arg59. In these structures, the binding interface region on the Arg59-containing loop is zipped by the hydrogen bond between the amide group of Gly61 and the carbonyl group of Asn58; thereby Arg59 can be accurately positioned into the binding site. This zipped structure is further locked by two additional hydrogen bonds between the Asn58 side chain and the backbone amidelcarbonyl groups. In the bound-state DI-OST simulation, with residue 58 repeatedly interconverted between the two end chemical states: asparagine and alanine, the structure of the Arg59-containing loop stayed. unchanged, even when approached the alanine state (=1). The hydrogen bond between the amide group of Gly61 and the carbonyl group of Asn58 was not broken during the entire simulation. The fluctuation of the distance between residues 58 and 63 was modest. In contrast, in the unbound-state simulation, synchronously with the move, the Arg59-containing loop varied back and forth between the original zipped conformation (at the asparagine state when =0) and a newly formed unzipped conformation (at the alanine state when =1). When residue 58 turned to alanine, the distance between residues 58 and 63 increased, and when traveled back to the asparagine state, the canonical hydrogen bonds between these two residues were formed again. Correspondingly, the zipping hydrogen bond repetitively broke and reformed. On the unzipped loop of the unbound N59A mutant, Arg59 flips away from its wild-type gesture that is originally preorganized to bind Barstar.

[0096] The above analysis suggests that there is strong coupling between the Barnase-Barstar binding and the Arg59-containing loop zipping, and Asn58 plays a pivotal role in prestabilizing the zipped conformation of the Arg59-containing loop when Barnase is in the unbound state. Therefore, the Barnase-Barstar binding can be enhanced. When Asn58 is mutated to alanine, the Arg59-containing loop in the unbound Barnase is unzipped due to the loss of both the locking hydrogen bonds by Asn58 and the binding of the Barstar. When the N58A mutant binds Barstar, some free energy penalty needs to be paid in order to form the bound conformation, which, as revealed by the bound state DI-OST simulation, stays zipped in the Barstar-bound state regardless of the existence of Asn58. The two simulations share the similar free energy derivative curves near the asparagine (=0) state; this indicates that there is only modest contribution from the direct electrostatic interaction difference to the binding affinity change. In essence, the binding affinity change induced by the N58A mutation is largely responsible for the mutation-induced conformational change at the unbound state. The DI-OST method allows the corresponding conformational change to be synchronously sampled with the moves; therefore, the binding affinity change can be efficiently predicted.

[0097] The simulations described above were performed using a 16-core Intel 3.2 GHz cluster. However, as discussed below, other computing platforms may be preferred.

[0098] With reference to FIG. 1, in one embodiment, the invention 100 is realized with the implementation of two software-based engines: a molecular dynamics (MD) engine 135 and an orthogonal space tempering (OST) engine 105. The molecular dynamics engine 135 provides a computer simulation method for studying the physical movements of atoms and molecules, wherein the atoms and/or molecules are allowed to interact for a fixed period of time, thereby providing a view of the dynamic evolution of the system. The orthogonal space tempering engine 105 implements the above-derived orthogonal space recursion and propagation calculations. The molecular dynamics engine 135 is implemented in a first CPU or GPU and the orthogonal space tempering engine 105 is implemented in a second CPU or GPU, which is independent form the first CPU or GPU. In a particular embodiment, the molecular dynamics engine 135 is a modified version of Chemistry at HARvard Macromolecular Mechanics (CHARMM) and the orthogonal space tempering engine 105 is a modified version of FLOSS. The FLOSS software can be obtained from Florida State University.

[0099] In operation, an input generator 130 provides initialization inputs to the molecular dynamics engine 135 operating in the first GPU or CPU, which then passes the initialization input to an input interpreter 140. The input interpret interpreter 140 sends some of the inputs to the OST engine 105 to initialize the OST environment 110 operating in the second GPU or CPU. The OST engine and the molecular dynamics engine operate in parallel and pass information between each other, illustrated. At each molecular dynamics propagation step performed by the molecular dynamics propagation engine 145, the OST engine 110 feeds forces, which are generated by the orthogonal space recursion and propagation calculation engine 115 based on the OST recursion protocol, into the molecular dynamics engine 135 so that molecular motions will be altered from regular molecular dynamics behaviors. Altering the molecular motions by speeding-up the structural changes of the proteins, based upon the physics-based algorithm of the present invention, allows free energy barriers in the orthogonal space to be automatically overcome. Additionally, at each molecular dynamics propagation step, the OST engine 110 propagates the motions of the virtual molecules, the values of which are used in the generation of the forces to be fed to the molecular dynamics engine 135. In addition, at each step, the OST engine 110 also acquires samples from the molecular dynamics engine 135 to perform its data recursion operation. The OST engine 110 provides output 120 independent from the output 165 of the molecular dynamics engine 135 and has its own data structure for saving data to be used in its own adaptive recursive operations. The propagation steps of the OST engine 125 and the molecular dynamics engine 150 continue for a predetermined period of time, as previously discussed. The outputs can be used to predict the most viable new drug candidates.

[0100] The OST engine output includes four data files called dvdl.dat, free.dat, and g2d.pm3d.dat. The file dvdl.dat gives the time-dependent parameter changes. The file fic.dat gives the current free energy related information. The file free.dat gives the time-dependent estimated free energy values. The file g2d.pm3d.dat gives the orthogonal space free energy surface information.

[0101] In general, the OST engine 110 of the present invention is designed as an external machine, which has its own standalone internal operations that are effective in speeding-up the sampling speed in comparison to that of the molecular dynamics engine 135. The OST engine 110 has been implemented in both CPU and GPU environments to work with both CPU and GPU based molecular dynamics engine 135. The OST engine 110 has been designed to be flexibly plugged into a standard molecular dynamics engine 135.

[0102] Turning now to FIG. 2, an apparatus 200 for implementing the OST device of the present invention includes a processor 210 with associated memory 215, an input 205 and an output 220. As mentioned above, the test simulations were performed with a 16-core Intel 3.2 GHz cluster. However, it is believed that other computing platforms may be preferred. In particular, it is believed that GPUs are more powerful in implementing the invention than CPUs. The NVIDIA GPU platform (http://www.nvidia.com/object/gpu-applications.html) is the presently preferred platform.

[0103] There have been described and illustrated herein several embodiments of methods and apparatus for double-integration orthogonal space tempering. While particular embodiments of the invention have been described, it is not intended that the invention be limited thereto, as it is intended that the invention be as broad in scope as the art will allow and that the specification be read likewise. It will therefore be appreciated by those skilled in the art that yet other modifications could be made to the provided invention without deviating from its spirit and scope as claimed.

METHODS AND APPARATUS FOR DOUBLE-INTEGRATION ORTHOGONAL SPACE TEMPERING

Inventors

Cpc classification

Classification Explorer

G16B5/30

PHYSICS

Classification Explorer

G16C20/50

PHYSICS

Classification Explorer

G16B5/00

PHYSICS

Classification Explorer

G16C20/90

PHYSICS

Classification Explorer

G16C10/00

PHYSICS

International classification

Classification Explorer

G06F19/00

PHYSICS

Classification Explorer

G06F19/12

PHYSICS

Abstract

Claims

Description