Method and system for molecular dynamics simulation with stability control
09727690 · 2017-08-08
Assignee
Inventors
- Roldán Martínez Abán (Madrid, ES)
- Pablo Echenique Robba (Madrid, ES)
- Javier Sancho Sanz (Madrid, ES)
- Roberto Martínez Benito (Madrid, ES)
- Alfonso Campo Camacho (Madrid, ES)
- Álvaro López Medrano (Madrid, ES)
Cpc classification
G16B15/00
PHYSICS
G16B5/00
PHYSICS
International classification
Abstract
The present invention is applicable in the field of molecular dynamics, said invention consisting of computing methods for predicting the structure and function of biomolecules, and particularly of proteins, by means of simulating the protein folding process and the interaction process of proteins with other biomolecules in a solvent. More particularly, the invention relates to a method and a system for controlling simulation stability and for choosing the timestep used in the numerical integration of the equations of motion. The invention successfully reduces the molecular dynamics simulation time by means of optimizing the timestep choice through an adaptive control or allowing larger timesteps correcting the trajectories based on a power control.
Claims
1. A computer-implemented method for simulating systems of atoms by means of molecular dynamics, which comprises the use of at least one numerical integrator for integrating equations of motion of the atoms, the computer-implemented method comprising: controlling a power of at least part of the atoms by carrying out a power control loop that calculates the power of at least part of the atoms to provide a calculated power, and wherein a timestep of the at least one numerical integrator is adaptively modified depending on the calculated power of the atoms, wherein if the calculated power is less than a lower threshold of a predefined stability power range, the timestep of the at least one numerical integrator is increased, and wherein if the calculated power is greater than an upper threshold of the predefined stability power range, the timestep of the integrator is reduced; maintaining the power of at least part of the atoms within the predefined stability power range within which the simulation of the systems is considered stable, and integrating the equations of motion of the atoms at the rate marked by the timestep the at least one numerical integrator in each instant of the simulation of the systems to maintain the simulation stability as a result of the power control loop.
2. The computer-implemented method according to claim 1, further comprising reducing a velocity of the atoms when the calculated power is greater than the upper threshold of the predefined stability power range to maintain the simulation stability.
3. The computer-implemented method according to claim 2, wherein reducing the velocity of the atoms is performed by means of scaling the velocity to provide a new power equal to the calculated power of the upper threshold of the predefined stability power range.
4. The computer-implemented method according to claim 1, wherein the upper threshold of the predefined stability power range comprises a hundred times a maximum value of dynamic variables for a constant timestep simulation of 0.5 femtoseconds, and the lower threshold of the predefined stability power range comprises a hundred times the minimum value of the dynamic variables.
5. The computer-implemented method according to claim 1, wherein the power of the at least part of the atoms is calculated as a scalar product of a velocity times a force acting thereon.
6. The computer-implemented method according to claim 1, further comprising utilizing constraint algorithms and hydrogen atom mass distribution.
7. The computer-implemented method according to claim 1, wherein the power of at least part of the atoms is maintained within the predefined stability power range two or more predefined stability power ranges, each of which is associated with a different atom of the atoms.
8. The computer-implemented method according to claim 1, wherein a multiple timestep (MTS) integration is used, each of which is controlled independently.
9. The computer-implemented method according to claim 1, wherein the simulated systems of atoms corresponds with: a gas, a liquid, a nano-device, biomolecules, or proteins.
10. The computer-implemented method according to claim 1, further comprising: selecting drugs, wherein the simulated systems of atoms corresponds with a biomolecule; entering data corresponding to the selected drug; and upon completion of the simulation, generating data relating to positions, velocities, energies of the molecules and estimating macroscopic properties of the molecules and validity of the selected drug.
11. A system comprising a processor and a memory, the memory storing computer program instructions for simulating systems of atoms by means of molecular dynamics, which comprises the use of at least one numerical integrator for integrating the equations of motion of the atoms, the computer program instructions when executed by the processor cause the system to perform: controlling a power of at least part of the atoms of the system by carrying out a power control loop that calculates the power of at least part of the atoms to provide a calculated power, and wherein a timestep of the at least one numerical integrator is adaptively modified depending on the calculated power of the atoms, wherein if the calculated power is less than a lower threshold of a predefined stability power range, the timestep of the at least one numerical integrator is increased, and wherein if the calculated power is greater than an upper threshold of the predefined stability power range, the timestep of the integrator is reduced; maintaining the power of at least part of the atoms within the predefined stability power range within which the simulation of the systems is considered stable, and integrating the equations of motion of the atoms at the rate marked by the timestep the at least one numerical integrator in each instant of the simulation of the systems to maintain the simulation stability as a result of the power control loop.
12. The system according to claim 11, wherein the system comprises a data input means and a data output means, where said computer program instructions are further executable by the processor to cause the system to perform: selecting drugs, wherein the simulated systems of atoms corresponds with a biomolecule entering data corresponding to the selected drug; and upon completion of the simulation, generating data relating to positions, velocities, energies of the molecules and estimating macroscopic properties of the molecules and validity of the selected drug.
Description
DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
PREFERRED EMBODIMENT OF THE INVENTION
(7)
(8) Said variables must be brought to the internal variables of the program.
(9) The next part is the core of the simulator, the integrator, which as has been mentioned, is responsible for calculating the positions, velocities and forces of the atoms of the system for each step of the simulation. The present invention relates precisely to this part of the integrator, particularly to the part of updating the coordinates (positions and velocities), modifying the velocities or the timestep.
(10) In a stable simulation, both total and individual power values are within a stability range of values which can be considered normal. Likewise, if the power values are within this stability range, the simulation is stable.
(11) This allows controlling simulation stability, which is somewhat difficult a priori, based on controlling a parameter, the power of the atoms.
(12) The invention successfully uses the maximum timestep possible, maintaining the simulation stable at all times, which can be achieved as mentioned, by maintaining the power within a range of values that can be called a stability range.
(13) There are different possibilities to that end.
(14) In each iteration i of the simulation corresponding to a discrete time instant, the state of each atom α is described by its position {right arrow over (r)}.sub.i.sup.α and velocity {right arrow over (v)}.sub.i.sup.α. Based on the positions of each atom, the force {right arrow over (F)}.sub.i.sup.α and the acceleration {right arrow over (a)}.sub.i.sup.α acting thereon can be calculated.
(15) The power p.sub.i.sup.α of the atom α in the instant i is defined as the scalar product of the velocity of an atom times the force to which it is subjected:
p.sub.i.sup.α={right arrow over (F)}.sub.i.sup.α.Math.v.sub.i.sup.α (11)
(16) p.sub.i is defined as the largest of all the p.sub.i.sup.α, i.e., the power value of the atom with the highest power.
(17) In Newtonian dynamics, power is a measurement of kinetic energy variation for a particle, and therefore in this case, for an atom.
(18) In the case of positive power, an atom gains kinetic energy, whereas in the case of negative power, an atom loses energy, the atom is damped.
(19) Given that a numerical integration of Newton's equation is being performed, a good integration maintaining this physical property is anticipated.
(20) The total power P.sub.i.sup.α can also be defined as the sum of the power p.sub.i.sup.α for all the N atoms of the system:
(21)
(22) Likewise, power variation Δp.sub.i.sup.α can be defined as the power difference for an atom between two consecutive iterations:
Δp.sub.i.sup.α=p.sub.i.sup.α−p.sub.i−1.sup.α (13)
and ΔP.sub.i can be defined as the total power variation between two iterations
ΔP.sub.i=P.sub.i−P.sub.i−1 (14)
(23) Δp.sub.i is defined as the largest of all the Δp.sub.i.sup.α.
(24) For these dynamic variables of the system, thresholds u, U, δ and Δ can be defined such that in good simulation conditions, the variables p.sub.i, P.sub.i, Δp.sub.i, ΔP.sub.i always have values less than the corresponding thresholds thereof.
(25) This threshold will depend on two simulation characteristics: the system to be simulated: both the force field used and the type of atoms of the system can affect these thresholds. The size of the system also affects the total amounts simulation conditions: the temperature, which is related to kinetic energy, will influence normal power values. The use of constraints will also have an effect since certain binding forces are eliminated. Likewise, more demand can be placed on the simulation quality, seeking good energy conservation, which involves a smaller timestep and also smaller power and therefore threshold values.
(26) The power dependency with respect to timestep can be seen in
(27) The system used is a ubiquitin protein molecule dissolved in water.
(28) The integrator used is the Velocity Verlet integrator and with a Berendsen type thermostat.
(29) The power is depicted on the vertical axis in au.Math.A.sup.2.Math.ps.sup.−3,
(30) au being the unit of atomic mass, A the unit of distance in amstrongs, and ps picoseconds.
(31) The timestep is depicted on the horizontal axis in femtoseconds
(1 fs=0.001 ps)
(32) The way of determining these parameters for simulation conditions is to obtain the statistics of the different variables from a short simulation in said conditions. A threshold can be obtained from the maximum of said dynamic variables by inspection.
(33) The threshold will be selected as the maximum plus a margin of 10% of the maximum power value for that system.
(34) Once a threshold for a type of system (gases, fluids, nano-devices, biomolecules) in temperature and pressure conditions is determined, this will be used for similar systems and conditions for one and the same simulator.
(35) The first iterations of the dynamics can be used as a threshold initialization step, automating the process.
(36) p.sub.i for the first 260000 iterations of ubiquitin protein simulation can be seen in
(37) It is also the case that the atoms having greater power are hydrogen atoms and they are what generally compromise system stability. An option which has been mentioned above is to increase the mass thereof at the expenses of heavy atoms to which they are bound.
(38) Once these threshold values are determined, there are several strategies compatible with one another for maintaining power in its normal range, maintaining the simulation stable.
(39) 1.—Timestep h Control:
(40) If the power is less than the threshold, the system is stable and it is possible to increase the timestep.
(41) If the power is greater, the system enters the instability zone and the timestep must be reduced in order to recover same.
(42) A simple algorithm based on this principle is the following. In each iteration: If variable <threshold then h=a.Math.h If not, then h=b.Math.h
(43) With a>1 and b<1, positive real numbers.
(44) It is convenient to choose b<1/a (conservative condition), such that the system can be stabilized.
(45) This type of control is complemented, when appropriate, with returns to prior steps: In each iteration: If variable <threshold then h=a.Math.h If not, then
h=b.Math.h
(46) if variable >>threshold then a new state i=state i-k
(47) end if end if
k being an integer.
(48) In practice with the criterion presented herein, the return to a prior step is only necessary when an excessively large threshold is chosen.
(49) Such algorithms have already been proven for molecular dynamics based on a different criterion, energy conservation (see S. J. Stuart, J. M. Hicks, M. T. Mury, An Iterative Variable-Timestep Algorithm for Molecular Dynamics Simulations).
(50) This control can be performed starting from any of the pairs (variable, threshold) based on power:
(p.sub.i,u) (15)
(P.sub.i,U) (16)
(Δp.sub.i,δ) (17)
(ΔP.sub.i,Δ) (18)
(51) Of the 4 pairs, the suggested pair is (p.sub.i,u) because it allows finer power control which results in greater gain.
(52) Next,
(53) The power dependency and therefore the recommendable threshold dependency with respect to timestep of the simulation is once again seen.
(54) It must be pointed out that a simulation at 2.5 femtoseconds (0.0025 ps) without power control would be unstable, so this method allows extending the timestep.
(55) The advantage of this power control-based adaptive control is that it allows more gain.
(56)
(57) An energy conservation-based control, an aforementioned criterion, is in blue.
(58) Another criterion, the alignment between position and velocity variation, is in green.
(59) The criterion presented herein is in red.
(60) A constraint control algorithm has been introduced to allow increasing the timestep.
(61) The small timestep variation and the gain with respect to the rest of the criteria is an indication of this being optimum for controlling timestep based on control loops.
(62) 2.—Individual Power Control
(63) A way of maintaining the power of each atom within the normal values is by rescaling the velocity of the atom if its power has exceeded the value threshold:
(64)
(65) p.sub.i.sup.α and {right arrow over (v)}.sub.i.sup.α being the power and velocity of the atom α in the iteration I, respectively.
(66) This problematic atom power correction allows increasing the timestep artificially.
(67) This together with the adaptive control of the timestep allows obtaining even greater timestep values.
(68) Along with constraint algorithms (WIGGLE) and hydrogen atom mass distribution, this allows values greater than 18 femtoseconds, as can be observed in
(69) 3.—Itemizing by Forces
(70) It is possible to itemize these variables for each type of force or group of forces.
(71) Therefore, the power due to the type of force or to the group of forces k in the iteration i for the atom α is:
p.sub.i.sup.α,k={right arrow over (F)}.sub.i.sup.α,k.Math.v.sub.i.sup.α (20)
(72) {right arrow over (F)}.sub.i.sup.α,k being the sum of the total force acting on the atom α type of force or the group of forces k.
(73) Similarly, it is possible to define the rest of the variables and thresholds.
(74) This can be itemized for each type of force, for example Van Der Waals, angles; or by groups of forces, for example bound, unbound.
(75) This separation by forces allows controlling the timestep independently for a multiple-timestep (MTS) integration, such as the case of RESPA, where the different timesteps are integrated at a different rate, following a criterion identical to that described in the first section.
(76) All these algorithms allow increasing the timestep by maintaining simulation stability.
(77) As mentioned, the choice of the threshold will depend on the simulation conditions.
(78) In some very conservative conditions, without using constraints or hydrogen mass redistribution and using BBK as an integrator at a temperature of 300 K, the stable timestep is 1.1 fs for a system formed by a ubiquitin-like protein, DHFR (dihydrofolate reductase) dissolved in water.
(79) Given that neither constraints nor hydrogen mass redistribution is used, these atoms are what cause integration problems.
(80) For each simulation step, the power control limits the velocity of the atoms according to equation (19).
(81) It is possible to obtain a timestep of 2.1 fs with this power control without modifying the global dynamics, the gain being 90% with respect to a conventional BBK in the same conditions.
(82) In this case the chosen threshold was 3100000 au.Math.A.sup.2.Math.ps.sup.−3, which is consistent with the values obtained for ubiquitin, a very similar system, at 1 fs.
(83) Given that the dynamics behavior of certain atoms can be very different, it is possible to use a different threshold for each atom
(84)
(85) Like hydrogens, the mass of one atom may typically vary greatly with respect to another atom, so said threshold can simply be a function of the mass of each atom.
(86)
(87) The stable timesteps are less than those for the preceding integrator, Verlet velocity and Berendsen thermostat, due particularly to the thermostat, since Berendsen re-scales the velocities, which therefore limits them, but damps the global dynamics. In contrast, BBK uses a Langevin type thermostat, which further reproduces the thermodynamic conditions to be simulated more accurately.
(88) If a threshold by type of force or type of atom is to be used, the way of determining it is similar to that of determining a single threshold. The corresponding behavior of the corresponding power for a very conservative timestep simulation is studied, and the maximum plus a margin is taken.