Method for searching for molecular stable structure, program for searching for molecular stable structure, and device for searching for molecular stable structure
11694767 · 2023-07-04
Assignee
Inventors
- Jun NAKABAYASHI (Ashigarakami-gun, JP)
- Shino Ohira (Ashigarakami-gun, JP)
- Kyosuke TSUMURA (Ashigarakami-gun, JP)
Cpc classification
G06N7/01
PHYSICS
G16B15/00
PHYSICS
International classification
G16B15/00
PHYSICS
G06N7/01
PHYSICS
Abstract
Provided are a method for searching for a molecular stable structure, a program for searching for a molecular stable structure, and a device for searching for a molecular stable structure, which are capable of acquiring a stable structure and various locally stable structures from a structural formula of a compound in a short time and with high accuracy. A three-dimensional structure is generated from the structural formula of the compound, and a locally stable structure is obtained from the three-dimensional structure. A one-dimensional or multidimensional energy distribution function for one or a plurality of internal coordinates and a probability distribution function of increasing a probability of low-energy internal coordinates are calculated from internal coordinates and an energy value of the locally stable structure. The method for searching for a molecular stable structure repeats the following processes: generating a three-dimensional structure based on the calculated probability distribution function; acquiring a locally stable structure; reflecting internal coordinates and an energy value of the obtained locally stable structure on the energy distribution function and the probability distribution function; and acquiring the locally stable structure, thereby obtaining a plurality of the locally stable structures and a structure with lowest energy. The program and the device for searching for a molecular stable structure execute the method.
Claims
1. A method for searching for a molecular stable structure, executed by a device comprising a processor, wherein the processor performs: a structural formula acquisition step of acquiring a structural formula of a compound; a first three-dimensional structure generation step of generating one or more three-dimensional structures in which internal coordinates of the structural formula are randomly set; a locally stable structure acquisition step of changing the internal coordinates of the three-dimensional structure to obtain a locally stable structure which is a structure with low energy; an energy acquisition step of obtaining internal coordinates of the locally stable structure and energy of the locally stable structure in the internal coordinates; an energy distribution function calculation step of calculating an energy distribution function which is a one-dimensional or multidimensional energy distribution function calculated for one or a plurality of internal coordinates constituting the compound and shows energy distribution of the locally stable structure with respect to the internal coordinates of the locally stable structure; a probability distribution function calculation step of calculating a probability distribution function of increasing a probability of low-energy internal coordinates from the energy distribution function; a second three-dimensional structure generation step of simultaneously changing one or more internal coordinates based on the probability distribution function and generating one or more three-dimensional structures using the determined internal coordinates; a repetition step of repeating, by using the three-dimensional structure generated in the second three-dimensional structure generation step, the locally stable structure acquisition step, the energy acquisition step, the energy distribution function calculation step, the probability distribution function calculation step, and the second three-dimensional structure generation step; and an output step of outputting, via a display, one or both of a plurality of the locally stable structures obtained in the locally stable structure acquisition step and a structure with lowest energy among the plurality of locally stable structures.
2. The method for searching for a molecular stable structure according to claim 1, wherein the locally stable structure is a structure having internal coordinates in which, in a case where the internal coordinates of the three-dimensional structure are changed in a direction of decreasing the energy, the energy is not further decreased.
3. The method for searching for a molecular stable structure according to claim 1, wherein the internal coordinates are determined by a dihedral angle obtained by coordinates of four atoms.
4. The method for searching for a molecular stable structure according to claim 3, wherein the energy distribution function calculation step is performed for all of the dihedral angles that the compound takes.
5. The method for searching for a molecular stable structure according to claim 1, wherein, in the probability distribution function calculation step, a function of accelerating computation is added to the probability distribution function.
6. The method for searching for a molecular stable structure according claim 1, wherein, in the second three-dimensional structure generation step, either generating a random number and selecting internal coordinates with a highest probability distribution intensity based on the random number, or determining the internal coordinates by the probability distribution function is selected, and the three-dimensional structure is generated.
7. A non-transitory and tangible computer-readable recording medium that causes a computer to execute the method for searching for a molecular stable structure according to claim 1, in a case where a command stored in the recording medium is read by the computer.
8. A device for searching for a molecular stable structure, comprising: a processor configured to: acquire a structural formula of a compound; generate one or more three-dimensional structures; change internal coordinates of the three-dimensional structure to obtain a locally stable structure which is a structure with low energy; obtain internal coordinates of the locally stable structure and energy of the locally stable structure in the internal coordinates; calculate an energy distribution function which is an energy distribution function calculated for each internal coordinate of each atom constituting the compound and shows energy distribution of the locally stable structure with respect to the internal coordinates of the locally stable structure; and calculate a probability distribution function of increasing a probability of low-energy internal coordinates from the energy distribution function; and a display that displays the locally stable structure, wherein the processor generates the three-dimensional structure based on the acquired structural formula of the compound or the probability distribution function.
9. The device for searching for a molecular stable structure according to claim 8, wherein the processor is further configured to: acquire a structure with lowest energy from the locally stable structures that are obtained.
10. The device for searching for a molecular stable structure according to claim 8, wherein the device further comprises a non-transitory and tangible computer-readable recording medium, and the processor performs processing by referring the recording medium.
11. The device for searching for a molecular stable structure according to claim 8, wherein the processor acquires the structural formula of a compound via a server and/or a database connected to the device via network.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
DESCRIPTION OF THE PREFERRED EMBODIMENTS
(22) Hereinafter, a method for searching for a molecular stable structure, a program for searching for a molecular stable structure, and a device for searching for a molecular stable structure according to an embodiment of the present invention will be described with reference to the accompanying drawings.
(23) <<Device for Searching for Molecular Stable Structure>>
(24)
(25) <Configuration of Processing Unit>
(26)
(27) The structural formula acquisition unit 105 acquires information such as a structural formula of a compound via a recording medium interface such as a DVD drive and a semiconductor memory terminal (not shown), and/or a network. The three-dimensional structure generation unit 110 randomly sets internal coordinates of the structural formula from the structural formula of the compound acquired by the structural formula acquisition unit 105, and generates one or more three-dimensional structures. In addition, the internal coordinates are determined based on a probability distribution function described below, and one or more three-dimensional structures are generated. The locally stable structure acquisition unit 115 changes the internal coordinates of the three-dimensional structure generated by the three-dimensional structure generation unit 110 and locally deforms the structure to acquire a locally stable structure which is a structure with low energy. Specifically, the locally stable structure is a structure in which even though the structure is deformed so that the energy is decreased, the energy is not further decreased. Further, the locally stable structure acquisition unit 115 comprises the most stable structure acquisition unit 116 and acquires a most stable structure with lowest energy from the obtained locally stable structures. In the present specification, “energy” is energy derived from a three-dimensional structure, and does not indicate energy resulting from change of one of internal coordinates described below.
(28) The energy acquisition unit 120 acquires the energy of the locally stable structure acquired by the locally stable structure acquisition unit 115. The energy distribution function calculation unit 125 calculates an energy distribution function showing distribution of the energy of the locally stable structure (structural energy) with respect to each of the internal coordinates of the locally stable structure. The energy distribution function is calculated for each internal coordinate constituting the compound. The probability distribution function calculation unit 130 calculates a probability distribution function of increasing a probability of low-energy internal coordinates from the energy distribution function.
(29) The output unit 135 outputs the locally stable structure acquired by the locally stable structure acquisition unit 115. In addition, the output unit 115 outputs the most stable structure obtained by the most stable structure acquisition unit 116. The display control unit 140 controls display of the acquired information and a processing result on a monitor 310. Details of processing of the method for searching for a molecular stable structure using these functions of the processing unit 100 will be described below. The processing by these functions is performed under control of the CPU 145.
(30) The function of each unit of the processing unit 100 described above can be realized by using various processors. The various processors include, for example, a CPU that is a general-purpose processor that executes software (program) to realize various functions. The various processors described above also include a programmable logic device (PLD) that is a processor of which a circuit configuration can be changed after manufacture, such as a field programmable gate array (FPGA). Further, a dedicated electric circuit that is a processor having a circuit configuration designed to be dedicated to execute specific processing, such as an application specific integrated circuit (ASIC), is also included in the various processors described above.
(31) The function of each unit may be realized by one processor, or may be realized by combining a plurality of processors. In addition, a plurality of functions may be realized by one processor. As an example in which the plurality of functions are configured by one processor, first, as represented by a computer such as a client or a server, one processor is configured by a combination of one or more CPUs and software and this processor functions as the plurality of functions. Second, as represented by a system on chip (SoC), a processor that realizes the functions of the entire system by using one integrated circuit (IC) chip is used. In this way, the various functions are configured by using one or more of the various processors described above as a hardware structure. Further, the hardware structure of these various processors is more specifically an electric circuit (circuitry) in which circuit elements such as semiconductor elements are combined.
(32) In a case where the processor or the electric circuit described above executes software (program), a processor (computer) readable code of the software to be executed is stored in a non-transitory recording medium such as the ROM 150 (see
(33) <Configuration of Storage Unit>
(34) The storage unit 200 is configured of a non-transitory recording medium such as a digital versatile disk (DVD), a hard disk, or various semiconductor memories, and a controller thereof, and stores an image and information shown in
(35) <Configuration of Display Unit and Operation Unit>
(36) The display unit 300 comprises the monitor 310 (display device), and can display an input image, an image and information stored in the storage unit 200, a result of the processing by the processing unit 100, and the like. The operation unit 400 includes a keyboard 410 and a mouse 420 as an input device and/or a pointing device, and a user can perform operations necessary for executing the method for searching for a molecular stable structure according to the present embodiment through these devices and a screen of the monitor 310. The operations that can be executed by the user include input of a structural formula of a compound, designation of a threshold value in calculating a probability distribution function, designation of a threshold value in generating a three-dimensional structure using a probability distribution function, and the like.
(37) <Processing in Device for Searching for Molecular Stable Structure>
(38) The above-described device for searching for a molecular stable structure search device 10 can search for a molecular stable structure in accordance with a user's instruction through the operation unit 400.
(39) <<Searching for Molecular Stable Structure>>
(40)
(41) In a case where determination is made in step S18 that a desired structure or a desired number of the locally stable structures or the most stable structures are not obtained, the method includes: an energy distribution function calculation step (step S20) of calculating an energy distribution function showing energy distribution of the locally stable structure with respect to the internal coordinates of the locally stable structure in each internal coordinate; a probability distribution function calculation step (step S22) of calculating a probability distribution function of increasing a probability of low-energy internal coordinates from the energy distribution function; and a second three-dimensional structure generation step (step S24) of generating one or more three-dimensional structures based on the probability distribution function.
(42) The energy distribution function may calculate a one-dimensional energy distribution function for each internal coordinate constituting a compound, and may calculate a two-dimensional energy distribution function using two internal coordinates or a multidimensional energy distribution function using a plurality of internal coordinates.
(43) In the probability distribution function calculation step, it is preferable to add a function of accelerating computation to the probability distribution function. The function of accelerating computation may include, but is not limited to, a white noise described below.
(44) After a three-dimensional structure is generated in step S24, the process returns to step S14, a locally stable structure is acquired from this three-dimensional structure, and internal coordinates and an energy value of the locally stable structure are acquired. Then, the internal coordinates and the energy value of the locally stable structure are reflected on the energy distribution function and the probability distribution function so far. By repeating step S14 to step S24, the probability distribution function obtained in step S22 can be made to be a probability distribution function having a high probability of the internal coordinates for obtaining low energy. Then, by using this probability distribution function, a probability that a locally stable structure with lower energy can be obtained can be increased.
(45) In a case where determination is made in step S18 that a desired structure or a desired number of the locally stable structures or the most stable structures are obtained, the method includes an output step (step S26) of outputting one of a plurality of the obtained locally stable structures or the most stable structure with lowest energy among the locally stable structures. By repeating processing from step S14 to step S24, the plurality of locally stable structures can be obtained. In addition, by selecting a structure with lowest energy among the locally stable structures, the most stable structure among the obtained structures can be obtained. Although it is not possible to objectively determine whether or not the obtained most stable structure is truly the most stable except for a specific compound, the greater the number of times the processing from step S14 to step S24 are repeated, the higher the probability that the obtained most stable structure is truly the most stable. In addition, it is possible to estimate, to some extent, whether or not the obtained most stable structure is truly the most stable from a state of convergence of the probability distribution function (shown in
(46) Hereinafter, each step will be described.
(47) <Structural Formula Acquisition Step (Step S10)>
(48) The structural formula acquisition step S10 is a step of acquiring a structural formula of a compound by inputting the structural formula of the compound according to a user's operation. Examples of the structure of the compound include a peptide having a plurality of amino acids bonded to one another and a molecular weight of 500 or more. In the present embodiment, in order to simplify description, dodecane (C.sub.12H.sub.26) shown in
(49) <First Three-Dimensional Structure Generation Step (Step S12)>
(50) The first three-dimensional structure generation step S12 is a step of generating one or more three-dimensional structures from the structural formula acquired in the structural formula acquisition step S10. The three-dimensional structure can be generated by randomly setting internal coordinates of the structural formula.
(51) The number of three-dimensional structures to be generated may be one or plural. By generating a plurality of the three-dimensional structures, it is possible to increase the number of locally stable structures obtained in the locally stable structure acquisition step S14 of the next step. Hereinafter, in order to simplify description, a case where one three-dimensional structure is generated will be described.
(52) Elements that determine the three-dimensional structure include, in a case where four atoms 600 exist, a bond length determined by arrangement (coordinates) of two atoms 600, a bond angle determined by arrangement of three atoms 600, and a dihedral angle determined by arrangement of four atoms 600, as shown in
(53) As another example of the internal coordinates, an interatomic distance can be used. In a case where an index of each atom is represented by a number and a distance between an atom x and an atom y is represented by d(x,y), a three-dimensional structure of a molecule can be uniquely determined by the following internal coordinates.
(54) d(1,2),
(55) d(1,3), d(2,3),
(56) d(1,4), d(2,4), d(3,4),
(57) d(2,5), d(3,5), d(4,5), . . . .
(58) Various methods are known for making the structural formula three-dimensional, and the first three-dimensional structure generation step S12 is not particularly limited and can be performed.
(59) <Locally Stable Structure Acquisition Step (Step S14)>
(60) The locally stable structure acquisition step S14 is a step of acquiring a locally stable structure from the three-dimensional structure (steric structure) obtained in the first three-dimensional structure generation step S12.
(61) Acquisition of the locally stable structure is performed in the first three-dimensional structure generation step S12 by searching for a three-dimensional structure with low energy from the three-dimensional structure randomly generated. Acquisition of the locally stable structure can be performed as follows. In
(62) Various methods of moving the structure can be adopted, for example, a steepest descent method, a conjugate gradient method, and a Berny method can be used, but the method is not particularly limited. Gaussian 09 and amber can be used as software which performs such a computation.
(63) As a locally stable structure, for example, in a case of a straight-chain alkane such as dodecane (C.sub.12H.sub.26), possible locally stable structures are as follows.
(64) As shown in
(65) <Energy Acquisition Step (step S16)>
(66) The energy acquisition step S16 is a step of acquiring the energy value of the locally stable structure obtained in the locally stable structure acquisition step S14 and each C—C—C—C dihedral angle. In the energy acquisition step S16, the energy value and each dihedral angle may be obtained from the acquired locally stable structure, but it can be acquired by performing the above-described locally stable structure acquisition step S14.
(67) <Determination Step (Step S18)>
(68) The determination step S18 is a step of determining whether or not a desired structure or a desired number of the locally stable structures or the most stable structures are obtained. In the present embodiment, the purpose is to acquire the most stable structure or a plurality of locally stable structures having different three-dimensional structures. Therefore, in a case where determination is made that acquisition of the locally stable structure is not necessary, such as a case where a desired structure (most stable structure) is obtained, a desired number of the most stable structures are obtained, or a desired number of the locally stable structures are obtained, the most stable structure and the locally stable structure are output (output step).
(69) Furthermore, in a case where determination is made that acquisition of the locally stable structure is necessary, the process proceeds to the energy distribution function calculation step S20. In the following steps, a state in which one locally stable structure is acquired as a first loop will be described.
(70) <Energy Distribution Function Calculation Step (Step S20)>
(71) The energy distribution function calculation step S20 is a step of calculating an energy distribution function E(φ) showing distribution of energy with respect to a dihedral angle φ. The energy distribution function E(φ) is calculated for each dihedral angle.
(72) <Probability Distribution Function Calculation Step (Step S22)>
(73) The probability distribution function calculation step S22 is a step of calculating a probability distribution function p(φ) for increasing a probability of low-energy dihedral angle from the energy distribution function E(φ).
(74) The probability distribution function p(φ) can be obtained by the following mathematical expression.
(75)
(76) Here, E.sub.min is a value with lowest energy among the structures generated so far (this is the first loop, so that there is one locally stable structure, E.sub.min=−61.8925 kcal/mol), k.sub.B is a Boltzmann coefficient, T is an optional temperature (300K this time), and C is a normalization constant.
(77)
(78) Next, using this probability distribution function p(φ), one or more three-dimensional structures are generated. However, since the probability distribution function is calculated with one locally stable structure, a probability distribution intensity of one peak is 1.0, and in a case where this probability distribution function is used, one internal coordinate is selected. Therefore, it is preferable that a probability distribution function p′(φ) added with a white noise e is calculated, and using this probability distribution function p′(φ), a three-dimensional structure is generated (second three-dimensional structure generation step S24). The probability distribution function p′(φ) added with the white noise e can be obtained by the following mathematical expression.
(79)
(80)
(81)
(82) <Second Three-Dimensional Structure Generation Step (Step S24)>
(83) The second three-dimensional structure generation step S24 is a step of determining internal coordinates of a new three-dimensional structure based on the probability distribution function calculated in the probability distribution function calculation step S22 and generating one or more three-dimensional structures.
(84) The method of generating a three-dimensional structure based on the probability distribution function is performed by determining each dihedral angle. First, a random number R (R is a real number which is 0 or more and 1 or less) is generated for one dihedral angle. In this case, R.sub.0 is set as an appropriate threshold value. For example, R.sub.0 is set to R.sub.0=0.4. The generated random number R is compared with the threshold value R.sub.0, and in a case of R>R.sub.0, the process proceeds to (1). In addition, in a case of R≤R.sub.0, the process proceeds to (2).
(85) (1) In a case of R>R.sub.0, a random number of R (R is a real number which is 0 or more and 1 or less) is generated again, and a dihedral angle is computed from the following mathematical expression. In this case, a plurality of dihedral angles can be selected by generating a plurality of random numbers. By selecting the plurality of angles, a plurality of three-dimensional structures can be generated.
∫.sub.0.sup.ϕp.sup.(′)(ϕ′)dϕ′=R
(86) (2) In a case of R≤R.sub.0, a dihedral angle at which the probability distribution function p(φ) or p′(φ) takes the maximum value is selected. In this case, since the angle is the maximum value, one angle is selected.
(87)
(88) By performing selection of the dihedral angle for all dihedral angles, one or more three-dimensional structures are generated. In this way, a random number is generated, and some of the dihedral angles are set to a value at which the probability distribution function takes the maximum value, that is, to an angle with lowest energy, whereby it is possible to generate a three-dimensional structure having different peak valleys in the graph shown in
(89) <Repetition Step>
(90) After one or more three-dimensional structures are generated in the second three-dimensional structure generation step S24, the process returns to the locally stable structure acquisition step S14 and a locally stable structure is acquired by the same method. In the energy acquisition step S16, an energy value of the acquired locally stable structure and each dihedral angle are acquired. For example, it is assumed that a steric structure shown in
(91) Next, although determination is made in the determination step S18 whether or not a desired structure or a desired number of the most stable structures or the locally stable structures are obtained, this is the second loop and the structure is not obtained, so that the process proceeds to the energy distribution function calculation step S20.
(92)
(93) Next, in the probability distribution function calculation step S22, a probability distribution function is calculated using the energy distribution function calculated in the energy distribution function calculation step S20 and reflecting the newly obtained locally stable structure. As a method of calculating the probability distribution function, the same method as in the probability distribution function calculation step S22 of the first loop can be used. Since E.sub.min in the expression is a value of a structure with lowest energy among the locally stable structures obtained so far, E.sub.min=−63.2096 kcal/mol, which is an energy value of the locally stable structure shown in
(94)
(95) After that, a three-dimensional structure is generated by using the probability distribution function to acquire a locally stable structure. Although a white noise is not added in
(96)
(97)
(98)
(99) With respect to this, according to the search method of the present embodiment, the most stable structure can be discovered in searches of 45 times by reflecting the dihedral angle and energy of the obtained locally stable structure on the probability distribution function. Even after the most stable structure is discovered, the same most stable structure can be rediscovered a plurality of times, and thus, it is possible to reliably obtain the most stable structure, not by chance, with a small number of times.
(100) In the above description, although the method of generating one three-dimensional structure in the first three-dimensional structure generation step S12 and the second three-dimensional structure generation step S24 is described, two or more three-dimensional structures may be generated. In that case, two or more locally stable structures can be obtained. By using the dihedral angle and the energy value of the obtained locally stable structure for calculating the energy distribution function, a highly accurate probability distribution function can be obtained with a small number of repetitions.
(101) Although the above embodiment is described using dodecane for ease of description, the locally stable structure and the most stable structure can be searched for by the same method for a cyclic peptide formed by a plurality of amino acids bonded to one another, which is expected as a drug candidate, as shown in
(102) <Output Step (step S26)>
(103) In the determination step S18, in a case where determination is made that a desired number of the locally stable structures are obtained, in a case where determination is made that the most stable structure is obtained, or in a case where determination is made that a desired number of the most stable structures are obtained by repeating the above repetition step, the obtained locally stable structure or most stable structure is output in the output step S26. Although it is considered that the most stable structure with lowest energy is preferable from the viewpoint of the effect of the drug or the membrane permeability, it is preferable to output the locally stable structure in consideration of the other viewpoints, for example, a case where the most stable structure cannot be used as a drug due to the difficulty in formulation. In addition, the characteristics of the three-dimensional structure with low energy can be grasped, which can be useful for future search of the stable structure.
(104) <Effect of Method for Searching for Molecular Stable Structure and Program for Searching for Molecular Stable Structure>
(105) As described above, in the device for searching for a molecular stable structure search device 10, a locally stable structure and the most stable structure of a structural formula of a compound can be searched for in a short time and with high accuracy by using the method for searching for a molecular stable structure and the program for searching for a molecular stable structure according to the present embodiment.
EXPLANATION OF REFERENCES
(106) 10: device for searching for molecular stable structure 100: processing unit 105: structural formula acquisition unit 110: three-dimensional structure generation unit 115: locally stable structure acquisition unit 116: most stable structure acquisition unit 120: energy acquisition unit 125: energy distribution function calculation unit 130: probability distribution function calculation unit 135: output unit 140: display control unit 145: CPU 150: ROM 155: RAM 200: storage unit 205: structure information 210: locally stable structure information 215: most stable structure information 220: energy distribution function information 225: probability distribution function information 300: display unit 310: monitor 400: operation unit 410: keyboard 420: mouse 500: external server 510: external database 600: atom NW: network