METHOD FOR TWO-DIMENSIONAL NUCLEAR MAGNETIC RESONANCE DIFFUSION ORDERED SPECTROSCOPY BASED ON DEEP LEARNING
20240019515 ยท 2024-01-18
Inventors
- Yuqing Huang (Xiamen, CN)
- Zhong Chen (Xiamen, CN)
- Yu YANG (Xiamen, CN)
- Liubin WU (Xiamen, CN)
- Bo CHEN (Xiamen, CN)
Cpc classification
G01R33/4625
PHYSICS
International classification
Abstract
A method for processing two-dimensional (2D) nuclear magnetic resonance (NMR) Diffusion Ordered Spectroscopy (DOSY) based on deep learning comprises constructing a simulated dataset by generating simulated data using a mathematical model based on signal characteristics of the 2D NMR DOSY, generating labels for training a deep learning network model, wherein the labels comprise a first two-dimensional matrix, and two dimensions of the first two-dimensional matrix comprise chemical shift and diffusion coefficients, constructing the deep learning network model and setting training parameters of the deep learning network model, training the deep learning network model using the simulated dataset, and testing the deep learning network model.
Claims
1. A method for processing two-dimensional (2D) nuclear magnetic resonance (NMR) Diffusion Ordered Spectroscopy (DOSY) based on deep learning, comprising: step 1: constructing a simulated dataset by generating simulated data using a mathematical model based on signal characteristics of the 2D NMR DOSY, the mathematical model is as follows:
2. The method for processing 2D NMR DOSY based on deep learning according to claim 1, wherein: the body structure of the deep learning network model in the step 3 comprises a first linear layer followed by N body modules, each of the N body modules comprises a multi-head attention module, a feed-forward module, and two Add&Norm modules, wherein the multi-head attention module is a core architecture of the deep learning network model, the feed-forward module comprises two second linear layers, and the two second linear layers comprise a dropout layer and a nonlinear activation unit, and each of the two Add&Norm modules comprises a residual connection and a LayerNorm layer, wherein the multi-head attention module is constructed by: firstly obtaining three matrices Q, K, and V by passing through three different third linear layers using an input matrix, uniformly dividing into multiple blocks, feeding the multiple blocks into an attention module for calculation, wherein a mathematical model of the attention module is as follows:
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0019]
[0020]
[0021]
[0022] The present disclosure is further described below in combination with the accompanying drawings and embodiments.
DETAILED DESCRIPTION OF THE EMBODIMENTS
[0023] The present disclosure provides a method for processing two-dimensional (2D) Nuclear Magnetic Resonance (NMR) Diffusion Ordered Spectroscopy (DOSY) based on deep learning to solve needs for prior knowledge and complicated parameter adjustment in traditional DOSY processing methods. Moreover, the DOSY in various dimensions can be quickly processed using the method with excellent alignment, high resolution, and strong robustness of spectral peaks.
[0024] In order to enable a person of skill in the art to well implement and understand the present disclosure, the present disclosure is further described below in combination with the accompanying drawings and embodiments. It should be understood that the embodiments described here are only used to explain and describe the present disclosure instead of being used to limit the present disclosure.
[0025] In order to facilitate the description, related professional terms in the embodiments are described as follows: [0026] Inverse Laplace Transform: ILT, [0027] 2D NMR Diffusion Ordered Spectroscopy: 2D NMR DOSY, and [0028] Adaptive Moment Estimation: Adam
[0029] In this embodiment, a trained deep learning network model for simulated data is generated, the trained deep learning network model is used to process data, and a corresponding relationship spectrum between chemical shift and diffusion coefficients is obtained to analyze interaction between molecules and to identify compositions of a mixture. When the deep learning network model is trained, dimensional sizes of input S.sub.1 are: 4830030, wherein 48 is a batch size, 300 is a dimensional size of the chemical shift, and 30 is dimensional sizes of input decay signals. Dimensional sizes of output data S.sub.2 are: 48300140, wherein 48 is a batch size, 300 is a dimensional size of the chemical shift, and 140 is a dimensional size of the diffusion coefficients. The deep learning network model has good versatility. When the DOSY data is processed, there are no restrictions on the dimensional size of the chemical shift or the dimensional sizes of the input decay signals. The output dimensional size of the chemical shift is the same as the input dimensional size of the chemical shift, and the output dimensional size of the diffusion coefficients is fixed at 140.
[0030] The specific steps are as follows: [0031] Step 1: A simulated dataset is constructed by generating simulated data using a relevant mathematical model and adding simulated noise based on signal characteristics of the 2D NMR DOSY.
[0032] The relevant mathematical model is as follows.
f is a frequency of atom nuclear resonance, D.sub.1 is a diffusion coefficient of an I-th molecular component, is a magnetogyric ratio, is a gradient pulse width, g is a pulsed field gradient amplitude, and is diffusion time corrected by a finite gradient pulse width. C.sub.l(f) is a spectrum of the I-th molecular component and is simulated as a linear combination of peaks with Lorentzian line shapes along a frequency dimension, wherein f.sub.i is a position of an i-th peak, w.sub.i is a full width at half height of the i-th peak, A.sub.i is an amplitude of the i-th peak, and is Gaussian noise. In this embodiment, b is a one-dimensional array uniformly distributed in an interval from 0 to 1.2, and a length of the array is 30. D.sub.l is a random number between 0 and 14, l is a random integer between 1 and 3, f.sub.i is a random integer between 0 and 300, w.sub.i is 18, and A.sub.i is a random number between 0 and 1. [0033] Step 2: Labels for training the deep learning network model are generated. The labels are a two-dimensional matrix with two dimensions representing the chemical shift and the diffusion coefficients. In this embodiment, the dimensional size of the chemical shift is 300, and the dimensional size of the diffusion coefficients is 140. The dimensional size of the chemical shift in this embodiment is the same as a dimensional size of a chemical shift of the simulated dataset. The ILT method is used to generate a dimensional size of the diffusion coefficients, and the Gaussian distribution is used to represent a possibility value of various diffusion coefficients. A central value of the Gaussian distribution (e.g., a position of a spectral peak) is a predict value of the diffusion coefficients, and a full width at a half height of the spectral peak corresponds to a confidence interval. [0034] Step 3: The deep learning network model is constructed, and training parameters are set.
[0035] The deep learning network model is shown in
Output from the attention module is spliced to form the complete multi-head attention module by a third linear layer. The feed forward module consists of two fourth linear layers, a dropout layer, and a non-linear activation unit ReLU. Each of the two Add&Norm modules comprises residual connections and a layer norm layer. A.sub.h is an attention matrix calculated by an h-th head of the attention module, Q is a query matrix, K is a key matrix, V is a value matrix, and d.sub.k is a last dimensional value of the key matrix.
[0036] In this example, the linear layers preceding the body structure of the deep learning network model expand dimensional sizes of the input decay signals from 30 to 140 by setting the training parameters from 30 to 140. 6 body modules of the N body modules are provided, and 7 heads of the multi-head attention module are provided. In the feed forward module, parameters of the two fourth linear layers are respectively set to 140-4096 and 4096-140, and dropout is set to 0.001. [0037] Step 4: The deep learning network model is trained using the generated simulated dataset.
[0038] As shown in
[0040] The DOSY is firstly interpolated and fitted using a gradient dimension algorithm, followed by normalization (e.g., each of the input decay signals is divided by a first value of the input decay signals, ensuring that the input decay signals decay from 1). Data after the normalization then feeds into the trained deep learning network model to output a two-dimensional matrix, as shown in a contour map illustrated in
[0041] The specific embodiments described in the specification are merely used to explain in an exemplary manner the spirit of the present disclosure. Thus, it is intended that the present disclosure cover any modifications and variations of the presently presented embodiments provided that a person skilled in the art can modify or supplement the specific embodiments described herein or replace the specific embodiments described herein using similar methods.
[0042] The aforementioned embodiments are merely some embodiments of the present disclosure, and the scope of the disclosure is not limited thereto. Thus, it is intended that the present disclosure cover any modifications and variations of the presently presented embodiments provided they are made without departing from the appended claims and the specification of the present disclosure.