CIRCUIT SYSTEM WHICH EXECUTES A METHOD FOR PREDICTING SLEEP APNEA FROM NEURAL NETWORKS

20220409126 · 2022-12-29

Assignee

Inventors

Cpc classification

International classification

Abstract

A method for predicting sleep apnea from neural networks that mainly includes the following steps: a) retrieving an original signal; b) retrieving at least one snoring signal from the original signal by a snoring signal segmentation algorithm and converting the snoring signal into one with one-dimensional vector; c) applying a feature extraction algorithm to process the snoring signal with one-dimensional vector and transform the snoring signal into a feature matrix of two-dimensional vectors; and d) classifying the feature matrix by a neural network algorithm to obtain the number of times of sleep apnea and sleep hypopnea from the snoring signal. The method thereby is able to decide whether the snoring signal has revealed indications of sleep apnea or sleep hypopnea or not.

Claims

1. A circuit system which executes a method for predicting sleep apnea from neural networks, comprising: a microphone for retrieving an original signal; an artificial intelligence (AI) device; a development board, wherein the microphone and the AI device are electrically connected to the development board; and a display; wherein the AI device segments the original signal based on a first threshold value and a second threshold value, utilizes a sliding window to linearly inspect the original signal and calculating a maximum value of the original signal, upon said maximum value being greater than said second threshold value, recognizes a snoring signal and a position thereof, keeps inspecting the original signal toward a right direction and obtains a sum value of an absolute value of the snoring signal, upon the sum value being less than the first threshold value, sets a stop position, keeps inspecting the original signal toward a left direction and obtains a sum value of an absolute value of the snoring signal, upon the sum value being less than the first threshold value, sets a start position, segments the signal fell between said start position and said stop position and recognizes as a snoring signal vector with one dimension, applies a feature extraction algorithm to said snoring signal vector with one-dimension to transform the snoring signal into a feature matrix of two-dimensional vector, and applies a neural network algorithm to said feature matrix of two-dimensional vector for classifying and then provides a result indicating a number of times of sleep apnea and sleep hypopnea within the snoring signal to the display.

2. The circuit system as claimed in claim 1, wherein a formula for calculation of the first threshold value is M=mean(f(Y.sub.i>0)), where M representing the first threshold value, mean representing an average value, f( ) representing a down sampling formula and Y.sub.i representing a frame vector of the original signal, and a formula for calculation of the second threshold value is X=mean(N)+std(N), where X representing the second threshold value, mean representing an average value, std representing a standard deviation and N representing a natural number calculated by a formula: N=sort(abs(y)), where sort representing a sorting by numerical order, abs representing an absolute value and y representing the number of vectors the frame vector was segmented into.

3. The circuit system as claimed in claim 1, wherein a length of the snoring signal vector is defined to be 25000 frames.

4. The circuit system as claimed in claim 1, wherein the sliding window has window size of 1000.

5. The circuit system as claimed in claim 1, wherein the feature extraction algorithm has the Mel-Frequency Cepstral Coefficients for the feature extraction process, including procedures of pre-emphasis, framing and windowing, fast Fourier transform, Mel filter bank, nonlinear conversion and discrete cosine transform.

6. The circuit system as claimed in claim 1, wherein the neural network algorithm is a convolutional neural network algorithm, having a dense convolutional network model as a decision model.

7. The circuit system as claimed in claim 6, wherein the dense convolutional network model includes a plurality of dense blocks, a plurality of transition layers and a classification layer.

8. The circuit system as claimed in claim 7, wherein the plurality of transition layers includes a convolution process and a pooling process, and the classification layer is a softmax layer.

9. The circuit system as claimed in claim 7, wherein the plurality of dense blocks includes a dense layer, a batch normalization-rectified linear units-convolution layer and a growth rate.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0018] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing (s) will be provided by the Office upon request and payment of the necessary fee.

[0019] FIG. 1 is a flow diagram of the present invention;

[0020] FIG. 2 is a schematic diagram illustrating an operation process of the present invention;

[0021] FIG. 3A is a schematic diagram of an original signal according to the present invention;

[0022] FIG. 3B is a schematic diagram of the original signal after normalization according to the present invention;

[0023] FIG. 3C is a schematic diagram of partial of the original signal according to the present invention;

[0024] FIG. 3D is a schematic diagram of the partial original signal after normalization being inspected for recognizing snoring signals according to the present invention;

[0025] FIG. 3E is a schematic diagram of a snoring signal being segmented according to the present invention;

[0026] FIG. 3F is a schematic diagram of a first single snoring signal after segmentation according to the present invention;

[0027] FIG. 3G is a schematic diagram of a second single snoring signal after segmentation according to the present invention;

[0028] FIG. 3H is a schematic diagram of a third single snoring signal after segmentation according to the present invention;

[0029] FIG. 3I is a schematic diagram of a fourth single snoring signal after segmentation according to the present invention;

[0030] FIG. 3J is a schematic diagram of a fifth single snoring signal after segmentation according to the present invention;

[0031] FIG. 3K is a schematic diagram of a sixth single snoring signal after segmentation according to the present invention;

[0032] FIG. 3L is a schematic diagram of a seventh single snoring signal after segmentation according to the present invention;

[0033] FIG. 4 is a schematic diagram illustrating procedures of the Mel-Frequency Cepstral Coefficients fora feature extraction process according to the present invention;

[0034] FIG. 5 is a schematic diagram illustrating a dense convolutional network model according to the present invention;

[0035] FIG. 6 is a schematic diagram illustrating structure of a dense block according to the present invention;

[0036] FIG. 7A is a schematic diagram illustrating a decision model of sleep apnea according to the present invention; and

[0037] FIG. 7B is a schematic diagram illustrating a prediction process of the decision model of sleep apnea according to the present invention.

[0038] FIG. 8 is a diagram illustrating a circuit system for executing the method shown in FIG. 1.

DETAILED DESCRIPTION

[0039] Referring to FIGS. 1-7B, the method for predicting sleep apnea from neural network includes the following steps.

[0040] Step a: retrieving an original signal Y. In this embodiment, polysomnography (PSG) has been performed on multiple subjects. The variables include the apnea-hypopnea index (AHI), snoring index and minimum oxygen saturation (MOS). The AHI is the number of times obstructive apnea and hypopnea happened per hour of sleep. Apnea is defined when the inhalation and exhalation stops for at least 10 seconds and hypopnea is defined when the baseline ventilator value is decreased by 50% or more and the oxygen saturation is decreased by 4% or more, and such reduction lasts more than 10 seconds. When performing PSG, the sound of snoring is recorded by a mini-microphone placed on a position above the suprasternal notch. But the present invention is not limited to such application.

[0041] Step b: applying a snoring signal segmentation algorithm G.sub.1 to the original signal Y to further retrieve at least one snoring signal B for segmentation and output the segmented snoring signals with one-dimensional vector S. Since the original signal Y is the audio file recorded all night, the data has to be processed in advance. The snoring signal segmentation algorithm G.sub.1 is applied to automatically sort out and segment the snoring signals from the original signal Y. But the present invention is not limited to such application.

[0042] With reference to FIGS. 3A-3L, the longitudinal direction is the magnitude and the transverse direction is the times. In this embodiment, the snoring signal segmentation algorithm G.sub.1 is performed for segmentation based on a first threshold value M and a second threshold value X. The snoring signal segmentation algorithm G.sub.1 has a sliding window W for linearly inspecting the original signal Y and, as illustrated in FIG. 3D, the algorithm G.sub.1 calculates a maximum value Xi of the original signal Y during the inspection. When the maximum value Xi is greater than the second threshold value X, a snoring signal B and a position of the snoring signal B are recognized. Then the inspection continues toward a right direction along the sliding window W and, as illustrated in FIG. 3E, a sum value Mi of an absolute value of the snoring signal Y is further obtained. When the sum value Mi is less than the first threshold value M, a stop position R is set. Then the inspection continues toward a left direction and a sum value Mi of an absolute value of the snoring signal Y is obtained again. When the sum value Mi is less than the first threshold value M, a start position L is set. Then the signal fell between the start position L and the stop position R is segmented and recognized as a snoring signal vector S with one dimension as illustrated in FIGS. 3F-3L. After the segmentation, a first single snoring signal vector S.sub.1, a second single snoring signal vector S.sub.2, a third single snoring signal vector S.sub.3, a fourth single snoring signal vector S.sub.4, a fifth single snoring signal vector S.sub.5, a sixth single snoring signal vector S.sub.6 and a seventh single snoring signal vector S.sub.7 are recognized. Since the snoring signal vectors are required to have the same length for further processing, a length of 25000 frames is set in this embodiment.

[0043] In addition, the formula for calculation of the first threshold value M is


M=mean(f(Y.sub.i>0)),

[0044] where M represents the first threshold value M; mean represents an average value; f( ) represents a downsampling formula; K represents a frame vector Yi of the original signal Y every 2 minutes downsampled to a dimension of 400. The downsampling process has the frame vectors Yi equally segmented to the same dimension and retrieves a maximum value of each segment. The frame vectors Yi are downsampled to a vector of 1*400, thereby producing a more reliable value of the first threshold value M.

[0045] The formula for calculation of the second threshold value X is


X=mean(N)+std(N),

[0046] where X represents the second threshold value X; mean represents an average value; std represents a standard deviation; N represents a natural number calculated by a formula of


N=sort(abs(y)),

[0047] where sort represents a sorting by numerical order; abs represents an absolute value; y representing the number of vectors the frame vector was segmented into. In other words, the number of vectors is the result of the length of the frame vector Yi dividing the size of the sliding window W. When the sliding window W has window size of 1000, the natural number is obtained and the second threshold value X can be further obtained.

[0048] Step c: applying a feature extraction algorithm G.sub.2 to the snoring signal vector S with one dimension to transform the snoring signal vector S into a feature matrix A of two-dimensional vector. Thereby the original signal Y is segmented into the plurality of single snoring signals S.sub.1, S.sub.2, S.sub.3, S.sub.4, S.sub.5, S.sub.6, S.sub.7 for further processing of the feature extraction algorithm G.sub.2. The feature extraction algorithm G.sub.2 has the Mel-Frequency Cepstral Coefficients (MFCC) for the feature extraction process, including procedures of pre-emphasis G.sub.21, framing and windowing G.sub.22, fast Fourier transform G.sub.23, Mel filter bank G.sub.24, nonlinear conversion G.sub.25 and discrete cosine transform G.sub.26 as illustrated in FIG. 4.

[0049] The pre-emphasis G.sub.21 aims to compensate for the attenuated portion of the single snoring signals S.sub.1, S.sub.2, S.sub.3, S.sub.4, S.sub.5, S.sub.6, S.sub.7 by a process defined as:


H.sub.preem(z)=.sup.1−α.sub.preemz.sup.−1,

[0050] where H.sub.preem represents the result after the pre-emphasis process G.sub.21 and α.sub.preem represents the input signal of sounds.

[0051] The framing and windowing G.sub.22 has the single snoring signals S.sub.1, S.sub.2, S.sub.3, S.sub.4, S.sub.5, S.sub.6, S.sub.7 divided into shorter frames, each of which has a length of 20-40 milliseconds. In order to avoid significant changes between two frames, there is an overlapping area of 10 milliseconds between each frames and each frame is multiplied by the Hamming window to enhance the continuity between the borders of the frames. The signals close to the borders of the frames are slowly faded out to avoid the discontinuity and the energy spectrum of noise would be weakened, thereby the peak of the sine wave of the signals would be relatively prominent as well. If there is obvious discontinuity between each frames, there will be other misguiding energy distribution in the next fast Fourier transform process, causing misjudgment of the analysis in the process. Therefore, the signals have to be multiplied by the Hamming window during this step.

[0052] The fast Fourier transform G.sub.23 is applied to convert the signals from the time domain to the frequency domain, and fast Fourier transform is the fast algorithm of discrete Fourier transform.

[0053] The Mel filter bank G.sub.24 is a band-pass filter that overlaps with each other. Based on the Mel scale, it is linear under the frequency of 1 Hz and logarithmic thereon. The Mel scaling process is defined as:

[00001] mel = 2595 log 1 0 ( 1 + f 7 0 0 ) ,

[0054] where mel represents the result of the Mel filter bank; f represents the input of the filter bank; and the numbers 2595 and 700 are fixed numbers that have been widely used in the filter process in many researches. The energy spectrum is multiplied by a set of 16 triangular band-pass filters and thus we use the Mel Frequency as the spectrum of the 16 filters.

[0055] The discrete cosine transform G.sub.26 is applied for calculation of the MFCCs in each frame, and the conversion is based on the following equation:


Σ.sub.k=1.sup.N log(y(i))*cos[mx(k−0.5)xπ÷N]

[0056] Thereby the snoring signal B can be converted into the MFCCs feature matrix A.

[0057] Step d: applying a neural network algorithm G.sub.3 to the feature matrix A of two-dimensional vector for classifying and then providing a result indicating a number of times of sleep apnea and sleep hypopnea within the snoring signal B. After the feature extraction process, the snoring signal B is converted into a two-dimensional vector, and, as most image classification process are performed by the neural network algorithm. G.sub.3, we can also apply the neural network algorithm G.sub.3 for classifying the feature matrix A. But the present invention is not limited to such application.

[0058] In this embodiment, the neural network algorithm G.sub.3 is a convolutional neural network algorithm which has a dense convolutional network (DN) model as a decision model. As illustrated in FIG. 5, the dense convolutional network model includes a plurality of dense blocks D, a plurality of transition layers T and a classification layer E. The transition layers T includes a convolution process T.sub.1 and a pooling process T.sub.2, and the classification layer E is a softmax layer.

[0059] Further referring to FIG. 6, the dense blocks D includes a dense layer I, a batch normalization-rectified linear units-convolution layer BR and a growth rate k. The growth rate k is the number of feature maps output from each layer. Since the DN model consists of multiple connected dense blocks D and transition layers T, and is finally connected to the classification layer E, the dense blocks D are densely connected convolutional neural networks. The snoring signal B is segmented and labeled for further training. For instance, if the feature matrix A does not contain signals of sleep apnea and sleep hypopnea, it is labeled normal A.sub.1; if the feature matrix A contains signals of sleep apnea or sleep hypopnea, it is labeled abnormal A.sub.2. After sending the labeled signals into the DN model, a sleep apnea model F is produced and ready for operation. However, the present invention is not limited to such application.

[0060] Within the dense blocks D, any two layers are directly connected; therefore, the input of each layer in the network is the output of its previous layer, and the feature map of each layer is also transmitted directly to all the descendent layers. Such approach employs the DN model to make efficient use of all-level features. The transition layers T are designed to reduce the size of the feature matrix. Since the final layer of the output from the dense blocks D, the model can be very large. Therefore, the transition layers T are employed to reduce the amount of the parameters greatly. With such structures, the DN model solves the problem of gradient vanishing occurred when the network architecture is too deep and has the resistance to over-fitting.

[0061] Further referring to FIG. 2, the sleep apnea model F is able to predict a normal signal F.sub.1 and an abnormal signal F.sub.2. And as illustrated in FIG. 7A, the sleep apnea model F has a ground truth displayed in blue color, the normal signal F.sub.1 as normal snoring displayed in green color and the abnormal signal F.sub.2 as obstructive sleep apnea (OSA) displayed in pink color for establishing a ground data. Then referring to FIG. 7B, the snoring signal B is inserted into the sleep apnea model F displayed in red color for deciding whether the snoring signal B is a normal signal F.sub.1 or an abnormal signal F.sub.2 and further predicting whether it is sleep apnea or not.

[0062] Finally, please refer to FIG. 8. FIG. 8 is a diagram illustrating a circuit system 800 for executing the method shown in FIG. 1, wherein the circuit system 800 includes a microphone 802, an artificial intelligence (AI) device 804, a development board 806, a display 808. For example, the AI device 804 can be an AI chip (KL520) which combines a reconfigurable artificial neural network (RANN) and model compression technology, and can support various machine learning frameworks and convolutional neural network (CNN) model. But, the present invention is not limited to the AI device 804 being the AI chip (KL520), that is, the AI device 804 can be other AI chips.

[0063] In addition, for example, the development board 806 can be a Raspberry Pi 4 development board, wherein an algorithm corresponding to the method shown in FIG. 1 can be converted into an image file by a processor 809 included in the development board 806 and the processor stores the image file in a memory card 805 (e.g. a secure digital (SD) card), and the memory card 805 is inserted into a corresponding lot (e.g. a SD card slot). Thus, the artificial intelligence (AI) device 804 can execute the image file to control corresponding hardware included in the development board 806 to analyze original signals (shown in FIG. 3A) recorded by the microphone 802, and make the display 808 display an execution result (e.g. FIG. 7B) for a client. That is, the circuit system 800 can execute the algorithm corresponding to the method shown in FIG. 1 to generate the execution result (e.g. FIG. 7B). Then, the client can make a determination according to the execution result shown in the display 808. Of course, the display 808 can also display the original signals (shown in FIG. 3A) recorded by the microphone 802, or display text information (corresponding to the original signals) converted by the development board 806, but that the display 808 displays the original signals, or displays text information (corresponding to the original signals) may not make much sense for the client. In addition, the present invention is not limited to the development board 806 being the Raspberry Pi 4 development board, that is, the development board 806 can be other kinds of development boards (e.g. Jetson Nano motherboard).

[0064] In addition, as shown in FIG. 8, the microphone 802, the AI device 804, and the display 808 can be electrically connected to the development board 806 through universal serial bus (USB) ports (e.g. USB 2.0 ports or USB 3.0 ports) 810, 812, 814. In addition, other components included in the development board 806 are not key points in the present invention, so other components are neglected and further description thereof is omitted.

[0065] Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.