G10L25/24

METHOD OF PROCESSING AUDIO DATA, ELECTRONIC DEVICE AND STORAGE MEDIUM

A method of processing audio data, an electronic device, and a storage medium, which relates to a field of artificial intelligence, in particular to a field of speech processing technology. The method includes: processing spectral data of the audio data to obtain a first feature information; obtaining a fundamental frequency indication information according to the first feature information, wherein the fundamental frequency indication information indicates valid audio data of the first feature information and invalid audio data of the first feature information; obtaining a fundamental frequency information and a spectral energy information according to the first feature information and the fundamental frequency indication information; and obtaining a harmonic structure information of the audio data according to the fundamental frequency information and the spectral energy information.

ACOUSTIC ANALYSIS OF A RESPIRATORY THERAPY SYSTEM

Method and apparatus obtain information about a patient and/or a respiratory therapy system that is configured to deliver respiratory therapy to the patient. The respiratory therapy system may include a flow generator configured to generate a supply of pressurized air along an air circuit to a patient interface. A sound signal representing a sound in the air circuit may be processed to obtain cepstrum data. A time series of delay estimates based on acoustic signatures of the cepstrum data may be generated. Each acoustic signature may represent a reflection of sound from a patient interface along the air circuit. Variation in the time series of delay estimates may be analysed. One or more output indicators based on the variation may be generated. The one or more output indicators may concern patient and/or system status.

ACOUSTIC ANALYSIS OF A RESPIRATORY THERAPY SYSTEM

Method and apparatus obtain information about a patient and/or a respiratory therapy system that is configured to deliver respiratory therapy to the patient. The respiratory therapy system may include a flow generator configured to generate a supply of pressurized air along an air circuit to a patient interface. A sound signal representing a sound in the air circuit may be processed to obtain cepstrum data. A time series of delay estimates based on acoustic signatures of the cepstrum data may be generated. Each acoustic signature may represent a reflection of sound from a patient interface along the air circuit. Variation in the time series of delay estimates may be analysed. One or more output indicators based on the variation may be generated. The one or more output indicators may concern patient and/or system status.

AI BASED SYSTEM AND METHOD FOR CORNERS OF TRUST FOR A CALLER

A computer captures a voice of a user. The computer determines a frequency spectrum and a voice pattern of the voice. The computer identifies one or more topics of the voice by transcribing the voice by a natural language processing. The computer identifies the user based on matching the frequency spectrum of the voice to the frequency spectrum of the conversation and the pattern of the voice to the pattern of the conversation when a conversation is intercepted and determines a trust score based on comparing the one or more topics to the one or more topics extracted from the conversation.

AI BASED SYSTEM AND METHOD FOR CORNERS OF TRUST FOR A CALLER

A computer captures a voice of a user. The computer determines a frequency spectrum and a voice pattern of the voice. The computer identifies one or more topics of the voice by transcribing the voice by a natural language processing. The computer identifies the user based on matching the frequency spectrum of the voice to the frequency spectrum of the conversation and the pattern of the voice to the pattern of the conversation when a conversation is intercepted and determines a trust score based on comparing the one or more topics to the one or more topics extracted from the conversation.

SYSTEM AND METHOD FOR MULTICHANNEL SPEECH DETECTION

Embodiments of the disclosure provide systems and methods for speech detection. The method may include receiving a multichannel audio input that includes a set of audio signals from a set of audio channels in an audio detection array. The method may further include processing the multichannel audio input using a neural network classifier to generate a series of classification results in a series of time windows for the multichannel audio input. The neural network classifier includes a causal temporal convolutional network (TCN) configured to determine a classification result for each time window based on portions of the multichannel audio input n the corresponding time window and one or more time windows before the corresponding time window. The method may additionally include determining whether the multichannel audio input includes one or more speech segments in the series of time windows based on the series of classification results.

SYSTEM AND METHOD FOR MULTICHANNEL SPEECH DETECTION

Embodiments of the disclosure provide systems and methods for speech detection. The method may include receiving a multichannel audio input that includes a set of audio signals from a set of audio channels in an audio detection array. The method may further include processing the multichannel audio input using a neural network classifier to generate a series of classification results in a series of time windows for the multichannel audio input. The neural network classifier includes a causal temporal convolutional network (TCN) configured to determine a classification result for each time window based on portions of the multichannel audio input n the corresponding time window and one or more time windows before the corresponding time window. The method may additionally include determining whether the multichannel audio input includes one or more speech segments in the series of time windows based on the series of classification results.

METHOD AND ELECTRONIC APPARATUS FOR DETECTING TAMPERING AUDIO, AND STORAGE MEDIUM

Disclosed are a method, an electronic apparatus for detecting tampering audio and a storage medium. The method includes: acquiring a signal to be detected, and performing a wavelet transform of a first preset order on the signal to be detected so as to obtain a first low-frequency coefficient and a first high-frequency coefficient corresponding to the signal to be detected, the number of which is equal to that of the first preset order; performing an inverse wavelet transform on the first high-frequency coefficient having an order greater than or equal to a second preset order so as to obtain a first high-frequency component signal corresponding to the signal to be detected; calculating a first Mel cepstrum feature of the first high-frequency component signal in units of frame, and concatenating the first Mel cepstrum features of a current frame signal and a preset number of frame signals.

METHOD AND ELECTRONIC APPARATUS FOR DETECTING TAMPERING AUDIO, AND STORAGE MEDIUM

Disclosed are a method, an electronic apparatus for detecting tampering audio and a storage medium. The method includes: acquiring a signal to be detected, and performing a wavelet transform of a first preset order on the signal to be detected so as to obtain a first low-frequency coefficient and a first high-frequency coefficient corresponding to the signal to be detected, the number of which is equal to that of the first preset order; performing an inverse wavelet transform on the first high-frequency coefficient having an order greater than or equal to a second preset order so as to obtain a first high-frequency component signal corresponding to the signal to be detected; calculating a first Mel cepstrum feature of the first high-frequency component signal in units of frame, and concatenating the first Mel cepstrum features of a current frame signal and a preset number of frame signals.

APPARATUS FOR DIAGNOSING DISEASE CAUSING VOICE AND SWALLOWING DISORDERS AND METHOD FOR DIAGNOSING SAME
20230130676 · 2023-04-27 ·

An apparatus for diagnosing a disease and a method for diagnosing a disease, in which: a plurality of voice signals are received to generate a first image signal and a second image signal which are image signals for each voice signal; a plurality of disease probability information for a target disease causing a voice change are extracted by using an artificial intelligence model determined according to the type of each voice signal and a generation method used to generate each image signal for the first image signal and the second image signal for each voice signal; and it is determined whether the target disease is negative or positive on the basis of the plurality of disease probability information.