G10L25/66

Speech fluency evaluation and feedback

Speech fluency evaluation and feedback tools are described. A computing device such as a smartphone may be used to collect speech (and/or other data). The collected data may be analyzed to detect various speech events (e.g., stuttering) and feedback may be generated and provided based on the detected speech events. The collected data may be used to generate a fluency score or other performance metric associated with speech. Collected data may be provided to a practitioner such as a speech therapist or physician for improved analysis and/or treatment.

VOICE CHARACTERISTIC-BASED METHOD AND DEVICE FOR PREDICTING ALZHEIMER'S DISEASE

A method and device for predicting Alzheimer's disease based on voice characteristics are provided. The device for predicting Alzheimer's disease according to an embodiment includes: a voice input unit configured to generate a voice sample by recording a voice of a subject; a data input unit configured to receive demographic information of the subject; a voice characteristic extraction unit configured to extract voice characteristics from the generated voice sample; and a prediction model that is pre-trained to predict presence or absence of Alzheimer's disease in the subject, based on the voice characteristics and the demographic information.

VOICE CHARACTERISTIC-BASED METHOD AND DEVICE FOR PREDICTING ALZHEIMER'S DISEASE

A method and device for predicting Alzheimer's disease based on voice characteristics are provided. The device for predicting Alzheimer's disease according to an embodiment includes: a voice input unit configured to generate a voice sample by recording a voice of a subject; a data input unit configured to receive demographic information of the subject; a voice characteristic extraction unit configured to extract voice characteristics from the generated voice sample; and a prediction model that is pre-trained to predict presence or absence of Alzheimer's disease in the subject, based on the voice characteristics and the demographic information.

DISEASE PREDICTION DEVICE, PREDICTION MODEL GENERATION DEVICE, AND DISEASE PREDICTION PROGRAM

Provided is a device performing machine learning by extracting an acoustic feature value from conversational voice data and predicting a disease level of a subject on the basis of a disease prediction model to be generated by the machine learning, the device including: a matrix calculation unit 23 calculating a spatial delay matrix using a relation value of a plurality of types of acoustic feature values; and a matrix decomposition unit 24 calculating a matrix decomposition value from the spatial delay matrix, in which a relation value reflecting a non-linear and non-stationary relationship of the feature values can be obtained by calculating at least one of a DCCA coefficient and a mutual information amount as the relation value of the plurality of types of acoustic feature values, and the disease level of the subject can be predicted on the basis of the relation value.

ORAL FUNCTION VISUALIZATION SYSTEM, ORAL FUNCTION VISUALIZATION METHOD, AND RECORDING MEDIUM MEDIUM

An oral function visualization system includes: an outputter that outputs information for prompting a user to utter a predetermined voice; an obtainer that obtains an uttered voice of the user uttered in accordance with the output; an analyzer that analyzes the uttered voice obtained by the obtainer; and an estimator that estimates a state of oral organs of the user from a result of analysis of the uttered voice by the analyzer. The outputter outputs, based on the state of the oral organs of the user estimated by the estimator, information for the user to achieve a state of the oral organs suitable for utterance of the predetermined voice.

ORAL FUNCTION VISUALIZATION SYSTEM, ORAL FUNCTION VISUALIZATION METHOD, AND RECORDING MEDIUM MEDIUM

An oral function visualization system includes: an outputter that outputs information for prompting a user to utter a predetermined voice; an obtainer that obtains an uttered voice of the user uttered in accordance with the output; an analyzer that analyzes the uttered voice obtained by the obtainer; and an estimator that estimates a state of oral organs of the user from a result of analysis of the uttered voice by the analyzer. The outputter outputs, based on the state of the oral organs of the user estimated by the estimator, information for the user to achieve a state of the oral organs suitable for utterance of the predetermined voice.

Method and device for detecting speech patterns and errors when practicing fluency shaping techniques
11517254 · 2022-12-06 · ·

A method and system for detecting errors when practicing fluency shaping exercises. The method includes setting each threshold of a set of thresholds to a respective predetermined initial value; analyzing a voice production to compute a set of first energy levels composing the voice production, wherein the voice production is of a user practicing a fluency shaping exercise; detecting at least one speech-related error based on the computed set of first energy levels, a set of second energy levels, and the set of thresholds, wherein the detection of the at least one speech-related error is with respect to the fluency shaping exercise being practiced by the user, wherein the set of second energy levels is determined based on a calibration process; and generating feedback indicating the detected at least one speech-related error.

Method and device for detecting speech patterns and errors when practicing fluency shaping techniques
11517254 · 2022-12-06 · ·

A method and system for detecting errors when practicing fluency shaping exercises. The method includes setting each threshold of a set of thresholds to a respective predetermined initial value; analyzing a voice production to compute a set of first energy levels composing the voice production, wherein the voice production is of a user practicing a fluency shaping exercise; detecting at least one speech-related error based on the computed set of first energy levels, a set of second energy levels, and the set of thresholds, wherein the detection of the at least one speech-related error is with respect to the fluency shaping exercise being practiced by the user, wherein the set of second energy levels is determined based on a calibration process; and generating feedback indicating the detected at least one speech-related error.

Devices, systems, and methods for real time surveillance of audio streams and uses therefore

Various examples are provided for surveillance of an audio stream. In one example, a method includes identifying presence or absence of a sound type of interest at a location during a time period; selecting the sound type from a library of sound type information to provide a collection of sound type information; incorporating the collection on a device proximate to the location; acquiring an audio stream from the location by the device to provide a locational audio stream; analyzing the locational audio stream to determine whether a sound type in the collection is present in the audio stream; and generating a notification to a user or computer if a sound type in the collection is present. The device can acquire and process the audio stream. In another example, a bulk sound type information library can be generated by identifying sound types of interest including them based upon a confidence level.

Devices, systems, and methods for real time surveillance of audio streams and uses therefore

Various examples are provided for surveillance of an audio stream. In one example, a method includes identifying presence or absence of a sound type of interest at a location during a time period; selecting the sound type from a library of sound type information to provide a collection of sound type information; incorporating the collection on a device proximate to the location; acquiring an audio stream from the location by the device to provide a locational audio stream; analyzing the locational audio stream to determine whether a sound type in the collection is present in the audio stream; and generating a notification to a user or computer if a sound type in the collection is present. The device can acquire and process the audio stream. In another example, a bulk sound type information library can be generated by identifying sound types of interest including them based upon a confidence level.