Patent classifications
G10L21/00
SYSTEMS, METHODS, AND DEVICES FOR AUDIO CORRECTION
Systems, methods, and devices relating to audio correction are described. A first portion of content including first spoken audio content indicating first word(s) may be determined. Background audio content of the first portion of the content may be determined. A voice profile may be determined based on the first spoken audio content. Based on the voice profile, second spoken audio content indicating second word(s) to replace the first word(s) may be generated. Based on mixing the background audio content and the second spoken audio content, a second portion of content may be determined. In the content, the first portion of the content may be replaced with the generated second portion of content.
Method and apparatus for audio data processing
Embodiments of the disclosure provide methods and apparatuses processing audio data. The method can include: acquiring audio data by an audio capturing device, determining feature information of an enclosure in which the audio capturing device is located, and reverberating the feature information into the audio data.
Method and apparatus for audio data processing
Embodiments of the disclosure provide methods and apparatuses processing audio data. The method can include: acquiring audio data by an audio capturing device, determining feature information of an enclosure in which the audio capturing device is located, and reverberating the feature information into the audio data.
Automated sig code translation using machine learning
A pharmacy management system for automated sig code translation using machine learning includes a processor and a memory storing instructions that, when executed by the one or more processors, cause the pharmacy management system to train a machine learning model to analyze sig codes, receive a sig code, analyze the sig code utterance and generate an output corresponding to the sig code utterance. A computer-implemented method includes training a machine learning model to analyze sig codes, receiving a sig code, analyzing the sig code utterance, and generating an output corresponding to the sig code utterance. A non-transitory computer readable medium containing program instructions that when executed, cause a computer to: train a machine learning model to analyze sig codes, receive a sig code, analyze the sig code utterance, and generate an output corresponding to the sig code utterance.
Notification system, notification method, and non-transitory computer readable medium storing program
A notification system includes: detection means (110) for detecting an acoustic event from voice data transmitted from a communication terminal held by a target person; and notification means (120) for sending a predetermined notification when the detection means (110) has detected the acoustic event. Accordingly, it is possible to determine the state of a target person regardless of the state of this person. Further, when the difference between an acoustic pattern of the voice data transmitted from the communication terminal and acoustic patterns registered in advance is outside a predetermined range, a management server (101) does not send a notification, whereby it is possible to prevent communication traffic from being increased based on unnecessary notifications.
System to evaluate dimensions of pronunciation quality
The present invention provides a system for determining a language proficiency of a user in an evaluated language. A machine learning engine may be trained using audio file variables from a plurality of audio files and human generated scores for a comprehensibility, accentedness and intelligibility for each audio file. The system may receive an audio file from a user and determine a plurality of audio file variables from the audio file. The system may apply the audio file variables to the machine learning engine to determine a comprehensibility, an accentedness and an intelligibility score for the user. The system may determine one or more projects and/or classes for the user based on the user's comprehensibility score, accentedness score and/or intelligibility score.
Mass media presentations with synchronized audio reactions
Systems and methods of the present disclosure provide a plurality of audio reactions from a plurality of client devices. The audio reactions are captured by microphones on the client devices and are time-stamped. The method also includes mixing the audio reactions by a mixer server to form a mixed audio reaction, and sending the mixed audio reaction to at least one of the client devices. The client device is adapted to play the mixed audio reaction and a mass media presentation. The mixed audio reaction and the mass media presentation are synchronized to create an audience effect for the mass media presentation. The present technology also provides echo removal, volume balancing, compression, and time stamping of an audio stream by the client device. Reactions from at least one of buttons and gestures to activate synthesized sounds, for example clapping, booing, and cheering, which are mixed into the mixed audio reaction.
Mass media presentations with synchronized audio reactions
Systems and methods of the present disclosure provide a plurality of audio reactions from a plurality of client devices. The audio reactions are captured by microphones on the client devices and are time-stamped. The method also includes mixing the audio reactions by a mixer server to form a mixed audio reaction, and sending the mixed audio reaction to at least one of the client devices. The client device is adapted to play the mixed audio reaction and a mass media presentation. The mixed audio reaction and the mass media presentation are synchronized to create an audience effect for the mass media presentation. The present technology also provides echo removal, volume balancing, compression, and time stamping of an audio stream by the client device. Reactions from at least one of buttons and gestures to activate synthesized sounds, for example clapping, booing, and cheering, which are mixed into the mixed audio reaction.
ACOUSTIC DATA AUGMENTATION WITH MIXED NORMALIZATION FACTORS
A method, computer system, and a computer program product for audio data augmentation are provided. Sets of audio data from different sources may be obtained. A respective normalization factor for at least two sources of the different sources may be calculated. The normalization factors from the at least two sources may be mixed to determine a mixed normalization factor. A first set of the sets may be normalized by using the mixed normalization factor and to obtain training data for training an acoustic model.
ACOUSTIC DATA AUGMENTATION WITH MIXED NORMALIZATION FACTORS
A method, computer system, and a computer program product for audio data augmentation are provided. Sets of audio data from different sources may be obtained. A respective normalization factor for at least two sources of the different sources may be calculated. The normalization factors from the at least two sources may be mixed to determine a mixed normalization factor. A first set of the sets may be normalized by using the mixed normalization factor and to obtain training data for training an acoustic model.