Patent classifications
G10L25/00
Matching output volume to a command volume
A speech recognition system that automatically sets the volume of output audio based on a sound intensity of a command spoken by a user to adjust the output volume. The system can compensate for variation in the intensity of the captured speech command based on the distance between the speaker and the audio capture device, the pitch of the spoken command and the acoustic profile of the system, and the relative intensity of ambient noise.
Enhancing comprehension in voice communications
Embodiments herein include receiving a request to modify an audio characteristic associated with a first user for a voice communication system. One or more suggested modified audio characteristics may be provided for the first user, based on, at least in part, one or more audio preferences established by another user. An input of one or more modified audio characteristics may be received for the first user for the voice communication system. A user-specific audio preference may be associated with the first user for voice communications on the voice communication system, the user-specific audio preference including the one or more modified audio characteristics.
Speech recognition for avionic systems
Voice-operable avionic systems and methods supporting utilization of speech recognition to facilitate control of avionic systems are disclosed. Utilizing speech recognition to control avionic systems may help reduce the head-down time of the flight crew. Safety features may also be implemented to ensure safety-critical commands are carried out as intended when commands are received through speech recognition. In addition, voice-operable avionic systems configured in accordance with embodiments of the inventive concepts disclosed herein may be implemented in manners that can help reduce the complexity and cost associated with obtaining certifications from aviation authorities.
Connected device voice command support
Systems and techniques for connected device voice command support are described herein. A voice command may be received from a user. A set of connected devices proximate to the user may be identified. The voice command may be transformed into a command for the set of connected devices. The command may be communicated to the set of connected devices.
Generating communicative behaviors for anthropomorphic virtual agents based on user's affect
Systems and methods for automatically generating at least one of facial expressions, body gestures, vocal expressions, or verbal expressions for a virtual agent based on emotion, mood and/or personality of a user and/or the virtual agent are provided. Systems and method for determining a user's emotion, mood and/or personality are also provided.
Anaphora resolution for semantic tagging
A semantic tagging method may add context to a sentence in order to increase search efficiency. Regardless of an author's writing style, translating semantic concepts into tags may increase search efficiency. Automatic semantic tagging of documents may allow semantic search and reasoning. Text for semantic tagging may include an email, a website chat room, an internet forum, or a text message. Additional texts may include aggregating general consensus of an emailed topic across multiple emails, whether in the same email chain or separate emails. To increase search efficiency, the analysis of prior communications within the body of text may comprise analyzing structured contextual information to facilitate with homophora resolution. The structured contextual information may include at least one of a sender email address, one or more recipient email addresses, a subject field, a message date and time stamp, and an attachment title.
Electronic apparatus and control method thereof
An electronic apparatus and a controlling methods thereof are disclosed. The electronic apparatus includes a voice input unit configured to receive a user voice, a storage unit configured to store a plurality of voice print feature models representing a plurality of user voices and a plurality of utterance environment models representing a plurality of environmental disturbances, a controller, in response to a user voice being input through the voice input unit, configured to extract utterance environment information of an utterance environment model among the plurality of utterance environment models corresponding to a location where the user voice is input, compare a voice print feature of the input user voice with the plurality of voice print feature models, revise a result of the comparison based on the extracted utterance environment information, and recognize a user corresponding to the input user voice based on the revised result.
Home appliance having speech recognition function
The present disclosure relates to a home appliance capable of being operated by speech of a user. The home appliance includes a main body forming an outer appearance, a microphone including at least one sensing portion disposed to direct to the front of the main body to detect speech of a user, and a speaker unit disposed to be spaced apart from the microphone unit by a predetermined distance.
Voiceprint authentication method and apparatus
The present disclosure provides a voiceprint authentication method and a voiceprint authentication apparatus. The method includes: displaying a first character string to a user, in which the first character string includes a predilection character preset by the user, and the predilection character is displayed as a symbol corresponding to the predilection character in the first character string; obtaining a speech of the first character string read by the user; obtaining a first voiceprint identity vector of the speech of the first character string; comparing the first voiceprint identity vector with a second voiceprint identity vector registered by the user to determine a result of a voiceprint authentication.
Single-sided speech quality measurement
A non-intrusive speech quality estimation technique is based on statistical or probability models such as Gaussian Mixture Models (“GMMs”). Perceptual features are extracted from the received speech signal and assessed by an artificial reference model formed using statistical models. The models characterize the statistical behavior of speech features. Consistency measures between the input speech features and the models are calculated to form indicators of speech quality. The consistency values are mapped to a speech quality score using a mapping optimized using machine learning algorithms, such as Multivariate Adaptive Regression Splines (“MARS”). The technique provides competitive or better quality estimates relative to known techniques while having lower computational complexity.