Patent classifications
H04M1/271
METHOD OF IDENTIFYING CONTACTS FOR INITIATING A COMMUNICATION USING SPEECH RECOGNITION
A method and system on an electronic device which uses speech recognition to initiate a communication from a mobile device having access to contact information for a number of contacts. In one example, the method comprises receiving through an audio input interface a voice input for initiating a communication, extracting from the voice input a type of communication and at least part of a contact name, and outputting, to an output interface, a selectable list of all contacts from the contact information which have the part of the contact name and which have a contact address associated with the type of communication. The mobile device may also be configured to access remote contact information from a remote server.
METHOD AND DEVICE FOR AUDIO INPUT ROUTING
A method on a mobile device for a wireless network is described. An audio input is monitored for a trigger phrase spoken by a user of the mobile device. A command phrase spoken by the user after the trigger phrase is buffered. The command phrase corresponds to a call command and a call parameter. A set of target contacts associated with the mobile device is selected based on respective voice validation scores and respective contact confidence scores. The respective voice validation scores are based on the call parameter. The respective contact confidence scores are based on a user context associated with the user. A call to a priority contact of the set of target contacts is automatically placed if the voice validation score of the priority contact meets a validation threshold and the contact confidence score of the priority contact meets a confidence threshold.
Acoustic trigger detection
A method for selective transmission of audio data to a speech processing server uses detection of an acoustic trigger in the audio data in determining the data to transmit. Detection of the acoustic trigger makes use of an efficient computation approach that reduces the amount of run-time computation required, or equivalently improves accuracy for a given amount of computation, by combining a time delay structure in which intermediate results of computations are reused at various time delays, thereby avoiding computation of computing new results, and decomposition of certain transformations to require fewer arithmetic operations without sacrificing significant performance. For a given amount of computation capacity the combination of these two techniques provides improved accuracy as compared to current approaches.
Intelligent interactive voice response system for processing customer communications
A method and apparatus of processing a user call via an intelligent voice response (IVR) call processing application is disclosed. One example method may include receiving a call from a user device, obtaining user information from the received call, comparing the user information to at least one pre-stored user information stored in a user databank associated with a user account, and calculating a first confidence level by comparing the user information to the pre-stored user information. The method may also include authorizing the user device to receive an offer based on the first confidence level, and transmitting the offer to the user authorized by the first confidence level.
SPEAKER RECOGNITION IN THE CALL CENTER
Utterances of at least two speakers in a speech signal may be distinguished and the associated speaker identified by use of diarization together with automatic speech recognition of identifying words and phrases commonly in the speech signal. The diarization process clusters turns of the conversation while recognized special form phrases and entity names identify the speakers. A trained probabilistic model deduces which entity name(s) correspond to the clusters.
AUDIO CALL ANALYSIS
A device includes a communication interface, an input interface, and a processor. The communication interface is configured to receive an audio signal associated with an audio call. The input interface is configured to receive user input during the audio call. The processor is configured to generate an audio recording of the audio signal in response to receiving the user input. The processor is also configured to generate text by performing speech-to-text conversion of the audio recording. The processor is further configured to perform a comparison of the text to a pattern. The processor is also configured to identify, based on the comparison, a portion of the text that matches the pattern. The processor is further configured to provide the portion of the text and an option to a display. The option is selectable to initiate performance of an action corresponding to the pattern.
METHOD AND DEVICE FOR DISPLAYING APPLICATION FUNCTION INFORMATION, AND TERMINAL
Embodiments of the present disclosure provide a method and a device for training an acoustic model, a terminal device and a storage medium. The method includes: displaying a content card associated with an application function on a screen of a terminal device; and performing a switching operation on the content card currently displayed on the screen when detecting that a preset switching condition is satisfied.
Communication device
The communication device comprising a 1st device remotely controlling implementer, a 2nd device remotely controlling implementer, and a video data transfer implementer.
TERMINAL HOLDER AND FAR-FIELD VOICE INTERACTION SYSTEM
Embodiments of the present disclosure disclose a terminal holder and a far-field voice interaction system. A specific implementation of the terminal holder includes: a far-field voice pickup device and a voice analysis device. The far-field voice pickup device receives voice sent by a user, and sends the voice to the voice analysis device. The voice analysis device analyzes the voice, determines whether the voice contains a preset wake-up word, and sends the voice to a terminal in communication connection with the terminal holder when the preset wake-up word is contained. This embodiment receives voice sent by a user through the terminal holder supporting a far-field voice pickup function, thereby facilitating the far-field voice control over the terminal.
LED Design Language for Visual Affordance of Voice User Interfaces
A method is implemented at an electronic device for visually indicating a voice processing state. The electronic device includes an array of visual indicators and one or more microphones. The electronic device collects via the one or more microphones audio inputs from an environment in proximity to the electronic device and initializes processing of the audio inputs. A state of the processing is then determined from among a plurality of predefined voice processing states, and for each of the visual indicators, a respective predetermined illumination specification is determined in association with the determined voice processing state. In accordance with the identified illumination specifications of the visual indicators, the electronic device synchronizes illumination of the array of visual indicators to provide a visual pattern indicating the determined voice processing state. The visual pattern is displayed on the surface of the electronic device and includes one or more discrete illumination elements.