Patent classifications
G10L2015/221
Vehicle and control method thereof
A vehicle includes a communication device configured to communicate with a terminal capable of providing a communication function; a sensor configured to receive voice of a user; a storage configured to store a user pattern related to a call pattern of the user; and a controller configured to search for at least one name candidate corresponding to input voice when receiving the input voice, determine a threshold for a confidence score of the at least one name candidate based on the user pattern, and select a name corresponding to the input voice from among the at least one name candidate based on the determined threshold.
PRONUNCIATION TEACHING METHOD
A pronunciation teaching method is provided. A service account is provided in a social communication program to provide a pronunciation teaching program. The service account provides guidance information to a user account which inputs the guidance information by voice input and directly transmits the guidance information to the service account by a text to be evaluated converted by a voice input engine. The service account provides an evaluation result to a corresponding user account according to the text to be evaluated. The social communication program provides the reception and transmission of text messages. The guidance information is texts provided for users to pronounce. The evaluation results are related to the difference between the guidance information and the text to be evaluated. Accordingly, the pronunciation defects of users can be effectively detected. Curative pronunciation exercises can be arranged specifically to improve the pronunciation accuracy of users and the efficiency of voice input.
SYSTEMS AND METHODS FOR IMPROVED AUDIO-VIDEO CONFERENCES
Systems and methods for efficient management of an audio/video conferences is disclosed. The method includes receiving an audio question from a first user of a plurality of users connected to a conference, recording the audio question and preventing an immediate transmission of the audio question to the plurality of users connected to the conference, analyzing the recorded question and a recorded portion of the conference to determine that the question has been answered during the recorded portion of the conference, and in response to the determining that the audio question has previously been answered, transmitting a relevant section of the recorded portion of the conference consisting of an answer to the audio question to the first user.
QUERY ROUTING FOR BOT-BASED QUERY RESPONSE
A method, system, and computer program product for routing queries to answer resources based on component parts and intents of a received query is provided. The method receives a query from a user. The query is analyzed to identify a set of entities associated with the query and generate an utterance representing the query. The method generates an intent classification for the utterance and a vector for the query. The vector is generated based on the set of entities, the utterance, and the intent classification. The method determines an answer resource for the query based on the vector and the intent classification of the query. In response to determining the answer resource, the method provides an answer interface based on the query, the vector, and the intent classification. The answer interface dynamically provides a response to the query.
SYSTEMS AND METHODS FOR SCRIPTED AUDIO PRODUCTION
A scripted audio production system in which the scripted audio production computerized process decreases production time by improving computerized processes and technological systems for pronunciation research and script preparation, narration, editing, proofing and mastering. The system enables the user to upload their manuscript and recorded audio of the narration of the manuscript to the system. The system then compares the recorded audio against previously uploaded manuscript and any mistakes or deviations from the manuscript are highlighted or otherwise indicated to the user. In other embodiments, after uploading the manuscript, the system enables the user to press “record,” and as soon as they start speaking, the scripted audio production technology system tracks the point within the manuscript from where they are reading. Any mistakes or deviations from the script are automatically highlighted. The narrator may then stop, and re-record a sentence after a mistake. The system automatically pieces together the last-read audio into a clean file without the need for significant user interaction. The process may also be performed on the recorded audio by the narrator first uploading the audio and manuscript to the scripted audio production technology system.
SECURE ENTERPRISE ACCESS WITH VOICE ASSISTANT DEVICES
Systems and methods are provided for optimizing and securing an enterprise voice service accessed by an external voice assistant device. An enterprise voice assistant installed on a client device acts as an enterprise voice service for an external voice assistant device. The enterprise voice assistant receives a voice query from the external voice assistant device. The voice query is processed using a machine learning model to extract an intent and at least one slot. The extracted intent and at least one slot are used to determine whether a response to the voice query can be generated using local enterprise data that was previously received and stored by the client device from a management server. The response is generated based on the determination by using the local enterprise data or by sending the extracted intent and at least one slot to and receiving the response from the management server.
VIRTUAL ASSISTANT IDENTIFICATION OF NEARBY COMPUTING DEVICES
In one example, a method includes method comprising: receiving audio data generated by a microphone of a current computing device; identifying, based on the audio data, one or more computing devices that each emitted a respective audio signal in response to speech reception being activated at the current computing device; and selecting either the current computing device or a particular computing device from the identified one or more computing devices to satisfy a spoken utterance determined based on the audio data.
Audio-triggered augmented reality eyewear device
Systems, methods, and non-transitory computer readable media for augmenting scenes viewed thorough displays of an eyewear devices with audio-related image information. Scenes may be augmented by capturing, via a camera of the eyewear device, initial images of a scene, identifying features within the initial images; receiving audio-related image information (e.g., lyrics and/or images), registering the audio-related image information to the identified features, creating audio-based visual overlays including the audio-related image information registered to the identified features, and displaying the audio-based visual overlays over the scene.
INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND RECORDING MEDIUM HAVING STORED THEREON INFORMATION PROCESSING PROGRAM
An information processing system according one embodiment includes: a voice receiver which receives a first voice uttered by a first user of a first information processing device; a voice recognizer which recognizes the first voice received by the voice receiver; a display controller which causes a first text, which corresponds to the first voice recognized by the voice recognizer, to be displayed in each of first display areas of the first information processing device and a second information processing device, and a second display area of the first information processing device; and a correction reception portion which receives a correction operation of the first user for the first text displayed in the second display area.
Virtual assistant identification of nearby computing devices
In one example, a method includes method comprising: receiving audio data generated by a microphone of a current computing device; identifying, based on the audio data, one or more computing devices that each emitted a respective audio signal in response to speech reception being activated at the current computing device; and selecting either the current computing device or a particular computing device from the identified one or more computing devices to satisfy a spoken utterance determined based on the audio data.