Patent classifications
G10L15/30
ELECTRONIC APPARATUS AND CONTROLLING METHOD THEREOF
An electronic apparatus includes: a memory storing one or more commands; and a processor connected to the memory and configured to control the electronic apparatus, wherein the processor is configured, by executing the one or more instructions, to: identify a first intention word and a first target word from first speech, acquire second speech received after the first speech based on at least one of the identified first intention word or the identified first target word not matching a word stored in the memory, acquire a similarity between the first speech and the second speech, and acquire response information based on the first speech and the second speech based on the similarity being a threshold value or more.
ELECTRONIC DEVICE AND METHOD OF CONTROLLING THEREOF
Disclosed is an electronic device. The electronic device may execute an application for transmitting and receiving at least one of text data or voice data with another electronic device using the communication module, in response to occurrence of at least one event, based on receiving at least one of text data or voice data from the another electronic device, identify that a confirmation is necessary using the digital assistant based on at least one of text data or voice data being generated based on a characteristic of ab utterance using a digital assistant, generate a notification to request confirmation using the digital assistant based on confirmation being necessary, and output the notification using the application.
A method for identifying that a confirmation is necessary may include identifying using voice data or text data that is received from another electronic device using a rule-based or AI algorithm.
When a confirmation is necessary is identified using the AI algorithm, the method may use machine learning, neural network, or a deep learning algorithm.
ELECTRONIC DEVICE AND METHOD OF CONTROLLING THEREOF
Disclosed is an electronic device. The electronic device may execute an application for transmitting and receiving at least one of text data or voice data with another electronic device using the communication module, in response to occurrence of at least one event, based on receiving at least one of text data or voice data from the another electronic device, identify that a confirmation is necessary using the digital assistant based on at least one of text data or voice data being generated based on a characteristic of ab utterance using a digital assistant, generate a notification to request confirmation using the digital assistant based on confirmation being necessary, and output the notification using the application.
A method for identifying that a confirmation is necessary may include identifying using voice data or text data that is received from another electronic device using a rule-based or AI algorithm.
When a confirmation is necessary is identified using the AI algorithm, the method may use machine learning, neural network, or a deep learning algorithm.
APPENDING ASSISTANT SERVER REQUESTS, FROM A CLIENT ASSISTANT,WITH PROACTIVELY-AGGREGATED PERIPHERAL DEVICE DATA
Implementations relate to proactively aggregating client device data to append to client assistant data that is communicated to a server device in response to a user request to a client automated assistant. When a user request that is associated with, for example, a peripheral client device, is received at a client device, the client device can communicate, to a server device, data that embodies the user request (e.g., audio data and/or local speech recognition data), along with peripheral device data that was received before the client device received the user request. In this way, the client automated assistant can bypass expressly soliciting peripheral device data each time a user request is received at another client device. Instead, a peripheral device can proactively communicate device data to a client device so that the device data can be appended to request data communicated to the server device from a particular client device.
APPENDING ASSISTANT SERVER REQUESTS, FROM A CLIENT ASSISTANT,WITH PROACTIVELY-AGGREGATED PERIPHERAL DEVICE DATA
Implementations relate to proactively aggregating client device data to append to client assistant data that is communicated to a server device in response to a user request to a client automated assistant. When a user request that is associated with, for example, a peripheral client device, is received at a client device, the client device can communicate, to a server device, data that embodies the user request (e.g., audio data and/or local speech recognition data), along with peripheral device data that was received before the client device received the user request. In this way, the client automated assistant can bypass expressly soliciting peripheral device data each time a user request is received at another client device. Instead, a peripheral device can proactively communicate device data to a client device so that the device data can be appended to request data communicated to the server device from a particular client device.
Tracking specialized concepts, topics, and activities in conversations
Embodiments are directed to organizing conversation information. A tracker vocabulary may be provided to a universal model to predict a generalized vocabulary associated with the tracker vocabulary. A tracker model may be generated based on the portions of the universal model activated by the tracker vocabulary such that a remainder of the universal model may be excluded from the tracker model. Portions of a conversation stream may be provided to the tracker model. A match score may be generated based on the track model and the portions of the conversation stream such that the match score predicts if the portions of the conversation stream may be in the generalized vocabulary predicted for the tracker vocabulary. Tracker metrics may be collected based on the portions of the conversation and the match scores such that the tracker metrics may be included in reports or notifications.
Tracking specialized concepts, topics, and activities in conversations
Embodiments are directed to organizing conversation information. A tracker vocabulary may be provided to a universal model to predict a generalized vocabulary associated with the tracker vocabulary. A tracker model may be generated based on the portions of the universal model activated by the tracker vocabulary such that a remainder of the universal model may be excluded from the tracker model. Portions of a conversation stream may be provided to the tracker model. A match score may be generated based on the track model and the portions of the conversation stream such that the match score predicts if the portions of the conversation stream may be in the generalized vocabulary predicted for the tracker vocabulary. Tracker metrics may be collected based on the portions of the conversation and the match scores such that the tracker metrics may be included in reports or notifications.
Transcription of communications
A method to transcribe communications may include obtaining, at a first device, an audio signal that originates at a remote device during a communication session. The audio signal may be shared between the first device and a second device. The method may also include obtaining an indication that the second device is associated with a remote transcription system and in response to the second device being associated with the remote transcription system, directing the audio signal to the remote transcription system by one of the first device and the second device instead of both the first device and the second device directing the audio signal to the remote transcription system when the second device is not associated with the remote transcription system.
Transcription of communications
A method to transcribe communications may include obtaining, at a first device, an audio signal that originates at a remote device during a communication session. The audio signal may be shared between the first device and a second device. The method may also include obtaining an indication that the second device is associated with a remote transcription system and in response to the second device being associated with the remote transcription system, directing the audio signal to the remote transcription system by one of the first device and the second device instead of both the first device and the second device directing the audio signal to the remote transcription system when the second device is not associated with the remote transcription system.
Systems and methods for voice identification and analysis
Obtaining configuration audio data including voice information for a plurality of meeting participants. Generating localization information indicating a respective location for each meeting participant. Generating a respective voiceprint for each meeting participant. Obtaining meeting audio data. Identifying a first meeting participant and a second meeting participant. Linking a first meeting participant identifier of the first meeting participant with a first segment of the meeting audio data. Linking a second meeting participant identifier of the second meeting participant with a second segment of the meeting audio data. Generating a GUI indicating the respective locations of the first and second meeting participants, and the GUI indicating a first transcription of the first segment and a second transcription of the second segment. The first transcription is associated with the first meeting participant in the GUI, and the second transcription is associated with the second meeting participant in the GUI.