Patent classifications
G10L15/083
ELECTRONIC DEVICE WITH NON-PARTICIPANT IMAGE BLOCKING DURING VIDEO COMMUNICATION
An electronic device, computer program product, and method avoids presenting certain objects during a video communication session. During a video communication session with second electronic device(s), a controller of an electronic device identifies baseline image(s) from an image stream provided by an image capturing device of the electronic device. The baseline image includes a primary image portion of participant(s) and including a scene of objects within the foreground or background of participant (s), during an initial portion of the video communication session. The controller monitors the image stream for a subsequent detection of the primary image portion and of non-participant(s) or object(s) as a secondary image portion that is not included within the baseline image(s). The controller responds to detecting the secondary image portion subsequently appearing within the image stream by communicating, to the one or more second electronic devices, a substitute image stream that does not present the secondary image portion.
DEVICE PAIRING USING WIRELESS COMMUNICATION BASED ON VOICE COMMAND CONTEXT
Pairing of multiple devices is initiated using a computer in an artificial intelligence (AI) ecosystem. A command is received at a computer to perform a user activity at a location which includes pairing a user device to a selectable device at the location. The context of the command is analyzed including a historical corpus regarding previous pairings and connection preferences. A device at the location is selected based on the analysis and the determining of the user activity. Pairing is automatically initiated for the user device to the selected device at the location based on the analysis of the context of the command. The automatic initiation includes adjusting settings on the user device based on the analysis of the context of the command. The user device is automatically paired to the selected device at the location to perform the user activity.
NETWORKED DEVICES, SYSTEMS, & METHODS FOR INTELLIGENTLY DEACTIVATING WAKE-WORD ENGINES
In one aspect, a playback deice is configured to identify in an audio stream, via a second wake-word engine, a false wake word for a first wake-word engine that is configured to receive as input sound data based on sound detected by a microphone. The first and second wake-word engines are configured according to different sensitivity levels for false positives of a particular wake word. Based on identifying the false wake word, the playback device is configured to (i) deactivate the first wake-word engine and (ii) cause at least one network microphone device to deactivate a wake-word engine for a particular amount of time. While the first wake-word engine is deactivated, the playback device is configured to cause at least one speaker to output audio based on the audio stream. After a predetermined amount of time has elapsed, the playback device is configured to reactivate the first wake-word engine.
INFORMATION PROCESSOR, INFORMATION PROCESSING METHOD, AND PROGRAM
An information processor including: an operation control unit that controls a motion of an autonomous mobile body acting on the basis of recognition processing, in a case where a target sound that is a target voice for voice recognition processing is detected, the operation control unit moving the autonomous mobile body to a position, around an approach target, where an input level of a non-target sound that is not the target voice becomes lower, the approach target being determined on the basis of the target sound.
Voice control method and apparatus, and computer storage medium
A voice control method can be applied to a first terminal, and include: receiving a user's voice operation instruction after the first terminal is activated, the voice operation instruction being used for controlling the first terminal to perform a target operation; sending an instruction execution request to a server after the voice operation instruction is received, the instruction execution request being used for requesting the server to determine whether the first terminal is to respond to the voice operation instruction according to device information of the terminal in a device network, wherein the first terminal is located in the device network; and performing the target operation in a case where a response message is received from the server, the response message indicating that the first terminal is to respond to the voice operation instruction.
Speech-processing system
A system may include first and second speech-processing systems with corresponding first and second wakewords. An utterance may contain two or more wakewords. The system determines which wakeword was spoken first and can send data to that wakeword's speech-processing system to perform further processing.
MITIGATING FALSE POSITIVES AND/OR FALSE NEGATIVES IN HOT WORD FREE ADAPTATION OF AUTOMATED ASSISTANT
Hot word free adaptation, of one or more function(s) of an automated assistant, responsive to determining, based on gaze measure(s) and/or active speech measure(s), that a user is engaging with the automated assistant. Implementations relate to various techniques for mitigating false positive occurrences of and/or false negative occurrences, of hot word free adaptation, through utilization of personalized parameter(s) for at least some user(s) of an assistant device. The personalized parameter(s) are utilized in determining whether condition(s) are satisfied, where those condition(s), if satisfied, indicate that the user is engaging in hot word free interaction with the automated assistant and result in adaptation of function(s) of the automated assistant.
Methods and systems for predicting non-default actions against unstructured utterances
A method to adaptively predict non-default actions against unstructured utterances by an automated assistant operating in a computing-system is provided. The method includes extracting voice-features based on receiving an input utterance from at-least one speaker by an automatic speech recognition (ASR) device, identifying the input utterance as an unstructured utterance based on the extracted voice-features and a mapping between the input utterance with one or more default actions as drawn by the ASR, obtaining at least one probable action to be performed in response to the unstructured utterance through a dynamic bayesian network (DBN). The method further includes providing the at least one probable action obtained by the DBN to the speaker in an order of the posterior probability with respect to each action.
SYSTEM FOR SETTING VOICE RECOGNITION RCU BY USING CLOUD SERVER, AND METHOD THEREFOR
A system for setting a voice recognition remote control unit (RCU) by using a cloud server, and a method therefor are proposed. The system is configured to include a communication interface configured to perform remote wireless communication with a device to be controlled and receive a voice signal for registering device information of the device to be controlled, and an interface setting unit configured to set IR code information corresponding to the device to be controlled in the communication interface through comparison of result data obtained by recognizing the voice signal with IR code information which is registered in the cloud server as a DB for each of device information (EDID).
Visual responses to user inputs
Techniques for generating a visual response to a user input are described. A system may receive input data corresponding to a user input, determining a first skill component is to determine a response to the user input, and determine a second skill component is to determine supplemental content related to the user input. The system may also determine a template for presenting a visual response to the user input, where the template is configured for presenting the response and the supplemental content. The system may receive, from the first skill component, first image data corresponding to the first response. The system may also receive, from the second skill component, second image data corresponding to the first supplemental content. The system may send, to a device including a display, a command to present the first image data and the second image data using the template.