G10L17/26

Apparatus, systems and methods for determining a commentary rating

Commentary rating determination systems and methods determine a commentary rating for commentary about a subject media content event that has been generated by a community member. An exemplary embodiment receives video information acquired by a 360° video camera, identifies a physical object from the received video information, determines a physical attribute associated with the identified physical object, wherein the determined physical attribute describes a characteristic of the identified physical object, compares the determined physical attribute of the identified physical object with a plurality of predefined physical object attributes stored in a database, and in response to identifying one of the plurality of predefined physical object attributes that matches the determined physical attribute, associates the quality value of the identified one of the plurality of predefined physical object attributes with the identified physical object. Then, the commentary rating is determined for the commentary based on the associated quality value.

METHOD FOR RECOGNIZING AT LEAST ONE NATURALLY EMITTED SOUND PRODUCED BY A REAL-LIFE SOUND SOURCE IN AN ENVIRONMENT COMPRISING AT LEAST ONE ARTIFICIAL SOUND SOURCE, CORRESPONDING APPARATUS, COMPUTER PROGRAM PRODUCT AND COMPUTER-READABLE CARRIER MEDIUM.
20220408184 · 2022-12-22 ·

The disclosure relates to a method for recognizing at least one naturally emitted sound produced by a real-life sound source in an environment comprising at least one artificial sound source. The method is implemented by an audio recognition device, and it includes simultaneously obtaining a first audio signal from a first microphone located in the environment and a second audio signal from an audio acquisition device associated with the at least one artificial sound source; analyzing the first audio signal, delivering a first list of sound classes corresponding to sounds recognized in the first audio signal; analyzing the second audio signal, delivering a second list of sound classes corresponding to sounds recognized in the second audio signal; and delivering a third list of sound classes, comprising only sound classes included in the first list of sound classes which are not included in the second list of sound classes.

ELECTRONIC APPARATUS AND METHOD OF CONTROLLING THE SAME
20220406308 · 2022-12-22 · ·

An electronic device includes a processor configured to: receive a user voice input, identify a state of the electronic device corresponding to at least one item related to the electronic device, select a voice recognition engine corresponding to the identified state, from among a plurality of voice recognition engines, based on correlations between the plurality of voice recognition engines and a plurality of states, and perform an operation corresponding to the user voice input based on the selected voice recognition engine.

Authentication Question Improvement Based on Vocal Confidence Processing

Methods, systems, and apparatuses are described herein for improving computer authentication processes using vocal confidence processing. A request for access to an account may be received. An authentication question may be provided to a user. Voice data indicating one or more vocal utterances by the user in response to the authentication question may be received. The voice data may be processed, and a first confidence score that indicates a degree of confidence of the user when answering the authentication question may be determined. An overall confidence score may be modified based on the first confidence score. Based on determining that the overall confidence score satisfies a threshold, data preventing the authentication question from being used in future authentication processes may be stored. The data may be removed when a time period expires.

PROHIBITING VOICE ATTACKS

In an approach for prohibiting voice attacks, a processor, in response to receiving a voice input from a source, determines, using a predetermined filter including an allowlist, that the voice input does not match any corresponding entry of the predetermined filter. A processor routes the voice input to an adversarial pipeline for processing. A processor identifies an adversarial example of the voice input using a predetermined connectionist temporal classification method. A processor generates a configurable distorted adversarial example using the adversarial example identified. In response to a user reply, a processor injects the configurable distorted adversarial example as noise into a voice stream of the user reply in real-time to alter the voice stream. A processor routes the altered voice stream to the source.

PRIVATE SPEECH FILTERINGS
20220406315 · 2022-12-22 ·

In some examples, an electronic device comprises an image sensor to detect a user action, an audio input device to receive an audio signal, and a processor coupled to the audio input device and the image sensor. The processor is to determine that the audio signal includes private speech based on the user action, remove the private speech from the audio signal to produce a filtered audio signal, and transmit the filtered audio signal.

PRIVATE SPEECH FILTERINGS
20220406315 · 2022-12-22 ·

In some examples, an electronic device comprises an image sensor to detect a user action, an audio input device to receive an audio signal, and a processor coupled to the audio input device and the image sensor. The processor is to determine that the audio signal includes private speech based on the user action, remove the private speech from the audio signal to produce a filtered audio signal, and transmit the filtered audio signal.

SYSTEM AND METHOD FOR COOPERATIVE PLAN-BASED UTTERANCE-GUIDED MULTIMODAL DIALOGUE
20220392454 · 2022-12-08 · ·

Methods and systems for multimodal conversational dialogue are disclosed. The multimodal conversational dialogue system includes multiple sensors to detect multimodal inputs from a user. The multimodal conversational dialogue system includes a multimodal sematic parser that performs semantic parsing and multimodal fusion of the multimodal inputs to determine a goal of the user. The multimodal conversational dialogue system includes a dialogue manager that generates a dialogue with the user in real-time. The dialogue includes system-generated utterances that are used to conduct a conversation between the user and the multimodal conversational dialogue system.

SYSTEM AND METHOD FOR COOPERATIVE PLAN-BASED UTTERANCE-GUIDED MULTIMODAL DIALOGUE
20220392454 · 2022-12-08 · ·

Methods and systems for multimodal conversational dialogue are disclosed. The multimodal conversational dialogue system includes multiple sensors to detect multimodal inputs from a user. The multimodal conversational dialogue system includes a multimodal sematic parser that performs semantic parsing and multimodal fusion of the multimodal inputs to determine a goal of the user. The multimodal conversational dialogue system includes a dialogue manager that generates a dialogue with the user in real-time. The dialogue includes system-generated utterances that are used to conduct a conversation between the user and the multimodal conversational dialogue system.

IDENTIFICATION SYSTEM DEVICE
20220374504 · 2022-11-24 ·

An identification system device includes an identification element processing unit that generates a plurality of identification elements based on sound information including a frequency of a sound source or a frequency of a sound, an ID conversion processing unit that generates an ID based on the sound information, an information generation processing unit that generates identification information by associating the ID with the identification elements, a memory unit that stores the identification information, a judgment unit that compares the identification information with a plurality of newly generated identification elements to determine whether or not both are the sound information from the same sound source, the ID conversion processing unit generates the new ID related to the ID when the determination was the sound information from the same sound source, the information generation processing unit generates the new identification information by associating the new ID with the newly generated identification elements.