G10L17/06

TECHNIQUES TO PROVIDE SENSITIVE INFORMATION OVER A VOICE CONNECTION

Embodiments may generally be directed components and techniques to detect a request to provide banking account information over a one or more voice connections, identify the requested banking account information, and generate speech data representing the banking account information requested. In embodiments further include communicating the speech data to another device.

TECHNIQUES TO PROVIDE SENSITIVE INFORMATION OVER A VOICE CONNECTION

Embodiments may generally be directed components and techniques to detect a request to provide banking account information over a one or more voice connections, identify the requested banking account information, and generate speech data representing the banking account information requested. In embodiments further include communicating the speech data to another device.

HEARING SYSTEM INCLUDING A HEARING INSTRUMENT AND METHOD FOR OPERATING THE HEARING INSTRUMENT
20230047868 · 2023-02-16 ·

A hearing system includes a hearing instrument for capturing a sound signal from an environment of the hearing instrument. The captured sound signal is processed, and the processed sound signal is output to a user of the hearing instrument. In a speech recognition step, the captured sound signal is analyzed to recognize speech intervals, in which the captured sound signal contains speech. In a speech enhancement procedure performed during recognized speech intervals, the amplitude of the processed sound signal is periodically varied according to a temporal pattern that is consistent with a stress rhythmic pattern of the user. A method for operating the hearing instrument is also provided.

HEARING SYSTEM INCLUDING A HEARING INSTRUMENT AND METHOD FOR OPERATING THE HEARING INSTRUMENT
20230047868 · 2023-02-16 ·

A hearing system includes a hearing instrument for capturing a sound signal from an environment of the hearing instrument. The captured sound signal is processed, and the processed sound signal is output to a user of the hearing instrument. In a speech recognition step, the captured sound signal is analyzed to recognize speech intervals, in which the captured sound signal contains speech. In a speech enhancement procedure performed during recognized speech intervals, the amplitude of the processed sound signal is periodically varied according to a temporal pattern that is consistent with a stress rhythmic pattern of the user. A method for operating the hearing instrument is also provided.

MULTI-USER VOICE ASSISTANT WITH DISAMBIGUATION

Disambiguating question answering responses by receiving voice command data associated with a first user, determining a first user identity according to the first user voice command data, determining a first user activity context according to the first user voice command data, determining a first response for the first user, receiving voice command data associated with a second user, determining a second user identity according to the second user voice command data, determining a second user activity context according to the second user voice command data, determining a second response for the second user, determining a predicted ambiguity between the first response and the second response, altering the first response according to the predicted ambiguity, and providing the first response and the second response.

MULTI-USER VOICE ASSISTANT WITH DISAMBIGUATION

Disambiguating question answering responses by receiving voice command data associated with a first user, determining a first user identity according to the first user voice command data, determining a first user activity context according to the first user voice command data, determining a first response for the first user, receiving voice command data associated with a second user, determining a second user identity according to the second user voice command data, determining a second user activity context according to the second user voice command data, determining a second response for the second user, determining a predicted ambiguity between the first response and the second response, altering the first response according to the predicted ambiguity, and providing the first response and the second response.

Speaker based anaphora resolution
11580991 · 2023-02-14 · ·

A speech-processing system configured to determine entities corresponding to ambiguous words such as anaphora (“he,” “she,” “they,” etc.) included in an utterance. The system may associate incoming utterances with a speaker identification (ID), device ID, and other data. The system then tracks entities referred to in utterances so that if a later utterance includes an ambiguous entity reference, the system may take the speaker ID, device ID, etc. from the ambiguous reference, along with the text of the utterance and other data, and compare that information to previously mentioned entities (or other entities that may be relevant) to identify the entity mentioned in the ambiguous statement. Once the entity is determined, the system may then complete command processing of the utterance using the identified entity.

Speaker based anaphora resolution
11580991 · 2023-02-14 · ·

A speech-processing system configured to determine entities corresponding to ambiguous words such as anaphora (“he,” “she,” “they,” etc.) included in an utterance. The system may associate incoming utterances with a speaker identification (ID), device ID, and other data. The system then tracks entities referred to in utterances so that if a later utterance includes an ambiguous entity reference, the system may take the speaker ID, device ID, etc. from the ambiguous reference, along with the text of the utterance and other data, and compare that information to previously mentioned entities (or other entities that may be relevant) to identify the entity mentioned in the ambiguous statement. Once the entity is determined, the system may then complete command processing of the utterance using the identified entity.

Target character video clip playing method, system and apparatus, and storage medium
11580742 · 2023-02-14 · ·

Provided are a target character video clip playing method, system and apparatus, and a storage medium. The method comprises: using image recognition technology to perform target character recognition on an entire video, positioning a plurality of video clips containing target characters, and obtaining a first playing time period set corresponding to the video clips; according to audio clips corresponding to each character marked within the entire video, obtaining a second playing time period set corresponding to the audio clips of the various characters; merging the time periods included in the playing time period sets, and obtaining a sum playing time period set of the target characters; according to a sorting of various playing timelines within the sum playing time period set, performing video playing of the target characters.

Local voice data processing

Example techniques relate to local voice control in a media playback system. A satellite device (e.g., a playback device or microcontroller unit) may be configured to recognize a local set of keywords in voice inputs including context specific keywords (e.g., for controlling an associated smart device) as well as keywords corresponding to a subset of media playback commands for controlling playback devices in the media playback system. The satellite device may fall back to a hub device (e.g., a playback device) configured to recognize a more extensive set of keywords. In some examples, either device may fall back to the cloud for processing of other voice inputs.