Patent classifications
G10L21/00
Altering undesirable communication data for communication sessions
This disclosure describes techniques implemented partly by a communications service for identifying and altering undesirable portions of communication data, such as audio data and video data, from a communication session between computing devices. For example, the communications service may monitor the communications session to alter or remove undesirable audio data, such as a dog barking, a doorbell ringing, etc., and/or video data, such as rude gestures, inappropriate facial expressions, etc. The communications service may stream the communication data for the communication session partly through managed servers and analyze the communication data to detect undesirable portions. The communications service may alter or remove the portions of communication data received from a first user device, such as by filtering, refraining from transmitting, or modifying the undesirable portions. The communications service may send the modified communication data to a second user device engaged in the communication session after removing the undesirable portions.
Devices for encoding and decoding a watermarked signal
An electronic device configured for encoding a watermarked signal is described. The electronic device includes modeler circuitry. The modeler circuitry determines parameters based on a first signal and a first-pass coded signal. The electronic device also includes coder circuitry coupled to the modeler circuitry. The coder circuitry performs a first-pass coding on a second signal to obtain the first-pass coded signal and performs a second-pass coding based on the parameters to obtain a watermarked signal.
Devices for encoding and detecting a watermarked signal
A method for decoding a signal on an electronic device is described. The method includes receiving a signal. The method also includes extracting a bitstream from the signal. The method further includes performing watermark error checking on the bitstream for multiple frames. The method additionally includes determining whether watermark data is detected based on the watermark error checking. The method also includes decoding the bitstream to obtain a decoded second signal if the watermark data is not detected.
Natural language dialogue method and natural language dialogue system
A natural language dialog method and a natural language dialog system are provided. In the method, a first speech input is received and parsed to generate at least one keyword included in the first speech input, so that a candidate list including at least one report answer is obtained. According to a properties database, one report answer is selected from the candidate list, and a first speech response is output according to the report answer. Other speech inputs are received, and a user's preference data is captured from the speech inputs. The user's preference data is stored in the properties database.
Method and apparatus of suppressing vocoder noise
A method and apparatus for suppressing vocoder noise are provided. In the method, first information and second information are received from a channel decoder, the first information indicating whether a decoded data frame has an error and the second information being a channel quality metric, error concealment voice decoding is performed on the decoded data frame if the first information indicates that no channel decoding error has been generated and the second information is smaller than a predetermined first threshold, and normal voice decoding is performed on the decoded data frame if the first information indicates that no channel decoding error has been generated and the second information is equal to or larger than the first threshold.
Systems and methods for analyzing audio characteristics and generating a uniform soundtrack from multiple sources
Systems and methods analyze audio characteristics of digital audio files and generate a uniform soundtrack based on more than one of the digital audio files. The systems and method comprise equalizing a content of each input digital audio file such that all input digital audio files are processible as a group, wherein the input digital audio files are storable in a database and comprise a list of original recorded digital audio files from original recorded digital video files that were recorded from at least two different digital sources at the same event and the input digital audio files have previously been synchronized such that exact locations of the input digital audio files within the same event has been determined or identified. Moreover, the systems and methods comprise analyzing audio characteristics of the input digital audio files to detect a content quality of each input digital audio file, retrieving highest possible content qualities of the input digital audio files by cleaning the input digital audio files, and generating a unified soundtrack for one or more portions of the same event by merging more than one of the input digital audio files into an output digital audio file.
Communication system
Systems and methods for responding to spoken language input or multi-modal input are described herein. More specifically, one or more user intents are determined or inferred from the spoken language input or multi-modal input to determine one or more user goals via a dialogue belief tracking system. The systems and methods disclosed herein utilize the dialogue belief tracking system to perform actions based on the determined one or more user goals and allow a device to engage in human like conversation with a user over multiple turns of a conversation. Preventing the user from having to explicitly state each intent and desired goal while still receiving the desired goal from the device, improves a user's ability to accomplish tasks, perform commands, and get desired products and/or services. Additionally, the improved response to spoken language inputs from a user improves user interactions with the device.
Management, replacement and removal of explicit lyrics during audio playback
Unwanted audio, such as explicit language, may be removed during audio playback. An audio player may identify and remove unwanted audio while playing an audio stream. Unwanted audio may be replaced with alternate audio, such as non-explicit lyrics, a “beep”, or silence. Metadata may be used to describe the location of unwanted audio within an audio stream to enable the removal or replacement of the unwanted audio with alternate audio. An audio player may switch between clean and explicit versions of a recording based on the locations described in the metadata. The metadata, as well as both the clean and explicit versions of the audio data, may be part of a single audio file, or the metadata may be separate from the audio data. Additionally, real-time recognition analysis may be used to identify unwanted audio during audio playback.
Method and system for confidential sentiment analysis
A confidential sentiment analysis method includes receiving call data, storing the call data including interaction metadata, generating a speech-to-text transcript corresponding to words spoken by one or more callers, generating an anonymized transcript by anonymizing personally identifiable words, and generating a sentiment score by analyzing the anonymized transcript. A computing system includes a processor, and a memory including computer executable instructions that, when executed by the one processor, cause the system to receive call data, store the call data, generate a speech-to-text transcript, generate an anonymized transcript by anonymizing personally identifiable words, and generate a sentiment score based on the anonymized transcript. A non-transitory computer readable medium contains program instructions that when executed, cause a computer system to receive call data, store the call data, generate a speech-to-text transcript, generate an anonymized transcript by anonymizing personally identifiable words, and generate a sentiment score based on the anonymized transcript.
System and method for providing network coordinated conversational services
A system and method for providing automatic and coordinated sharing of conversational resources, e.g., functions and arguments, between network-connected servers and devices and their corresponding applications. In one aspect, a system for providing automatic and coordinated sharing of conversational resources includes a network having a first and second network device, the first and second network device each comprising a set of conversational resources, a dialog manager for managing a conversation and executing calls requesting a conversational service, and a communication stack for communicating messages over the network using conversational protocols, wherein the conversational protocols establish coordinated network communication between the dialog managers of the first and second network device to automatically share the set of conversational resources of the first and second network device, when necessary, to perform their respective requested conversational service.