Patent classifications
G10L17/02
COMPUTER-IMPLEMENTED DETECTION OF ANOMALOUS TELEPHONE CALLS
Computer-implemented detection of anomalous telephone calls, for example detection of interconnect bypass fraud, is disclosed. A telephone call associated with user devices is analyzed remote from the user devices. A first set of multiple features, for example Mel Frequency Cepstral Coefficients, is derived from a call audio stream. The first set is converted to an embedding vector, for example via a model based on a Universal Background Model comprising a Gaussian Mixture Model, which model is preferably configured based on a training plurality of first sets of multiple features derived form a corresponding training plurality of audio streams. Occurrence, or probability of occurrence, of an anomalous telephone call is determined based on the embedding vector, for example via a back-end classifier, such as a Gaussian Backend Model, which classifier is preferably configured based on labels associated with the training plurality of audio streams.
COMPUTER-IMPLEMENTED DETECTION OF ANOMALOUS TELEPHONE CALLS
Computer-implemented detection of anomalous telephone calls, for example detection of interconnect bypass fraud, is disclosed. A telephone call associated with user devices is analyzed remote from the user devices. A first set of multiple features, for example Mel Frequency Cepstral Coefficients, is derived from a call audio stream. The first set is converted to an embedding vector, for example via a model based on a Universal Background Model comprising a Gaussian Mixture Model, which model is preferably configured based on a training plurality of first sets of multiple features derived form a corresponding training plurality of audio streams. Occurrence, or probability of occurrence, of an anomalous telephone call is determined based on the embedding vector, for example via a back-end classifier, such as a Gaussian Backend Model, which classifier is preferably configured based on labels associated with the training plurality of audio streams.
SMART SPEAKER, MULTI-VOICE ASSISTANT CONTROL METHOD, AND SMART HOME SYSTEM
The invention discloses a smart loudspeaker, wherein the smart loudspeaker includes a voice input module, a language recognition module and at least two voice assistants, and the language recognition module receives a voice information from the voice input module and determines the language category based on the voice information and activates the voice assistant corresponding to the language category.
SMART SPEAKER, MULTI-VOICE ASSISTANT CONTROL METHOD, AND SMART HOME SYSTEM
The invention discloses a smart loudspeaker, wherein the smart loudspeaker includes a voice input module, a language recognition module and at least two voice assistants, and the language recognition module receives a voice information from the voice input module and determines the language category based on the voice information and activates the voice assistant corresponding to the language category.
INFORMATION TRANSMISSION DEVICE, INFORMATION RECEPTION DEVICE, INFORMATION TRANSMISSION METHOD, RECORDING MEDIUM, AND SYSTEM
An information transmission device according to the present disclosure includes: an acoustic feature calculator that calculates an acoustic feature of a spoken voice; a speaker feature calculator that calculates a speaker feature from the acoustic feature using a deep neural network (DNN), the speaker feature being a feature unique to a speaker of the spoken voice; an analyzer that analyzes condition information indicating a condition to be used in calculating the speaker feature, based on the spoken voice; and an information transmitter that transmits the speaker feature and the condition information to an information reception device that performs speaker recognition processing on the spoken voice, as information to be used by the information reception device to recognize the speaker of the spoken voice.
Automatic detection and analytics using sensors
A method and device for automatic meeting detection and analysis. A mobile electronic device includes multiple sensors configured to selectively capture sensor data. A classifier is configured to analyze the sensor data to detect a meeting zone for a meeting with multiple participants. A processor device is configured to control the multiple sensors and the classifier to trigger sensor data capture.
Automatic detection and analytics using sensors
A method and device for automatic meeting detection and analysis. A mobile electronic device includes multiple sensors configured to selectively capture sensor data. A classifier is configured to analyze the sensor data to detect a meeting zone for a meeting with multiple participants. A processor device is configured to control the multiple sensors and the classifier to trigger sensor data capture.
Speech feature extraction apparatus, speech feature extraction method, and computer-readable storage medium
A speech feature extraction apparatus 100 includes a voice activity detection unit 103 that drops non-voice frames from frames corresponding to an input speech utterance, and calculates a posterior of being voiced for each frame, a voice activity detection process unit 106 calculates a function value as weights in pooling frames to produce an utterance-level feature, from a given a voice activity detection posterior, and an utterance-level feature extraction unit 112 that extracts an utterance-level feature, from the frame on a basis of multiple frame-level features, using the function values.
Speaker identity and content de-identification
One embodiment of the invention provides a method for speaker identity and content de-identification under privacy guarantees. The method comprises receiving input indicative of privacy protection levels to enforce, extracting features from a speech recorded in a voice recording, recognizing and extracting textual content from the speech, parsing the textual content to recognize privacy-sensitive personal information about an individual, generating de-identified textual content by anonymizing the personal information to an extent that satisfies the privacy protection levels and conceals the individual's identity, and mapping the de-identified textual content to a speaker who delivered the speech. The method further comprises generating a synthetic speaker identity based on other features that are dissimilar from the features to an extent that satisfies the privacy protection levels, and synthesizing a new speech waveform based on the synthetic speaker identity to deliver the de-identified textual content. The new speech waveform conceals the speaker's identity.
Speaker identity and content de-identification
One embodiment of the invention provides a method for speaker identity and content de-identification under privacy guarantees. The method comprises receiving input indicative of privacy protection levels to enforce, extracting features from a speech recorded in a voice recording, recognizing and extracting textual content from the speech, parsing the textual content to recognize privacy-sensitive personal information about an individual, generating de-identified textual content by anonymizing the personal information to an extent that satisfies the privacy protection levels and conceals the individual's identity, and mapping the de-identified textual content to a speaker who delivered the speech. The method further comprises generating a synthetic speaker identity based on other features that are dissimilar from the features to an extent that satisfies the privacy protection levels, and synthesizing a new speech waveform based on the synthetic speaker identity to deliver the de-identified textual content. The new speech waveform conceals the speaker's identity.