Patent classifications
G10L21/043
Keyword detection method and related apparatus
A keyword detection method includes: obtaining an enhanced speech signal of a to-be-detected speech signal, the enhanced speech signal corresponding to a target speech speed; performing speed adjustment on the enhanced speech signal to obtain a first speed-adjusted speech signal having a first speech speed, the first speech speed being different from the target speech speed; obtaining a first speech feature signal according to the first speed-adjusted speech signal; obtaining a detection result according to a first keyword detection result corresponding to the first speech feature signal, the detection result indicating whether a target keyword exists in the to-be-detected speech signal; and performing an operation corresponding to the target keyword in response to determining that the target keyword exists according to the detection result.
Keyword detection method and related apparatus
A keyword detection method includes: obtaining an enhanced speech signal of a to-be-detected speech signal, the enhanced speech signal corresponding to a target speech speed; performing speed adjustment on the enhanced speech signal to obtain a first speed-adjusted speech signal having a first speech speed, the first speech speed being different from the target speech speed; obtaining a first speech feature signal according to the first speed-adjusted speech signal; obtaining a detection result according to a first keyword detection result corresponding to the first speech feature signal, the detection result indicating whether a target keyword exists in the to-be-detected speech signal; and performing an operation corresponding to the target keyword in response to determining that the target keyword exists according to the detection result.
Daydream-aware information recovery system
A system includes a sensing module configured to collect physiological information from a user; an audio input device configured to capture auditory content that is in an environment proximate the user; and one or more processors configured to: determine that a user has entered a state of inwardly focused attention based on first physiological information collected from the user; and in response, record the auditory content captured by the audio input device.
Daydream-aware information recovery system
A system includes a sensing module configured to collect physiological information from a user; an audio input device configured to capture auditory content that is in an environment proximate the user; and one or more processors configured to: determine that a user has entered a state of inwardly focused attention based on first physiological information collected from the user; and in response, record the auditory content captured by the audio input device.
Systems and methods for generating a visual color display of audio-file data
Systems and methods for generating a visual color display of audio-file data are provided. The system includes a processor that performs a method including receiving audio-file data; generating filtered-audio data by processing the audio-file data by frequency-band filters. The frequency band filters have different frequency bands. The method includes generating one or more waveforms corresponding to the filtered-audio data and displaying the waveforms superimposed in unique color relative to one another. The method includes downsampling the waveforms. The method includes processing the waveforms through an envelope detector. The method includes processing the waveforms through an expander and applying a gain factor. The waveforms have transparency levels at sections that are proportional or inversely proportional to amplitudes at the sections.
Systems and methods for generating a visual color display of audio-file data
Systems and methods for generating a visual color display of audio-file data are provided. The system includes a processor that performs a method including receiving audio-file data; generating filtered-audio data by processing the audio-file data by frequency-band filters. The frequency band filters have different frequency bands. The method includes generating one or more waveforms corresponding to the filtered-audio data and displaying the waveforms superimposed in unique color relative to one another. The method includes downsampling the waveforms. The method includes processing the waveforms through an envelope detector. The method includes processing the waveforms through an expander and applying a gain factor. The waveforms have transparency levels at sections that are proportional or inversely proportional to amplitudes at the sections.
METHOD FOR PROTECTING ANONYMITY FROM MEDIA-BASED INTERACTIONS
Disclosed herein are a computing device, method, and computer-readable medium embodiments for altering video and/or audio data during a media-based interaction to protect anonymity of one or more users in the media-based interaction for the purpose of removing bias from the media-based interaction. In some aspects, video and audio data including facial information and speech data of a user may be streamed during the media-based interaction between client devices. The video and audio data may be altered in real-time during the media-based interaction to change the visual and/or audio aspects of one or more users during the media-based interaction. This alterations to the video and audio data may be based on identifying visual and audio features from a list of one or more visual and audio identifiers associated with age, sex, and/or gender. The alteration results in a new virtual representation presented during the interaction that anonymizes the user.
METHOD FOR PROTECTING ANONYMITY FROM MEDIA-BASED INTERACTIONS
Disclosed herein are a computing device, method, and computer-readable medium embodiments for altering video and/or audio data during a media-based interaction to protect anonymity of one or more users in the media-based interaction for the purpose of removing bias from the media-based interaction. In some aspects, video and audio data including facial information and speech data of a user may be streamed during the media-based interaction between client devices. The video and audio data may be altered in real-time during the media-based interaction to change the visual and/or audio aspects of one or more users during the media-based interaction. This alterations to the video and audio data may be based on identifying visual and audio features from a list of one or more visual and audio identifiers associated with age, sex, and/or gender. The alteration results in a new virtual representation presented during the interaction that anonymizes the user.
SOUND SIGNAL REFINEMENT METHOD, SOUND SIGNAL DECODE METHOD, APPARATUS THEREOF, PROGRAM, AND STORAGE MEDIUM
There is provided a technology that improves, in a case where there is a sound signal obtained from a different code that is different from a code from which a decoded sound signal is obtained and that is derived from the same sound signal, the decoded sound signal by using the sound signal obtained from the different code. A signal (hereinafter, referred to as an upmixed common signal) obtained by upmixing a decoded sound common signal obtained by downmixing a decoded sound signal of each channel is subjected to signal purification using a signal (hereinafter, referred to as an upmixed monaural decoded sound signal) obtained by upmixing a monaural decoded sound signal to thereby generate a purified upmixed signal, and in each channel, the upmixed common signal is subtracted from the decoded sound signal and the purified upmixed signal is added thereto, to thereby generate a purified decoded sound signal.
SOUND SIGNAL REFINEMENT METHOD, SOUND SIGNAL DECODE METHOD, APPARATUS THEREOF, PROGRAM, AND STORAGE MEDIUM
There is provided a technology that improves, in a case where there is a sound signal obtained from a different code that is different from a code from which a decoded sound signal is obtained and that is derived from the same sound signal, the decoded sound signal by using the sound signal obtained from the different code. A signal (hereinafter, referred to as an upmixed common signal) obtained by upmixing a decoded sound common signal obtained by downmixing a decoded sound signal of each channel is subjected to signal purification using a signal (hereinafter, referred to as an upmixed monaural decoded sound signal) obtained by upmixing a monaural decoded sound signal to thereby generate a purified upmixed signal, and in each channel, the upmixed common signal is subtracted from the decoded sound signal and the purified upmixed signal is added thereto, to thereby generate a purified decoded sound signal.