Patent classifications
G10L21/04
Apparatus and method for processing an input audio signal using cascaded filterbanks
An apparatus for processing an input audio signal relies on a cascade of filterbanks, the cascade having a synthesis filterbank for synthesizing an audio intermediate signal from the input audio signal, the input audio signal being represented by a plurality of first subband signals generated by an analysis filterbank, wherein a number of filterbank channels of the synthesis filterbank is smaller than a number of channels of the analysis filterbank. The apparatus furthermore has a further analysis filterbank for generating a plurality of second subband signals from the audio intermediate signal, wherein the further analysis filterbank has a number of channels being different from the number of channels of the synthesis filterbank, so that a sampling rate of a subband signal of the plurality of second subband signals is different from a sampling rate of a first subband signal of the plurality of first subband signals.
Apparatus and method for processing an input audio signal using cascaded filterbanks
An apparatus for processing an input audio signal relies on a cascade of filterbanks, the cascade having a synthesis filterbank for synthesizing an audio intermediate signal from the input audio signal, the input audio signal being represented by a plurality of first subband signals generated by an analysis filterbank, wherein a number of filterbank channels of the synthesis filterbank is smaller than a number of channels of the analysis filterbank. The apparatus furthermore has a further analysis filterbank for generating a plurality of second subband signals from the audio intermediate signal, wherein the further analysis filterbank has a number of channels being different from the number of channels of the synthesis filterbank, so that a sampling rate of a subband signal of the plurality of second subband signals is different from a sampling rate of a first subband signal of the plurality of first subband signals.
SYSTEMS AND METHODS TO ALTER VOICE INTERACTIONS
Systems and methods are disclosed for providing voice interactions based on user context. Data is received that causes a voice interaction to be generated for output at a user device during an output time interval. In response, current user contextual data of the user device is retrieved. The voice interaction and output time interval are altered to increase consumption likelihood of the voice interaction based on the current user contextual data. The altered voice interaction is outputted at the user device during the altered output time interval.
SYSTEMS AND METHODS TO ALTER VOICE INTERACTIONS
Systems and methods are disclosed for providing voice interactions based on user context. Data is received that causes a voice interaction to be generated for output at a user device during an output time interval. In response, current user contextual data of the user device is retrieved. The voice interaction and output time interval are altered to increase consumption likelihood of the voice interaction based on the current user contextual data. The altered voice interaction is outputted at the user device during the altered output time interval.
METHOD AND APPARATUS FOR TRAINING DATA AUGMENTATION FOR END-TO-END SPEECH RECOGNITION
The present invention relates to a method of training data augmentation for end-to-end speech recognition. The method for training data augmentation for end-to-end speech recognition includes: combining speech augmentation data and text augmentation data; performing a dynamic augmentation process on each of the speech augmentation data and the text augmentation data that have been combined; and training the end-to-end speech recognition using the speech augmentation data and the text augmentation data that are subjected to the dynamic augmentation process.
Pitch emphasis apparatus, method and program for the same
Provided is pitch enhancement processing having little unnaturalness even in time segments for consonants, and having little unnaturalness to listeners caused by discontinuities even when time segments for consonants and other time segments switch frequently. A pitch emphasis apparatus carries out the following as the pitch enhancement processing: for a time segment in which a spectral envelope of a signal has been determined to be flat, obtaining an output signal for each of times in the time segment, the output signal being a signal including a signal obtained by adding (1) a signal obtained by multiplying the signal of a time, further in the past than the time by a number of samples T.sub.0 corresponding to a pitch period of the time segment, a pitch gain σ.sub.0 of the time segment, a predetermined constant B.sub.0, and a value greater than 0 and less than 1, to (2) the signal of the time.
APPARATUS AND METHOD FOR PROCESSING AN INPUT AUDIO SIGNAL USING CASCADED FILTERBANKS
An apparatus for processing an input audio signal relies on a cascade of filterbanks, the cascade having a synthesis filterbank for synthesizing an audio intermediate signal from the input audio signal, the input audio signal being represented by a plurality of first subband signals generated by an analysis filterbank, wherein a number of filterbank channels of the synthesis filterbank is smaller than a number of channels of the analysis filterbank. The apparatus furthermore has a further analysis filterbank for generating a plurality of second subband signals from the audio intermediate signal, wherein the further analysis filterbank has a number of channels being different from the number of channels of the synthesis filterbank, so that a sampling rate of a subband signal of the plurality of second subband signals is different from a sampling rate of a first subband signal of the plurality of first subband signals.
APPARATUS AND METHOD FOR PROCESSING AN INPUT AUDIO SIGNAL USING CASCADED FILTERBANKS
An apparatus for processing an input audio signal relies on a cascade of filterbanks, the cascade having a synthesis filterbank for synthesizing an audio intermediate signal from the input audio signal, the input audio signal being represented by a plurality of first subband signals generated by an analysis filterbank, wherein a number of filterbank channels of the synthesis filterbank is smaller than a number of channels of the analysis filterbank. The apparatus furthermore has a further analysis filterbank for generating a plurality of second subband signals from the audio intermediate signal, wherein the further analysis filterbank has a number of channels being different from the number of channels of the synthesis filterbank, so that a sampling rate of a subband signal of the plurality of second subband signals is different from a sampling rate of a first subband signal of the plurality of first subband signals.
DISTRIBUTED ALGORITHM FOR AUTOMIXING SPEECH OVER WIRELESS NETWORKS
Systems and methods are disclosed for operating a wireless audio network including a plurality of wireless microphone units (e.g., wireless delegate units) and a central access point having a mixer. The wireless microphone units may perform voice detection and level sensing, and make a preliminary gating decision. The central access point may make a final gating decision, determine the granting of wireless communications channels, and generate a final mixed audio output signal.
SYSTEMS AND METHODS FOR PROVIDING AUDIO-FILE LOOP-PLAYBACK FUNCTIONALITY
Systems and methods for providing audio-file loop-playback functionality are provided. The system includes a processor that performs a method including setting a playback loop start-point based on a first selection of a button; setting a loop end-point, associating a loop with an audio file, and entering into the loop based on a second selection of the button; and exiting the loop based on a third selection of the button. Associating the loop with the audio file includes adding metadata to the audio file. The metadata associates the loop with a button. The method includes reentering the loop based on a fourth selection of the button and exiting the loop based on a fifth selection of the button.