Patent classifications
G10L25/00
Method and apparatus for detecting correctness of pitch period
A method and an apparatus for detecting correctness of a pitch period, where the method for detecting correctness of a pitch period includes determining, according to an initial pitch period of an input signal in a time domain, a pitch frequency bin of the input signal, where the initial pitch period is obtained by performing open-loop detection on the input signal, determining, based on an amplitude spectrum of the input signal in a frequency domain, a pitch period correctness decision parameter, associated with the pitch frequency bin, of the input signal, and determining correctness of the initial pitch period according to the pitch period correctness decision parameter.
Computationally efficient speech classifier and related methods
In a general aspect, an apparatus for detecting speech can include a signal conditioning stage that receives a signal corresponding with acoustic energy, filters the received signal to produce a speech-band signal, calculates a first sequence of energy values for the received signal and calculates a second sequence of energy values for the speech-band signal. The apparatus can also include a detection stage including a plurality of speech and noise differentiators. The detection stage can being configured to receive the first and second sequences of energy values and, based on the first sequence of energy values and the second sequence of energy values, provide, for each speech and noise differentiator of the plurality of speech and noise differentiators, a respective speech-detection indication signal. The apparatus can also include a combination stage configured to combine the respective speech-detection indication signals and based on the combination of the respective speech-detection indication signals, provide an indication of one of presence of speech in the received signal and absence of speech in the received signal.
Restroom maintenance systems having a voice activated virtual assistant
Exemplary embodiments of restroom monitoring systems having a virtual assistants includes a communications gateway located in a restroom. The communications gateway having a processor, memory, short range communications circuitry, long range communications circuitry, a microphone and a speaker. The communications gateway containing logic for listening for a wake up word and upon detecting a wake up word, capturing a request, logic for processing the request to determine what request is being requested, logic for verifying the request with the requester, and one of a plurality of wave files and a voice synthesizer. The system further includes one or more dispensers located in the restroom. The one or more dispensers having short range communications circuitry for communicating status or product level to the communications gateway.
Digital audio processing device, digital audio processing method, and digital audio processing program
A local extremum calculator detects a local maximum sample and a local minimum sample of a digital audio signal. A number-of-sample detector detects a sample interval between the local maximum sample and the local minimum sample. A difference value calculator calculates difference values between adjacent samples. A correction value calculator calculates a first correction value by multiplying the difference value between the local maximum sample and a first adjacent sample by a coefficient and calculates a second correction value by multiplying the difference value between the local minimum sample and a second adjacent sample by the coefficient. When a periodic signal detector detects that the digital audio signal is a single sine wave, an adder/subtractor does not add the first correction value to the first adjacent sample, and does not subtract the second correction value from the second adjacent sample.
Digital audio processing device, digital audio processing method, and digital audio processing program
A local extremum calculator detects a local maximum sample and a local minimum sample of a digital audio signal. A number-of-sample detector detects a sample interval between the local maximum sample and the local minimum sample. A difference value calculator calculates difference values between adjacent samples. A correction value calculator calculates a first correction value by multiplying the difference value between the local maximum sample and a first adjacent sample by a coefficient and calculates a second correction value by multiplying the difference value between the local minimum sample and a second adjacent sample by the coefficient. When a periodic signal detector detects that the digital audio signal is a single sine wave, an adder/subtractor does not add the first correction value to the first adjacent sample, and does not subtract the second correction value from the second adjacent sample.
Voice aging using machine learning
This specification describes systems and methods for aging voice audio, in particular voice audio in computer games. According to one aspect of this specification, there is described a method for aging speech audio data. The method comprises: inputting an initial audio signal and an age embedding into a machine-learned age convertor model, wherein: the initial audio signal comprises speech audio; and the age embedding is based on an age classification of a plurality of speech audio samples of subjects in a target age category; processing, by the machine-learned age convertor model, the initial audio signal and the age embedding to generate an age-altered audio signal, wherein the age-altered audio signal corresponds to a version of the initial audio signal in the target age category; and outputting, from the machine-learned age convertor model, the age-altered audio signal.
Methods and apparatus for reducing stuttering
A feedback system may play back, to a user, an altered version of the user's voice in real time, in order to reduce stuttering by the user. The system may operate in different feedback modes at different times. For instance, the system may detect when the severity of a user's stuttering increases, which is indicative of the user habituating to the current feedback mode. The system may then switch to a different feedback mode. In some cases, the feedback modes include at least a Whisper mode, a Reverb mode, and a Harmony mode. In Whisper mode, the user's voice may be transformed to sound as if it were whispering in the user's ears. In Harmony mode, the user's voice may be altered as if the user were harmonizing with himself or herself. In Reverb mode, the user's voice may be altered so that it reverberates.
Apparatus and method for residential speaker recognition
A home assistant device captures voice signal expressed by users in the home and extracts vocal features from these captured voice recordings. The device collects data about the current context in the home and requests from an aggregator a background model that is best adapted to the current context. This background model is obtained and locally used by the home assistant device to perform the speaker recognition. Home assistant devices from a plurality of homes contribute to the establishment of a database of background models by aggregating vocal features, clustering them according to the context and computing background models for the different contexts. These background models are then collected, clustered according to their contexts and aggregated by an aggregator in the database. Any home assistant device can then request from the aggregator the background model that fits best its current context, thus improving the speaker recognition.
Hands free always on near field wakeword solution
Apparatuses and systems for conserving power for a portable electronic device that monitors local audio for a wakeword are described herein. In a non-limiting embodiment, a portable electronic device may have two-phases. The first phase may be a first circuit that stores an audio input while determining whether human speech is present in the audio input. The second phase may be a second circuit that activates when the first circuit determines that human speech is present in the audio input. The second circuit may receive the audio input from the first circuit, store the audio input, and determine whether a wakeword is present within the audio input.
Apparatus for communicating with voice recognition device, apparatus with voice recognition capability and controlling method thereof
The present disclosure relates to an apparatus which communicates with a voice recognition device, and a method for controlling an apparatus with a voice recognition capability which operates in the Internet of Things environment configured by a 5G communication network. According to an exemplary embodiment of the present disclosure, an apparatus with a voice recognition capability includes a container which has one open surface and accommodates objects therein, a door which opens/closes the container, a sensor which senses an open/closed state of the door, a microphone which receives an external voice, a voice recognizer which recognizes a voice command received from the microphone, and a controller which controls an active state and an inactive state of the voice recognizer, in which the controller may predict whether the voice recognizer needs to be activated using a deep neural network model learned through the machine learning.