G10L21/00

Respiratory biofeedback devices, systems, and methods
09779751 · 2017-10-03 · ·

Respiratory-based biofeedback devices, systems, and methods are provided. A respiratory biofeedback method includes producing a respiratory signal in response to a user's respiratory activity, generating an audio output signal that includes a modified version of the respiratory signal, and converting the audio output signal into sound waves output to the user to provide biofeedback. The sound waves can be output to the user in real time response to the user's respiratory activity. A microphone can be used to generate the respiratory signal. The generated audio output signal can includes the respiratory signal modified to increase a volume level of a portion of the respiratory signal where the volume level exceeds a specified volume level.

Respiratory biofeedback devices, systems, and methods
09779751 · 2017-10-03 · ·

Respiratory-based biofeedback devices, systems, and methods are provided. A respiratory biofeedback method includes producing a respiratory signal in response to a user's respiratory activity, generating an audio output signal that includes a modified version of the respiratory signal, and converting the audio output signal into sound waves output to the user to provide biofeedback. The sound waves can be output to the user in real time response to the user's respiratory activity. A microphone can be used to generate the respiratory signal. The generated audio output signal can includes the respiratory signal modified to increase a volume level of a portion of the respiratory signal where the volume level exceeds a specified volume level.

Speech decoder with high-band generation and temporal envelope shaping
09779744 · 2017-10-03 · ·

A linear prediction coefficient of a signal represented in a frequency domain is obtained by performing linear prediction analysis in a frequency direction by using a covariance method or an autocorrelation method. After the filter strength of the obtained linear prediction coefficient is adjusted, filtering may be performed in the frequency direction on the signal by using the adjusted coefficient, whereby the temporal envelope of the signal is shaped. This reduces the occurrence of pre-echo and post-echo and improves the subjective quality of the decoded signal, without significantly increasing the bit rate in a bandwidth extension technique in the frequency domain represented by SBR.

Speech enhancement device and speech enhancement method
09779754 · 2017-10-03 · ·

A speech enhancement device which includes: a speech production section detection unit configured to detect a speech production section in which a speaker produces speech, from an input signal generated by a speech input unit; a timer unit configured to measure an elapsed time from a starting point of the speech production section; a gain determination unit configured to determine a gain, which represents a level of enhancement of the input signal, according to the elapsed time; and an enhancement unit configured to enhance the input signal or a spectrum signal of the input signal in the speech production section according to the gain, whereby the input signal is enhanced only at necessary portions thereof.

Augmenting speech segmentation and recognition using head-mounted vibration and/or motion sensors

Example methods and systems use multiple sensors to determine whether a speaker is speaking. Audio data in an audio-channel speech band detected by a microphone can be received. Vibration data in a vibration-channel speech band representative of vibrations detected by a sensor other than the microphone can be received. The microphone and the sensor can be associated with a head-mountable device (HMD). It is determined whether the audio data is causally related to the vibration data. If the audio data and the vibration data are causally related, an indication can be generated that the audio data contains HMD-wearer speech. Causally related audio and vibration data can be used to increase accuracy of text transcription of the HMD-wearer speech. If the audio data and the vibration data are not causally related, an indication can be generated that the audio data does not contain HMD-wearer speech.

Acoustic enhancement by leveraging metadata to mitigate the impact of noisy environments

A system for cloud acoustic enhancement is disclosed. In particular, the system may leverage metadata and cloud-computing network resources to mitigate the impact of noisy environments that may potentially interfere with user communications. In order to do so, the system may receive an audio stream including an audio signal associated with a user, and determine if the audio stream also includes an interference signal. The system may determine that the audio stream includes the interference signal if a portion of the audio stream correlates with metadata that identifies the interference signal. If the audio stream is determined to include the interference signal, the system may cancel the interference signal from the audio stream by utilizing the metadata and the cloud-computing network resources. Once the interference signal is cancelled, the system may transmit the audio stream including the audio signal associated with the user to an intended destination.

Object sound period detection apparatus, noise estimating apparatus and SNR estimation apparatus
09779762 · 2017-10-03 · ·

An object sound period detection apparatus includes a first calculating unit, a second calculating unit, a first detecting unit, and a second detecting unit. The first calculating unit calculates a first threshold every unit time. The second calculating unit calculates a second threshold every unit time. The first detecting unit compares first feature amount based on the input signal with the first threshold and detects the object sound period in the input signal. The second detecting unit compares second feature amount based on the input signal with the second threshold, detects the object sound period in the input signal, and outputs a detecting result. The first calculating unit calculates the first threshold based on a detecting result before unit time by the second detecting unit. The second calculating unit calculates the second threshold based on a detecting result in same unit time by the first detecting unit.

Techniques for decreasing echo and transmission periods for audio communication sessions

A computer-implemented technique can include establishing an audio communication session between first and second computing devices and obtaining, by the first computing device, an audio input signal using audio data captured by a microphone. The first computing device can analyze the audio input signal to detect a speech input by its first user and can determine a duration of a detection period from when the audio input signal was obtained until the analyzing has completed. The first computing device can then transmit, to the second computing device, (i) a portion of the audio input signal beginning at a start of the speech input and (ii) the detection period duration, wherein receipt of the portion of the audio input signal and the detection period duration causes the second computing device to accelerate playback of the portion of the audio input signal to compensate for the detection period duration.

Visual indication of an operational state

Architectures and techniques to visually indicate an operational state of an electronic device. In some instances, the electronic device comprises a voice-controlled device configured to interact with a user through voice input and visual output. The voice-controlled device may be positioned in a home environment, such as on a table in a room of the environment. The user may interact with the voice-controlled device through speech and the voice-controlled device may perform operations requested by the speech. As the voice-controlled device enters different operational states while interacting with the user, one or more lights of the voice-controlled device may be illuminated to indicate the different operational states.

Echo cancellation based on shared reference signals

An audio processing system configured to generate, based at least in part on captured sound, an audio signal that includes a speech component corresponding to a user's speech utterance and an audio component corresponding to audio output of another device is described herein. The audio processing system is also configured to receive a reference signal that corresponds to the audio output of the other device. The reference signal may be received as ultrasonic audio output of the other device or from a remote server. The audio processing device then processes the generated audio signal to remove at least a part of the generated audio signal that corresponds to the reference signal.