G10L21/055

AUDIO SCRIPTS FOR VARIOUS CONTENT

Disclosed are various embodiments for initiating playback of audio scripts that correspond to content such as books or songs. Verbal content can be captured via a microphone. The text of the verbal content can be assessed to determine whether an audio script specifies a sound effect that should be played at particular cue words within the content. The verbal content is assessed to determine whether a user reading aloud or singing a song has reached a cue word. When a cue word is reached, the sound effect can be played.

AUDIO SCRIPTS FOR VARIOUS CONTENT

Disclosed are various embodiments for initiating playback of audio scripts that correspond to content such as books or songs. Verbal content can be captured via a microphone. The text of the verbal content can be assessed to determine whether an audio script specifies a sound effect that should be played at particular cue words within the content. The verbal content is assessed to determine whether a user reading aloud or singing a song has reached a cue word. When a cue word is reached, the sound effect can be played.

AUDIO PROCESSING DEVICE AND METHOD

The present disclosure relates to an audio processing device and a method therefor allowing a localization position of a sound image to be readily changed. A coefficient computation unit 23 adds or subtracts coefficients k_Ls, k_L, k_C, k_R, and k_Rs set for respective channels by a control unit 21 to or from audio signals Ls, L, C, R, and Rs from a delay unit 22, respectively. A dividing unit divides an audio signal C form the coefficient computation unit into two channel outputs, outputs a signal obtained by multiplying an audio signal C resulting from the division by delay_α to a combining unit of a channel L, and outputs a signal obtained by multiplying an audio signal C resulting from the division by delay_β to a combining unit of a channel R. The present disclosure is applicable to a downmixer that downmixes audio signals from two or more channels to two channels.

AUDIO PROCESSING DEVICE AND METHOD

The present disclosure relates to an audio processing device and a method therefor allowing a localization position of a sound image to be readily changed. A coefficient computation unit 23 adds or subtracts coefficients k_Ls, k_L, k_C, k_R, and k_Rs set for respective channels by a control unit 21 to or from audio signals Ls, L, C, R, and Rs from a delay unit 22, respectively. A dividing unit divides an audio signal C form the coefficient computation unit into two channel outputs, outputs a signal obtained by multiplying an audio signal C resulting from the division by delay_α to a combining unit of a channel L, and outputs a signal obtained by multiplying an audio signal C resulting from the division by delay_β to a combining unit of a channel R. The present disclosure is applicable to a downmixer that downmixes audio signals from two or more channels to two channels.

Systems, devices, and methods for synchronizing audio

Disclosed herein are new techniques carried out by a computing system for determining delays of various components of an audio system to allow for accurate correction of these delays, which may improve the audio quality of live performances for listeners who hear audio reproduced by loudspeakers at live performance venues. In one implementation the computing system, which may comprise a transmitter device and one or more receiver devices, may be configured to perform functions, including receiving a first audio signal, receiving, via an audio input interface of the receiver, a second audio signal, and determining, based on the first audio signal and the second audio signal, an audio delay that is associated with the second audio signal. The computing system may be configured to perform further functions, including based on a determined cross-correlation between a downsampled audio signal and a filtered second audio signal, determining the audio signal delay.

Systems, devices, and methods for synchronizing audio

Disclosed herein are new techniques carried out by a computing system for determining delays of various components of an audio system to allow for accurate correction of these delays, which may improve the audio quality of live performances for listeners who hear audio reproduced by loudspeakers at live performance venues. In one implementation the computing system, which may comprise a transmitter device and one or more receiver devices, may be configured to perform functions, including receiving a first audio signal, receiving, via an audio input interface of the receiver, a second audio signal, and determining, based on the first audio signal and the second audio signal, an audio delay that is associated with the second audio signal. The computing system may be configured to perform further functions, including based on a determined cross-correlation between a downsampled audio signal and a filtered second audio signal, determining the audio signal delay.

Method, Apparatus and Systems for Audio Decoding and Encoding

An audio processing system (100) accepts an audio bitstream having one of a plurality of predefined audio frame rates. The system comprises a front-end component (110), which receives a variable number of quantized spectral components, corresponding to one audio frame in any of the predefined audio frame rates, and performs an inverse quantization according to predetermined, frequency-dependent quantization levels. The front-end component may be agnostic of the audio frame rate. The audio processing system further comprises a frequency-domain processing stage (120) and a sample rate converter (130), which provide a reconstructed audio signal sampled at a target sampling frequency independent of the audio frame rate. By its frame-rate adaptability, the system can be configured to operate frame-synchronously in parallel with a video processing system that accepts plural video frame rates.

Method, Apparatus and Systems for Audio Decoding and Encoding

An audio processing system (100) accepts an audio bitstream having one of a plurality of predefined audio frame rates. The system comprises a front-end component (110), which receives a variable number of quantized spectral components, corresponding to one audio frame in any of the predefined audio frame rates, and performs an inverse quantization according to predetermined, frequency-dependent quantization levels. The front-end component may be agnostic of the audio frame rate. The audio processing system further comprises a frequency-domain processing stage (120) and a sample rate converter (130), which provide a reconstructed audio signal sampled at a target sampling frequency independent of the audio frame rate. By its frame-rate adaptability, the system can be configured to operate frame-synchronously in parallel with a video processing system that accepts plural video frame rates.

REMOTE VISUALIZATION OF REAL-TIME THREE-DIMENSIONAL (3D) FACIAL ANIMATION WITH SYNCHRONIZED VOICE
20210375020 · 2021-12-02 ·

Described herein are methods and systems for remote visualization of real-time three-dimensional (3D) facial animation with synchronized voice. A sensor captures frames of a face of a person, each frame comprising color images of the face, depth maps of the face, voice data associated with the person, and a timestamp. The sensor generates a 3D face model of the person using the depth maps. A computing device receives the frames of the face and the 3D face model. The computing device preprocesses the 3D face model. For each frame, the computing device: detects facial landmarks using the color images; matches the 3D face model to the depth maps using non-rigid registration; updates a texture on a front part of the 3D face model using the color images; synchronizes the 3D face model with a segment of the voice data using the timestamp; and transmits the synchronized 3D face model and voice data to a remote device.

METHODS AND APPARATUS TO PERFORM SPEED-ENHANCED PLAYBACK OF RECORDED MEDIA
20220189509 · 2022-06-16 ·

Methods, apparatus, systems, and articles of manufacture to perform speed-enhanced playback of recorded media are disclosed. Example apparatus to playback media disclosed herein comprise at least one memory, machine-readable instructions, and processor circuitry to execute the machine-readable instructions to parse an audio frame included in the media to determine a number of skip bytes included in the audio frame, compare the number of skip bytes to a threshold, associate the audio frame with a plurality of candidate frames identified in the media when the number of skip bytes satisfies the threshold, and calculate a speed-enhanced playback rate for the media based on the plurality of candidate frames identified in the media.