Patent classifications
G10L21/045
DATA PROCESSING APPARATUS, DATA PROCESSING METHOD, AND DATA PROCESSING PROGRAM
A data processing apparatus (1) includes a signal processing unit (31, 32) and an insertion/deletion unit (33). The signal processing unit performs predetermined signal processing on the wirelessly received data for each frame that includes a predetermined number of samples, and stores the data in the buffers (41, 42). In the case where an amount of the data accumulated in the buffer is out of a predetermined range, the insertion/deletion unit (33) performs insertion/deletion processing that inserts or deletes the data in units of samples.
SPEECH PROCESSING METHOD AND APPARATUS
A speech processing method includes obtaining first speech information from a user, determining one or more similar speech segments in the first speech information and deleting one or more similar frames each of the one or more similar speech segments to obtain second speech information, and analyzing the second speech information to determine a user intent corresponding to the first speech information. A duration of the first speech information exceeds a preset analysis duration threshold, and a duration of the second speech information does not exceed the preset analysis duration threshold.
Techniques for decreasing echo and transmission periods for audio communication sessions
A computer-implemented technique can include establishing an audio communication session between first and second computing devices and obtaining, by the first computing device, an audio input signal using audio data captured by a microphone. The first computing device can analyze the audio input signal to detect a speech input by its first user and can determine a duration of a detection period from when the audio input signal was obtained until the analyzing has completed. The first computing device can then transmit, to the second computing device, (i) a portion of the audio input signal beginning at a start of the speech input and (ii) the detection period duration, wherein receipt of the portion of the audio input signal and the detection period duration causes the second computing device to accelerate playback of the portion of the audio input signal to compensate for the detection period duration.
METHODS AND APPARATUS TO PERFORM SPEED-ENHANCED PLAYBACK OF RECORDED MEDIA
Methods, apparatus, systems, and articles of manufacture to perform speed-enhanced playback of recorded media are disclosed. Example apparatus to playback media disclosed herein comprise at least one memory, machine-readable instructions, and processor circuitry to execute the machine-readable instructions to parse an audio frame included in the media to determine a number of skip bytes included in the audio frame, compare the number of skip bytes to a threshold, associate the audio frame with a plurality of candidate frames identified in the media when the number of skip bytes satisfies the threshold, and calculate a speed-enhanced playback rate for the media based on the plurality of candidate frames identified in the media.
METHODS AND APPARATUS TO PERFORM SPEED-ENHANCED PLAYBACK OF RECORDED MEDIA
Methods, apparatus, systems, and articles of manufacture to perform speed-enhanced playback of recorded media are disclosed. Example apparatus to playback media disclosed herein comprise at least one memory, machine-readable instructions, and processor circuitry to execute the machine-readable instructions to parse an audio frame included in the media to determine a number of skip bytes included in the audio frame, compare the number of skip bytes to a threshold, associate the audio frame with a plurality of candidate frames identified in the media when the number of skip bytes satisfies the threshold, and calculate a speed-enhanced playback rate for the media based on the plurality of candidate frames identified in the media.
SYSTEM AND METHODOLOGY FOR MODULATION OF DYNAMIC GAPS IN SPEECH
A system capable of speech gap modulation is configured to: receive at least one composite speech portion, which comprises at least one speech portion and at least one dynamic-gap portion, wherein the speech portion(s) comprising at least one variable-value speech portion, wherein the dynamic-gap portion(s) associated with a pause in speech; receive at least one synchronization point, wherein synchronization point(s) is associating a point in time in the composite speech portion(s) and a point in time in other media portion(s); and modulate dynamic-gap portion(s), based at least partially on the at variable-value speech portion(s), and on the point(s), thereby generating at least one modulated composite speech portion. This facilitates improved synchronization of the modulated composite speech portion(s) and the other media portion(s) at the synchronization point(s), when combining the other media portion(s) and the audio-format modulated composite speech portion(s) into a synchronized multimedia output.
SYSTEM AND METHODOLOGY FOR MODULATION OF DYNAMIC GAPS IN SPEECH
A system capable of speech gap modulation is configured to: receive at least one composite speech portion, which comprises at least one speech portion and at least one dynamic-gap portion, wherein the speech portion(s) comprising at least one variable-value speech portion, wherein the dynamic-gap portion(s) associated with a pause in speech; receive at least one synchronization point, wherein synchronization point(s) is associating a point in time in the composite speech portion(s) and a point in time in other media portion(s); and modulate dynamic-gap portion(s), based at least partially on the at variable-value speech portion(s), and on the point(s), thereby generating at least one modulated composite speech portion. This facilitates improved synchronization of the modulated composite speech portion(s) and the other media portion(s) at the synchronization point(s), when combining the other media portion(s) and the audio-format modulated composite speech portion(s) into a synchronized multimedia output.
Information processing apparatus, information processing method, and program
An information processing apparatus including an audio buffer unit, a reproduction time calculation unit, a position decision unit; and an insertion unit. The audio buffer unit retains first audio data that have not been reproduced in the first audio data received from another apparatus via a transmission path. The reproduction time calculation unit calculates a reproduction time of second audio data on the basis of at least any of a state of the first audio data retained in the audio buffer unit or a state of the transmission path. The second audio data are to be inserted and reproduced while the first audio data are being reproduced. The position decision unit decides an insertion position of the second audio data in the first audio data. The insertion unit controls a process of inserting the second audio data at the insertion position in the first audio data.
Methods and apparatus to perform speed-enhanced playback of recorded media
Methods, apparatus, systems and articles of manufacture to perform speed-enhanced playback of recorded media are disclosed. Example media playback devices disclosed herein are to determine a target number of frames of recorded media to drop during playback of the recorded media, the target number determined based on a difference between (1) a total number of frames of the recorded media and (2) a ratio of the total number of frames of the recorded media to a target playback rate. Disclosed example media playback devices are also to select a subset of the frames of the recorded media to drop during the playback of the recorded media, the frames selector to select the subset of the frames based on the target number of frames to drop and skip bytes included in the subset of frames.
Methods and apparatus to perform speed-enhanced playback of recorded media
Methods, apparatus, systems and articles of manufacture to perform speed-enhanced playback of recorded media are disclosed. Example media playback devices disclosed herein are to determine a target number of frames of recorded media to drop during playback of the recorded media, the target number determined based on a difference between (1) a total number of frames of the recorded media and (2) a ratio of the total number of frames of the recorded media to a target playback rate. Disclosed example media playback devices are also to select a subset of the frames of the recorded media to drop during the playback of the recorded media, the frames selector to select the subset of the frames based on the target number of frames to drop and skip bytes included in the subset of frames.