G10L21/055

ELECTRONIC DEVICE FOR RECORDING CONTENTS DATA AND METHOD OF THE SAME

An electronic device according to various embodiments of the disclosure includes: a display configured to output image data of content based on execution of an application, a sound output module comprising circuitry configured to output audio data of the content, and a processor adaptively connected to the display and the sound output module, wherein the processor is configured to: identify a schedule for sequentially receiving read tasks (RTs) at a specified time interval to encode audio segments sequentially input in a specified size into an audio buffer from the audio data, and control time points at which the RTs are called, based on at least one of a situation in which the RTs are received according to the schedule and an audio buffer state and encode the audio segments corresponding to the RTs received at the controlled time points.

ELECTRONIC DEVICE FOR RECORDING CONTENTS DATA AND METHOD OF THE SAME

An electronic device according to various embodiments of the disclosure includes: a display configured to output image data of content based on execution of an application, a sound output module comprising circuitry configured to output audio data of the content, and a processor adaptively connected to the display and the sound output module, wherein the processor is configured to: identify a schedule for sequentially receiving read tasks (RTs) at a specified time interval to encode audio segments sequentially input in a specified size into an audio buffer from the audio data, and control time points at which the RTs are called, based on at least one of a situation in which the RTs are received according to the schedule and an audio buffer state and encode the audio segments corresponding to the RTs received at the controlled time points.

IMAGE PROCESSING APPARATUS, IMAGE PICKUP DEVICE, IMAGE PROCESSING METHOD, AND PROGRAM

An image pickup device which captures sound and a moving image prevents deterioration in a reproduction quality. A scene change detector detects a frame at the time of a scene change from among a plurality of frames imaged at a predetermined frame rate as a detection frame. A frame rate converting unit converts a frame rate of the frame imaged outside a detection to a lower frame rate. A video reproduction time setting unit sets a reproduction time when reproduction is performed at the lower frame rate as a video reproduction time. An audio reproduction time setting unit sets an audio reproduction time at constant intervals for sounds recorded at constant intervals outside the detection period and sets an audio reproduction time in synchronization with the video reproduction time corresponding to the detection frame relative to sound recorded in the detection period.

IMAGE PROCESSING APPARATUS, IMAGE PICKUP DEVICE, IMAGE PROCESSING METHOD, AND PROGRAM

An image pickup device which captures sound and a moving image prevents deterioration in a reproduction quality. A scene change detector detects a frame at the time of a scene change from among a plurality of frames imaged at a predetermined frame rate as a detection frame. A frame rate converting unit converts a frame rate of the frame imaged outside a detection to a lower frame rate. A video reproduction time setting unit sets a reproduction time when reproduction is performed at the lower frame rate as a video reproduction time. An audio reproduction time setting unit sets an audio reproduction time at constant intervals for sounds recorded at constant intervals outside the detection period and sets an audio reproduction time in synchronization with the video reproduction time corresponding to the detection frame relative to sound recorded in the detection period.

ALIGNING PARAMETER DATA WITH AUDIO RECORDINGS
20230019463 · 2023-01-19 ·

Various techniques relate to aligning parameters and audio recordings obtained at a rescue scene. An example method includes receiving, from a first device, a first file including first measurements of a first parameter at first discrete times in a time interval. The first file further indicates a marker output by the first device during the time interval. The method also includes receiving, from a second device, a second file comprising second measurements of a second parameter at second discrete times in the time interval. The method includes detecting the marker output by the first device in the second measurements of the second parameter and based on detecting the signal output by the first device in the second measurements, generating aligned data by time-aligning the first measurements of the first parameter and the second measurements of the second parameter. The method further includes outputting the aligned data.

ALIGNING PARAMETER DATA WITH AUDIO RECORDINGS
20230019463 · 2023-01-19 ·

Various techniques relate to aligning parameters and audio recordings obtained at a rescue scene. An example method includes receiving, from a first device, a first file including first measurements of a first parameter at first discrete times in a time interval. The first file further indicates a marker output by the first device during the time interval. The method also includes receiving, from a second device, a second file comprising second measurements of a second parameter at second discrete times in the time interval. The method includes detecting the marker output by the first device in the second measurements of the second parameter and based on detecting the signal output by the first device in the second measurements, generating aligned data by time-aligning the first measurements of the first parameter and the second measurements of the second parameter. The method further includes outputting the aligned data.

Systems and Methods for Voice Based Audio and Text Alignment
20220399030 · 2022-12-15 ·

The present disclosure relates to systems and methods for temporally aligning media elements. Example methods include providing an audio input waveform based on an audio input and receiving a text input. The example method also includes converting the text input to a text-to-speech input waveform and extracting, with an audio feature extractor, characteristic audio features from the audio input waveform and the text-to-speech input waveform. The example method yet further includes comparing audio input waveform features and text-to-speech waveform features and, based on the comparison, temporally aligning a displayed version of the text input with the audio input.

METHOD AND DEVICE FOR GENERATING SPEECH MOVING IMAGE
20220398793 · 2022-12-15 ·

A device for generating a speech moving image according to an embodiment includes a first encoder that receives a person background image in which a portion related to speech of a person that is a video part of the speech moving image of the person is covered with a mask, extracts an image feature vector from the person background image, and compresses the extracted image feature vector, a second encoder that receives a speech audio signal that is an audio part of the speech moving image, extracts a voice feature vector from the speech audio signal, and compresses the extracted voice feature vector, a combination unit that generates a combination vector of the compressed image feature vector and the compressed voice feature vector, and an image reconstruction unit that reconstructs the speech moving image of the person with the combination as an input.

METHOD AND DEVICE FOR GENERATING SPEECH MOVING IMAGE
20220398793 · 2022-12-15 ·

A device for generating a speech moving image according to an embodiment includes a first encoder that receives a person background image in which a portion related to speech of a person that is a video part of the speech moving image of the person is covered with a mask, extracts an image feature vector from the person background image, and compresses the extracted image feature vector, a second encoder that receives a speech audio signal that is an audio part of the speech moving image, extracts a voice feature vector from the speech audio signal, and compresses the extracted voice feature vector, a combination unit that generates a combination vector of the compressed image feature vector and the compressed voice feature vector, and an image reconstruction unit that reconstructs the speech moving image of the person with the combination as an input.

METHODS, SYSTEMS, AND DEVICES FOR ASSEMBLY OF LIVE AND RECORDED AUDIO CONTENT

Aspects of the subject disclosure may include, for example, receiving first audio content from a first communication device and receiving second audio content from a second communication device, and adjusting the first audio content and the second audio content. The adjusting of the first audio content and the second audio content can comprise: detecting a gap in the first audio content; analyzing the first audio content resulting in an audio analysis; generating filler audio content based on the audio analysis; and inserting the filler audio content into the gap of the first audio content. Further embodiments can include aggregating the first adjusted audio content with the second adjusted audio content resulting in aggregated audio content, and providing the aggregated audio content to a third communication device for playback. The third communication device plays the aggregated audio content. Other embodiments are disclosed.