H04N21/8106

Dynamic Transcoding for Enhancing Audio Playback
20220358943 · 2022-11-10 ·

A first playback device is configured to: operate as part of a synchrony group that comprises the first playback device and a second playback device; obtain a first version of audio content that is encoded according to a first encoding format; determine that the first version of the audio content is unsuitable for playback by the second playback device; based on the determination, (i) decode the first version of the audio content and (ii) re-encode a second version of the audio content according to a second encoding format; transmit the second version of the audio content to the second playback device for playback; cause the second playback device to play back the second version of the audio content; and play back the first version of the audio content in synchrony with the playback of the second version of the audio content by the second playback device.

SYSTEM AND METHOD FOR PROVIDING DIGITAL GRAPHICS AND ASSOCIATED AUDIOBOOKS
20220360855 · 2022-11-10 ·

The present invention relates to a system and method for providing digital media such as comics with associated audio files such as audiobooks. The digital media and audio are provided using a computer-implemented application and a website. The system includes a server with associated databases for digitizing physical copies of comics and other graphical content, and creating audio files by using text to speech and natural language processing and storing digital comics and audiobooks. A media player displays digital comics and an audio player plays out audio files in synchronized manner allowing people to listen to comics without having to read the textual content contained in the digital media file.

Real Time Popularity Based Audible Content Acquisition

A personalized news service provides personalized news programs for its users by generating personalized combinations of audible versions of news stories derived from text-based versions of the news stories. The audible versions may be generated from the text-based version by a text-to-speech system, or may by recording a person reading aloud the text-based version. To acquire recordings, the personalized news service can make a determination that a particular news story has a threshold extent of popularity. The news service can then transmit a request to a remote recording station for a recording of a verbal reading of the particular news story. The news service can then receive the requested recording from the remote recording station.

CONTENT ACCESS DEVICES THAT USE LOCAL AUDIO TRANSLATION FOR CONTENT PRESENTATION
20230095557 · 2023-03-30 ·

A content access device uses local audio translation for content presentation. The content access device receives video and first audio data associated with a first language. The content access device uses translation software and/or other automated translation services to translate the first audio data to second audio data associated with a second language. The content access device synchronizes the video with the second audio data and outputs the video and the second audio data for presentation. The first audio data may be audio, text, and so on. The second audio data may be output as audio, text, and so on.

Methods and apparatus to perform an automated gain control protocol with an amplifier based on historical data corresponding to contextual data

Methods and apparatus to perform an automated gain control protocol with an amplifier based on historical data corresponding to contextual data are disclosed. Example apparatus disclosed herein are to select an automatic gain control (AGC) parameter for an AGC protocol based on historical data corresponding to contextual data, the contextual data including at least one of a time during which the AGC protocol is performed, a panelist identified by a meter, demographics of an audience identified by the meter, a location of the meter, a station identified by the meter, a media type identified by the meter, or a sound pressure level identified by the meter. The disclosed example apparatus are also to perform the AGC protocol based on the selected AGC parameter.

INCORPORATING INTERACTION ACTIONS INTO VIDEO DISPLAY THROUGH PIXEL DISPLACEMENT

A video processing method includes obtaining, in response to an interaction operation received on a portion of a first image, an adjustment parameter corresponding to the interaction operation. The adjustment parameter indicates an adjustment range of a display position of one or more pixels corresponding to the portion of the first image based on the interaction operation. The method further includes obtaining a displacement parameter of the one or more pixels in the portion of the first image, the displacement parameter representing a displacement of the one or more pixels between the first image and a second image displayed after the first image. The method also includes adjusting a display position of one or more pixels in the second image based on the adjustment parameter and the displacement parameter, and displaying the second image based on the adjusted display position of the one or more pixels.

PROGRAM SEARCHING FOR PEOPLE WITH VISUAL IMPAIRMENTS OR BLINDNESS

Disclosed herein are various embodiments for providing content searching for people with visual impairments or blindness. An embodiment operates by receiving a command to search multimedia content including both video content and audio content. One or more scene changes, including a first scene change, corresponding to the video content are determined. The search command is executed on the multimedia content. It is detected that the multimedia content has reached the first scene change responsive to the executing the search command. An audio cue t is audibly output responsive to the detection.

Method and apparatus for efficient delivery and usage of audio messages for high quality of experience

A method and a system for virtual reality, augmented reality, mixed reality, or 360-degree Video environment is disclosed. The system receives Video Streams associated to audio and video scenes to be reproduced and Audio Streams associated to audio and video scenes to be reproduced. There are provided a Video decoder which decodes signal from the Video Stream for the representation of the audio and video scene; an Audio decoder which decodes signal from the Audio Stream for the representation of the audio and video scene to the user; and a region of interest processor deciding, based e.g. on the user's viewport, head orientation, movement data, or metadata, whether an Audio information message is to be reproduced. At the decision, the reproduction of the Audio information message is caused.

System for jitter recovery from a transcoder

A system for transcoding a digital video stream using a transcoder includes receiving a digital video stream that includes an input video stream and extracting a first set of presentation time stamps from the input video stream which are stored in a table. The system transcodes the input video stream including the first set of presentation time stamps from an initial set of characteristics to a modified set of characteristics including a second set of presentation time stamps that are different from the first set of presentation time stamps. The system processes the second set of presentation time stamps of the transcoded video stream to determine if the second presentation time stamps include jitter, and based upon determining the second set of presentation time stamps include jitter modifying the second set of presentation time stamps based upon the first set of presentation time stamps in the table.

Terminal and operating method thereof
11615777 · 2023-03-28 · ·

A terminal may include a display that is divided into at least two areas, when a real time broadcasting, where a user of the terminal is a host, starts through a broadcasting channel, and of which one area of the at least two areas is allocated to the host; an input/output interface that receives a voice of the host; a communication interface that receives one item selected of at least one or more items and a certain text from a terminal of a certain guest, of at least one or more guests who entered the broadcasting channel; and a processor that generates a voice message converted from the certain text into the voice of the host or a voice of the certain guest.