Patent classifications
H04N21/234336
Automated voice translation dubbing for prerecorded video
A method for aligning a translation of original caption data with an audio portion of a video is provided. The method includes identifying, by a processing device, original caption data for a video that includes a plurality of caption character strings. The processing device identifies speech recognition data that includes a plurality of generated character strings and associated timing information for each generated character string. The processing device maps the plurality of caption character strings to the plurality of generated character strings using assigned values indicative of semantic similarities between character strings. The processing device assigns timing information to the individual caption character strings based on timing information of mapped individual generated character strings. The processing device aligns a translation of the original caption data with the audio portion of the video using assigned timing information of the individual caption character strings.
ACTION SYNCHRONIZATION FOR TARGET OBJECT
A method for synchronizing an action of a target object with source audio is provided. Facial parameter conversion is performed on an audio parameter of the source audio at different time periods to obtain source parameter information of the source audio at the respective time periods. Parameter extraction is performed on a target video that includes the target object to obtain target parameter information of the target video. Image reconstruction is performed on the target object in the target video based on the source parameter information of the source audio and the target parameter information of the target video, to obtain a reconstructed image. Further, a synthetic video is generated based on the reconstructed image, the synthetic video including the target object, and the action of the target object being synchronized with the source audio.
SYSTEMS AND METHODS FOR CREATING A 2D FILM FROM IMMERSIVE CONTENT
Systems, methods, and non-transitory computer-readable media can obtain data associated with a computer-based experience. The computer-based experience can be based on interactive real-time technology. At least one virtual camera can be configured within the computer-based experience in a real-time engine. Data associated with an edit cut of the computer-based experience can be obtained based on content captured by the at least one virtual camera. A plurality of shots that correspond to two-dimensional content can be generated from the edit cut of the computer-based experience in the real-time engine. Data associated with a two-dimensional version of the computer-based experience can be generated with the real-time engine based on the plurality of shots. The two-dimensional version can be rendered based on the generated data.
VIRTUAL LIVE VIDEO STREAMING METHOD AND APPARATUS, DEVICE, AND READABLE STORAGE MEDIUM
This application discloses a virtual live streaming method and apparatus, a device, and a storage medium. The method includes: acquiring a live text content which is a text content broadcast by voice by a virtual character in a virtual live stream; segmenting the live text content to obtain text segments that are sequentially arranged; acquiring a live broadcast data packet of each of the text segments following a sequence of the text segments, the live broadcast data packet comprising mouth shape data corresponding to the each of the text segments and being used for determining a mouth shape of the virtual character corresponding to the text segment; and performing screen rendering based on the live broadcast data packet to obtain a live screen for the virtual live streaming, the live screen comprising the virtual character who expresses the each of the text segments with a corresponding mouth shape.
FILE ENCAPSULATION METHOD, FILE TRANSMISSION METHOD, FILE DECODING METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM
A file encapsulation method and apparatus, a file transmission method and apparatus, a file decoding method and apparatus, an electronic device, and a storage medium. The file encapsulation method includes: obtaining an encoded target video and temporal layer information of samples determined during encoding of the target video, encapsulating the encoded target video according to the temporal layer information of the samples to generate a first encapsulated file, the first encapsulated file including the temporal layer information of the samples, and transmitting the first encapsulated file to a first device.
METHODS AND SYSTEMS FOR SELECTIVE PLAYBACK AND ATTENUATION OF AUDIO BASED ON USER PREFERENCE
Systems and methods are presented for providing to filter unwanted sounds from a media asset. Voice profiles of a first character and a second character are generated based on a first voice signal and a second voice signal received from the media device during a presentation. The user provides a selection to avoid a certain sound or voice in association with the second character. During a presentation of the media asset, a second audio segment is analyzed to determine, based on the voice profile of the second character, whether the second voice signal includes the voice of a second character. If so, the second voice signal output characteristics are adjusted to reduce the sound.
VISUAL ASSETS OF AUDIOVISUAL SIGNALS
In some examples, an electronic device includes a network interface and a processor. The processor is to analyze an audiovisual signal received via the network interface to identify a topic, identify information related to the topic, and cause a display device to display a visual asset for the information in a video representing the audiovisual signal.
VIDEO PROCESSING FOR ENABLING SPORTS HIGHLIGHTS GENERATION
One or more highlights of a video stream may be identified. The highlights may be segments of a video stream, such as a broadcast of a sporting event, that are of particular interest to one or more users. According to one method, at least a portion of the video stream may be stored. The portion of the video stream may be compared with templates of a template database to identify the one or more highlights. Each highlight may be a subset of the video stream that is deemed likely to match the one or more templates. The highlights, an identifier that identifies each of the highlights within the video stream, and/or metadata pertaining particularly to the one or more highlights may be stored to facilitate playback of the highlights for the users.
METHOD FOR JUST-IN-TIME TRANSCODING OF BYTERANGE-ADDRESSABLE PARTS
A method including: ingesting a video segment and a set of video features of the video segment; estimating a part size distribution for the video segment based on the set of video features and a first rendition of the video segment; calculating a maximum expected part size based on a threshold percentile in the part size distribution; at a first time, transmitting, to an video player, a manifest file indicating a set of byterange-addressable parts of the video segment in the first rendition, each byterange addressed part characterized by the maximum expected part size; at a second time, receiving, a playback request for a first byterange-addressable part; transcoding the first byterange-addressable part; in response to the maximum expected part size exceeding a size of the first byterange-addressable part in the first rendition, appending padding data to the first byterange-addressable part; and transmitting the first byterange-addressable part to the AV player.
Policy based transcoding
Methods and systems are disclosed for providing video content in response to requests in a content delivery system with more speed and efficiency. In some aspects, network monitoring devices may gather content specific and network performance metrics, from user devices and content delivery components, to provide input to a computing device for deciding whether to store or delete different versions of the same or different items of content. The decision may be based on a policy which may include a weighted score based on a combination of usage and network efficiency scores. In other aspects, methods and systems are provided to initially provide to a user device a stored version of a content item, and then switch, as needed, to a different version of the content item using on-demand transcoding.