G06F16/7844

Transcription of speech
09786283 · 2017-10-10 · ·

A speech media transcription system comprises a playback device arranged to play back speech delimited in segments. The system is programmed to provide, for a segment being transcribed, an adaptive estimate of the proportion of the segment that has not been transcribed by a transcriber. The device is arranged to play back that proportion of the segment, optionally after having already played back the entire segment. Additionally, a segmentation engine is arranged to divide speech media into a plurality of segments by identifying speech as such and using timing information but without using a machine conversion of the speech media into text or a representation of text.

SYSTEMS AND METHODS FOR RECORDING MEDIA ASSETS
20170280191 · 2017-09-28 ·

Systems and methods are provided to record portions of media assets. User request is received to record a media asset together with a criterion for recording portions of that media asset. A content recognition algorithm is executed against segments of the media asset to determine a set of keywords associated with those segments. Separately a set of keywords associated with the criterion is generated. Sets of keywords are compared and segments that match the criterion are discovered. If it is determined that a first segment and third segment each match the criterion and a second segment does not, a delete indicator is added to the second segment and the third and first segments are compared. If those segments match the delete indicator is removed from the second segment.

SYSTEMS AND METHODS FOR CONTROLLING DISPLAY OF SUPPLEMENTARY DATA FOR VIDEO CONTENT

A processor-implemented method is disclosed. The method includes: obtaining metadata associated with a video; identifying one or more tradeable objects associated with video content of the video based on performing textual comparison between text of the metadata and a defined list of tradeable objects; determining one or more segments of the video corresponding to the one or more identified tradeable objects, the one or more video segments having respective playback start timestamps; receiving, via a client device during playback of the video, a user selection of a first one of the video segments; in response to receiving the user selection: generating supplementary display data associated with a first tradeable object corresponding to the first video segment; and sending, to the client device, the supplementary display data.

Generating personalized clusters of multimedia content elements based on user interests

A system and method for generating personalized multimedia content element clusters. The method includes determining, based on at least one interest, at least one personalized concept, wherein each personalized concept represents one of the at least one user interest; obtaining at least one multimedia content element related to a user; generating at least one signature for the at least one multimedia content element, each generated signature representing at least a portion of the at least one multimedia content element; determining, based on the generated at least one signature, at least one multimedia content element cluster, wherein each cluster includes a plurality of multimedia content elements sharing a common concept of the at least one personalized concept; and creating at least one personalized multimedia content element cluster by adding, to each determined cluster, at least one of the at least one multimedia content element sharing the common concept of the cluster.

Method and system for modeling image of interest to users
20170277785 · 2017-09-28 ·

A system and method for modeling and distributing image data of interest to users is disclosed. Users on user devices such as mobile phones send request messages for image data captured by surveillance cameras of the system. The request messages include information for selecting the image data, such as camera number and time of recording of the image data, in examples. In response, an application server of the system collects the image data from the surveillance cameras, and supplies image data to the users based on a model that the application server creates and updates for each of the users. The model ranks image data of potential interest for each of the users, where the model is based on the information for selecting the image data provided by the users. Preferably, a machine learning application of the application server creates the model for each of the users.

Content Descriptor
20210406306 · 2021-12-30 ·

An apparatus, method, system and computer-readable medium are provided for generating one or more descriptors that may potentially be associated with content, such as video or a segment of video. In some embodiments, a teaser for the content may be identified based on contextual similarity between words and/or phrases in the segment and one or more other segments, such as a previous segment. In some embodiments, an optical character recognition (OCR) technique may be applied to the content, such as banners or graphics associated with the content in order to generate or identify OCR'd text or characters. The text/characters may serve as a candidate descriptor(s). In some embodiments, one or more strings of characters or words may be compared with (pre-assigned) tags associated with the content, and if it is determined that the one or more strings or words match the tags within a threshold, the one or more strings or words may serve as a candidate descriptor(s). One or more candidate descriptor identification techniques may be combined.

MEME PACKAGE GENERATION METHOD, ELECTRONIC DEVICE, AND MEDIUM

A computer-implemented method is provided. The method includes: acquiring at least one piece of target feedback information for a plurality of related videos associated with a target video, the target video and the plurality of related videos are related to the same video producer; matching the at least one piece of target feedback information with the target video; determining at least one target video clip from the target video based on a matching result; and generating a meme package at least based on the at least one target video clip, the meme package being particular to the video producer.

VIDEO SECURITY SYSTEM
20210409825 · 2021-12-30 · ·

A system and method to tag portions of a video with an access-control-list (ACL) for those periods of time (e.g., video segments, series of video frames, etc.) within a video that require elevated permission to view or access is provided. At playback time, the access-control-list is dynamically enforced in real-time to ensure that a user viewing or editing the video has permissions to view upcoming portions of the video. When a user has insufficient permission to view a portion of the video, a filler frame, blurred content or blank frame is displayed in the place of the actual video content. Audio may also be muted or beeped out during the periods of the video for which there is insufficient permissions.

VIDEO PROCESSING METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM THEREOF

A video processing method, an electronic device and a storage medium, which relates to the field of video recognition and understanding and deep learning, are disclosed. The method may include: during video play, for to-be-processed audio data, which has not been played, determined according to a predetermined policy, performing the following processing: extracting a word/phrase meeting a predetermined requirement from text content corresponding to the audio data, as a tag of the audio data; determining a special effect animation corresponding to the audio data according to the tag; and superimposing the special effect animation on a corresponding video picture for display when the audio data begins to be played.

Method and apparatus for commenting video

Embodiments of the present disclosure disclose a method and apparatus for commenting a video, and relate to the field of cloud computing. The method may include: acquiring content information of a to-be-processed video frame; constructing text description information based on the content information, the text description information being used to describe a content of the to-be-processed video frame; importing the text description information into a pre-trained text conversion model to obtain commentary text information corresponding to the text description information, the text conversion model being used to convert the text description information into the commentary text information; and converting the commentary text information into audio information.