G06V20/635

Subtitle extraction method and device, storage medium

A subtitle extraction method includes decoding a video to obtain video frames; performing adjacency operation in a subtitle arrangement direction on pixels in the video frames to obtain adjacency regions in the video frames; and determining certain video frames including a same subtitle based on the adjacency regions, and subtitle regions in the certain video frames including the same subtitle based on distribution positions of the adjacency regions in the video frames including the same subtitle. The method also includes constructing a component tree for at least two channels of the subtitle regions and using the constructed component tree to extract a contrasting extremal region corresponding to each channel; performing color enhancement processing on the contrasting extremal regions of the at least two channels to form a color-enhanced contrasting extremal region; and extracting the subtitle by merging the color-enhanced contrasting extremal regions of at least two channels.

Electronic device and control method thereof
11367283 · 2022-06-21 · ·

An electronic device is disclosed. The electronic device includes a communicator configured to receive a video consisting of a plurality of frames, a processor configured to sense a frame having a predetermined object in the received video, extract information from the sensed frame, and generate metadata by using the extracted information, and a memory configured to store the generated metadata.

A METHOD AND SYSTEM FOR MATCHING CLIPS WITH VIDEOS VIA MEDIA ANALYSIS
20220189174 · 2022-06-16 ·

A method includes comparing each textless video clip to a plurality of portions of the video file corresponding to the full length video file; determining each textless video clip being similar to only one portion of the video file as being a matched pair; for each matched pair identifying if their text content is different, wherein identification of a different text content dictates that the textless video clip corresponds to a portion of the video file having overlaid text; training a classifier to predict whether an area of text detected in the full length video is overlaid text; determining the probability of each portion of the full-length video having overlaid text; determining each textless video clip being similar to more than one portion of the video file as being a potential matched pair; and resolving the potential matched pairs with the determined probability.

Methods and systems for automatic generation and convergence of keywords and/or keyphrases from a media

Embodiments herein disclose methods and systems for automatic generation and convergence of keywords and/or keyphrases from a media. A method disclosed herein includes analyzing at least one source of the media to obtain at least one text, wherein the at least one source includes at least one audio portion and at least one visual portion. The method further includes extracting at least one keyword of a plurality of keywords from the extracted at least one text. The method further includes generating at least one keyphrase of a plurality of keyphrases for the extracted at least one keyword. The method further includes merging at least one of the at least one keyword and the at least one keyphrase to generate a plurality of elements from the media, wherein the plurality of elements includes context dependent set of at least one of the plurality of keywords and the plurality of keyphrases.

Video caption generating method and apparatus, device, and storage medium

A video caption generating method is provided to a computer device. The method includes encoding a target video by using an encoder of a video caption generating model, to obtain a target visual feature of the target video, decoding the target visual feature by using a basic decoder of the video caption generating model, to obtain a first selection probability corresponding to a candidate word, decoding the target visual feature by using an auxiliary decoder of the video caption generating model, to obtain a second selection probability corresponding to the candidate word, a memory structure of the auxiliary decoder including reference visual context information corresponding to the candidate word, determining a decoded word in the candidate word according to the first selection probability and the second selection probability, and generating a video caption according to decoded word.

METHODS AND APPARATUS TO MEASURE BRAND EXPOSURE IN MEDIA STREAMS
20230267733 · 2023-08-24 ·

Methods and apparatus to measure brand exposure in media streams are disclosed. Example apparatus disclosed herein are to determine a first histogram based on at least one of luminance components or chrominance components of a first frame of video, and determine a second histogram based on at least one of luminance components or chrominance components of a second frame of the video. Disclosed example apparatus are also to detect a transition in the video based on the first histogram and the second histogram, and responsive to the detection of the transition in the video. Disclosed example apparatus are further to process a region of interest within at least one of the first frame or the second frame to detect a logo in the region of interest.

Methods and Systems for Scoreboard Text Region Detection

A computing system automatically detects, within a digital video frame, a video frame region that depicts a textual expression of a scoreboard. The computing system (a) engages in an edge-detection process to detect edges of at least scoreboard image elements depicted by the digital video frame, with at least some of these edges being of the textual expression and defining alphanumeric shapes; (b) applies pattern-recognition to identify the alphanumeric shapes; (c) establishes a plurality of minimum bounding rectangles each bounding a respective one of the identified alphanumeric shapes; (d) establishes, based on at least two of the minimum bounding rectangles, a composite shape that encompasses the identified alphanumeric shapes that were bounded by the at least two minimum bounding rectangles; and (e) based on the composite shape occupying a particular region, deems the particular region to be the video frame region that depicts the textual expression.

System and method for automatically identifying locations in video content for inserting advertisement breaks
11336930 · 2022-05-17 · ·

An automated method is provided for identifying candidate locations in video content for inserting advertisement (ad) breaks. Each candidate location is a different offset time from the beginning of the video content. Different distinct characteristics of the video content are identified at offset times. Certain characteristics are desirable and certain other characteristics are not desirable. Candidate locations are identified which have the most desirable characteristics at particular offset times, but which do not have any of the undesirable characteristics at any of the offset times.

3D CAPTIONS WITH FACE TRACKING

Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing at least one program and method for performing operations comprising: receiving, by one or more processors that implement a messaging application, a video feed from a camera of a user device; detecting, by the messaging application, a face in the video feed; in response to detecting the face in the video feed, retrieving a three-dimensional (3D) caption; modifying the video feed to include the 3D caption at a position in 3D space of the video feed proximate to the face; and displaying a modified video feed that includes the face and the 3D caption.

System and method for detecting and classifying direct response advertisements using fingerprints

System and method for detecting and classifying direct response advertisements. The system includes a unit for generating an advertisement candidate segment for an advertisement section detected from a broadcast stream; a matching unit for determining whether the candidate segment matches each advertisement segment stored in a database (DB); a unit for, if the matching unit determines that a segment matching the candidate segment is not present, determining whether the candidate segment is a direct response advertisement; a registration unit for storing the candidate segment, determined to be a direct response advertisement, as an advertisement segment that is the direct response advertisement in the DB; and a direct response advertisement grouping unit for, if the matching unit determines that an advertisement segment matching the candidate segment is present, and the matching segment is a direct response advertisement, grouping the candidate segment with DB-stored advertisement segments that are direct response advertisements.