G06V20/43

System for the automated, context sensitive, and non-intrusive insertion of consumer-adaptive content in video

Described herein is a method and system for automated, context sensitive and non-intrusive insertion of consumer-adaptive content in video. It assesses ‘context’ in the video that a consumer is viewing through multiple modalities and metadata about the video. The method and system described herein analyzes relevance for a consumer based on multiple factors such as the profile information of the end-user, history of the content, social media and consumer interests and professional or educational background, through patterns from multiple sources. The system also implements local-context through search techniques for localizing sufficiently large, homogenous regions in the image that do not obfuscate protagonists or objects in focus but are viable candidate regions for insertion for the intended content. This makes relevant, curated content available to a user in the most effortless manner without hampering the viewing experience of the main video.

Real-time Iot device reliability and maintenance system and method
11520677 · 2022-12-06 · ·

The present invention generally relates to systems and methods for detecting and/or isolating any causes of defective and/or partially defective IoT device or individual sensor device(s). In embodiments the present invention generally relates to fixing, replacing, and/or troubleshooting IoT devices and/or individual sensor device(s) that are defective and/or partially defective.

System for the automated, context sensitive, and non-intrusive insertion of consumer-adaptive content in video

Described herein is a method and system for automated, context sensitive and non-intrusive insertion of consumer-adaptive content in video. It assesses ‘context’ in the video that a consumer is viewing through multiple modalities and metadata about the video. The method and system described herein analyzes relevance for a consumer based on multiple factors such as the profile information of the end-user, history of the content, social media and consumer interests and professional or educational background, through patterns from multiple sources. The system also implements local-context through search techniques for localizing sufficiently large, homogenous regions in the image that do not obfuscate protagonists or objects in focus but are viable candidate regions for insertion for the intended content. This makes relevant, curated content available to a user in the most effortless manner without hampering the viewing experience of the main video.

Acquiring public opinion and training word viscosity model

A public opinion acquisition method and device, a word viscosity model training method and device, a server, and a medium are provided in the present disclosure. And the present disclosure relates to the technical field of artificial intelligence, specifically to image recognition and natural language processing, which can be used in a cloud platform. A video public opinion acquisition method includes: receiving a public opinion acquisition request, the public opinion acquisition request including a public opinion keyword to be acquired; matching the public opinion keyword to be acquired with video data including a recognition result, wherein the recognition result is obtained by performing predefined content recognition on the video data, the predefined content recognition including text recognition and image recognition; and determining video data that matches with the public opinion keyword to be acquired as result video data.

Method and apparatus for generating commentary

Embodiments of the present disclosure provide a method and apparatus for generating a commentary. The method may include: acquiring at least one news cluster composed of pieces of news generated within a first preset time length, the pieces of news in the news cluster direct to a given news event; determining a target news cluster based on the at least one news cluster; determining, for each piece of news in the target news cluster, a score of being suitable for generating a commentary for the piece of news; and generating, based on a piece of target news, a commentary for the target news cluster, where the piece of target news is a piece of news having a highest score of being suitable for generating a commentary in the target news cluster.

INTELLIGENT CATALOGING METHOD FOR ALL-MEDIA NEWS BASED ON MULTI-MODAL INFORMATION FUSION UNDERSTANDING

The present disclosure provides an intelligent cataloging method for all-media news based on multi-modal information fusion understanding, which obtains multi-modal fusion features by unified representation and fusion understanding of video information, voice information, subtitle bar information, and character information in the all-media news, and realizes automatic slicing, automatic cataloging description, and automatic scene classification of news using the multi-modal fusion features. The beneficial effect of the present disclosure is that it realizes the complete process of automatic comprehensive cataloging for the all-media news, and improves the accuracy and generalization of the cataloging method, and greatly reduces the manual cataloging time by generating stripping marks, news cataloging descriptions, news classification labels, news keywords, and news characters based on the fusion of multi-modes of video, audio, and text.

Learning representations of generalized cross-modal entailment tasks
11250299 · 2022-02-15 · ·

A method is provided for determining entailment between an input premise and an input hypothesis of different modalities. The method includes extracting features from the input hypothesis and an entirety of and regions of interest in the input premise. The method further includes deriving intra-modal relevant information while suppressing intra-modal irrelevant information, based on intra-modal interactions between elementary ones of the features of the input hypothesis and between elementary ones of the features of the input premise. The method also includes attaching cross-modal relevant information to the features from the input premise to the features from the input hypothesis to form a cross-modal representation, based on cross-modal interactions between pairs of different elementary features from different modalities. The method additionally includes classifying a relationship between the input premise and the input hypothesis using a label selected from the group consisting of entailment, neutral, and contradiction based on the cross-modal representation.

IMAGE DISPLAY DEVICE AND OPERATING METHOD FOR ENLARGING AN IMAGE DISPLAYED IN A REGION OF A DISPLAY AND DISPLAYING THE ENLARGED IMAGE VARIOUSLY

An image display device including a display configured to display a first image is provided. The image display device includes a controller configured to generate a second image by enlarging a part of the first image displayed in a first region of the display and to control the display to display a part of the second image in the first region, and a sensor configured to sense a user input for moving the second image. In response to the user input, the controller is configured to control the display to move and display the second image, within the first region.

Real-time IoT device reliability maintenance system and method
11726891 · 2023-08-15 · ·

The present invention generally relates to systems and methods for detecting and/or isolating any causes of defective and/or partially defective IoT device or individual sensor device(s). In embodiments the present invention generally relates to fixing, replacing, and/or troubleshooting IoT devices and/or individual sensor device(s) that are defective and/or partially defective.

System and method for eliminating bias in selectively edited video

Techniques for eliminating bias in selectively edited videos are provided. A request to release a video capturing a public safety incident is received. The video is edited to create an edited video. At least one civilian score and at least one public safety official score based on the sentiment of the video is computed. At least one edited civilian score and at least one edited public safety official score based on the sentiment of the video is computed. A first score is computed based on a combination of the civilian score and public safety official score. A second score is computed based on a combination of the edited civilian score and edited public safety official score. The first and second score are compared to determine if a difference between the scores exceed a threshold. The edited video is released when the scores do not exceed the threshold.