IPIQ

G06V20/635

Video processing for embedded information card localization and content extraction

11615621 · 2023-03-28 ·

STATS LLC

Metadata for one or more highlights of a video stream may be extracted from one or more card images embedded in the video stream. The highlights may be segments of the video stream, such as a broadcast of a sporting event, that are of particular interest. According to one method, video frames of the video stream are stored. One or more information cards embedded in a decoded video frame may be detected by analyzing one or more predetermined video frame regions. Image segmentation, edge detection, and/or closed contour identification may then be performed on identified video frame region(s). Further processing may include obtaining a minimum rectangular perimeter area enclosing all remaining segments, which may then be further processed to determine precise boundaries of information card(s). The card image(s) may be analyzed to obtain metadata, which may be stored in association with at least one of the video frames.

Optimized reduced bitrate encoding for titles and credits in video content

11496738 · 2022-11-08 ·

Amazon Technologies, Inc.

Embodiments include systems, methods, and computer-readable media for optimized reduced bitrate encoding for text-based content in video frames. Example methods may include determining that a first segment of video content includes a content scene, determining that a second segment of the video content includes text, and determining a first encoder configuration to encode the first segment of video content, where the first encoder configuration includes a first encoding parameter setting. Example methods may include determining a second encoder configuration to encode the second segment of the video content, where the second encoder configuration includes a second encoding parameter setting, encoding the first segment using the first encoder configuration, and encoding the second segment using the second encoder configuration. The first segment may be encoded at a first bitrate that is greater than a second bitrate at which the second segment is encoded.

Systems and methods of presenting video overlays

11617017 · 2023-03-28 ·

Rovi Guides, Inc.

Systems and methods are provided for relocating an overlay overlapping information in content. The systems and methods may comprise receiving a content item, the content item comprising a video image, and determining a first screen position of an information box (e.g., a score box) in the video image. Determining may be performed with image analysis and/or a machine learning model. The system receives an overlay image (e.g., a channel logo) with a second screen position and determines if the second screen position (e.g., for the logo) overlaps the first screen position (e.g., for the score). In response to determining the second screen position (e.g., of the logo) overlaps the first screen position (e.g., the score), the system modifies the second screen position (e.g., for the logo). Then the system generates for display the overlay image on the video in the modified screen position. The system may not relocate the overlay if the overlay is a high priority.

Acquiring public opinion and training word viscosity model

11610401 · 2023-03-21 ·

Beijing Baidu Netcom Science And Technology Co., Ltd.

A public opinion acquisition method and device, a word viscosity model training method and device, a server, and a medium are provided in the present disclosure. And the present disclosure relates to the technical field of artificial intelligence, specifically to image recognition and natural language processing, which can be used in a cloud platform. A video public opinion acquisition method includes: receiving a public opinion acquisition request, the public opinion acquisition request including a public opinion keyword to be acquired; matching the public opinion keyword to be acquired with video data including a recognition result, wherein the recognition result is obtained by performing predefined content recognition on the video data, the predefined content recognition including text recognition and image recognition; and determining video data that matches with the public opinion keyword to be acquired as result video data.

Semantically-guided template generation from image content

11610054 · 2023-03-21 ·

Adobe Inc.

Techniques for template generation from image content includes extracting information associated with an input image. The information comprises: 1) layout information indicating positions of content corresponding to a content type of a plurality of content types within the input image; and 2) text attributes indicating at least a font of text included in the input image. A user-editable template having the characteristics of the input image is generated based on the layout information and the text attributes.

CONTENT RECOGNITION METHOD AND APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM

20230077849 · 2023-03-16 ·

Tencent Technology (Shenzhen) Company Limited

A method for content recognition includes acquiring, from a content for recognition, a text piece and a media piece associated with the text piece, performing a first feature extraction on the text piece to obtain text features, performing a second feature extraction on the media piece associated with the text piece to obtain media features, and determining feature association measures between the media features and the text features. A feature association measure for a first feature in the media features and a second feature in the text features indicating an association degree between the first feature and the second feature. The method further includes adjusting the text features based on the feature association measures to obtain adjusted text features, and performing a recognition based on the adjusted text features to obtain a content recognition result of the content. Apparatus and non-transitory computer-readable storage medium counterpart embodiments are also contemplated.

OBJECT CHARACTERIZATION USING ONE OR MORE NEURAL NETWORKS

20230074950 · 2023-03-09 ·

Apparatuses, systems, and techniques are presented to detect one or more objects in one or more images. In at least one embodiment, one or more neural networks can be used to detect one or more objects in one or more images based, at least in part, on textual descriptions of the one or more objects.

PERSONALIZED VR CONTROLS AND COMMUNICATIONS

20230128658 · 2023-04-27 ·

Systems and methods for personalized controls and communications in virtual environments are provided. A virtual reality (VR) profile may be stored in memory for a user. Such VR profile may specify a cue associated with custom instructions executable to modify one or more virtual display elements. An interactive session associated with a virtual environment in which the user is participating via a user device may be monitored based on the VR profile stored for the user. The cue specified by the VR profile may be detected as being present in the monitored communication session. The virtual elements may be modified within a presentation of the virtual environment provided to the user device in accordance with the executable instructions associated with the cue specified by the VR profile of the user.

Systems and methods for augmented reality application for annotations and adding interfaces to control panels and screens

11475661 · 2022-10-18 ·

Fujifilm Business Innovation Corp.

Example implementations described herein systems and method for providing a platform to facilitate augmented reality (AR) overlays, which can involve stabilizing video received from a first device for display on a second device and for input made to a portion of the stabilized video at the second device, generating an AR overlay on a display of the first device corresponding to the portion of the stabilized video.

Robust audio identification with interference cancellation

11631404 · 2023-04-18 ·

Roku, Inc.

Audio distortion compensation methods to improve accuracy and efficiency of audio content identification are described. The method is also applicable to speech recognition. Methods to detect the interference from speakers and sources, and distortion to audio from environment and devices, are discussed. Additional methods to detect distortion to the content after performing search and correlation are illustrated. The causes of actual distortion at each client are measured and registered and learnt to generate rules for determining likely distortion and interference sources. The learnt rules are applied at the client, and likely distortions that are detected are compensated or heavily distorted sections are ignored at audio level or signature and feature level based on compute resources available. Further methods to subtract the likely distortions in the query at both audio level and after processing at signature and feature level are described.

Patent classifications

G06V20/635