Patent classifications
G06V20/46
AIR-CONDITIONING OPERATION TERMINAL, COMPUTER-READABLE MEDIUM AND AIR-CONDITIONING SYSTEM
An image acquisition unit (211) acquires a photographed image obtained by photographing an air-conditioning indoor unit including a plurality of air outlets. An image collation unit (212) collates a template image wherein an air-conditioning indoor unit of a same type as the air-conditioning indoor unit is shown with the photographed image. An air outlet identification unit (213) determines an air outlet identifier of each air outlet in the photographed image based on a collation result, and on air outlet identification data indicating an air outlet identifier to be associated with each air outlet in the template image. An identification result display unit (214) displays the photographed image by superimposing the air outlet identifier on each air outlet in the photographed image.
METHOD OF PROCESSING IMAGE, ELECTRONIC DEVICE, AND STORAGE MEDIUM
A method of processing an image, an electronic device, and a storage medium, which relate to the artificial intelligence field, in particular to fields of computer vision and intelligent transportation technologies. The method includes: determining at least one key frame image in a scene image sequence captured by a target camera; determining a camera pose parameter associated with each key frame image in the at least one key frame image, according to a geographic feature associated with the key frame image; and projecting each scene image in the scene image sequence to obtain a target projection image according to the camera pose parameter associated with the key frame image, so as to generate a scene map based on the target projection image. The geographic feature associated with any key frame image indicates localization information of the target camera at a time instant of capturing the corresponding key frame image.
Temporal Approximation Of Trilinear Filtering
In one embodiment, a method includes receiving instructions to render a snapshot of a scene for a video, where the snapshot is to be displayed using a sequence of N frames, computing a mipmap-level determining factor for a texture appearing in the scene based on a scale of the texture on a pixel grid, selecting a mipmap level of the texture for each of the N frames based on the mipmap-level determining factor, where the mipmap levels selected for the N frames are non-uniform and temporally approximate the mipmap-level determining factor, rendering each of the N frames by sampling the mipmap level of the texture selected for that frame, and displaying the rendered N frames sequentially to represent the snapshot of the scene.
Control apparatus, control system, control method, and storage medium
A control apparatus including an extraction unit configured to extract a subject from an image captured by an image capturing apparatus, an estimation unit configured to estimate a skeleton of the subject extracted by the extraction unit and a control unit configured to control an angle of view of the image capturing apparatus based on a result of the estimation by the estimation unit.
Methods and apparatus to detect commercial advertisements associated with media presentations
Methods and apparatus to detect commercial advertisements associated with media presentations are disclosed. An example method involves receiving a video frame and detecting a change in box-formatting between the video frame and a subsequent video frame. A transition between the video frame and the subsequent video frame is indicated as a commercial advertisement transition based on the detected change in box-formatting.
System, device, and method for generating and utilizing content-aware metadata
System, device, and method for generating and utilizing content-aware metadata, particularly for playback of video and other content items. A method includes: receiving a video file, and receiving content-aware metadata about visual objects that are depicted in said video file; and dynamically adjusting or modifying playback of that video file, on a video playback device, based on the content-aware metadata. The modifications include content-aware cropping, summarizing, watermarking, overlaying of other content elements, modifying playback speed, adding user-selectable indicators or areas around or near visual objects to cause a pre-defined action upon user selection, or other adjustments or modification. Optionally, a modified and content-aware version of the video file is automatically generated or stored. Optionally, the content-aware metadata is stored internally or integrally within the video file, in its header or as a private channel; or is stored in an accompanying file.
Scene change method and system combining instance segmentation and cycle generative adversarial networks
A scene change method and system combining instance segmentation and cycle generative adversarial networks are provided. The method includes: processing a video of a target scene and then inputting the video into an instance segmentation network to obtain segmented scene components, that is, obtain mask cut images of the target scene; and processing targets in the mask cut images of the target scene by using cycle generative adversarial networks according to the requirements of temporal attributes to generate data in a style-migrated state, and generating style-migrated targets with unfixed spatial attributes into a style-migrated static scene according to a specific spatial trajectory to achieve a scene change effect.
Video event recognition method, electronic device and storage medium
Technical solutions for video event recognition relate to the fields of knowledge graphs, deep learning and computer vision. A video event graph is constructed, and each event in the video event graph includes: M argument roles of the event and respective arguments of the argument roles, with M being a positive integer greater than one. For a to-be-recognized video, respective arguments of the M argument roles of a to-be-recognized event corresponding to the video are acquired. According to the arguments acquired, an event is selected from the video event graph as a recognized event corresponding to the video.
METHOD FOR EMBEDDING WATERMARK IN VIDEO DATA AND APPARATUS, METHOD FOR EXTRACTING WATERMARK IN VIDEO DATA AND APPARATUS, DEVICE, AND STORAGE MEDIUM
Disclosed in this application are a method for embedding a watermark in video data and apparatus, a method for extracting a watermark in video data and apparatus, a device, and a storage medium. The method for embedding the watermark includes: acquiring a target image frame in video data; performing time-frequency transformation on the target image frame to obtain target frequency domain data, the target frequency domain data comprising a matrix formed by frequency domain coefficients; changing the frequency domain coefficients in the target frequency domain data according to watermark data to obtain watermarked frequency domain data; performing inverse time-frequency transformation on the watermarked frequency domain data to obtain a watermarked image frame; and synthesizing watermarked video data according to the watermarked image frame.
PEDESTRIAN SEARCH METHOD, SERVER, AND STORAGE MEDIUM
Provided are a pedestrian search method, a server, and a storage medium. The pedestrian search method is described as follows: a pedestrian detection is performed on each segment of monitoring video to obtain multiple pedestrian tracks, where each pedestrian track of the multiple pedestrian tracks includes multiple video frame images of a same pedestrian; and pedestrian tracks belonging to the same pedestrian is determined according to video frame images in the multiple pedestrian tracks, and the pedestrian tracks of the same pedestrian are merged.