Patent classifications
G06V20/44
SYSTEMS AND METHODS FOR GENERATING METADATA FOR A LIVE MEDIA STREAM
Systems and methods are described to dynamically generate metadata for a live media stream. The system determines that a first user on a social media network has started a live media stream. In response, the system identifies a topic of the live media stream based on a frame of the live media stream and identifies another person featured in the frame of the live media stream based on social connections of the first user in the social media network. The system then generates a title for the live media stream based on the identified topic and the identified person, and transmits a notification to a second user that the first user is streaming live, where the notification includes the generated title.
AUTOMATIC DETERMINATION AND MONITORING OF VEHICLES ON A RACETRACK WITH CORRESPONDING IMAGERY DATA FOR BROADCAST
Methods and systems for automatically tracking and analyzing imagery data of at least one vehicle on a racetrack comprising. A video event management system with a plurality of video cameras positioned around a racetrack determines the presence of the at least one vehicle and based on a weighted event score corresponding to dynamics for the at least one vehicle and other objects captures video imagery and stills and generates at least one subframe. Excess video imagery data and excess stills data are discarded based on metadata of linked subframes.
Weakly-Supervised Action Localization by Sparse Temporal Pooling Network
Systems and methods for a weakly supervised action localization model are provided. Example models according to example aspects of the present disclosure can localize and/or classify actions in untrimmed videos using machine-learned models, such as convolutional neural networks. The example models can predict temporal intervals of human actions given video-level class labels with no requirement of temporal localization information of actions. The example models can recognize actions and identify a sparse set of keyframes associated with actions through adaptive temporal pooling of video frames, wherein the loss function of the model is composed of a classification error and a sparsity of frame selection. Following action recognition with sparse keyframe attention, temporal proposals for action can be extracted using temporal class activation mappings, and final time intervals can be estimated corresponding to target actions.
IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM
An image processing apparatus (10) includes an image processing unit and an execution unit. The image processing unit processes a video generated by a surveillance camera (20), and thereby determines whether a person included in the video performs a first gesture. The execution unit executes first processing on a necessary condition that the first gesture is detected. As described above, a plurality of types of the first gestures exist, and the first processing is determined for each of a plurality of types of the first gestures. Then, the execution unit executes the first processing being associated with a type of the detected first gesture.
Apparatus and method for associating images from two image streams
An apparatus configured to, based on first imagery (301) of at least part of a body of a user (204), and contemporaneously captured second imagery (302) of a scene, the second imagery comprising at least a plurality of images taken over time, and based on expression-time information indicative of when a user expression of the user (204) occurs, provide a time window (303) temporally extending from a first time (t−1) prior to the time (t) of the expression-time information, to a second time (t−5) comprising a time equal to or prior to the first time (t−1), the time window (303) provided to identify at least one expression-causing image (305) from the plurality of images of the second imagery (302) that was captured in said time window, and provide for recordal of the at least one expression-causing image (305) with at least one expression-time image (306) comprising at least one image from the first imagery (301).
TIME-OF-FLIGHT BASED 3D SURVEILLANCE SYSTEM WITH FLEXIBLE SURVEILLANCE ZONE DEFINITION FUNCTIONALITY
A surveillance system for detecting and/or characterizing movement of a monitored infrastructure. An improved compromise between tight zone surveillance and number of false alarms is provided by an improved control of a 3D surveillance device. An input functionality is provided for a user to define a 3D subzone within a 3D environment model. A change functionality allows the user to generate a redefined subzone by dragging one of the corner points of the 3D subzone to a different position within a 3D visualization of the 3D environment model, whereby the shape of the 3D subzone is distorted. The input functionality and the change functionality are used to provide to the 3D surveillance device spatial parameters associated with the redefined subzone and the 3D surveillance device is caused to generate an action in case a movement within the redefined subzone is detected by means of the 3D measurement data.
AUDIO-VIDEO-HAPTICS RECORDING AND PLAYBACK
Innovative techniques to generate a haptic stream are proposed. The proposed techniques allow haptic stream to be captured and along with audio/video stream. In so doing, a full experience—audio, video, haptics experience—may be experienced during playback.
METHODS AND SYSTEMS FOR USING VIDEO SURVEILANCE TO DETECT FALLING LIQUIDS
Apparatus and methods for automatically detecting falling fluid include receiving one or more live video feeds from one or more cameras. The live video feeds are received concurrently. The live video feeds are analyzed to detect a fluid leak using a motion algorithm. False positives are reduced and sensitivity for the fluid leak detection is improved using information from one or more integrated systems, in response to detecting the fluid leak. A notification is generated to inform a user of the detected fluid leak, in response to detecting the fluid leak after reducing the false positives.
Homography error correction using marker locations
An object tracking system that includes a sensor that is configured to capture frames of at least a portion of a global plane for a space. The system is configured to receive a first frame from the sensor and to identify a first pixel location and a second pixel location within the first frame. The system is further configured to determine (x,y) coordinates by applying a homography to the first pixel location and the second pixel location. The system is further configured to determine an estimated distance between the (x,y) coordinates, to determine an actual distance, and to determine a distance difference between the estimated distance and the actual distance. The system is further configured to compare the distance difference to a difference threshold level and to recompute the homography in response to determining that the distance difference exceeds the difference threshold level.
Intelligent cooking process flow
Techniques related to promoting consistent cooking event outcomes are disclosed. Natural language processing (NLP) is used to promote the consistent cooking event outcomes. Data is acquired from a sensor that is monitoring a cooking preparation area. Based on the data, an event is identified. The event is modeled using NLP, which then predicts a subsequent event that will likely occur in the cooking preparation area. NLP is also used to select a recipe. A list of instructions included in the selected recipe are displayed in a user interface.