Patent classifications
G06V20/41
Tracking positions using a scalable position tracking system
A scalable tracking system processes video of a space to track the positions of objects within a space. The tracking system determines local coordinates for the objects within frames of the video and then assigns these coordinates to time windows based on when the frames were received. The tracking system then combines or clusters certain local coordinates that have been assigned to the same time window to determine a combined coordinate for an object during that time window.
Tracking positions using a scalable position tracking system
A scalable tracking system processes video of a space to track the positions of people within a space. The tracking system determines local coordinates for the people within frames of the video and then assigns these coordinates to time windows based on when the frames were received. The tracking system then combines or clusters certain local coordinates that have been assigned to the same time window to determine a combined coordinate for a person during that time window.
COMPUTER-EXECUTABLE METHOD RELATING TO WEEDS AND COMPUTER SYSTEM
A computer-executable method relating to weeds, and a computer system. The method comprises: receiving an image (S11); recognizing one or more plants in the image in order to obtain the classification and/or names of the plants, and determining whether the plants are weeds (S12); and in response to determining that at least one plant is a weed, outputting information indicating that the at least one plant is a weed (S13).
VIDEO PROCESSING METHOD, VIDEO SEARCHING METHOD, TERMINAL DEVICE, AND COMPUTER-READABLE STORAGE MEDIUM
A video processing method, comprising: according to the scenario, editing a video to be edited, and obtaining a target video (S100); acquiring feature parameters of the target video (S200); generating, according to the feature parameters, a keyword of the target video (S300); and associatively storing the keyword and the target video (S400).
ACTIVITY RECOGNITION IN DARK VIDEO BASED ON BOTH AUDIO AND VIDEO CONTENT
Videos captured in low light conditions can be processed in order to identify an activity being performed in the video. The processing may use both the video and audio streams for identifying the activity in the low light video. The video portion is processed to generate a darkness-aware feature which may be used to modulate the features generated from the audio and video features. The audio features may be used to generate a video attention feature and the video features may be used to generate an audio attention feature. The audio and video attention features may also be used in modulating the audio video features. The modulated audio and video features may be used to predict an activity occurring in the video.
QUERY OPTIMIZATION FOR DEEP CONVOLUTIONAL NEURAL NETWORK INFERENCES
A method may include generating views materializing tensors generated by a convolutional neural network operating on an image. Determining the outputs of the convolutional neural network operating on the image with a patch occluding various portions of the image. The outputs being determined by generating queries on the views that performs, based at least on the changes associated with occluding different portions of the image, partial re-computations of the views. A heatmap may be generated based on the outputs of the convolutional neural network. The heatmap may indicate the quantities to which the different portions of the image contribute to the output of the convolutional neural network operating on the image. Related systems and articles of manufacture, including computer program products, are also provided.
AUTOMATED PAUSING OF AUDIO AND/OR VIDEO DURING A CONFERENCING SESSION
Embodiments include an audio analyzer to analyze audio data received from a user computing system operating as a participant in a conference managed by a conferencing application and to detect one or audio pause conditions; a video analyzer to analyze video data received from the user computing system and to detect one or video pause conditions; and a conferencing manager to automatically pause distribution of the audio data to other participants of the conference when the one or more audio pause conditions are detected and automatically pause distribution of the video data to the other participants when the one or more video pause conditions are detected.
Homography error correction
An object tracking system that includes a sensor that is configured to capture frames of at least a portion of a global plane for a space. The system is configured to receive a first frame from the sensor, to identify a pixel location within the first frame, and to determine an estimated sensor location for the sensor by applying a homography to the pixel location. The homography includes coefficients that translate between pixel locations in a frame from the sensor and (x,y) coordinates in the global plane. The system is further configured to determine an actual sensor location for the sensor and to determine a location difference between the estimated sensor location and the actual sensor location. The system is further configured to compare the location difference to a difference threshold level and to recompute the homography in response to determining that the location difference exceeds the difference threshold level.
System, device, and method for generating and utilizing content-aware metadata
System, device, and method for generating and utilizing content-aware metadata, particularly for playback of video and other content items. A method includes: receiving a video file, and receiving content-aware metadata about visual objects that are depicted in said video file; and dynamically adjusting or modifying playback of that video file, on a video playback device, based on the content-aware metadata. The modifications include content-aware cropping, summarizing, watermarking, overlaying of other content elements, modifying playback speed, adding user-selectable indicators or areas around or near visual objects to cause a pre-defined action upon user selection, or other adjustments or modification. Optionally, a modified and content-aware version of the video file is automatically generated or stored. Optionally, the content-aware metadata is stored internally or integrally within the video file, in its header or as a private channel; or is stored in an accompanying file.
Scene change method and system combining instance segmentation and cycle generative adversarial networks
A scene change method and system combining instance segmentation and cycle generative adversarial networks are provided. The method includes: processing a video of a target scene and then inputting the video into an instance segmentation network to obtain segmented scene components, that is, obtain mask cut images of the target scene; and processing targets in the mask cut images of the target scene by using cycle generative adversarial networks according to the requirements of temporal attributes to generate data in a style-migrated state, and generating style-migrated targets with unfixed spatial attributes into a style-migrated static scene according to a specific spatial trajectory to achieve a scene change effect.