G06T2207/30221

METHOD AND APPARATUS FOR DETECTION AND TRACKING, AND STORAGE MEDIUM

In the field of video processing, a detection and tracking method and apparatus, and a storage medium, are provided. The method includes: performing feature point analysis on a video frame sequence, to obtain feature points on each video frame thereof; performing target detection on an extracted frame through a first thread based on the feature points, to obtain a target box in the extracted frame; performing target box tracking in a current frame through a second thread based on the feature points and the target box in the extracted frame, to obtain a result target box in the current frame; and outputting the result target box. As the target detection and the target tracking are divided into two threads, a tracking frame rate is unaffected by a detection algorithm, and the target box of the video frame can be outputted in real time, improving real-time performance and stability.

Automated physical training system

Systems, methods and computer readable media comprising a virtual exercise board, which is represented by images on the screen of a pad device; wearable devices configured to attach to each shoe of a user and to collect and transmit touch data to the pad device; cameras for tracking movement and calibrating with the data collected by the wearable devices; and computer programs for collecting user data, processing user data, and generating outputs. In embodiments, features include augmented reality; ratings of performance; automated workouts/protocols; real-time progress bar; multi-location database capabilities; and reports.

Systems and methods for generating panning images

Images may be captured by a moving image capture device. A reference image and a background image may be selected from the images. The reference image may include depiction of an object, with the object blocking view of the background. The background image may include depiction of the background blocked by the object in the reference image. An object layer may be generated by segmenting the depiction of the object from the reference image. A background layer may be generated by combining the depiction of the background in the background image with the reference image. The background layer may be blurred and combined with the object layer to generate a panning image.

INFORMATION PROCESSING DEVICE, CONTROL METHOD, AND STORAGE MEDIUM
20230021345 · 2023-01-26 · ·

The information processing device 4 includes an acquisition unit 40A, a structural feature point extraction unit 41A, a common coordinate system transformation unit 42A, and a structural feature point integration unit 43A. The acquisition unit 40A is configured to acquire captured images Im generated by plural image acquisition devices 5 and positional relation information Ip indicative of a positional relation among the plural image acquisition device 5. The structural feature point extraction unit 41A is configured to extract, from each of the captured images Im, intra-image coordinate values Pi for structural feature points, which are structural feature points of a target structure of observation through a display device, the display device displaying a virtual object superimposed on a view. The common coordinate system transformation unit 42A is configured to convert, based on the positional relation information Ip, the intra-image coordinate values Pi for each of the structural feature points extracted from each of the captured images Im into individual coordinate values Pc indicating coordinate values in a common coordinate system which is a common three-dimensional coordinate system. The structural feature point integration unit 43A is configured to determine, for each of the structural feature points, a representative coordinate value Pcr in the common coordinate system based on the individual coordinate values Pc in the common coordinate system.

MULTI-VIEW NEURAL HUMAN RENDERING
20230027234 · 2023-01-26 ·

An image-based method of modeling and rendering a three-dimensional model of an object is provided. The method comprises: obtaining a three-dimensional point cloud at each frame of a synchronized, multi-view video of an object, wherein the video comprises a plurality of frames; extracting a feature descriptor for each point in the point cloud for the plurality of frames without storing the feature descriptor for each frame; producing a two-dimensional feature map for a target camera; and using an anti-aliased convolutional neural network to decode the feature map into an image and a foreground mask.

Systems, methods, and computer-program products for assessing athletic ability and generating performance data

Methods, systems, and computer-program products used for assessing athletic ability and generating performance data. In one embodiment, athlete performance data is generated through computer-vision analysis of video of an athletic performing, e.g., during practice or gameplay. The generated performance data for the athlete may include, for example, maximum speed, maximum acceleration, time to maximum speed, transition time (e.g., time to change direction), closing speed (e.g., time to close the distance to another athlete), average separation (e.g., between the athlete and another athlete), play-making ability, athleticism (e.g., a weighted computation and/or combination of multiple metrics), and/or other performance data. This performance data may be used to generate and/or update a profile associated with the athlete, which can be utilized for recruiting, scouting, comparing, and/or assessing athletes with greater efficiency and precision.

METHOD, COMPUTER PROGRAM, APPARATUS AND SYSTEM
20230230376 · 2023-07-20 ·

A method includes obtaining position information relating to an object in a sporting event, determining, based on the position information, that a start event has occurred, wherein the start event indicates a start of play of the sporting event, and generating, according to a result of the determination, an instruction to start storing position information relating to the object.

Image processing apparatus, image processing method, and storage medium
11704820 · 2023-07-18 · ·

To improve user convenience as to adjustment for vibration correction of a captured image captured by an image capturing apparatus, a feature portion evaluation unit refers to feature portions selected by a feature portion selection unit and determines whether feature portions necessary for vibration isolation have been acquired from a reference image. The feature portion evaluation unit has the function of notifying information about the feature portions of the reference image in a case where the acquired feature portions do not satisfy a predetermined condition, that is, in a case where the reliability of the acquired feature portions does not reach a predetermined level.

Image processing apparatus, image processing method, and storage medium
11704805 · 2023-07-18 · ·

An image processing apparatus extracts a foreground image corresponding to an object included in a processing image using a background image corresponding to the processing image, and generates the background image from the processing image. The image processing apparatus determines whether it is allowed to update the background image for use in the extraction, and based on a result of the determination, updates the background image for use in the extraction using the generated background image.

System for the automated, context sensitive, and non-intrusive insertion of consumer-adaptive content in video

Described herein is a method and system for automated, context sensitive and non-intrusive insertion of consumer-adaptive content in video. It assesses ‘context’ in the video that a consumer is viewing through multiple modalities and metadata about the video. The method and system described herein analyzes relevance for a consumer based on multiple factors such as the profile information of the end-user, history of the content, social media and consumer interests and professional or educational background, through patterns from multiple sources. The system also implements local-context through search techniques for localizing sufficiently large, homogenous regions in the image that do not obfuscate protagonists or objects in focus but are viable candidate regions for insertion for the intended content. This makes relevant, curated content available to a user in the most effortless manner without hampering the viewing experience of the main video.