G06T2207/20084

Generative adversarial neural network assisted video reconstruction

A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.

Method and system for detecting and picking up objects

A method includes steps of: capturing an image of a container; recognizing at least one object in the container based on the image; determining at least one first coordinate set corresponding to the at least one object; determining at least one second coordinate set that corresponds to target one (s) of the at least one first coordinate set and that relates to a fixed picking device of a robotic arm; adjusting position(s) of unfixed picking device(s) of the robotic arm if necessary; controlling the robotic arm to pick up one (s) of the at least one object that correspond(s) to the at least one second coordinate set with the fixed picking device and/or at least one unfixed picking device.

Video visual relation detection methods and systems

Methods and systems for detecting visual relations in a video are disclosed. A method comprises: decomposing the video sequence into a plurality of segments; for each segment, detecting objects in frames of the segment; tracking the detected objects over the segment to form a set of object tracklets for the segment; for the detected objects, extracting object features; for pairs of object tracklets of the set of object tracklets, extracting relativity features indicative of a relation between the objects corresponding to the pair of object tracklets; forming relation feature vectors for pairs of object tracklets using the object features of objects corresponding to respective pairs of object tracklets and the relativity features of the respective pairs of object tracklets; and generating a set of segment relation prediction results from the relation features vectors; generating a set of visual relation instances for the video sequence by merging the segment prediction results from different segments; and generating a set of visual relation detection results from the set of visual relation instances.

Image positioning system and image positioning method based on upsampling

An image positioning system based on upsampling and a method thereof are provided. The image positioning method based on upsampling is to fetch a region image covering a target from a wide region image, determine a rough position of the target, execute an upsampling process on the region image based on neural network data model for obtaining a super-resolution region image, map the rough position onto the super-resolution region image, and analyze the super-resolution region image for determining a precise position of the target. The present disclosed example can significantly improve the efficiency of positioning and effectively reduce the required cost of hardware.

Apparatus and method for displaying contents on an augmented reality device

A system for displaying contents on an augmented reality (AR) device comprises a capturing module configured to capture a field of view of a user, a recording module configured to record the captured field of view, a user input controller configured to track a vision of the user towards one or more objects and a server. The server comprises a determination module, an identifier, and an analyser. The determination module is configured to determine at least one object of interest. The identifier is configured to identify a frame containing disappearance of the determined object of interest. The analyser is configured to analyse the identified frame based on at least one disappearance of the object of interest, and generate analysed data. The display module is configured to display a content of the object of interest on the AR device.

System and method for visually tracking persons and imputing demographic and sentiment data

A visual tracking system for tracking and identifying persons within a monitored location, comprising a plurality of cameras and a visual processing unit, each camera produces a sequence of video frames depicting one or more of the persons, the visual processing unit is adapted to maintain a coherent track identity for each person across the plurality of cameras using a combination of motion data and visual featurization data, and further determine demographic data and sentiment data using the visual featurization data, the visual tracking system further having a recommendation module adapted to identify a customer need for each person using the sentiment data of the person in addition to context data, and generate an action recommendation for addressing the customer need, the visual tracking system is operably connected to a customer-oriented device configured to perform a customer-oriented action in accordance with the action recommendation.

System and method for future forecasting using action priors

A system for method for future forecasting using action priors that include receiving image data associated with a surrounding environment of an ego vehicle and dynamic data associated with dynamic operation of the ego vehicle. The system and method also include analyzing the image data and detecting actions associated with agents located within the surrounding environment of the ego vehicle and analyzing the dynamic data and processing an ego motion history of the ego vehicle. The system and method further include predicting future trajectories of the agents located within the surrounding environment of the ego vehicle and a future ego motion of the ego vehicle within the surrounding environment of the ego vehicle.

Image denoising model training method, imaging denoising method, devices and storage medium

A training method for an image denoising model that can include collecting multiple sample image groups through a shooting device, each sample image group including multiple frames of sample images with a same photographic sensitivity and sample images in different sample image groups having different photographic sensitivities. The method can further include acquiring a photographic sensitivity of each sample image group, determining a noise characterization image corresponding to each sample image group based on the photographic sensitivity, determining a training input image group and a target image associated with each sample image group, each training input image group including all or part of sample images in a corresponding sample image group and a corresponding noise characterization image, constructing multiple training pairs each including a training input image group and a target image, and training the image denoising model based on the multiple training pairs until the image denoising model converges.

Plant group identification

A farming machine moves through a field and includes an image sensor that captures an image of a plant in the field. A control system accesses the captured image and applies the image to a machine learned plant identification model. The plant identification model identifies pixels representing the plant and categorizes the plant into a plant group (e.g., plant species). The identified pixels are labeled as the plant group and a location of the pixels is determined. The control system actuates a treatment mechanism based on the identified plant group and location. Additionally, the images from the image sensor and the plant identification model may be used to generate a plant identification map. The plant identification map is a map of the field that indicates the locations of the plant groups identified by the plant identification model.

Predictive use of quantitative imaging

The present disclosure provides systems and methods for predicting a disease state of a subject using ultrasound imaging and ancillary information to the ultrasound imaging. At least two quantitative measurements of a subject, including at least one measurement taken using ultrasound imaging, as part of quantified information can be identified. One of the quantitative measurements can be compared to a first predetermined standard, included as part of ancillary information to the quantified information, in order to identify a first initial value. Further, another of the quantitative measurements can be compared to a second predetermined standard, included as part of the ancillary information, in order to identify a second initial value. Subsequently, the quantitative information can be correlated with the ancillary information using the first initial value and the second initial value to determine a final value that is predictive of a disease state of the subject.