G06T2207/20132

VISION-BASED SAFETY MONITORING AND/OR ACTIVITY ANALYSIS
20230072434 · 2023-03-09 · ·

Presented herein are embodiments of a vision-based object perception system for activity analysis, safety monitoring, or both. Embodiments of the perception subsystem detect multi-class objects (e.g., construction machines and humans) in real-time while estimating the poses and actions of the detected objects. Safety monitoring embodiments and object activity analysis embodiments may be based on the perception result. To evaluate the performance of embodiments, a dataset was collected including multi-class of objects in different lighting conditions with human annotations. Experimental results show that the proposed action recognition approach outperforms the state-of-the-art approaches on top-1 accuracy by about 5.18%.

Hand Presence Over Keyboard Inclusiveness
20230131667 · 2023-04-27 ·

A method includes accessing an image of a physical environment of a user, the image depicting a physical input device and a physical hand of the user, determining that a contrast between the physical input device and the physical hand depicted in the image is lower than a predetermined threshold, modifying the image to increase the contrast, determining a pose of the physical input device, generating a three-dimensional model representing the physical hand of the user, generating an image mask by projecting the three-dimensional model onto an image plane, generating, a cropped image depicting at least the physical hand of the user in the image, rendering, based on the perspective of the user and the pose of the physical input device, a virtual input device to represent the physical input device, and displaying the cropped image depicting at least the physical hand over the rendered virtual input device.

SYSTEMS AND METHODS FOR ASSESSING TRAILER UTILIZATION
20230128009 · 2023-04-27 ·

Methods for assessing trailer utilization are disclosed herein. An example method includes capturing an image featuring a trailer, and segmenting the image into a plurality of regions. For each region the example method may include cropping the image to exclude data that exceeds a respective forward distance threshold, and iterating over each data point to determine whether or not a matching point is included. Responsive to whether or not a matching point included for a respective data point, the method may include adding the respective data point or the matching point to a respective region based on a position of the respective data point. Further, the method may include calculating a normalized height of the respective region based on whether or not a gap is present in the respective region; and creating a 3D model visualization of the trailer that depicts trailer utilization.

Training method and device of neural network for medical image processing, and medical image processing method and device
11636664 · 2023-04-25 · ·

The present disclosure provides a training method and device of a neural network for medical image processing, a medical image processing method and device, and an electronic apparatus for medical image processing based on a neural network. The training method includes performing a pre-processing process on an original image to obtain a pre-processed image, performing a data-augmenting process on the pre-processed image to obtain an augmented image retaining a pathological feature, the augmented image including at least one image with first resolution and at least one image with second resolution being higher than the first resolution, and training the neural network by selecting the image with first resolution and a part-cropping image from the image with second resolution as training samples.

Substrate defect inspection apparatus, substrate defect inspection method, and storage medium
11636585 · 2023-04-25 · ·

An apparatus for classifying a defect generated in a substrate, includes: a first storage part for storing a first image data for defect classification determination, which includes a defect region in which the defect is generated and a surrounding region of the defect region; a first estimation part for estimating a first type of defect by using a deep learning system, based on the first image data; a second storage part for storing a second image data for defect classification estimation, which is obtained by expressing the defect region and the surrounding region by a binarized data; a second estimation part for estimating a second type of defect by using a rule-based system, based on an attribute of the defect region extracted from the second image data; and a comprehensive determination part for comprehensively determining a type of defect based on the first and second types of defects.

Undamaged/damaged determination

The present invention relates to the determination of damage to portions of a vehicle. More particularly, the present invention relates to determining whether each part of a vehicle should be classified as damaged or undamaged and optionally the severity of the damage to each part of the damaged vehicle. Aspects and/or embodiments seek to provide a computer-implemented method for determining damage states of each part of a damaged vehicle, indicating whether each part of the vehicle is damaged or undamaged and optionally the severity of the damage to each part of the damaged vehicle, using images of the damage to the vehicle and trained models to assess the damage indicated in the images of the damaged vehicle.

Device and method for item recommendation based on visual elements

A method, a device, and a non-transitory computer readable medium for item recommendation based on visual elements. The method includes: determining, by one or more processors, visual elements from an item image of an item; generating, by the one or more processors, an element descriptor for the item based on at least a part of the visual elements; and calculating, by the one or more processors, a compatibility value between the element descriptor and one or more other element descriptors for one or more other items.

Detecting user interface elements in robotic process automation using convolutional neural networks

Graphical elements in a user interface (UI) may be detected in robotic process automation (RPA) using convolutional neural networks (CNNs). Such processes may be particularly well-suited for detecting graphical elements that are too small to be detected using conventional techniques. The accuracy of detecting graphical elements (e.g., control objects) may be enhanced by providing neural network-based processing that is robust to changes in various UI factors, such as different resolutions, different operating system (OS) scaling factors, different dots-per-inch (DPI) settings, and changes due to UI customization of applications and websites, for example.

Electronic tracking device for camera and related system for controlling image output of the camera

A trackable camera beacon is provided that is mountable onto a camera, so that the camera can be more easily tracked and automatically controlled. The camera beacon obtains lens data from the lens of a camera, position data corresponding to the camera beacon, and outputs a unified and synchronized data packet. The unified and synchronized output includes position, orientation and lens data. This can be used to also control the camera, such as the focus, iris and zoom parameters of the lens.

Localizing relevant objects in multi-object images

Solutions for localizing relevant objects in multi-object images include receiving a multi-object image; detecting a plurality of detected objects within the multi-object image; generating a primary heatmap for the multi-object image, the primary heatmap having at least one region of interest; determining a relevant detected object corresponding to a region of interest in the primary heatmap; determining an irrelevant detected object not corresponding to a region of interest in the primary heatmap; and indicating the relevant detected object as an output result but not indicating the irrelevant detected object as an output result. Some examples identify a plurality of objects that are visually similar to the relevant object and displaying the visually similar objects to a user, for example as recommendations of alternative catalog items on an e-commerce website. Some examples are able to identify a plurality of relevant objects and display multiple sets of visually similar objects.