G06V30/2504

METHOD AND APPARATUS FOR EVALUATING IMAGE RELATIVE DEFINITION, DEVICE AND MEDIUM

Provided are a method and apparatus for evaluating image relative definition, a device and a medium, relating to technologies such as computer vision, deep learning and intelligent medical. A specific implementation solution is: extracting a multi-scale feature of each image in an image set, where the multi-scale feature is used for representing definition features of objects having different sizes in an image; and scoring relative definition of each image in the image set according to the multi-scale feature by using a relative definition scoring model pre-trained, where the purpose for training the relative definition scoring model is to learn a feature related to image definition in the multi-scale feature.

Content-adaptive non-uniform image downsampling using predictive auxiliary convolutional neural network

Techniques are described for content-adaptive downsampling of digital images and videos for computer vision operations, such as semantic segmentation. A computer vision system comprises a memory, one or more processors operably coupled to the memory and a downsampling module configured for execution by the one or more processors to perform, based on a non-uniform sampling model trained to predict content-aware sampling parameters, downsampling input image data to generate downsampled image data. A segmentation module is configured for execution by the one or more processors to perform segmentation on the downsampled image to produce a segmentation result, such as a feature map that assigns pixels of the downsampled image data to object classes. An upsampling module is configured for execution by the one or more processors to perform upsampling according to the segmentation result to produce upsampled image data.

MEANS FOR USING MICROSTRUCTURE OF MATERIALS SURFACE AS A UNIQUE IDENTIFIER

The present application concerns the visual identification of materials or documents for tracking or authentication purposes. It describes methods to automatically authenticate an object by comparing some object images with reference images, the object images being characterized by the fact that visual elements used for comparison are non-disturbing for the naked eye. In some described approaches it provides the operator with visible features to locate the area to be imaged. It also proposes ways for real-time implementation enabling user friendly detection using mobile devices like smart phones

Vision based light detection and ranging system using multi-fields of view
11790671 · 2023-10-17 · ·

A vision based light detection and ranging (LIDAR) system captures images including a targeted object and identifies the targeted object using an object recognition model. To identify the targeted object, the vision based LIDAR system determines a type of object and pixel locations or a boundary box associated with the targeted object. Based on the identification, the vision based LIDAR system directs a tracking beam onto one or more spots on the targeted object and detects distances to the one or more spots. The vision based LIDAR system updates the identification of the targeted object based on the one or more determined distances.

Image normalization increasing robustness of machine learning applications for medical images

A computer program, a system and a method for normalizing medical images from a type of image acquisition device using a machine learning unit are disclosed. An embodiment of the method includes receiving a set of image data with images; decomposing each of the images of the set of images into components by incorporating at least information from different settings of the image acquisition device-specific image processing algorithms; and normalizing each of the components via a machine learning unit by processing at least information from the different settings of the image acquisition device-specific processing algorithms to provide a set of normalized images with a relatively decreased variability score.

Joint training of neural networks using multi-scale hard example mining

An example apparatus for mining multi-scale hard examples includes a convolutional neural network to receive a mini-batch of sample candidates and generate basic feature maps. The apparatus also includes a feature extractor and combiner to generate concatenated feature maps based on the basic feature maps and extract the concatenated feature maps for each of a plurality of received candidate boxes. The apparatus further includes a sample scorer and miner to score the candidate samples with multi-task loss scores and select candidate samples with multi-task loss scores exceeding a threshold score.

Information processing apparatus

An information processing apparatus is configured to cause a first recognizer to execute a first recognition process that takes sensor information as input, and a second recognizer to execute a second recognition process that takes the sensor information as input, the second recognizer being configured to operate under different capability conditions from the first recognizer; determine one of a transmission necessity and a transmission priority of the sensor information depending on a difference between a first recognition result of the first recognition process and a second recognition result of the second recognition process; and transmit the sensor information to a server apparatus according to the determined one of the transmission necessity and the transmission priority.

System and method for reducing resources costs in visual recognition of video based on static scene summary

Embodiments may provide techniques that provide identification of images that can provide reduced resource utilization due to reduced sampling of video frames for visual recognition. For example, in an embodiment, a method of visual recognition processing may be implemented in a computer system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, the method comprising: coarsely segmenting video frames of video stream into a plurality of clusters based on scenes of the video stream, sampling a plurality of video frames from each cluster; determining a quality of each cluster, re-clustering the video frames of video stream to improve the quality of at least some of the clusters.

SYSTEMS AND METHODS FOR PLATFORM AGNOSTIC WHOLE BODY IMAGE SEGMENTATION

Presented herein are systems and methods that provide for automated analysis of three-dimensional (3D) medical images of a subject in order to automatically identify specific 3D volumes within the 3D images that correspond to specific anatomical regions (e.g., organs and/or tissue). Notably, the image analysis approaches described herein are not limited to a single particular organ or portion of the body. Instead, they are robust and widely applicable, providing for consistent, efficient, and accurate detection of anatomical regions, including soft tissue organs, in the entire body. In certain embodiments, the accurate identification of one or more such volumes is used to automatically determine quantitative metrics that represent uptake of radiopharmaceuticals in particular organs and/or tissue regions. These uptake metrics can be used to assess disease state in a subject, determine a prognosis for a subject, and/or determine efficacy of a treatment modality.

Object detection method, and training method for a target object detection model

A target object detection model is provided. The target object detection model includes a YOLOv3-Tiny model. Through the target object detection model, low-level information in the YOLOv3-Tiny sub-model can be merged with high-level information therein, so as to fuse the low-level information and the high-level information. Since the low-level information can be further used, the comprehensiveness of target detection is effectively improved, and the detection effect of small targets is improved.