G06V10/462

SYSTEMS AND METHODS FOR KEYPOINT DETECTION WITH CONVOLUTIONAL NEURAL NETWORKS
20230054821 · 2023-02-23 ·

A keypoint detection system includes: a camera system including at least one camera; and a processor and memory, the processor and memory being configured to: receive an image captured by the camera system; compute a plurality of keypoints in the image using a convolutional neural network including: a first layer implementing a first convolutional kernel; a second layer implementing a second convolutional kernel; an output layer; and a plurality of connections between the first layer and the second layer and between the second layer and the output layer, each of the connections having a corresponding weight stored in the memory; and output the plurality of keypoints of the image computed by the convolutional neural network.

MEDICAL IMAGE PROCESSING APPARATUS AND MEDICAL IMAGE PROCESSING METHOD

A medical image processing apparatus according to an embodiment includes processing circuitry. The processing circuitry detects, for each of target points corresponding to feature points, a reference point having a spatial correlation with the target point in a medical image. The processing circuitry generates candidate points corresponding to the target point for each of the target points by using a detection model. The processing circuitry selects, for each of the target points, a candidate point based on a position feature indicating a spatial position relationship between the target point and the reference point. The processing circuitry selects, for each of a plurality of candidate point combinations, a candidate point combination based on a structural feature indicating a spatial structural relationship between the target points. The processing circuitry outputs feature points in the medical image based on the selected candidate point and candidate point combination.

LABELING DEVICE AND LEARNING DEVICE

A labeling device includes: an image-signal acquisition unit that acquires an image signal indicating an image captured by a camera; an image recognition unit that has learned by machine learning and performs image recognition on the captured image; and a learning-data-set generation unit that generates, by performing labeling on each object included in the captured image on the basis of a result of image recognition, a learning data set including image data corresponding to each object and label data corresponding to each object.

MAP UPDATING METHOD AND APPARATUS, DEVICE, SERVER, AND STORAGE MEDIUM

Embodiments of the present disclosure provide a map updating method and apparatus, a device, a server, and a storage medium, which relate to the field of artificial intelligence, and in particular, to the field of autonomous parking. The specific implementation solution is: an intelligent vehicle obtains driving data collected by a vehicle sensor of its own vehicle on a target road section, where the driving data at least includes first video data related to an environment of the target road section, and determines at least one image feature corresponding to each first image frame in the first video data, where the image feature at least includes an image local feature related to the environment of the target road section, and then updates map data corresponding to the target road section according to the at least one image feature corresponding to each first image frame in the first video data.

SCALABLE ROAD SIGN KNOWLEDGE GRAPH CONSTRUCTION WITH MACHINE LEARNING AND HUMAN IN THE LOOP

Systems and methods for constructing and managing a unique road sign knowledge graph across various countries and regions is disclosed. The system utilizes machine learning methods to assist humans when comparing a new sign template with a plurality of stored sign templates to reduce or eliminate redundancy in the road sign knowledge graph. Such a machine learning method and system is also used in providing visual attributes of road signs such as sign shapes, colors, symbols, and the like. If the machine learning determines that the input road sign template is not found in the road sign knowledge graph, the input sign template can be added to the road sign knowledge graph. The road sign knowledge graph can be maintained to add signs templates that are not already in the knowledge graph but are found in real-world by integrating human annotator's feedback during ground truth generation for machine learning.

Generating an audio signal from multiple microphones based on uncorrelated noise detection

An audio capture device selects between multiple microphones to generate an output audio signal depending on detected conditions. The audio capture device determines whether one or more microphones are wet or dry and selects one or more audio signals from the one or more microphones depending on their respective conditions. The audio capture device generates a mono audio output signal or a stereo output signal depending on the respective conditions of the one or more microphones.

Image classification using a mask image and neural networks

Image classification using a generated mask image is performed by generating a mask image that extracts a target area from an input image, extracting an image feature map of the input image by inputting the input image in a first neural network including at least one image feature extracting layer, masking the image feature map by using the mask image, and classifying the input image by inputting the masked image feature map to a second neural network including at least one classification layer.

Visual-inertial positional awareness for autonomous and non-autonomous tracking
11501527 · 2022-11-15 · ·

The described positional awareness techniques employing visual-inertial sensory data gathering and analysis hardware with reference to specific example implementations implement improvements in the use of sensors, techniques and hardware design that can enable specific embodiments to provide positional awareness to machines with improved speed and accuracy.

Individual identifying device
11501517 · 2022-11-15 · ·

An imaging unit, an extraction unit, a feature amount pair generation unit, and an imaging parameter adjustment unit are included. The imaging unit acquires images obtained by imaging each of N (N≥3) types of objects a plurality of times by setting a value of a specific imaging parameter, among a plurality of types of imaging parameters, as a certain candidate value and changing a value of the remaining imaging parameter. The extraction unit extracts a feature amount from each of the images. The feature amount pair generation unit generates, as a first feature amount pair for each of the N types of objects, a feature amount pair in which two feature amounts constituting the feature amount pair are extracted from images of objects of the same type, and generates, as a second feature amount pair for every combination of the N types of objects, a feature amount pair in which two feature amounts constituting the feature amount pair are extracted from a images of objects of the different types. The imaging parameter adjustment unit generates a first distribution that is a distribution of collation scores of the first feature amount pairs, generates a second distribution that is a distribution of collation scores of the second feature amount pairs, and on the basis of a degree of separation between the first distribution and the second distribution, determines the propriety of adopting the candidate value.

Vision system for object detection, recognition, classification and tracking and the method thereof

Aspects of the present disclosure are directed to, for example, a method for object detection, recognition, classification and tracking using a distributed networked architecture. In some embodiments, the distributed network architecture may include one or more sensor units wherein the image acquisition and the initial feature extraction are performed and a gateway processor for further data processing. Some aspects of the present disclosure are also directed to a vision system for object detection, and to algorithms implemented in the vision system for executing the method acts for object detection, recognition, classification and/or tracking.