G06V10/811

METHOD AND DEVICE FOR MULTI-SENSOR DATA-BASED FUSION INFORMATION GENERATION FOR 360-DEGREE DETECTION AND RECOGNITION OF SURROUNDING OBJECT

Presented are a method and a device for multi-sensor data-based fusion information generation for 360-degree detection and recognition of a surrounding object. The present invention proposes a method for multi-sensor data-based fusion information generation for 360-degree detection and recognition of a surrounding object, the method comprising the steps of: acquiring a feature map from a multi-sensor signal by using a deep neural network; converting the acquired feature map into an integrated three-dimensional coordinate system; and generating a fusion feature map for performing recognition by using the converted integrated three-dimensional coordinate system.

Hybrid reinforcement learning for autonomous driving

A method includes determining a current state of an environment of an autonomous agent, such as a vehicle. The method also includes determining, via a first neural network, a set of actions based on the current state. The method further includes determining whether further analysis of the set of actions is desired. The method selects an action from the set of actions using a model-based solution based on a reward and a risk of the action when further analysis is desired. The method also includes selecting the action from the set of actions according to a metric when further analysis is not desired. The method controls the autonomous agent to perform the selected action.

MULTI-MODAL TEST-TIME ADAPTATION

Systems and methods are provided for multi-modal test-time adaptation. The method includes inputting a digital image into a pre-trained Camera Intra-modal Pseudo-label Generator, and inputting a point cloud set into a pre-trained Lidar Intra-modal Pseudo-label Generator. The method further includes applying a fast 2-dimension (2D) model, and a slow 2D model, to the inputted digital image to apply pseudo-labels, and applying a fast 3-dimension (3D) model, and a slow 3D model, to the inputted point cloud set to apply pseudo-labels. The method further includes fusing pseudo-label predictions from the fast models and the slow models through an Inter-modal Pseudo-label Refinement module to obtain robust pseudo labels, and measuring a prediction consistency for the pseudo-labels. The method further includes selecting confident pseudo-labels from the robust pseudo labels and measured prediction consistencies to form a final cross-modal pseudo-label set as a self-training signal, and updating batch parameters utilizing the self-training signal.

Auto-labeling of driving logs using analysis-by-synthesis and unsupervised domain adaptation

Acquiring labeled data can be a significant bottleneck in the development of machine learning models that are accurate and efficient enough to enable safety-critical applications, such as automated driving. The process of labeling of driving logs can be automated. Unlabeled real-world driving logs, which include data captured by one or more vehicle sensors, can be automatically labeled to generate one or more labeled real-world driving logs. The automatic labeling can include analysis-by-synthesis on the unlabeled real-world driving logs to generate simulated driving logs, which can include reconstructed driving scenes or portions thereof. The automatic labeling can further include simulation-to-real automatic labeling on the simulated driving logs and the unlabeled real-world driving logs to generate one or more labeled real-world driving logs. The automatically labeled real-world driving logs can be stored in one or more data stores for subsequent training, validation, evaluation, and/or model management.

METHOD, APPARATUS, AND NON-TRANSITORY COMPUTER READABLE STORAGE MEDIUM FOR CONFIRMING A PERCEIVED POSITION OF A TRAFFIC LIGHT

A method, apparatus, and computer-readable medium for confirming a perceived position of a traffic light, by obtaining identifiers and results of a first perception of traffic lights associated with the identifiers, the results of the first perception including a first estimation of an ellipse encompassing each of the traffic lights, receiving results of a second perception of traffic lights associated with the identifiers, the results of the second perception including a second estimation of an ellipse encompassing each of the traffic lights, calculating, based on the first perception and the second perception, association parameters for each possible pair of estimated ellipses, selecting, based on the calculated association parameters for each possible pair of estimated ellipses, matching pairs of estimated ellipses, and fusing each matching pair of estimated ellipses.

IMAGE RECOGNITION DEVICE AND IMAGE RECOGNITION METHOD

An image recognition device capable of reducing power, a calculation amount, a memory occupancy amount, and a bus band occupancy amount and maintaining high recognition accuracy is provided. An image recognition device includes a plurality of signal processing modules that connects a plurality of image sensors having different functions, performs signal processing on the basis of image signals output from connected image sensors and including an object within an imaging visual field, and is each capable of being controlled for power source supply, a recognition processing unit that selectively performs recognition processing of the object by an output signal of one signal processing module and recognition processing of the object with fusion of corresponding output signals of the plurality of signal processing modules, and a control unit that supplies a power source to at least one signal processing module, causes the recognition processing unit to perform recognition processing of the object without fusion, determines reliability of a recognition processing result, and performs power source supply control of another signal processing module on the basis of a determination result.

LEARNING PROXY MIXTURES FOR FEW-SHOT CLASSIFICATION
20230111287 · 2023-04-13 ·

A computer system and method are provided for training a machine learning system to perform a classification task by classifying input data into one of a plurality of classes. The system is configured to: receive per class training data from which per class representations can be derived, wherein each class is described by multiple representations; process the training data to form, for at least one class, a first proxy for a relatively global portion of an item of training data and multiple proxies for distinct relatively local portions of the item of training data, each proxy corresponding to a representation of the data belonging to that class.

Training and operating a machine learning system
11468687 · 2022-10-11 · ·

A method for training a machine learning system, in which image data are fed into a machine learning system with processing of at least a part of the image data by the machine learning system. The method includes synthetic generation of at least a part of at least one depth map that includes a plurality of depth information values. The at least one depth map is fed into the machine learning system with processing of at least a part of the depth information values of the at least one depth map. The machine learning system is then trained based on the processed image data and based on the processed depth information values of the at least one depth map, with adaptation of a parameter value of at least one parameter of the machine learning system, the adapted parameter value influencing an interpretation of input data by the machine learning system.

VEHICLE CONTROL APPARATUS
20230111488 · 2023-04-13 ·

A vehicle control apparatus including a camera, a detector acquiring position information of a target based on reflected wave and a microprocessor. The microprocessor is configured to perform estimating a position of the target, based on position information acquired by the camera and the detector, controlling an actuator based on the estimated position, and detecting a predetermined gradient state in which a gradient of a road surface in front of a subject vehicle is an upward gradient of a predetermined degree or more with respect to a road surface at a current position of the subject vehicle and is configured to perform the estimating including estimating the position of the target by lowering a reliability of position information acquired by the camera among position information of the target captured on the road surface in the predetermined gradient state when the predetermined gradient state is detected.

Object detection device

In an object detection device to be installed to a vehicle and detect an object outside the vehicle, a position calculator sets multiple candidate points representing a candidate position of the object, based on positions of feature points extracted from a first image captured at a first time. The multiple candidate points are set to be denser within a detection range set based on a distance to the object detected by the ultrasonic sensor than outside the detection range. The position calculator estimates positions of the multiple candidate points at a second time which is after the first time, based on the positions of the multiple candidate points and movement information of the vehicle, and calculates the position of the object by comparing the estimated positions of the multiple candidate points at the second time and the positions of the feature points extracted from a second image captured at the second time.