Patent classifications
G06T2207/30261
Robot for detecting human head and control method thereof
Disclosed is a robot and a control method thereof. The robot includes: a camera configured to acquire a depth image of a specific area; a processor configured to create an elevation map for the specific area based on the depth image, and determine whether a head of a person is in the specific area based on the elevation map. The processor, when determining whether the head of the person is in the specific area, is further configured to partition pixels of the elevation map into floor pixels corresponding to a floor area and non-floor pixels corresponding to a non-floor area using elevation values of the elevation map, set paths through the elevation map with a subset of the non-floor pixels as starting points for the paths, and determine whether the head of the person is in the specific area based on whether the paths converge.
Use of a Reference Image to Detect a Road Obstacle
Methods and systems for use of a reference image to detect a road obstacle are described. A computing device configured to control a vehicle, may be configured to receive, from an image-capture device, an image of a road on which the vehicle is travelling. The computing device may be configured to compare the image to a reference image; and identify a difference between the image and the reference image. Further, the computing device may be configured to determine a level of confidence for identification of the difference. Based on the difference and the level of confidence, the computing device may be configured to modify a control strategy associated with a driving behavior of the vehicle; and control the vehicle based on the modified control strategy.
OBJECT DETECTION BASED ON THREE-DIMENSIONAL DISTANCE MEASUREMENT SENSOR POINT CLOUD DATA
Distance measurements are received from one or more distance measurement sensors, which may be coupled to a vehicle. A three-dimensional (3D) point cloud are generated based on the distance measurements. In some cases, 3D point clouds corresponding to distance measurements from different distance measurement sensors may be combined into one 3D point cloud. A voxelized model is generated based on the 3D point cloud. An object may be detected within the voxelized model, and in some cases may be classified by object type. If the distance measurement sensors are coupled to a vehicle, the vehicle may avoid the detected object.
BIRD'S EYE VIEW BASED VELOCITY ESTIMATION VIA SELF-SUPERVISED LEARNING
Systems and methods determining velocity of an object associated with a three-dimensional (3D) scene may include: a LIDAR system generating two sets of 3D point cloud data of the scene from two consecutive point cloud sweeps; a pillar feature network encoding data of the point cloud data to extract two-dimensional (2D) bird's-eye-view embeddings for each of the point cloud data sets in the form of pseudo images, wherein the 2D bird's-eye-view embeddings for a first of the two point cloud data sets comprises pillar features for the first point cloud data set and the 2D bird's-eye-view embeddings for a second of the two point cloud data sets comprises pillar features for the second point cloud data set; and a feature pyramid network encoding the pillar features and performing a 2D optical flow estimation to estimate the velocity of the object.
BIRD'S EYE VIEW BASED VELOCITY ESTIMATION
Systems and methods determining velocity of an object associated with a three-dimensional (3D) scene may include: a LIDAR system generating two sets of 3D point cloud data of the scene from two consecutive point cloud sweeps; a pillar feature network encoding data of the point cloud data to extract two-dimensional (2D) bird's-eye-view embeddings for each of the point cloud data sets in the form of pseudo images, wherein the 2D bird's-eye-view embeddings for a first of the two point cloud data sets comprises pillar features for the first point cloud data set and the 2D bird's-eye-view embeddings for a second of the two point cloud data sets comprises pillar features for the second point cloud data set; and a feature pyramid network encoding the pillar features and performing a 2D optical flow estimation to estimate the velocity of the object.
IMAGE-BASED DEPTH DATA AND LOCALIZATION
A vehicle can use an image sensor to both detect objects and determine depth data associated with the environment the vehicle is traversing. The vehicle can capture image data and lidar data using the various sensors. The image data can be provided to a machine-learned model trained to output depth data of an environment. Such models may be trained, for example, by using lidar data and/or three-dimensional map data associated with a region in which training images and/or lidar data were captured as ground truth data. The autonomous vehicle can further process the depth data and generate additional data including localization data, three-dimensional bounding boxes, and relative depth data and use the depth data and/or the additional data to autonomously traverse the environment, provide calibration/validation for vehicle sensors, and the like.
Tracking vehicles in a warehouse environment
This specification generally discloses technology for tracking vehicle positions in a warehouse environment. A system receives stereoscopic image data from a camera on a forklift, in some implementations. The system recognizes an object that is represented in the stereoscopic image data, identifies a representation of the recognized object in a spatial model that identifies, for each of a plurality of objects in an environment, a corresponding location of the object in the environment, determines the location of the recognized object in the environment, determines a relative position between the forklift and the recognized object, based on a portion of the received stereoscopic image data that represents the recognized object, and determines a location of the forklift in the environment, based on the determined location of the recognized object in the environment, and the determined relative position between the forklift and the recognized object.
Information processing apparatus, vehicle, and information processing method for presence probability of object
An information processing apparatus according to one embodiment includes a processing circuit. The processing circuit calculates a first presence probability of an object present around a moving body with positional information measured by each of a plurality of sensors having different characteristics, acquires non-measurement information indicating that the positional information on the object has not been obtained for each of the sensors, and determines a second presence probability of the object based on the first presence probability and the non-measurement information.
Rendering operations using sparse volumetric data
A ray is cast into a volume described by a volumetric data structure, which describes the volume at a plurality of levels of detail. A first entry in the volumetric data structure includes a first set of bits representing voxels at a lowest one of the plurality of levels of detail, and values of the first set of bits indicate whether a corresponding one of the voxels is at least partially occupied by respective geometry. A set of second entries in the volumetric data structure describe voxels at a second level of detail, which represent subvolumes of the voxels at the first lowest level of detail. The ray is determined to pass through a particular subset of the voxels at the first level of detail and at least a particular one of the particular subset of voxels is determined to be occupied by geometry.
Methods and apparatuses for object detection, and devices
A method for object detection includes: obtaining a plurality of to-be-determined targets in a to-be-detected image; determining confidences of the plurality of to-be-determined targets separately belonging to at least one category, determining categories of the plurality of to-be-determined targets according to the confidences, and determining position offset values corresponding to the respective categories of the plurality of to-be-determined targets; using the position offset values corresponding to the respective categories of the plurality of to-be-determined targets as position offset values of the plurality of to-be-determined targets; and determining position information and a category of at least one to-be-determined target in the to-be-detected image according to the categories of the plurality of to-be-determined targets, the position offset values of the plurality of to-be-determined targets, and the confidences of the plurality of to-be-determined targets belonging to the categories thereof.