Patent classifications
G06T7/596
Systems and methods for decoding image files containing depth maps stored as metadata
Systems and methods in accordance with embodiments of the invention are configured to render images using light field image files containing an image synthesized from light field image data and metadata describing the image that includes a depth map. One embodiment of the invention includes a processor and memory containing a rendering application and a light field image file including an encoded image, a set of low resolution images, and metadata describing the encoded image, where the metadata comprises a depth map that specifies depths from the reference viewpoint for pixels in the encoded image. In addition, the rendering application configures the processor to: locate the encoded image within the light field image file; decode the encoded image; locate the metadata within the light field image file; and post process the decoded image by modifying the pixels based on the depths indicated within the depth map and the set of low resolution images to create a rendered image.
Self-implementing accurate body measurements by a user using a mobile device
A method includes receiving from a camera a 2D front-view digital image (FDI) and side-view digital image (SDI) of a human body; receiving one or more camera parameters (CPs); executing a front depth-estimation model (DEM) generated through pre-training a first machine-learning model (MLM) to convert the FDI and SDI into a front depth map (DM) and side DM, respectively; executing a rear DEM generated through pre-training a second MLM to estimate a rear DM based on the FDI, SDI, and front and side DMs; combining the front and rear DMs to generate a 3D model; executing a front key-point (FKP) estimation model generated through pre-training a third MLM to estimate one or more FKPs; and extracting a body measurement based on the 3D model and FKPs, the first and second MLMs each including a respective encoder and decoder, each respective decoder configured to be executed based on the CPs.
PERCEPTION UNCERTAINTY
A computer-implemented method of perceiving structure in an environment comprises steps of: receiving at least one structure observation input pertaining to the environment; processing the at least one structure observation input in a perception pipeline to compute a perception output; determining one or more uncertainty source inputs pertaining to the structure observation input; and determining for the perception output an associated uncertainty estimate by applying, to the one or more uncertainty source inputs, an uncertainty estimation function learned from statistical analysis of historical perception outputs.
MEDICAL IMAGING SYSTEM AND METHOD FOR GENERATING SCAN RANGE INDICATOR IN MEDICAL IMAGING SYSTEM
Provided are a medical imaging system and a method for generating a scan range indicator in the medical imaging system. The method includes projecting a 3D point cloud, generated based on a depth image of a patient, onto a plane perpendicular to a table supporting the patient and along a scanning direction, so as to generate a projected 2D point cloud; generating a patient 2D point cloud contour based on the projected 2D point cloud; based on an upper and lower positions that are received in real time and indicate a scan range, determining two corresponding points on the patient 2D point cloud contour, the upper and lower boundary positions; and, using the scan positions of the corresponding points as base points, presenting an upper boundary line and a lower boundary line of the scan range on an image of the patient as a scan range indicator.
Camera array and methods of using captured images from a camera array for depth determination purposes
Camera arrays including multiple cameras which are spaced at various distances from one another are implemented and used. Multiple camera pairs with very different camera baselines between the cameras allows for reliable depth determinations to be made. Disparity information obtained by comparing images of a closely spaced camera pair are used to limit the search for matching image portions of images of a more distantly spaced camera pair. This allows searching for matching image portions to be constrained in a way that the complexity of searching for matching image portions can scale at a lower rate than the rate at which the baselines between cameras increases. This allows depth determinations to benefit from the accuracy obtained from using large camera baselines in a manner that is efficient from a processor utilization perspective since comparing image portions can be processor intensive if implemented without the benefit of constraints.
Learned Stereo Synthetic Data
A method for training a learned stereo architecture includes receiving, from a graphic rendering system, a plurality of stereo image pairs comprising a variety of disparate scenes and scene parameters, where: a first subset of stereo image pairs correspond to a first baseline, and a second subset of stereo image pairs correspond to a second baseline different from the first baseline; inputting the plurality of stereo image pairs into a stereo architecture comprising one or more 3D convolution networks configured to learn disparity estimation based on the plurality of stereo image pairs; comparing disparity estimations from the stereo architecture with ground truth disparity from the graphic rendering system to generate training feedback; and adjusting one or more neural network models implemented by the stereo architecture based on the training feedback thereby configuring the learned stereo architecture.
System and method for performing quality control of manufactured models
Disclosed herein are example embodiments of methods and systems for identifying manufacturing defects of a manufactured dentition model. One of the methods for performing quality control comprises: determining whether the manufactured dentition model is a good or a defective product based on a statistical characteristic of a differences model. The differences model can be generated based on differences between a scanned 3D patient-dentition data and a scanned 3D manufactured-dentition data. The scanned 3D patient-dentition data can be generated using 3D data of a patient's dentition, and the scanned 3D manufactured-dentition data can be generated using 3D data of the manufactured dentition model. The manufactured dentition model can be a 3D printed model.
Binocular vision-based environment sensing method and apparatus, and unmanned aerial vehicle
A binocular vision-based environment sensing method and apparatus, is applied to an unmanned aerial vehicle. The unmanned aerial vehicle is provided with five binocular cameras. The first binocular camera is disposed at the front portion of the fuselage of the unmanned aerial vehicle. The second binocular camera is inclined upward and disposed between the left side of the fuselage and the upper portion of the fuselage of the unmanned aerial vehicle. The third binocular camera is inclined upward and disposed between the right side of the fuselage and the upper portion of the fuselage of the unmanned aerial vehicle. The fourth binocular camera is disposed at the lower portion of the fuselage of the unmanned aerial vehicle. The fifth binocular camera disposed at the rear portion of the fuselage of the unmanned aerial vehicle. The method can simplify an omni-directional sensing system while reducing the sensing blind area.
Generation of a 3D point cloud of a scene
A method for generating a 3D point cloud of a scene is performed by an image processing device. The method obtains digital images depicting the scene. Each digital image is composed of pixels. The method includes segmenting each of the digital images into digital image segments. The method includes determining a depth vector and a normal vector per each of the digital image segments by applying MVS processing to a subset of the pixels per each digital image segment. The method includes forming a map of depth vectors and normal vectors per each pixel in the digital images by, based on the determined depth and normal vectors per each of the digital image segments, estimating a 3D plane per digital image segment. The method includes generating the 3D point cloud of the scene as a combination of all the maps of depth vectors and normal vectors per each pixel in the digital images.
Image processing apparatus, image processing method, and storage medium
An object is to obtain three-dimensional shape data with high accuracy from three-dimensional shape data representing an approximate shape of an object. Three-dimensional shape data of an object existing in an image capturing space is obtained. Further, a plurality of distance images each representing a distance to the object and corresponding to a plurality of viewpoints is obtained. Then, correction to evaluate the three-dimensional shape data based on the plurality of distance images and based on results of the evaluation, delete a unit element estimated not to represent a shape of the object among unit elements configuring the three-dimensional shape data is performed.