Patent classifications
G06T3/00
Information processing device, information processing method, program, and movable object
The present technology relates to an information processing device, an information processing method, a program, and a movable object that enable prevention of a user from suffering from motion sickness. The information processing device includes: a display-position setting unit configured to move, on the basis of the motion of a movable object, the display position of a first picture viewed from a predetermined point of view of the movable object; and a display control unit configured to perform display control based on the display position set. The present technology can be applied to, for example, a vehicle that displays a picture in superimposition on the ambient scenery.
Warping for spatial light modulating displays using eye tracking
Embodiments shifts the color fields of a rendered image frame based on the eye tracking data (e.g. position of the user's pupils). An MR device obtains a first image frame having a set of color fields. The first image frame corresponds to a first position of the pupils of the viewer. The MR device then determines a second position of the pupils of the viewer based on, for example, data receive from an eye tracking device coupled to the MR device. The MR device generates, based on the first image frame, a second image frame corresponding to the second position of the pupils. The second image frame is generated by shifting color fields by a shift value based on the second position of the pupils of the viewer. The MR device transmits the second image frame to a display device of the MR device to be displayed thereon.
Intuitive 3D transformations for 2D graphics
A graphics design system provides intuitive 3D transformations for 2D objects. A user interface element is presented on 2D object or group of 2D objects. The user interface element comprises a combination of components for applying different 3D transformations, including at least one rotation component for rotating a 2D object or group of 2D objects around an axis and at least one translation component for translating the 2D object or group of 2D objects along at least one axis. 3D transformations are non-destructive and performed relative to axes local to a 2D object or 2D objects. When a 2D object or group of 2D objects is rotated around an axis, the other axes are rotated. As such, subsequent rotations and translations are performed based on the rotated axes. Additionally, editing actions associated with rotated 2D object(s) are performed in the rotated x-y plane of the rotated 2D object(s).
Method to improve accuracy of quantized multi-stage object detection network
An apparatus includes a memory and a processor. The memory may be configured to store image data of an input image. The processor may be configured to detect one or more objects in the input image using a quantized multi-stage object detection network, where quantization of the quantized multi-stage object detection network includes (i) generating quantized image data by performing a first data range analysis on the image data of the input image, (ii) generating a feature map and proposal bounding boxes by applying a region proposal network (RPN) to the quantized image data, (iii) performing a region of interest pooling operation on the feature map and a plurality of ground truth boxes corresponding to the proposal bounding boxes generated by the RPN, (iv) generating quantized region of interest pooling results by performing a second data range analysis on results from the region of interest pooling operation, and (v) applying a region-based convolutional neural network (RCNN) to the quantized region of interest pooling results.
3D microgeometry and reflectance modeling
A system and method for three-dimensional (3D) microgeometry and reflectance modeling is provided. The system receives images comprising a first set of images of a face and a second set of images of the face. The faces in the first set of images and the second set of images are exposed to omni-directional lighting and directional lighting, respectively. The system generates a 3D face mesh based on the received images and executes a set of skin-reflectance modeling operations by using the generated 3D face mesh and the second set of images, to estimate a set of texture maps for the face. Based on the estimated set of texture maps, the system texturizes the generated 3D face mesh. The texturization includes an operation in which texture information, including microgeometry skin details and skin reflectance details, of the estimated set of texture maps is mapped onto the generated 3D face mesh.
Surgical navigation with stereovision and associated methods
A surgical guidance system has two cameras to provide stereo image stream of a surgical field; and a stereo viewer. The system has a 3D surface extraction module that generates a first 3D model of the surgical field from the stereo image streams; a registration module for co-registering annotating data with the first 3D model; and a stereo image enhancer for graphically overlaying at least part of the annotating data onto the stereo image stream to form an enhanced stereo image stream for display, where the enhanced stereo stream enhances a surgeon's perception of the surgical field. The registration module has an alignment refiner to adjust registration of the annotating data with the 3D model based upon matching of features within the 3D model and features within the annotating data; and in an embodiment, a deformation modeler to deform the annotating data based upon a determined tissue deformation.
Unsupervised real-to-virtual domain unification for end-to-end highway driving
An unsupervised real to virtual domain unification model for highway driving, or DU-drive, employs a conditional generative adversarial network to transform driving images in a real domain to their canonical representations in the virtual domain, from which vehicle control commands are predicted. In the case where there are multiple real datasets, a real-to-virtual generator may be independently trained for each real domain and a global predictor could be trained with data from multiple real domains. Qualitative experiment results show this model can effectively transform real images to the virtual domain while only keeping the minimal sufficient information, and quantitative results verify that such canonical representation can eliminate domain shift and boost the performance of control command prediction task.
Artificial neural network model and electronic device including the same
An electronic device is described, that includes a processing logic configured to receive input image data and generate output image data having a different format from the input image data using an artificial neural network model. The artificial neural network model includes a plurality of encoding layer units, including a plurality of layers located at a plurality of levels, respectively. The artificial neural network model also includes a plurality of decoding layer units including a plurality of layers and configured to form skip connections with the plurality of encoding layer units at the same levels. A first encoding layer unit of a first level receives a first input feature map and outputs a first output feature map. A first output feature map is based on the first input feature map, to a subsequent encoding layer unit and a decoding layer unit at the first level.
Method and apparatus for providing virtual clothing wearing service based on deep-learning
A method and apparatus provide a virtual clothing wearing service based on deep-learning. A virtual clothing wearing server based on deep-learning includes a communicator configured to receive a user image and a v clothing image; a memory configured to store a program including first and second deep-learning models; a processor configured to generate an image of virtually dressing a virtual wearing clothing on a user. The program is configured to: generate, by the first deep-learning model, a transformed virtual wearing clothing image by transforming the virtual wearing clothing image in accordance with a body of the user in the user image based on the user image and the virtual wearing clothing image, and generate, by the second deep-learning model, the virtual wearing person image by dressing the transformed virtual wearing clothing on the body of the user based on the user image and the transformed virtual wearing clothing image.
METHOD OF INSERTING AN OBJECT INTO A SEQUENCE OF IMAGES
The invention relates to a method of inserting an insertion object into a sequence of images. The insertion object may be an image, a video, or a three-dimensional model, which could possibly be animated. Particularly, but not exclusively, the invention relates to the insertion of advertisement images into video, such as videos of sporting events. A method comprises capturing a sequence of images, the sequence of images comprising in order a first image, a second image, and a third image; estimating a first homographic transform from the first image to the third image; deriving a second homographic transform from the first image to the second image based on the first homographic transform; transforming the insertion object using the first homographic transformation to form a first warped insertion image, and inserting the first warped insertion image into the third image of the sequence of images; and transforming the insertion object using the second homographic transformation to form a second warped insertion image, and inserting the second warped insertion image into the second image of the sequence of images.