Patent classifications
H04N13/268
Self-supervised training of a depth estimation model using depth hints
A method for training a depth estimation model with depth hints is disclosed. For each image pair: for a first image, a depth prediction is determined by the depth estimation model and a depth hint is obtained; the second image is projected onto the first image once to generate a synthetic frame based on the depth prediction and again to generate a hinted synthetic frame based on the depth hint; a primary loss is calculated with the synthetic frame; a hinted loss is calculated with the hinted synthetic frame; and an overall loss is calculated for the image pair based on a per-pixel determination of whether the primary loss or the hinted loss is smaller, wherein if the hinted loss is smaller than the primary loss, then the overall loss includes the primary loss and a supervised depth loss between depth prediction and depth hint. The depth estimation model is trained by minimizing the overall losses for the image pairs.
Self-supervised training of a depth estimation model using depth hints
A method for training a depth estimation model with depth hints is disclosed. For each image pair: for a first image, a depth prediction is determined by the depth estimation model and a depth hint is obtained; the second image is projected onto the first image once to generate a synthetic frame based on the depth prediction and again to generate a hinted synthetic frame based on the depth hint; a primary loss is calculated with the synthetic frame; a hinted loss is calculated with the hinted synthetic frame; and an overall loss is calculated for the image pair based on a per-pixel determination of whether the primary loss or the hinted loss is smaller, wherein if the hinted loss is smaller than the primary loss, then the overall loss includes the primary loss and a supervised depth loss between depth prediction and depth hint. The depth estimation model is trained by minimizing the overall losses for the image pairs.
Method, apparatus, medium, terminal, and device for processing multi-angle free-perspective data
A method, an apparatus, a medium, a terminal, and a device for processing multi-angle free-perspective data are disclosed. The method includes: acquiring a data header file; determining a defined format of a data file according to a parsing result of the data header file; reading and obtaining a data combination from the data file based on the defined format, where the data combination includes pixel data and depth data of multiple synchronized images, and the multiple synchronized images have different perspectives with respect a to-be-viewed area, and pixel data and depth data of each image of the multiple synchronized images has an association relationship; and performing image or video reconstruction of a virtual viewpoint according to the read data combination, where the virtual viewpoint is selected from a multi-angle free-perspective range, and the multi-angle free-perspective range is a range supporting virtual viewpoint switching viewing of the to-be-viewed area.
Method, apparatus, medium, terminal, and device for processing multi-angle free-perspective data
A method, an apparatus, a medium, a terminal, and a device for processing multi-angle free-perspective data are disclosed. The method includes: acquiring a data header file; determining a defined format of a data file according to a parsing result of the data header file; reading and obtaining a data combination from the data file based on the defined format, where the data combination includes pixel data and depth data of multiple synchronized images, and the multiple synchronized images have different perspectives with respect a to-be-viewed area, and pixel data and depth data of each image of the multiple synchronized images has an association relationship; and performing image or video reconstruction of a virtual viewpoint according to the read data combination, where the virtual viewpoint is selected from a multi-angle free-perspective range, and the multi-angle free-perspective range is a range supporting virtual viewpoint switching viewing of the to-be-viewed area.
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM
An information processing device according to an embodiment of the present technology includes a movement-information acquisition unit, a gaze-information acquisition unit, and a display control unit. The movement-information acquisition unit acquires movement information about a gesture by a user. The gaze-information acquisition unit acquires information about a gazing point of the user. The display control unit controls a display device on the basis of the movement information. The display control unit causes the display device to display a first virtual object including information relating to a target object in a first region related to the target object, and to vary, on the basis of a position of the gazing point in duration for which the user is making the gesture, how the first virtual object is displayed.
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM
An information processing device according to an embodiment of the present technology includes a movement-information acquisition unit, a gaze-information acquisition unit, and a display control unit. The movement-information acquisition unit acquires movement information about a gesture by a user. The gaze-information acquisition unit acquires information about a gazing point of the user. The display control unit controls a display device on the basis of the movement information. The display control unit causes the display device to display a first virtual object including information relating to a target object in a first region related to the target object, and to vary, on the basis of a position of the gazing point in duration for which the user is making the gesture, how the first virtual object is displayed.
AUGMENTED REALITY-BASED REMOTE GUIDANCE METHOD AND APPARATUS, TERMINAL, AND STORAGE MEDIUM
Embodiments disclose an augmented reality-based remote guidance method and apparatus, terminal, and storage medium. The method comprises the following steps: acquiring a two-dimensional video of a target scene, and sending the two-dimensional video to a remote terminal; if a guidance mode of the remote guidance is marking mode, acquiring two-dimensional pixel coordinates corresponding to a marked point in a marked image frame of the two-dimensional video at the remote terminal; determining current camera coordinates corresponding to the marked point, according to first three-dimensional coordinate estimation rules and the two-dimensional pixel coordinates, wherein the current camera coordinates are current three-dimensional space coordinates corresponding to the marked point in a camera coordinate system; and according to a presentation mode and the current camera coordinates rendering a three-dimensional virtual model corresponding to the marked point so as to display the three-dimensional virtual model in the target scene.
Image generation apparatus, image generation method, data structure, and program
To allow an observer wearing stereoscopic equipment to perceive a stereo image and an observer not wearing stereoscopic equipment to perceive a clear image. Based on an original image, an image containing phase-modulated components a and an image containing phase-modulated components b are generated. The image containing phase-modulated components a and the image containing phase-modulated components b are for one who sees the original image or a subject represented by the original image and the image containing phase-modulated components a with one eye and sees the original image or the subject and the image containing phase-modulated components b with the other eye to perceive a stereo image, and one who sees the original image or the subject, the image containing phase-modulated components a, and the image containing phase-modulated components b with the same eye(s) to perceive the original image. The phase-modulated components a are generated by shifting the phase of spatial frequency components of the original image by a first phase, and the phase-modulated components b are generated by shifting the phase of the spatial frequency components of the original image by a second phase being a different phase than the first phase.
Image generation apparatus, image generation method, data structure, and program
To allow an observer wearing stereoscopic equipment to perceive a stereo image and an observer not wearing stereoscopic equipment to perceive a clear image. Based on an original image, an image containing phase-modulated components a and an image containing phase-modulated components b are generated. The image containing phase-modulated components a and the image containing phase-modulated components b are for one who sees the original image or a subject represented by the original image and the image containing phase-modulated components a with one eye and sees the original image or the subject and the image containing phase-modulated components b with the other eye to perceive a stereo image, and one who sees the original image or the subject, the image containing phase-modulated components a, and the image containing phase-modulated components b with the same eye(s) to perceive the original image. The phase-modulated components a are generated by shifting the phase of spatial frequency components of the original image by a first phase, and the phase-modulated components b are generated by shifting the phase of the spatial frequency components of the original image by a second phase being a different phase than the first phase.
IMAGE GENERATION APPARATUS, IMAGE GENERATION METHOD, AND PROGRAM
Provided is an image generation technology capable of suppressing unpleasant feelings caused by fluctuation in image quality caused by the viewer's viewpoint movement. The image generation technology includes an image generation unit configured to generate a pseudo viewpoint image I.sub.φk and a pseudo viewpoint image I.sub.−φk, by using disparity inducing edge D.sub.φk having a phase difference φk from a viewpoint image I, where φk(1≤k≤K) is set as a real number satisfying 0<φ1< . . . <φK≤π/2, an output image generation unit configured to generate an output image Out.sub.m(1≤m≤2K−1), from the pseudo viewpoint image I.sup.(m)(1≤m≤2K+1), the viewpoint image I is set as a pseudo viewpoint image I.sub.φ0, where the pseudo viewpoint images I.sub.φk, I.sub.−φk (0≤k≤K) are arranged in a sequence of I.sub.φk, I.sub.φ(K−1), . . . , I.sub.φ1, I.sub.φ0(=I), I.sub.−φ1, . . . , and I.sub.−φK is set as I.sup.(1), I.sup.(2), . . . , I.sup.(K), I.sup.(K+1), I.sup.(K+2), . . . , I.sup.(2K+1), and the output image Out.sub.m and the output image Out.sub.m+1(1≤m≤2K−2) include a phase modulation component that is canceled out when synthesized and visually recognized.