H04N13/261

Surgical video production system and surgical video production method
11185388 · 2021-11-30 · ·

A surgical video production system consists of: an input part which records time codes in a moving, and inputs a focal distance of a lens and information of surgical tools, a recognition part which generates recognition information by recognizing the surgical tools, organs, tissue, and objects in the moving image, identifies unfocused video and shaky video, indicates markers corresponding to a first time code and a last time code of each of the unfocused video and the shaky moving image, and indicates the markers corresponding to the first time code and the last time code of the moving image corresponding to the event; an editing part, which deletes a part of the moving image using the markers or separates a moving image according to the event to generate an edited image; and a transformation part which transforms an edited image into a stereoscopic image.

SYSTEMS, METHODS AND COMPUTER PROGRAM PRODUCTS FOR AUTOMATICALLY EXTRACTING INFORMATION FROM A FLOWCHART IMAGE
20210365679 · 2021-11-25 · ·

A method of extracting information from a flowchart image comprising a plurality of closed-shaped data nodes having text enclosed within, connecting lines connecting the plurality of closed-shaped data nodes and free text adjacent to the connecting lines includes receiving the flowchart image, detecting the closed-shaped data nodes, localizing the text enclosed within the closed-shaped data nodes, and masking the localized text.to generate an annotated image. Lines in the annotated image are the detected to reconstruct them as closed-shaped data nodes and connecting lines. A tree frame with the plurality of closed-shaped data nodes and the connecting lines is extracted. The free text is then localized. Chunks of the free text oriented and positioned proximally together are assembled into text blocks using an orientation-based two-dimensional clustering.

SYSTEMS, METHODS AND COMPUTER PROGRAM PRODUCTS FOR AUTOMATICALLY EXTRACTING INFORMATION FROM A FLOWCHART IMAGE
20210365679 · 2021-11-25 · ·

A method of extracting information from a flowchart image comprising a plurality of closed-shaped data nodes having text enclosed within, connecting lines connecting the plurality of closed-shaped data nodes and free text adjacent to the connecting lines includes receiving the flowchart image, detecting the closed-shaped data nodes, localizing the text enclosed within the closed-shaped data nodes, and masking the localized text.to generate an annotated image. Lines in the annotated image are the detected to reconstruct them as closed-shaped data nodes and connecting lines. A tree frame with the plurality of closed-shaped data nodes and the connecting lines is extracted. The free text is then localized. Chunks of the free text oriented and positioned proximally together are assembled into text blocks using an orientation-based two-dimensional clustering.

Apparatus and method for processing a depth map

An apparatus for processing a depth map comprises a receiver (203) receiving an input depth map. A first processor (205) generates a first processed depth map by processing pixels of the input depth map in a bottom to top direction. The processing of a first pixel comprises determining a depth value for the first pixel for the first processed depth map as the furthest backwards depth value of: a depth value for the first pixel in the input depth map, and a depth value determined in response to depth values in the first processed depth map for a first set of pixels being below the first pixel. The approach may improve the consistency of depth maps, and in particular for depth maps generated by combining different depth cues.

Apparatus and method for processing a depth map

An apparatus for processing a depth map comprises a receiver (203) receiving an input depth map. A first processor (205) generates a first processed depth map by processing pixels of the input depth map in a bottom to top direction. The processing of a first pixel comprises determining a depth value for the first pixel for the first processed depth map as the furthest backwards depth value of: a depth value for the first pixel in the input depth map, and a depth value determined in response to depth values in the first processed depth map for a first set of pixels being below the first pixel. The approach may improve the consistency of depth maps, and in particular for depth maps generated by combining different depth cues.

Efficient implementation of joint bilateral filter

Some embodiments are directed to an integrated circuit and computer-implemented method for estimating a depth map from an image using a joint bilateral filter at reduced computational complexity. For that purpose, image data of an image is accessed as well as depth data of a template depth map. A joint bilateral filter is then applied to the template depth map using the image data as a range term in the joint bilateral filter, thereby obtaining an image-adapted depth map as output. The applying of the joint bilateral filter includes initializing a sum-of-weighted-depths volume and a sum-of-weights volume as respective empty data structures in a memory, performing a splatting operation to fill said volumes, performing a slicing operation to obtain an image-adapted depth volume, and performing an interpolation operation to obtain an image-adapted depth value of the image-adapted depth map for each pixel in the image.

Method, an apparatus and a computer program product for virtual reality
11218685 · 2022-01-04 · ·

The invention relates to a solution wherein a bitstream defining a presentation is generated, the presentation comprising an omnidirectional visual media content and a visual overlay. A first relative distance of the omnidirectional visual media content and a second relative distance of the visual overlay are indicated in the bitstream. Metadata indicative of a scale applicable to convert the first relative distance and the second relative distance to real-world distance units is also associated with the generated bitstream, wherein the scale is for deriving a binocular disparity for the visual overlay. The invention also concerns a solution for decoding the bitstream to obtain data for deriving binocular disparity for the visual overlay.

REAL-TIME MULTIVIEW VIDEO CONVERSION METHOD AND SYSTEM
20230328222 · 2023-10-12 ·

Systems and methods are directed to real-time multiview video conversion. This conversion may involve receiving a video stream including two-dimensional (2D) frames, where each 2D frame corresponds to a respective 2D video timestamp. In addition, a camera baseline and a center viewpoint are identified. These parameters may be user-specified or predetermined. A target timestamp for a view of a multiview frame may be determined based on the camera baseline and the center viewpoint. The view is generated from a subset of 2D frames having 2D video timestamps adjacent to the target timestamp. A multiview video is rendered for display, where the multiview video comprises the view of the multiview frame.

Virtual reality environment based manipulation of multi-layered multi-view interactive digital media representations

Various embodiments of the present disclosure relate generally to systems and methods for generating multi-view interactive digital media representations in a virtual reality environment. According to particular embodiments, a plurality of images is fused into a first content model and a first context model, both of which include multi-view interactive digital media representations of objects. Next, a virtual reality environment is generated using the first content model and the first context model. The virtual reality environment includes a first layer and a second layer. The user can navigate through and within the virtual reality environment to switch between multiple viewpoints of the content model via corresponding physical movements. The first layer includes the first content model and the second layer includes a second content model and wherein selection of the first layer provides access to the second layer with the second content model.

Virtual reality environment based manipulation of multi-layered multi-view interactive digital media representations

Various embodiments of the present disclosure relate generally to systems and methods for generating multi-view interactive digital media representations in a virtual reality environment. According to particular embodiments, a plurality of images is fused into a first content model and a first context model, both of which include multi-view interactive digital media representations of objects. Next, a virtual reality environment is generated using the first content model and the first context model. The virtual reality environment includes a first layer and a second layer. The user can navigate through and within the virtual reality environment to switch between multiple viewpoints of the content model via corresponding physical movements. The first layer includes the first content model and the second layer includes a second content model and wherein selection of the first layer provides access to the second layer with the second content model.