H04N19/17

ENCODING LIDAR SCANNED DATA FOR GENERATING HIGH DEFINITION MAPS FOR AUTONOMOUS VEHICLES
20220373687 · 2022-11-24 ·

Embodiments relate to methods for efficiently encoding sensor data captured by an autonomous vehicle and building a high definition map using the encoded sensor data. The sensor data can be LiDAR data which is expressed as multiple image representations. Image representations that include important LiDAR data undergo a lossless compression while image representations that include LiDAR data that is more error-tolerant undergo a lossy compression. Therefore, the compressed sensor data can be transmitted to an online system for building a high definition map. When building a high definition map, entities, such as road signs and road lines, are constructed such that when encoded and compressed, the high definition map consumes less storage space. The positions of entities are expressed in relation to a reference centerline in the high definition map. Therefore, each position of an entity can be expressed in fewer numerical digits in comparison to conventional methods.

VIDEO COMPRESSION BASED ON LONG RANGE END-TO-END DEEP LEARNING
20220377358 · 2022-11-24 ·

At least a method and an apparatus are presented for efficiently encoding or decoding video. For example, a plurality of frames is provided to a motion estimator to produce an output comprising estimated motion information. The estimated motion information is provided to an auto-encoder or an auto-decoder to produce an output comprising reconstructed motion field. The reconstructed motion field and one or more decoded frames of the plurality of frames are provided to a deep neural network to produce an output comprising refined bi-directional motion field. The video is encoded or decoded based on the refined bi-directional motion field.

METHOD FOR IDENTIFYING STATIONARY REGIONS IN FRAMES OF A VIDEO SEQUENCE
20220377355 · 2022-11-24 · ·

A method for identifying stationary regions in frames of a video sequence comprises receiving an encoded version of the video sequence, wherein the encoded version of the video sequence includes an intra-coded frame followed by a plurality of inter-coded frames; reading coding-mode information in the inter-coded frames of the encoded version of the video sequence, wherein the coding-mode information is indicative of blocks of pixels in the inter-coded frames being skip-coded; finding, using the read coding-mode information, one or more blocks of pixels that each was skip-coded in a respective plurality of consecutive frames in the encoded version of the video sequence; and designating each found block of pixels as a stationary region in the respective plurality of consecutive frames.

Processing system for reducing data amount of a point cloud
20220375029 · 2022-11-24 · ·

A processing system for reducing data amount of a point cloud includes a sample rate controller and a transmitter. The sample rate controller is used for receiving a plurality of coordinates corresponding to the point cloud, and sampling the plurality of coordinates according to an adjustable sampling rate to generate a plurality of sampled coordinates, wherein data amount of the plurality of coordinates is not less than data amount of the plurality of sampled coordinates. The transmitter coupled to the sample rate controller is used for outputting the plurality of sampled coordinates.

Automatic extraction of secondary video streams

A system and method to automatically generate a secondary video stream based on an incoming primary video stream. The method including performing video analytics on the primary video stream to generate one or more analysis results, detecting the first target of interest using the analysis results, automatically extracting a first secondary video stream that captures at least a portion of a first target of interest and has a field of view smaller than that of the primary video stream, tracking the first target of interest, displaying the first secondary video stream, detecting a second target of interest using the analysis results, automatically adapting the first secondary video stream from the primary video stream to capture a portion of the first and second targets of interest, tracking the second target of interest, and displaying the first secondary stream including the portion of the first and second targets of interest.

Automatic extraction of secondary video streams

A system and method to automatically generate a secondary video stream based on an incoming primary video stream. The method including performing video analytics on the primary video stream to generate one or more analysis results, detecting the first target of interest using the analysis results, automatically extracting a first secondary video stream that captures at least a portion of a first target of interest and has a field of view smaller than that of the primary video stream, tracking the first target of interest, displaying the first secondary video stream, detecting a second target of interest using the analysis results, automatically adapting the first secondary video stream from the primary video stream to capture a portion of the first and second targets of interest, tracking the second target of interest, and displaying the first secondary stream including the portion of the first and second targets of interest.

Information processing apparatus and method

The present disclosure relates to information processing apparatus and method that makes it possible to suppress a reduction in encoding efficiency. Information relating to quantization of a three-dimensional position of an encoding target is generated. For example, the information relating to the quantization includes information relating to a coordinate system to be subjected to the quantization, information relating to a bounding box for normalization of position information of the encoding target, or information relating to a voxel for quantization of position information of the encoding target. In addition, three-dimensional information of the encoding target is restored from a signal string on the basis of the information relating to the quantization of the three-dimensional position of the encoding target. The present disclosure is applicable to, for example, an information processing apparatus, an image processing apparatus, an electronic device, an information processing method, a program, or the like.

Landscape video stream compression using computer vision techniques
11508142 · 2022-11-22 · ·

A video encoder compresses video for real-time transmission to a video decoder of a remote teleoperator system that provides teleoperator support to the vehicle based on the real-time video. The video encoder recognizes one or more generic objects in captured video that can be removed from the video without affecting the ability of the teleoperator to control the vehicle. The video encoder removes regions of the video corresponding to the generic objects to compress the video, and generates a metadata stream encoding information about the removed objects. The video decoder generates replacement objects for the objects removed the compressed video. The video decoder inserts the rendered replacement objects into relevant regions of the compressed video to reconstruct the scene.

Method and apparatus for geometry merge mode for point cloud coding
11593969 · 2023-02-28 · ·

A method of a geometry merge mode for point cloud coding (PCC), is performed by at least one processor, and includes obtaining a candidate node of an octree partition of a point cloud, for a current node of the octree partition, the current node being currently-coded, and the candidate node being previously-coded. The method further includes obtaining an occupancy code of the obtained candidate node, constructing a candidate list including the obtained occupancy code of the candidate node, obtaining an occupancy code of the current node, based on the constructed candidate list, and performing the PCC, based on the obtained occupancy code of the current node.

Multi-parameter adaptive loop filtering in video processing

Devices, systems and methods for video processing are described. In an exemplary aspect, a method for video processing includes performing a conversion between a coded representation of a video comprising one or more video regions and the video. The coded representation includes first side information that provides a clipping parameter for filtering a reconstruction of a video unit of a video region using a non-linear adaptive loop filter, and wherein the first side information is signaled together with second side information indicative of filter coefficients used in the non-linear adaptive loop filter.