Methods and devices for encoding and reconstructing a point cloud

11627339 · 2023-04-11

Assignee

Inventors

Cpc classification

International classification

Abstract

This method comprises: —accessing (2) a point cloud (PC) comprising a plurality of points defined by attributes, said attributes including a spatial position of a point in a 3D space and at least one feature of the point;—segmenting (2) the point cloud into one or more clusters (C.sub.i) of points on the basis of the attributes of the points; and for at least one cluster (C.sub.i):—constructing (4) a similarity graph having a plurality of vertices and at least one edge, the similarity graph representing a similarity among neighboring points of the cluster (C.sub.i) in terms of the attributes, the plurality of vertices including vertices P.sub.i and P.sub.j corresponding to points of the cluster (C.sub.i);—assigning one or more weights w.sub.i,j to one or more edges connecting vertices P.sub.i and P.sub.j of the graph;—computing (6) a transform using the one or more assigned weights, said transform being characterized by coefficients; and—quantizing (8) and encoding (10) the transform coefficients.

Claims

1. A method comprising: segmenting a point cloud comprising a plurality of points into multiple clusters of points such that both similarities in spatial positions of the points in a 3D space and in at least one other attribute of the points are taken into account to segment the plurality of points, the at least one other attribute of a point comprising an information representative of a color, an information representative of a texture, or an information representative of the color and the texture of the point; once the point cloud is segmented into multiple clusters, representing each cluster of points with a graph-based representation, said graph-based representation having a plurality of vertices and at least one edge connecting two vertices, an edge comprising an indicator capturing similarities of multiple attributes of two neighboring points of a cluster; and for each cluster: applying a graph Fourier transform to each cluster based on the at least one indicator to obtain transform coefficients, and encoding the transform coefficients and an information associating a cluster index to a graph Fourier transform index to indicate which graph Fourier transform is used for the cluster.

2. The method of claim 1, wherein, for at least one cluster, the method further comprises transmitting the encoded transform coefficients and identifiers of the clusters representative of transforms applied to the transform coefficients.

3. A non-transitory computer readable storage medium having program code instructions stored thereon which are executable by a processor to cause the processor to implement the method according to claim 1.

4. The method of claim 1, wherein an indicator of an edge connecting two vertices Pi and Pj is a weight w.sub.i,j defined by: w i , j = exp { - .Math. g i - g j .Math. 2 .Math. c i - c j .Math. 2 2 σ 2 } where gi and gj are vectors representing, respectively, spatial position of points Pi and Pj, and cj are vectors representing, respectively, another feature of points Pi and Pj and σ is a parameter used to control the weight value.

5. The method of claim 4, wherein if a weight is less than a threshold, said weight is set to 0 and corresponding vertices are disconnected.

6. A device comprising electronic circuitry adapted for: segmenting a point cloud comprising a plurality of points into multiple clusters of points such that both similarities in spatial positions of the points in a 3D space and in at least one other attribute of the points are taken into account to segment the plurality of points, the at least one other attribute of a point comprising an information representative of a color, an information representative of a texture, or an information representative of the color and the texture of the point; once the point cloud is segmented in a plurality of clusters, representing each cluster of points with a graph-based representation of each cluster of points, said graph-based representation having a plurality of vertices and at least one edge connecting two vertices, an edge comprising an indicator capturing similarities of multiple attributes of two neighboring points of a cluster; and for each cluster: applying a graph Fourier transform to each cluster based on the at least one indicator to obtain transform coefficients, and encoding the transform coefficients and an information associating a cluster index to a graph Fourier transform index to indicate which graph Fourier transform is used for the cluster.

7. The device of claim 6, wherein, for at least one cluster, the electronic circuitry is further configured for transmitting the encoded transform coefficients and identifiers of the clusters representative of transforms applied to the transform coefficients.

8. The device of claim 6, wherein an indicator of an edge connecting two vertices Pi and Pj is a weight w.sub.i,j defined by: w i , j = exp { - .Math. g i - g j .Math. 2 .Math. c i - c j .Math. 2 2 σ 2 } where g.sub.i and g.sub.j are vectors representing, respectively, spatial position of points Pi and Pj, c.sub.i and c.sub.j are vectors representing, respectively, another feature of points Pi and Pj and σ is a parameter used to control the weight value.

9. The device of claim 8, wherein the electronic circuitry is further configured for setting a weight to 0 if the weight is less than a threshold, and for disconnecting corresponding vertices.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) The present invention is illustrated by way of examples, and not by way of limitation, in the figures of the accompanying drawings, in which like reference numerals refer to similar elements and in which: FIG. 1, already described, is a schematic view illustrating an octree-based point cloud representation according to the prior art; FIG. 2 already described, is a schematic view illustrating an overview of an octree data structure; FIG. 3 is a flowchart showing the steps of encoding a point cloud, according to an embodiment of the present invention; FIG. 4 is a flowchart showing the steps of reconstructing a point cloud, according to an embodiment of the present invention; FIG. 5 is a schematic view illustrating an encoder, according to an embodiment of the invention; and FIG. 6 is a schematic view illustrating a decoder, according to an embodiment of the invention.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

(2) The method of encoding a point cloud according to an embodiment of the present invention is illustrated in the flowchart of FIG. 3.

(3) The points of the point cloud PC are characterized by their attributes which include the geometry, i.e. the spatial position of each point and at least another attribute of the point, such as for example the color. We will assume in the following description that each point of the point cloud has two attributes: the geometry g in the 3D space and the color c in the RGB space.

(4) The point cloud data is classified into several clusters at step 2. As a result of this clustering, the point cloud data in each cluster can be approximated by a linear function as it has generally been observed that most point cloud data has a piecewise linear behavior.

(5) This clustering may be performed by any prior art clustering method.

(6) Advantageously, the clustering is performed using the normalized cuts technique modified, according to the present embodiment, in order to consider similarities between all the attributes of the points of the point cloud.

(7) Thus, when performing clustering, multi-lateral relations are considered. That is, both similarities in the geometry and in the other attributes, such as the color, are taken into account to segment the point cluster in order to achieve a clustering result that exploits the piecewise linear property in an optimal way.

(8) At the end of step 2, the point cloud PC is clustered into a plurality of clusters identified by corresponding indices C.sub.i, wherein 1≤i≤N, where N is the obtained number of clusters.

(9) The cluster indices are assigned randomly, while ensuring that a unique index is assigned for each cluster.

(10) Then, at step 4, a similarity graph is constructed for each cluster. This similarity graph represents the similarity among neighboring vertices, i.e. points, of the cluster and permits to acquire a compact representation of the cluster.

(11) A graph G={V, E, W} consists of a finite set of vertices V with cardinality |V|=N, a set of edges E connecting vertices, and a weighted adjacency matrix W. W is a real N×N matrix, where w.sub.i,j is the weight assigned to the edge (i, j) connecting vertices Pi and Pj. Only undirected graphs are considered here, which correspond to symmetric weighted adjacency matrices, i.e., w.sub.i,j=w.sub.j,i. Weights are assumed non-negative, i.e., w.sub.i,j≥0.

(12) In the similarity graph constructed at step 4, the weight w.sub.i,j between vertex Pi and vertex Pj is assigned as

(13) w i , j = exp { - .Math. g i - g j .Math. 2 .Math. c i - c j .Math. 2 2 σ 2 } . ( 1 )
Where g.sub.i and g.sub.j are the coordinate vectors respectively, i.e., g.sub.i=(x.sub.i, y.sub.i, z.sub.i).sup.T and g.sub.j=(x.sub.j, y.sub.j, z.sub.j).sup.T; c.sub.i and c.sub.j are the color vectors respectively, i.e., c.sub.i=(r.sub.i, g.sub.i, b.sub.i).sup.T and c.sub.j=(r.sub.i, g.sub.j, b.sub.j).sup.T. In this formula, ∥g.sub.i−g.sub.j∥.sup.2 describes the similarity in the geometry and ∥c.sub.i−c.sub.j∥.sup.2 describes the similarity in the color. σ is a parameter used to control the weight value and is often set empirically.

(14) In this way, both the geometry and color information are taken into consideration in the constructed graph, which is different from the usual definition of edge weights, which only considers the similarity in one attribute.

(15) According to a preferred embodiment and in order to reduce the number of connected vertices for sparse representation, a threshold t is used to set small weights to 0 as follows:
w.sub.i,j=0, if w.sub.i,j<t  (2).

(16) In this way, small weights won't be encoded so as to save coding bits. Also, as these weights are small, which means the connected vertices are dissimilar to some extent, setting these weights to 0 almost does not degrade the coding quality.

(17) Thus, the graph is constructed as follows.

(18) For each point Pi of the cluster, the possible edge weight between it and any other point Pj in the cluster is calculating according to equation (1). If w.sub.i,j>=t, an edge e.sub.i,j between Pi and Pj is added and the calculated weight w.sub.i,j is attached to the edge, otherwise, Pi and Pj are disconnected as a small weight means that there's a large discrepancy between them.

(19) Then, at step 6, the GFT is calculated for each cluster based on the constructed similarity graph.

(20) At this step, the weighted adjacency matrix is obtained from the weights of the graph.

(21) Then, the graph Laplacian matrix is computed.

(22) There exist different variants of Laplacian matrices.

(23) In one embodiment, the unnormalized combinatorial graph Laplacian is deployed, which is defined as L:=D−W, where D is the degree matrix defined as a diagonal matrix whose i-th diagonal element is the sum of all elements in the i-th row of W, i.e., d.sub.i,i=Σ.sub.j=1.sup.Nw.sub.i,j. Since the Laplacian matrix is a real symmetric matrix, it admits a set of real eigenvalues {λl}.sub.l=0, 1, . . . , N-1 with a complete set of orthonormal eigenvectors {ψ.sub.l}.sub.l=0, 1, . . . , N-1, i.e., Lψ.sub.l=λ.sub.lψ.sub.l, for l=0, 1, . . . , N−1.

(24) This Laplacian matrix is employed in one embodiment for two reasons.

(25) First, as the elements in each row of L sum to zero by construction, 0 is guaranteed to be an eigenvalue with [1 . . . 1].sup.T as the corresponding eigenvector. This allows a frequency interpretation of the GFT, where the eigenvalues λ.sub.l's are the graph frequencies and always have a DC component, which is beneficial for the compression of point cloud data consisting of many smooth regions. By connecting similar points and disconnecting dissimilar ones, high-frequency coefficient are reduced, which leads to a compact representation of the point cloud in the GFT domain.

(26) Second, GFT defaults to the well-known DCT when defined for a line graph (corresponding to the 1D DCT) or a 4-connectivity graph (2D DCT) with all edge weights equal to 1. That means that the GFT is at least as good as the DCT in sparse signal representation if the weights are chosen in this way. Due to the above two desirable properties, the unnormalized Laplacian matrix is used for the definition of the GFT as described in the following paragraph.

(27) The eigenvectors {ψ.sub.l}.sub.l=0, 1, . . . , N-1 of the Laplacian matrix are used to define the GFT. Formally, for any signal x∈custom character.sup.N residing on the vertices of G, its GFT {circumflex over (x)}∈custom character.sup.N is defined as
{circumflex over (x)}(l)=<ψ.sub.l,x>=Σ.sub.n=1.sup.Nψ.sub.l*(n)x(n),l=0,1, . . . ,N−1.  (3)
where {circumflex over (x)}(l) is the l-th GFT coefficient, where x(n) refers here to the attributes of the point n of the point cloud.

(28) For each cluster, the obtained GFT is identified by a GFT index.

(29) Thus, step 6 results in the GFT coefficients and the GFT indices.

(30) Then, at step 8, the GFT coefficients are quantized.

(31) At step 10, the quantized GFT coefficients are entropy coded, for example by using the CABAC method described in Marpe: “Context-Based Adaptive Binary Arithmetic Coding in the H.264/AVC Video Compression Standard”, IEEE Trans. Cir. and Sys. For Video Technol., vol. 13, no. 7, pages 620-636, 2003.

(32) Also, the overhead, constituted by the cluster indices and the GFT indices, is entropy coded at step 10 to indicate which GFT is used for which cluster. The encoded GFT coefficients, cluster indices and GFT indices are then transmitted, at step 12, to a decoder.

(33) FIG. 4 shows the steps of reconstructing a point cloud, i.e. the decoding steps implemented by the decoder after receiving the encoded GFT coefficients, cluster indices and GFT indices, according to an embodiment of the present disclosure.

(34) At step 20, the received GFT coefficients, cluster indices and GFT indices are entropy decoded.

(35) Then, at step 22, the decoded GFT coefficients are dequantized.

(36) At step 24, an inverse GFT is performed using the dequantized GFT coefficient and the decoded GFT indices which indicate the applied GFT.

(37) The inverse GFT is given by:
x(n)=Σ.sub.l=0.sup.N-1{circumflex over (x)}(l)ψ.sub.l(n),n=1,2, . . . ,N.  (4)
where x is the recovered signal representing the point cloud data of each cluster.

(38) According to an embodiment, the graph Fourier transform is calculated as follows, where each point in the point cloud is treated as a vertex in a graph.

(39) Firstly, each point is connected to its neighbors as long as they are similar. Two points are disconnected if there is a large discrepancy between them.

(40) Secondly, given the connectivity graph, the adjacency matrix W is defined, where w.sub.i,j=w.sub.j,i=1 if vertex i and j are connected, and 0 otherwise. The degree matrix D is then computed.

(41) In a third step, using computed matrices W and D, the graph Laplacian matrix is computed as L=D−W. The eigenvectors U of L are the basis vectors of the GFT. Finally, the attributes of the points of the point cloud are stacked into a column vector, and the GFT and inverse GFT are computed according to equations 3 and 4.

(42) By connecting similar points and disconnecting dissimilar ones, high-frequency coefficients are reduced, thus leading to compact representation of point clouds in the GFT domain.

(43) FIG. 5 is a block diagram of an exemplary embodiment of an encoder 30 implementing the encoding method of the present disclosure.

(44) Advantageously, the encoder 30 includes one or more processors and a memory 32.

(45) The encoder 30 comprises: a segmentation module 34 configured to segment the point cloud into clusters of points on the basis of the attributes of the points; a construction module 36 configured to construct, for each cluster, a similarity graph representing the similarity among neighboring points of the cluster in terms of the attributes; a computation module 38 configured to compute, for each cluster; a Graph Fourier Transform, GFT, based on the constructed similarity graph, the GFT being characterized by its coefficients; and a coding module 40 configured to encode the GFT coefficients.

(46) The encoder 30 also comprises a transmitter 42 configured to transmit to a decoder the encoded GFT coefficients and identifiers of the clusters and of the computed GFTs.

(47) According to the represented embodiment, a bus 44 provides a communication path between various elements of the encoder 30. Other point-to-point interconnection options (e.g. non-bus architecture) are also feasible.

(48) FIG. 6 is a block diagram of an exemplary embodiment of a decoder 50 implementing the reconstructing method of the present disclosure.

(49) Advantageously, the decoder 50 includes one or more processors and a memory 52.

(50) The decoder 50 comprises: a receiver 54 configured to receive encoded data including GFT coefficients associated with cluster indices and GFT indices; a decoding module 66 configured to decode the received data; and a reconstruction module 58 configured to reconstruct the clusters by performing for each cluster an inverse GFT.

(51) According to the represented embodiment, a bus 60 provides a communication path between various elements of the decoder 50. Other point-to-point interconnection options (e.g. non-bus architecture) are also feasible.

(52) While there has been illustrated and described what are presently considered to be the preferred embodiments of the present invention, it will be understood by those skilled in the art that various other modifications may be made, and equivalents may be substituted, without departing from the true scope of the present invention. Additionally, many modifications may be made to adapt a particular situation to the teachings of the present invention without departing from the central inventive concept described herein. Furthermore, an embodiment of the present invention may not include all of the features described above. Therefore, it is intended that the present invention is not limited to the particular embodiments disclosed, but that the invention includes all embodiments falling within the scope of the appended claims.

(53) Expressions such as “comprise”, “include”, “incorporate”, “contain”, “is” and “have” are to be construed in a non-exclusive manner when interpreting the description and its associated claims, namely construed to allow for other items or components which are not explicitly defined also to be present. Reference to the singular is also to be construed to be a reference to the plural and vice versa.

(54) A person skilled in the art will readily appreciate that various parameters disclosed in the description may be modified and that various embodiments disclosed and/or claimed may be combined without departing from the scope of the invention.

(55) For instance, even if the described graph transform is a GFT, other graph transforms may also be used such as, for example, wavelets on graphs, as described in D. Hammond, P. Vandergheynst, and R. Gribonval, “Wavelets on graphs via spectral graph theory” in Elsevier Appplied and Computational Harmonic Analysis, vol. 30, April 2010, pp. 129-150, and lifting transforms on graphs, as described in G. Shen, “Lifting transforms on graphs: Theory and applications” in Ph.D. dissertation, University of Southern California, 2010.