QoE-BASED ADAPTIVE ACQUISITION AND TRANSMISSION METHOD FOR VR VIDEO

Abstract

The present application discloses a QoE-based adaptive acquisition and transmission method for VR video, comprising the following steps: 1, capturing, by respective cameras in a VR video acquisition system, original videos with the same bit rate level, and compressing each original video with different bit rate levels; 2, selecting, by a server, a bit rate level for each original video for transmission, and synthesizing all of the transmitted original videos into a complete VR video; 3, performing, by the server, a segmentation process on the synthesized VR video, and compressing each video block into different quality levels; and 4, selecting, by the server, a quality level and an MCS scheme for each video block according to real-time viewing angle information of users and downlink channel bandwidth information in a feedback channel, and transmitting each video block to a client.

Claims

1. A QoE-based adaptive acquisition and transmission method for VR video, applied in a network environment comprising C cameras, a VR video server and N clients; a transmission between the cameras and the VR video server being performed through an uplink, a transmission between the VR video server and the clients being performed through a downlink; and the downlink comprising a feedback channel from the clients to the VR video server; wherein the adaptive acquisition and transmission method for VR video is conducted as follows: 1 step 1, denoting C original videos taken by C cameras as {V.sub.1, V.sub.2, . . . , V.sub.c, . . . , V.sub.C} in the network environment, wherein V.sub.c represents an original video taken by a c-th camera, wherein 1≤c≤C; compressing the c-th original video V.sub.c into original videos with E bit rate levels, denoted as {V.sub.c.sup.1, V.sub.c.sup.2, . . . , V.sub.c.sup.e, . . . , V.sub.c.sup.E}, wherein V.sub.c.sup.e represents an original video with e-th bit rate level obtained after compressing the c-th original video V.sub.c, wherein 1≤e≤E; step 2, establishing an objective function with a goal of maximizing a total utility value constituted with a sum of quality of experiences QoEs of N clients, and setting corresponding constraint conditions, thereby establishing an adaptive acquisition and transmission model for VR video; step 3, solving the adaptive acquisition and transmission model for VR video with a KKT condition and a hybrid branch and bound method to obtain an uplink collecting decision variable and a downlink transmitting decision variable in the network environment; step 4, selecting, by the VR video server, an original video with the e-th bit rate level for the c-th camera according to a value of the uplink collecting decision variable χ.sub.c,e.sup.UL, and receiving the original video of the e-th bit rate level selected by the c-th camera uploaded through the uplink, so that the VR video server receives original videos of corresponding bit rate levels selected by C cameras respectively; step 5, performing, by the VR video server, a stitching and mapping process on the original videos with C corresponding bit rate levels to synthesize a complete VR video; step 6, performing, by the VR video server, a segmentation process on the complete VR video to obtain T video blocks, denoted as {T.sub.1, T.sub.2, . . . , T.sub.t, . . . , T.sub.T}, wherein T.sub.t represents any t-th video block, and 1≤t≤T; wherein the VR video server provides D bit rate selections for the t-th video block T.sub.t for a compressing process, thereby obtaining compressed video blocks with D different bit rate levels, denoted as {T.sub.t.sup.1, T.sub.t.sup.2, . . . , T.sub.t.sup.d, . . . , T.sub.t.sup.D}, T.sub.t.sup.d represents a compressed video block with a d-th bit rate level obtained after the t-th video block T.sub.t is compressed, wherein 1≤d≤D; step 7, assuming that a modulation and coding scheme in the network environment is {M.sub.1, M.sub.2, . . . , M.sub.m, . . . , M.sub.M}, wherein M.sub.m represents an m-th modulation and coding scheme, and 1≤m≤M; and selecting, by the VR video server, the m-th modulation and coding scheme for the t-th video block T.sub.t; and selecting, by the VR video server, the compressed video block T.sub.t.sup.d with the d-th bit rate level of the t-th video block T.sub.t for any n-th client according to a value of the downlink transmitting decision variable χ.sub.t,d,m.sup.DL, and transmitting the selected compressed video block T.sub.t.sup.d with the d-th bit rate level of the t-th video block T.sub.t to the n-th client through the downlink with the m-th modulation and coding scheme; so that the n-th client receives compressed video blocks with corresponding bit rate levels of T video blocks through the corresponding modulation and coding scheme; and step 8, performing, by the n-th client, decoding, mapping, and rendering process on the received compressed video blocks with corresponding bit rate levels of the T video blocks, so as to synthesize a QoE-optimized VR video.

2. The adaptive acquisition and transmission method for VR video according to claim 1, wherein the step 2 is performed as follows: step 2.1, establishing the objective function with formula (1): $\begin{matrix} Max {.Math.}_{n = 1}^{N} QoE = {.Math.}_{n = 1}^{N} \log (\frac{{.Math.}_{t = 1}^{T_{FoV}^{n}} {.Math.}_{d = 1}^{D} {.Math.}_{m = 1}^{M} λ_{t, d}^{DL} .Math. χ_{t, d, m}^{DL}}{λ_{t, D}^{DL}}) & (1) \end{matrix}$ formula (1) represents the sum of QoEs of N clients, which is the total utility value of the system; in formula (1) λ.sub.t,d.sup.DL represents a bit rate of the video block t with a quality level of d; λ.sub.t,D.sup.DL represents a bit rate when the video block t is transmitted at a highest quality level D; T.sub.FoV.sup.n indicates a video block covered in an FoV of the n-th client; when χ.sub.t,d,m.sup.DL=1, it means that the t-th video block is transmitted to the client through the downlink at the d-th bit rate level and the m-th modulation and coding scheme; and when χ.sub.t,d,m.sup.DL=0, it means that the t-th video block is not transmitted to the client through the downlink at the d-th bit rate level and the m-th modulation and coding scheme; and step 2.2, establishing constraint conditions with formulas (2)-(7): $\begin{matrix} {.Math.}_{e = 1}^{E} χ_{c, e}^{UL} - 1, \forall c & (2) \\ {.Math.}_{c = 1}^{C} {.Math.}_{e = 1}^{E} χ_{c, e}^{UL} .Math. λ_{c, e}^{UL} \leq {BW}^{UL} & (3) \\ {.Math.}_{m = 1}^{M} χ_{t, d, m}^{DL} = 1, \forall t, d & (4) \\ {.Math.}_{d = 1}^{D} χ_{t, d, m}^{DL} = 1, \forall t, m & (5) \\ {.Math.}_{t = 1}^{T_{FoV}^{n}} {.Math.}_{d = 1}^{D} {.Math.}_{m = 1}^{M} χ_{t, d, m}^{DL} .Math. [\frac{λ_{t, d}^{DL}}{R_{m}^{DL}}] \leq Y^{DL} & (6) \\ {.Math.}_{d = 1}^{D} {.Math.}_{m = 1}^{M} χ_{t, d, m}^{DL} .Math. λ_{t, d}^{DL} \leq \frac{1}{T} .Math. {.Math.}_{e = 1}^{E} χ_{c, e}^{UL} .Math. λ_{c, e}^{UL}, \forall c, t & (7) \end{matrix}$ wherein formula (2) means that any c-th camera can select an original video of only one bit rate level to upload to the server; in formula (2), when χ.sub.c,e.sup.UL=1, it means that the c-th camera uploads an original video at e-th bit rate level to the server, and when χ.sub.c,e.sup.UL=0, it means that the c-th camera does not upload an original video at e-th bit rate level to the server; formula (3) indicates that a total bit rate of the transmitted C videos should not exceed a total bandwidth of an entire uplink channel; in formula (3), BW.sup.UL represents a value of the total bandwidth of the uplink channel; formula (4) indicates that when any t-th video block is transmitted to the client through the downlink at d quality level, only one modulation and coding scheme can be selected; formula (5) indicates that when any t-th video block is transmitted to the client through the downlink with the m-th modulation and coding scheme, the transmitted video block can select only one bit rate level; formula (6) indicates that a total bit rate of all video blocks transmitted does not exceed a bit rate that all resource blocks in the entire downlink channel can provide; in formula (6), R.sub.m.sup.DL indicates a bit rate that can be provided by single resource block when the m-th modulation and coding scheme is selected, Y.sup.DL represents a total number of all resource blocks in the downlink channel; formula (7) indicates that a bit rate of any t-th video block in the downlink of the network environment is not greater than a bit rate of an original video taken by any c-th camera in the uplink.

3. The adaptive acquisition and transmission method for VR video according to claim 2, wherein the step 3 is performed as follows: step 3.1, performing a relaxation operation on the collecting decision variables χ.sub.c,e.sup.UL and the transmitting decision variables χ.sub.t,d,m.sup.DL of the adaptive acquisition and transmission model for VR video, and obtaining a continuous collecting decision variable and a continuous transmitting decision variable within a scope of [0,1], respectively; step 3.2, according to the constraint conditions of formula (2)-formula (7), denoting ${.Math.}_{e = 1}^{E} χ_{c, e}^{UL} - 1$ as a function h.sub.1(χ.sub.c,e.sup.UL); denoting ${.Math.}_{c = 1}^{C} {.Math.}_{e = 1}^{E} χ_{c, e}^{UL} .Math. λ_{c, e}^{UL} - {BW}^{UL}$ as a function h.sub.2(χ.sub.t,d,m.sup.DL); denoting ${.Math.}_{d = 1}^{D} χ_{t, d, m}^{DL} - 1$ as a function h.sub.3(χ.sub.t,d,m.sup.DL); denoting ${.Math.}_{c = 1}^{C} {.Math.}_{e = 1}^{E} χ_{c, e}^{UL} .Math. λ_{c, e}^{UL} - {BW}^{UL}$ as a function g.sub.1(χ.sub.c,d.sup.DL); denoting ${.Math.}_{t = 1}^{T_{FoV}^{n}} {.Math.}_{d = 1}^{D} {.Math.}_{m = 1}^{M} χ_{t, d, m}^{DL} .Math. [\frac{λ_{t, d}^{DL}}{R_{m}^{DL}}] - Y^{DL}$ as a function g.sub.2(χ.sub.t,d,m.sup.DL); denoting ${.Math.}_{d = 1}^{D} {.Math.}_{m = 1}^{M} χ_{t, d, m}^{DL} .Math. λ_{t, d}^{DL} - \frac{1}{T} .Math. {.Math.}_{e = 1}^{D} χ_{c, e}^{UL} .Math. λ_{c, e}^{UL}$ as a function g.sub.3(χ.sub.c,e.sup.DL,χ.sub.t,d,m.sup.DL); and calculating a Lagrangian function L(λ.sub.c,e.sup.UL,χ.sub.t,d,m.sup.DL,λ,μ) of a relaxed adaptive acquisition and transmission model for VR video with formula (8) as: $\begin{matrix} L (λ_{c, e}^{UL}, χ_{t, d, m}^{DL}, λ, μ) = - {.Math.}_{n = 1}^{N} {QoE}_{n} + λ_{1} h_{1} (χ_{c, e}^{DL}) + λ_{2} h_{2} (χ_{t, d, m}^{DL}) + λ_{3} h_{3} (χ_{t, d, m}^{DL}) + μ_{1} g_{1} (χ_{c, e}^{DL}) + μ_{2} g_{2} (χ_{t, d, m}^{DL}) + μ_{3} g_{3} (χ_{c, e}^{DL}, χ_{t, d, m}^{DL}) & (8) \end{matrix}$ in the formula (8), λ represents a Lagrangian coefficient of equality constraint conditions in formulas (2)-(7), μ represents a Lagrangian coefficient of inequality constraint conditions in formulas (2)-(7), λ.sub.1 represents a Lagrangian coefficient of the function h.sub.1(χ.sub.c,e.sup.UL), λ.sub.2 represents a Lagrangian coefficient of the function h.sub.2(χ.sub.t,d,m.sup.DL), λ.sub.3 is a Lagrangian coefficient of the function h.sub.3(χ.sub.t,d,m.sup.DL), μ.sub.1 is a Lagrangian coefficient of the function g.sub.1(χ.sub.c,e.sup.DL), μ.sub.2 is a Lagrangian coefficient of the function g.sub.2(χ.sub.t,d,m.sup.DL), and μ.sub.3 is a Lagrangian coefficient of the function g.sub.3(χ.sub.c,e.sup.DL,χ.sub.t,d,m.sup.DL), and QoE.sub.n represents quality of experience of the n-th client and: $\begin{matrix} {QoE}_{n} = \log (\frac{{.Math.}_{t = 1}^{T_{FoV}^{n}} {.Math.}_{d = 1}^{D} {.Math.}_{m = 1}^{M} λ_{t, d}^{DL} .Math. χ_{t, d, m}^{DL}}{λ_{t, D}^{DL}}) & (9) \end{matrix}$ step 3.3, obtaining the KKT conditions of a relaxed adaptive acquisition and transmission model for VR video as shown in formulas (10)-(15) below according to the Lagrangian function L(λ.sub.c,e.sup.UL,χ.sub.t,d,m.sup.DL,λ,μ) of formula (8): $\begin{matrix} \frac{\partial L (λ_{c, e}^{UL} .Math. χ_{t, d, m}^{DL}, λ, μ)}{\partial λ_{c, e}^{UL}} = λ_{1} \frac{\partial h_{1} (χ_{c, e}^{DL})}{\partial λ_{c, e}^{UL}} + μ_{1} + μ_{3} \frac{\partial g_{3} (χ_{c, e}^{DL} .Math. χ_{t, d, m}^{DL})}{\partial λ_{c, e}^{UL}} = 0 & (10) \\ \frac{\partial L (λ_{c, e}^{UL} .Math. χ_{t, d, m}^{DL}, λ, μ)}{\partial χ_{t, d, m}^{DL}} = - {.Math.}_{n = 1}^{N} \frac{\partial {QoE}_{n}}{\partial χ_{t, d, m}^{DL}} + λ_{2} \frac{\partial h_{2} (χ_{t, d, m}^{DL})}{\partial χ_{t, d, m}^{DL}} + λ_{3} \frac{\partial h_{3} (χ_{t, d, m}^{DL})}{\partial χ_{t, d, m}^{DL}} + μ_{2} \frac{\partial g_{2} (χ_{t, d, m}^{DL})}{\partial χ_{t, d, m}^{DL}} + μ_{3} \frac{\partial g_{3} (χ_{c, e}^{DL}, χ_{t, d, m}^{DL})}{\partial χ_{t, d, m}^{DL}} = 0 & (11) \\ g_{1} (χ_{c, e}^{DL}) \leq 0, g_{2} (χ_{t, d, m}^{DL}) \leq 0, g_{3} (χ_{c, e}^{DL}, χ_{t, d, m}^{DL}) \leq 0 & (12) \\ h_{1} (χ_{c, e}^{DL}) = 0, h_{2} (χ_{t, d, m}^{DL}) = 0, h_{3} (χ_{t, d, m}^{DL}) = 0 & (13) \\ λ_{1}, λ_{2}, λ_{3} \neq 0, μ_{1}, μ_{2}, μ_{3} \geq 0 & (14) \\ μ_{1} g_{1} (χ_{c, e}^{DL}) = 0, μ_{2} g_{2} (χ_{t, d, m}^{DL}) = 0, μ_{3} g_{3} (χ_{c, e}^{DL}, χ_{t, d, m}^{DL}) = 0 & (15) \end{matrix}$ solving the formulas (10)-(15), and obtaining an optimal solution χ.sub.relax and an optimal total utility value Z.sub.relax of the relaxed adaptive acquisition and transmission model for VR video; wherein the optimal solution χ.sub.relax comprises relaxed optimal solutions of the collecting decision variable χ.sub.c,e.sup.UL and the transmitting decision variable .Math..sub.t,d,m.sup.DL; step 3.4, using the optimal solution χ.sub.relax and the optimal total utility value Z.sub.relax as initial input parameters of the hybrid branch and bound method; step 3.5, defining the number of branches in the hybrid branch and bound method as k, defining a lower bound of the optimal total utility value in the hybrid branch and bound method as L, and defining an upper bound of the optimal total utility value in the hybrid branch and bound method as U; step 3.6, initializing k=0; step 3.7, initializing L=0; step 3.8, initializing U=Z.sub.relax; step 3.9, denoting an optimal solution of a k-th branch as χ.sub.k and denoting a corresponding optimal total utility value as Z.sub.k, assigning a value of χ.sub.relax to χ.sub.k, and using the optimal solution χ.sub.k of the k-th branch as a root node; step 3.10, determining whether there is a solution of χ.sub.k that does not meet a 0-1 constraint condition, if there is, dividing a relaxed optimal solution of χ.sub.k into a solution that meets the 0-1 constraint condition and a solution χ.sub.k(0,1) that does not meet the 0-1 constraint condition, and going to step 3.12; otherwise, expressing χ.sub.k as an optimal solution of the non-relaxed adaptive acquisition and transmission model for VR video; step 3.11, generating randomly, a random number ε.sub.k for the k-th branch within a range of (0,1), and determining whether 0<χ.sub.k(0,1)<ε.sub.k is true; if true, adding a constraint condition “χ.sub.k(0,1)=0” to the non-relaxed adaptive acquisition and transmission model for VR video to form a sub-branch I of the k-th branch; otherwise, adding a constraint condition “χ.sub.k(0,1)=1” to the non-relaxed adaptive acquisition and transmission model for VR video to form a sub-branch II of the k-th branch; step 3.12, solving the relaxed solutions of the sub-branch I and the sub-branch II of the k-th branch with the KKT condition, and using them as an optimal solution χ.sub.k+1 and an optimal total utility value Z.sub.k+1 to a (k+1)-th branch, wherein the χ.sub.k+1 comprises: relaxed solutions of the sub-branch I and the sub-branch II of the (k+1)-th branch; step 3.13, determining whether the optimal solution χ.sub.k+1 of the (k+1)-th branch meets the 0-1 constraint condition, if so, finding a maximum value from the optimal total utility value Z.sub.k+1 and assigning it to L, and χ.sub.k+1∈{0,1}; otherwise, finding a maximum value from the optimal total value utility Z.sub.k+1 and assigning it to U, and χ.sub.k+1∈(0,1); step 3.14, determining whether Z.sub.k+1<L is true; if so, cutting off the branch where the optimal solution χ.sub.k+1 of the (k+1)-th branch is located, assigning k+1 to k, and returning to step 3.10; otherwise, going to step 3.15; step 3.15, determining whether Z.sub.k+1>L is true; if so, assigning k+1 to k, and returning to step 3.10; otherwise, going to step 3.16; and step 3.16, determining whether Z.sub.k+1=L is true, if so, it means that the optimal solution of the non-relaxed adaptive acquisition and transmission model for VR video is the optimal solution χ.sub.k+1 of the (k+1)-th branch, and assigning χ.sub.k+1 to an optimal solution χ.sub.0-1 of the non-relaxed adaptive acquisition and transmission model for VR video, assigning Z.sub.k+1 corresponding to the χ.sub.k+1 to an optimal total utility value Z.sub.0-1 of the non-relaxed adaptive acquisition and transmission model for the VR video; otherwise, assigning k+1 to k, and returning to step 3.10.

Description

BRIEF DESCRIPTION OF DRAWINGS

[0054] FIG. 1 is an application scenario diagram of an acquisition and transmission method for streaming media of VR video proposed in the present disclosure;

[0055] FIG. 2 is a system structure diagram of an adaptive acquisition and transmission method proposed in the present disclosure.

DESCRIPTION OF EMBODIMENTS

[0056] In this embodiment, a QoE-based adaptive acquisition and transmission method for a VR video, as shown in FIG. 1, is applied in a network environment of multiple users, and there are C cameras, a VR video server and N clients in the network environment. The transmission between the cameras and the VR video server is performed through an uplink, and the transmission between the VR video server and the clients is performed through a downlink; the downlink includes a feedback channel from the clients to the VR video server; the feedback channel is able to feedback a real-time viewing angle information of the user and a downlink bandwidth information to the server, assisting the server on acquisition and transmission operation. As shown in FIG. 2, the method specifically includes the following steps:

[0057] step 1, denoting C original videos taken by C cameras as {V.sub.1, V.sub.2, . . . , V.sub.c, . . . , V.sub.C} in an application network environment, where V.sub.c represents an original video taken by a c-th camera, where 1≤c≤C;

[0058] obtaining E original videos with different bit rate levels after compressing the original video V.sub.c taken by the c-th camera, where V.sub.c.sup.e represents an original video with e-th bit rate level obtained after compressing the original video V.sub.c taken by the c-th camera C.sub.c, where 1≤e≤E;

[0059] step 2, establishing an objective function with a goal of maximizing a total utility value constituted with a sum of quality of experiences QoEs of N clients, and setting corresponding constraint conditions, thereby establishing an adaptive acquisition and transmission model for VR video with formula (1) formula (7);

[0060] The objective function:

[00012] $\begin{matrix} Max {.Math.}_{n = 1}^{N} QoE = {.Math.}_{n = 1}^{N} \log (\frac{{.Math.}_{t = 1}^{T_{FoV}^{n}} {.Math.}_{d = 1}^{D} {.Math.}_{m = 1}^{M} λ_{t, d}^{DL}, χ_{t, d, m}^{DL}}{λ_{t, D}^{DL}}) & (1) \end{matrix}$

[0061] formula (1) represents the sum of QoEs of N clients, which is the total utility value of the system; in formula (1), λ.sub.t,d.sup.DL represents a bit rate of a video block t with a quality level of d; λ.sub.t,D.sup.DL represents a bit rate when the video block t is transmitted at a highest quality level D; T.sub.FoV.sup.n represents a video block covered in a FoV of a n-th client; when χ.sub.t,d,m.sup.DL=1, it means that a t-th video block is transmitted to the client through the downlink at a d-th bit rate level and an m-th modulation and coding scheme; and when χ.sub.t,d,m.sup.DL=0, it means that the t-th video block is not transmitted to the client through the downlink at the d-th bit rate level and the m-th modulation and coding scheme;

[0062] The constraint conditions:

[00013] $\begin{matrix} {.Math.}_{e = 1}^{E} χ_{c, e}^{UL} = 1, \forall c & (2) \\ {.Math.}_{c = 1}^{C} {.Math.}_{e = 1}^{E} χ_{c, e}^{UL} .Math. λ_{c, e}^{UL} \leq {BW}^{UL} & (3) \\ {.Math.}_{m = 1}^{M} χ_{t, d, m}^{DL} = 1, \forall t, d & (4) \\ {.Math.}_{d = 1}^{D} χ_{t, d, m}^{DL} = 1, \forall t, m & (5) \\ {.Math.}_{t = 1}^{T_{FoV}^{n}} {.Math.}_{d = 1}^{D} {.Math.}_{m = 1}^{M} χ_{t, d, m}^{DL} .Math. [\frac{λ_{t, d}^{DL}}{R_{m}^{DL}}] \leq Y^{DL} & (6) \\ {.Math.}_{d = 1}^{D} {.Math.}_{m = 1}^{M} χ_{t, d, m}^{DL} .Math. λ_{t, d}^{DL} \leq \frac{1}{T} .Math. {.Math.}_{e = 1}^{E} χ_{c, e}^{UL} .Math. λ_{c, e}^{UL}, \forall c, t & (7) \end{matrix}$

[0063] formula (2) indicates that any c-th camera can select an original video of only one bit rate level to upload to the server; in formula (2), when χ.sub.c,e.sup.UL=1, it means that the c-th camera uploads an original video at e-th bit rate level to the server, and when χ.sub.c,e.sup.UL=0, it means that the c-th camera does not upload an original video at e-th bit rate level to the server;

[0064] formula (3) indicates that a total bit rate of the transmitted C videos should not exceed a total bandwidth of the entire uplink channel; in formula (3), BW.sup.UL represents a value of the total bandwidth of the uplink channel;

[0065] formula (4) indicates that when any t-th video block is transmitted to the client through the downlink at d quality level, only one modulation and coding scheme can be selected;

[0066] formula (5) indicates that when any t-th video block is transmitted to the client through the downlink with the m-th modulation and coding scheme, the transmitted video block can select only one bit rate level;

[0067] formula (6) indicates that a total bit rate of all video blocks transmitted does not exceed a bit rate that all resource blocks in the entire downlink channel can provide; in formula (6), R.sub.m.sup.DL indicates a bit rate that can be provided by single resource block when the m-th modulation and coding scheme is selected, Y.sup.DL represents a total number of all resource blocks in the downlink channel;

[0068] formula (7) indicates that a bit rate of any t-th video block in the downlink of the network environment is not greater than a bit rate of an original video taken by any c-th camera in the uplink.

[0069] Step 3, solving the adaptive acquisition and transmission model for VR video with a KKT condition and a hybrid branch and bound method to obtain an uplink collecting decision variable and a downlink transmitting decision variable in the network environment;

[0070] step 3.1, performing a relaxation operation on the collecting decision variable χ.sub.c,e.sup.UL and the transmitting decision variable χ.sub.t,d,m.sup.DL of the adaptive acquisition and transmission model for VR video, and obtaining a continuous collecting decision variable and a continuous transmitting decision variable within a scope of [0,1], respectively;

[0071] step 3.2, according to the constraint conditions of formula (2)-formula (7), denoting

[00014] ${.Math.}_{e = 1}^{E} χ_{c, e}^{UL} - 1$

as a function h.sub.1(χ.sub.c,e.sup.UL); denoting

[00015] ${.Math.}_{c = 1}^{C} {.Math.}_{e = 1}^{E} χ_{c, e}^{UL} .Math. λ_{c, e}^{UL} - {BW}^{UL}$

as a function h.sub.2(χ.sub.t,d,m.sup.DL); denoting

[00016] ${.Math.}_{d = 1}^{D} χ_{t, d, m}^{DL} - 1$

as a function h.sub.3(χ.sub.t,d,m.sup.DL); denoting

[00017] ${.Math.}_{c = 1}^{C} {.Math.}_{e = 1}^{E} χ_{c, e}^{UL} .Math. λ_{c, e}^{UL} - {BW}^{UL}$

as a function g.sub.1(χ.sub.c,d.sup.DL); denoting

[00018] ${.Math.}_{t = 1}^{T_{FoV}^{n}} {.Math.}_{d = 1}^{D} {.Math.}_{m = 1}^{M} χ_{t, d, m}^{DL} .Math. [\frac{λ_{t, d}^{DL}}{R_{m}^{DL}}] - Y^{DL}$

as a function g.sub.2(χ.sub.t,d,m.sup.DL); denoting

[00019] ${.Math.}_{d = 1}^{D} {.Math.}_{m = 1}^{M} χ_{t, d, m}^{DL} .Math. λ_{t, d}^{DL} - \frac{1}{T} .Math. {.Math.}_{c = 1}^{D^{'}} χ_{c, e}^{UL} .Math. λ_{c, e}^{UL}$

as a function g.sub.1(χ.sub.c,e.sup.DL,χ.sub.t,d,m.sup.DL); and a Lagrangian function L(λ.sub.c,e.sup.UL,χ.sub.t,d,m.sup.DL,λ,μ) of a relaxed adaptive acquisition and transmission model for VR video is calculated with formula (8) as:

[00020] $\begin{matrix} L (λ_{c, e}^{UL}, χ_{t, d, m}^{DL}, λ, μ) = - {.Math.}_{n = 1}^{N} {QoE}_{n} + λ_{1} h_{1} (χ_{c, e}^{DL}) + λ_{2} h_{2} (χ_{t, d, m}^{DL}) + λ_{3} h_{3} (χ_{t, d, m}^{DL}) + μ_{1} g_{1} (χ_{c, e}^{DL}) + μ_{2} g_{2} (χ_{t, d, m}^{DL}) + μ_{3} g_{3} (χ_{c, e}^{DL}, χ_{t, d, m}^{DL}) & (8) \end{matrix}$

[0072] in the formula (8), λ represents a Lagrangian coefficient of equality constraint conditions in formulas (2)-(7), μ represents a Lagrangian coefficient of inequality constraint conditions in formulas (2)-(7), λ.sub.1 represents a Lagrangian coefficient of the function h.sub.1(χ.sub.c,e.sup.UL), λ.sub.2 represents a Lagrangian coefficient of the function h.sub.2(χ.sub.t,d,m.sup.DL); λ.sub.3 is a Lagrangian coefficient of the function h.sub.3(χ.sub.t,d,m.sup.DL), μ.sub.1 is a Lagrangian coefficient of the function g.sub.1(χ.sub.c,e.sup.DL), μ.sub.2 is a Lagrangian coefficient of the function g.sub.2(χ.sub.t,d,m.sup.DL), and μ.sub.3 is a Lagrangian coefficient of the function g.sub.1(χ.sub.c,e.sup.DL,χ.sub.t,d,m.sup.DL), and QoE.sub.n represents quality of experience of the n-th client, and:

[00021] $\begin{matrix} {QoE}_{n} = \log (\frac{{.Math.}_{t = 1}^{T_{FoV}^{n}} {.Math.}_{d = 1}^{D} {.Math.}_{m = 1}^{M} λ_{t, d}^{DL} .Math. χ_{t, d, m}^{DL}}{λ_{t, D}^{DL}}) & (9) \end{matrix}$

[0073] step 3.3, obtaining the KKT conditions of the relaxed adaptive acquisition and transmission model for VR video as shown in formulas (10)-(15) below according to the Lagrangian function L(λ.sub.c,e.sup.UL,χ.sub.t,d,m.sup.DL,λ,μ) of formula (8):

[00022] $\begin{matrix} \frac{\partial L (λ_{c, e}^{UL}, χ_{t, d, m}^{DL}, λ, μ)}{\partial λ_{c, e}^{UL}} = λ_{1} \frac{\partial h_{1} (χ_{c, e}^{DL})}{\partial λ_{c, e}^{UL}} + μ_{1} + μ_{3} \frac{\partial g_{3} (χ_{c, e}^{DL}, χ_{t, d, m}^{DL})}{\partial λ_{c, e}^{UL}} = 0 & (10) \\ \frac{\partial L (λ_{c, e}^{UL}, χ_{t, d, m}^{DL}, λ, μ)}{\partial χ_{t, d, m}^{DL}} = - {.Math.}_{n = 1}^{N} \frac{\partial {QoE}_{n}}{\partial χ_{t, d, m}^{DL}} + λ_{2} \frac{\partial h_{2} (χ_{t, d, m}^{DL})}{\partial χ_{t, d, m}^{DL}} + λ_{3} \frac{\partial h_{3} (χ_{t, d, m}^{DL})}{\partial χ_{t, d, m}^{DL}} + μ_{2} \frac{\partial g_{2} (χ_{t, d, m}^{DL})}{\partial χ_{t, d, m}^{DL}} + μ_{3} \frac{\partial g_{3} (χ_{c, e}^{DL}, χ_{t, d, m}^{DL})}{\partial χ_{t, d, m}^{DL}} = 0 & (11) \\ g_{1} (χ_{c, e}^{DL}) \leq 0, g_{2} (χ_{t, d, m}^{DL}) = 0, g_{3} (χ_{c, e}^{DL}, χ_{t, d, m}^{DL}) \leq 0 & (12) \\ h_{1} (χ_{c, e}^{DL}) = 0, h_{2} (χ_{t, d, m}^{DL}) = 0, h_{3} (χ_{t, d, m}^{DL}) = 0 & (13) \\ λ_{1}, λ_{2}, λ_{3} \neq 0, μ_{1}, μ_{2}, μ_{3} \geq 0 & (14) \\ μ_{1} g_{1} (χ_{c, e}^{DL}) = 0, μ_{2} g_{2} (χ_{t, d, m}^{DL}) = 0, μ_{3} g_{3} (χ_{c, e}^{DL}, χ_{t, d, m}^{DL}) = 0 & (15) \end{matrix}$

[0074] Formulas (10) and (11) represent necessary conditions when an extreme value of the Lagrangian function L(λ.sub.c,e.sup.UL,χ.sub.t,d,m.sup.DL,λ,μ) is taken; formulas (12) and (13) represent constraint conditions of the functions h.sub.1(χ.sub.c,e.sup.UL), h.sub.2(χ.sub.t,d,m.sup.DL), h.sub.3(χ.sub.t,d,m.sup.DL), g.sub.1(χ.sub.c,e.sup.DL), g.sub.2(χ.sub.t,d,m.sup.DL), g.sub.3(χ.sub.c,e.sup.DL,χ.sub.t,d,m.sup.DL); formula (14) represents constraint conditions of the Lagrangian coefficients λ.sub.1, λ.sub.2, λ.sub.3, μ.sub.1, μ.sub.2, μ.sub.3; and formula (15) represents a complementary relaxation condition.

[0075] Solving the formulas (10)-(15), and obtaining an optimal solution χ.sub.relax and an optimal total utility value Z.sub.relax of the relaxed adaptive acquisition and transmission model for VR video; where the optimal solution χ.sub.relax includes relaxed optimal solutions of the collecting decision variable χ.sub.c,e.sup.UL and the transmitting decision variable χ.sub.t,d,m.sup.DL;

[0076] step 3.4, using the optimal solution χ.sub.relax and the optimal total utility value Z.sub.relax as initial input parameters of the hybrid branch and bound method;

[0077] step 3.5, defining the number of branches in the algorithm as k, defining a lower bound of the optimal total utility value in the algorithm as L, and defining an upper bound of the optimal total utility value of in the algorithm as U;

[0078] determining an output parameter of the hybrid branch and bound method:

[0079] let χ.sub.0-1 denote an optimal solution of a non-relaxed adaptive acquisition and transmission model for VR video;

[0080] let Z.sub.0-1 denote an optimal total utility value of the non-relaxed adaptive acquisition and transmission model for VR video;

[0081] step 3.6, initializing k=0;

[0082] step 3.7, initializing L=0;

[0083] step 3.8, initializing U=Z.sub.relax;

[0084] step 3.9, denoting an optimal solution of the k-th branch as χ.sub.k and denoting a corresponding optimal total utility value as Z.sub.k, assigning a value of χ.sub.relax to χ.sub.k, and using the optimal solution χ.sub.k of the k-th branch as a root node;

[0085] step 3.10, determining whether there is a solution of χ.sub.k that does not meet a 0-1 constraint condition, if there is, dividing a relaxed optimal solution of χ.sub.k into a solution that meets the 0-1 constraint condition and a solution χ.sub.k(0,1) that does not meet the 0-1 constraint condition, and going to step 3.12; otherwise, expressing χ.sub.k as the optimal solution of the non-relaxed adaptive acquisition and transmission model for VR video;

[0086] step 3.11, generating randomly, a random number ε.sub.k for the k-th branch within the range of (0,1), and determining whether 0<χ.sub.k(0,1)<ε.sub.k is true; if true, adding the constraint condition “χ.sub.k(0,1)=0” to the non-relaxed adaptive acquisition and transmission model for VR video to form a sub-branch I of the k-th branch; otherwise, adding a constraint condition “χ.sub.k(0,1)=1” to the non-relaxed adaptive acquisition and transmission model for VR video to form a sub-branch II of the k-th branch;

[0087] step 3.12, solving the relaxed solutions of the sub-branch I and the sub-branch II of the k-th branch with the KKT condition, and using them as an optimal solution χ.sub.k+1 and an optimal total utility value Z.sub.k+1 to the (k+1)-th branch, where the χ.sub.k+1 includes: relaxed solutions of the sub-branch I and the sub-branch II of the (k+1)-th branch;

[0088] step 3.13, determining whether the optimal solution χ.sub.k+1 of the (k+1)-th branch meets the 0-1 constraint condition, if so, finding a maximum value from the optimal total utility value Z.sub.k+1 and assigning it to L, and χ.sub.k+1∈{0,1}; otherwise, finding the maximum value from the optimal total utility value Z.sub.k+1 and assigning it to U, and χ.sub.k+1∈(0,1);

[0089] step 3.14, determining whether Z.sub.k+1<L is true; if so, cutting off the branch where the optimal solution χ.sub.k+1 of the (k+1)-th branch is located, assigning k+1 to k, and returning to step 3.10; otherwise, going to step 3.15;

[0090] step 3.15, determining whether Z.sub.k+1>L is true; if so, assigning k+1 to k, and returning to step 3.10; otherwise, going to step 3.16;

[0091] step 3.16, determining whether Z.sub.k+1=L is true, if so, it means that the optimal solution of the non-relaxed adaptive acquisition and transmission model for VR video is the optimal solution χ.sub.k+1 of the (k+1)-th branch, and assigning χ.sub.k+1 to an optimal solution χ.sub.0-1 of the non-relaxed adaptive acquisition and transmission model for VR video, assigning Z.sub.k+1 corresponding to the χ.sub.k+1 to an optimal total utility value Z.sub.0-1 of the non-relaxed adaptive acquisition and transmission model for the VR video; otherwise, assigning k+1 to k, and returning to step 3.10.

[0092] step 4, selecting, by the VR video server, an original video with the e-th bit rate level for the c-th camera according to the value of the uplink collecting decision variable χ.sub.c,e.sup.UL, and receiving the original video of the e-th bit rate level selected by the c-th camera uploaded through the uplink, so that the VR video server receives original videos of corresponding bit rate levels selected by C cameras respectively;

[0093] step 5, performing, by the VR video server, a stitching and mapping process on the original videos with C corresponding bit rate levels to synthesize a complete VR video;

[0094] step 6, performing, by the VR video server, a segmentation process on the complete VR video to obtain T video blocks, denoted as {T.sub.1, T.sub.2, . . . , T.sub.t, . . . , T.sub.T}, where T.sub.t represents any t-th video block, and 1≤t≤T;

[0095] the VR video server provides D bit rate selections for the t-th video block T.sub.t for a compressing process, thereby obtaining compressed video blocks with D different bit rate levels, denoted as {T.sub.t.sup.1, T.sub.t.sup.2, . . . , T.sub.t.sup.d, . . . , T.sub.t.sup.D}, where T.sub.t.sup.d represents a compressed video block with the d-th bit rate level obtained after the t-th video block T.sub.t is compressed, where 1≤d≤D.

[0096] step 7, assuming that a modulation and coding scheme in the network environment is {M.sub.1, M.sub.2, . . . , M.sub.m, . . . , M.sub.M}, where M.sub.m represents the m-th modulation and coding scheme, and 1≤m≤M; and selecting, by the VR video server, the m-th modulation and coding scheme for the t-th video block T.sub.t; and

[0097] selecting, by the VR video server, the compressed video block T.sub.t.sup.d with the d-th bit rate level of the t-th video block T.sub.t for any n-th client according to a value of the downlink transmitting decision variable χ.sub.t,d,m.sup.DL, and transmitting the selected compressed video block T.sub.t.sup.d with the d-th bit rate level of the t-th video block T.sub.t to the n-th client through the downlink with the m-th modulation and coding scheme; so that the n-th client receives compressed video blocks with corresponding bit rate levels of T video blocks through the corresponding modulation and coding scheme;

[0098] step 8, performing, by the n-th client, decoding, mapping, and rendering process on the received compressed video blocks with corresponding bit rate levels of T video blocks, so as to synthesize a QoE-optimized VR video.

QoE-BASED ADAPTIVE ACQUISITION AND TRANSMISSION METHOD FOR VR VIDEO

Inventors

Cpc classification

Classification Explorer

H04N19/176

ELECTRICITY

Classification Explorer

H04L65/613

ELECTRICITY

Classification Explorer

H04N19/166

ELECTRICITY

Classification Explorer

H04L65/612

ELECTRICITY

Classification Explorer

H04L65/765

ELECTRICITY

Classification Explorer

H04L65/80

ELECTRICITY

International classification

Classification Explorer

H04N19/166

ELECTRICITY

Classification Explorer

H04N19/176

ELECTRICITY

Abstract

Claims

Description