METHOD AND SYSTEM FOR ENCODING A VIDEO DATA SIGNAL, ENCODED VIDEO DATA SIGNAL, METHOD AND SYSTEM FOR DECODING A VIDEO DATA SIGNAL
20230276039 · 2023-08-31
Inventors
Cpc classification
H04N19/88
ELECTRICITY
H04N13/161
ELECTRICITY
H04W52/0216
ELECTRICITY
H04N19/70
ELECTRICITY
Y02D30/70
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
H04N19/597
ELECTRICITY
H04L12/1881
ELECTRICITY
H04N21/234327
ELECTRICITY
International classification
H04N13/161
ELECTRICITY
H04N19/88
ELECTRICITY
H04N21/2343
ELECTRICITY
H04N21/434
ELECTRICITY
H04N19/597
ELECTRICITY
Abstract
Video data signals are encoded such that the encoded video data signal include at least a primary and at least a secondary video data signal. The primary and secondary video data signals are jointly compressed. The primary video data signal is compressed in a self-contained manner, and the secondary video data signal is compressed using data from the primary video data signal. The jointly compressed video data signal is split into separate bitstreams, at least a primary bitstream including data for the primary video data signal and at least a secondary bitstream including data for the secondary video data signal, whereafter the primary and secondary bitstreams are multiplexed into a multiplexed signal, and the primary and secondary signals are provided with separate codes.
Claims
1. A method for encoding a video signal including a first signal and a second signal, the method comprising: encoding, by an encoder, the video signal to form an encoded signal; splitting, by a splitter, the encoded signal into separate bitstreams including a first bitstream including data for the first signal and a second bitstream including data for the second signal; and assigning, by the multiplexer, a first code to the first bitstream to form a first coded bitstream, assigning a second code to the second bitstream to form a second coded bitstream, wherein the second code is different from the first code.
2. The method of claim 1, wherein the encoding step encodes the first signal in a self-contained manner without using information from the second signal and encodes the second signal using information from the first signal.
3. The method of claim 1, further comprising outputting the first coded bitstream and the second coded bitstream to one of a three dimensional (3D) decoder system for providing a 3D view and a two dimensional (2D) decoder system for providing a 2D view.
4. The method of claim 1, wherein the first code is decidable by the 2D decoder system to allow the 2D decoder system to extract and decode the first coded bitstream and discard the second coded bitstream.
5. The method of claim 1, wherein the encoding step encodes the first signal such that the first bitstream comprises data with a first set of frames having a first frequency, and encodes the second signal such that the second bitstream comprises data with a second set of frames having a second frequency, the second frequency being higher than the first frequency.
6. The method of claim 1, wherein the encoding step compresses the first signal with a lower quantization factor than the second signal.
7. The method of claim 1, further comprising interleaving frames of the first signal with frames of the second signal to form the video signal.
8. A system for encoding a video signal including a first signal and a second signal, the system comprising: an encoder configured to encode the video signal to form an encoded signal; a splitter configured to split the encoded signal into separate bitstreams including a first bitstream including data for the first signal and a second bitstream including data for the second signal; and a multiplexer configured to assign a first code to the first bitstream to form a first coded bitstream, and assign a second code to the second bitstream to form a second coded bitstream, wherein the second code is different from the first code.
9. The system of claim 8, wherein the encoder is configured to encode the first signal in a self-contained manner without using information from the second signal and encode the second signal using information from the first signal.
10. The system of claim 8, wherein the multiplexer is configured to transmit the first coded bitstream and the second coded bitstream to one of an enhanced two-dimensional (2D) decoder system for providing a three-dimensional (3D) view and a non-enhanced 2D decoder system for providing a 2D view.
11. The system of claim 10, wherein the first code is decidable by the non-enhanced 2D decoder system to allow the non-enhanced 2D decoder system to extract and decode the first coded bitstream and discard the second coded bitstream.
12. The system of claim 8, wherein the encoder is configured to encode the first signal such that the first bitstream comprises data with a first set of frames having a first frequency, and encode the second signal such that the second bitstream comprises data with a second set of frames having a second frequency, the second frequency being higher than the first frequency.
13. The system of claim 8, wherein the encoder is configured to compress the first signal with a lower quantization factor than the second signal.
14. The system of claim 8, further comprising a combiner configure to combine the first signal and the second signal to form a combined signal for encoding by the encoder.
15. The system of claim 8, further comprising an interleaver configure to interleave frames of the first signal with frames of the second signal to form a combined signal for encoding by the encoder.
16. A non-transitory computer readable medium comprising computer instructions which, when executed by a processor, configure the processor to perform the method of encoding of claim 1.
17. A method for decoding, by a 2D decoder system, a video signal including a first coded bitstream and a second coded bitstream, the method comprising: receiving the first coded bitstream having a first code and receiving the second coded bitstream having a second code, the second code being different from the first code; recognizing, by a demultiplexer, at least one of the first code and the second code and demultiplexing first coded bitstream and the second coded bitstream to form a first demultiplexed bitstream and a second demultiplexed bitstream; merging the first demultiplexed bitstream and the second demultiplexed bitstream to form a merged signal; and decompressing by a 2D decoder the merged signal to form a decompressed video signal.
18. The method of claim 17, wherein the first coded bitstream is self-contained, and wherein the decompressing step decompresses the first demultiplexed bitstream included in the merged signal without using information from second demultiplexed bitstream, and the decompresses second demultiplexed bitstream included in the merged signal using data from the first demultiplexed bitstream included in the merged signal.
19. The method of claim 17, wherein the first code is decidable by the 2D decoder system to perform one of (i) the merging step and (ii) extract and decode the first coded bitstream and discard the second coded bitstream.
20. The method of claim 17, further comprising a splitter configured to split the merged signal to provide a left view and a right view.
21. A system for decoding a video signal including a first coded bitstream and a second coded bitstream, the system comprising: a demultiplexer configured to: receive the first coded bitstream having a first code and receiving the second coded bitstream having a second code, the second code being different from the first code, recognize at least one of the first code and the second code, and demultiplex the first coded bitstream and the second coded bitstream to form a first demultiplexed bitstream and a second demultiplexed bitstream; a merger configured to merge the first demultiplexed bitstream and the second demultiplexed bitstream to form a merged signal; and a 2D decoder configured to decompress the merged signal to form a decompressed video signal.
22. The system of claim 21, wherein the first coded bitstream is self-contained such that predicted pictures associated with the first coded bitstream are temporally predicted only from pictures associated with the first coded bitstream, and predicted pictures associated with the second coded bitstream are temporally predicted from the second coded bitstream.
23. The system of claim 21, wherein the first code is decidable by the 2D decoder.
24. The system of claim 21, further comprising a splitter configured to split the merged signal to provide a left view and a right view.
25. A non-transitory computer readable medium comprising computer instructions which, when executed by a processor, configure the processor to perform the method of encoding of claim 16.
Description
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053] The Figures are not drawn to scale. Generally, identical components are denoted by the same reference numerals in the Figures.
[0054]
[0055]
[0056] Although such a method does reduce the bit-rate (by about 25% compared to the method of
[0057] The object of the invention is therefore to provide a method which, on the one hand reduces the bit rate compared to fully and separately encoding both views, while, on the other hand, still being having standard video backward compatibility.
[0058] To this end a method for encoding video data signals in accordance with the invention is a method wherein a video data signal is encoded, the encoded video data signal comprising at least a primary and at least a secondary video data signal, wherein the primary and secondary video data signal are jointly compressed, the primary video data signal being compressed in a self-contained manner, and the secondary video data signal being compressed using data from the primary video data signal, the jointly compressed video data signal is split into separate bitstreams, the bitstreams comprising at least a primary bitstream comprising data for the primary video data signal and at least a secondary bitstream comprising data for the secondary video data signal, whereafter the primary and secondary bitstreams are multiplexed into a multiplexed signal, and the primary and secondary signals are provided with separate codes.
[0059] The method of the invention combines the advantages of prior methods while avoiding their respective drawbacks. It comprises jointly compressing two or more video data signals, followed by splitting the single compressed bitstream into 2 or more (primary and secondary) separate bit-streams: a “primary” one that is self contained and is decidable by conventional video decoders, and one or more “secondary” set of frames (so called auxiliary-video- representation streams) that are dependent on the primary bitstream. The separate bitstreams are multiplexed wherein the primary and secondary bit-streams are separate bitstreams provided with separate codes and transmitted. Prima facie it may seem superfluous and a waste of effort to first jointly compress signals only to split them again after compression and provided them with separate codes. In all known techniques the compressed video data signal is given a single code in the multiplexer. Prima facie the invention seems to add an unnecessary complexity in the encoding of the video data signal.
[0060] The inventors have however realized that splitting and separately packaging (i.e., giving the primary and secondary bitstream separate codes in the multiplexer) of the primary and secondary bit stream in the multiplexed signal has the result that, on the one hand, a standard demultiplexer in a conventional video system will recognize the primary bit stream by its code and send it to the decoder so that the standard video decoder receives only the primary stream, the secondary stream not having passed the de-multiplexer, and the standard video decoder is thus able to correctly process it as a standard video data signal, for instance a standard 2D video data signal and/or a standard 50 Hz video data signal, or a signal of base resolution while on the other hand, a specialized system such as a 3D system or a 100 Hz display system or a high resolution video decoder can completely reverse the encoding process and re-create the original enhanced bit-stream before sending it to the for instance a stereo decoder or 100 Hz decoder or a HDTV decoder.
[0061] In an embodiment of the method of the invention a video data signal is encoded, the encoded video data signal comprising a first and at least a second view having frames, wherein the frames of the first and second view are interleaved to form an interleaved video sequence, whereafter the interleaved video sequence is compressed, wherein the frames of the first of the views are encoded and compressed without using frames of the second view, and the frames of the second view are encoded and compressed using frames of the first view, and where the compressed enhanced video data signal is split into a primary and a secondary bit stream each bit stream comprising frames, wherein the primary bit-stream comprises compressed frames for the first of the views, and the secondary bit-stream for the second of the views, the primary and secondary bit-stream forming separate bit-streams, whereafter the primary and secondary bit-stream are multiplexed into a multiplex signal, the primary and secondary bitstream being provided with separate codes.
[0062]
[0063] The frames of the left and right view are interleaved in VI to provide a combined signal. The combined signal resembles a 2D signal. The 2D video encoder 5 encodes and compresses the combined interleaved signal. A special feature of the compression is that the frames of one of the views form a self-contained system, i.e., in compression no information from the other view is used for the compression. The frames of the other view are compressed using information from frames of the first view. The invention departs from the natural tendency to treat two views on an equal footing. In fact, the two views are not treated equally during compression. One of the views becomes the primary view, for which during compression no information is used form the other view, the other view is secondary. The frames of the primary view and the frames of the secondary view are split into a primary bit-stream and a secondary bit stream by Bit Stream Splitter BSS. The coding system comprises a multiplexer MUX which assigns a code, e.g., 0×01 for MPEG or 0×1B for H.264, recognizable for standard video as a video bitstream, to the primary bitstream and a different code, e.g., 0×20, to the secondary stream. The multiplexed signal is transmitted (T). In
[0064]
[0065] Because the primary stream 0×1B is a fully self-contained signal, the problem associated with the method of
[0066] The method of encoding of the invention allows a reduction of bit rate compared to compressing the two views separately. Thus both a reduction in bitrate as well as 2D backward compatibility are achieved.
[0067]
Page 12 Line 15 of Original WO-701, Cited on Page 169 of EP-FW
[0068]
[0069] When the interleaving scheme of
Page 12 Line 33 of Original WO-701, Cited on Page 169 of EP-FW
[0070] By interleaving the frames of the left and right view and then compressing then with a compression scheme which provides for one self-contained signal for one of the views and then splitting the signal again in a primary bit stream (containing the self-contained signal) and a secondary bit-stream (containing the non-self-contained signal) a bit rate reduction is achieved while yet providing a fully operational 2D backward compatible signal. The bit stream splitter creates the primary stream (0×1B) by concatenating all the Access Units (an Access Units comprises the data for at least a frame) of the first view into a primary stream and creates a secondary stream by concatenating all the Access Units (AU) of the second view into the secondary bit stream. The multiplexer then assigns a different code to the primary and secondary stream.
[0071]
[0072] In this embodiment of the invention the SEI message is used inside the encoding system.
[0073] An access unit is taken in step 60.
[0074] In a first step 61 it is checked whether the current access unit comprises an SEI message.
[0075] If an access unit does not contain an SEI message, in this particular example the information on the set of frames to which the access unit belongs is deduced from previous received information. For instance, if the previous received information was: “if one access unit belongs to set A, the next belongs to set B”, it is not needed to provide each access unit with SEI information.
[0076] If the access unit does contain an SEI message the validity of the SEI message is checked with regards to a previous SEI message in step 62.
[0077] The SEI messages give information on interleaving which usually is a known sequence. If the SEI message is invalid there is an error 63.
[0078] If the SEI message is valid, the next step 64 is taken.
[0079] For each access unit the relevant interleaving information is now available, either by means of the fact that there was no SEI message, in which case there was no change in SEI message with respect to a previous access unit, or the access unit has a valid SEI message.
[0080] In the next step 64 it is checked whether the Access Unit forms part of the primary view, if so it is appended in step 65 to the primary view bit stream, if not, it is appended to the secondary view video bit-stream in step 66. It is obvious that this sequence could be reversed. Once an access unit is dealt with and appended to either the primary or secondary bit-stream, the next access unit is gotten in step 67 and the process is repeated. It is remarked that
[0086]
[0087] The two bit-streams are kept synchronous at systems level, for instance thanks to DTS (Decoding Time Stamp) flags in an MPEG-2 transport stream (broadcast application) or in an MPEG-4 File Format (file storage applications). A syntax element at systems level may be used to indicate that the secondary bit stream is dependent on the primary bit stream.
[0088] It is remarked that the secondary stream is no longer a valid stream by itself. Often this will not be a problem, Should problem occur, one can insert empty frames into the secondary stream which will hardly increase the bit rate. Before the merging process these empty frames will have to be removed first.
[0089] In embodiments regular changes of primary and secondary signal may be made. In the method the two views are not treated equally; the first view is a self-contained view, whereas the second view is derived from the first. This could lead to a small difference in quality between the left end right view which may, in time, lead to slightly different behavior of the left and right eye receiving the images. By regularly changing the primary view from left to right, for instance at scene changes, this can be avoided.
[0090] In embodiments the quantization factor of the compression may differ for the primary and secondary bit streams, in the 3D example the primary and secondary views. Especially when there are more secondary views, as will be explained below, it may be useful to assign more bandwidth to the primary view than to the secondary view.
[0091]
[0092]
[0093] After encoding the splitter splits the encoded stream into a primary stream prim and a secondary stream sec. The multiplexer mux generates a multiplexed signal comprising a bitstream 0×1B for the primary view, and a separate bitstream 0×20 for the secondary view and, as in the case of
[0094] The standard device comprises a demultiplexer which extracts form the multiplexed signal the primary bitstream 0×1B since it recognizes this bitstream by its code; it rejects the bitstream 0×20. The video decoder receives the primary bitstream 0×1B. Since this is a self-contained bitstream with a “normal” bit rate, the video decoder is able to decode the bitstream without great difficulty. Thus the encoding method and system is backward compatible.
[0095]
[0096] At the decoder side, the decoder comprises a 3D demultiplexer 3D demux. This demultiplexer sends the audio bitstream 0×03 to an audio decoder, extracts the two video bitstreams 0×1B (the primary bitstream) and 0×02 (the secondary bitstream) from the multiplexed signal and sends the two video bitstreams to their respective inputs at a Bit Stream Merger (BSM) which merges the primary and secondary stream again. The merged video stream is send to a decoder which decodes the merged bitstream using a reverse coding method providing a 3D video data signal.
[0097] Thus, a specialized 3D video decoding system is able to decode a 3D video data signal, while yet a standard 2D video decoding system is also able to provide a high quality image.
[0098] In the examples given above the enhanced video data signal was a 3D signal comprising two views, a left and a right view.
[0099] The invention is not restricted to this situation, although it is highly suitable for this situation.
[0100] The invention is also useful when, instead of two views, a multiview signal is generated.
[0101]
[0102] In this example three are three views, a primary view, for instance a central view, and a number of secondary views, for instance a left and a right view. For the central view a self-contained bitstream is generated. For the two secondary views, for instance a left and right view, secondary, not self-contained, bit streams are generated.
[0103] In this embodiment it may be useful to use in the compression a different quantization factor for the central view frames than for the secondary view frames especially if there are many secondary views.
[0104] This embodiment is useful for generation of multiview signal using a MVC (multi-View encoding) encoder. In the example of
[0105] A further embodiment of the invention is exemplified in
[0106] Another category of enhanced video data signals in one in which a higher frequency (for instance 100 Hz) signal is to be generated. The problem described above for stereo signals play also for such video data signal. The majority of video display devices operates at standard frequencies and the decoders are designed for such frequencies.
[0107]
[0108] In a specialized video decoder the frames may be compressed in a scheme much the same as the scheme shown in
[0109]
[0110]
[0111] In the above embodiment the SVC stream is split along the frequency (time axis). SVC also allows splitting frames along the resolution and/or the quantization axis (SNR, CGS (coarse Grain scalability), FGS (Fine grain scalability)) and/or color sampling axis (4:4:4,4:2:0,4:2:2). In such embodiments the problem described above i.e., the fact that a standard video decoder has problems handling the incoming bitstream also occur, in other words compatibility problems occur.
[0112] Within the framework of the invention, in its broadest sense, the video stream is split into at least two bit streams, a primary and secondary video stream (see
[0113] The code provided to the primary stream is a standard code (e.g., 0×1B or 0×01) and is thus decidable by any normal standard non-scalable MPEG (for 0×01) or H.264 video decoder (for 0×1B) 9, whereas a specialized decoder 8 in accordance with the invention can draw full advantage of the scalability encoding.
[0114] The invention is also embodied in any computer program product for a method or device in accordance with the invention. Under computer program product should be understood any physical realization of a collection of commands enabling a processor -generic or special purpose-, after a series of loading steps (which may include intermediate conversion steps, like translation to an intermediate language, and a final processor language) to get the commands into the processor, to execute any of the characteristic functions of an invention. In particular, the computer program product may be realized as data on a carrier such as e.g,. a disk or tape, data present in a memory, data traveling over a network connection -wired or wireless-, or program code on paper. Apart from program code, characteristic data required for the program may also be embodied as a computer program product.
[0115] The invention also relates to devices comprising an encoding system in accordance with the invention, such as 3D video recording devices or high resolution video recording devices.
[0116] The invention also relates to display devices comprising a decoding system in accordance with the invention. Such devices may for instance be 3D video display devices or HDTV display device or display devices with increased resolution.
[0117] The invention furthermore relates to a multiplexed video data signal comprising at least two related video data signals with separate codes (0×01, 0×1B, 0×20), wherein a first video data signal (0×01, 0×1B) is a self-contained video data signal and at least a second video data signal (0×20) is not. Using a demultiplexer it is easy to treat the two related, but different, video data signals differently without having to use a decoder to make the distinction. For standard devices, such as standard 2D video display devices or SDTV device, the first selfcontained signal can be forwarded to the decoder, without overloading the decoder with the second signal. Specialized video system can make full use of the data in the two video data signals.
[0118] In short, the invention can be described as follows:
[0119] Video data signals are encoded such that the encoded video data signal comprises at least a primary and at least a secondary video data signal. The primary and secondary video data signal are jointly compressed. The primary video data signal is compressed in a self-contained manner, and the secondary video data signal is compressed using data from the primary video data signal. The jointly compressed video data signal is split into separate bitstreams, at least a primary bitstream comprising data for the primary video data signal and at least a secondary bitstream comprising data for the secondary video data signal, whereafter the primary and secondary bitstreams are multiplexed into a multiplexed signal, and the primary and secondary signals are provided with separate codes.
[0120] It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims.
[0121] In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim.
[0122] The word “comprising” does not exclude the presence of other elements or steps than those listed in a claim. The invention may be implemented by any combination of features of various different preferred embodiments as described above.