Method and apparatus for multi-threaded video decoding

09838703 ยท 2017-12-05

Assignee

Inventors

Cpc classification

International classification

Abstract

A method and an apparatus for performing multi-threaded video decoding are disclosed. The method takes use of a multi-threaded scheme to process an encoded picture stream on a picture by picture basis. In the method, multiple threads are used for performing video decoding at the same time, such as one thread for the operation of parsing input bits into syntax elements of one picture implemented by the first thread, another thread for the operation of decoding the parsed syntax elements of another picture into pixel values implemented by the second thread, and the other threads for the operations of the non-reference picture, such as bidirectional predictive picture, including parsing input bits into syntax elements and the subsequent operation of decoding the parsed syntax elements into pixel values. Therefore, the decoding speed is substantially increased, and the decoding efficiency is enhanced.

Claims

1. A method for performing video decoding, comprising: receiving a stream of pictures, each picture consisting essentially of a complete field or frame having one or more slices of a plurality of macroblocks, each picture being encoded in accordance with one of MPEG-1, MPEG-2, MPEG-4 and H.264; parsing input bits of a first one of the pictures into a plurality of syntax elements of the first picture; decoding the plurality of syntax elements of the first picture into pixel values and parsing input bits of a second different one of the pictures into a plurality of syntax elements of the second picture in a parallel manner; determining whether the second picture has dependency on one or more other pictures; and if the second picture has no dependency on one or more other pictures, decoding the plurality of syntax elements of the second picture into pixel values.

2. A method according to claim 1, wherein the parsing, decoding and determining are performed by a multi-threaded processor, the method further comprising: assigning a first thread of the multi-threaded processor to perform the parsing input bits of the first picture; and assigning a second thread of the multi-threaded processor to perform the parsing input bits of the second picture, wherein the first and second threads operate in parallel.

3. A method according to claim 2, further comprising: assigning a third thread for performing both parsing input bits of a third picture into a plurality of syntax elements of the third picture and subsequent decoding of the plurality of syntax elements of the third picture into pixel values, wherein the third picture depends on one or both of the first and second pictures.

4. A method according to claim 1, wherein the decoding of the second picture further comprises: if the second picture has dependency on one or more other pictures, decoding the plurality of syntax elements of the second picture into pixel values after decoding the one or more other pictures referred by the decoding operation of the second picture.

5. A method according to claim 1, wherein the first picture comprises one of an intra-coded picture (I picture) and a predictive picture (P picture).

6. A method according to claim 1, wherein the second picture comprises one of a predictive picture (P picture) and one or more bidirectional predictive picture (B picture).

7. A method according to claim 1, further comprising determining whether the picture to be decoded includes a plurality of slices, and if it determined that the picture is a multi-slice picture, decoding the multi-slice picture by multiple threads of a multi-threaded processor.

8. A method according to claim 1, wherein receiving includes receiving the stream from a network.

9. A method according to claim 1, wherein receiving includes receiving the stream from a storage device.

10. An apparatus, comprising: a receiving unit that is adapted to receive a stream of pictures, each picture consisting essentially of a complete field or frame having one or more slices of a plurality of macroblocks, each picture being encoded in accordance with one of MPEG-1, MPEG-2, MPEG-4 and H.264; a first decoding unit that is adapted to parse input bits of a first one of the pictures into a plurality of syntax elements of the first picture; a second decoding unit that is adapted to decode the plurality of syntax elements of the first picture into pixel values, wherein the first decoding unit is further adapted to parse input bits of a second different one of the pictures into a plurality of syntax elements of the second picture in a parallel manner with the second decoding unit decoding the plurality of syntax elements of the first picture into pixel values; and a determining unit that is adapted to determine whether the second picture has dependency on one or more other pictures, wherein if the second picture has no dependency on one or more other pictures, the second decoding unit decodes the plurality of syntax elements of the second picture into pixel values.

11. An apparatus according to claim 10, wherein the apparatus is implemented as part of a mobile phone.

12. An apparatus according to claim 11, wherein the receiving unit is coupled to receive the stream from a network connection of the mobile phone.

13. An apparatus according to claim 10, wherein the apparatus is implemented as part of a digital versatile disk player.

14. An apparatus according to claim 13, wherein the receiving unit is coupled to receive the stream from an optical storage disk of the digital versatile disk player.

15. An apparatus according to claim 10, wherein the apparatus is implemented as part of a computer.

16. An apparatus, comprising: a buffer that is adapted to store at least a portion of a stream of pictures, each picture consisting essentially of a complete field or frame having one or more slices of a plurality of macroblocks, each picture being encoded in accordance with one of MPEG-1, MPEG-2, MPEG-4 and H.264; a video decoder coupled to the buffer, the video decoder having a first stage that is adapted to parse input bits of a first one of the pictures into a plurality of syntax elements of the first picture, and a stage that is adapted to decode the plurality of syntax elements of the first picture into pixel values, wherein the first stage is further adapted to parse input bits of a second different one of the pictures into a plurality of syntax elements of the second picture in a parallel manner with the second stage decoding the plurality of syntax elements of the first picture into pixel values; and a determining unit that is adapted to determine whether the second picture has dependency on one or more other pictures, wherein if the second picture has no dependency on one or more other pictures, the second decoding unit decodes the plurality of syntax elements of the second picture into pixel values.

17. An apparatus according to claim 16, wherein the apparatus is implemented as part of a mobile phone.

18. An apparatus according to claim 17, wherein the video decoder is implemented by one or both of hardware and software of the mobile phone.

19. An apparatus according to claim 16, wherein the apparatus is implemented as part of a digital versatile disk player.

20. An apparatus according to claim 19, wherein the video decoder is implemented by one or both of hardware and software of the digital versatile disk player.

21. An apparatus according to claim 16, wherein the apparatus is implemented as part of a computer.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) These and other aspects and features of the present invention will become apparent to those ordinarily skilled in the art upon review of the following description of specific embodiments of the invention in conjunction with the accompanying figures, wherein:

(2) The accompanying drawings are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention.

(3) FIG. 1 is a flowchart illustrating a conventional process for decoding picture stream.

(4) FIG. 2 is a schematic diagram illustrating a conventional video decoding process.

(5) FIG. 3 is a flowchart illustrating a method for performing multi-threaded video decoding according to an embodiment of the present invention.

(6) FIG. 4 is a flowchart illustrating a method for performing multi-threaded video decoding according to another embodiment of the present invention.

(7) FIG. 5 is a flowchart illustrating the decoding process for the block A shown in FIG. 4.

(8) FIG. 6 is a flowchart illustrating the decoding process for the block B shown in FIG. 4.

(9) FIG. 7 is a flowchart illustrating the decoding process for the block C shown in FIG. 4.

(10) FIG. 8 is a block diagram illustrating the apparatus for performing multi-threaded video decoding according to an embodiment of the present invention.

(11) FIG. 9 is a block diagram illustrating the apparatus for performing multi-threaded video decoding according to another embodiment of the present invention.

(12) FIG. 10 is a schematic diagram illustrating the picture stream according to an embodiment of present invention.

(13) FIG. 11 is a schematic diagram illustrating the multi-threaded scheme by decoding order according to an embodiment of present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

(14) Reference will now be made in detail to the present preferred embodiments of the invention, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers are used in the drawings and the description to refer to the same or like parts.

(15) As seen in FIG. 2, the bitstream is decoded picture by picture since the processor can only be operated in a single thread. However, with the multi-threaded processor being presented to the public, the decoding of pictures of a video can be implemented by multiple decoding units or decoding instructions that can be executed in multiple threads. The present invention applies such multi-threaded scheme to effectively accelerate the decoding speed.

(16) FIG. 3 is a flowchart illustrating a method for performing multi-threaded video decoding according to an embodiment of the present invention. Referring to FIG. 3, the decoding process 300 is carried out by a video decoder with multiple threads. When a bitstream of a video is inputted into the video decoder, the syntax elements of a first picture are parsed (step 310). The syntax elements include the information of the first picture, such as picture start code (PSC), temporal reference (TR), picture type, motion vector type, motion vectors, and so on.

(17) Next, one thread of the decoding process starts to decode the first picture into a plurality of pixel values based on the parsed syntax elements of the first picture (step 320). In the meantime, the input bits of a second picture of encoded picture streams are parsing into the syntax elements of a second picture by another thread of the decoding process (step 330).

(18) Before the decoding process starts to decode the second picture, it is determined whether the decoding operation of the second picture has dependency on one or more other pictures (step 340). If no dependency is found, the decoding process will decode the second picture directly (step 350). However, if the decoding operation of the second picture has dependency on the other pictures, the decoding process will further determine whether the one or more other pictures referred by the decoding of the second picture have been decoded already or not (step 360).

(19) If the decoding of the dependent pictures is not yet completed, the decoding of the second picture will be postponed until all the reference pictures have been decoded. Then the decoder starts to decode the second picture by referring to the dependent pictures (step 370).

(20) In the aforesaid decoding process, the pictures that depend on more than one reference picture are typically referred to as the non-reference pictures, since they are themselves not used for prediction purposes by any other picture. This is the common practice of most international standards followed by digital television and optical storage disks such as DVD, High-definition disks (HD-DVD) and Blue-ray disks (BD). More precisely, they are defined as a bidirectional predictive picture (B picture) according to the standard video codec. Wherein, the forward predictive picture needs to make forward reference to a preceding intra-coded picture or another forward predictive picture.

(21) Similarly, the bidirectional predictive picture is required to make forward and backward reference to other intra-coded pictures or forward predictive pictures. Therefore, if the aforementioned second picture is a bidirectional predictive picture, the decoding process will further determine whether the one or more other pictures forward and backward referred by the decoding of the second picture have been decoded because the decoding operation of the second picture has dependency on the other pictures. Once the reference pictures have been decoded, the decoder may assign one thread to do the whole operations including parsing input bits into a plurality of syntax elements of the bidirectional predictive picture and the subsequent operation of decoding the parsed syntax elements into pixel values. Accordingly, a decoding process with the highest decoding speed is obtained.

(22) To sum up, in the present invention, multiple threads are used for performing video decoding at the same time, such as the operation of parsing input bits into syntax elements of one picture implemented by the first thread, the operation of decoding the parsed syntax elements of another picture into pixel values implemented by the second thread, and the whole operations including parsing input bits into syntax elements and the subsequent operation of decoding the parsed syntax elements into pixel values of the bidirectional predictive picture implemented by the third thread. The same process is applied and repeated for decoding the other pictures until all the pictures in the picture stream have been decoded. However, the decoding process may vary for different types of pictures, and usually be implemented in a proper order. Therefore, an embodiment considering all the conditions of decoding pictures is further provided.

(23) FIG. 4 is a flowchart illustrating a method for performing multi-threaded video decoding according to another embodiment of the present invention. Referring to FIG. 4, the decoding process 400 is carried out by a video decoder with multiple threads. In step 410, a video stream is received by the video decoder either from a network or from an external storage device. In step 420, the picture header is read by the video decoder to obtain information about the picture. Then, in step 430, the decoder is initialized to find a reference picture. In the present embodiment, the reference picture is an intra-coded picture (I picture) or a predictive picture (P picture).

(24) In step 440, the decoder processes the picture stream in a parallel manner with operations for various pictures, such as reference picture, preceding reference picture, and non-reference picture. The decoder can leverage the decoding process in parallel manner, for example, paring input bits into a plurality of syntax elements of one picture and decoding the parsed syntax elements of another picture into the pixel values. The decoding process can be classified into three conditions (as noted by block A, B, C) due to the different types of pictures.

(25) In block A, the syntax elements of current reference picture is parsing from the input bits; in block B, the preceding reference picture is decoded into pixel values based on its parsed syntax elements; in block C, the syntax elements of the non-reference picture is parsing from the input bits followed by decoding the syntax elements into pixel values of the non-reference picture. Each of these three blocks is implemented with different threads, such that they can be executed in a parallel manner.

(26) After these pictures are decoded, the decoded pixel values are then outputted into buffer memory for display. Meanwhile, the decoder checks whether the decoding process 400 reaches the end of picture stream. If there are still pictures not yet decoded, the decoding process 400 will return back to step 420 for reading the header of a next picture stream. When the decoder detects that the entire picture stream has been decoded, the decoding process 400 is terminated.

(27) In other embodiments, each picture may include multiple slices. In that case, multiple threads can also be used to do the reading operation of the syntax elements or the decoding operation of pictures. The following embodiments are presented to introduce detailed process for the three conditions described in FIG. 4.

(28) FIG. 5 is a flowchart illustrating the decoding process for the block A shown in FIG. 4. Referring to FIG. 5, the decoder is going to parse the input bits into the syntax elements of a current reference picture. Accordingly, it is determined whether there are multiple slices in the current reference (step 510). If multiple slices exist, the decoder will use multiple threads (N+1 threads in this embodiment) to do the operation of parsing input bits into the syntax elements of slices (0 to N) of current reference picture in a parallel manner for each slice (step 520). Otherwise, the decoder only needs to use a single thread to do the operation of parsing input bits into the syntax elements of current reference picture (step 530).

(29) FIG. 6 is a flowchart illustrating the decoding process for the block B shown in FIG. 4. Referring to FIG. 6, the decoder is going to decode syntax elements into pixel values of a preceding reference picture. Accordingly, it is determined whether there are multiple slices in the preceding reference picture (step 610). If multiple slices exist, the decoder will use multiple threads (N+1 threads in this embodiment) to do the operation of decoding syntax elements into pixel values of slices (0 to N) of the preceding reference picture in a parallel manner for each slice (step 620). Otherwise, the decoder only needs to use a single thread to do the operation of decoding syntax elements into pixel values of the preceding reference picture (step 630).

(30) FIG. 7 is a flowchart illustrating the decoding process for the block C shown in FIG. 4. Referring to FIG. 7, the decoder is going to decode the preceding non-reference picture. Accordingly, it is determined whether there are multiple slices in the non-reference picture (step 710). If multiple slices exist, the decoder will use multiple threads (N+1 threads in this embodiment) to do the operation of parsing input bits into the syntax elements of slices (0 to N) and the subsequent operation of decoding syntax elements into pixel values of slices (0 to N) of the non-reference picture in a parallel manner for each slice (step 720). Otherwise, the decoder only needs to use a single thread to do the operation of parsing input bits into the syntax elements and the subsequent operation of decoding syntax elements into pixel values of the non-reference picture (step 730).

(31) FIG. 8 is a block diagram illustrating the apparatus for performing multi-threaded video decoding according to an embodiment of present invention. Referring to FIG. 8, the apparatus 800 includes a buffer 810, a first decoding unit 820, and a second decoding unit 830. Additionally, the apparatus can include an addressable storage medium or computer accessible medium, such as random access memory (RAM), an electronically erasable programmable read-only memory (EEPROM), masked read-only memory, one-time programmable memory, hard disks, floppy disks, laser disk players, digital video devices, Compact Disc ROMs, DVD-ROMs, other optical media, video tapes, audio tapes, magnetic recording tracks, electronic networks, and other techniques to transmit or store electronic content such as, by way of example, programs and data. The apparatus 800 may be used or implemented as part of the hardware or software included with a personal computer, a portable computer, a mobile phone, a digital personal assistant, a digital versatile disk player, or a television, but is not limited to them.

(32) A buffer 810 is suitable for receiving and storing the encoded pictures of a video from a network or from an external storage device. A first decoding unit 820 is coupled to the buffer 810 and suitable for parsing the input bits into syntax elements, and a second decoding unit 830 is coupled to the buffer 810 and suitable for decoding syntax elements into pixels value. Significantly, in the present embodiment, when the second decoding unit 830 is decoding the parsed syntax elements of one picture achieved by the first decoding unit 820 into pixels value, the first decoding unit 820 can be parsing the input bits of another picture into syntax elements at the same time Therefore, the video decoding can be divided by two stages, respectively executed by a first decoding unit and a second decoding unit, each of which can be independently operated with multiple threads for different pictures or slices so as to accelerate the speed of decoding process.

(33) FIG. 9 is a block diagram illustrating the apparatus for performing multi-threaded video decoding according to another embodiment of the present invention. Referring to FIG. 9, the apparatus includes a receiving unit 910, a finding unit 920, a first decoding unit 930, a second decoding unit 940 and a determining unit 950. The apparatus 900 may be used or implemented as a portable computer, a mobile phone, a digital personal assistant, a digital versatile disk player, or a television, but is not limited to them.

(34) The receiving unit 910 is suitable for receiving and storing the encoded pictures of a video from a network or from an external storage device. The finding unit 920 is suitable for reading header information of encoded picture streams to find a reference picture before starting to perform multi-threaded video decoding. The first decoding unit 930 is suitable for parsing the input bits into syntax elements, and the second decoding unit 940 is suitable for decoding syntax elements into pixels value. The determining unit 950 is coupled to the first decoding unit 930 and the second decoding unit 940, also, the determining unit 950 has two functions, and the first one is to determine whether the picture of encoded picture streams is reference picture or non-reference picture, and the second one is to determine whether the picture of encoded picture streams includes multiple slices. If the picture of encoded picture streams is determined to be a multiple-slice picture, the first decoding unit and the second decoding unit will be used with multiple threads to process multiple slices of the picture in parallel manner.

(35) For example, the picture stream, e.g. IBBPBBP . . . , wherein the I, P, B refers to I picture, P picture, and B picture, respectively, has been received by a receiving unit 910. As defined in the video encoding/decoding standard, the decoding order would be I, P, B.sub.0, B.sub.1, P.sub.0, B.sub.2, B.sub.3, P.sub.1, B.sub.4, B.sub.5, and P.sub.2. Therefore, after a reference I picture has been found by a finding unit 920 from reading header information of encoded picture stream before starting to perform multi-threaded video decoding in the present embodiment, the first decoded picture will be the reference I picture. When the syntax elements of I picture has been achieved by the first decoding unit 930, and the parsed syntax elements of I picture is decoding into pixels values by the second decoding unit 940. Next, when the picture B.sub.0 is inputted, due to the B picture is a non-reference picture determined by the determining unit 950, the decoder parses its header of this non-reference picture in order to process it later. Then, the decoder continues to get a next picture, the picture B.sub.1 is inputted. It is also a non-reference picture here, so the decoder does the same process as B.sub.0. Next, the picture P.sub.0 is inputted, the input bits of P.sub.0 picture can be parsing into syntax elements by the first decoding unit 930 at the same time while the parsed syntax elements of I picture is decoding into pixels values by the second decoding unit 940.

(36) Then, the decoder continues to get a next picture, the picture P.sub.1 is inputted. It is a reference picture, so the input bits of P.sub.1 picture can be parsing into syntax elements by the first decoding unit 930 at the same time while the parsed syntax elements of P.sub.0 picture is decoding into pixels values by the second decoding unit 940. Because the reference picture of B.sub.0 and B.sub.1 is P.sub.0, and the decoding operation of B.sub.0 and B.sub.1 must be postponed until P.sub.0 is decoded completely, that is, when the decoder continues to get a next picture, a non-reference picture B.sub.4 is inputted, parses its header of this non-reference picture in order to process it later. Subsequently, one more non-reference picture B.sub.5 is inputted, so the decoder does the same process as B.sub.4. Next, the picture P.sub.2 is inputted, at this moment, P.sub.0 is decoded by the second decoding unit 940 completely so that when the input bits of P.sub.2 picture can be parsing into syntax elements by one thread of the first decoding unit 930 and the syntax elements of P.sub.1 picture is decoding into pixels values by the second decoding unit 940, the input bits of B.sub.0 and B.sub.1 would need respectively one thread of the first decoding unit 930 to performing the parsing operation of syntax elements. When the input bits of B.sub.0 and B.sub.1 are respectively parsed into syntax elements completely, the syntax elements of B.sub.0 and B.sub.1 would need respectively one thread of the second decoding unit 940 to decode them into pixels values, as such, the first decoding unit 930 and the second decoding unit 940 can be independently operated with multiple threads for different pictures or slices so as to accelerate the speed of decoding process. In the other case, when the picture of encoded picture streams is determined by the determining unit 950 with multiple slices, the present invention takes use of multiple threads to do the decoding process of multiple slices of each picture if the picture contains more than one slice. Through the multi-threaded decoding process as described above, the decoding speed can be substantially increased, such that the efficiency of the apparatus in the present embodiment is enhanced.

(37) In order to explain the concept of present invention more clearly, an actual exemplary embodiment is addressed. In the embodiment, the picture stream is assumed to be IBBPBBP . . . , wherein the I, P, B refers to I picture, P picture, and B picture, respectively. FIG. 10 is a schematic diagram illustrating the picture stream according to an embodiment of present invention. As illustrated, the display order of the pictures is B.sub.0, B.sub.1, P.sub.0, B.sub.2, B.sub.3, P.sub.1, B.sub.4, B.sub.5, P.sub.2, B.sub.6, B.sub.7, and P.sub.3. Accordingly, as defined in the video encoding/decoding standard, the decoding order would be I.sub.0, P.sub.0, B.sub.0, B.sub.1, P.sub.1, B.sub.2, B.sub.3, P.sub.2, B.sub.4, B.sub.5, and P.sub.3.

(38) FIG. 11 is a schematic diagram illustrating the decoding order for the multi-threaded scheme according to an embodiment of the present invention. Referring to FIG. 11, the unit of processing time is a time-slot. In the present embodiment, each time-slot has one thread at the least and four threads at the most, but is not limited to it.

(39) As seen in FIG. 11, when the picture I.sub.0 is inputted, the decoder uses one thread for parsing input bits of the picture I.sub.0 into the syntax elements of the picture I.sub.0 (denoted as I.sub.0-READ). Then, when the picture P.sub.0 is inputted, the decoder uses two threads to do the operation of decoding the syntax elements achieved by I.sub.0-READ into pixel values of the picture I.sub.0 (denoted as I.sub.0-DECODE) and the operation of parsing input bits of the picture P.sub.0 into the syntax elements of P.sub.0 (denoted as P.sub.0-READ). Next, when the picture B.sub.0 is inputted, due to the B picture is a non-reference picture, the decoder parses its header and stores all NALUs (Network Abstraction Layer Unit) of this non-reference picture in order to read and decode later.

(40) Then, the decoder continues to get a next picture. Next, the picture B.sub.1 is inputted. It is also a non-reference picture here, so the decoder does the same process as B.sub.0. Next, the picture P.sub.1 is inputted. It is a reference picture, so the decoder uses two threads to do the operation of decoding the syntax elements achieved by P.sub.0-READ into pixel values of P.sub.0 (denoted as P.sub.0-DECODE) and the operation of parsing input bits of the picture P.sub.1 into the syntax elements of P.sub.1 (denoted as P.sub.1-READ). However, B.sub.0 and B.sub.1 cannot be processed immediately because the reference picture of B.sub.0 and B.sub.1 is P.sub.0, and at this moment, the decoder is just executing the decoding operation of P.sub.0. As a result, the decoding operation of B.sub.0 and B.sub.1 must be postponed until P.sub.0 is decoded completely, that is, moved to next time-slot.

(41) Next, the picture B.sub.2 and B.sub.3 is inputted subsequently. They are also parsed and stored to the decoder. After that, the picture P.sub.2 is inputted, so that the decoder uses four threads to do video decoding, one thread is for the operation of decoding the syntax elements achieved by P.sub.1-READ into pixel values of P.sub.1 (denoted as P.sub.1-DECODE), another thread is for the operation of parsing input bits of the picture P.sub.2 into the syntax elements of P.sub.1 (denoted as P.sub.2-READ), and the other two threads are respectively for parsing input bits of the picture into the syntax elements and decoding the syntax elements into pixel values of B.sub.0 and B.sub.1 (respectively denoted as B.sub.0-READ& DECODE, B.sub.1-READ& DECODE). The same thing is happened to the picture B.sub.4, B.sub.5, and P.sub.3. Therefore, the decoder also uses four threads to do video decoding, one thread is for the operation of decoding the syntax elements achieved by P.sub.2-READ into pixel values of P.sub.2 (denoted as P.sub.2-DECODE), another thread is for the operation of parsing input bits of the picture P.sub.3 into the syntax elements of P.sub.3 (denoted as P.sub.3-READ), and the other two threads are respectively for parsing input bits of the picture into the syntax elements and decoding the syntax elements into pixel values of B.sub.2 and B.sub.3 (respectively denoted as B.sub.2-READ& DECODE, B.sub.3-READ& DECODE). As described above, the decoding process follow the similar rule depending on reference picture, preceding reference picture and non-reference picture on a picture by picture basis with multi-threaded scheme, such that the detailed description for the decoding of rest pictures in the picture stream is omitted here.

(42) It deserves to be mentioned that according to experimental results, the processing time of the B picture is often half of the P picture. Therefore, in the present embodiment, the reading and decoding operations of the B picture are processed in the same time-slot to gain the best performance. However, in various embodiments, the reading and decoding operations of the B picture can also be processed in different time-slot.

(43) In summary, the present invention takes use of multi-threaded processor and implements multiple threads to do the operation of parsing input bits into syntax elements of one picture and the operation of decoding syntax elements into pixel values of another picture in a parallel manner. Moreover, each of the slices in the pictures is also processed with one thread. As a result, the redundant time for waiting in a decoding sequence is saved, and therefore a more effective decoding method is obtained.

(44) It will be apparent to those skilled in the art that various modifications and variations can be made to the structure of the present invention without departing from the scope or spirit of the invention. In view of the foregoing, it is intended that the present invention cover modifications and variations of this invention provided they fall within the scope of the following claims and their equivalents.