Transmission apparatus, transmission method, reception apparatus, and reception method
11134254 · 2021-09-28
Assignee
Inventors
Cpc classification
H04N21/438
ELECTRICITY
H04N21/84
ELECTRICITY
H04N21/440281
ELECTRICITY
H04N21/4345
ELECTRICITY
H04N19/157
ELECTRICITY
H04N19/70
ELECTRICITY
H04N19/44
ELECTRICITY
H04N19/127
ELECTRICITY
H04N21/2362
ELECTRICITY
H04N19/156
ELECTRICITY
International classification
H04N21/4402
ELECTRICITY
H04N21/2362
ELECTRICITY
H04N21/84
ELECTRICITY
H04N21/434
ELECTRICITY
H04N19/70
ELECTRICITY
H04N19/127
ELECTRICITY
H04N19/157
ELECTRICITY
H04N19/44
ELECTRICITY
Abstract
A receiving side is enabled to perform excellent decode processing according to decoding capability. An image encoding unit classifies image data of each picture consisting moving picture data into a plurality of layers, encodes the classified image data of the picture in each of the plurality of layers, and generates video data having the encoded image data of the picture in each of the plurality of layers A data transmission unit transmits the video data. An information transmission unit transmits a level designation value of a bit stream and information on a layer range in each of a plurality of layer ranges having a different maximum layer.
Claims
1. A transmission apparatus, comprising: circuitry configured to classify image data of each picture including moving picture data into a plurality of layers, encode the classified image data of the picture in each of the plurality of layers, and generate video data having the encoded image data of the picture in each of the plurality of layers; transmit the video data; transmit layer identification information in a header of each picture, the layer identification information identifying each of the plurality of layers; and transmit information that specifies a plurality of level designation values, each of the plurality of level designation values having a corresponding one of a plurality of layer ranges of a bit stream, each of the plurality of layer ranges including a common one of the plurality of layers, wherein the layer identification information is different from the information that specifies the plurality of level designation values.
2. The transmission apparatus according to claim 1, wherein the circuitry is configured to insert the information in a layer of a container containing the video data and transmit the information.
3. The transmission apparatus according to claim 2, wherein the container is an MPEG2-TS, and the circuitry is configured to insert the information under a program map table and transmit the information.
4. The transmission apparatus according to claim 1, wherein the circuitry is configured to insert the information in a metafile having meta-information related to the video data, and transmit the information.
5. The transmission apparatus according to claim 4, wherein the metafile is a media presentation description (MPD) file.
6. The transmission apparatus according to claim 1, wherein the circuitry is configured to transmit, together with a profile, the information.
7. The transmission apparatus according to claim 1, wherein the circuitry is configured to insert the layer identification information into the header of a network abstraction layer (NAL) unit of each picture.
8. The transmission apparatus according to claim 1, wherein the circuitry is configured to insert the information into a sequence parameter set (SPS) unit of each picture.
9. The transmission apparatus according to claim 1, wherein the information that specifies the plurality of level designation values includes, for each of the plurality of layer ranges, a maximum layer value that indicates an identifier of an uppermost layer of the respective layer range and a minimum layer value that indicates a lowest layer of the respective layer range.
10. The transmission apparatus according to claim 1, wherein each of the plurality of layer ranges includes each layer below a maximum one of the plurality of layers included in the respective layer range.
11. A transmission method, comprising: classifying, using circuitry, image data of each picture including moving picture data into a plurality of layers, encoding the classified image data of the picture in each of the plurality of layers, and generating video data having the encoded image data of the picture in each of the plurality of layers; transmitting the video data, using the circuitry; transmitting layer identification information in a header of each picture, the layer identification information identifying each of the plurality of layers; and transmitting, using the circuitry, information that specifies a plurality of level designation values, each of the plurality of level designation values having a corresponding one of a plurality of layer ranges of a bit stream, each of the plurality of layer ranges including a common one of the plurality of layers, wherein the layer identification information is different from the information that specifies the plurality of level designation values.
12. A reception apparatus, comprising: circuitry configured to receive video data having encoded image data of pictures in each of a plurality of layers obtained by classifying image data of each picture including moving picture data into the plurality of layers and encoding the image data; receive layer identification information included in a header of each picture, the layer identification information identifying each of the plurality of layers; receive information that specifies a plurality of level designation values, each of the plurality of level designation values having a corresponding one of a plurality of layer ranges of a bit stream, each of the plurality of layer ranges including a common one of the plurality of layers; and extract, from the video data, encoded image data of pictures in a layer lower than a predetermined layer and decode the extracted encoded image data on the basis of the information and the layer identification information, wherein the layer identification information is different from the information that specifies the plurality of level designation values.
13. The reception apparatus according to claim 12, wherein the circuitry is configured to acquire the information from a layer of a container containing the video data.
14. The reception apparatus according to claim 12, wherein the circuitry is configured to acquire the information from a metafile having meta-information related to the video data.
15. A reception method, comprising: receiving, using circuitry, video data having encoded image data of pictures in each of a plurality of layers obtained by classifying image data of each picture including moving picture data into the plurality of layers and encoding the image data; receiving layer identification information in a header of each picture, the layer identification information identifying each of the plurality of layers; receiving, using the circuitry, information that specifies a plurality of level designation values, each of the plurality of level designation values having a corresponding one of a plurality of layer ranges of a bit stream, each of the plurality of layer ranges including a common one of the plurality of layers; and extracting, using the circuitry, from the video data, encoded image data of pictures in a layer lower than a predetermined layer and decoding the extracted encoded image data on the basis of the information and the layer identification information, wherein the layer identification information is different from the information that specifies the plurality of level designation values.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
MODE FOR CARRYING GUT THE INVENTION
(19) Hereinafter, mode for carrying out the invention (hereinafter, referred to as an “embodiment”) is described. Note that, the description is made in the following order:
(20) 1. Embodiment
(21) 2. Modified example
1. Embodiment
(22) [Transceiver System]
(23)
(24) The transmission apparatus 100 transmits a transport stream TS as a container on a broadcast wave. The transport stream TS contains video data having encoded image data of a picture in each of a plurality of layers obtained by classifying image data of the picture consisting moving picture data into the plurality of layers and encoding the image data. In this case, by, for example, performing encoding such as H.264/AVC or H.265/HEVC, the image data is encoded so that a referred picture belongs to the own layer and/or a layer lower than the own layer.
(25) Layer identification information is added to the encoded image data of the picture of each of the layers in order to identify the layer to which each picture belongs. In this embodiment, the layer identification information (“nuh_temporal_id_plus1”indicating temporal_id) is arranged in a header part of a NAL unit (nal_unit) of each picture. By adding the layer identification information in this manner, it is possible for a receiving side to selectively extract encoded image data in a layer lower than a predetermined layer and perform decode processing.
(26)
(27) The transport stream TS contains a single video stream. Furthermore, a level designation value of a bit stream and information on a layer range in each of a plurality of layer ranges having a different maximum layer are inserted in the transport stream TS. The information is inserted, for example, under a program map table (PMT).
(28)
(29)
(30)
(31) Here, a value of “level_idc” is described.
(32) Furthermore, for example, “level_idc” corresponding to a service of 2160/50P is “level5.1”, and the value is “153” in decimal and “0×99” in hexadecimal. To indicate this “level5.1”, “9”, which is the lower 4 bits, is to be described as “1s4b_sublayer_level_idc” in an HEVC descriptor, which will be described later. Furthermore, for example, “level_idc” corresponding to a service of 2160/100P is “level5.2”, and the value is “156” in decimal and “0x9c” in hexadecimal.
(33) Furthermore, for example, “level_idc” corresponding to a service of 4320/50P s “level6.1”, and the value is “183” in decimal and “0xb7” in hexadecimal. To indicate this “level6.1”, “7”, which is the lower 4 bits, is to be described as “1s4b_sublayer_level_idc” in an HEVC descriptor, which will be described later. Furthermore, for example, “level_idc” corresponding to a service of 4320/100P is “level6.2”, and the value is “186” in decimal and “0xba” in hexadecimal.
(34) The reception apparatus 200 receives the above described transport stream TS transmitted from the transmission apparatus 100 on a broadcast wave or on an internet packet. The reception apparatus 200 extracts the encoded image data of the picture in the layer lower than the predetermined layer from the video data contained in the transport stream TS and decodes the encoded image data according to the own decoding capability. At this time, the reception apparatus 200 performs decoding on the basis of the level designation value of the bit stream and the information on the layer range in each of the layer ranges having a different maximum layer which are inserted in the transport stream TS as described above.
(35) “Configuration of the Transmission Apparatus”
(36)
(37) The encoder 102 inputs decoded moving picture data VB and hierarchically encodes the data. The encoder 102 classifies image data of each picture consisting the moving picture data VB into a plurality of layers. Then, the encoder 102 encodes the classified image data of the picture in each of the layers, and generates a video stream (video data) having the encoded image data of the picture in each of the layers.
(38) The encoder 102 performs encoding, such as H.264/AVC or H. 265/HEVC. At this time, the encoder 102 performs encoding so that a picture to be referred (a referred picture) belongs to the own layer and/or a layer lower than the own layer. The coded picture buffer (cpb) 103 temporarily stores the video stream containing the encoded image data of the picture in each of the layers and generated by the encoder 102.
(39) The multiplexer 104 reads and PES-packetizes the video stream stored in the compressed data buffer 103, multiplexes the video stream by transport-packetizing the video stream, and obtains the transport stream TS as a multiplexed stream. The transport stream TS contains a single video stream as described above. The multiplexer 104 inserts, in a layer of a container, the level designation value of the bit stream and the information on the layer range in the layer ranges (level layers) having a different maximum layer. The transmission unit 105 transmits the transport stream TS obtained by the multiplexer 104 to the reception apparatus 200 on a broadcast wave or an internet packet.
(40) [Insertion of Information]
(41) The insertion of information by the multiplexer 104 is further described. To insert the information, an existing HEVC descriptor (HEVC_descriptor) or a newly defined layer/signaling/descriptor (Layer_signaling descriptor) is used.
(42)
(43) An 8-bit field of “profile_idc” indicates a profile of a bit stream. An 8-bit field of “level_idc” indicates a level designation value of a bit stream in the uppermost layer. A 4-bit field of “1s4b_sublayer_level_idc” indicates a level designation value of a bit stream, in a layer lower than the uppermost layer (for example, the layer one level below the uppermost layer). In this case, lower 4 bits are arranged in hexadecimal.
(44) Furthermore, in the case of
(45) “temporal_layer_subset_flag=1”, there exists a 3-bit field of each of “temporal_id_min”, “temporal_id_max”, “temporal_id_sublayer_min”, and “temporal_id_sublayer_max”. “temporal__id_max” indicates a value of temporal_id of the uppermost layer of the layer range in which the maximum layer is the highest layer, that is, the uppermost layer, and “temporal_id_min” indicates a value of temporal_id of the lowest layer of the layer range. Furthermore
(46) “temporal_id_sublayer_max” indicates a value of temporal_id of the uppermost layer of the layer range in which the maximum layer is lower than the uppermost layer (normally, the layer one level below the uppermost layer), and “temporal_id_sublayer_min” indicates a value of temporal_id of the lowest layer of the layer range.
(47) For example, a specific example of each field description in the hierarchical encoding example illustrated in
(48) “011” indicating temporal_id=3 is described in the 3-bit field of “temporal_id_max”, and “000” indicating t temporal_id=0 is described in the 3-bit field “temporal_id_min”. Furthermore, “010” indicating temporal_id=2 is described in the 3-bit field of “temporal_id_sublayer_max”, and “000” indicating temporal__id=0 is described in the 3-bit field of “temporal_id_min”.
(49)
(50) The 8-bit field of “descriptor_tag” indicates a descriptor type, and indicates a layer/signaling/descriptor here. An 8-bit field of “descriptor_length” indicates the length (size) of the descriptor, and indicates the following number of bytes as the length of a descriptor.
(51) An 8-bit field of “overall__profile_idc” indicates a profile of the maximum range related to scalable encode tools. An 8-bit field of “highest_level_idc” indicates the maximum level of a scalable range. An 8-bit field of “number_of__profile_layers” indicates the number of profile layers having a scalable function. “number_of_level_layers” indicates the number of level layers.
(52) An 8-bit field of “layer_profile_idc[i]” indicates a profile of each profile layer. An 8-bit field of “layer_level_idc[i][j]” indicates a level of each level layer. An 8-bit field of “temporal_id_layer_min[i][j]” indicates a value of the minimum temporal_id in each level layer. An 8-bit field of “temporal_id_layer_max[i][j]” indicates a value of the maximum temporal_id in each level layer.
(53) For example, a specific example of each field description related to a level in the hierarchical encoding example illustrated in
(54) Then, with regard to a first level layer, “0x9c”, which is the value of “level5.2”, is described in the 8-bit field of “layer_level_idc[i][j]”, “100” indicating temporal_id=4 is described in the 8-bit field of “temporal_id_layer_max[i][j]”, and “000” indicating temporal_id=0 is described in the 8-bit field of “temporal_id__layer_min [i][j]”.
(55) Furthermore, with regard to a second level layer, “0x99”, which is the value of “level5.1”, is described in the 8-bit field of “layer_level_idc [i][j]”, “011” indicating temporal_id=3 is described in the 8-bit field of “temporal_id_layer_max[i][j]”, and “000” indicating temporal_id=0 is described in the 8-bit field of “temporal_id_layer_min[i][j]”.
(56) Furthermore, with regard to a third level layer, “0x96”, which is the value of “level5”, is described in the 8-bit field of “layer_level_idc[i][j]”, “010” indicating temporal_id=2 is described in the 8-bit field of “temporal_id_layer_max[i][j]”, and “000” indicating temporal_id=0 is described in the 8-bit field of “temporal_id_layer_min[i][j]”.
(57) Here, a configuration example of a profile layer is described with reference to
(58)
(59)
(60) Note that, the added value of displaying by the scalable extended stream not only improves the above described image quality, but also applies to scalable extension related to increase of a spatial resolution, expansion of a color gamut, and an expansion of a luminance level. With regard to these streams, by analyzing a packet from a decoder input buffer of a receiver and appropriately discriminating the packet, it is possible to perform desired decoding.
(61) [Configuration of the Transport Stream TS]
(62)
(63) In the encoded image data of each picture, there exists an NAL unit, such as a VPS, SPS, PPS, SLICE, or SEI. As described above, the layer identification information on the picture (“nuh_temporal_id_plus1”indicating temporal_id) is arranged in the header of the NAL unit. “general_level_idc”, which is a level designation value of a bit stream, is inserted in SPS
(64) Furthermore, the transport stream TS contains a program map table (PMT) as program specific information (PSI). The PSI is the information in which it is described that elementary streams contained in the transport stream each belong to which program.
(65) In the PMT, there exists a program/loop (Program loop) describing information related to an entire program. Furthermore, there exists an elementary/loop having information related to each elementary stream in the PMT. In the configuration example, there exists a video elementary/loop (video ES1 loop).
(66) In the video elementary/loop, information, such as a stream type and a packet identifier (PID), corresponding to a video stream (video PES1), and a descriptor describing information related to the video stream is arranged. As one of the descriptors, the above described HEVC descriptor (HEVC_descriptor) or layer/signaling/descriptor (Layer_signaling descriptor) is inserted. Note that, the layer/signaling/descriptor is not inserted, when the element-added HEVC descriptor illustrated in
(67) The operations of the transmission apparatus 100 illustrated in
(68) The video stream containing the encoded data of the
(69) picture in each of the layers and generated by the encoder 102 is supplied to the compressed data buffer (cpb) 103 and temporarily stored. By the multiplexer 104, the video stream stored in the compressed data buffer 103 is read. PES-packetized, and multiplexed by being transport-packetized, and the transport stream TS as a multiplexed stream is obtained. The transport stream TS contains a single video stream.
(70) When the transport stream TS is generated by the multiplexer 104 in this manner, the level designation value of the bit stream and the information on the layer range in the layer ranges having a different maximum layer are inserted in the layer of the container. For example, the element-added HEVC descriptor (see
(71) “Configuration of the Reception Apparatus”
(72)
(73) The reception unit 202 receives the transport stream TS transmitted from the transmission apparatus 100 on a broadcast wave or on an internet packet. The demultiplexer 203 extracts, from the transport stream TS, a TS packet, consisting the video stream contained in the transport stream TS after filtering the TS packet with a PID filter, and transmits the transport stream TS to the compressed data buffer (cpb: coded picture buffer) 204.
(74) Furthermore, the demultiplexer 203 extracts section data data from the transport stream TS, and transmits the section data data to the CPU 201. The section data contains the above described HEVC descriptor (HEVC_descriptor) and layer/signaling/descriptor (Layer_signaling descriptor), The CPU 201 determines the layer range which the decoder 205 can decode from the layer ranges indicated by these descriptors with the level designation value of the bit stream and the information on the layer range, and transmits the information on the temporal ID (temporal_id) of the layer range to the decoder 205.
(75) Furthermore, the demultiplexer 203 extracts a program clock, reference (PGR) from the TS packet containing the PGR, and transmits the PGR to the CPU 201. Furthermore, the demultiplexer 203 extracts time stamps (DTS and PTS) inserted in a PES header for each picture, and transmits the time stamps to the CPU 201.
(76) The compressed data buffer (cpb) 204 temporarily stores the encoded image data of each picture according to the TS packet transferred from the demultiplexer 203. The decoder 205 reads and decodes the encoded image data of each picture stored in the compressed data buffer 204 at a decode timing supplied by a decoding time stamp (DTS) of the picture, and transmits the decoded image data to the decompressed data buffer (dpb: decoded picture buffer) 206. At this time, the decoder 205 selectively decodes only the encoded image data of the picture contained in the decodable layer range on the basis of the information on the temporal ID (temporal_id) of the decodable layer range supplied by the CPU 201.
(77)
(78) Furthermore, the decoder 205 includes a temporal ID analysis unit 205a and a decode processing unit 205b. The temporal ID analysis unit 205a sequentially reads the encoded data, of each picture stored in the compressed data buffer 204 at the decode timing, and analyzes the information, on the temporal ID (temporal_id) inserted in the NAL unit header. Then, the temporal ID analysis unit 205a transfers the encoded data to the decode processing unit 205b when determining the encoded data is within the decodable layer range, and discards the encoded data without transferring the encoded data to the decode processing unit 205b when determining the encoded data is not within the decodable layer range. Note that, the information on the temporal ID (temporal_id) of the decodable layer range is supplied to the temporal ID analysis unit 205a by the CPU 201.
(79) For example, the case of the hierarchical encoding example of
(80) On the other hand, when the decoder 205 is a 50p decoder, “0 to 2” is supplied to the temporal ID analysis unit 205a as the information on the temporal ID of the decodable layer ranges by the CPU 201. Thus, the temporal ID analysis unit 205a transmits the encoded image data of the pictures in the layers 0 to 2 to the decode processing unit 205b. On the other hand, the temporal ID analysis unit 205a discards the encoded image data of the picture in the layer 3 without transmitting the encoded image to the decode processing unit 205b.
(81) Returning back to
(82) For example, when the frame rate of the image data of each picture after decoding is 50 fps and the display capability is 100 fps, the post-processing unit 207 performs interpolation processing to the image data of each picture after decoding so that the time direction resolution becomes twice, and transmits the image data of 100 fps to the display unit 208.
(83) The display unit 208 is constituted by, for example, a liquid crystal display (LCD), an organic electro-luminescence (EL) panel, or the like. Note that, the display unit 208 may be an external device connected to the reception apparatus 200.
(84)
(85) Next, in step ST3, the decoder 205 determines whether the temporal ID (temporal_id) detected in step ST2 is within the decodable range. When the temporal ID is not within the decodable range, the decoder 205 does not perform the decode processing, and returns back to the processing in step ST2. On the other hand, when the temporal ID is within the decodable range, the decoder 205 moves to the processing in step ST4. In step ST4, the decoder 205 performs the decode processing, and transfers the image data of the picture after decoding to the decompressed data buffer (dpb) 206.
(86) Next, in step ST5, the post-processing unit 207 reads, from the decompressed data buffer (dpb) 206, the image data of the image data of the picture to be displayed at the display timing. Next, in step ST6, the post-processing unit 207 determines whether a display frequency and a read frequency from the decompressed data buffer (dpb) 206 are different. When the frequencies are different, in step ST7, the post-processing unit 207 adjusts the read frequency to the display frequency by performing frame interpolation or thinning of the picture. After the processing in step ST7, the processing is terminated in step ST8. Furthermore, when the frequencies are not different in step ST6, the processing is immediately terminated in step ST8.
(87) The operations of the reception, apparatus 200 illustrated in
(88) Furthermore, the section data is extracted from the transport stream TS, and transmitted to the CPU 201 by the demultiplexer 203. The layer range which the decoder 205 can decode is determined from the layer ranges described by the HEVC descriptor or the layer/signaling/descriptor with the level designation value of the bit stream and the information on the layer range, and the information on the temporal ID (temporal_id) of the layer range is transmitted to the decoder 205 by the CPU 201.
(89) The encoded image data of each picture stored in the compressed data buffer 204 is decoded at the decode timing of the picture, transmitted to the decompressed data buffer (dpb) 206, and temporarily stored by the decoder 205. In this case, by the decoder 205, the encoded image data of the picture within the decodable layer range is only selectively decoded on the basis of the information on the temporal ID (temporal_id) of the decodable layer range supplied by the CPU 201.
(90) The image data of each picture stored in the decompressed data buffer (dpb) 206 is sequentially read at the display timing, and transmitted to the post-processing unit 207. Interpolation, subsample, or thinning is performed to the image data of each picture to adjust the frame rate to the display capability by the post-processing unit 207. The image data of each picture processed by the post-processing unit 207 is supplied to the display unit 208, and the moving picture is displayed with the image data of each picture.
(91) As described above, in the transceiver system 10 illustrated in
2. Modified Example
(92) [Application to the MPEG-DASH-Based Stream Distribution System]
(93) Note that, in the above described embodiment, the example in which the container is a transport stream (MPEG-2 TS) has been described. However, the present technology can be similarly applied to a system having a configuration in which a stream is distributed to a reception terminal using a network, such as the internet. In an internet distribution, a stream is mainly distributed by a container of an MP4 or other formats.
(94)
(95) The DASH stream file server 31 generates, on the basis of media data (video data, audio data, subtitle data, or the like) of predetermined content, a stream segment conforming to DASH (hereinafter, appropriately referred to as a “DASH segment”), and transmits the segment in response to an HTTP request from the receiver. The DASH stream file server 31 may be a streaming-dedicated server, or used as a web server.
(96) Furthermore, the DASH stream file server 31 transmits, in response to a request of a segment of a predetermined stream transmitted from the receiver 33 (33-1, 33-2, . . . and 33-N) through the CDN 34, the segment of the stream to the receiver, which is the request source, through the CDN 34. In this case, the receiver 33 performs the request by referring to the value of the rate described in a media presentation description (MPD) file and selecting a stream of an optimal rate according to a network environment where a client is placed.
(97) The DASH MPD server 32 is a server to generate an MPD file to acquire the DASH segment, generated by the DASH stream file server 31. The MPD file is generated based on content metadata from a content management server (not illustrated) and an address (url) of the segment generated by the DASH stream file server 31.
(98) In an MPD format, using an element of representation for each stream of a video and audio, each attribute is described. For example, by separating the representation for each of a plurality of video data streams having a different rate, each rate is described in the MPD file. The receiver 33 can select an optimal stream according to the conditions of the network environment where the receiver 33 is placed by referring to the value of the rate as described above.
(99)
(100)
(101) As illustrated in
(102) As illustrated in
(103) There exist information and the like on an address (url) to actually acquire segment data, such as a video or audio, in the media segment.
(104) Note that, switching stream can be freely performed between the representations grouped by AdaptationSet. Thus, according to the conditions of a network environment where an IPTV client is placed, it is possible to select a stream of an optimal rate, and perform seamless moving picture distribution.
(105)
(106) The present technology can be applied to the stream distribution systems 30 and 30A illustrated in
(107) Furthermore, the transceiver system 10 constituted by the transmission apparatus 100 and the reception apparatus 200 has been described in the above described embodiment, however, the transceiver system to which the present technology can be applied is not limited to this. For example, the part of the reception apparatus 200 may be, for example, a set top box and monitor connected by a digital interface, such as High-Definition Multimedia Interface (HDMI). Note that, “HDMI” is a registered trademark.
(108) Furthermore, the present technology can be following configurations:
(109) (1) A transmission apparatus includes an image encoding unit which classifies image data, of each picture consisting moving picture data into a plurality of layers, encodes the classified image data of the picture in each of the plurality of layers, and generates video data having the encoded image data of the picture in each of the plurality of layers,
(110) a data transmission unit which transmits the video data,
(111) and an information transmission unit which transmits a level designation value of a bit stream and information on a layer range in each of a plurality of layer ranges having a different maximum layer.
(112) (2) The transmission apparatus according to the (1),
(113) in which the information transmission unit inserts the information in a layer of a container containing the video data and transmits the information.
(114) (3) The transmission apparatus according to the (2),
(115) in which the container is an MPEG2-TS,
(116) and the information transmission unit inserts the information under a program map; table and transmits the information.
(117) (4) The transmission apparatus according to the (1),
(118) in which the information transmission unit inserts the information in a metafile having meta-information related to the video data, and transmits the information,
(119) (5) The transmission apparatus according to the (4),
(120) in which the metafile is an MPD file.
(121) (6) The transmission apparatus according to any one of the (1) to (5),
(122) in which the information transmission unit transmits, together with information on a profile, the level designation value of the bit stream and the information on the layer range in each of the plurality of layer ranges having a different maximum layer.
(123) (7) A transmission method includes an image encoding step for classifying image data of each picture consisting moving picture data into a plurality of layers, encoding the classified image data of the picture in each of the plurality of layers, and generating video data having the encoded image data of the picture in each of the plurality of layers,
(124) a data transmitting step for transmitting the video data,
(125) and an information transmitting step for transmitting a level designation value of a bit stream and information on a layer range in each of a plurality of layer ranges having a different, maximum layer.
(126) (8) A reception apparatus includes a data reception unit which receives video data, having encoded image data of a picture in each of a plurality of layers obtained by classifying image data, of each picture consisting moving picture data into the plurality of layers and encoding the image data,
(127) an information reception unit which receives a level designation value of a bit stream and information on a layer range in each of a plurality of layer ranges having a different maximum layer,
(128) and a processing unit which extracts, from the video data, the encoded image data of the picture in a layer lower than a predetermined layer and decodes the encoded image data on the basis of the information.
(129) (9) The reception apparatus according to the (8),
(130) in which the information reception unit acquires the information from a layer of a container containing the video data.
(131) (10) The reception apparatus according to the (8),
(132) in which the information reception unit acquires the information from a metafile having meta-information related to the video data.
(133) (11) A reception method includes a data receiving step for receiving video data having encoded image data of a picture in each of a plurality of layers obtained by classifying image data of each picture consisting moving picture data into the plurality of layers and encoding the image data,
(134) an information receiving step for receiving a level designation value of a bit stream and information on a layer-range in each of a plurality of layer ranges having a different maximum layer,
(135) and a processing step for extracting, from the video data, the encoded image data of the picture in a layer lower than a predetermined layer and decoding the encoded image data, on the basis of the information.
(136) The main feature of the present technology is by transmitting a level designation value of a bit stream and information on a layer range in each of a plurality of layer ranges having a different maximum layer when video data, hierarchically encoded is transmitted, it is possible for a receiving side to easily decode the encoded image data of the picture of the layer range according to decoding performance (see
REFERENCE SIGNS LIST
(137) 10 Transceiver system 30, 30A MPEG-DASH-based stream distribution system 31 DASH stream file server 32 DASH MPD server 33-1, 33-2, . . . , 33-N, 35-1, 35-2, . . . , 35-M receiver 34 Content delivery network (CDN) 36 Broadcast transmission system 100 Transmission apparatus 101 CPU 102 Encoder 103 Compressed data buffer (cpb) 104 Multiplexer 105 Transmission unit 200 Reception apparatus 201 CPU 202 Reception unit 203 Demultiplexer 203a Video multiplexing buffer 203b Section data buffer 204 Compressed data buffer (cpb) 205 Decoder 205a Temporal ID analysis unit 205b Decode processing unit 206 Decompressed data buffer (dpb) 207 Post-processing unit 208 Display unit