Transmission device, transmission method, reception device, and reception method
10542304 ยท 2020-01-21
Assignee
Inventors
Cpc classification
H04N21/8543
ELECTRICITY
H04N21/4312
ELECTRICITY
International classification
H04N21/235
ELECTRICITY
H04N21/431
ELECTRICITY
H04N21/8543
ELECTRICITY
Abstract
A video encoder generates a video stream including image data. A subtitle encoder generates a subtitle stream including subtitle information. An adjustment information insertion unit inserts luminance level adjustment information for adjusting the luminance level of a subtitle, into the video stream and/or the subtitle stream. A transmission unit transmits a container in a predetermined format containing the video stream and the subtitle stream. The reception side is enabled to perform subtitle luminance level adjustment.
Claims
1. A transmission device, comprising: processing circuitry configured to generate a video stream including image data, generate a subtitle stream including subtitle information, insert luminance level adjustment information into at least one of the video stream and the subtitle stream, the luminance level adjustment information including an average luminance value generated in accordance with the image data and including a threshold value indicating whether luminance of a subtitle foreground or a subtitle background in the subtitle stream is to be adjusted, and transmit a container containing the video stream and the subtitle stream.
2. The transmission device according to claim 1, wherein the luminance level adjustment information is luminance level adjustment information corresponding to an entire area of a screen and/or luminance level adjustment information corresponding to respective partition regions obtained by dividing the entire area of the screen by a predetermined number.
3. The transmission device according to claim 2, wherein the luminance level adjustment information is inserted into the video stream and includes a maximum luminance value and a minimum luminance value, the maximum luminance value and the minimum luminance value being generated in accordance with the image data.
4. The transmission device according to claim 3, wherein the threshold value of the luminance level adjustment information inserted into the video stream further includes a high-luminance threshold value, a low-luminance threshold value, and an average-luminance threshold value, the high-luminance threshold value, the low-luminance threshold value, and the average-luminance threshold value being set in accordance with electro-optical transfer function characteristics.
5. The transmission device according to claim 2, wherein the luminance level adjustment information is inserted into the subtitle stream and includes subtitle luminance range limit information.
6. The transmission device according to claim 5, wherein the threshold value of the luminance level adjustment information inserted into the subtitle stream further includes a high-luminance threshold value, a low-luminance threshold value, and an average-luminance threshold value, the high-luminance threshold value, the low-luminance threshold value, and the average-luminance threshold value being set in accordance with electro-optical transfer function characteristics.
7. The transmission device according to claim 5, wherein the luminance level adjustment information inserted into the subtitle stream further includes color space information.
8. The transmission device according to claim 1, wherein the processing circuitry is further configured to generate the subtitle stream in accordance with subtitle text information in Timed Text Markup Language (TTML), and insert the luminance level adjustment information using an element of metadata in a header of a TTML structure.
9. The transmission device according to claim 1, wherein the processing circuitry is further configured to generate the subtitle stream in accordance with subtitle text information in Timed Text Markup Language (TTML), and insert the luminance level adjustment information using an element of styling extension in a header of a TTML structure.
10. The transmission device according to claim 1, wherein the processing circuitry is further configured to generate the subtitle stream having segments as components, and insert a segment containing the luminance level adjustment information into the subtitle stream.
11. The transmission device according to claim 3, wherein the processing circuitry is further configured to insert identification information into the container, the identification information indicating that there is the luminance level adjustment information inserted in the video stream.
12. The transmission device according to claim 5, wherein the processing circuitry is further configured to insert identification information into the container, the identification information indicating that there is the luminance level adjustment information inserted in the subtitle stream.
13. The transmission device according to claim 12, wherein the identification information includes information indicating an insertion position of the luminance level adjustment information in the subtitle stream.
14. A transmission method, comprising: generating, by processing circuitry, a video stream including image data; generating, by the processing circuitry, a subtitle stream including subtitle information; inserting, by the processing circuitry, luminance level adjustment information into at least one of the video stream and the subtitle stream, the luminance level adjustment information including an average luminance value generated in accordance with the image data and including a threshold value indicating whether luminance of a subtitle foreground or a subtitle background in the subtitle stream is to be adjusted; and transmitting, by the processing circuitry, a container containing the video stream and the subtitle stream.
15. A reception device, comprising: processing circuitry configured to receive a container containing a video stream including image data and a subtitle stream including subtitle information, obtain image data by performing a decoding process on the video stream, obtain bitmap data of a subtitle by performing a decoding process on the subtitle stream, perform a luminance level adjustment process on the bitmap data in accordance with luminance level adjustment information inserted in at least one of the video stream and the subtitle stream, the luminance level adjustment information including an average luminance value generated in accordance with the image data and including a threshold value indicating whether luminance of a subtitle foreground or a subtitle background in the subtitle stream is to be adjusted, and superimpose bitmap data obtained after the luminance level adjustment on the obtained image data.
16. The reception device according to claim 15, wherein the processing circuitry is further configured to perform the luminance level adjustment using the luminance level adjustment information inserted in at least one of the video stream and the subtitle stream.
17. The reception device according to claim 15, wherein the processing circuitry is further configured to generate the luminance level adjustment information, and perform the luminance level adjustment using the generated luminance level adjustment information.
18. A reception method, comprising: receiving, by processing circuitry, a container containing a video stream including image data and a subtitle stream including subtitle information; obtaining, by the processing circuitry, image data by performing a decoding process on the video stream; obtaining, by the processing circuitry, bitmap data of a subtitle by performing a decoding process on the subtitle stream; performing, by the processing circuitry, a luminance level adjustment process on the bitmap data in accordance with luminance level adjustment information inserted in at least one of the video stream and the subtitle stream, the luminance level adjustment information including an average luminance value generated in accordance with the image data and including a threshold value indicating whether luminance of a subtitle foreground or a subtitle background in the subtitle stream is to be adjusted; and superimposing, by the processing circuitry, bitmap data obtained after the luminance level adjustment on the image data.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
(31)
(32)
(33)
MODE FOR CARRYING OUT THE INVENTION
(34) The following is a description of a mode for embodying the present technology (the mode will be hereinafter referred to as the embodiment). Explanation will be made in the following order.
(35) 1. Embodiment
(36) 2. Modifications
1. Embodiment
(37) [Example Configuration of a Transmission/Reception System]
(38)
(39) The transmission device 100 generates an MPEG2 transport stream TS as a container, and transmits this transport stream TS in the form of broadcast waves or a network packet. This transport stream TS contains a video stream including image data. This transport stream TS also contains a subtitle stream including subtitle information. Luminance level adjustment information for adjusting the luminance level of a subtitle is inserted into the video stream and/or the subtitle stream.
(40) The reception device 200 receives the transport stream TS transmitted from the transmission device 100. The reception device 200 obtains the image data by performing a decoding process on the video stream, and obtains bitmap data of the subtitle by performing a decoding process on the subtitle stream. Furthermore, in accordance with the luminance level adjustment information inserted into the video stream and/or the subtitle stream, the reception device 200 performs a luminance level adjustment process on bitmap data of the subtitle, and superimposes the adjusted bitmap data on the image data. It should be noted that, if the video stream and/or the subtitle stream does not have the luminance level adjustment information inserted therein, the reception device 200 generates luminance level adjustment information and uses the luminance level adjustment information.
(41)
(42) In the subtitle luminance level adjustment, the luminance level of the entire subtitle is adjusted in accordance with the luminance (the maximum luminance, the minimum luminance, and the average luminance) of the background image, and the subtitle luminance range is limited to a range R. Bordered subtitles are normally used as subtitles. A bordered subtitle has a rectangular border portion surrounding the text portion. The subtitle luminance range in this case means the luminance range of the entire region including both the text portion and the border portion.
(43) It should be noted that a rimmed subtitle may also be used as a subtitle, and is subjected to luminance level adjustment like a bordered subtitle. In this case, the rim portion is equivalent to the border portion. In this embodiment, a bordered subtitle is taken as an example and will be described below.
(44) In a case where the entire image is bright as shown in
(45) [Luminance Level Adjustment Information]
(46) The luminance level adjustment information to be transmitted from the transmission side is now described. The luminance level adjustment information includes luminance level adjustment information corresponding to an entire screen as shown in
(47) The maximum luminance value global_content_level_max, the minimum luminance value global_content_level_min, and the average luminance value global_content_level_ave that correspond to the entire screen are inserted into the video stream, and the maximum luminance values partition_content_level_max, the minimum luminance values partition_content_level_min, and the average luminance values partition_content_level_ave that correspond to the respective partitions are also inserted into the video stream. These values are obtained in accordance with the image data. It should be noted that the values corresponding to both the entire screen and the respective partitions are not necessarily inserted, and the values corresponding to either the entire screen or the respective partitions may be inserted.
(48) Further, a high-luminance threshold value Th_max, a low-luminance threshold value Th_min, and an average-luminance threshold value Th_ave for determining how the subtitle luminance is to be adjusted on the reception side are inserted into the video stream. These values are obtained in accordance with electro-optical transfer function characteristics (EOTF characteristics).
(49) A curve a in
(50) The high-luminance threshold value Th_max, the low-luminance threshold value Th_min, and the average-luminance threshold value Th_ave, which have been described above, are inserted into the subtitle stream. It should be noted that these values may not be inserted into the subtitle stream, as long as they are inserted into the video stream. Subtitle luminance range limit information renderingdrange is also inserted into the subtitle stream. Color space information colorspace is further inserted into the subtitle stream.
(51) The luminance level adjustment information described above is inserted as an SEI message, for example, into the video stream. Therefore, the luminance level adjustment information is inserted into the video stream on a picture-by-picture basis, for example, as shown in
(52) It should be noted that the luminance level adjustment information may be inserted on a GOP-by-GOP basis or on some other basis. The luminance level adjustment information described above is also inserted into the subtitle stream on a subtitle display basis, for example.
(53) [Subtitle Luminance Level Adjustment]
(54) The subtitle luminance level adjustment on the reception side is now described. In a case where a subtitle is superimposed on an HDR image as the background image, the luminance contrast between the background image and the subtitle is great on the display, and objects with a large luminance difference coexist in the screen, leading to visual fatigue. To prevent that, the subtitle luminance level is adjusted while the HDR effect of the background image is maintained. In that case, the foreground region in the text portion of the subtitle and the background region in the border portion are controlled separately from each other, in accordance with the luminance in the background image.
(55) Referring now to
(56) Referring now to
(57) Referring now to
(58) Referring now to
(59) Subtitle luminance level adjustment is performed with global parameters for the screen in some cases, and is performed with parameters for the respective partitions in some other cases. First, subtitle luminance level adjustment with global parameters for the screen is described. As shown in the chart in
(60) Here, how the luminances Lf and Lb are determined is described. By a conventional method of transmitting subtitle information in the form of text, color designation is performed with six-digit color codes or color names such as Red, Green, Blue, and White.
(61) In the example of TTML, color indicates the color of the foreground region that is the text portion of a subtitle, and backgroundColor indicates the color of the background region that is the border portion of the subtitle. The example shown in
(62) As described above, the subtitle color information transmission is conducted for the foreground region and the background region separately from each other, but is often performed in an RGB domain for either of the regions. In an RGB domain, the relationship between visibility and luminance is not a linear relationship. Therefore, subtitle luminance level adjustment is performed through a transfer from the RGB domain to a YCbCr domain (a luminance/chrominance domain) in the conversion described below.
(63) Color conversion depends on color spaces, and the expressions for converting the chromaticity values of R, G, and B into a luminance Y in the respective color spaces Bt.709 and Bt.2020, for example, are shown below.
Y=0.212R+0.715G+0.072B (in the case of Bt.709)
Y=0.262R+0.678G+0.059B (in the case of Bt.2020)
(64) On the reception side, color conversion is performed on the color information (R, G, B) about the foreground region and the background region of the subtitle, so that the luminances Lf and Lb are determined. As described above, color conversion depends on color spaces. Therefore, in this embodiment, the color space information colorspace about the color information (R, G, B) is inserted into the subtitle stream.
(65) It should be noted that subtitle information may be transmitted in the form of bitmap data, instead of text. In this case, the luminances Lf and Lb can be obtained from GLUT outputs on the reception side.
(66) The subtitle luminance level adjustment in the bright scene of type a shown in
(67) In this case, as shown in
(68) Next, the subtitle luminance level adjustment in the dark scene of type b shown in
(69) In this case, as shown in
(70) Next, the subtitle luminance level adjustment in the dark scene with a high-luminance portion of type c shown in
(71) In this case, as shown in
(72) Next, the subtitle luminance level adjustment in the bright scene with a low-luminance portion of type d shown in
(73) In this case, as shown in
(74) Next, subtitle luminance level adjustment with parameters for the respective partitions is described. The maximum luminance value, the minimum luminance value, and the average luminance value in the screen cannot indicate local luminance distribution. The maximum luminance values, the minimum luminance values, and the average luminance values of the respective partitions are used, so that more minute subtitle luminance level adjustment can be performed.
(75) Here, the screen is divided into the 24 partitions P0 through P23, and a subtitle is superimposed across eight partitions A, B, C, D, B, F, G, and H, as shown in
(76) As shown in
(77) In this case, for each partition, determinations similar to those in the subtitle luminance level adjustment with the above described parameters for the screen are made, and the final determinations are made in accordance with the rule of majority or degrees of priority, for example. In the example shown in the drawing, the determinations as to the partition C are employed (see
(78) [Example Configuration of the Transmission Device]
(79)
(80) The control unit 101 is designed to include a central processing unit (CPU), and controls operations of the respective components of the transmission device 100 in accordance with a control program. The HDR camera 102 images an object, and outputs high dynamic range (HDR) video data (image data). This HDR video data has a contrast ratio of 0 to 100%*N (N being a number greater than 1) exceeding the luminance of the white peak of a conventional SDR image, such as a contrast ratio of 0 to 1000%. Here, the 100% level is equivalent to the luminance value of white, which is 100 cd/m.sup.2, for example.
(81) A master monitor 103a is a monitor for grading the HDR video data obtained by the HDR camera 102. This master monitor 103a has a display luminance level corresponding to the HDR video data or suitable for grading the HDR video data.
(82) The HDR photoelectric conversion unit 103 applies HDR opto-electric transfer function characteristics to the HDR video data obtained by the HDR camera 102, to acquire transmission video data V1. The RGB/YCbCr conversion unit 104 converts the transmission video data V1 from an RGB domain to a YCbCr (luminance/chrominance) domain.
(83) In accordance with the transmission video data V1 converted to the YCbCr domain, the luminance level calculation unit 106 calculates, for each picture, the maximum luminance value global_content_level_max, the minimum luminance value global_content_level_min, and the average luminance value global_content_level_ave that correspond to the entire screen, and the maximum luminance values partition_content_level_max, the minimum luminance values partition_content_level_min, and the average luminance values partition_content_level_ave that correspond to the respective partition regions (partitions) obtained by dividing the screen by a predetermined number, for example.
(84)
(85) The pixel value comparison unit 106b receives an input of the respective values of each partition calculated by the pixel value comparison unit 106a. The pixel value comparison unit 106b compares the values of the respective partitions, to calculate the maximum luminance value global_content_level_max, the minimum luminance value global_content_level_min, and the average luminance value global_content_level_ave that correspond to the entire screen.
(86) Referring back to
(87) The video encoder 105 performs encoding such as MPEG4-AVC or HEVC on the transmission video data V1, to generate a video stream (PES stream) VS containing the encoded image data. The video encoder 105 also inserts the luminance level adjustment information for adjusting the luminance level of the subtitle into the video stream.
(88) Specifically, the maximum luminance value global_content_level_max, the minimum luminance value global_content_level_min, and the average luminance value global_content_level_ave that have been calculated by the luminance level calculation unit 106 and correspond to the entire screen are inserted into the video stream, and the maximum luminance values partition_content_level_max, the minimum luminance values partition_content_level_min, and the average luminance values partition_content_level_ave that have been calculated by the luminance level calculation unit 106 and correspond to the respective partitions are also inserted into the video stream. The high-luminance threshold value Th_max, the low-luminance threshold value Th_min, and the average-luminance threshold value Th_ave that have been set by the threshold value setting unit are also inserted into the video stream.
(89) In this embodiment, the video encoder 105 inserts a luma dynamic range SEI message Luma_dynamic_range SEI message that is newly defined, into an SEIs portion of each access unit (AU).
(90)
(91)
(92) When Luma_dynamic_range_cancel_flag is 0, there are the fields described below. The 8-bit field of coded_data_bit_depth indicates the number of encoded pixel bits. The 8-bit field of number_of_partitions indicates the number of partition regions (partitions) in the screen. If this value is smaller than 2, the screen is not divided. The 8-bit field of block_size indicates the block size, or the size of the regions obtained by dividing the entire screen by the number of partition regions.
(93) The 16-bit field of global_content_level_max indicates the maximum luminance value in the entire screen. The 16-bit field of global_content_level_min indicates the minimum luminance value in the entire screen. The 16-bit field of global_content_level_ave indicates the average luminance value in the entire screen. The 16-bit field of content_threshold_max indicates the high-luminance threshold value. The 16-bit field of content_threshold_min indicates the low-luminance threshold value. The 16-bit field of content_threshold_ave indicates the average-luminance threshold value.
(94) Further, when the number of partitions indicated by the field of number_of_partitions is 2 or greater, each partition contains the fields described below. The 16-bit field of partition_content_level_max indicates the maximum luminance value in the partition. The 16-bit field of partition_content_level_min indicates the minimum luminance value in the partition. The 16-bit field of partition_content_level_ave indicates the average luminance value in the partition.
(95) Referring back to
(96)
(97)
(98)
(99) Referring back to
(100) In this embodiment, the luminance level adjustment information for adjusting the luminance level of the subtitle is inserted into the subtitle stream SS. Specifically, the high-luminance threshold value Th_max, the low-luminance threshold value Th_min, the average-luminance threshold value Th_ave, the subtitle luminance range limit information renderingdrange, and the subtitle color space information colorspace are inserted into the subtitle stream SS.
(101) The insertion of the luminance level adjustment information is performed by the text format conversion unit 109 or the subtitle encoder 110. In a case where the text format conversion unit 109 performs the insertion of the luminance level adjustment information, the elements in the metadata metadata in the header of the TTML structure are used, for example.
(102)
(103) Rendering control information as the luminance level adjustment information is indicated by ttm-ext:renderingcontrol. The high-luminance threshold value is indicated by ttm-ext:lumathmax, and Th_max as the actual value thereof is then written. The low-luminance threshold value is indicated by ttm-ext:lumathmin, and Th_min as the actual value thereof is then written. The average-luminance threshold value is indicated by ttm-ext:lumathave, and Th_ave as the actual value thereof is then written.
(104) The subtitle luminance range limit information is indicated by ttm-ext:renderingdrange, and Maxminratio is then written. Maxminratio indicates the ratio obtained by dividing the maximum luminance value of a subtitle by the minimum luminance value of the subtitle. When this value is 4, for example, the maximum luminance value of the subtitle after luminance adjustment is equal to or lower than four times the minimum luminance value.
(105) Further, In a case where the text format conversion unit 109 performs the insertion of the luminance level adjustment information, the elements in the styling extension styling extension in the header of the TTML structure are used, for example. In this case, independent rendering control (luminance level adjustment) can be performed for each xml:id.
(106)
(107) The high-luminance threshold value is indicated by ttse:renderingcontrol:lumathmax, and Th_max as the actual value thereof is then written. The low-luminance threshold value is indicated by ttse:renderingcontrol:lumathmin, and Th_min as the actual value thereof is then written. The average-luminance threshold value is indicated by ttse:renderingcontrol:lumathave, and Th_ave as the actual value thereof is then written. The subtitle luminance range limit information is indicated by ttse:renderingcontrol:renderingdrange, andMaxminratio is then written.
(108) In a case where the subtitle encoder 110 performs the insertion of the luminance level adjustment information, a segment containing the luminance level adjustment information are inserted into the subtitle stream. In this embodiment, a newly-defined subtitle rendering control segment (SRCS: Subtitle rendering control segment) is inserted into the subtitle stream.
(109)
(110) Also, this structure includes the luminance level adjustment information for each region. The 8-bit field of resion_id indicates the identifier for identifying the region. The 8-bit field of colorspace_type indicates the color space information. The 8-bit field of dynamicrange_type indicates the dynamic range information, or indicates the type of the EOTF characteristics of HDR. The 16-bit field of luma_th_max indicates the high-luminance threshold value. The 16-bit field of luma_th_min indicates the low-luminance threshold value. The 16-bit field of luma_th_ave indicates the average-luminance threshold value.
(111) The 8-bit field of renderingdrange indicates the subtitle luminance range limit information. This limit information indicates the ratio obtained by dividing the maximum luminance value of a subtitle by the minimum luminance value of the subtitle, for example. When this value is 4, for example, the maximum luminance value of the subtitle after luminance adjustment is equal to or lower than four times the minimum luminance value.
(112) Referring back to
(113) The system encoder 111 inserts identification information into the transport stream TS serving as a container. The identification information indicates that the luminance level adjustment information has been inserted into the video stream. In this embodiment, the system encoder 111 inserts an HDR rendering support descriptor HDR_rendering_support_descriptor into a program map table (PMT: Program Map Table).
(114)
(115) The flag HDR_flag indicates whether the service stream (video stream) is compatible with HDR. When the flag is 1, the service stream is compatible with HDR. When the flag is 0, the service stream is not compatible with HDR. The flag composition_control_flag indicates whether the luma dynamic range SEI message (Luma_dynamic_Range SEI message) has been encoded in the video stream, or whether the luminance level adjustment information has been inserted into the video stream. When the flag is 1, the luma dynamic range SEI message has been encoded. When the flag is 0, the luma dynamic range SEI message has not been encoded. The 8-bit field of EOTF_type indicates the type of the EOTF characteristics of the video (the value of the VUI of the video stream).
(116) The system encoder 111 inserts further identification information into the transport stream TS serving as a container. The identification information indicates that the luminance level adjustment information has been inserted into the subtitle stream. In this embodiment, the system encoder 111 inserts a subtitle rendering metadata descriptor Subtitle_rendering_metadata_descriptor into the program map table (PMT: Program Map Table).
(117)
(118) The flag subtitle_text_flag indicates whether subtitle is transmitted in the form of a text code. When the flag is 1, the subtitle is a text-encoded subtitle. When the flag is 0, the subtitle is not a text-encoded subtitle. The flag subtitle_rendering_control_flag indicates whether the luminance adjustment meta-information about the subtitle has been encoded, or whether the subtitle has the luminance level adjustment information inserted therein. When the flag is 1, the luminance adjustment meta-information has been encoded. When the flag is 0, the luminance adjustment meta-information has not been encoded.
(119) The 3-bit field of meta_container_type indicates the storage site or the insertion position of the luminance adjustment meta-information (luminance level adjustment information). In the 3-bit field of meta_container_type, 0 indicates the subtitle rendering control segment, 1 indicates an element in the metadata metadata in the header of the TTML structure, and 2 indicates an element in the styling extension styling extension in the header of the TTML structure.
(120) Operations in the transmission device 100 shown in
(121) The transmission video data V1 converted to the YCbCr domain is supplied to the video encoder 105 and the luminance level calculation unit 106. The luminance level calculation unit 106 calculates, for each picture, the maximum luminance value global_content_level_max, the minimum luminance value global_content_level_min, and the average luminance value global_content_level_ave that correspond to the entire screen, and the maximum luminance values partition_content_level_max, the minimum luminance values partition_content_level_min, and the average luminance values partition_content_level_ave that correspond to the respective partition regions (partitions) obtained by dividing the screen by a predetermined number (see
(122) Information about the electro-optical transfer function characteristics (EOTF characteristics) is supplied to the threshold value setting unit 107. In accordance with the EOTF characteristics, the threshold value setting unit 107 sets the high-luminance threshold value Th_max, the low-luminance threshold value Th_min, and the average-luminance threshold value Th_ave for determining how the subtitle luminance is to be adjusted on the reception side (see
(123) At the video encoder 105, encoding such as MPEG4-AVC or HEVC is performed on the transmission video data V1, and a video stream (PES stream) VS containing encoded image data is generated. At the video encoder 105, the luminance level adjustment information for adjusting the luminance level of the subtitle is also inserted into the video stream. That is, at the video encoder 105, a newly-defined luma dynamic range SEI message is inserted into a portion of SEIs in an access unit (AU) (see
(124) At the subtitle generation unit 108, text data (a character code) DT is generated as subtitle information. This text data DT is supplied to the text format conversion unit 109. At the text format conversion unit 109, the text data DT is converted into subtitle text information having display timing information, or into TTML (see
(125) At the subtitle encoder 110, the TTML obtained by the text format conversion unit 109 is converted into various segments, and a subtitle stream SS formed with a PES packet having these segments disposed in payloads is generated.
(126) The luminance level adjustment information for adjusting the luminance level of the subtitle is inserted into the subtitle stream SS. The insertion of the luminance level adjustment information is performed by the text format conversion unit 109 or the subtitle encoder 110. In a case where the insertion is performed by the text format conversion unit 109, the elements in the metadata metadata in the header of the TTML structure or the elements in the styling extension styling extension in the header of the TTML structure are used, for example (see
(127) The video stream VS generated by the video encoder 105 is supplied to the system encoder 111. The subtitle stream SS generated by the subtitle encoder 110 is supplied to the system encoder 111. At the system encoder 111, a transport stream TS including the video stream VS and the subtitle stream SS is generated. This transport stream TS is incorporated into broadcast waves or a network packet, and is transmitted to the reception device 200 by the transmission unit 112.
(128) At the system encoder 111 in this stage, identification information indicating that the video stream has the luminance level adjustment information inserted therein is inserted into the transport stream TS. That is, at the system encoder 111, an HDR rendering support descriptor is inserted into the program map table (PMT) (see
(129) [Structure of the Transport Stream TS]
(130)
(131) This example structure also includes a subtitle stream PES packet Subtitle PES2 that is identified by PID2. The luminance level adjustment information (color space information, threshold values for combining, subtitle luminance range limit information, and the like) is inserted into an element of the metadata metadata in the header of the TTML structure, an element of the styling extension styling extension in the header of the TTML structure, or the subtitle rendering control segment.
(132) The transport stream TS also includes the program map table (PMT) as program specific information (PSI). The PSI is the information indicating to which programs the respective elementary streams included in the transport stream belong. The PMT includes a program loop Program loop in which information related to the entire program is written.
(133) The PMT also includes elementary stream loops having information related to the respective elementary streams. This example structure includes a video elementary stream loop video ES loop corresponding to the video stream, and a subtitle elementary stream loop Subtitle ES loop corresponding to the subtitle stream.
(134) In the video elementary stream loop video ES loop, information such as the stream type and the packet identifier (PID) corresponding to the video stream is disposed, and a descriptor that describes information related to the video stream is also disposed. The value of Stream_type of this video stream is set at a value indicating an HEVC video stream, for example, and the PID information indicates PID1 allotted to the video stream PES packet video PES1. An HEVC descriptor, a newly-defined HDR rendering support descriptor, or the like is inserted as the descriptor.
(135) In the subtitle elementary stream loop Subtitle ES loop, information such as the stream type and the packet identifier (PID) corresponding to the subtitle stream is disposed, and a descriptor that describes information related to the subtitle stream is also disposed. The value of Stream type of this subtitle stream is set at a value indicating a private stream, for example, and the PID information indicates PID2 allotted to the subtitle stream PES packet Subtitle PES2. A newly-defined subtitle rendering metadata descriptor or the like is inserted as the descriptor.
(136) [Example Configuration of the Reception Device]
(137)
(138) The control unit 201 is designed to include a central processing unit (CPU), and controls operations of the respective components of the reception device 200 in accordance with a control program. The reception unit 202 receives the transport stream TS in broadcast waves or a network packet transmitted from the transmission device 100. The system decoder 203 extracts the video stream VS and the subtitle stream SS from the transport stream TS. The system decoder 203 also extracts various kinds of information inserted in the transport stream TS (container), and sends the extracted information to the control unit 201.
(139) In this embodiment, this extracted information includes an HDR rendering support descriptor (see
(140) As the flag HDR_flag in the HDR rendering support descriptor is 1, the control unit 201 recognizes that the video stream (service stream) is compatible with HDR. As the flag composition_control_flag in the HDR rendering support descriptor is 1, the control unit 201 also recognizes that an encoded luma dynamic range SEI message is included in the video stream, or the video stream has the luminance level adjustment information inserted therein.
(141) As the flag subtitle_text_flag in the subtitle rendering metadata descriptor is 1, the control unit 201 also recognizes that the subtitle is transmitted in the form of a text code. As the flag subtitle_rendering_control_flag in the subtitle rendering metadata descriptor is 1, the control unit 201 also recognizes that the luminance adjustment meta-information about the subtitle has been encoded, or the subtitle has the luminance level adjustment information inserted therein.
(142) The video decoder 204 performs a decoding process on the video stream VS extracted by the system decoder 203, and outputs the transmission video data V1. The video decoder 204 also extracts the parameter sets and the SEI messages inserted in the respective access units constituting the video stream VS, and sends necessary information to the control unit 201.
(143) In this embodiment, the control unit 201 recognizes that the video stream includes an encoded luma dynamic range SEI message as described above. Thus, under the control of the control unit 201, the video decoder 204 also extracts the SEI message without fail, and obtains the luminance level adjustment information such as background image luminance values and threshold values for combining.
(144) The subtitle text decoder 205 performs a decoding process on the segment data of the respective regions in the subtitle stream SS, and thus obtains the text data and the control codes of the respective regions. The subtitle text decoder 205 also obtains, from the subtitle stream SS, the luminance level adjustment information such as the color space information, the threshold values for combining, and the subtitle luminance range limit information. In this embodiment, the control unit 201 recognizes that the subtitle luminance adjustment meta-information has been encoded as described above. Thus, under the control of the control unit 201, the subtitle text decoder 205 obtains the luminance level adjustment information without fail.
(145) The font decompression unit 206 performs font decompression in accordance with the text data and the control codes of the respective regions obtained by the subtitle segment decoder 302, and thus obtains the bitmap data of the respective regions. The RGB/YCbCr conversion unit 208 converts the bitmap data from an RGB domain to a YCbCr (luminance/chrominance) domain. In this case, the RGB/YCbCr conversion unit 208 performs the conversion using a conversion equation suitable for the color space, in accordance with the color space information.
(146) The luminance level adjustment unit 209 performs luminance level adjustment on the subtitle bitmap data converted to the YCbCr domain, using the background image luminance values, the threshold values for combining, and the subtitle luminance range limit information. In this case, subtitle luminance level adjustment with the global parameters for the screen (see
(147) The video superimposition unit 210 superimposes the bitmap data of the respective regions having the luminance levels adjusted by the luminance level adjustment unit 209, on the transmission video data V1 obtained by the video decoder 204. The YCbCr/RGB conversion unit 211 converts the transmission video data V1 having the bitmap data superimposed thereon from the YCbCr (luminance/chrominance) domain to a RGB domain. In this case, the YCbCr/RGB conversion unit 211 performs the conversion using a conversion equation suitable for the color space, in accordance with the color space information.
(148) The HDR electro-optical conversion unit 212 applies HDR electro-optical transfer function characteristics to the transmission video data V1 converted to the RGB domain, and thus obtains display video data for displaying an HDR image. The HDR display mapping unit 213 performs display luminance adjustment on the display video data, in accordance with the maximum luminance display capability or the like of the CE monitor 214. The CE monitor 214 displays an HDR image in accordance with the display video data on which the display luminance adjustment has been performed. This CE monitor 214 is formed with a liquid crystal display (LCD) or an organic electroluminescence (EL) display, for example.
(149) Operations in the reception device 200 shown in
(150) At the system decoder 203, various kinds of information inserted in the transport stream TS (container) are also extracted, and are sent to the control unit 201. This extracted information includes an HDR rendering support descriptor (see
(151) As the flag HDR_flag in the HDR rendering support descriptor is 1, the control unit 201 recognizes that the video stream (service stream) is compatible with HDR. As the flag composition_control_flag in the HDR rendering support descriptor is 1, the control unit 201 also recognizes that luma dynamic range SEI message in the video stream has been encoded.
(152) As the flag subtitle_text_flag in the subtitle rendering metadata descriptor is 1, the control unit 201 also recognizes that the subtitle is transmitted in the form of a text code. As the flag subtitle_rendering_control_flag in the subtitle rendering metadata descriptor is 1, the control unit 201 also recognizes that the luminance adjustment meta-information about the subtitle has been encoded.
(153) The video stream VS extracted by the system decoder 203 is supplied to the video decoder 204. At the video decoder 204, a decoding process is performed on the video stream VS, and the transmission video data V1 is obtained. At the video decoder 204, the luma dynamic range SEI message is also extracted from the video stream VS, and the luminance level adjustment information such as the background image luminance values and the threshold values for combining is obtained.
(154) The subtitle stream SS extracted by the system decoder 203 is supplied to the subtitle text decoder 205. At the subtitle text decoder 205, a decoding process is performed on the segment data of the respective regions included in the subtitle stream SS, and the text data and the control codes of the respective regions are obtained. At the subtitle text decoder 205, the luminance level adjustment information such as the color space information, the threshold values for combining, and the subtitle luminance range limit information is also obtained from the subtitle stream SS.
(155) The text data and the control codes of the respective regions are supplied to the font decompression unit 206. At the font decompression unit 206, font decompression is performed in accordance with the text data and the control codes of the respective regions, and the bitmap data of the respective regions is obtained. At the RGB/YCbCr conversion unit 208, this bitmap data is converted from an RGB domain to a YCbCr domain in accordance with color space information S, and is supplied to the luminance level adjustment unit 209.
(156) At the luminance level adjustment unit 209, luminance level adjustment is performed on the bitmap data of the respective regions converted to the YCbCr domain, in accordance with the background image luminance values, the threshold values for combining, and the subtitle luminance range limit information. In this case, subtitle luminance level adjustment with the global parameters for the screen (see
(157) The transmission video data V1 obtained by the video decoder 204 is supplied to the video superimposition unit 210. The bitmap data of the respective regions that has been subjected to the luminance level adjustment and been obtained by the luminance level adjustment unit 209 is supplied to the video superimposition unit 209. At the video superimposition unit 210, the bitmap data of the respective regions is superimposed on the transmission video data V1.
(158) The transmission video data V1 that has been obtained by the video superimposition unit 210 and has the bitmap data superimposed thereon is converted from the YCbCr (luminance/chrominance) domain to an RGB domain at the YCbCr/RGB conversion unit 211 in accordance with the designation indicated by color space information V, and is then supplied to the HDR electro-optical conversion unit 212. At the HDR electro-optical conversion unit 212, HDR electro-optical transfer function characteristics are applied to the transmission video data V1, so that the display video data for displaying an HDR image is obtained. The display video data is supplied to the HDR display mapping unit 213.
(159) At the HDR display mapping unit 213, display luminance adjustment is performed on the display video data in accordance with the maximum luminance display capability or the like of the CE monitor 214. The display video data subjected to such display luminance adjustment is supplied to the CE monitor 214. In accordance with this display video data, an HDR image is displayed on the CE monitor 214.
(160) It should be noted that the reception device 200 further includes a subtitle bitmap decoder 215 to cope with a situation where the subtitle information included in the subtitle stream SS is bitmap data. This subtitle bitmap decoder 215 performs a decoding process on the subtitle stream SS, to obtain subtitle bitmap data. This subtitle bitmap data is supplied to the luminance level adjustment unit 209.
(161) In this case, the subtitle information (transmission data) included in the subtitle stream SS is transmitted to the CLUT, and a CLUT output might have a YCbCr domain. Therefore, the subtitle bitmap data obtained by the subtitle bitmap decoder 215 is supplied directly to the luminance level adjustment unit 209. It should be noted that, in this case, the luminance Lf in the foreground region of the subtitle and the luminance Lb in the background region of the subtitle can be obtained from the CLUT output on the reception side.
(162) The reception device 200 further includes a luminance level calculation unit 216 to cope with a situation where the luma dynamic range SEI message in the video stream VS has not been encoded, and any background image luminance value cannot be obtained from the SEI message. This luminance level calculation unit 216 has a configuration similar to that of the luminance level calculation unit 106 (see
(163) In accordance with the transmission video data V1 obtained by the video decoder 204, the luminance level calculation unit 216 calculates, for each picture, the maximum luminance value global_content_level_max, the minimum luminance value global_content_level_min, and the average luminance value global_content_level_ave that correspond to the entire screen, and the maximum luminance values partition_content_level_max, the minimum luminance values partition_content_level_min, and the average luminance values partition_content_level_ave that correspond to the respective partition regions (partitions) obtained by dividing the screen by a predetermined number (see
(164) The reception device 200 also includes a threshold value setting unit 217 to cope with a situation where the luma dynamic range SEI message in the video stream VS has not been encoded or where the video stream VS includes an encoded luma dynamic range SEI message but does not include any threshold value for combining, and a situation where the subtitle stream SS does not include any threshold value for combining. This threshold value setting unit 217 has a configuration similar to that of the threshold value setting unit 107 in the transmission device 100 shown in
(165) In accordance with electro-optical transfer function characteristics (EOTF characteristics), this threshold value setting unit 217 sets a high-luminance threshold value Th_max, a low-luminance threshold value Th_min, and an average-luminance threshold value Th_ave for determining how the subtitle luminance is to be adjusted on the reception side (see
(166) The flowchart in
(167) Next, in step ST2, the reception device 200 determines whether there is luminance adjustment meta-information. If there is luminance adjustment meta-information, the reception device 200 detects the meta-information storage site, and obtains meta-information (color space information, threshold values for combining, and subtitle luminance range limit information) from the storage site. After this step ST3, the reception device 200 moves on to the process in step ST5. If there is no luminance adjustment meta-information, on the other hand, the reception device 200 in step ST4 sets a color space of a conventional type, threshold values for combining, and subtitle luminance range limit information. After this step ST4, the reception device 200 moves on to the process in step ST5.
(168) In step ST5, the reception device 200 determines whether the encoded data of the subtitle information is text-based data. If the encoded data of the subtitle information is text-based data, the reception device 200 in step ST6 decodes the text-based subtitle, and performs font decompression from the subtitle combining position and the character code, to obtain bitmap data. At this point, the bitmap data is in the decompression size and the colors of the foreground and the background. In step ST7, the reception device 200 calculates the luminance Lf of the foreground of the subtitle and the luminance Lb of the background of the subtitle, in accordance with the color space information. After this step ST7, the reception device 200 moves on to the process in step ST16.
(169) If the encoded data of the subtitle information is not text-based data, on the other hand, the reception device 200 in step ST8 decodes the subtitle stream, to obtain the subtitle bitmap data and the subtitle combining position. In step ST9, through the CLUT designated by the stream, the reception device 200 calculates the luminance Lf of the foreground of the subtitle and the luminance Lb of the background of the subtitle. After this step ST9, the reception device 200 moves on to the process in step ST16.
(170) In step ST11, the reception device 200 also reads the HDR rendering support descriptor, and searches for a luma dynamic range SEI in the video stream VS.
(171) Next, in step ST12, the reception device 200 determines whether there is a luma dynamic range SEI message. If there is a lima dynamic range SEI message, the reception device 200 in step ST13 reads the respective elements in the SEI message, and detects the background image luminance values and the threshold values for combining. After this step ST13, the reception device 200 moves on to the process in step ST15. If there is a luma dynamic range SEI message, on the other hand, the reception device 200 in step ST14 determines the background image luminance values by calculating the luminance level of the decoded image, and sets the threshold values for combining. After this step ST14, the reception device 200 moves on to the process in step ST15.
(172) In step ST15, the reception device 200 determines whether there is partition information. If there is partition information, the reception device 200 in step ST16 determines whether low-luminance and high-luminance objects are at a distance from the subtitle combining (superimposing) position. If such objects are not at a distance from the subtitle combining position, the reception device 200 moves on to the process in step ST18. If such objects are at a distance from the subtitle combining position, on the other hand, the reception device 200 in step ST17 performs a luminance level adjustment process, using the partition information. After this step ST17, the reception device 200 moves on to the process in step ST19.
(173) In step ST18, the reception device 200 performs a global luminance level adjustment process. After this step ST8, the reception device 200 moves on to the process in step ST19. In step ST19, the reception device 200 combines the subtitle with (or superimposes the subtitle on) the background image with the adjusted luminance. After this step ST19, the reception device 200 in step ST20 ends the process.
(174) The flowchart in
(175) In addition, if the minimum luminance value is lower than the low-luminance threshold value in step ST23, the reception device 200 in step ST25 determines whether the average luminance value is higher than the average-luminance threshold value. If the average luminance value is higher than the average-luminance threshold value, the reception device 200 in step ST26 corrects the luminance level of the background of the subtitle to fall within the range between the maximum luminance and the minimum luminance of the subtitle in a case where the ratio between the maximum luminance and the minimum luminance of the subtitle is designated (see
(176) Further, if the maximum luminance value is not higher than the high-luminance threshold value in step ST22, on the other hand, the reception device 200 in step ST28 determines whether the minimum luminance value is lower than the low-luminance threshold value. If the minimum luminance value is lower than the low-luminance threshold value, the reception device 200 in step ST29 corrects the luminance level of the foreground of the subtitle to fall within the range between the maximum luminance and the minimum luminance of the subtitle in a case where the ratio between the maximum luminance and the minimum luminance of the subtitle is designated (see
(177) It should be noted that, as for a luminance level adjustment process using the partition information, the reception device 200 performs the process shown in the flowchart in
(178) As described above, in the transmission/reception system 10 shown in
(179) Also, in the transmission/reception system 10 shown in
(180) Also, in the transmission/reception system 10 shown in
(181) <2. Modifications>
(182) In the above described example of an embodiment, the container is a transport stream (MPEG-2 TS). However, transport according to the present technology is not necessarily performed with a transport stream TS, but video layers can be obtained with some other packet by the same method in the case of ISOBMFF, MMT, or the like, for example. Also, a subtitle stream is not necessarily formed with a PES packet having TTML in segments disposed in multiple payloads as described above. Instead, the present technology can be embodied by setting TTML directly in a PES packet having the multiple payloads or in a section.
(183) The present technology may also be embodied in the structures described below.
(184) (1) A transmission device including:
(185) a video encoder that generates a video stream including image data;
(186) a subtitle encoder that generates a subtitle stream including subtitle information;
(187) an adjustment information insertion unit that inserts luminance level adjustment information into the video stream and/or the subtitle stream, the luminance level adjustment information being designed for adjusting a luminance level of a subtitle; and
(188) a transmission unit that transmits a container in a predetermined format, the container containing the video stream and the subtitle stream.
(189) (2) The transmission device of (1), in which
(190) the luminance level adjustment information is luminance level adjustment information corresponding to an entire screen and/or luminance level adjustment information corresponding to respective partition regions obtained by dividing the screen by a predetermined number.
(191) (3) The transmission device of (2), in which
(192) the luminance level adjustment information to be inserted into the video stream includes a maximum luminance value, a minimum luminance value, and an average luminance value that are generated in accordance with the image data.
(193) (4) The transmission device of (3), in which
(194) the luminance level adjustment information to be inserted into the video stream further includes a high-luminance threshold value, a low-luminance threshold value, and an average-luminance threshold value that are set in accordance with electro-optical transfer function characteristics.
(195) (5) The transmission device of (2), in which
(196) the luminance level adjustment information to be inserted into the subtitle stream includes subtitle luminance range limit information.
(197) (6) The transmission device of (5), in which
(198) the luminance level adjustment information to be inserted into the subtitle stream further includes a high-luminance threshold value, a low-luminance threshold value, and an average-luminance threshold value that are set in accordance with electro-optical transfer function characteristics.
(199) (7) The transmission device of (5) or (6), in which
(200) the luminance level adjustment information to be inserted into the subtitle stream further includes color space information.
(201) (8) The transmission device of any of (1) to (7), in which
(202) the subtitle encoder generates the subtitle stream in accordance with subtitle text information in TTML, and
(203) the adjustment information insertion unit inserts the luminance level adjustment information, using an element of metadata in a header of a TTML structure.
(204) (9) The transmission device of any of (1) to (7), in which
(205) the subtitle encoder generates the subtitle stream in accordance with subtitle text information in TTML, and
(206) the adjustment information insertion unit inserts the luminance level adjustment information, using an element of styling extension in a header of a TTML structure.
(207) (10) The transmission device of any of (1) to (7), in which
(208) the subtitle encoder generates the subtitle stream having segments as components, and
(209) the adjustment information insertion unit inserts a segment containing the luminance level adjustment information into the subtitle stream.
(210) (11) The transmission device of any of (1) to (10), further including
(211) an identification information insertion unit that inserts identification information into the container, the identification information indicating that there is the luminance level adjustment information inserted in the video stream.
(212) (12) The transmission device of any of (1) to (11), further including
(213) an identification information insertion unit that inserts identification information into the container, the identification information indicating that there is the luminance level adjustment information inserted in the subtitle stream.
(214) (13) The transmission device of (12), in which
(215) information indicating an insertion position of the luminance level adjustment information in the subtitle stream is added to the identification information.
(216) (14) A transmission method including:
(217) a video encoding step of generating a video stream including image data;
(218) a subtitle encoding step of generating a subtitle stream including subtitle information;
(219) an adjustment information insertion step of inserting luminance level adjustment information into the video stream and/or the subtitle stream, the luminance level adjustment information being designed for adjusting a luminance level of a subtitle; and
(220) a transmission step of transmitting a container in a predetermined format, the container containing the video stream and the subtitle stream, the container being transmitted by a transmission unit.
(221) (15) A reception device including:
(222) a reception unit that receives a container in a predetermined format, the container containing a video stream including image data and a subtitle stream including subtitle information;
(223) a video decoding unit that obtains image data by performing a decoding process on the video stream;
(224) a subtitle decoding unit that obtains bitmap data of a subtitle by performing a decoding process on the subtitle stream;
(225) a luminance level adjustment unit that performs a luminance level adjustment process on the bitmap data in accordance with luminance level adjustment information; and
(226) a video superimposition unit that superimposes bitmap data obtained by the luminance level adjustment unit after the luminance level adjustment, on the image data obtained by the video decoding unit.
(227) (16) The reception device of (15), in which
(228) the luminance level adjustment unit performs the luminance level adjustment, using the luminance level adjustment information inserted in the video stream and/or the subtitle stream.
(229) (17) The reception device of (15), further including
(230) a luminance level adjustment information generation unit that generates the luminance level adjustment information,
(231) in which the luminance level adjustment unit performs the luminance level adjustment, using the luminance level adjustment information generated by the luminance level adjustment information generation unit.
(232) (18) A reception method including:
(233) a reception step of receiving a container in a predetermined format, the container containing a video stream including image data and a subtitle stream including subtitle information, the container being received by a reception unit;
(234) a video decoding step of obtaining image data by performing a decoding process on the video stream;
(235) a subtitle decoding step of obtaining bitmap data of a subtitle by performing a decoding process on the subtitle stream;
(236) a luminance level adjustment step of performing a luminance level adjustment process on the bitmap data in accordance with luminance level adjustment information; and
(237) a video superimposition step of superimposing bitmap data obtained in the luminance level adjustment step after the luminance level adjustment, on the image data obtained in the video decoding step.
(238) (19) A transmission device including:
(239) a transmission unit that transmits a video stream in a container in a predetermined format, the video stream including transmission video data obtained through high dynamic range photoelectric conversion performed on high dynamic range image data; and
(240) an identification information insertion unit that inserts identification information into the container, the identification information indicating that the video stream is compatible with a high dynamic range.
(241) (20) A transmission device including:
(242) a transmission unit that transmits a video stream and a subtitle stream in a container in a predetermined format, the video stream including image data, the subtitle stream including text information about a subtitle; and
(243) an identification information insertion unit that inserts identification information into the container, the identification information indicating that the subtitle is transmitted in the form of a text code.
(244) The principal feature of the present technology lies in that luminance level adjustment information for adjusting the luminance level of a subtitle is inserted into a video stream VS and a subtitle stream SS, so that the reception side can perform prefer red subtitle luminance level adjustment (see
REFERENCE SIGNS LIST
(245) 10 Transmission/reception system 100 Transmission device 101 Control unit 102 HDR camera 103 HDR photoelectric conversion unit 103a Master monitor 104 RGB/YCbCr conversion unit 105 Video encoder 106 Luminance level calculation unit 106a, 106b Pixel value comparison unit 107 Threshold value setting unit 108 Subtitle generation unit 109 Text format conversion unit 110 Subtitle encoder 111 System encoder 112 Transmission unit 200 Reception device 201 Control unit 202 Reception unit 203 System decoder 204 Video decoder 205 Subtitle text decoder 206 Font decompression unit 208 RGB/YCbCr conversion unit 209 Luminance level adjustment unit 210 Video superimposition unit 211 YCbCr/RGB conversion unit 212 HDR electro-optical conversion unit 213 HDR display mapping unit 214 CE monitor