TRANSMISSION DEVICE, TRANSMISSION METHOD, RECEPTION DEVICE, AND RECEPTION METHOD
20190123842 ยท 2019-04-25
Assignee
Inventors
Cpc classification
H04H20/28
ELECTRICITY
H04H60/74
ELECTRICITY
H04H60/35
ELECTRICITY
H04N21/435
ELECTRICITY
H04N21/236
ELECTRICITY
International classification
H04H20/28
ELECTRICITY
H04H60/35
ELECTRICITY
H04H60/74
ELECTRICITY
H04N21/236
ELECTRICITY
Abstract
To simplify transmission of a plurality of types of subtitle information.
A predetermined number of subtitle streams each having one piece or two or more pieces of subtitle information is generated, and a container of a predetermined format including the predetermined number of subtitle streams is transmitted. A reception side extracts one subtitle stream from the predetermined number of subtitle streams, extracts one piece of subtitle information from the one subtitle stream, decodes the one piece of subtitle information, and controls subtitle display.
Claims
1. A transmission device comprising: a subtitle encoding unit configured to generate a predetermined number of subtitle streams each having one piece or two or more pieces of subtitle information; and a transmission unit configured to transmit a container of a predetermined format including the predetermined number of subtitle streams.
2. The transmission device according to claim 1, wherein each of the predetermined number of subtitle streams has segmented subtitle information.
3. The transmission device according to claim 1, wherein the subtitle encoding unit generates a plurality of subtitle streams each having subtitle information of a different language, and each of the plurality of subtitle streams has a plurality of pieces of subtitle information each having different content.
4. The transmission device according to claim 1, wherein the subtitle encoding unit generates a plurality of subtitle streams each having subtitle information of different content, and each of the plurality of subtitle streams has a plurality of pieces of subtitle information each having a different language.
5. The transmission device according to claim 1, further comprising: an information insertion unit configured to insert information regarding each of the predetermined number of subtitle streams into the container.
6. The transmission device according to claim 5, wherein the information regarding each of the subtitle streams includes flag information indicating whether or not a corresponding subtitle stream has a plurality of pieces of subtitle information.
7. The transmission device according to claim 5, wherein the information regarding each of the subtitle streams includes identification information identifying a corresponding subtitle stream.
8. The transmission device according to claim 5, wherein the information regarding each of the subtitle streams includes identification information identifying each subtitle information that a corresponding subtitle stream has.
9. A transmission method comprising: a subtitle encoding step of generating a predetermined number of subtitle streams each having one piece or two or more pieces of subtitle information; and a transmission step of transmitting, by a transmission unit, a container of a predetermined format including the predetermined number of subtitle streams.
10. A reception device comprising: a reception unit configured to receive a container of a predetermined format including a predetermined number of subtitle streams each having one piece or two or more pieces of subtitle information; and a control unit configured to control first extraction processing of extracting one subtitle stream from the predetermined number of subtitle streams and second extraction processing of extracting one piece of subtitle information from the extracted one subtitle stream.
11. The reception device according to claim 10, wherein information regarding each of the predetermined number of subtitle streams is inserted in the container, and the control unit further controls display processing of user interface information for the first extraction processing and the second extraction processing on the basis of the information regarding each of the predetermined number of subtitle streams.
12. A reception method comprising: a reception step of receiving, by a reception unit, a container of a predetermined format including a predetermined number of subtitle streams each having one piece or two or more pieces of subtitle information; and a control step of controlling first extraction processing of extracting one subtitle stream from the predetermined number of subtitle streams and second extraction processing of extracting one piece of subtitle information from the extracted one subtitle stream.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
MODE FOR CARRYING OUT THE INVENTION
[0043] Hereinafter, a mode for implementing the present invention (hereinafter referred to as an embodiment) will be described. Note that the description will be given in the following order.
[0044] 1. Embodiment
[0045] 2. Modification
[0046] <1. Embodiment>
[Configuration Example of Transmission/Reception System]
[0047]
[0048] The transport stream TS includes a predetermined number of subtitle streams together with a video stream having video data and an audio stream having audio data. Each of the predetermined number of subtitle streams has one piece or two or more pieces of subtitle information. As the subtitle information, text information of a subtitle (caption), for example, TTML, a derivative format of the TTML, or the like can be considered. In this embodiment, the subtitle information is the TTML, and the subtitle stream has segmented subtitle information.
[0049] The broadcast transmission system 100 inserts information regarding each of the predetermined number of subtitle streams into the transport stream TS as a container. This information includes, for example, flag information indicating whether or not a corresponding subtitle stream has a plurality of pieces of subtitle information, identification information identifying the corresponding subtitle stream, identification information identifying each subtitle information that the corresponding subtitle stream has, and the like. With the information insertion, a reception side can appropriately perform display processing of user interface information for a user to perform a selection operation for desired subtitle display.
[0050] The television receiver 200 receives the transport stream TS sent from the broadcast transmission system 100. The television receiver 200 applies decoding processing to the video stream having video data to obtain the video data and applies decoding processing to the audio stream having audio data to obtain the audio data.
[0051] The television receiver 200 extracts one subtitle stream from the predetermined number of subtitle streams and extracts one piece of subtitle information from the extracted one subtitle stream according to a user's selection operation. Then, the television receiver 200 applies decoding processing to the extracted one piece of subtitle information to obtain bitmap data of a subtitle, and superimposes the bitmap data on the video data to obtain video data for display.
[0052] In this case, the television receiver 200 displays the user interface information (see
[0053] In this embodiment, it is assumed that a subtitle stream 1 (Packetid1) and a subtitle stream 2 (Packetid2) are included in the transport stream TS, and each of the subtitle stream 1 and the subtitle stream 2 has three pieces of subtitle information.
[0054] Here, the subtitle stream 1 has three pieces of subtitle information with the language of English and the content of normal, hard of hearing, and non-native, respectively. Furthermore, the subtitle stream 2 has three pieces of subtitle information with the language of French and the content of normal, hard of hearing, and non-native, respectively.
[0055]
[0056]
[0057] English or French is possible. Furthermore, in content selection Subtitle Type Selection, selection of normal subtitle (Normal Subtitle), hard of hearing subtitle (Hard of Hearing Subtitle), or non-native subtitle (Non-native Subtitle) is possible. The illustrated example indicates a state in which normal subtitle in English has been selected.
[0058]
[0059] Here, Normal1 has a segment type of 1 because of normal subtitle information, and is subtitle information for displaying xxx yy, for example. Hard of hearing1 has a segment type of 2 because of hard of hearing subtitle information, and is subtitle information displaying ggggjjjj, for example. Non-native1 has a segment type of 3 because of non-native subtitle information, and is subtitle information displaying Fff hi, for example.
[0060]
[0061] Furthermore, the subtitle stream with display timing of T2 has subtitle information of Normal2, Hard of hearing2, and Non-native2.
[0062] Here, Normal2 has a segment type of 1 because of normal subtitle information, and is subtitle information for displaying xxx yy zzzz, for example. Hard of hearing2 has a segment type of 2 because of hard of hearing subtitle information, and is subtitle information displaying G hg jkj jk, for example. Non-native2 has a segment type of 3 because of non-native subtitle information, and is subtitle information displaying Fff hi jjj, for example.
[0063]
[0064] [Configuration Example of Stream Generation Unit of Broadcast Transmission System]
[0065]
[0066] The control unit 111 has a configuration including a central processing unit (CPU), for example, and controls operation of each unit of the stream generation unit 110. The video encoder 112 inputs video data DV and applies encoding to the video data DV to generate a video stream configured by a video PES packet having encoded video data in payload. The audio encoder 113 inputs audio data DA and applies encoding to the audio data DA to generate an audio stream configured by an audio PES packet having encoded audio data.
[0067] The text format conversion unit 114 inputs text data (character code) DT and obtains Timed Text Markup Language (TTML) as subtitle information.
[0068] The metadata includes information of a title of the metadata, information of copyright, and the like. The styling includes information such as a position of region, a size, a color, a font (fontFamily), a font size (fontSize), and text alignment (textAlign), in addition to an identifier (id). The layout includes information such as offset (padding), a background color (backgroundColor), and alignment (displayAlign), in addition to an identifier (id) of the region where the subtitle is arranged. The body includes information of the subtitle. Display start timing and display end timing are described and text data is described for each subtitle.
[0069] The text format conversion unit 114 obtains a plurality of types of TTML corresponding to the same display timing. In this embodiment, six types of TTML including (1) TTML with the language of English and the content of normal, (2)
[0070] TTML with the language of English and the content of hard of hearing, (3) TTML with the language of English and the content of non-native, (4) TTML with the language of French and the content of normal, (5) TTML with the language of French and the content of hard of hearing, and (6) TTML with the language of French and the content of non-native are obtained.
[0071] The subtitle encoder 115 converts the six types of TTML obtained in the text format conversion unit 114 into segments (TTML segments). Then, the subtitle encoder 115 generates a subtitle stream 1 including a subtitle PES packet in which the TTML segments of the above (1) to (3) with the language of English are arranged in the payload, and generates a subtitle stream 2 including a subtitle PES packet in which the TTML segments of the above (4) to (6) with the language of French are arranged in the payload.
[0072] Note that, in this embodiment, at least a font download segment (Font_download_segment) having download information for downloading a file of a font designated in font designation information of the TTML is also included in the subtitle streams 1 and 2. In other words, the subtitle encoder 115 inserts the font download segment into the payload of the subtitle PES packet configuring each of the subtitle streams 1 and 2.
[0073]
[0074] A field of Optional_PES_header( ) exists after PES_packet_length. In this field, time stamps such as PTS and DTS are arranged. After this field, a field of PES_packet_data_byte exists. This field corresponds to a PES payload. In this field, PES_data_byte_field( ) for storing data is arranged.
[0075]
[0076] An 8-bit field of subtitle_stream_id indicates an identifier for identifying the type of subtitle stream. In a case of a subtitle stream transmitting text information, the type is set to a new value, for example, 0x01 and can be distinguish from a subtitle stream 0x00, which transmits a conventional bitmap.
[0077] After this field of subtitle_stream_id, a field of TimedTextSubtitling_segments( ) exists following a pattern of 00001111. In this field, a subtitle segment (Subtitle_segment) is arranged. After this field, an 8-bit field of end_of_PES_data_field_marker exists. This field is a marker indicating end of the PES packet.
[0078]
[0079]
[0080] Referring back to
[0081] In a case where the segment type is 0x01, 0x02, 0x03, 0x11, or 0x12, a TTML document (see
[0082]
[0083] A 16-bit field of original_network_id indicates identification information of a network to which download data is transmitted. A 16-bit field of transport_stream_id indicates identification information of individual transport streams. A 16-bit field of service_id indicates identification information of a service to be downloaded. In a case of a download target common to distribution media, a font file may be sent not by its own transport stream but by another transport stream, and as information for specifying a referenced private section in that case, the information of original_network_id, transport_stream_id, and service_id can be designated.
[0084] An 8-bit field of font_file_id indicates an identification number assigned to the font file. A 24-bit field of ISO_639_language_code indicates a code having three characters for identifying a language. For example, jpn indicates Japanese and eng indicates English. An 8-bit field of font_group_id indicates identification information of a font group and corresponds to the generic family of TTML. An 8-bit field of font_name_id indicates individual font names.
[0085] An 8-bit field of url_type indicates a type of a server. For example,0x01 indicates a font server (uncompressed URL), 0x02 indicates a general server (uncompressed URL), 0x11 indicates a font server (compressed URL), and 0x12 indicates a general server (compressed URL). An 8-bit field of url_string_lengthindicates the length (size) of a character code portion indicating a character string of subsequent URL in the number of bytes. The character code is arranged in the field of char.
[0086] Referring back to
[0087] In this case, the TS formatter 116 inserts information regarding each of the two subtitle streams 1 and 2 included in the transport stream TS into a program map table (PMT). Specifically, the TS formatter 116 generates a text subtitle descriptor (Text_subtitle_descriptor) to be newly defined and having the information, and inserts the text subtitle descriptor into a subtitle elementary stream loop (Subtitle ES loop) corresponding to each of the subtitle streams 1 and 2.
[0088]
[0089] An 8-bit field of packet_type indicates a packet type, as illustrated in
[0090]
[0091] Referring back to
[0092] Note that, in this embodiment, at least a font file descriptor (Font_file_descriptor) having download information for downloading a file of a font designated in font designation information of the TTML is inserted into the subtitle elementary stream loop (Subtitle ES loop) corresponding to each of the subtitle streams 1 and 2.
[0093]
[0094] An operation of the stream generation unit 110 illustrated in
[0095] Furthermore, the audio data DA is supplied to the audio encoder 113. In the audio encoder 113, encoding is applied to the audio data DA, and an audio stream including an audio PES packet having encoded audio data is generated. This audio stream is supplied to the TS formatter 116.
[0096] Furthermore, the text data (character code) DT is supplied to the text format conversion unit 114. In the text format conversion unit 114, TTML as caption information is obtained (see
[0097] The six types of TTML obtained in the text format conversion unit 114 are supplied to the subtitle encoder 115. In the subtitle encoder 115, the six types of TTML are converted into segments (TTML segments) (see
[0098] Note that, in the subtitle encoder 115, at least the font download segment (Font_download_segment) having download information for downloading a file of a font designated in font designation information of the TTML is also included in the subtitle streams 1 and 2 (see
[0099] In the TS formatter 116, the video stream generated in the video encoder 112, the audio stream generated in the audio encoder 113, and the subtitle streams 1 and 2 generated in the subtitle encoder 115 are transport-packetized and multiplexed, and transport stream TS as a container (multiplexed stream) is generated.
[0100] In this case, in the TS formatter 116, a text subtitle descriptor (Text_subtitle_descriptor) having information regarding a corresponding subtitle stream is inserted (see
[0101] Configuration Example of Transport Stream TS
[0102]
[0103] In the subtitle 1 PES packet, three types of TTML segments having subtitle information with the language of English (=1st language) are inserted in the PES payload. In other words, in this PES payload, a TTML segment of normal subtitle (Normal subtitle) with the segment type of 0x01, a hard of hearing subtitle (Hard_of_hearing subtitle) with the segment type of 0x02, and a TTML segment of non-native subtitle (Non-native subtitle) with the segment type of 0x03 are inserted. Furthermore, in this PES payload, a font download segment having a segment type of 0x84 is also inserted.
[0104] Similarly, in the subtitle 2 PES packet, three types of TTML segments having subtitle information with the language of French (=2nd language) are inserted in the PES payload. In other words, in this PES payload, a TTML segment of normal subtitle (Normal subtitle) with the segment type of 0x01, a hard of hearing subtitle (Hard_of_hearing subtitle) with the segment type of 0x02, and a TTML segment of non-native subtitle (Non-native subtitle) with the segment type of 0x03 are inserted. Furthermore, in this PES payload, a font download segment having a segment type of 0x84 is also inserted.
[0105] Furthermore, the transport stream TS includes a program map table (PMT) as program specific information (PSI). This
[0106] PSI is information describing which program each elementary stream included in the transport stream TS belongs to. In the PMT, a program descriptor that describes information related to the entire program exists.
[0107] In this PMT, subtitle 1 elementary stream loop (Subtitle 1 ES loop) having information related to the subtitle stream 1 exists. In this loop, information such as a packet identifier (PID) is arranged and a descriptor describing information related to the subtitle stream is also arranged corresponding to the subtitle stream 1.
[0108] As this descriptor, a text subtitle descriptor (Text_subtitle_descriptor) and font file descriptor (Font_file_descriptor) are inserted (see
[0109] Furthermore, in this PMT, subtitle 2 elementary stream loop (Subtitle 2 ES loop) having information related to the subtitle stream 2 exists. In this loop, information such as a packet identifier (PID) is arranged and a descriptor describing information related to the subtitle stream is also arranged corresponding to the subtitle stream 2.
[0110] As this descriptor, a text subtitle descriptor (Text_subtitle_descriptor) and font file descriptor (Font_file_descriptor) are inserted (see
[0111] [Configuration Example of Television Receiver]
[0112] Furthermore, the television receiver 200 includes a CPU 221, a flash ROM 222, a DRAM 223, an internal bus 224, a remote control reception unit 225, a remote control transmitter 226, and a communication interface 227.
[0113] The CPU 221 controls operation of each unit of the television receiver 200. The flash ROM 222 stores control software and stores data. The DRAM 223 configures a work area of the CPU 221. The CPU 221 develops the software and data read from the flash ROM 222 on the DRAM 223, activates the software, and controls each unit of the television receiver 200.
[0114] The remote control reception unit 225 receives a remote control signal (remote control code) transmitted from the remote control transmitter 226, and supplies the remote control code to the CPU 221. The CPU 221 controls each unit of the television receiver 200 on the basis of the remote control code. The CPU 221, the flash ROM 222, and the DRAM 223 are connected to the internal bus 224.
[0115] The communication interface 227 performs communication with a server existing on a network such as the Internet under the control of the CPU 221. The communication interface 227 is connected to the internal bus 224.
[0116] The reception unit 201 receives the transport stream TS transmitted on the broadcast wave from the broadcast transmission system 100. As described above, the transport stream TS includes the video stream, the audio stream, and the subtitle streams 1 and 2. The TS analysis unit 202 extracts streams of a video, an audio, and a subtitle from the transport stream TS.
[0117] In this case, the TS analysis unit 202 analyzes various types of information inserted in the header of each TS packet, and selectively extracts a TS packet including data of each PES packet of the video, the audio, or the subtitle on the basis of PID to obtain each stream of the video, the audio, or the subtitle.
[0118] Furthermore, the TS analysis unit 202 analyzes the various types of information inserted in the header of each TS packet, extracts various types of information inserted in the transport stream TS on the basis of PID, and sends the information to the CPU 221. This information also includes the text subtitle descriptor and the font file descriptor (see
[0119] The CPU 221 obtains information regarding a corresponding subtitle stream from the text subtitle descriptor. This information includes, for example, flag information indicating whether or not the corresponding subtitle stream has a plurality of pieces of subtitle information, identification information identifying the corresponding subtitle stream, identification information identifying each subtitle information that the corresponding subtitle stream has, and the like. Furthermore, the CPU 221 acquires at least information for downloading a file of a font designated in font designation information of TTML from the font file descriptor.
[0120] The audio decoder 207 applies decoding processing to the audio stream extracted in the TS analysis unit 202 to obtain the audio data. The audio output circuit 208 applies necessary processing such as D/A conversion and amplification to the audio data, and supplies the audio data to the speaker 209. The video decoder 203 applies decoding processing to the video stream extracted in the TS analysis unit 202 to obtain the video data.
[0121] The subtitle decoder 210 applies decoding processing to the subtitle stream extracted in the TS analysis unit 202 to obtain the TTML from the timed text subtitle segment (TimedText subtitle segments).
[0122] In this case, only one of the two subtitle streams 1 and 2 included in the transport stream TS is selectively extracted and supplied from the TS analysis unit 202 to the subtitle decoder 210. Furthermore, in the subtitle decoder 210, only one of the three TTML segments included in the subtitle stream supplied from the TS analysis unit 202 is selectively extracted and decoded to obtain the TTML.
[0123] Selection of the stream is performed as the CPU 221 supplies information of the packet type (Packet type) (see
[0124] For example, in a case where English is selected, the packet type is 0x11, and the TS analysis unit 202 extracts the subtitle stream 1. Furthermore, for example, in a case where French is selected, the packet type is 0x12, and the TS analysis unit 202 extracts the subtitle stream 2.
[0125] Furthermore, selection of the TTML segment is performed as the CPU 221 supplies information of the segment type (Segment type) (see
[0126] For example, in the case where normal subtitle (Normal Subtitle) is selected, the segment type is 0x01, and the subtitle decoder 210 extracts a TTML segment including normal TTML. Furthermore, for example, in a case where hard of hearing subtitle (Hard of Hearing Subtitle) is selected, the segment type is 0x02, and the subtitle decoder 210 extracts a TTML segment including hard of hearing TTML. Furthermore, for example, in a case where non-native subtitle (Non-native Subtitle) is selected, the segment type is 0x03, and the subtitle decoder 210 extracts a TTML segment including non-native TTML.
[0127] The subtitle decoder 210 sends TTML, which has been obtained by applying decoding processing to the extracted one TTML segment, to the CPU 221. The CPU 221 acquires caption display position information and the like from the TTML.
[0128] Furthermore, the subtitle decoder 210 extracts the font download segment (see
[0129] Furthermore, the subtitle decoder 210 converts text data (font data) of the caption (subtitle) at each caption display position (region) included in the TTML into bitmap data (binary image information) under the control of the CPU 221.
[0130] Here, the subtitle decoder 210 uses the file of the font designated in the font designation information of the TTML when obtaining the bitmap data of the caption under the control of the CPU 221. When the television receiver 200 does not have the font file designated in the font designation information, the CPU 221 appropriately downloads the font file from a broadcast signal (transport stream TS) or a server on the network on the basis of the download information inserted in the PES packet, the PMT, or the like, as described above, and uses the downloaded font file. Note that, when the file cannot be downloaded, the CPU 221 uses a substitute font file (for example, a default font file).
[0131] The video superimposition unit 204 superimposes the bitmap data of the caption at each caption display position obtained in the subtitle decoder 210 on the video data obtained in the video decoder 203 to obtain display video data under the control of the CPU 221. In this case, the CPU 221 performs control such that a superimposition position of the bitmap data of the caption is located at the caption display position determined by the subtitle display position information.
[0132] The panel drive circuit 205 drives the display panel 206 on the basis of the display video data obtained in the video superimposition unit 204. The display panel 206 is configured by, for example, a liquid crystal display (LCD), an organic electroluminescence (EL) display, or the like.
[0133] An operation of the television receiver 200 illustrated in
[0134] Furthermore, in the TS analysis unit 202, various types of information inserted in the transport stream TS are extracted and sent to the CPU 221. This information also includes the text subtitle descriptor and the font file descriptor (see
[0135] With the information, the CPU 221 obtains the information regarding the corresponding subtitle stream from the text subtitle descriptor. Furthermore, in the CPU 221, at least information for downloading a file of a font designated in font designation information of TTML is acquired from the font file descriptor.
[0136] The video stream extracted in the TS analysis unit 202 is supplied to the video decoder 203. In the video decoder 203, decoding processing is applied to the video PES stream and the video data is obtained.
[0137] Furthermore, the subtitle stream extracted in the TS analysis unit 202 is supplied to the subtitle decoder 210. In the subtitle decoder 210, decoding processing is applied to the subtitle stream, and the TTML is obtained from the timed text subtitle segments.
[0138] In this case, only one of the two subtitle streams 1 and 2 included in the transport stream TS is selectively extracted and supplied from the TS analysis unit 202 to the subtitle decoder 210. Furthermore, in the subtitle decoder 210, only one of the three TTML segments included in the subtitle stream supplied from the TS analysis unit 202 is selectively extracted and decoded to obtain the TTML.
[0139] Selection of the stream in the TS analysis unit 202 is performed under the control of the CPU 221 on the basis of the user's or system's language selection information. Note that selection of the TTML segment in the subtitle decoder 210 is performed under the control of the CPU 221 on the basis of the user's or system's language selection information. The user can cause a desired subtitle to be displayed by selecting language and content.
[0140] In the subtitle decoder 210, the font download segment is extracted from the subtitle stream obtained in the TS analysis unit 202 and is sent to the CPU 221. In the CPU 221, at least information for downloading a file of a font designated in font designation information of TTML is acquired from the font download segment.
[0141] The TTML obtained in the subtitle decoder 210 is sent to the CPU 221. In the CPU 221, the caption display position information and the like are acquired from the TTML.
[0142] Furthermore, in the subtitle decoder 210, the font download segment (see
[0143] Furthermore, in the subtitle decoder 210, the text data (font data) of the caption (subtitle) at each caption display position (region) included in the TTML is converted into bitmap data (binary image information) under the control of the CPU 221.
[0144] Here, in the subtitle decoder 210, the file of the font designated in the font designation information of the TTML is used when the bitmap data of the caption is obtained under the control of the CPU 221. When the television receiver 200 does not have the font file designated in the font designation information, the CPU 221 appropriately downloads the font file from a broadcast signal (transport stream. TS) or a server on the network on the basis of the download information inserted in the PES packet, the PMT, or the like, as described above, and uses the downloaded font file. Note that, when the file cannot be downloaded, the CPU 221 uses a substitute font file (for example, a default font file).
[0145] The bitmap data of the caption at each caption display position output from the subtitle decoder 210 is supplied to the video superimposition unit 204. In the video superimposition unit 204, the bitmap data of the caption at each caption display position obtained in the subtitle decoder 210 is superimposed on the video data obtained in the video decoder 203, and the display video data is obtained. In this case, the superimposed position of the bitmap data of the caption is controlled, by the CPU 221, to be located at the caption display position on the basis of the caption display position determined by the caption display position information.
[0146] The display video data obtained in the video superimposition unit 204 is supplied to the panel drive circuit 205. In the panel drive circuit 205, the display panel 206 is driven on the basis of the display video data. With the operation, an image in which the caption (subtitle) is superimposed on each caption display position (region) is displayed on the display panel 206.
[0147] Furthermore, the audio stream extracted in the TS analysis unit 202 is supplied to the audio decoder 207. In the audio decoder 207, decoding processing is applied to the audio stream, and the audio data is obtained. This audio data is supplied to the audio output circuit 208. In the audio output circuit 208, necessary processing such as D/A conversion and amplification is performed on the audio data. Then, the processed audio data is supplied to the speaker 209. As a result, a sound output corresponding to the display image on the display panel 206 is obtained from the speaker 209.
[0148] As described above, in the transmission/reception system 10 illustrated in
[0149] Furthermore, in the transmission/reception system 10 illustrated in
[0150] Furthermore, in the transmission/reception system 10 illustrated in
[0151] <2. Modification>
[0152] Note that, in the above-described embodiment, the case in which the subtitle stream 1 (Packet id1) having the three pieces of subtitle information (TTML segments) with the language of English and the content of normal, hard of hearing and non-native and the subtitle stream 2 (Packet id2) having the three pieces of subtitle information (TTML segments) with the language of French and the content of normal, hard of hearing, and non-native are included in the transport stream TS generated in the broadcast transmission system 100 has been described.
[0153] However, an example in which a subtitle stream 1 (Packet id1) having subtitle information (TTML segment) with content of normal, a subtitle stream 2 (Packet id2) having subtitle information (TTML segment) with content of hard of hearing, and a subtitle stream 3 (Packet id3) having subtitle information (TTML segment) with content of non-native are included in a transport stream TS generated in a broadcast transmission system 100 can also be considered.
[0154]
[0155] Furthermore, the subtitle stream. 3 has two pieces of subtitle information with the content of non-native and languages of English and French, respectively.
[0156]
[0157] First, in stream extraction processing (first extraction processing), a subtitle stream including the subtitle information for performing desired subtitle display is extracted from the subtitle streams 1, 2, and 3. Next, in subtitle information extraction processing (second extraction processing), the subtitle information for performing desired subtitle display is extracted from the extracted subtitle stream.
[0158]
[0159]
[0160] PES packet Subtitle 2 PES that is a PES packet of the subtitle stream 2 identified with PID2, and further, a subtitle 3 PES packet Subtitle 3 PES that is a PES packet of the subtitle stream 3 identified with PID3 exist.
[0161] In the subtitle 1 PES packet, two types of TTML segments having subtitle information with the content of normal are inserted in a PES payload. In other words, in this PES payload, a TTML segment of English subtitle with the segment type of 0x11, a TTML segment of French subtitle with the segment type of 0x12 are inserted. Furthermore, in this PES payload, a font download segment having a segment type of 0x84 is also inserted.
[0162] Similarly, in the subtitle 2 PES packet, two types of TTML segments having subtitle information with the content of hard of hearing are inserted in a PES payload. In other words, in this PES payload, a TTML segment of English subtitle with the segment type of 0x11, a TTML segment of French subtitle with the segment type of 0x12 are inserted. Furthermore, in this PES payload, a font download segment having a segment type of 0x84 is also inserted.
[0163] Similarly, in the subtitle 3 PES packet, two types of TTML segments having subtitle information with the content of non-native are inserted in a PES payload. In other words, in this PES payload, a TTML segment of English subtitle with the segment type of 0x11, a TTML segment of French subtitle with the segment type of 0x12 are inserted. Furthermore, in this PES payload, a font download segment having a segment type of 0x84 is also inserted.
[0164] Furthermore, the transport stream TS includes a program map table (PMT) as program specific information (PSI). This PSI is information describing which program each elementary stream included in the transport stream TS belongs to. In the PMT, a program descriptor that describes information related to the entire program exists.
[0165] In this PMT, subtitle 1 elementary stream loop (Subtitle 1 ES loop) having information related to the subtitle stream 1 exists. In this loop, information such as a packet identifier (PID) is arranged and a descriptor describing information related to the subtitle stream is also arranged corresponding to the subtitle stream 1.
[0166] As this descriptor, a text subtitle descriptor (Text_subtitle_descriptor) and font file descriptor (Font_file_descriptor) are inserted (see
[0167] Furthermore, in this PMT, subtitle 2 elementary stream loop (Subtitle 2 ES loop) having information related to the subtitle stream 2 exists. In this loop, information such as a packet identifier (PID) is arranged and a descriptor describing information related to the subtitle stream is also arranged corresponding to the subtitle stream 2.
[0168] As this descriptor, a text subtitle descriptor (Text_subtitle descriptor) and font file descriptor (Font_file_descriptor) are inserted (see
[0169] Furthermore, in this PMT, subtitle 3 elementary stream loop (Subtitle 2 ES loop) having information related to the subtitle stream 3 exists. In this loop, information such as a packet identifier (PID) is arranged and a descriptor describing information related to the subtitle stream is also arranged corresponding to the subtitle stream 3.
[0170] As this descriptor, a text subtitle descriptor (Text_subtitle_descriptor) and font file descriptor (Font_file_descriptor) are inserted (see
[0171] Furthermore, in the above-described embodiment, an example in which the container is a transport stream (MPEG-2 TS) has been described. However, the present technology is not limited to the container of MPEG-2 TS, and can be similarly realized with a container in another format such as MMT or ISOBMFF, for example.
[0172] Furthermore, in the above-described embodiment, the transmission/reception system 10 including the broadcast transmission system 100 and the television receiver 200 has been described. However, a configuration of a transmission/reception system to which the present technology can be applied is not limited to the transmission/reception system 10. For example, the television receiver 200 may have a configuration of a set top box, a monitor, and the like connected by a digital interface such as high-definition multimedia interface (HDMI). Note that HDMI is a registered trademark.
[0173] Furthermore, the present technology can also have the following configurations.
[0174] (1) A transmission device including:
[0175] a subtitle encoding unit configured to generate a predetermined number of subtitle streams each having one piece or two or more pieces of subtitle information; and
[0176] a transmission unit configured to transmit a container of a predetermined format including the predetermined number of subtitle streams.
[0177] (2) The transmission device according (1), in which each of the predetermined number of subtitle streams has segmented subtitle information.
[0178] (3) The transmission device according to (1) or (2), in which
[0179] the subtitle encoding unit generates a plurality of subtitle streams each having subtitle information of a different language, and
[0180] each of the plurality of subtitle streams has a plurality of pieces of subtitle information each having different content.
[0181] (4) The transmission device according to (1) or (2), in which
[0182] the subtitle encoding unit generates a plurality of subtitle streams each having subtitle information of different content, and
[0183] each of the plurality of subtitle streams has a plurality of pieces of subtitle information each having a different language.
[0184] (5) The transmission device according to anyone of (1) to (4), further including:
[0185] an information insertion unit configured to insert information regarding each of the predetermined number of subtitle streams into the container.
[0186] (6) The transmission device according to (5), in which
[0187] the information regarding each of the subtitle streams includes flag information indicating whether or not a corresponding subtitle stream has a plurality of pieces of subtitle information.
[0188] (7) The transmission device according to (5) or (6), in which
[0189] the information regarding each of the subtitle streams includes identification information identifying a corresponding subtitle stream.
[0190] (8) The transmission device according to anyone of (5) to (7), in which
[0191] the information regarding each of the subtitle streams includes identification information identifying each subtitle information that a corresponding subtitle stream has. (9) A transmission method including:
[0192] a subtitle encoding step of generating a predetermined number of subtitle streams each having one piece or two or more pieces of subtitle information; and
[0193] a transmission step of transmitting, by a transmission unit, a container of a predetermined format including the predetermined number of subtitle streams.
[0194] (10) A reception device including:
[0195] a reception unit configured to receive a container of
[0196] a predetermined format including a predetermined number of subtitle streams each having one piece or two or more pieces of subtitle information; and
[0197] a control unit configured to control first extraction processing of extracting one subtitle stream from the predetermined number of subtitle streams and second extraction processing of extracting one piece of subtitle information from the extracted one subtitle stream.
[0198] (11) The reception device according to (10), in which
[0199] information regarding each of the predetermined number of subtitle streams is inserted in the container, and
[0200] the control unit
[0201] further controls display processing of user interface information for the first extraction processing and the second extraction processing on the basis of the information regarding each of the predetermined number of subtitle streams.
[0202] (12) A reception method including:
[0203] a reception step of receiving, by a reception unit, a container of a predetermined format including a predetermined number of subtitle streams each having one piece or two or more pieces of subtitle information; and
[0204] a control step of controlling first extraction processing of extracting one subtitle stream from the predetermined number of subtitle streams and second extraction processing of extracting one piece of subtitle information from the extracted one subtitle stream.
[0205] A main characteristic of the present technology is to suppress an increase in the number of subtitle streams even if types of subtitle information increases and to therefore simplify transmission of a plurality of types of subtitle information by generating and transmitting a subtitle stream including a plurality of pieces of subtitle information (see
[0206]
REFERENCE SIGNS LIST
[0207] 10 Transmission/reception system
[0208] 100 Broadcast transmission system
[0209] 110 Stream generation unit
[0210] 111 Control unit
[0211] 112 Video encoder
[0212] 113 Audio encoder
[0213] 114 Text format conversion unit
[0214] 115 Subtitle encoder
[0215] 116 TS formatter
[0216] 200 Television receiver
[0217] 201 Reception unit
[0218] 202 TS analysis unit
[0219] 203 Video decoder
[0220] 204 Video superimposition unit
[0221] 205 Panel drive circuit
[0222] 206 Display panel
[0223] 207 Audio decoder
[0224] 208 Audio output circuit
[0225] 209 Speaker
[0226] 210 Subtitle decoder
[0227] 221 CPU
[0228] 227 Communication interface