3D video communications
09729847 · 2017-08-08
Assignee
Inventors
Cpc classification
H04N13/172
ELECTRICITY
International classification
Abstract
The disclosed embodiments relate to determining transmit formats and receive formats for a 3D video communication service. Sets of available 3D video communication transmit formats are received. Each set is associated with a client device. Sets of available 3D video communication receive formats are received. Each set is associated with one of the client devices. One format for transmission of 3D video communication from the at least one client device to the other client devices is determined for at least one of the client devices. The one format is a member of both the set of available 3D video communication transmit formats associated with the at least one client device and the received available 3D video communication receive formats associated with other client devices.
Claims
1. A method of determining transmit formats and receive formats for a 3D video communication service, comprising the steps of: receiving sets of available 3D video communication formats, each set of available 3D video communication formats being received from one of a plurality of client devices, each set of available 3D video communication formats comprising: at least one 3D transmit format for a type of a capture unit being associated with a client device; and at least one 3D receive format associated with a type of a display unit being associated with the client device; and using a preference criterion common for the plurality of client devices to determine a preference order for a plurality of 3D video transmit formats included in the sets of available 3D communication formats; based on the preference order for the plurality of 3D video transmit formats, comparing the sets of available 3D video communication formats to determine, for at least one client device, a particular 3D transmit format for transmission of 3D video communication by the type of the capture unit of the at least one client device that is a preferred 3D video transmit format and corresponds to a particular 3D receive format for receipt of 3D video communication by the type of the display device of at least one other client device; and transmitting, to the plurality of client devices, a message comprising the particular 3D transmit format to be used by the plurality of client devices during a video conference.
2. The method according to claim 1, wherein the method is performed in a central device.
3. The method according to claim 1, wherein the method is distributively performed in each one of the client devices.
4. The method according to claim 1, wherein each set of available 3D video communication transmit formats is associated with one capturing unit, each capturing unit being associated with one client device.
5. The method according to claim 4, wherein a preference order of the set of available 3D video communication transmit formats associated with said one client device is determined by said one capturing unit.
6. The method according to claim 1, wherein each set of available 3D video communication receive formats is associated with one display unit, each display unit being associated with one client device.
7. The method according to claim 6, wherein a preference order of the set of available 3D video communication receive formats associated with said one client device is determined by said one display unit.
8. The method according to claim 1, further comprising determining, based on the preference criterion common for the plurality of client devices, a preference order for a plurality of 3D video receive formats included in the sets of available 3D communication formats.
9. The method according to claim 1, further comprising determining, for at least one of the client devices, one format for reception of 3D video communication from the other client devices to the at least one client device, the one format for reception being a member of both the set of available 3D video communication receive formats associated with the at least one client device and the received available 3D video communication transmit formats associated with other client devices.
10. The method according to claim 1, further comprising receiving an indication of at least one client device being disconnected from the 3D video communication service; and repeating, for all client devices still connected to the 3D video communication service, said step of: determining one format for transmission of 3D video communication for at least one of the client devices.
11. The method according to claim 1, further comprising receiving an indication of at least one further client device being connected to the 3D video communication service; and repeating, for all client devices connected to the 3D video communication service, said steps of: receiving sets of available 3D video communication transmit formats, receiving sets of available 3D video communication receive formats, and determining one format for transmission of 3D video communication for at least one of the client devices.
12. The method according to claim 1, further comprising receiving an indication of the number of client devices to connect to a session of the 3D video communication service; and delaying said step of determining one format for transmission of 3D video communication for at least one of the client device until reception of an indication that at least a pre-determined fraction of the client devices have connected to the session of the 3D video communication service.
13. The method according to claim 12, wherein the pre-determined fraction corresponds to 50%, preferably 75%, most preferably 100%.
14. A non-transitory computer program of determining transmit formats and receive formats for 3D video communication, the computer program comprising computer program code which, when run on a processing unit, causes the processing unit to receive sets of available 3D video communication formats, each set of available 3D video communication formats being received from one of a plurality of client devices, each set of available 3D video communication formats comprising: at least one 3D transmit format for a type of a capture unit being associated with a client device; and at least one 3D receive format associated with a type of a display unit being associated with the client device; use a preference criterion common for the plurality of client devices to determine a preference order for a plurality of 3D video transmit formats included in the sets of available 3D communication formats; based on the preference order for the plurality of 3D video transmit formats, compare the sets of available 3D video communication formats to determine, for at least one of the client devices, a particular 3D transmit format for transmission of 3D video communication by the type of the capture unit of the at least one client device that is a preferred 3D video transmit format and corresponds to a particular 3D receive format for receipt of 3D video communication by the type of the display unit of at least one other client device; and transmit, to the plurality of client devices, a message comprising the particular 3D transmit format to be used by the plurality of client devices during a video conference.
15. A non-transitory computer program product comprising a computer program according to claim 14 and a computer readable means on which the computer program is stored.
16. A local client device for determining transmit formats and receive formats for 3D video communication, comprising: a receiver arranged to acquire a set of available 3D video communication transmit formats from a type of a local capturing unit and to acquire a set of available 3D video communication receive formats from a type of a local display unit; a transmitter arranged to transmit the acquired available 3D video communication transmit and receive formats to at least one remote client device; wherein the receiver is further arranged to receive acquired available 3D video communication formats, the available 3D video communication formats comprising: available 3D video communication transmit formats for a type of a remote capturing unit of the at least one remote client device; and available 3D video communication receive formats for a type of a remote displaying unit of the at least one remote client device; and a processing unit operable to: use a preference criterion common for the local client device and the remote client device to determine a preference order for a plurality of 3D video transmit formats included in the sets of available 3D communication formats; based on the preference order for the plurality of 3D video transmit formats, compare the set of available 3D video communication transmit formats for the type of the local capturing unit and the set of available 3D video communication receive formats for the type of the local display unit with the available 3D video communication formats from the at least one remote client device to determine at least one of: a first particular 3D transmit format for the type of capturing unit that is a preferred 3D video transmit format and corresponds to a first particular 3D receive format for receipt of 3D video communication by a type of a display unit of the at least one remote client device; and a second particular 3D receive format for the type of display unit that is a preferred 3D video transmit format and corresponds to a second particular 3D transmit format for transmission of 3D video communication by a type of a capturing unit of the at least one remote client device; and wherein the transmitter is further arranged to transmit, to the at least one remote client device, a message comprising the first particular 3D transmit format to be used by the at least one remote client device during a video conference.
17. The local client device according to claim 16, wherein the transmitter further is arranged to transmit the determined one format for transmission of 3D video communication to the remote client devices.
18. A central device for determining transmit formats and receive formats for 3D video communication, comprising: a receiver arranged to receive, from each of a plurality of client devices, a set of available 3D video communication formats, each set of available 3D video communication formats comprising: at least one 3D transmit format for a type of a capture unit being associated with a client device; and at least one 3D video communication receive format associated with a type of a display unit being associated with the client device; and a processing unit operable to: use a preference criterion common for the plurality of client devices to determine a preference order for a plurality of 3D video transmit formats included in the sets of available 3D video communication formats; based on the preference order for the plurality of 3D video transmit formats, compare the sets of available 3D video communication formats to determine, for at least one client device of the plurality of client devices, a particular 3D transmit format for transmission of 3D video communication by the type of the capture unit of the at least one client device that is a preferred 3D video transmit format and corresponds to a particular 3D receive format for receipt of 3D video communication by the type of the display unit of at least one other client device; and a transmitter arranged to transmit, to the plurality of client devices, a message comprising the particular 3D transmit format to be used by the plurality of client devices during a video conference.
19. The central device according to claim 18, wherein the message comprises one or more codec options associated with the particular 3D transmit format.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Embodiments of the invention will now be described, by way of non-limiting examples, references being made to the accompanying drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
DETAILED DESCRIPTION
(9) The invention will now be described more fully hereinafter with reference to the accompanying drawings, in which certain embodiments of the invention are shown. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided by way of example so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Like numbers refer to like elements throughout the description.
(10) Consider the video communications systems 1a, 1b illustrated in
(11) For example, if a capture unit 6 in form of a 2D camera and a display unit comprising a display and polarized glasses to be used by the user are connected to client device 2a, then this latter will be able to transmit plain 2D video but would like to receive frame compatible format video with stereo content. However, the stream(s) that client device 2a receives may be different from the wished format, as it also depends on the equipment from the other client devices 2b, 2c. For example, if now client device 2b has a texture+depth camera, then client device 2a will only receive a frame compatible format video including texture+depth. In this case, client device 2a will have to adapt the 3D video stream received to a frame compatible format video including stereo. To this effect, client device 2a will synthesize the new view with the information contained in the texture+depth received. On the other hand, if a new client device 2c will be connected to the video communications system, then a new adaptation will be required.
(12) In general terms there are two main categories for capture and display equipment; 2D, which is kept for legacy, or 3D, which is in turn divided into different possibilities. In terms of capture units, 3D images can be obtained by means of a stereo camera where separate left and right views of the scene are captured, with a rig of multiple cameras where multiple views of the scene are captured, with a texture camera and a depth camera (e.g. an infrared camera), or with multiple texture cameras and multiple depth cameras. The selection of the camera determines the video format that can be transmitted from a client device. Similarly, 3D displays may be classified into stereo displays that require special purpose glasses and auto-stereoscopic displays that do not require such special purpose glasses. Displays with polarized or shutter glasses are typical cases for stereo displays with glasses; whereas barrier parallax or lenticular arrays are the typical technologies for two-view or multi-view auto-stereoscopic displays. Likewise, the selection of the display determines the video format that can be received by the client device.
(13) Table 1 summarizes examples of capture units and display units that can be connected to a 3D video conferencing system, such as the video communications systems 1a, 1b of
(14) TABLE-US-00001 TABLE 1 Capture units and display units for 3D video communications Capture units Display units 2D camera 2D display 3D camera 3D display Stereo camera Display using polarized-glasses Rig with multiple cameras Display using shutter-glasses Texture and depth Two-view auto-stereoscopic cameras display (glasses free) Multiple texture + Multi-view auto-stereoscopic depth cameras display (glasses free)
(15) For each capture unit and display unit type possible, video format and coding options could also be determined. Table 2 summarizes video formats and coding options for the capture unit types listed in Table 1. As the skilled person understands, the 3D video streams delivered by one capture unit type may be combined in different ways and therefore several options may be possible for each capture unit type.
(16) TABLE-US-00002 TABLE 2 Video format and coding options for different capture unit types Capture unit Video format and coding options Conventional 2D Conventional 2D video coding, e.g. camera H.264/AVC or HEVC Stereo camera Simulcast (i.e. two separated streams encoded with e.g. H.264/AVC or HEVC) Frame compatible formats (e.g. side-by-side, top-bottom), encoded with some of 2D codecs Multi-view Video Coding (MVC) MPEG 3DV (currently under development in MPEG) Rig with multiple Simulcast (i.e. separated streams) cameras Multi-view Video Coding (a.k.a. MVC) MPEG 3DV Texture + depth Simulcast (i.e. two separated streams) camera Frame compatible formats MPEG 3DV Multiple texture + Simulcast (i.e. separated streams) depth cameras Multi-view Video Coding (separately applied to texture and depth) MPEG 3DV
(17) Similarly, Table 3 lists the video format and coding options for the display unit types of Table 1. Table 3 considers the input formats to the displays units, not the format needed to achieve a 3D experience. For example, for a polarized screen, the input required is a side-by-side stream that the display unit itself horizontally interlaces to match the screen polarized lines. Since this operation is performed directly by the display unit, the interlaced format is not considered as the input explicitly required, but rather the side-by-side format.
(18) TABLE-US-00003 TABLE 3 Video format and coding options for each display unit type Display unit Video format and coding options 2D display Conventional 2D video coding, e.g. H.264/AVCor HEVC Polarized-glasses Frame compatible formats display Multi-view Video Coding (MVC) MPEG 3DV (for the texture + depth input) Shutter-glasses Conventional 2D video coding but at double display the frame rate Multi-view Video Coding (MVC) MPEG 3DV Two-view auto- Frame compatible formats stereoscopic Multi-view Video Coding (MVC) display (glasses MPEG 3DV (for the texture + depth input) free) Multi-view auto- Frame compatible formats stereoscopic Multi-view Video Coding (MVC) display (glasses MPEG 3DV (for the texture + depth input) free)
(19) As noted above, one object of the disclosed subject matter is to provide automatic selection for video formats and coding depending on the equipment connected to each client device 2a, 2b, 2c. In order to do so, there is provided a client device 2a, 2b, 2c (preferably with functional components as illustrated in
(20) According to a first preferred embodiment the client devices 2a, 2b, 2c of
(21) According to a second preferred embodiment the central device 9 of
(22)
(23) Consider now the video communications systems 1a, 1b of
(24) According to the scenarios illustrated in
(25) For the embodiment illustrated in
(26) TABLE-US-00004 TABLE 4 Information contained in the messages sent by each client device Client device 2a Client device 2b Client device 2c Client device can Plain 2D FC T + D FC stereo transmit: T + D T + T Plain 2D Plain 2D Client device can Plain 2D FC stereo MT + MD receive: FC stereo T + T T + D FC T + D FC T + D FC T + D T + T T + D FC stereo T + D MT + MD T + T MT + MD Plain 2D Plain 2D Legend: FC T + D = frame compatible format including texture + depth T + D = texture + depth in separated streams FC stereo = frame compatible format including stereo T + T = texture + texture in separated streams MT + MD = multiple textures + multiple depths in separated streams
(27) At this point, according to the embodiment of
(28) If for a client device the first transmit format option is contained in all the “Client device can receive” lists, then preferably the first format becomes the transmission format of the client device. If the first transmit format is not contained in the list, then the client device (or the central device) investigates the next format until there is a consensus between the formats of the client devices. If there is no consensus, then the client device will use a plain 2D video format to transmit as this format is receivable by all client devices. For example, consider client device 2b. Its first transmission format is “frame compatible format including texture+depth”, which is contained both in client device 2a and client device 2c's “Client device can receive” lists. So client device 2b sets its transmission format to “frame compatible format including texture+depth”.
(29) The “Client device can receive” lists may be ordered according to preference. A number of ways exist to determine the ordering of the “Client device can receive” list. For example, each display unit 7 (which is associated with a client device) may determine the priority ordering of the receive formats. Alternatively there may be a global criteria to determine the priority ordering of the receive formats. In other words, the sets of available 3D video communication receive formats may have a preference order according to a preference criterion common for all client devices 2a, 2b, 2c.
(30) In a similar way, each client device 2a, 2b, 2c (or the central device 9) process the “Client device can transmit” lists and will compare the content to what the client device itself is arranged to receive. If the first option is available in the list of the client device itself, then the first option will become the reception format. If the first option is not in the list, then the client device (or the central device 9) will investigate the next option until a match is found. If there is no matching, the client device will consider that it is receiving plain 2D from a remote client device.
(31) The “Client device can transmit” lists may be ordered according to preference. A number of ways exist to determine the ordering of the “Client device can transmit” list. For example, each capturing unit 6 may determine the priority ordering of the transmit formats. Each set of available 3D video communication transmit formats may, for example, be associated with one capturing unit 6 as in
(32) For example in the case of client device 2b, it considers the “Client device can transmit” list of client device 2a which starts with “plain 2D”. Since this format is in client device 2b's “Client device can receive” list, it is accepted as the transmit format of client device 2a. Then client device 2b goes through client device 2c's “Client device can transmit” list. The first option is “frame compatible format including stereo” which is also included in client device 2b's “Client device can receive” list. So this means that client device 2b is going to receive in format “plain 2D” from client device 2a and in format “frame compatible format including stereo” from client device 2c. Each client device 2a, 2b, 2c is thereby enabled to prepare the processing chain for receiving 3D video streams from the other client devices.
(33) This process may also be performed by the central device 9. Thus, according to the embodiment of
(34) The client devices 2a, 2b, 2c according to the embodiment of
(35) TABLE-US-00005 TABLE 5 Response information from the client devices Client device 2a Client device 2b Client device 2c Client device Plain 2D FC T + D FC stereo will transmit: Client device FC T + D Plain 2D Plain 2D will receive: (from (from (from client client client device 2b) device 2a) device 2a) FC stereo FC stereo FC T + D (from (from (from client client client device 2c) device 2c) device 2b) Legend: FC stereo = frame compatible format including stereo FC T + D = frame compatible format including texture + depth
(36) Each client device 2a, 2b, 2c may then start transmitting the 3D video stream in its determined transmit format and start receiving the 3D video stream in the determined transmit formats of the other client devices.
(37) In a case where an existing client device disconnects from an ongoing 3D video communication service, there is preferably a re-negotiation of the formats that can be transmitted and received. Particularly, in a step S10 an indication of at least one client device being disconnected from the 3D video communication service is received (either by the receiver 5 of the client devices 2a, 2b, 2c or by the receiver 12 of the central device 9). In a step S12 the step S6 (and possibly also S8) may then be repeated for all client devices still being connected to the 3D video communication service. For example, considering the example disclosed above, where client device 2a comprises or is operatively connected to a capture unit 6 in form of a stereo camera, client device 2b and client device 2c both comprise or are operatively connected to display units 7 in form of 2D screens, and a further client device (not illustrated) comprises or is operatively connected to a display unit in form of a stereo display with polarized glasses. Assume further that client device 2a currently is transmitting a frame compatible format including stereo. If the further client device leaves the ongoing 3D video communication service, then client device 2a may start transmitting plain 2D video to adapt to the other two client devices 2b, 2c.
(38) In a case where a new client device connects to an ongoing 3D video communication service, there is preferably a re-negotiation of the formats that can be transmitted and received. Particularly, in a step S14 an indication of at least one further client device being connected to the 3D video communication service is received (either by the receiver 5 of the client devices 2a, 2b, 2c or by the receiver 12 of the central device 9). In a step S16 the steps S2, S4, S6 (and possibly also S8) may then be repeated for all client devices (including the at least one newly connected client device) connected to the 3D video communication service. For example, consider that client device 2a comprises or is operatively connected to a capture unit 6 in form of a stereo camera, but that client devices 2b and 2c both comprise or are operatively connected to display units 7 in form of 2D displays. In this case, client device 2a may transmit in a plain 2D format. However, if a further client device (not illustrated) later connects to the 3D video communication service and the further client device comprises or is operatively connected to a display unit in form of a stereo display that requires polarized glasses, then client device 2a may transmit in a frame compatible format including stereo, since it will enable the user of the further client device to watch 3D video and client devices 2b and 2c still be associated with the 2D content. Once the formats are decided, then the further client device may join the 3D video communication service and the client devices already being connected to the ongoing 3D video communication service adapt their transmit and receive formats.
(39) In order to avoid unnecessary re-negotiation of the formats there may be a delay before the determination of formats is made. For example, in a step S18 an indication of the number of client devices which are to be connected to a session of the 3D video communication service is received. The step S6 of determining transmit format(s) may then be delayed, in a step S20, until reception of an indication that at least a pre-determined fraction of the client devices have connected to the session of the 3D video communication service. The pre-determined fraction may correspond to 50%, preferably 75%, most preferably 100% of the number of client devices scheduled to join the session.
(40) For a point-to-point 3D video communication session (i.e. a 3D video communication session involving only two client devices) the negotiation between the two client devices preferably considers only what the other client device is able to receive. In this way, each client device adapts its transmit format to the available receive formats of the other client device.
(41) The invention has mainly been described above with reference to a few embodiments. However, as is readily appreciated by a person skilled in the art, other embodiments than the ones disclosed above are equally possible within the scope of the invention, as defined by the appended patent claims.