VIDEOCONFERENCING SYSTEM ALLOWING A PARALLAX EFFECT ASSOCIATED WITH THE DIRECTION OF THE GAZE OF A USER TO BE DECREASED
20220366545 · 2022-11-17
Assignee
Inventors
Cpc classification
H04N7/144
ELECTRICITY
H04L65/403
ELECTRICITY
H04N23/90
ELECTRICITY
International classification
H04L65/403
ELECTRICITY
Abstract
The invention relates to a videoconferencing system 1, comprising: a display screen 10, for displaying an image I.sub.e(t.sub.i) containing N images I.sub.int.sup.(k)(t.sub.i); a camera 20, for acquiring an image I.sub.c(t.sub.j); a single-pixel-imager-employing optical device suitable for determining N images I.sub.co.sup.(k)(t.sub.j) on the basis of sub-matrices SM.sub.imp.sup.(k)(t.sub.j) comprising: an optical source 31, suitable for irradiating an ocular portion P.sub.o(t.sub.j) of the face of the user; a matrix of single-pixel imagers that are suitable for reconstructing a correction image I.sub.co.sup.(k)(t.sub.j) on the basis of the light beam reflected by the ocular portion P.sub.o(t.sub.j); a processing unit 40, suitable for: determining, in each image I.sub.int.sup.(k)(t.sub.i) of the image I.sub.e(t.sub.i), a target point P.sub.c.sup.(k)(t.sub.j), then selecting N sub-matrices SM.sub.imp.sup.(k)(t.sub.j) each centred on a target point P.sub.c.sup.(k)(t.sub.j); correcting the image I.sub.c(t.sub.j), by replacing a region of the image P.sub.c(t.sub.j) representing the ocular portion P.sub.o(t.sub.j) with the N images I.sub.co.sup.(k)(t.sub.j).
Claims
1. A videoconferencing system, configured to transmit and receive multimedia signals to and from N remote videoconferencing systems, with N≥1, allowing a user to communicate in real time with N interlocutors using these remote systems, comprising: a display screen, comprising a matrix of emissive pixels that is configured to display, at various successive display times t.sub.i, an image I.sub.e(t.sub.i) containing N images I.sub.int.sup.(k)(t.sub.i) transmitted by the remote systems and depicting the face of the interlocutors; a camera, configured to acquire, at various successive acquisition times t.sub.j, an image I.sub.c(t.sub.j) of the face of the user; an optical device comprising single-pixel imagers, configured to determine N correction images I.sub.co.sup.(k)(t.sub.j) on the basis of sub-matrices SM.sub.imp.sup.(k)(t.sub.j) of at least one single-pixel imager, at the various acquisition times t.sub.j, comprising: at least one optical source, configured to emit a light beam of wavelength located outside of the visible spectrum and that irradiates a predefined angular region Z.sub.a covering an ocular portion P.sub.o(t.sub.j) of the face of the user containing his eyes; a matrix of single-pixel imagers, each configured to collect a part of the irradiating light beam reflected by the ocular portion P.sub.o(t.sub.j) and to reconstruct a correction image I.sub.co.sup.(k)(t.sub.j) on the basis of the collected light beam, and each comprising a single photosensitive region, the photosensitive regions being integrated into the display screen and located in a main region (Z.sub.p) of the display screen, in which main region the N images I.sub.int.sup.(k)(t.sub.j) of the interlocutors are located; a processing unit, configured to: determine, in each image I.sub.int.sup.(k)(t.sub.j) of the image I.sub.e(t.sub.j), a target point P.sub.c.sup.(k)(t.sub.j) located at the eyes of the interlocutor, then selecting N sub-matrices SM.sub.imp.sup.(k)(t.sub.j) each centred on a target point P.sub.c.sup.(k)(t.sub.j); correct the image I.sub.c(t.sub.j) by replacing a region of the image I.sub.c(t.sub.j) depicting the ocular portion P.sub.o(t.sub.j) with the N correction images I.sub.co.sup.(k)(t.sub.j), thus obtaining N corrected images I.sub.cc(t.sub.j) each to be transmitted to the remote system of the corresponding interlocutor.
2. The videoconferencing system according to claim 1, wherein the matrix of single-pixel imagers has a resolution equal to the resolution of the matrix of emissive pixels.
3. The videoconferencing system according to claim 1, wherein the region I.sub.c_po(t.sub.j) of the image I.sub.c(t.sub.j) depicting the ocular portion P.sub.o(t.sub.j) and replaced by a correction image I.sub.co.sup.(k)(t.sub.j) has a resolution higher than a resolution of a region I.sub.c_br(t.sub.j) of the image I.sub.c(t.sub.j) encircling the region I.sub.c_po(t.sub.j).
4. The videoconferencing system according to claim 3, wherein the region I.sub.c_br(t.sub.j) of the image I.sub.c(t.sub.j) has a resolution lower than a native resolution of the image I.sub.c(t.sub.j) during its acquisition by the camera (20).
5. The videoconferencing system according to claim 1, wherein the optical source is configured to emit a light beam that spatially scans the angular region Z.sub.a in a scan time T, the one or more single-pixel imagers of the N sub-matrices SM.sub.imp.sup.(k)(t.sub.j) being configured to perform n.sub.i×p.sub.i acquisitions during imp the scan time T.
6. The videoconferencing system according to claim 1, wherein the optical source comprises a matrix-array optical modulator and is configured to illuminate the entire angular region Z.sub.a simultaneously.
7. A method for videoconferencing with a user by means of the videoconferencing system according to claim 1, comprising the following steps: receiving N images I.sub.int.sup.(k)(t.sub.j) transmitted by the remote systems of the interlocutors, at various display times t.sub.i, displaying, with the display screen, an image I.sub.e(t.sub.i) containing the images I.sub.int.sup.(k)(t.sub.j); determining N target points P.sub.c.sup.(k)(t.sub.j) each located at the eyes of one interlocutor; determining N sub-matrices SM.sub.imp.sup.(k)(t.sub.j) of at least one single-pixel imager, said sub-matrices each being centred on one determined target point P.sub.c.sup.(k)(t.sub.j); acquiring an image I.sub.c(t.sub.j) of the face of the user with the camera at various acquisition times t.sub.j; determining an angular region Z.sub.a covering an ocular portion P.sub.o(t.sub.j) of the face of the user containing his eyes; emitting with the optical source a light beam of wavelength located outside of the visible spectrum and that irradiates the angular region Z.sub.a; determining N correction images I.sub.co.sup.(k)(t.sub.j) on the basis of sub-matrices SM.sub.imp.sup.(k)(t.sub.j) the one or more single-pixel imagers of which collect a part of the emitted light beam reflected by an ocular portion P.sub.o(t.sub.j) of the face of the user, which ocular portion is located in the angular region Z.sub.a; correcting the image I.sub.c(t.sub.j) acquired by the camera, by replacing a region depicting the ocular portion P.sub.o(t.sub.j) with the N correction images I.sub.co.sup.(k)(t.sub.j), and thus obtaining N corrected images I.sub.cc(t.sub.j); transmitting the N corrected images I.sub.cc(t.sub.j), each to the remote system of the corresponding interlocutor.
8. The videoconferencing method according to claim 7, wherein the angular region Z.sub.a(t.sub.j) is determined on the basis of a reference point P.sub.u(t.sub.j) determined in the image I.sub.c(t.sub.j) acquired by the camera and associated with the eyes of the user.
9. The videoconferencing method according to claim 8, wherein single-pixel imagers that do not belong to the determined N sub-matrices SM.sub.imp.sup.(k)(t.sub.j) are not activated in the step of emitting the light beam.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0036] Other aspects, aims, advantages and features of the invention will become more clearly apparent on reading the following detailed description of preferred embodiments thereof, this description being given by way of non-limiting example and with reference to the appended drawings, in which:
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
DETAILED DESCRIPTION OF PARTICULAR EMBODIMENTS
[0046] In the figures and in the remainder of the description, the same references have been used to designate identical or similar elements. In addition, the various elements have not been shown to scale for the sake of clarity of the figures. Moreover, the various embodiments and variants are not mutually exclusive and may be combined with one another. Unless indicated otherwise, the terms “substantially”, “about” and “of the order of” mean to within 10%, and preferably to within 5%. Moreover, the terms “comprised between . . . and . . . ” and equivalents mean inclusive of limits, unless indicated otherwise.
[0047]
[0048] The videoconferencing system 1 according to this embodiment comprises: [0049] a display screen, comprising a matrix of emissive pixels that is suitable for displaying an image I.sub.e(t.sub.i), at various successive display times t.sub.i, with a frequency f.sub.e, an image I.sub.e(t.sub.i) containing N images I.sub.int.sup.(k)(t.sub.i) that are transmitted by the remote systems and that depict the face of the interlocutors (see
[0057] The operation of the videoconferencing system 1 according to the invention will now be presented succinctly, with reference to
[0058] A user uses a videoconferencing system 1 according to the invention to communicate here with two interlocutors, each interlocutor using a conventional remote system 2 representative of the prior art. Thus, these remote systems 2 do not allow parallax to be decreased.
[0059] A first interlocutor therefore looks at the display screen 2e of his remote system 2, while the camera films his face. Thus, the display screen 2e displays an image of the user at various successive display times, while the camera acquires an image I.sub.int.sup.(1)(t.sub.i) of this interlocutor at various successive acquisition times t. Parallax results in a non-zero angle α, which is for example of a value higher than 5° or than 10°, between the optical axis passing through the collecting optical system 22 (see
[0060] The first remote system 2 transmits the acquired images I.sub.int.sup.(1)(t.sub.i) to the videoconferencing system 1, and the second remote system 2 transmits the acquired images I.sub.int.sup.(2)(t.sub.i) to the videoconferencing system 1. Of course, the two remote systems 2 transmit these acquired images to each other. These images form a video signal, which is accompanied by an audio signal, both signals thus forming a multimedia stream transmitted and received by each of the videoconferencing systems 1, 2.
[0061] In the same way, the user looks at one or other of the interlocutors displayed by the display screen 10 of the videoconferencing system 1, while the camera 20 films his face. Thus, the display screen 10 displays the images I.sub.int.sup.(1)(t.sub.i) and I.sub.int.sup.(2)(t.sub.i) of the interlocutors at various successive display times, while the camera 20 acquires an image I.sub.c(t.sub.j) of the interlocutor at various successive acquisition times t. However, as described in detail below, two sub-matrices SM.sub.imp.sup.(k)(t.sub.j) of single-pixel imagers each determine an image I.sub.co.sup.(k)(t) of a portion, referred to as the ocular portion P.sub.o(t.sub.j), of the face of the user (facial region containing the eyes). The index k is relative to the interlocutors: k=1 for the first interlocutor, and k=2 for the second interlocutor. In so far as the photosensitive regions 34 (see
[0062] Thus, when the user looks the first interlocutor displayed on the display screen 10 in the eyes, the corresponding image I.sub.co.sup.(1)(t.sub.j) determined by the sub-matrix SM.sub.imp.sup.(1)(t.sub.j) of single-pixel imagers shows the eyes of the user looking directly at the interlocutor. Thus, the parallax angle α is greatly decreased and here substantially zero. This is also the case with the image I.sub.co.sup.(2)(t.sub.j) when the user looks the second interlocutor displayed on the display screen 10 in the eyes.
[0063] The image I.sub.c(t.sub.j) acquired by the camera 20 is then corrected to form as many corrected images I.sub.cc.sup.(k)(t.sub.j) as there are interlocutors. The correction consists in replacing, with the image I.sub.co.sup.(1)(t.sub.j), the region of the base image I.sub.c(t.sub.j) representing the ocular portion P.sub.o(t.sub.j), thus obtaining the corrected image I.sub.cc.sup.(1)(t.sub.j) to be sent to the first interlocutor. The image I.sub.co.sup.(2)(t.sub.j) is corrected in the same way, and thus the corrected image I.sub.cc.sup.(2)(t.sub.j) to be sent to the second interlocutor is obtained. Thus, the interlocutor who the user is looking in the eyes receives an image of the user with an almost zero parallax angle α, whereas the other interlocutor sees the user obviously not looking him in the eyes but looking to one side.
[0064] The videoconferencing system 1 will now be described in more detail, with reference to
[0065] The videoconferencing system 1 comprises a display screen 10 suitable for displaying an image l.sub.e(t.sub.i) at various successive display times t.sub.i, at a predefined frequency f.sub.e. It comprises a matrix of emissive pixels of n.sub.e×p.sub.e size, this size n.sub.e×p.sub.e corresponding to the resolution of the displayed images l.sub.e(t.sub.i). By way of example, the frequency f.sub.e may be 10 Hz, and the resolution of the displayed images l.sub.e(t.sub.i) may be 3840×2160 pixels (in the case of a 4K UHD screen).
[0066] As illustrated in
[0067] As illustrated in
[0068] The videoconferencing system 1 also comprises a camera 20 suitable for acquiring an image I.sub.c(t.sub.j), at various successive acquisition times t.sub.j, of the face of the user. It is here held by the rigid frame 11 of the display screen 10 (see
[0069] The videoconferencing system 1 further comprises a single-pixel-imager-employing optical device. This optical device is suitable for determining (reconstructing) N images, which are referred to as correction images I.sub.po.sup.(k)(t.sub.j), with k ranging from 1 to N, at the various acquisition times t.sub.j, these correction images I.sub.po.sup.(k)(t.sub.j) representing an ocular portion P.sub.o(t.sub.j) of the face of the user from various viewpoints. The viewpoints are the positions P.sub.c.sup.(k)(t.sub.j) of the target points located in proximity to the eyes of the interlocutors displayed on the display screen 10. To this end, the optical device comprises at least one radiating optical source 31 and a matrix of single-pixel imagers, and is connected to the processing unit 40.
[0070] The radiating optical source 31 is suitable for irradiating the ocular portion P.sub.o(t.sub.j) of the face of the user with a light beam F.sub.ec the wavelength of which is located outside of the visible spectrum, for example outside of the range extending from 380 nm to 780 nm (according to the definition given by the International Commission on Illumination). By way of example, the wavelength of the light beam F.sub.ec may be located in the near infrared (between 0.78 and 2 μm, 0.78 μm being excluded). The optical source 31 may comprise a laser diode 32 emitting a light beam at the desired wavelength. The optical source 31 further comprises a projecting optical system 33, suitable for transmitting and orienting the light beam F.sub.ec toward a predefined angular region Z.sub.a(t.sub.j), in which the ocular portion P.sub.o(t.sub.j) of the face of the user is located. The angular region Z.sub.a(t.sub.j) may be defined on the basis of the image I.sub.c(t.sub.j) acquired by the camera 20, at the acquisition frequency f.sub.c or at a lower frequency, or even once at the start of the videoconference. By way of example, the optical source 31 may be an optical phased array (OPA) such as that described in the article by Tyler et al. titled SiN integrated optical phased array for two-dimensional beam steering at a single near-infrared wavelength, Opt. Express 27, 5851-5858 (2019). As illustrated in
[0071] Each single-pixel imager comprises a single photosensitive region 34 suitable for delivering an electrical signal in response to detection of the reflected irradiating light beam. It may comprise a read-out circuit 37 and is connected to the processing unit 40. In this regard, a presentation of single-pixel photosensitive imagers is notably given in the article by Gibson at al. titled Single-pixel imaging 12 years on: a review, Opt. Express 28(19), 28190-28208 (2020) and in the article by Duarte et al. titled Single-Pixel Imaging via Compressive Sampling, IEEE Signal Processing Mag., Vol. 25, No. 2, pp. 83-91, 2008. Document FR3063411 also describes an example of a single-pixel imager.
[0072] As
[0073] As
[0074] Generally, a plurality of single-pixel imaging configurations are described in the literature, in which configurations the intensity and/or phase of the detection or illumination is optically modulated. It is however possible, as described here, to not optically modulate the irradiating light beam. Thus, in this embodiment, the irradiating light beam F.sub.ec(t.sub.j) is not optically modulated: the optical source 31 emits an irradiating light beam of small angular divergence, and performs a spatial scan of the predefined angular region Z.sub.a(t.sub.j), and therefore of the ocular portion P.sub.o(t.sub.j) of the face of the user. During the scan of the angular region Z.sub.a(t.sub.j), at least one single-pixel imager that has been activated (that of the) sub-matrix SM.sub.mp.sup.(k)(t.sub.j) in proximity to a target point, the others remaining inactive) receives, on its photosensitive region 34 (photodiode), the light beam reflected by the ocular portion P.sub.o(t.sub.j). The irradiating light beam scans the angular region Z.sub.a(t.sub.j) in a time T and the photosensitive region 34 performs n.sub.i×p.sub.i acquisitions (for example 300×100), each acquisition corresponding to one different position of the irradiating light beam in the angular region Z.sub.a(t.sub.j), and therefore on the ocular portion P.sub.o(t.sub.j).
[0075] Thus, the processing unit 40 of the single-pixel imager constructs an angular orientation vector V.sub.oa the terms of which correspond to the angular orientation of the reflected light beam in a given frame of reference, here that of the single-pixel imager in question, at each acquisition time, and an optical intensity vector V.sub.io the terms of which correspond to the optical intensity of the reflected light beam acquired by the photosensitive region 34, at each acquisition time. The vectors V.sub.oa and V.sub.io are therefore (n.sub.i×p.sub.i)×1 in size. The processing unit 40 is then able to reconstruct a (correction) image I.sub.po.sup.(k)(t.sub.j), of the ocular portion P.sub.o(t.sub.j), the resolution of which is n.sub.i×p.sub.i (for example 300×100 pixels). It will be noted that this image is a greyscale image in so far as the irradiating light beam is here monochromatic and that the single-pixel imager comprises only a single photosensitive region.
[0076] The quality (notably in terms of sensitivity) of the correction images I.sub.po.sup.(k)(t.sub.j) may be improved when the terms of the vector V.sub.io are generated not just by the single-pixel imager in question but also by a few adjacent single-pixel imagers (for example 4×4 adjacent other imagers). As a variant or in addition, to obtain a correction image I.sub.po.sup.(k)(t.sub.j), the optical source may perform a plurality of successive scans of the angular region Z.sub.a(t.sub.j) and therefore of the ocular portion P.sub.o(t.sub.j) of the face of the user at a given acquisition time t.sub.j, the optical intensity acquired during a scan for a given angular orientation of the reflected light beam then being added to that acquired in the proceeding scan.
[0077] It will be noted here that the single-pixel-imager-employing optical device may have other configurations. Thus, in the context of a so-called structured-illumination configuration (notably illustrated in
[0078] The videoconferencing system 1 comprises a processing unit 40. The latter is suitable for performing at least two key steps, namely determining the N target points P.sub.c.sup.(k)(t.sub.j) in the image I.sub.e(t.sub.i) displayed by the screen 10, and correcting I.sub.c(t.sub.j) the image on the basis of the N correction images I.sub.co.sup.(k)(t.sub.j) to obtain the N corrected images I.sub.cc.sup.k(t.sub.j) to be transmitted to the N interlocutors. Moreover, in this example, the processing unit interacts with the single-pixel-imager-employing optical device to determine the N correction images I.sub.co.sup.(k)(t.sub.j). It will be noted here that, in the context of the invention, to correct an image I.sub.c(t.sub.j) acquired by the camera and to obtain N corrected images to be transmitted to the N interlocutors, the single-pixel-imager-employing optical device does not activate all the single-pixel imagers, but only those located in sub-matrices SM.sub.mp.sup.(k)(t.sub.j) centred on the determined target points P.sub.c.sup.(k)(t.sub.j).
[0079] Thus, the processing unit 40 is suitable for determining the N target points P.sub.c.sup.(k)(t.sub.j) located in the image I.sub.e(t.sub.i) displayed by the display screen 10. A target point is a position in the image I.sub.e(t.sub.i) associated with the eyes of an interlocutor. It is a question of a point that the user will fixate his gaze upon when he desires to speak to the interlocutor in question while looking him in the eyes. This target point may be defined as being the position of one of the eyes of the interlocutor, or even a median point located between both eyes.
[0080] To determine the target points P.sub.c.sup.(k)(t.sub.j) in the image I.sub.e(t.sub.i), the processing unit 40 recognizes features of the face of each interlocutor. Among these facial features, mention may be made for example of the general shape of the face, the position of the mouth, the position of the nose and the position of the eyes. This step may be performed at each display time t, and therefore at the frequency f.sub.e, or even at a lower frequency or even once and only once at the start of the videoconference. The facial-recognition method employed is well known and not described in detail here. As regards the position of the eyes of the first interlocutor, in a frame of reference R.sub.e(O,X,Y) of the screen, where the origin O is for example located in the lower left-hand corner, X is the horizontal axis and Y the vertical axis, the position of his left eye is denoted P.sub.yg.sup.(1)(t.sub.j) and the position of his right eye is denoted P.sub.yd.sup.1(t.sub.j).
[0081] On the basis of the positions P.sub.yg.sup.(1)(t.sub.j) and P.sub.yd.sup.(1)(t.sub.j) of the eyes of the first interlocutor, the processing unit determines the target point P.sub.c.sup.(1)(t.sub.j). It also determines the position of the target point P.sub.c.sup.(2)(t.sub.j) associated with the eyes of the second interlocutor. In the case of a target point that is a median point located between both eyes, the y-coordinate of the target point may be identical to that of the eyes of the interlocutor in question, and the x-coordinate is equal to the average of those of the positions of the eyes.
[0082] On the basis of the positions of the various target points P.sub.c.sup.(k)(t.sub.j), with k ranging from 1 to N, the processing unit 40 determines the N sub-matrices SM.sub.imp.sup.(k)(t.sub.j) of single-pixel imagers. Each sub-matrix SM.sub.imp.sup.(k)(t.sub.j) of single-pixel imagers is centred on the target point P.sub.c.sup.(k)(t.sub.j) in question. It may comprise only a single single-pixel imager, i.e. the one located closest to the target point in question, or may comprise a plurality of single-pixel imagers, namely the single-pixel imager located closest to the target point in question and a plurality of adjacent single-pixel imagers, so as to increase the detection sensitivity.
[0083] As illustrated in
[0084] It will be noted that this step of determining sub-matrices SM.sub.imp.sup.(k)(t.sub.j) of single-pixel imagers may be performed at a frequency equal to or lower than the acquisition frequency f.sub.c, or even once and only once at the start of the videoconference in so far as the face of the interlocutors will change position little during the communication.
[0085] Next, the processing unit 40 is suitable for correcting the image I.sub.c(t.sub.i) on the basis of the N correction images I.sub.co.sup.(k)(t.sub.j) to obtain the N corrected images I.sub.cc.sup.(k)(t.sub.j) to be transmitted to the N interlocutors. To this end, it receives the image I.sub.c(t.sub.j) acquired at the acquisition time t.sub.j by the camera, and the N correction images I.sub.cc.sup.(k)(t.sub.j). The correction images are first modified so that they have the colorimetric characteristics of the ocular portion represented in the image I.sub.c(t.sub.j). Next, the processing unit 40 determines N corrected images I.sub.cc.sup.(k)(t.sub.j), by replacing the ocular portion represented in the base image I.sub.c(t.sub.j) with each of the N modified correction images Im.sub.co.sup.(k)(t.sub.j). Each of the N corrected images I.sub.cc.sup.(k)(t.sub.j) are then transmitted to the interlocutor in question.
[0086] It will be noted that the N corrected images I.sub.cc.sup.(k)(t.sub.j) to be transmitted to the N interlocutors may have a foveated-imaging aspect, i.e. the ocular portion in the corrected image I.sub.cc.sup.(k)(t.sub.j) (obtained from a correction image I.sub.co.sup.(k)(t.sub.j)) has a higher resolution than the region of the image encircling this ocular portion. By way of example, the ocular portion may have a resolution equal to the particularly high resolution of the display screen 10, and the region encircling the ocular portion may have a resolution lower than the native resolution of the base image I.sub.c(t.sub.j) of the camera. This allows the weight in bytes of the video streams transmitted to the remote systems to be decreased. This aspect is described in detail below with reference to
[0087] Thus, the videoconferencing system 1 according to the invention allows the parallax effect associated with the direction of the gaze of the user when he is communicating with any one of the N interlocutors while looking at him in the eyes to be decreased effectively, in so far as it uses a single-pixel-imager-employing optical device integrated into the display screen 10, of which only single-pixel imagers located in proximity to target points of the interlocutors are activated. There is thus a clear difference between it and use of a more conventional matrix-array imager integrated into the display screen, such as that described in document WO2019/165124. In addition, the weight in bytes of the video streams transmitted by the videoconferencing system 1 to the remote systems remains unchanged because it is associated with the image acquired by the camera and not with the image acquired by the matrix-array imager integrated into the screen of document WO2019/165124. Preferably, the weight of the images transmitted to the remote systems 2 may be low when a foveated-imaging technique is used.
[0088]
[0089] Step 100: The videoconferencing system 1 receives, in real time, the multi-media streams (video and audio signals) generated by N remote systems 2 of the various interlocutors.
[0090] Step 110: The display screen 10 displays the image I.sub.e(t.sub.i) at various display times t, at a frequency f.sub.e. The displayed image I.sub.e(t.sub.i) contains the N images I.sub.int.sup.(k)(t.sub.i) of the interlocutors. By way of example, the image I.sub.e(t.sub.i) has a resolution n.sub.e×p.sub.e of 3840×2160 pixels and the display frequency f.sub.e is equal to 10 Hz. The images I.sub.int.sup.(k)(t.sub.i) of the interlocutors are here placed side-by-side horizontally.
[0091] Step 200: The processing unit 40 determines the position P.sub.c.sup.(k)(t.sub.j) of the target points associated with the N interlocutors, with k ranging from 1 to N. This step may be performed at the various acquisition times t, of the camera or may be performed at a lower frequency, or may even be performed once and only once at the start of the videoconference. The processing unit 40 recognizes the face of each interlocutor displayed in the image I.sub.e(t.sub.i) and determines the position P.sub.c.sup.(k)(t.sub.j) of the N target points.
[0092] Step 210: The processing unit 40 then determines the N sub-matrices SM.sub.mp.sup.(k)(t.sub.j) of single-pixel imagers associated with the determined target points P.sub.c.sup.(k)(t.sub.j). To do this, it determines the single-pixel imager located closest to the position P.sub.c.sup.(k)(t.sub.j) of the target point in question and, preferably, a plurality of neighbouring single-pixel imagers. The number of single-pixel imagers in each sub-matrix is chosen to improve the quality of the correction image I.sub.po.sup.(k)(t.sub.j) to be reconstructed. The other single-pixel imagers may remain inactive.
[0093] Step 300: In parallel to steps 110, 200 and 210, the camera 20 acquires an image I.sub.c(t.sub.j) of the face of the user at various successive acquisition times t.sub.j. The acquisition frequency f.sub.c may be equal to the display frequency f.sub.e or preferably be lower therethan. It may here be equal to 10 Hz. The image l.sub.e(t.sub.j) has a resolution of n.sub.e×p.sub.e for example equal to 1280×720 pixels.
[0094] Step 310: The processing unit 40 then determines the angular region Z.sub.a(t.sub.j) in which the ocular portion P.sub.o(t.sub.j) of the face of the user is located. This step may be performed at the acquisition frequency f.sub.c, or at a lower frequency, or even once and only once at the start of the videoconference. Here also, the processing unit 40 determines the position P.sub.u(t.sub.j) of a reference point associated with the eyes of the user, in the acquired image I.sub.c(t.sub.j). This reference point may be a median position between the two eyes of the user. Next, on the basis of the properties of the collecting optical device 22 of the camera 20, the processing unit 40 determines an angular region Z.sub.a(t.sub.j) covering the ocular portion P.sub.o(t.sub.j) of the face of the user, i.e. the portion of his face that contains his two eyes.
[0095] Step 400: The single-pixel-imager-employing optical device determines the N correction images I.sub.co.sup.(k)(t.sub.j), having, as viewpoint, the position P.sub.c.sup.(k)(t.sub.j) of the various target points. These correction images are determined (reconstructed) by the sub-matrices SM.sub.imp.sup.(k)(t.sub.j) of single-pixel imagers associated with the target points. To do this, the optical source 31 emits an irradiating light beam that spatially scans the ocular portion of the face of the user in a time T. The radiating light beam has a wavelength here located in the near infrared, and is of small angular divergence. Each sub-matrix SM.sub.imp.sup.(k)(t.sub.j) of single-pixel imagers acquires the reflected light beam in n.sub.i×p.sub.i measurements. The read-out circuits of each sub-matrix SM.sub.imp.sup.(k)(t.sub.j) receive a synchronization signal from the single-pixel-imager-employing optical device, and read and store in memory each detection signal acquired by each of the photosensitive regions 34. The processing unit 40 then determines the N correction images I.sub.co.sup.(k)(t.sub.j). It will be noted that each correction image I.sub.co.sup.(k)(t.sub.j) may then be modified to correct an effect of perspective.
[0096] Step 410: The processing unit 40 then modifies the N correction images I.sub.co.sup.(k)(t.sub.j) so that they have the colorimetric characteristics of the ocular portion displayed in the image I.sub.c(t.sub.j). The region l.sub.c,po(t.sub.j) of the image I.sub.c(t.sub.j) comprising the ocular portion of the face of the user is firstly over-sampled to make it the same resolution as each of the correction images I.sub.co.sup.(k)(t.sub.j). The region I.sub.c,po(t.sub.j) of the image I.sub.c(t.sub.j) is then decomposed into a space separating chroma and luminance, for example in the CIELAB (1976) colour space, also denoted the L*a*b* colour space, which is a space in which colours are characterized by three quantities (along three axes). A colour y is characterized by a point located in the L*a*b* space, in which the value along the a* axis expresses red/green character (positive if red, negative if green), the value along the b* axis expresses yellow/blue character (positive if yellow, negative if blue), and in which the value along the vertical L* axis expresses lightness (derived from luminance), which ranges from black for L=0 to white for L=100. Next, to each pixel of the correction images I.sub.co.sup.(k)(t.sub.j) are attributed the colorimetric characteristics associated with the corresponding pixel of the region I.sub.c,po.sup.(k)(t.sub.j) of the image I.sub.c(t.sub.j), and thus the N modified correction images Im.sub.co.sup.(k)(t.sub.j) are obtained. Thus, the L* portion of the correction image I.sub.co.sup.(k)(t.sub.j) is preserved, but its a* and b* coordinates are replaced by those of the region I.sub.c,po(t.sub.j) of the image I.sub.c(t.sub.j).
[0097] Step 420: The processing unit determines the N corrected images I.sub.cc.sup.(k)(t.sub.j) to be transmitted to the N interlocutors. To do this, each modified correction image Im.sub.co.sup.(k)(t.sub.j) is superposed on the image I.sub.c(t.sub.j). In other words, the region I.sub.c,po(t.sub.j) of the image I.sub.c(t.sub.j) is replaced by a modified correction image IM.sub.co.sup.(k)(t.sub.j), and thus a corrected image I.sub.cc.sup.(k)(t.sub.j) is obtained.
[0098] It will be noted that it is advantageous, in the context of application of a foveated-imaging technique, to consider here a ‘degraded’ version of the base image I.sub.c(t.sub.j), i.e. a version I.sub.c,br(t.sub.j) of the base image I.sub.c(t.sub.j) having a resolution lower than the initial resolution. Thus, each corrected image I.sub.cc.sup.(k)(t.sub.j) contains a high-resolution region that corresponds to the ocular portion (drawn from the modified correction image Im.sub.co.sup.(k)(t.sub.j)) and a low-resolution region that encircles the ocular portion.
[0099] Step 500: The processing unit then transfers the corrected image I.sub.cc.sup.(1)(t.sub.j) to the remote system of the first interlocutor, and the corrected image I.sub.cc.sup.(2)(t.sub.j) to the remote system of the second interlocutor. Thus, when the user looks the first interlocutor in the eyes (i.e. by looking at the target position P.sub.c.sup.(1)(t.sub.j)) the corrected image I.sub.cc.sup.(1)(t.sub.j) shows the user with a parallax angle of substantially zero. This interlocutor then sees the user looking him in the eyes. In contrast, the other interlocutor sees the user not looking directly at him, but looking to one side.
[0100] Particular embodiments have just been described. Various modifications and variants will be obvious to anyone skilled in the art.