Image display device, image display method, and program
09736450 · 2017-08-15
Assignee
Inventors
Cpc classification
H04N13/361
ELECTRICITY
H04N13/349
ELECTRICITY
International classification
Abstract
An image display device includes region of interest extraction unit, parallax image generation unit and 3D image display unit. Region of interest extraction unit generates depth image signal by depth image conversion employing depth threshold, depth image signal including information on distance in three-dimensional space between viewpoint and each pixel of two-dimensional image including region of interest desired to be noted by observer, depth image conversion being such that depth value indicating distance between viewpoint and each pixel of two-dimensional image is converted to depth value for 2D display when depth value is equal to or larger than depth threshold. Parallax image generation unit generates both-eye parallax image having parallax in region of interest alone, from two-dimensional image and image obtained by conversion of region of interest image representing region of interest at each of both-eye viewpoints, based on two-dimensional image and depth image signal.
Claims
1. An image display device, comprising: a region of interest extraction unit that i) generates a depth image signal by depth image conversion employing a depth threshold generated by an amount of movement of images in a 2D/3D mixed display, the depth image signal including information on a distance in a three-dimensional space between a viewpoint and each pixel of a two-dimensional image including a region of interest desired to be noted by an observer, the depth image conversion being such that a depth value for each pixel, the depth value indicating the distance between the viewpoint and each pixel of the two-dimensional image is converted to a new depth value for 2D display when the depth value is equal to or larger than the depth threshold, and ii) extracts the region of interest based on the generated depth image signal; a parallax image generation unit that generates a left- and right-eye parallax images each having a parallax in a new region of interest resulting from the new depth value and having no parallax in a region other than the new region of interest, from said two-dimensional image and an image obtained by conversion of a region of interest image to be displayed in the region of interest at each of right and left viewpoints, based on said two-dimensional image and said depth image signal; and a 3D image display unit that displays the left-eye parallax image and the right-eye parallax image, wherein the region of interest extraction unit comprises: an image segmentation unit that receives a first two-dimensional image of a most recent frame and a second two-dimensional image of a frame immediately preceding the most recent frame of the first two-dimensional image, divides each of the first and second two-dimensional images into divided regions each having common characteristic(s) to the first and second two-dimensional images, and generates a divided first two-dimensional image and a divided second two-dimensional image; an optical flow computing unit that computes a difference value between center positions of gravity of a divided region of the divided first two-dimensional image and a divided region of the divided second two-dimensional image, as an optical flow of the divided region of the divided first two-dimensional image; a depth image computing unit that generates a depth estimation value for each pixel of the divided region of the divided first two-dimensional image based on luminance information on the divided region of the divided first two-dimensional image; and a depth image conversion unit that computes the depth threshold from the optical flow, thereby performing the depth image conversion on the depth estimation value.
2. An image display device, comprising: a region of interest extraction unit that i) generates a depth image signal by depth image conversion employing a depth threshold generated by an amount of movement of images in a 2D/3D mixed display, the depth image signal including information on a distance in a three-dimensional space between a viewpoint and each pixel of a two-dimensional image including a region of interest desired to be noted by an observer, the depth image conversion being such that a depth value for each pixel, the depth value indicating the distance between the viewpoint and each pixel of the two-dimensional image is converted to a new depth value for 2D display when the depth value is equal to or larger than the depth threshold, and ii) extracts the region of interest based on the generated depth image signal; a parallax image generation unit that generates a left-and right-eye parallax images each having a parallax in a new region of interest resulting from the new depth value and having no parallax in a region other than the new region of interest, from said two-dimensional image and an image obtained by conversion of a region of interest image to be displayed in the region of interest at each of right and left viewpoints, based on said two-dimensional image and said depth image signal; and a 3D image display unit that displays the left-eye parallax image and the right-eye parallax image, wherein the region of interest extraction unit comprises: an image segmentation unit that receives first left-and right-eye two-dimensional images of a most recent frame and second left-and right-eye two-dimensional images of a frame immediately preceding the most recent frame of the first left-and right-eye two-dimensional images, divides each of the first and second left-eye two-dimensional images, and, the first and second right-eye two-dimensional images into divided regions each having common characteristic(s), thereby generating divided first left-and right-eye two-dimensional images, and divided second left-and right-eye two-dimensional images, respectively; an optical flow computing unit that computes a difference value between center positions of gravity of the divided regions of the divided first and second left-eye two-dimensional images or a difference value between center positions of gravity of the divided regions of the divided first and second right-eye two-dimensional images, as an optical flow for each of the divided regions of the divided first left-and right-eye two-dimensional images, respectively; a depth image computing unit that generates a depth estimation value for each pixel of the divided region of the divided first left-or right-eye two-dimensional image, based on a parallax amount between the divided regions of the divided first left-and right-eye two-dimensional images; and a depth image conversion unit that computes the depth threshold from the optical flow, thereby performing the depth image conversion on the depth estimation value.
3. An image display method, comprising: extracting a region of interest from a two-dimensional image a region of interest desired to be noted by an observer, by i) generating a depth image signal through depth image conversion employing a depth threshold generated by an amount of movement of images in a 2D/3D mixed display, the depth image signal including information on a distance in a three-dimensional space between a viewpoint and each pixel of the two-dimensional image including the region of interest, the depth image conversion being such that a depth value for each pixel, the depth value indicating the distance between the viewpoint and each pixel of the two-dimensional image is converted to a new depth value for 2D display when the depth value is equal to or larger than the depth threshold, and ii) extracting the region of interest based on the generated depth image signal; generating a parallax image by generating left-and right-eye parallax images each having a parallax in a new region of interest resulting from the new depth value and having no parallax in a region other than the new region of interest, from said two-dimensional image and an image obtained by conversion of a region of interest image to be displayed in the region of interest at each of right and left viewpoints, based on said two-dimensional image and said depth image signal; and displaying a 3D image by displaying the left-eye parallax image and the right-eye parallax image, wherein said extracting the region of interest comprises: receiving a first two-dimensional image of a most recent frame and a second two-dimensional image of a frame immediately preceding the most recent frame of the first two-dimensional image, dividing each of the first and second two-dimensional images into divided regions each having common characteristic(s) to the first and second two-dimensional images, and generating divided first and second two-dimensional images; computing a difference value between center positions of gravity of a divided region of the divided first two-dimensional image and a divided region of the divided second two-dimensional image, as an optical flow of the divided region of the divided first two-dimensional image; generating a depth estimation value for each pixel of the divided region of the divided first two-dimensional image based on luminance information on the divided region of the divided first two-dimensional image; and computing the depth threshold from the optical flow, thereby performing the depth image conversion on the depth estimation value.
4. An image display method, comprising: extracting a region of interest from a two-dimensional image a region of interest desired to be noted by an observer, by i) generating a depth image signal through depth image conversion employing a depth threshold generated by an amount of movement of images in a 2D/3D mixed display, the depth image signal including information on a distance in a three-dimensional space between a viewpoint and each pixel of the two-dimensional image including the region of interest, the depth image conversion being such that a depth value for each pixel, the depth value indicating the distance between the viewpoint and each pixel of the two-dimensional image is converted to a new depth value for 2D display when the depth value is equal to or larger than the depth threshold, and ii) extracting the region of interest based on the generated depth image signal; generating a parallax image by generating left-and right-eye parallax images each having a parallax in a new region of interest resulting from the new depth value and having no parallax in a region other than the new region of interest, from said two-dimensional image and an image obtained by conversion of a region of interest image to be displayed in the region of interest at each of right and left viewpoints, based on said two-dimensional image and said depth image signal; and displaying a 3D image by displaying the left-eye parallax image and the right-eye parallax image, wherein said extracting the region of interest comprises: receiving first left-and right-eye two-dimensional images of a most recent frame and second left-and right-eye two-dimensional images of a frame immediately preceding the most recent frame of the first left-and right-eye two-dimensional images, dividing each of the first and second left-eye two-dimensional images, and, the first and second right-eye two-dimensional images into divided regions each having common characteristic(s), thereby generating divided first left-and right-eye two-dimensional images, and, divided second left-and right-eye two-dimensional images, respectively; computing a difference value between center positions of gravity of the divided regions of the divided first and second left-eye two-dimensional images, or a difference value between center positions of gravity of the divided regions of the divided first and second right-eye two-dimensional images, as an optical flow for each of the divided regions of the divided first left-and right-eye two-dimensional images, respectively; generating a depth estimation value for each pixel of the divided region of the divided first left-or right-eye two-dimensional image, based on a parallax amount between the divided regions of the divided first left-and right-eye two-dimensional images; and computing the depth threshold from the optical flow, thereby performing the depth image conversion on the depth estimation value.
5. A computer program recorded on a non-transient recording medium that causes, upon execution by a processor of a computer equipped with an image display device, to cause the computer execute: a region of interest extraction process of extracting a region of interest from a two-dimensional image a region of interest desired to be noted by an observer, by i) generating a depth image signal through depth image conversion employing a depth threshold generated by an amount of movement of images in a 2D/3D mixed display, the depth image signal including information on a distance in a three-dimensional space between a viewpoint and each pixel of the two-dimensional image including the region of interest, the depth image conversion being such that a depth value for each pixel, the depth value indicating the distance between the viewpoint and each pixel of the two-dimensional image is converted to a new depth value for 2D display when the depth value is equal to or larger than the depth threshold, and ii) extracting the region of interest based on the generated depth image signal; a parallax image generation process of generating a parallax image by generating left-and right-eye parallax images having a parallax in a new region of interest resulting from the new depth value and having no parallax in a region other than the new region of interest, from said two-dimensional image and an image obtained by conversion of a region of interest image to be displayed in the region of interest at each of right and left viewpoints, based on said two-dimensional image and said depth image signal; and a 3D image display process by displaying the left-eye parallax image and the right-eye parallax image, wherein said region of interest extraction process comprises: an image segmentation process of receiving a first two-dimensional image of a most recent frame and a second two-dimensional image of a frame immediately preceding the most recent frame of the first two-dimensional image, dividing each of the first and second two-dimensional images into divided regions each having common characteristic(s) to the first and second two-dimensional images, and generating divided first and second two-dimensional images; a moving amount computing process of computing a difference value between center positions of gravity of a divided region of the divided first two-dimensional image and a divided region of the divided second two-dimensional image, as an optical flow of the divided region of the divided first two-dimensional image; a depth image computing process of generating a depth estimation value for each pixel of the divided region of the divided first two-dimensional image based on luminance information on the divided region of the divided first two-dimensional image; and a depth image conversion process of converting the depth estimation value to a depth value for 2D display when the depth estimation value is equal to or larger than a depth threshold computed from said optical flow.
6. A computer program recorded on a non-transient recording medium that causes, upon execution by a processor of a computer equipped with an image display device, to cause the computer execute: a region of interest extraction process of extracting a region of interest from a two-dimensional image a region of interest desired to be noted by an observer, by i) generating a depth image signal through depth image conversion employing a depth threshold generated by an amount of movement of images in a 2D/3D mixed display, the depth image signal including information on a distance in a three-dimensional space between a viewpoint and each pixel of the two-dimensional image including the region of interest, the depth image conversion being such that a depth value for each pixel, the depth value indicating the distance between the viewpoint and each pixel of the two-dimensional image is converted to a new depth value for 2D display when the depth value is equal to or larger than the depth threshold, and ii) extracting the region of interest based on the generated depth image signal; a parallax image generation process of generating a parallax image by generating left-and right-eye parallax images having a parallax in a new region of interest resulting from the new depth value and having no parallax in a region other than the new region of interest, from said two-dimensional image and an image obtained by conversion of a region of interest image to be displayed in the region of interest at each of right and left viewpoints, based on said two-dimensional image and said depth image signal; and a 3D image display process by displaying the left-eye parallax image and the right-eye parallax image, wherein said region of interest extraction process comprises: an image segmentation process of receiving first and second left-eye two-dimensional images of a most recent frame and a second left-and right-eye two-dimensional images of a frame immediately preceding the most recent frame of the first left-and right-eye two-dimensional images, dividing each of the first and second left-eye two-dimensional images, and, the first and second right-eye two-dimensional images, into divided regions each having common characteristic(s), thereby generating divided first left-and right-eye two-dimensional images, divided second left-and right-eye two-dimensional images, respectively; an optical flow computing process of computing a difference value between center positions of gravity of the divided regions of the divided first and second left-eye two-dimensional images or a difference value between center positions of gravity of the divided regions of the divided first and second right-eye two-dimensional images, as an optical flow of each of the divided regions of the divided first left-and right-eye two-dimensional images, respectively; a depth image computing process of generating a depth estimation value for each pixel of the divided region of the divided first left-or right-eye two-dimensional image, based on a parallax amount between the divided regions of the divided first left-and right-eye two-dimensional images; and a depth image conversion process of converting the depth estimation value to a depth value for 2D display when the depth estimation value is equal to or larger than a depth threshold computed from the optical flow.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
(31)
(32)
(33)
(34)
(35)
(36)
(37)
(38)
(39)
(40)
(41)
(42)
(43)
(44)
(45)
(46)
(47)
(48)
(49)
(50)
PREFERRED MODES
(51) Note that the reference to the drawings by way of marks, symbols, or figure recited in the present disclosure is intended merely to help understanding and not for limitation purpose.
(52) [First Exemplary Embodiment]
(53) For an image display device for display of a mixture of 2D and 3D images according to a first exemplary embodiment of the present invention, a region of interest extraction unit for automatically converting a region of interest into a 3D image is also provided, in addition to a parallax image generation unit for generating a left-eye parallax image and a right-eye parallax image and a 3D image display unit for performing the display of the mixture of 2D and 3D images.
(54) A configuration of an image display device according to the first exemplary embodiment of the present invention will be described with reference to
(55) The parallax image generation unit 120 computes parallax information on each pixel of the region of interest image from the depth image 2010 obtained by the conversion, thereby generating a left-eye parallax image 2000Lo and a right-eye parallax image 2000Ro. Finally, the 3D image display unit 130 rearranges the generated left-eye parallax image 2000Lo and the generated right-eye parallax image 2000Ro to perform display of a mixture of 2D and 3D images.
(56) A configuration of each process block will be described in the order from an input to an output.
(57)
(58) In order to increase a speed and accuracy of each of processes such as optical flow computation and depth image estimation that will be described later, it is arranged that the process is not performed for each pixel, and that computation is performed for each of regions obtained by division of a received two-dimensional image into the regions in advance. A plurality of the two-dimensional images (1000, 2000) is successively supplied to the image segmentation unit 10. Herein, the following description will be given, with the two-dimensional image of an immediately preceding frame written as the two-dimensional image 1000 and the two-dimensional image of a current frame written as the two-dimensional image 2000. The image segmentation unit 10 performs a division process of dividing each of the received two-dimensional images (1000, 2000) into regions having similar pixel characteristics (of color information and position information) by referring to coordinate values and color information on the two-dimensional image. That is, each two-dimensional image is divided into the regions having the similar characteristics. Then, a labeling process is performed on each region obtained by the division, and it is so arranged that pixel values of an output image obtained by the division are labeling values of the respective regions obtained by the division. A signal obtained by the division of the two-dimensional image 1000 represents a divided two-dimensional image 1000a. A signal obtained by the division of the two-dimensional image 2000 represents a divided two-dimensional image 2000a. A specific example of the image segmentation process will be described later.
(59) Next, the divided two-dimensional images (1000a, 2000a) are supplied to the optical flow computing unit 12, and a correspondence (optical flow) between respective regions of the two-dimensional image 1000 of the immediately preceding frame and the two-dimensional image 2000 of the current frame is estimated, using the color information, the luminance information, and area information. Specifically, a difference value between the centers of gravity of the regions corresponding to each other derived from the color information, the luminance information, and the like is output as an optical flow 2000c of this region.
(60) Further, the divided two-dimensional images (1000a, 2000a) and the optical flow of each region output from the optical flow computing unit 12 are supplied to the depth image computing unit 11. The depth image computing unit 11 estimates a depth value of each region, by referring to the position information, the optical flow, the luminance information, and the like of each region.
(61) A relationship between a parallax amount and a depth value will be described, using
(62) As described above, a depth value can be estimated from a parallax amount. An input signal, however, has only information on one camera. Accordingly, the depth value is estimated by the luminance information on the divided two-dimensional image 2000a.
(63) The depth value estimated by the depth image computing unit 11 and the optical flow 2000c of each region output from the optical flow computing unit 12 are supplied to the depth image conversion unit 13. As described above, the smaller the distance between a viewpoint and an object region is, the larger the parallax amount is. Accordingly, when an object region near to the viewpoint is extracted as the region of interest, a more natural representation of a mixture of 2D and 3D images is obtained. Then, the depth image conversion unit 13 performs a conversion process on a depth image signal 2000d indicating the depth value, using conversion expressions shown in Equations (1) and (2).
D1<Dth(v):D2=D1 (1)
D1>Dth(v):D2=255 (2)
where D1 denotes the depth value before the conversion, D2 denotes a depth value obtained by the conversion, Dth denotes a depth threshold, and v denotes the optical flow 2000c.
(64) Equation (2) determines the depth threshold using the optical flow of each region. When the depth value of a certain region is larger than the depth threshold, the depth value of a target object represented by this region is converted to a depth value for 2D display. The depth value for 2D display is determined according to a method of arranging right and left cameras in a 3D space. When the right and left cameras are disposed in parallel, the depth value for 2D display is infinitely distant from a viewpoint. When the right and left cameras are disposed in a shift sensor method, the depth value for 2D display indicates a distance to the screen surface of each camera. An example of the conversion of the depth value using stereo cameras arranged in parallel will be herein described. A target object having a depth value larger than the depth threshold is regarded to be indefinitely distant from a viewpoint, and the depth value is converted to 255. Equation (1) indicates that depth value conversion is not performed when the depth value of a certain region is smaller than the depth threshold. By performing the conversion process as described above, a region having a depth value obtained by the conversion being not 255 is extracted as the region of interest image. The conversion as described above creates a perception of so-called image pop-up that the image is coming out of the screen. A depth image signal obtained by the conversion is indicated by reference numeral 2010.
(65) Next, a relationship between the depth threshold and the optical flow will be described. According to Equations (1) and (2), an object region near to a viewpoint is displayed stereoscopically. A region having a large optical flow may come close to an observer after several frames to become the region of interest. Thus, by setting the depth threshold to be high in advance for 3D display, a more interesting mixture of 2D and 3D images having a rich power of expression can be created. In order to achieve that purpose, the relationship between the depth threshold and the optical flow is defined as a linear relationship as indicated by Equation (3).
Dth(v)=k×v+D0 (3)
where k indicates a proportionality coefficient between the depth threshold and the optical flow, D0 indicates the depth threshold of a still object.
(66) When an object makes extremely rapid movement at a time of observation of the 3D screen, the observer may not be able to keep track of the movement. In order to prevent such a phenomenon, a relationship between the depth threshold and the optical flow as shown in each of Equations (4) and (5) may be used.
v<vth:Dth=k1×v+D1 (4)
v>vth:Dth=k2×v+D2 (5)
That is, when the optical flow of the object becomes equal to or larger than a certain value vth, a proportionality coefficient k2 for the depth threshold and the optical flow is set to a negative value, or the proportionality coefficient k2 is reduced to be lower than a proportionality coefficient k1 (k2<k1). Conversion from a 2D image to a 3D image is thereby made to be moderate to reduce burden on eyes.
(67) Next, a method of extracting a region of interest will be specifically explained. A description will be given about a process of extracting a region of interest from a three-dimensional space in which three balls as shown in
(68) As shown in
(69) In step S01, the photographed two-dimensional images (1000, 2000) of the frames are chronologically supplied to the region of interest extraction unit 110. The two-dimensional image 1000 of the immediately preceding frame in that case is shown in
(70) In step S02, the division process and the labeling process are performed to output the divided two-dimensional image 1000a and the divided two-dimensional image 2000a. In the division process, the received two-dimensional image 1000 of the immediately preceding frame and the two-dimensional image 2000 of the current frame are divided into regions having similar pixel characteristics (such as color information and position information), by referring to coordinate values and the color information. In the labeling process, the regions obtained by the division are labeled in order. Since pixel values of pixels indicating a same one of the balls are equal in the received two-dimensional images 1000 and 2000, an image corresponding to each of the three balls is set to one divided region. The pixel value of the image corresponding to each of the three balls in each of the divided two-dimensional image 1000a and the divided two-dimensional image 2000a is the labeled value of the corresponding region. The divided two-dimensional image 1000a of the immediately preceding frame is shown in
(71) In step S03, a correspondence between respective regions of the divided two-dimensional image 1000a of the immediately preceding frame and the divided two-dimensional image 2000a of the current frame is estimated, using the color information and the luminance information on the divided regions. As a result, the correspondence between the respective regions of the divided two-dimensional image 1000a of the immediately preceding frame and the divided two-dimensional image 2000a of the current frame is as shown in
(72) Next, an optical flow between the respective corresponding regions of the divided two-dimensional image 2000a of the current frame and the divided two-dimensional image 1000a of the immediately preceding frame is supplied to the depth image computing unit 11. The depth image computing unit 11 refers to the position information, the optical flow, and the luminance information of each region to estimate the depth image signal 2000d for each region.
(73) Herein, a depth is estimated by luminance information on an input image. Let us assume in advance that the light source is disposed between the object and the viewpoint. Then, it is considered that a luminance difference among these three balls in the received two-dimensional image occurs due to a difference among distances of the three balls from the light source. It can be then estimated that the larger a luminance value, the smaller a depth value is. Based on the luminance information on the divided two-dimensional image 2000a of the current frame (that the ball 1 has the same luminance value as the ball 2, and the ball 3 has the largest luminance value), the depth image signal 2000d as shown in
(74) In step S04, the depth threshold is computed from the optical flow of each region, and the depth image signal 2000d is converted. When the depth value of a certain region is equal to or larger than the computed threshold in that case, a target object to be displayed in this region is 2D-displayed. Thus, the target object is regarded to be indefinitely distant from the viewpoint, and the depth value of this region is converted to 255, as indicated by Equation (2). On the contrary, when the depth value of this region is less than the threshold, the depth value of this region is output without alteration, as indicated by Equation (1). Assume that a relationship between the depth threshold and the optical flow for converting the depth image signal 2000d is defined to be a linear relationship as shown in
(75) In step S05, the depth image signal 2010 obtained by the conversion is output.
(76) The relationship between the depth threshold and the optical flow is assumed to be represented by a linear function. The relationship may be set to belong to a relationship other than the linear function.
(77) As described above, by performing depth conversion in view of the optical flow for each pixel of the two-dimensional image using the region of interest extraction unit 110, generation of the depth image signal including information corresponding to the region of interest can be automatically generated. Accordingly, the parallax image generation unit 120 can restore a distance between the object and each of right and left viewpoints from the depth image signal 2010 obtained by the conversion. A parallax amount of each pixel in the current frame image 2000 can be computed. The left-eye parallax image 2000Lo and the right-eye parallax image 2000Ro can be generated by performing a displacement process of each pixel, according to the computed parallax amount.
(78) The process performed by the parallax image generation unit 120 is indicated by Equation (6). A computation equation of a shift amount Δu (u, v) of a pixel (u, v) of the current frame image 2000 can be expressed by Equation (6).
(79)
where z(u, v) indicates a distance between one of right and left viewpoints and a point in the three-dimensional space corresponding to the pixel (u, v) in the image of the current frame, and can be computed from the depth image signal 2010 obtained by the conversion. IOD indicates a distance between both of the right and left viewpoints, and Fov indicates a field of view. That is, when the depth value of the target pixel is large, the target pixel is distant from the viewpoint. The shift amount Δu is reduced. A pixel whose depth value has been converted to 255 is regarded to be infinitely distant from the viewpoint. The parallax amount of this pixel is zero. On the contrary, when the depth value of the target pixel is small, the target pixel is near the viewpoint. Thus, the shift amount Δu increases.
(80) Next, using the computed shift amount, the pixel value of the pixel (u, v) of the current frame image is applied to a coordinate (u−Δu, v) of the left-eye parallax image 1000Lo and a coordinate (u+Δu, v) of the-right eye parallax image 1000Ro. By these processes, the left-eye parallax image 1000Lo and the right-eye parallax image 1000Ro having a parallax in the region of interest alone can be generated.
(81) Finally, the left-eye parallax image and the right-eye parallax image generated by the 3D image display unit are rearranged to perform display of the mixture of the 2D and 3D images.
(82) In this exemplary embodiment, when a region of interest is extracted, the depth value and the optical flow of a target object are considered. The object having a large optical flow may come close to an observer after several frames and may therefore become the region of interest. Accordingly, the depth threshold is set to be high in advance to perform 3D display of the object having the large optical flow. A mixture of 2D and 3D contents that is more interesting and has a rich power of expression can be thereby created. On the other hand, in order to prevent eyes from not being able to keep track of a rapid movement, the depth threshold is made to depend on the optical flow. Conversion from 2D display to 3D display can be made to be moderate, thereby allowing reduction of burden on eyes. Further, the depth value of a region other than the region of interest is converted to a depth value for the 2D display. Only the region of interest can also be thereby converted into a 3D image automatically.
(83) [Variation Example]
(84) In this exemplary embodiment, left-eye two-dimensional images (1000L and 2000L) and right-eye two-dimensional images (1000R and 2000R) of a plurality of frames can also be supplied to an image display device 1a, as shown in
(85)
(86) Then, the optical flow computing unit 12a estimates a correspondence between respective regions of a divided two-dimensional image (1000La) of an immediately preceding frame and a divided two-dimensional image (2000La) of a current frame and a correspondence between respective regions of a divided two-dimensional image (1000Ra) of the immediately preceding frame and a divided two-dimensional image (2000Ra) of the current frame, using color information and luminance information on each of the right-eye and left-eye two-dimensional image signals obtained by the division. Then, the optical flow computing unit 12a outputs a difference value between the centers of gravity of the regions of the divided two-dimensional image (1000La) and the divided two-dimensional image (2000La) corresponding to each other, as an optical flow 2000Lc of the region. The optical flow computing unit 12a also outputs a difference value between the centers of gravity of the regions of the divided two-dimensional image (1000Ra) and the divided two-dimensional image (2000Ra) corresponding to each other as an optical flow 2000Rc of the region.
(87) Next, the divided left-eye two-dimensional image 1000La of the immediately preceding frame, the divided right-eye two-dimensional image 1000Ra of the immediately preceding frame, the divided left-eye two-dimensional image 2000La of the current frame, the divided right-eye two-dimensional image 2000Ra of the current frame, and the optical flows between the respective corresponding regions of the right-eye and left-eye two-dimensional images of the current frame are supplied to the depth image computing unit 11a. The depth image computing unit 11a estimates a correspondence between the respective regions of the divided left-eye two-dimensional image 2000La of the current frame and the divided right-eye two-dimensional image 2000Ra of the current frame, by referring to the optical flow of and luminance information on the respective regions of the divided left-eye two-dimensional image 2000La of the current frame and the divided right-eye two-dimensional image 2000Ra of the current frame. The depth image computing unit 11a estimates a depth value from a parallax amount of the region obtained from the centers of gravity of the regions corresponding to each other.
(88) Finally, the computed depth value and the optical flow 2000Lc are supplied to the depth image conversion unit 13a. Then, employing a depth threshold Dth with an optical flow used as a parameter, a region of interest is determined. The depth value of a region other than the region of interest is then converted to the depth value for 2D display. Then, a depth image 2010L obtained by the conversion is output.
(89) A method of extracting the region of interest when the left-eye two-dimensional images (1000L and 2000L) and the right-eye two-dimensional images (1000R and 2000R) of the plurality of frames are supplied to the region of interest extraction unit 110a in
(90) In step S11, the left-eye two-dimensional images (1000L, 2000L) and the right-eye two-dimensional images (1000R, 2000R) of the plurality of frames which have been photographed are chronologically supplied to the region of interest extraction unit 110a. These input images are shown in
(91) In step S12, an image segmentation process is performed on the left-eye two-dimensional images (1000L, 2000L) and the right-eye two-dimensional images (1000R, 2000R) of the plurality of frames that have been received.
(92) In step S13, using the color information and the luminance information, the correspondence between the respective regions of the divided two-dimensional image (1000La) of the immediately preceding frame and the divided two-dimensional image (2000La) of the current frame and the correspondence between the respective regions of the divided two-dimensional image (1000Ra) of the immediately preceding frame and the divided two-dimensional image (2000Ra) of the current frame are estimated, and the difference value between the centers of gravity of the regions of the divided two-dimensional images (1000La) and (2000La) corresponding to each other and the difference value between the centers of gravity of the regions of the divided two-dimensional images (1000Ra) and (2000Ra) corresponding to each other are output as the optical flows of the region (refer to
(93) Next, the divided left-eye two-dimensional image 1000La of the immediately preceding frame, the divided right-eye two-dimensional image 1000Ra of the immediately preceding frame, the divided left-eye two-dimensional image 2000La of the current frame, the divided right-eye two-dimensional image 2000Ra of the current frame, and the optical flows of the respective regions of the right-eye two-dimensional image and left-eye two-dimensional image of the current frame are supplied to the depth image computing unit 11a. The depth image computing unit 11a estimates a correspondence between the respective regions of the divided left-eye two-dimensional image 2000La of the current frame and the divided right-eye two-dimensional image 2000Ra of the current frame, by referring to the optical flows of the respective regions and the luminance information on the respective regions. Then, the depth image computing unit 11a obtains a parallax amount between the regions corresponding to each other, based on positions of the centers of gravity of the corresponding regions, thereby estimating a depth value.
(94) In step S14, depth conversion is performed on one of the depth image signal 2000Ld for a left eye and a depth image signal 2000Rd for a right eye generated in step S13. The specific conversion process is the same as that in step S04 described above when one two-dimensional image has been received.
(95) In step S15, a depth image signal 2010L for the left eye obtained by the conversion is output. The depth image signal 2010L for the left eye obtained by the conversion is supplied to the parallax image generation unit 120 as described above, thereby generating a left eye parallax image 2000Lo and a right eye parallax image 2000Ro where only the region of interest alone has a parallax, according to Equation (6).
(96) When only a depth image signal 1000d and a two-dimensional image 1000 are supplied to an image display device 1, a region of interest extraction unit 110b can be configured as shown in
(97) The above description was given about a method of converting 2D display to 3D display by the depth conversion in order to create a perception of image pop-up. This method can also be applied to a case where a visual depth perception is created. When the depth value of a certain region is smaller than a computed threshold in that case, the depth value of the certain region is converted to a depth value for 2D display. On the contrary, when the depth value of this region is larger than the threshold, the depth value is output without alteration.
(98) In this exemplary embodiment, when a region of interest is extracted, the depth value and the optical flow of a target object are considered. The object having a large optical flow may come close to an observer after several frames and may therefore become the region of interest. Accordingly, the depth threshold is set to be high in advance to perform 3D display of the object having the large optical flow. A mixture of 2D and 3D contents that is more interesting and has a rich power of expression can be thereby created. On the other hand, in order to prevent eyes from keeping track of a rapid movement, the depth threshold is made to depend on the optical flow. Conversion from 2D display to 3D display can be made to be moderate, thereby allowing reduction of burden on eyes. Further, the depth value of a region other than the region of interest is converted to a depth value for the 2D display. Only the region of interest can also be thereby converted into a 3D image automatically.
(99) As described above, a mixture of 2D and 3D images, in which a region of interest is automatically extracted from images viewed from different viewpoints and only the region of interest is represented stereoscopically, can be displayed. Patent Document 3 discloses a technology of disposing an identification mark for each of a 2D image and a 3D image in a region other than a region of interest. A method of automatically extracting the region of interest and a specific method of determining the region of interest are not disclosed. In this exemplary embodiment, a depth image signal is estimated from an amount of parallax between received right-eye and left-eye two-dimensional images. More accurate depth information can be thereby obtained. Further, a region of interest is extracted, by employing a depth threshold function using an optical flow as a parameter. Thus, there can be created a mixture of 2D and 3D contents of a video from which an object desired to be noted is more smoothly extracted and burden on eyes is reduced.
(100) [Second Exemplary Embodiment]
(101) Next, a second exemplary embodiment will be described in detail, with reference to drawings.
(102) A two-dimensional image 1000, a depth estimation LUT signal 1009 for estimating a depth value, and a depth threshold Dth for converting the depth value are supplied to the image display device 2. The depth estimation LUT signal 1009 is a look-up table signal for estimating the depth value from the shape and the area of each region.
(103)
(104) The image segmentation unit 20 receives a two-dimensional image, performs an image segmentation process based on coordinate values and color information, and outputs a divided two-dimensional image. Next, the divided two-dimensional image is supplied to the depth image generation unit 21, and a depth image signal 1000d is then generated by referring to a table that defines a relationship among the shape, the area, and the depth value of each region.
(105) The depth image conversion unit 23 refers to the received depth threshold, and performs the depth conversion process indicated by Equation (1) and (2) on the depth image signal 1000d generated by the depth image generation unit 21. The depth value of a region other than a region of interest is set to 255 in a depth image signal 1010 obtained by the conversion.
(106) Next, a specific region of interest extraction process in this exemplary embodiment will be explained using a button screen often used as an operation screen for industrial operation, as an example. The explanation of the process will be given with reference to a flowchart (in
(107) A screen as shown in
(108) In step S21, a still two-dimensional image 1000 is supplied to the region of interest extraction unit 210.
(109) In step S22, the process of dividing the received two-dimensional image into regions each having uniform pixel characteristics is performed, by referring to coordinate values and luminance values.
(110) In step S23, a depth value is assigned to each region by referring to the depth estimation LUT signal 1009 and using features (shape information and area) of each region of the divided two-dimensional image as parameters that are independent to each other. Specifically, the area of each region is first computed. In the example shown in
(111)
(112) In step S24, the depth image conversion indicated by Equations (1) and (2) is performed on the depth image signal 1000d generated in the above-mentioned step S23. When the depth threshold is set to 40 for
(113) In step S25, the depth image signal obtained by the conversion is output to the parallax image generation unit 120.
(114) In this exemplary embodiment, the description was directed to the process of automatically extracting a region of interest image and the process of automatically generating a depth image, from one two-dimensional image. These processes can also be applied to a case where an input is constituted from right-eye and left-eye two-dimensional images and a case including a depth image. When a depth image signal is included in an input image signal, the depth image generation unit 21 becomes unnecessary.
(115) In this exemplary embodiment, even if an input signal indicates a still two-dimensional image, the depth value of a region other than a region of interest can be converted to a depth value for 2D display, and only the region of interest can be automatically converted into a 3D image.
(116) [Third Exemplary Embodiment]
(117) Next, a third exemplary embodiment will be described in detail, with reference to drawings. In the first to second exemplary embodiments, generation of image data to be displayed by the 3D image display unit was mainly described. In this exemplary embodiment, control by a controller for image display will be described, in addition to generation of data to be displayed by a 3D image display unit.
(118) When an image display device is a liquid crystal monitor, it becomes possible to provide a 3D image having a rich power of expression with low power consumption by directly controlling a backlight of the liquid crystal motor.
(119) A plurality of two-dimensional images (1000, 2000) are chronologically supplied to the image display device 3.
(120) The backlight control signal generation unit 34 computes and outputs a position signal 1100 for LEDs of the backlight, indicating a position corresponding to a region of interest or a region having a depth value of other than 255, using a depth image signal 2010 obtained by depth image conversion.
(121) The backlight control signal generation unit 34 refers to the luminance values of the two-dimensional image 2000 to output a luminance signal 1200 for each LED of the backlight. Further, the backlight control signal generation unit 34 outputs a luminance conversion LUT signal 1300 set in the backlight control signal generation unit in advance. The luminance conversion LUT signal 1300 indicates a reference table for converting the luminance signal. A specific example of luminance conversion using the luminance conversion LUT signal 1300 will be described later.
(122) The parallax image generation unit 320 generates a left-eye parallax image 2000Lo and a right-eye parallax image 2000Ro by shifting each pixel to a position corresponding to the computed parallax, based on the depth image signal 2010 obtained by the conversion, and outputs the left-eye parallax image 2000Lo and the right-eye parallax image 2000Ro to the 3D image display unit 330 simultaneously with output of the backlight control signals described above.
(123) The 3D image display unit 330 includes a liquid crystal controller 341, a backlight controller 342, a liquid crystal panel 343, and a LED backlight 344, as shown in
(124)
(125) Herein, a specific process of the luminance conversion will be described. First, each 8 bits of the luminance signal is supplied to the backlight luminance conversion circuit 3421. Then, it is determined whether or not the luminance signal currently received corresponds to the region of interest, using the position signal indicating the region of interest. Then, according to a result of the determination, an appropriate luminance value is found out, using the luminance conversion LUT signal 1300, thereby performing the luminance conversion. Then, a newly generated luminance signal is supplied to the shift register 3422. The shift register 3422 receives each bit of the luminance signal and writes each one bit in the register. When eight bits are written, the shift register 3422 transfers this signal of the 8 bits to the latch register 3423. Finally, a switch signal for controlling the corresponding one or more of the LEDs is generated by a switch 3424. This switch signal controls each LED of backlight.
(126) Next, the reason why power consumption can be reduced due to the process of the luminance conversion by the backlight luminance conversion circuit 3421 will be specifically described, by referring to
(127) In order to implement the LCD screen having the luminance value of a region of interest of 200 and the luminance value of a region other than the region of interest of 50 as shown in
(128) Then, in order to reduce the power consumption of the liquid crystal panel, the backlight luminance conversion circuit 3421 performs backlight control. First, each 8 bits of a luminance signal for the backlight in
(129) On the other hand, in order to obtain the luminance distribution of the LCD screen as shown in
(130) The above description was directed to the case where the backlight luminance conversion circuit 3421 for the LEDs is provided for the backlight controller 342. The backlight luminance conversion circuit 3421 may be provided within the parallax image generation unit 320. The description was directed to the configuration of the 3D image display unit, based on the image display device 3. The control in this exemplary embodiment can also be applied to the first and second exemplary embodiments. As described above, by giving luminance values that are different between each LED for the region of interest and each LED for the background region for the backlight, power consumption can be reduced. Note, however, there are displays having no backlight. Therefore, it is needless to say, the use of (or reference to) the backlight per se may be dispended with in case of such display screen. For example, the backlight controller may be replaced by a luminance controller, in general. Also LCD may be replaced with any other suitable display screen device available for displaying in general.
(131) Each disclosure of Patent Documents described above is incorporated herein by reference. Modifications and adjustments of the exemplary embodiments and an example are possible within the scope of the overall disclosure (including claims) of the present invention, and based on the basic technical concept of the invention. Various combinations and selections of various disclosed elements are possible within the scope of the claims of the present invention. That is, the present invention of course includes various variations and modifications that could be made by those skilled in the art according to the overall disclosure including the claims and the technical concept. To take an example, all the exemplary embodiments can be implemented by a hardware configuration. The present invention is not limited to this implementation. All the processes can also be performed by causing a CPU (Central Processing Unit) to execute a computer program. In this case, the computer program can be provided by being stored in a storage medium, or can be provided by being transferred through the Internet or other communication medium. As the 3D image display unit in each of the first and second embodiments, a display device such as an LCD, an organic EL, a LED, a PDP, or the like can be applied. Assume the 3D image display unit is applied to a light-emitting display device such as the organic EL, the LED, or the PDP in particular. Then, power consumption is determined by the luminance of each pixel. Thus, a filtering process or gray scale conversion is employed to reduce the luminance value of the region other than the region of interest. A remarkable effect of reducing power consumption can be thereby obtained.
(132) It should be noted that other objects, features and aspects of the present invention will become apparent in the entire disclosure and that modifications may be done without departing the gist and scope of the present invention as disclosed herein and claimed as appended herewith.
(133) Also it should be noted that any combination or selection of the disclosed and/or claimed elements, matters and/or items may fall under the modifications aforementioned.