Apparatus and method for selecting camera providing input images to synthesize virtual view images
11706395 · 2023-07-18
Assignee
Inventors
Cpc classification
H04N13/111
ELECTRICITY
International classification
Abstract
The present disclosure provides an apparatus and a method for selecting camera providing input images to synthesize virtual view images. According to the present disclosure, A method of selecting a camera providing as input image to synthesize a virtual view image, the method may comprise, for a camera providing an input image, determining whether or not the camera is comprised in a field of view (FoV) at a virtual view position and in response to the camera determined to be comprised in the field of view, selecting the camera to synthesize the virtual image, wherein the determining determines, by way of comparison, whether or not a direction from the virtual view position to a position of the camera is in the FoV at the virtual view position.
Claims
1. A method of selecting a camera providing an input image to synthesize a virtual view image, the method comprising: for a camera providing an input image, determining whether or not the camera is comprised in a field of view (FoV) at a virtual view position; and in response to the camera determined to be comprised in the field of view, selecting the camera to synthesize the virtual image, wherein determining, whether or not the camera is comprised in the FoV at the virtual view point is based on whether or not a direction from the virtual view position to a position of the camera is in the FoV at the virtual view position, wherein the position of the camera is represented on a virtual sphere with an extended radius, and wherein the position of the camera on the virtual sphere is determined as a point at which the camera meets the virtual sphere when the camera is moved from an actual camera position in a viewing direction of the camera.
2. The method of claim 1, further comprising determining, in response to the camera determined not to be comprised, whether or not there is an overlap between an FoV of the camera and the FoV at the virtual view position.
3. The method of claim 2, further comprising extending, in response to the determining that there is the overlap, the FoV at the virtual view position to comprise the camera.
4. The method of claim 3, wherein the extending of the FoV at the virtual view position comprises extending the FoV at the virtual view position to the FoV of the camera or extending the FoV at the virtual view position to an FoV with a predetermined size.
5. The method of claim 1, wherein, in response to a camera rig being spherical, the position of the camera on the virtual sphere is determined by moving by a difference of radius between the virtual sphere and the camera rig in the viewing direction of the camera.
6. The method of claim 1, wherein the virtual view position is outside the camera rig.
7. The method of claim 1, wherein, in response to a camera rig being equally radial in all directions, a viewing direction of the camera coincides with a direction from a center of the camera rig to a center of the camera.
8. The method of claim 7, wherein the determining comprises: defining the FoV as an area with four vertexes; defining a straight line connecting the four vertexes from the virtual view position; obtaining intersection points between the four vertexes and a spherical surface of the camera rig; and selecting a camera located in a figure constructed by the intersection points.
9. An apparatus for selecting a camera providing an input image to synthesize a virtual view image, the apparatus comprising: a receiver for receiving information on a camera providing an input image; and a processor for using the information on the camera providing the input image, determining, for the camera, whether or not the camera is comprised in an field of view FoV at a virtual view position, and in response to the camera determined to be comprised, selecting the camera to synthesize the virtual image, wherein the processor determines whether or not the camera is comprised in the FoV at the virtual view point based on whether or not a direction from the virtual view position to a position of the camera is in the FoV at the virtual view position, wherein the position of the camera is represented on a virtual sphere with an extended radius, and wherein the position of the camera on the virtual sphere is determined as a point at which the camera meets the virtual sphere when the camera is moved from an actual camera position in a viewing direction of the camera.
10. The apparatus of claim 9, wherein the processor, in response to the camera determined not to be comprised, determines whether or not there is an overlap between an FoV of the camera and the FoV at the virtual view position.
11. The apparatus of claim 10, wherein the processor, in response to the determining that there is the overlap, extends the FoV at the virtual view position to comprise the camera.
12. The apparatus of claim 11, wherein, in response to the extending the FoV, the FoV at the virtual view position is extended to the FoV of the camera or is extended to an FoV with a predetermined size.
13. The apparatus of claim 9, wherein the virtual view position is outside the camera rig.
14. The apparatus of claim 9, wherein information on the camera providing the input image comprises the content that a camera rig is equally radial in all directions and a viewing direction of the camera coincides with a direction from a center of the camera rig to a center of the camera.
15. The apparatus of claim 14, wherein the processor, in response to the determining whether or not the camera is comprised, defines the FoV as an area with four vertexes, obtains intersection points between the four vertexes and a spherical surface of the camera rig and then selects a camera located in a figure constructed by the intersections.
16. A non-transitory computer readable medium storing computer program when executed causes a computing device to perform a method comprising: for a camera providing an input image, determining whether or not the camera is comprised in a field of view (FoV) at a virtual view position; and in response to the camera determined to be comprised, selecting the camera to synthesize the virtual view image, wherein determining whether or not the camera is comprised in the FoV at the virtual view point is based on whether or not a direction from the virtual view position to a position of the camera is in the FoV at the virtual view position, wherein the position of the camera is represented on a virtual sphere with an extended radius, and wherein the position of the camera on the virtual sphere is determined as a point at which the camera meets the virtual sphere when the camera is moved from an actual camera position in a viewing direction of the camera.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
DETAILED DESCRIPTION OF THE INVENTION
(11) Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings, which will be easily implemented by those skilled in the art. However, the present disclosure may be embodied in many different forms and is not limited to the embodiments described herein. In the following description of the embodiments of the present disclosure, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present disclosure rather unclear. In addition, parts not related to the description of the present disclosure in the drawings are omitted, and like parts are denoted by similar reference numerals.
(12) In the present disclosure, components that are distinguished from each other are intended to clearly illustrate each feature. However, it does not necessarily mean that the components are separate. That is, a plurality of components may be integrated into one hardware or software unit, or a single component may be distributed into a plurality of hardware or software units. Thus, unless otherwise noted, such integrated or distributed embodiments are also included within the scope of the present disclosure.
(13) In the present disclosure, components described in the various embodiments are not necessarily essential components, and some may be optional components. Accordingly, embodiments consisting of a subset of the components described in one embodiment are also included within the scope of the present disclosure. Also, embodiments that include other components in addition to the components described in the various embodiments are also included in the scope of the present disclosure.
(14) Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings.
(15) In the present disclosure, a view may have a meaning including a virtual view and a free view.
(16) Also, in the present disclosure, a camera and an input camera may be used in a same meaning.
(17) Also, in the present disclosure, an input image may be used in a meaning including an omnidirectional light field input image and be an image collected from a camera.
(18) Also, in the present disclosure, a position of a virtual view and a project center of the virtual view may be used in a same meaning.
(19) Also, in the present disclosure, a FoV of virtual view, a FoV of virtual view position, and a FoV of virtual view image may all be used in a same meaning.
(20) Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings.
(21) In the present disclosure, several embodiments will be disclosed including a method and apparatus for selecting a camera providing an input image for virtual view synthesis by using not geometry information but only a camera parameter. As described above, in the present disclosure, when selecting a camera obtaining an omnidirectional light field image, a significant effect may be expected. However, it does not mean that the present disclosure may be used only to select an omnidirectional light field image. Whenever it is necessary to select a camera providing an input image for virtual view image synthesis, the present disclosure may be applicable.
(22)
(23) However, the structure of a camera rip obtaining an omnidirectional light field input image according to an embodiment of the present disclosure is not limited to the sphere-like camera rig of
(24) Although it is obvious that various structures of camera rigs as described above are available, when describing many embodiments of the present disclosure below, unless otherwise specified, it is assumed that the camera rig with the shape of
(25) A camera selection apparatus providing an input image for virtual view image synthesis may be configured in various structures. The apparatus maybe designed to include a processor for selecting an input image and a camera rig. In this case, the locations of one or more cameras included in the camera rig, the radius of the camera rig, and a viewing direction of each camera may be identified from the camera rig by the processor, for example. However, the apparatus described in
(26)
(27) More particularly, as an example, an apparatus 202 for selecting a camera providing the input image includes a receiver 204 for receiving information on the camera and a processor 205 for determining whether or not the camera is included in a field of view (FoV) at a virtual view position by using information 201 on the camera providing the input image and, when it is determined that the camera is included, selecting the camera. Herein, the processor determines whether or not the camera is included by comparing a direction from the virtual view position to the position of the camera with the field of view at the virtual view position.
(28) For example, the information 201 on the camera providing the input image may become an input value of the apparatus 202 for selecting a camera providing the input image and be a value received by the receiver 204 of the apparatus 202. Information on a camera providing an input image may include the positions of each of one or more cameras included in a camera rig, information on each camera, and information on a shape or a structure of a camera rig. Herein, the shape of a camera rig may be estimated through the processor 205 by using a position of a camera included in the camera rig. However, as it is information on shape itself, it may be received through the receiver 204 but is not limited thereto. In addition, the information may include every viewing direction of the one or more cameras included in the camera rig. However, when the camera rig has an equally radial shape, the direction may be the same as the camera center and thus the information may be omitted.
(29) As an example, the positions of each of the one or more cameras may be represented by a position vector, and each direction that the camera is looking at may be given as a rotation matrix or a direction vector but is not limited thereto.
(30) When receiving the information on the camera rig from the receiver 204, the processor 205, as an example, may determine a virtual view position for synthesizing a virtual view image or identify a field of view at the virtual view position according to a preset virtual view position and then determine whether or not the camera is included in the field of view. Herein, for example, the processor may define a virtual sphere with a significantly extended radius and then determine the positions of each camera included in the camera rig on the extended sphere. For each camera, it is determined whether or not a vector connecting the position of the camera thus determined and a virtual view position is included in the field of view. When it is determined that the vector is included in the field of view, the camera may be selected, and an image collected from the camera, that is, an input image may be used to synthesize a virtual view image. This will be described in further detail in
(31) In addition, even when it is determined that the camera is not included in the field of view, it should be further determined whether an image collected from the camera, that is, a camera image may be actually used to synthesize a virtual view image. This will be described in further detail in
(32)
(33) As one embodiment, the selection method includes determining whether or not a camera is included in a field of view at a virtual view position (S301) and selecting the camera when it is determined that the camera is included (S302). Herein, the selecting may include determining by comparing a direction from the virtual view position to the position of the camera to the field of view at the virtual view position.
(34) For example, the determining whether or not a camera is included in a field of view at a virtual view position (S301) maybe, more specifically, determining, for every camera included in a camera rig, whether or not the camera is included in a field of view at a virtual view position. In addition, this step may be performed after defining a virtual sphere with a significantly extended radius and determining each camera position on the sphere thus extended. The step S301 may include determining, for each camera, whether or not the camera position determined on the virtual sphere is included in the field of view.
(35) As an example, the positions of each of the one or more cameras may be represented by a position vector, and each direction that the camera is looking at may be given as a rotation matrix or a direction vector. In addition, the determining of whether or not a camera is included in a field of view may be indicating by a vector oriented to the position of each of the one or more cameras at a virtual view position and determining whether or not the vector is included in a field of view at the virtual view position but is not limited thereto. This will be described in further detail in
(36) Herein, when a camera rig is equally radial in all directions, simpler selection maybe possible. In this regard, an embodiment may be described in further detail in
(37) When it is determined that the camera is included, an image of the camera selected by the selecting (S302) of the camera, that is, an input image may be used to synthesize a virtual view image.
(38)
(39) For example, as described above, a position of a camera included in the camera rig and a viewing direction may be given as a position vector C.sub.i and a rotation matrix R.sub.i or a direction vector respectively but are not limited thereto. A polar coordinate is worthy of consideration. A virtual view position may exist either inside or outside a camera.
(40)
(41) As one embodiment, first, the selection method may define a virtual sphere 502 with a significantly extended radius. A position of i-th camera (C.sub.i) included in a camera rig may be expressed as C.sub.i′ 503 on the virtual sphere 502. This may be performed for every camera included in the camera rig. When every C.sub.i′ is determined, a direction from a virtual view position 505 to C.sub.i′ 503 may be obtained. Herein, the direction and the position may be represented by a rotation matrix, a direction vector and a position vector but are not limited thereto. A direction of viewing a scene at the virtual view position V 505 may also be represented as a direction vector 501. There is no limit to the method of representing the direction. Whether or not √{square root over (VC.sub.i′)} 506 is included in a FoV 504 may be determined by considering the direction vector indicating a viewing direction at the virtual view position and the FoV 504. When it is determined that it is included in the FoV 504, a camera C.sub.i corresponding to the C.sub.i′ may be selected. A camera image collected from the selected camera C.sub.i may be used to synthesize an image at the virtual view position.
(42) More particularly, the virtual sphere 503 may be defined as a virtual sphere with a significantly extended radius. The radius of the extended sphere may be determined according to a maximum range of geometry information of a scene or may be set as a radius with any length that is significantly larger than a size of a camera rig. As the radius may be determined by another method, it is not limited to the methods.
(43) Next, by moving each camera included in a camera rig from each camera position by using a direction vector indicating a viewing direction of the camera, a point C.sub.i′ meeting the virtual sphere may be obtained. As an example, the calculation for obtaining C.sub.i′ may be performed simply by moving a point as shown in Formula (1) below but is not limited thereto.
C.sub.i′=C.sub.i+DirectionVector.Math.(R′−R) Formula (1)
(44) That is, Formula (1) indicates that the camera position C.sub.i′ 503 on a virtual sphere is obtained by moving from a camera position C.sub.i by a difference between a radius R of a camera rig and a radius R′ of the virtual sphere through a direction vector (DirectionVector in this formula) indicating a viewing direction of a camera. In addition, R may be a radius of a camera rig, and R′ may be a radius of the virtual sphere. In addition, for every C.sub.i′ 503, vectors 506 from the virtual view position 506 to C.sub.i′ may be obtained. Whether or not it is included in the FoV may be determined by obtaining an angle between a direction vector of a virtual view and every √{square root over (VC.sub.i′)} 506, as shown in Formula (2) below.
(45)
(46) As an example, a camera may be selected by finding out an index i of a camera satisfying the above comparative Formula (2). As this method determines whether or not a direction in which a camera is looking at a scene is included in an FoV, an input image may be selected not based on a center position of each camera but based on a viewing direction.
(47)
(48) More particularly, when a camera is selected to synthesize a virtual view image as described above, there may be a case in which a camera that is determined not to be included in the FoV may be actually used to synthesize a virtual view image. Accordingly, the method concerns a way of additionally selecting the camera.
(49) The process of selecting a camera for synthesizing a virtual view image may include the method described in
(50) Even when the vector 506 of
(51) To make up for such a case, it is possible to apply a method of extending a range including a camera providing an input image by considering a virtual FoV that is extended from an actual FoV.
(52) For example, according to the method of
(53) For example, when, determining whether or not there is an overlap 605 between FoVs, a method of comparing cosine values between an FoV of a virtual view and an FoV of C.sub.i+1 may be used. Whether or not an image of C.sub.i+1 maybe used to synthesize a virtual view may be determined by comparing two lines 606 and 608 constituting an FoV of a virtual view image and two lines 609 and 610 constituting an FoV at a camera position. More particularly, for example, when C.sub.i+1 is to the right of the line 606 constituting an FoV at a virtual view position, it may be determined whether or not there is a contact point between the line 606 and the line 609. When C.sub.i+1 is to the left of a line, which is on the left side of a virtual view direction vector 602, out of the lines constituting a virtual view FoV, it may be determined whether or not there is a contact point between the left-side line and the right line of C.sub.i+1 FoV. Thus, it may be determined whether or not the camera may be included. The lines 606, 607, 608, 609 and 610 constituting the FoVs may be vectors, and determining an overlap between FoVs is not limited to the above method.
(54) As one embodiment, extending an FoV at a virtual view used for the comparison of Formula 2 may be considered to include an image of C.sub.i+1. For example, the line 606 may move to the line 607 according to the extension of FoV. Herein, the extension may be as large as an FoV of an input image provided by C.sub.i+1, as shown in Formula (3) below or be as large as a preset FoV with an arbitrary size but is not limited thereto. FoV.sub.input represents an FoV value that is newly extended, and FoV.sub.extended, that is, an extended FoV may be obtained by adding FoV.sub.input and FoV.sub.virtual.
FoV.sub.Extended=FoV.sub.Virtual+FoV.sub.Input Formula (3)
(55) Moreover, a method of remembering only the index of the camera without extending an FoV itself maybe used, but the present disclosure is not limited thereto.
(56)
(57) In a method of selecting a camera providing an input image for virtual view image synthesis according to an embodiment of the present disclosure, an available camera rig is not limited to the spherical shape as described above. Accordingly, even when the camera rig is unstructured for all the directions, it may be equally applied. This will be described in further detail using
(58) In this case again, any virtual sphere 703 with an extended radius may be used in the same way as described in
(59)
(60) The above-described method of selecting a camera providing an input image may be applied whatever structure of a camera rig is used. Accordingly, it is obvious that the method expressed by
(61) As one embodiment, for a camera rig that is equally radial for all directions, when the center of the camera rig is an origin as shown in
(62) First, there may be various projection formats of a virtual view image like the perspective projection and the equirectangular projection (ERP). However, as projection formats may be converted to each other, the perspective projection is assumed to describe embodiments of the present disclosure.
(63) As one embodiment, an FoV at a virtual view position may be expressed as a closed figure of rectangle, that is, a focal plane 906, for convenience sake. Herein, four vectors V.sub.1 901, V.sub.2 902, V.sub.3 903 and V.sub.4 904 passing a virtual view position 905 (projection center of a virtual view image) and the four vertexes of the focal plane 906 may be defined. P.sub.1, P.sub.2, P.sub.3 and P.sub.4, which are points where the vectors meet a spherical surface on which the camera centers of camera rig are located, may be obtained as follows.
(64) First, a spherical camera rig may be expressed as follows.
(x−c.sub.x).sup.2+(y−c.sub.y).sup.2+(z−c.sub.z).sup.2=R.sup.2 Formula (4)
(65) The formula assumes that the center of a camera rig is (C.sub.x, C.sub.y, C.sub.z). For the convenience of calculation, a coordinate system may be moved so that the center of camera rig becomes the origin (0, 0, 0). It was assumed above that the center of a camera rig is the origin. Also, in order to obtain P.sub.1, P.sub.2, P.sub.3 and P.sub.4, the equation of straight line may be expressed as follows.
(66)
(67) As the straight line represented by Formula (5) is intended to obtain respective points P.sub.1, P.sub.2, P.sub.3 and P.sub.4 at which V.sub.1 901, V.sub.2 902, V.sub.3 903 and V.sub.4 904 meet a spherical surface, the respective points may have slopes a, b and c and intercepts (x.sub.1, y.sub.1, z.sub.1) respectively. As illustrated in
(ad+x.sub.1).sup.2+(bd+y.sub.1).sup.2+(cd+z.sub.1).sup.2=R.sup.2 Formula (6)
(68) Formula (6) may be rearranged as follows.
(a.sup.2+b.sup.2+c.sup.2)d.sup.2+(2a(x.sub.1−c.sub.x)+2b(y.sub.1−c.sub.y)+2z(z.sub.1−c.sub.z))d+(x.sub.1−c.sub.x).sup.2+(y.sub.1−c.sub.y).sup.2+(z.sub.1−c.sub.z).sup.2−r.sup.2=0 Formula (7)
(69) Formula (7) maybe rearranged into a quadratic equation with respect to d, for which the following quadratic formula may be used.
(70)
(71) Using Formula (8), d may be found. Once d is determined, the x, y and z coordinates of P.sub.1, P.sub.2, P.sub.3 and P.sub.4 may be obtained.
(72) Meanwhile, since there may be two intersection points 1002 and 1003 between the equation of straight line and a sphere when a virtual viewpoint 1000 is within the sphere as in
(73) As one embodiment, when finding all the four intersection points P.sub.1, P.sub.2, P.sub.3 and P.sub.4 by the above method, cameras inside a closed figure constructed by the intersection points may be selected.
(74) Herein, whether or not a camera is located inside the closed figure may be determined according to
(75) Herein, as one embodiment, since the four intersection points P.sub.1, P.sub.2, P.sub.3 and P.sub.4 may also be obtained as polar coordinates from a center (here, origin) of a camera rig, when a coordinate of each camera included in the camera rig is in an intersection made by the polar coordinates of the intersection points, the corresponding camera may be selected. As each camera's viewing direction coincides with each coordinate of the camera, this method may simply select a camera providing an input image without obtaining a vector from every virtual view position to a carrier a position.
(76) In addition, as shown in
(77) An input image collected from a camera that is selected through the above method may be expressed as a selected image as it is and may be used to synthesize a virtual view image.
(78)
(79)
(80) In addition, a server may be any type including a centralized type and a cloud server, and a terminal may also be any electronic device like a smartphone, a desktop, a laptop, a pad, etc. as long as they are capable of communication.
(81) As an example, a camera for virtual view image synthesis may be included in separate terminals 1202, 1203 and 1204, and an input image may be collected from a camera included in the terminals.
(82) For example, when it is assumed that a server selects a suitable camera, the process of
(83) The various forms of the present disclosure are not an exhaustive list of all possible combinations and are intended to describe representative aspects of the present disclosure, and the matters described in the various forms may be applied independently or in combination of two or more.
(84) For example, according to an embodiment of the present disclosure, a computer program, by being stored in a medium, for selecting a camera providing an input image to synthesize a virtual view image, the computer program may comprise, for a camera providing an input image, determining whether or not the camera is comprised in a field of view (FoV) at a virtual view position and in response to the camera determined to be comprised, selecting the camera to synthesize the virtual view image, wherein whether or not the camera is comprised is determined by comparing a direction from the virtual view position to a position of the camera and the FoV at the virtual view position.
(85) In addition, a computer that implements the computer program stored in the medium for selecting a camera providing an input image to synthesize a virtual view image may include a mobile information terminal, a smart phone, a mobile electronic device, and a stationary type computer, to which the present disclosure is not limited.
(86) In addition, various forms of the present disclosure may be implemented by hardware, firmware, software, or a combination thereof. In the case of hardware implementation, one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs) , programmable logic devices (PLDs), field programmable gate arrays A general processor, a controller, a microcontroller, a microprocessor, and the like may be used for implementation.
(87) The scope of the present disclosure includes software or machine-executable instructions (for example, an operating system, applications, firmware, programs, etc.) that enable operations according to the methods of various embodiments to be performed on a device or computer, and a non-transitory computer-readable medium in which such software or instructions are stored and are executable on a device or computer. It will be apparent to those skilled in the art that various substitutions, modifications, and changes are possible are possible without departing from the technical features of the present disclosure. It is therefore to be understood that the scope of the present disclosure is not limited to the above-described embodiments and the accompanying drawings.