Systems And Methods For Changing The Direction Of View During Video Guided Clinical Procedures Using Real-Time Image Processing
20220383588 · 2022-12-01
Assignee
Inventors
- João Pedro DE ALMEIDA BARRETO (Coimbra, PT)
- Carolina DOS SANTOS RAPOSO (Coimbra, PT)
- Michel Goncalves ALMEIDA ANTUNES (Coimbra, PT)
- Rui Jorge Melo TEIXEIRA (Tondela, PT)
Cpc classification
G06T7/80
PHYSICS
A61B1/0005
HUMAN NECESSITIES
International classification
A61B1/00
HUMAN NECESSITIES
Abstract
Arthroscopes and laparoscopes are available in several lens cuts to cover different clinical situations of interest, with the surgeon having to exchange the optics in order to change the direction of view of the camera. The presently disclosed embodiments disclose real-time image processing systems and methods that enable the user to arbitrarily change the direction of view of a surgical camera with the advantage of avoiding the disruption in workflow caused by the physical exchange of the optics. In addition, the presently disclosed embodiments disclose performing zoom along an arbitrary viewing direction (directional zoom) that enables to increase the scale of a region of interest without decreasing the overall field-of-view or losing image contents.
Claims
1. A method for rendering an image Î.sub.i of a virtual target camera based on a source image I.sub.i captured by a real source camera, the source camera comprising a camera-head and a rigid endoscope with lens cut β that rotates in azimuth around a mechanical axis that intersects the image in a point Q, for which the focal length f, radial distortion ζ, and principal point O at a certain azimuth α.sub.0 are known, the method comprising: finding a location of the principal point O.sub.i in the source image I.sub.i with azimuth α.sub.i by rotating O around Q by an angular displacement in azimuth δ.sub.i=α.sub.i−α.sub.0; updating a camera model c.sub.s of the source camera that maps points x in a canonical image into points u in a pixel image according to the focal length f, radial distortion t, and location of the principal point O.sub.i; determining a 3D location of a vertical plane Π.sub.i that contains, or passes close to, the mechanical axis and the optical axis of the source camera, by finding and back-projecting a line n.sub.i where Π.sub.i intersects the source image I.sub.i; defining a 3D motion m between the target and source cameras as a rotation by an angle γ={circumflex over (β)}−β around a direction {right arrow over (n)}.sub.i that is normal to the vertical plane Π.sub.i i, such that x=m({circumflex over (x)}; y, {right arrow over (n)}.sub.i), with {circumflex over (x)} being a point in the canonical image of the target camera; computing a focal length {circumflex over (f)} and a location of the principal point Ô.sub.i for the target camera and deriving a camera model c.sub.t that maps points {circumflex over (x)} in a canonical image of the target image into points û in a pixel image of the target image according to the focal length {circumflex over (f)} and location of the principal point Ô.sub.i; and generating the target image Î.sub.i by mapping a plurality of pixels û in Î.sub.i into a point u in the source image I.sub.i with a mapping function w that is a composition of the camera model cs of the source camera, the 3D motion m, and an inverse of a camera model c.sub.t of the target camera.
2. The method of claim 1, wherein the azimuth a.sub.0 corresponds to a first notch position P, the method further comprising: processing the source image L to detect a boundary with a center Ci and a second notch position P.sub.i; wherein rotating O around Q by an angular displacement in azimuth comprises estimating the angular displacement in azimuth according to the first notch position P, the second notch position P.sub.i, and the point Q; wherein the line n.sub.i is defined by points O.sub.i and P.sub.i.
3. The method in claim 2, further comprising processing the rendered target image Î.sub.i to create a black frame defining an image region with center Ĉ and diameter {circumflex over (d)} and to create a notch by placing a visual mark at point {circumflex over (P)}.sub.i=Ĉ+{circumflex over (d)}/2 v.sub.i in the circular boundary, with v.sub.i being the 2D unit direction of image line n.sub.i.
4. The method of claim 3, wherein the image region comprises one or more of a circular shape, a conic shape, a rectangular shape, an hexagonal shape, or another polygonal shape.
5. The method of claim 2, wherein the source image I.sub.i comprises two or more source image frames, wherein point Q is determined by detecting one or more of point P.sub.i or point C.sub.i in successive ones of the two or more frames.
6. The method of claim 2, wherein processing the source image I.sub.I to detect a boundary with a center C.sub.i and a second notch position P.sub.i comprises detecting a plurality of second notch positions P.sub.i, wherein one of the plurality of second notch positions is used for determining the angular displacement in azimuth δ.sub.i.
7. The method of claim 1, wherein the azimuth α.sub.0 corresponds to a first notch position P, the method further comprising: processing the source image L to detect a boundary with a center C and a second notch position P.sub.i; wherein rotating O around Q by an angular displacement in azimuth comprises estimating the angular displacement in azimuth according to the first notch position P, the second notch position P.sub.i, and the point Q; wherein line n.sub.i is defined by: points O.sub.i and Q; or points O.sub.i and C.sub.i.
8. The method of claim 1, wherein the position of the principal point is given in a normalized reference frame disposed in or attached to the endoscope.
9. The method of claim 1, wherein the source camera is equipped with a sensor that measures the rotation of the endoscope with respect to the camera-head and estimates the angular displacement in azimuth δ.sub.i.
10. The method of claim 1, wherein the principal point O is coincident with the rotation center Q.
11. The method of claim 1, wherein a distortion of the target camera is {circumflex over (ζ)} set to zero.
12. The method of claim 1, wherein generating the target image Î.sub.i is performed using an image warping or pixel value interpolation comprising one or more of interpolation by nearest neighbors, bilinear interpolation, or bicubic interpolation.
13. The method of claim 1, wherein the principal point Ô.sub.i is made coincident with a center Ĉ of a boundary of the virtual image and computing the focal length {circumflex over (f)} comprises solving Φ({circumflex over (f)},{circumflex over (ζ)},{circumflex over (Θ)}/2,{circumflex over (d)}/2)=0, with Φ being a mathematical expression that relates focal length, radial distortion, image distance, and angle between back-projection rays.
14. The method of claim 1 wherein the principal point Ô.sub.i is made coincident with a center Ĉ of a boundary of the virtual image, the method comprising: transforming an origin [0,0].sup.T of the source image by a function g that is the composition of the camera model c.sub.s and the motion m to find a location Ō.sub.i in the source image that maps to the principal point Ô.sub.i; determining a limiting viewing angle
15. The method of claim 1, further comprising: transforming an origin [0,0].sup.T of the source image by a function g that is the composition of the camera model c.sub.s and the motion m to find a location Ō.sub.i in the source image that maps to the principal point Ô.sub.i; determining a limiting viewing angle
16. The method of claim 15, wherein: the source camera is chosen such that the cut angle of the rigid endoscope is approximately
17. The method of claim 15, wherein the {circumflex over (Θ)}, the distortion {circumflex over (ζ)}, the image resolution, and the diameter {circumflex over (d)} of the target camera are set with the same respective values as the source camera's and a cut angle {circumflex over (β)} of the target camera is set by a user through user selection of an amount of angular shift γ that is added to the lens cut β of the source camera.
18. The method of claim 17, wherein the distortion {circumflex over (ζ)} is zero.
19. The method of claim 18, wherein the angular shift γ is zero.
20. The method of claim 15, wherein a cut angle {circumflex over (β)} of the target camera is set by a user through user selection of an amount of angular shift γ that is added to the lens cut β of the source camera, wherein the angular shift γis chosen by the user such that the principal point is placed in a region of interest in the target image and the radial distortion parameter {circumflex over (ζ)} is increased relative to the radial distortion ζ of the source camera to magnify the region of interest.
21. The method of claim 1, wherein a field of view {circumflex over (Θ)} of the virtual target camera and the distortion {circumflex over (ζ)} of the virtual target camera take the same values as the source camera's, and a lens cut {circumflex over (β)} of the virtual target camera is set by a user.
22. The method of claim 1 wherein a field of view {circumflex over (Θ)} of the virtual target camera takes the same value as the source camera's, and the lens cut {circumflex over (β)} of the virtual target camera and the distortion {circumflex over (ζ)} of the virtual target camera are set by a user.
23. The method of claim 22, wherein the lens cut {circumflex over (β)} and the distortion {circumflex over (ζ)} are set by the user to adapt the depth of view and produce a zoom effect that does not change the field of view.
24. The method of claim 1 wherein the field of view {circumflex over (Θ)} and the lens cut {circumflex over (β)} of the virtual target camera take the same values as the source camera's, and the distortion {circumflex over (ζ)} of the virtual target camera is set by a user.
25. The method of claim 1 wherein field of view {circumflex over (Θ)} of the virtual target camera takes the same value as the source camera's, and a lens cut {circumflex over (β)} and the focal length {circumflex over (f)} of the virtual target camera are set by the user.
26. The method of claim 25, wherein the principal point O.sub.i is made coincident with a center e of a boundary of the virtual image and computing the distortion {circumflex over (ζ)} comprises solving Φ({circumflex over (f)},{circumflex over (ζ)},{circumflex over (Θ)}/2, {circumflex over (d)}/2)=0, with Φ being a mathematical expression that relates focal length, radial distortion, image distance, and angle between back-projection rays.
27. The method of claim 25, wherein the principal point Ô.sub.i is made coincident with a center Ĉ of a boundary of the virtual image, the method comprising: transforming an origin [0,0].sup.T of the source image by a function g that is the composition of the camera model c.sub.s and the motion m to find a location Ō.sub.i in the source image that maps to the principal point Ô.sub.i; determining a limiting viewing angle
28. The method of claim 25, further comprising: transforming an origin [0,0].sup.T of the source image by a function g that is the composition of the camera model C.sub.s and the motion m to find a location Ō.sub.i in the source image that maps to the principal point Ô.sub.i; determining a limiting viewing angle
29. A system for rendering an image Î.sub.i of a virtual target camera based on a source image I.sub.i captured by a real source camera, the source camera comprising a camera-head and a rigid endoscope with lens cut β that rotates in azimuth around a mechanical axis that intersects the image in a point Q, for which the focal length f, radial distortion ζ, and principal point O at a certain azimuth α.sub.0 are known, the system comprising: a non-transitory computer-readable medium storing instructions; and a processor configured to execute the instructions to perform a method comprising: finding a location of the principal point O.sub.i in the source image I.sub.i with azimuth α.sub.i by rotating O around Q by an angular displacement in azimuth δ.sub.i=α.sub.i−α.sub.0; updating a camera model c.sub.s of the source camera that maps points x in a canonical image into points u in a pixel image according to the focal length f, radial distortion ζ, and location of the principal point O,; determining a 3D location of a vertical plane Π.sub.i that contains, or passes close to, the mechanical axis and the optical axis of the source camera, by finding and back-projecting a line n.sub.i where Π.sub.i intersects the source image I.sub.i; defining a 3D motion m between the target and source cameras as a rotation by an angle γ={circumflex over (β)}−β around a direction {right arrow over (n)}.sub.i that is normal to the vertical plane Π.sub.ii, such that x=m({circumflex over (x)};γ,{right arrow over (n)}.sub.i), with {circumflex over (x)} being a point in the canonical image of the target camera; computing a focal length {circumflex over (f)} and a location of the principal point Ô.sub.i for the target camera and deriving a camera model that maps points {circumflex over (x)} into points û in the pixel image according to the focal length f and location of the principal point Ô.sub.i; and generating the target image Î.sub.i by mapping a plurality of pixels û in Î.sub.i into a point u in the source image I.sub.i with a mapping function w that is a composition of the camera model c.sub.s of the source camera, the 3D motion m, and an inverse of a camera model c.sub.t of the target camera, wherein the camera model c.sub.t of the target camera maps points in a canonical image of the target camera into points in a pixel image of the target camera.
30. A method for rendering a virtual image h of a virtual target camera, the method comprising: obtaining a source image L captured by a real source camera, the source camera comprising a medical scope; finding a location of a principal point O, in the source image I.sub.i with azimuth α.sub.i, wherein the real source camera is associated with a camera model c.sub.s that maps points x in a canonical image into points u in a pixel image according to the location of the principal point O.sub.i; determining a 3D location of a vertical plane Π.sub.i proximate a mechanical axis and an optical axis of the source camera, by finding and back-projecting a line n.sub.i where Π.sub.i intersects the source image I.sub.i; defining a 3D motion m between the target and source cameras as a rotation around a direction {right arrow over (n)}.sub.i that is normal to the vertical plane Π.sub.i; deriving a camera model c.sub.t that maps points {circumflex over (x)} in a canonical image of the virtual image Î.sub.i into points û in a pixel image of the virtual image Î.sub.i; and generating the target image Î.sub.i by mapping a plurality of pixels û in Î.sub.i into a point u in the source image I.sub.i with a mapping function w that is a composition of the camera model c.sub.s of the source camera, the 3D motion m, and an inverse of a camera model c.sub.t of the target camera.
Description
BRIEF DESCRIPTION OF THE DRAWING
[0021] For a more complete understanding of the present disclosure, reference is made to the following detailed description of exemplary embodiments considered in conjunction with the accompanying drawings.
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
DETAILED DESCRIPTION
[0038] It should be understood that, although an illustrative implementation of one or more embodiments is provided below, the various specific embodiments may be implemented using any number of techniques known by persons of ordinary skill in the art. The disclosure should in no way be limited to the illustrative embodiments, drawings, and/or techniques illustrated below, including the exemplary designs and implementations illustrated and described herein. Furthermore, the disclosure may be modified within the scope of the appended claims along with their full scope of equivalents.
[0039] Systems and methods for changing the direction of view during video guided clinical procedures are disclosed herein. The systems and methods may be used for clinical procedures including, but not limited to, arthroscopy, laparoscopy, endoscopy or other surgical procedures including minimally invasive orthopedic surgery procedures. The systems and methods can be used with real-time image processing or delayed image processing.
[0040]
[0041] The rotation in azimuth 18 around the mechanical axis 12 of the endoscope 10 causes the optical axis 36 to describe a cone in space 32 (the cone of DoV) whose half-angle is the lens cut 20, as illustrated in
[0042] In this disclosure, 2D and 3D vectors are written in bold lower and upper case letters, respectively. Functions are represented by lower case italic letters, and angles by lower case Greek letters. Points and other geometric entities in the plane are represented in homogeneous coordinates, as is commonly done in projective geometry, with 2D linear transformations in the plane being represented by 3×3 matrices and equality being up to scale. In addition, when representing functions, the symbol; is used to distinguish between variables (that appear to the left of;) and parameters (that appear to the right of;) of the function. Finally, different sections of the text are referenced by their paragraphs' numbers using the symbol § .
Image Warping
[0043] The disclosed methods and systems for the rendering of virtual views with an arbitrary shift in the inclination of viewing direction relate with image warping techniques, in particular with software based methods to create a virtual Pan-Tilt-Zoom (PTZ) camera from a wide Field of View (FoV), panoramic camera. In this case, the image that would be acquired by the PTZ camera (the target image) is rendered from the image acquired by the panoramic camera (the source image) through a function that maps pixels in one image into pixels in the other.
[0044] Without loss of generality, let w be the function that transforms pixel coordinates u.sub.t in the target image into pixel coordinates u.sub.s in the source image, as illustrated in
[0045] The existing warping techniques include, but are not limited to, direct mapping, inverse mapping, warping by re-sampling in the continuous or discrete image domain, warping by re-sampling and filtering, warping using a look-up table, warping using decomposable transformations and learned warping transformations.
[0046] The warping function w is the composition of functions c.sub.s and c.sub.t, corresponding to the camera models of the source and target cameras, respectively, with function m, which is the camera motion. The camera models c.sub.s and c.sub.t describe the mapping between the canonical image 22 in millimeters and the image in pixels 30. Since the source camera is a real camera, c.sub.s can be determined using an appropriate calibration method. On the other hand, the target camera c.sub.t is chosen so that the desired imaging features (resolution, zoom, FoV, etc) are predefined. Concerning function m, it describes the relative motion between virtual (target) and real (source) cameras. In more detail, it represents the rotation undergone by the virtual camera in 3D space that causes an homography mapping in projective coordinates between the canonical images of the source and target cameras.
[0047] Warping images acquired with endoscopic cameras is significantly more challenging than doing so with images acquired with conventional cameras because of two main reasons. Firstly, the camera model c.sub.s changes in every frame time instant due to the relative rotation of the endoscopic lens with respect to the camera head, and this must be taken into account in building the warping function w. Secondly, the motion model m depends not only on the desired change in elevation γ but also in the mechanical change in azimuth δ that must be measured at every frame time instant. These challenges do not exist for conventional cameras because they do not present moving parts that interfere with the camera model.
Endoscopic Camera Model
[0048] This section introduces the endoscopic camera model c by describing the model of a general camera presenting radial distortion introduced by the optics, providing an overview of relevant concepts and explaining how the endoscopic camera can be described with an adaptive model that is updated at every frame time instant.
General Camera Model
[0049]
[0050] In this model, the distortion function d is generic, meaning that it can be any distortion model in the literature such as Brown's polynomial model, the rational model, the fish-eye model, or the division model with one or more parameters, in which case t is a scalar or a vector, respectively. The distortion function d can also take into account other types of distortion such as, but not limited to, tangential distortion, prism distortion or perspective distortion caused by tilt of the sensor. Without loss of generality, it is assumed in the remainder of the description that d applies the first order division model, with t being a scalar:
[0051] Moreover, it is known that a distance r in pixels measured in the image in pixels 30 between a point u and the principal point O (r=∥u−O∥) corresponds to an angle θ in 3D that depends on the joint effect of f and ζ, such that
Φ(f,ζ,θ,r)=0. (1)
Handling Lens Rotation
[0052] The rotation of the endoscope 10 with respect to the camera head 28 causes the principal point O, the center C of the circular boundary 24 and the notch 26, denoted as P, to rotate in the image in pixels 30 around a point Q, as illustrated in
[0053] Thus, if the calibration parameters f, O, ζ refer to a particular azimuth position α.sub.0, then a rotation in azimuth by an angle δ.sub.i=α.sub.i−α.sub.0 changes the position of the principal point to O.sub.i=r(O; 67 .sub.i, Q), where r denotes a 2D rotation by an angle Si around an axis going through Q. Similarly, the center of the circular boundary becomes C.sub.i=r(C; δ.sub.i, Q).
[0054] In other words, let f, O and ζ be the calibration parameters of a generic endoscopic camera at a particular azimuth position α.sub.0. It comes that the camera model that maps points x in the canonical image into points u in pixel coordinates can change at every frame time instant i, being given by u=c(x; f, O.sub.i,ζ) with O.sub.i=r(O; δ.sub.i, Q) and δ.sub.i=α.sub.i−α.sub.o, where α.sub.i is the angle in azimuth in frame i.
[0055] In the remainder of this description, it is assumed, without loss of generality, that both the calibration parameters f, O, ζ for a reference azimuth position α.sub.0 and the rotation center Q are known a priori, such that if rotation ζ.sub.i with respect to α.sub.0 is estimated at each frame time instant i, then the calibration parameters f,O.sub.i,ζ can be determined as previously described. It is relevant to note that if O is coincident with Q, then O.sub.i in every frame i will also be coincident with Q and thus this adaptation of the camera model c to the rotation of the scope is unnecessary. However, the misalignment between these entities occurs frequently due to mechanical tolerances in building the optics, as discussed previously. Since the effect of this misalignment in the calibration parameters is not negligible, it must be taken into account in practice.
Camera Rotation Motion in 3D
[0056] The method to determine the motion model m introduced in §§ [0043]-[0048] is now described, with the accompanying
[0057] As illustrated in
[0058] In ideal conditions, plane Π.sub.i 40 contains both the optical and the mechanical axes and intersects the FSM in the notch, whose purpose is to inform the surgeon about the direction of the lens cut. These conditions hold if and only if points Q, O.sub.i and P.sub.i in the image in pixels 30 are collinear and, in this case, line n.sub.138 is the line that contains all three points Q, O.sub.i, and P.sub.i. In real conditions, this does not usually happen due to the mechanical tolerances in lens manufacturing, leading to a non-coplanarity of the optical and mechanical axes and/or plane Π.sub.i 40 not going exactly through the notch of the FSM. Thus, in the remainder of this description, it is assumed, without loss of generality, that n.sub.i 38 is the line defined by the optical axis O.sub.i and the notch P.sub.i. Thus, in projective coordinates, line n.sub.1 38 is computed by n.sub.i=P.sub.i ×O.sub.i and the normal to plane Π.sub.i 40 can afterwards be determined in a straightforward manner by {right arrow over (n)}.sub.i=K.sub.i.sup.Tn.sub.i with K.sub.i being the matrix of intrinsic parameters in frame i,
[0059] Alternatively, since in ideal conditions points Q, O.sub.i and P.sub.i in the image in pixels 30 are collinear, as is the boundary center C.sub.i, line n.sub.i 38 can be obtained by determining the line that goes through points Q and O.sub.i or C.sub.i and O.sub.i, in which case points Q and C.sub.i replace P.sub.i in the previous equation, respectively.
Further Considerations
[0060] In this disclosure, it is assumed, without loss of generality, that all measurements and computations are performed in dry environment. Adaptation to wet environment can be performed in a straightforward manner by multiplying the focal length by the ratio of the refractive indices of the two or more mediums where light travels before reaching the imaging sensor.
[0061] Another important consideration to make is that both the update of the camera model c and the detection of line n, 38 in the image in pixels 30 in every frame i can be performed by determining the angular displacement in azimuth & with respect to a reference angular position α.sub.0, which may be accomplished using exclusively image processing techniques. In this disclosure is it considered, without loss of generality, that the method disclosed in U.S. Application No. 62/911,950 for detecting the boundary with center C.sub.i and notch P.sub.i in every frame i is employed for this task. This method contemplates the possibility that the FSM contains more than one notch, which is useful for guaranteeing that at least one notch is always visible in the image, being able to detect multiple notches whose relative location is known and that are identified by their different shapes and sizes. Thus, it comes that the angular displacement δ.sub.i is the angle defined by points P.sub.i, Q and P (δ.sub.i=<PQP.sub.i), with P being the position of one of the notches at the reference angular position α.sub.0. Alternatively, δ.sub.i can be estimated from the boundary centers C.sub.i and C, with C being the center of the boundary in the reference position: δ.sub.i=≮CQC.sub.i.
[0062] As discussed in § [0054], a distance r measured in the image in pixels 30 between any point and the principal point O corresponds to an angle θ in 3D that depends on both f and ζ and that can be computed using Equation 1. Let d be the length of the line segment whose endpoints are the two intersections of the boundary with the line n 38 in
Method for Changing the Direction of View (DoV)
[0063] This section presents the method disclosed in this disclosure for changing the DoV by rendering the video that would be acquired by a virtual camera with predefined characteristics located in the same 3D position as a real endoscopic camera. In particular, let I.sub.i be frame i acquired by the real endoscopic camera, henceforth referred to as the source camera, whose endoscope has a lens cut β and rotates in azimuth around a mechanical axis that intersects the image plane in the rotation center Q. The endoscopic camera's calibration for a reference angular position α.sub.0, corresponding to a particular notch position P, is known, meaning that its focal length f, radial distortion ζ and principal point O have been determined. The purpose of this method is to render an image Î.sub.i with resolution m×n and a circular boundary with diameter {circumflex over (d)} centered in point Ĉ that would be acquired by a virtual camera, henceforth referred to as the target camera, with a lens cut {circumflex over (β)}, Field-of-View {circumflex over (Θ)}, and distortion {circumflex over (ζ)} placed in the same 3D location as the source camera.
[0064] As shown in
[0065] Line n.sub.i, which is the projection of the reference plane Π.sub.i introduced in §§ [0060]-[0064] in the image plane, is determined as the line that goes through points O.sub.i and P.sub.1. As previously described, by back-projecting n.sub.i into the 3D space, plane Π.sub.i with normal {right arrow over (n)}.sub.i can be obtained, yielding motion model m, which is a rotation in 3D space around axis {right arrow over (n)}.sub.i by an angle γ that corresponds to the difference between the lens cuts of the target and source cameras, γ={circumflex over (β)}−β. Motion model m transforms points {circumflex over (x)} into points x in the canonical images of the target and source cameras, respectively, such that x=m({circumflex over (x)}; γ,{right arrow over (n)}.sub.i).
[0066] In order to derive the camera model for the target camera c.sub.t that maps points {circumflex over (x)} in its canonical image into points û in the image in pixels (û=c.sub.t ({circumflex over (x)}; {circumflex over (f)}, Ô.sub.i, {circumflex over (ζ)})), the focal length {circumflex over (f)} and the location of the principal point Ô.sub.i in every frame i must be computed. Details on the estimation of these parameters are given in §§ [0079]-[0091].
[0067] As a final step, the algorithm generates the target image t using image warping techniques, as described in §§ [0043]-[0048], where each pixel 1.1 in t is mapped into a point u in the source image I.sub.i by the mapping function w, that is the composition of functions c.sub.s, m and c.sub.t.sup.−1 (w=c.sub.s.Math.m.Math.c.sub.t.sup.−1), such that the color value of u can be interpolated. As previously described, this mapping function w may implement any method for image warping or pixel value interpolation.
[0068] The disclosed method for changing the DoV assumes that the calibration parameters of the source camera for the reference position in azimuth α.sub.0 are known in advance. These can be determined in several different ways, which include, but are not limited to, using a set of calibration parameters predetermined in factory or representative of a set of similar endoscopic cameras and using an appropriate calibration method for performing calibration in the Operating Room before the medical procedure, such as the one disclosed in U.S. Pat. No. 9,438,897 (application no. 14/234,907) entitled “Method and apparatus for automatic camera calibration using one or more images of a checkerboard pattern”.
[0069] In addition, the presently disclosed method also assumes that the rotation center Q is known a priori. However, this is not a strict requirement since Q may be determined on-the-fly from points P.sub.i and/or C.sub.i detected in successive frames, using the method disclosed in US Application No. U.S. Application No. 62/911,950. Another alternative is to determine the position of the principal point O.sub.i in every frame time instant i by making use of an estimate for the principal point given in a normalized reference frame that is attached to the circular boundary. In this case, since O.sub.i is obtained directly from the normalized estimate of the principal point, as disclosed in US Application No. U.S. Application No. 62/911,950, it is not required to know the rotation center Q or the angular displacement in azimuth Si a priori.
[0070] The estimation of the angular displacement in azimuth Si, depicted in the second block of the diagram in
[0071] The method for changing the DoV disclosed herein may be used for different purposes by setting the parameters of the target camera to desired values. In particular, if the distortion of the target camera {circumflex over (ζ)} is set to zero, the images Îi rendered by the target camera will be distortion-free images. Also, in order to perfectly mimic a real endoscopic camera comprising an FSM, the rendered target images t may be processed to create a black frame such that meaningful image contents are within a circular region with center Ĉ and diameter {circumflex over (d)} and to create a notch by placing a visual mark in point {circumflex over (P)}.sub.i=Ĉ.sub.i+{circumflex over (d)}/2 v.sub.i in the circular boundary, with v.sub.i being the unit direction of line n.sub.i (v.sub.i=P.sub.i −O.sub.i/∥P.sub.i−O.sub.i∥) computed in block 4 of the diagram in
Adjusting the Target Camera Model
[0072] The disclosed warping function w is the composition of three functions: the camera model c.sub.s, for which the parameters are the calibration parameters of the real source camera at the current frame i, the motion model m that depends on the desired change in inclination γand orientation {right arrow over (n)}.sub.i of the reference plane with respect to camera-head, and the camera model c.sub.t for the virtual target camera that must be such that the FoV is {circumflex over (Θ)} for an image distortion of {circumflex over (ζ)} and image diameter {circumflex over (d)}.
[0073] While the two first functions are fully determined, it remains to define the focal length {circumflex over (f)} and principal point Ô.sub.i of the target camera to fully specify c.sub.t.
[0074] Since both functions c.sub.s and m are known, it is possible to anticipate the location Ō.sub.i where the principal point of the target image will be mapped by the warp function w in the source image (
[0075] The choice of parameters {circumflex over (f)} and Ô.sub.i for the target camera c.sub.t can be performed according to three different settings, for which an illustration is given in
[0076] Setting A: One possibility is to choose Ô.sub.i coincident with the center Ĉ of the boundary. Since the virtual center of rotation {circumflex over (Q)} is also assumed to be coincident with Ĉ then the principal point is stationary across frames i, independently of the rotation of the lens with respect to camera-head. In this case, and considering the interdependence between focal length, distortion, image diameter and FoV expressed by Φ, the focal length {circumflex over (f)} that grants the specified FoV of {circumflex over (Θ)} is the solution of Φ({circumflex over (f)},{circumflex over (ζ)},{circumflex over (Θ)}/2, {circumflex over (d)}/2)=0.
[0077] The problem of this choice of {circumflex over (f)} and Ô.sub.i is that the rendered target image often has a region without visual content (the empty region) near the intersection of nj with the boundary (
[0078] The empty region arises whenever the selected warping function w maps the target image beyond the boundary of the source image where there are no visual contents (
[0079] Setting B: A possible solution to avoid the empty region artifact is to relax the FoV requirement and make it twice the limiting viewing angle
[0080] Setting C: This disclosure discloses an alternative method to choose the focal length and principal point of the target image that conciliates the change in inclination of the DoV by an angle γ, with the rendering of an image with the desired FoV{circumflex over (Θ)} and no empty regions (
[0081] For illustrative purposes let the shift in the DoV be a positive angular offset γ>0 (
[0082] Note that if γ<0, the shift in the FoV is from up to down directions and the principal point translates upwards, i.e., Ô.sub.i=Ĉ−λv.sub.i.
[0083] If Θ≥{circumflex over (Θ)}, with Θ being the FoV of the source camera, then the disclosed method for selecting {circumflex over (f)} and Ô.sub.i always succeeds in rendering a target image with the desired FoV and no empty region. Conciliating these two features is important to provide a good user experience, but there are situations in which this is accomplished by significantly deviating the principal point towards the periphery of the target image. This might be undesirable for certain applications, specially the ones where the target camera is intended to mimic a particular real endoscopic camera, in which case the principal point is typically close to the image the center.
Applications and Functionalities
[0084] The disclosed image processing methods for changing the DoV of endoscopic systems with exchangeable, rotatable optics can lead to multiple applications and be employed in several different systems. The disclosure describes some embodiments of these systems and applications, without prejudice of other possible applications.
Electronic Switch Between Rigid Endoscopes with Different Lens Cut β
[0085] Manufacturers of rigid endoscopes provide lenses with different angular offsets from the mechanical axis with the most common lens cuts being β=0°,30°,45°,70°. Although surgical procedures usually favor endoscopes with a particular cut angle, there are moments or steps in the surgery where a different lens cut would be more convenient. For example, in knee arthroscopy the preference goes for arthroscopes with 30° lens cut, but the meniscus inspection in anterior or intercondylar regions benefits from a cut angle of 0° or 70°.
[0086] The problem is that, since a lateral change in DoV requires to physically switch the endoscope, which causes disruption and can involve risks for the patient, the surgeons rarely do it in practice and perform the procedure with the same endoscope, even when the visualization is sub-optimal.
[0087] The disclosed method can be used in a system to process the images and video acquired by an endoscopy camera equipped with a lens with a wide FoV to empower the surgeon with electronic switch between two or more virtual endoscopes with different lens cuts β. Such system overcomes the above-mentioned difficulty, that precludes the surgeon from having the best possible visualization at every surgical moment or step.
[0088] Let the two desired virtual endoscopes have lens cuts {circumflex over (β)}.sub.1 and {circumflex over (β)}.sub.2 and FoVs {circumflex over (Θ)}.sub.1 and {circumflex over (Θ)}.sub.2, respectively. Although the disclosed method for selecting the focal length and principal point (
[0089] An embodiment is to use the disclosed methods in a system to electronically switch between the two most common cut angles used in arthroscopy. Since a standard 30° arthroscope has {circumflex over (β)}.sub.1=30°, {circumflex over (Θ)}.sub.1=110° and a 70° arthroscope has {circumflex over (β)}.sub.2=70°, {circumflex over (Θ)}.sub.2=90°, then it stems from the calculations above that the ideal source camera should have a cut angle β around 45° and FoV Θ of approximately 140°. These values are merely indicative and serve to assure the rendering of realistic target images with the principal point in the center region. Let f, ζ, O.sub.i be the exact calibration of the source camera, and let a.sub.1, a.sub.2 and Vi.sub.i, .sub.2 be, respectively, the image diameters and distortions that complement the specifications for the two target cameras, and that can be either arbitrary or obtained from the calibration of real lenses. The method of
Electronic Change of DoV, Distortion Correction and Directional Zoom
[0090] So far the method of
[0091] Another possible embodiment or application is to use the disclosed method to shift the DoV of a particular endoscopic camera while maintaining the overall FoV, in which case the principal point in the rendered images moves towards the periphery as the shift in DoV increases (
[0092] The system that is described can be further enhanced with the possibility of user controlled distortion, in which case the setting {circumflex over (ζ)}=0, γ=0 will cause the system to correct the radial distortion in the source video.
[0093] The disclosed system can also be used to implement what is referred to as directional zoom, in which case the angular shift γ is adjusted such that the principal point in the rendered image becomes overlaid with a region of interest (ROI) and the distortion {circumflex over (ζ)} is increased to magnify the ROI while maintaining the FoV and all visual contents in the image (
Other Applications
[0094] The disclosed methods of the schemes of
[0095] The disclosed methods can also be applied to other types of imagery such as fundus images in ophthalmology.
[0096] The method that is disclosed for selecting the focal length and position of principal point in the target image (
[0097]
[0098] In its most basic configuration, computing system environment 1200 typically includes at least one processing unit 1202 and at least one memory 1204, which may be linked via a bus 1206. Depending on the exact configuration and type of computing system environment, memory 1204 may be volatile (such as RAM 1210), non-volatile (such as ROM 1208, flash memory, etc.) or some combination of the two. Computing system environment 1200 may have additional features and/or functionality. For example, computing system environment 1200 may also include additional storage (removable and/or non-removable) including, but not limited to, magnetic or optical disks, tape drives and/or flash drives. Such additional memory devices may be made accessible to the computing system environment 1200 by means of, for example, a hard disk drive interface 1212, a magnetic disk drive interface 1214, and/or an optical disk drive interface 1216. As will be understood, these devices, which would be linked to the system bus 1206, respectively, allow for reading from and writing to a hard disk 1218, reading from or writing to a removable magnetic disk 1220, and/or for reading from or writing to a removable optical disk 1222, such as a CD/DVD ROM or other optical media. The drive interfaces and their associated computer-readable media allow for the nonvolatile storage of computer readable instructions, data structures, program modules and other data for the computing system environment 1200. Those skilled in the art will further appreciate that other types of computer readable media that can store data may be used for this same purpose. Examples of such media devices include, but are not limited to, magnetic cassettes, flash memory cards, digital videodisks, Bernoulli cartridges, random access memories, nano-drives, memory sticks, other read/write and/or read-only memories and/or any other method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Any such computer storage media may be part of computing system environment 1200.
[0099] A number of program modules may be stored in one or more of the memory/media devices. For example, a basic input/output system (BIOS) 1224, containing the basic routines that help to transfer information between elements within the computing system environment 1200, such as during start-up, may be stored in ROM 1208. Similarly, RAM 1210, hard drive 1218, and/or peripheral memory devices may be used to store computer executable instructions comprising an operating system 1226, one or more applications programs 1228 (such as an application that performs the methods and processes of this disclosure), other program modules 1230, and/or program data 1232. Still further, computer-executable instructions may be downloaded to the computing environment 1200 as needed, for example, via a network connection.
[0100] An end-user, e.g., a customer, retail associate, and the like, may enter commands and information into the computing system environment 1200 through input devices such as a keyboard 1234 and/or a pointing device 1236. While not illustrated, other input devices may include a microphone, a joystick, a game pad, a scanner, etc. These and other input devices would typically be connected to the processing unit 1202 by means of a peripheral interface 1238 which, in turn, would be coupled to bus 1206. Input devices may be directly or indirectly connected to processor 1202 via interfaces such as, for example, a parallel port, game port, firewire, or a universal serial bus (USB). To view information from the computing system environment 1200, a monitor 1240 or other type of display device may also be connected to bus 1206 via an interface, such as via video adapter 1242. In addition to the monitor 1240, the computing system environment 1200 may also include other peripheral output devices, not shown, such as speakers and printers.
[0101] The computing system environment 1200 may also utilize logical connections to one or more computing system environments. Communications between the computing system environment 1200 and the remote computing system environment may be exchanged via a further processing device, such a network router 1252, that is responsible for network routing. Communications with the network router 1252 may be performed via a network interface component 1254. Thus, within such a networked environment, e.g., the Internet, World Wide Web, LAN, or other like type of wired or wireless network, it will be appreciated that program modules depicted relative to the computing system environment 1200, or portions thereof, may be stored in the memory storage device(s) of the computing system environment 1200.
[0102] The computing system environment 1200 may also include localization hardware 1256 for determining a location of the computing system environment 1200. In embodiments, the localization hardware 1256 may include, for example only, a GPS antenna, an RFID chip or reader, a Wi-Fi antenna, or other computing hardware that may be used to capture or transmit signals that may be used to determine the location of the computing system environment 1200.
[0103] In a first aspect of the instant disclosure, a method is disclosed that, given video acquired by a real source camera that comprises a camera-head and a rigid endoscope with lens cut 13 that rotates in azimuth around a mechanical axis that intersects the image in a point Q, for which the focal length f, radial distortion ζ, and principal point O at a certain azimuth α.sub.0 are known, renders the video that would be acquired by a virtual target camera placed in the same 3D location as the source camera, but with a lens cut {circumflex over (β)}, Field-of-View {circumflex over (Θ)}, and distortion {circumflex over (ζ)}, where each successive target image Î.sub.i with resolution of m×n and circular boundary of diameter {circumflex over (d)} centered in point Ĉ is rendered by executing the following steps for each corresponding source image I.sub.i: (i) finding the location of the principal point O.sub.1 in the source image I.sub.i with azimuth α.sub.i by rotating O around Q by an angular displacement in azimuth δ.sub.i=α.sub.i − α.sub.0; (ii) updating the camera model of the source camera that maps points x in the canonical image, which are the pinhole projection of points X in the 3D scene, into points u in the pixel image such that u=c.sub.s(x; f, O.sub.i, ζ); (iii) determining the 3D location of a vertical plane Π.sub.i that contains, or passes close to, the mechanical axis and the optical axis of the source camera, by finding and back-projecting the line n.sub.i where Π.sub.i intersects the source image I.sub.i; (iv) defining a 3D motion m between target and source cameras as the rotation by an angle γ={circumflex over (β)}−β around a direction {right arrow over (n)}.sub.i that is the normal of the vertical plane Π.sub.i i, such that x=m({circumflex over (x)}; γ, {right arrow over (n)}.sub.i), with {circumflex over (x)} being a point in the canonical image of the target camera; (v) computing a focal length {circumflex over (f)} and location of the principal point Ô.sub.i for the target camera and deriving the corresponding camera model that maps points {circumflex over (x)} into points û in the pixel image according to û=c.sub.t ({circumflex over (x)}; {circumflex over (f)}, Ô.sub.i,{circumflex over (ζ)}); and (vi) generating a target image I.sub.i using image warping techniques where each pixel û in Î.sub.i is mapped into a point u in the source image I.sub.i by a mapping function w that is the composition of the functions c.sub.s, m and the inverse of c.sub.t computed in step (v) (w=c.sub.s.Math.m.Math.c.sub.t.sup.−1), such that the color value of u can be interpolated.
[0104] In an embodiment of the first aspect, the azimuth α.sub.0 is referenced by a particular notch position P, the source image I.sub.i is processed to detect the boundary with center C.sub.i and notch position P.sub.i, the angular displacement in azimuth is estimated as δ.sub.i=≮P.sub.iQP, and the line n.sub.i is defined by points O.sub.i and P.sub.i
[0105] In an embodiment of the first aspect, the method further comprises processing the rendered target image I.sub.i to create a black frame such that meaningful image contents are within a circular region with center e and diameter a and to create a notch by placing a visual mark in point {circumflex over (P)}.sub.i=Ĉ+{circumflex over (d)}/2 v.sub.i in the circular boundary, with v.sub.i being the 2D unit direction of image line n.sub.i.
[0106] In an embodiment of the first aspect, the region that contains meaningful image contents can take any desired geometric shape, such as, but not limited to, a conic shape, a rectangular shape, an hexagonal shape or any other polygonal shape.
[0107] In an embodiment of the first aspect, point Q is determined on-the fly using the detection of points P.sub.I and/or C.sub.i in successive frames, in which case it does not have to be known or determined a priori.
[0108] In an embodiment of the first aspect, line n.sub.i is alternatively defined by points 0.sub.1 and Q or O.sub.i and C.sub.i.
[0109] In an embodiment of the first aspect, the position of the principal point is given in a normalized reference frame attached to the circular boundary, in which case the computation of its pixel location O.sub.i at every frame time instant i can be accomplished without having to explicitly know the rotation center Q and angular displacement in azimuth δ.sub.i.
[0110] In an embodiment of the first aspect, the source camera is equipped with an optical encoder, or any other sensing device, that measures the rotation of the scope with respect to the camera-head and estimates the angular displacement in azimuth δ.sub.i of step (i).
[0111] In an embodiment of the first aspect, the principal point O is coincident with the rotation center Q, in which case the camera model of the source camera does not have to be updated at every frame time instant as in step (i.sub.i).
[0112] In an embodiment of the first aspect, the distortion of the target camera {circumflex over (ζ)} is set to zero in order for the rendered image Î.sub.i to be distortion free.
[0113] In an embodiment of the first aspect, the rendering of target image Î.sub.i in step (vi) is performed using any common method for image warping or pixel value interpolation including, but not limited to, interpolation by nearest neighbors, bilinear interpolation or bicubic interpolation.
[0114] In an embodiment of the first aspect, in step (v) the principal point is made coincident with the center of the boundary (Ô.sub.i=Ĉ) and the focal length is chosen such that the FoV of the target camera is the desired {circumflex over (Θ)}, in which case the value of {circumflex over (f)} is obtained by solving Φ({circumflex over (f)}, {circumflex over (ζ)}, {circumflex over (Θ)}/2, {circumflex over (d)}/2)=0, with Φ being the mathematical expression that relates focal length, radial distortion, image distance, and angle between back-projection rays, all of which are interdependent parameters.
[0115] In an embodiment of the first aspect, in step (v) the principal point is made coincident with the center of the boundary (Ô.sub.i=Ĉ) and the focal length is chosen such that the rendered image never has a region without visual content (an empty region), the method comprising: finding a location Ō.sub.i in the source image where the warping function will map the principal point Ô.sub.i, which can be determined by transforming the origin [0,0].sup.T by a function g that is the composition of the camera model c.sub.s and the motion m (g=c.sub.s.Math.m); determining a limiting viewing angle Ψ.sub.i that is the angle between the back-projection rays of Ō.sub.i and the point in the circular boundary closest to Ō.sub.i; if
[0116] In an embodiment of the first aspect, in step (vi) the focal length {circumflex over (f)} and principal point Ô.sub.i are such that the rendered image has no empty region and the FoV of the target camera is the specified value {circumflex over (Θ)}, the method comprising: finding a location Ō.sub.i in the source image where the warping function will map the principal point Ô.sub.i, which can be determined by transforming the origin [0,0].sup.T Of by a function g that is the composition of the camera model C.sub.s and the motion m (g=c.sub.s.Math.m); determining a limiting viewing angle
and obtaining the principal point as Ô.sub.i=Ĉ+λv.sub.i, otherwise making Ô.sub.i=Ĉ and finding {circumflex over (f)} by solving Φ({circumflex over (f)},{circumflex over (ζ)}, {circumflex over (Θ)}/2, {circumflex over (d)}/2)=0, with Φ being the mathematical expression that relates focal length, radial distortion, image distance, and angle between back-projection rays, all of which are interdependent parameters.
[0117] In an embodiment of the first aspect, the parameters of the target camera can be arbitrary, be predefined to accomplish a certain purpose, be equal to the calibration of a particular real camera equipped with a particular real endoscope with cut angle {circumflex over (β)}, be chosen by the user at the start of the procedure, or vary during operation according to particular events or user commands.
[0118] In an embodiment of the first aspect, the method is used in a system connected to a source camera for the purpose of empowering the user with the possibility of electronically switching between two virtual target cameras with lens cuts and FoVs of {circumflex over (β)}.sub.1, {circumflex over (Θ)}.sub.1 and {circumflex over (β)}.sub.2, {circumflex over (Θ)}.sub.2.
[0119] In an embodiment of the first aspect, the source camera is chosen such that the cut angle
[0120] of the rigid endoscope is approximately
and the FoV is approximately
to assure that the principal point in the target cameras is close to the center of the image.
[0121] In an embodiment of the first aspect, the FoV {circumflex over (Θ)}, the distortion {circumflex over (ζ)}, the image resolution and the diameter {circumflex over (d)} of the target camera are set with the same values as the source camera's and the cut angle {circumflex over (β)} is set by the user that controls the amount of angular shift γ that is added to the lens cut β of the source camera such that {circumflex over (β)}=β+γ.
[0122] In an embodiment of the first aspect, both the distortion {circumflex over (ζ)} and the angular shift γ are set to zero, in which case the system will correct the radial distortion in the source image.
[0123] In an embodiment of the first aspect, the distortion {circumflex over (ζ)} is set to zero for the target image to be rendered with no radial distortion independently of the chosen angular shift γ.
[0124] In an embodiment of the first aspect, the angular shift γ is chosen such that the principal point is placed in a region of interest in the target image and the radial distortion parameter {circumflex over (ζ)} is increased to magnify this region of interest while maintaining the FoV of the target camera and all contents visible.
[0125] In an embodiment of the first aspect, the source and/or target cameras have radial distortion described by any of the models known in the literature, including, but not limited to, learning-based models, Brown's polynomial model, the rational model, the fish-eye model, or the division model.
[0126] In an embodiment of the first aspect, the boundary with center C.sub.i and the notch P.sub.i are detected using a generic conic detection method, using machine learning or deep learning techniques, or any other image processing technique.
[0127] In an embodiment of the first aspect, multiple notch positions are detected and a particular notch of these multiple notches is used for determining the angular displacement in azimuth δ.sub.i as the angle defined by points P.sub.i, Q, P.
[0128] In an embodiment of the first aspect, the Field-of-View {circumflex over (Θ)} and the distortion {circumflex over (ζ)} of the virtual target camera take the same values as the source camera's, but the lens cut {circumflex over (β)} is set by the user at every frame time instant.
[0129] In an embodiment of the first aspect, the Field-of-View {circumflex over (Θ)} of the virtual target camera takes the same value as the source camera's, but the lens cut {circumflex over (β)} and the distortion {circumflex over (ζ)} is set by the user at every frame time instant.
[0130] In an embodiment of the first aspect, the Field-of-View {circumflex over (Θ)} and the lens cut {circumflex over (β)} of the virtual target camera take the same values as the source camera's, but the distortion {circumflex over (ζ)} is set by the user at every frame time instant to produce a zoom in or zoom out effect that does not change the FoV.
[0131] In an embodiment of the first aspect, the Field-of-View {circumflex over (Θ)} of the virtual target camera takes the same value as the source camera's, but the lens cut {circumflex over (β)} and the distortion {circumflex over (ζ)} are set by the user at every frame time instant to adapt the DoV and produce a zoom in or zoom out effect that does not change the FoV.
[0132] In an embodiment of the first aspect, the Field-of-View {circumflex over (Θ)} of the virtual target camera takes the same value as the source camera's, but the lens cut {circumflex over (β)} and the focal length {circumflex over (f)} are set by the user at every frame time instant, in which case the variable to solve for is not the focal length {circumflex over (f)} but distortion {circumflex over (ζ)} to produce a zoom in or zoom out effect that does not change the FoV.
[0133] While various embodiments have been described for purposes of this disclosure, such embodiments should not be deemed to limit the teaching of this disclosure to those embodiments. Various changes and modifications may be made to the elements and operations described above to obtain a result that remains within the scope of the systems and processes described in this disclosure. All patents, patent applications, and published references cited herein are hereby incorporated by reference in their entirety. It should be emphasized that the above-described embodiments of the present disclosure are merely possible examples of implementations, merely set forth for a clear understanding of the principles of the disclosure. Many variations and modifications may be made to the above-described embodiment(s) without departing substantially from the spirit and principles of the disclosure. It will be appreciated that several of the above-disclosed and other features and functions, or alternatives thereof, may be desirably combined into many other different systems or applications. All such modifications and variations are intended to be included herein within the scope of this disclosure, as fall within the scope of the appended claims.
[0134] The described embodiments are to be considered in all respects only as illustrative and not restrictive and the scope of the presently disclosed embodiments is, therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed systems and/or methods.