Apparatus and method for wide-range optical tracking during medical imaging
11308645 · 2022-04-19
Assignee
Inventors
- Jakob Ehrl (Untergriesbach, DE)
- Julian Maclaren (Menlo Park, CA)
- Murat Aksoy (San Jose, CA, US)
- Roland Bammer (Palo Alto, CA)
Cpc classification
G06T7/246
PHYSICS
G01R33/5608
PHYSICS
A61B5/055
HUMAN NECESSITIES
G01R33/56509
PHYSICS
G01R33/283
PHYSICS
A61B5/721
HUMAN NECESSITIES
International classification
G06T7/80
PHYSICS
A61B5/11
HUMAN NECESSITIES
G01R33/565
PHYSICS
A61B5/055
HUMAN NECESSITIES
G01R33/28
PHYSICS
Abstract
Methods to quantify motion of a human or animal subject during a magnetic resonance imaging (MRI) exam are provided. In particular, these algorithms make it possible to track head motion over an extended range by processing data obtained from multiple cameras. These methods make current motion tracking methods more applicable to a wider patient population.
Claims
1. A method of determining a position and orientation of an object in a medical imaging device, the method comprising: rigidly attaching one one or more markers to the object, wherein each marker of the one or more markers comprises three or more feature points, wherein the three or more feature points of each marker of the one or more markers have known positions in a coordinate system of the corresponding marker; configuring two or more cameras to have partial or full views of at least one of the one or more markers; determining a camera calibration that provides transformation matrices T.sub.ij relating a coordinate system C.sub.i of camera i to a coordinate system C.sub.j of camera j, wherein i and j are index integers for the two or more cameras; forming two or more images of the one or more markers with the two or more cameras, wherein the known positions of the three or more feature points of each marker in the coordinate systems of the corresponding markers lead to image consistency conditions for images of the three or more feature points in the camera coordinate systems; wherein the image consistency conditions are relations that are true in images of the one or more markers because of known relative positions of the three or more feature points on each of the one or more markers; and solving the image consistency conditions to determine rigid-body transformation matrices M.sub.k relating coordinate systems MC.sub.k of each marker k to the coordinate systems of the two or more cameras, wherein k is an index integer for the one or more markers, whereby the position and orientation of the object is provided; wherein the solving the image consistency conditions to determine each rigid-body transformation matrix M.sub.k is performed with a least squares solution to an overdetermined system of linear equations; wherein the overdetermined system of linear equations for rigid-body transformation matrix M.sub.k is a set of two equations for each feature point of marker k that is seen by each of the two or more cameras; and wherein the overdetermined system of linear equations for rigid-body transformation matrix M.sub.k has coefficients of the rigid-body transformation matrix M.sub.k as unknowns to be solved for.
2. The method of claim 1, wherein the two or more cameras are compatible with magnetic fields of a magnetic resonance imaging system.
3. The method of claim 1, wherein the one or more markers include a position self-encoded marker.
4. The method of claim 1, wherein the object is a head of a human subject.
5. The method of claim 1, wherein the camera calibration is determined prior to installing the two or more cameras in the medical imaging device.
6. The method of claim 1, wherein the camera calibration includes referencing each camera to system coordinates of the medical imaging device and enforcing consistency conditions for the camera calibration.
7. The method of claim 1, wherein all visible feature points of the one or more markers in the images are used in the solving of the image consistency conditions.
8. The method of claim 1, wherein fewer than all visible feature points of the one or more markers in the images are used in the solving of the image consistency conditions.
9. The method of claim 1, wherein a frame capture timing of the two or more cameras is offset, whereby an effective rate of tracking can be increased.
10. The method of claim 1, wherein the two or more cameras are arranged to allow a marker tracking range in a head-feet direction of a patient being imaged.
11. The method of claim 1, further comprising applying motion correction to medical imaging data based on the position and orientation of the object.
12. The method of claim 11, wherein the motion correction is applied adaptively.
13. The method of claim 12, wherein two or more of the one or more markers are attached to the object, and further comprising performing analysis of a relative position of the two or more markers as a marker consistency check.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
DETAILED DESCRIPTION
A) General Principles
(15) To better appreciate the present invention, it will be helpful to briefly describe some embodiments with reference to the subsequent description. An exemplary embodiment of the invention is a method of determining a position and orientation of an object in a medical imaging device. The method includes five main steps.
(16) 1) Providing one or more markers rigidly attached to the object, where each marker includes three or more feature points, and where the feature points of each marker have known positions in a coordinate system of the corresponding marker. In other words, the feature points are marker features that can be distinguished from each other in images and which have known relative positions with respect to each other, provided they are on the same marker.
(17) 2) Providing two or more cameras configured to have partial or full views of at least one of the markers.
(18) 3) Determining a camera calibration that provides transformation matrices T.sub.ij relating a coordinate system C.sub.i of camera i to a coordinate system C.sub.j of camera j. Here i and j are index integers for the two or more cameras. See Eqs. 1 and 3 below for examples of such transformation matrices.
(19) 4) Forming two or more images of the one or more markers with the two or more cameras. Here the known positions of the feature points of each marker in the coordinate systems of the corresponding markers lead to image consistency conditions for images of the feature points in the camera coordinate systems. See Eqs. 2 and 4 below for examples of such consistency conditions. Here image consistency conditions refer to relations that are true in images of the markers because of the known relative positions of feature points on each marker. As a simple example, suppose three feature points are equally spaced in the x-direction of the marker coordinate system. That equal spacing relation will lead to corresponding relations in images including these three feature points. This kind of consistency condition is a single-image consistency condition, and is different from image to image consistency checks performed to see if a marker has moved, as described below.
(20) 5) Solving the image consistency conditions to determine transformation matrices M.sub.k relating the coordinate systems MC.sub.k of each marker k to the coordinate systems of the cameras, wherein k is an index integer for the one or more markers, whereby position and orientation of the object is provided. See
(21) The cameras are preferably compatible with magnetic fields of a magnetic resonance imaging system. The one or more markers can include a position self-encoded marker. The object can be a head of a human subject.
(22) The camera calibration can be performed prior to installing the cameras in the medical imaging device. The camera calibration can include referencing each camera to system coordinates of the medical imaging device and enforcing consistency conditions for the camera calibration.
(23) All or fewer than all visible feature points of the markers in the images can be used in the solution of the image consistency conditions. A frame capture timing of the two or more cameras can be offset to increase an effective rate of tracking. The cameras can be arranged to increase a marker tracking range in a head-feet direction of a patient being imaged.
(24) The position and orientation of the object can be used to apply motion correction to medical imaging data. Such motion correction can be applied adaptively. In cases where two or more markers are attached to the object, analysis of the relative position of the two or more markers can be performed as a marker consistency check. If this marker consistency check fails, the motion correction can be disabled.
(25) Solving the image consistency conditions can be performed with a least squares solution to an overdetermined system of linear equations (i.e., more equations than unknowns).
B) Examples
(26)
(27)
(28)
(29)
(30) In the implementation shown in
(31)
(32)
(33)
(34) The pose combination algorithm (
(35)
The estimates are then combined using a weighted sum. For the translation component of pose, the combined estimate is given by
t.sub.c=w.sub.1t.sub.1+w.sub.2t.sub.2+ . . . +w.sub.nt.sub.n
where t.sub.i, is the vector translation component of the pose estimate from camera i.
(36) The combined estimate of the rotation component of each pose is computed using a similar weighting procedure. However, simply averaging rotation matrices or Euler angles is not a mathematically valid approach. Instead, rotations components derived from the individual camera views are first expressed as unit quaternions, q.sub.i. Then the combined estimate is calculated as q.sub.c, using one of several known methods, such as spherical linear interpolation (slerp) or the method of Markley, et al., “Averaging Quaternions”, Journal of Guidance, Control and Dynamics, Vol. 30, No. 4, 2007. In our experience, when the unit quaternions to be averaged all represent a similar rotation, a simple and computationally efficient approximation to these methods can be obtained using the following procedure:
(37) 1) Changing the sign of all unit quaternions with negative real part (q and −q represent the same rotation, but can't be easily averaged).
(38) 2) Taking the mean of all n unit quaternions by adding all components and dividing by n.
(39) 3) Renormalizing by dividing the result from (2) by its norm, so that the combined quaternion, q.sub.c, is a unit quaternion.
(40) If weighted averaging is desired, then weights can be easily included as part of Step (2).
(41) The augmented DLT algorithm (
(42)
(43) The augmented DLT algorithm determines the pose of the marker coordinate frame (W) with respect to a reference camera frame (arbitrarily chosen to be C.sub.1 in this example). This pose is represented by a 4-by-4 transformation matrix T.sub.WC1. Here, we are assuming that the extrinsic calibration of the camera system is already known, i.e., the transformation matrix T.sub.C1C2 linking the two coordinate frames.
(44) Cameras 1 and 2 track two points, .sup.wX.sub.1 and .sup.wX.sub.2, respectively. The left superscript w indicates that .sup.wX.sub.1 and .sup.wX.sub.2 are defined with respect to the coordinate frame W, i.e.,
.sup.C1X.sub.1=T.sub.WC1.sup.WX.sub.1
.sup.C2X.sub.1=T.sub.C1C2T.sub.WC1.sup.WX.sub.1 (1)
In practice, the coordinate frame W corresponds to the coordinate frame defined by the marker.
(45) Using the pinhole camera model, the projection of .sup.C1X.sub.1=(.sup.C1x.sub.1, .sup.C1y.sub.1, .sup.C1z.sub.1) on the first camera image plane) .sup.C1I.sub.1=(.sup.C1u.sub.1.sup.(1), .sup.C1v.sub.1.sup.(1), −f.sup.(1)) can be determined as:
(46)
where f.sup.(1) is the focal length of camera 1. Note that in Eq. 2, we used the coordinates .sup.C1X.sub.1, but in fact one knows .sup.wX.sub.1. Another important point is that the coordinates u and v in Eq. 2 are still defined with respect to a physical coordinate system C1, and are represented in physical units (e.g., millimeters). However, in reality, the location of a projected point on a camera image is described in pixels. The conversion from detected camera image pixel coordinates to physical coordinates (u, v) involve other steps, such as re-centering depending on the offset between centers of the lens and detectors, and correcting for radial and tangential lens distortions. However, pixel-to-physical conversion rules are constant for a camera and can be determined offline using well-known intrinsic camera calibration methods (e.g., Zhang Z. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence 2000; 22:1330-1334. doi: 10.1109/34.888718). Thus, without loss of generality, it can be assumed that (u, v) coordinates in Eq. 2 can easily be determined from the pixel coordinates on the image. In fact, we can also drop the focal length f.sup.(1) in Eq. 2 by re-defining u′ and v′ such that u′=u/f and v′=v/f.
(47) The transformation matrix between the marker and Camera 1, and between Camera 1 and Camera γ, can be defined as
(48)
where γ is the camera index. In both cases, the 3-by-3 matrix R represents the rotation and the 3-by-1 vector t represents translation. T.sub.C1Cγ is already known through extrinsic camera calibration and T.sub.WC1 is the marker pose that is to be determined using DLT. Assuming arbitrary point κ and camera γ, we can re-arrange Eq. 2 to get (and dropping the focal length):
.sup.Cγu.sub.κ.sup.(γ)Cγ−.sup.Cγx.sub.κ=0
.sup.Cγv.sub.κ.sup.(γ)Cγ−.sup.Cγy.sub.κ=0 (4)
(49) Combining Eqs. 1, 3, 4 and cascading the equations for each detected point for all cameras gives a system of equations as shown on
(50) More explicitly, the matrix in
(51)
-by-12, where n.sub.γ is the total number of cameras and n.sub.η.sup.(γ) is the number of points detected by camera γ. In cases where more than one marker is employed, a system of equations as in
(52) Solution of the system of
(53)
T.sub.C1ST.sub.C2C1T.sub.SC2=I (5)
(54) Well-known iterative optimization methods can be used to modify the measured transformations, such that the above equation holds, and while satisfying constraints such as
(55) 1) Even distribution of errors between scanner-camera cross-calibration transformations T.sub.C1S and T.sub.C2C1 and/or
(56) 2) No errors in T.sub.C2C1 because camera-camera calibration can be done to far greater accuracy than scanner-camera calibration.
(57) Given more than two cameras, it is possible to formulate the optimal solution of scanner-camera transformation in a least squared sense as follows. Arbitrarily choosing C1 as the reference frame, one can obtain:
(58)
(59) Here, {tilde over (T)}.sub.C1S, {tilde over (T)}.sub.C2S and {tilde over (T)}.sub.CγS, are the measured camera-to-scanner transformations for cameras 1, 2 and γ. As mentioned above, the transformation between camera and MRI scanner can be obtained using methods well known to those in the field. In addition, the camera-to-scanner transformations for all cameras can be obtained within one experiment without additional time overhead. In Eq. 6, T.sub.CγC1 represents the transformations between camera γ and camera 1, and can be obtained outside the MRI scanner with a high degree of accuracy. T.sub.C1S in Eq. 6 is the reference-camera-to-scanner transformation that needs to be determined from the equations. Re-writing Eq. 6 as a least-squares problem:
(60)
Eq. 7 represents a linear-least-squares problem with respect to the variables in T.sub.C1S, so it can be solved using any available linear equation solver. It is also possible to solve Eq. 7 using non-linear methods, such as Levenberg-Marquardt or Gauss-Newton. One can also solve Eq. 7 by separating the rotational and translational components and solving for the rotational component of the transformation matrices first.
(61)