ABSTRACT Title of Document: PROSPECTIVE HEAD MOVEMENT CORRECTION FOR HIGH-RESOLUTION MRI USING AN IN-BORE OPTICAL TRACKING SYSTEM. Lei Qin, Doctor of Philosophy, 2009 Directed By: Professor Yang Tao, Fischell Department of Bioengineering In MRI of the human brain, subject motion is a major cause of magnetic resonance image quality degradation. To compensate the effects of head motion during data acquisition, an in-bore optical motion tracking system is proposed. The system comprises one or two MR compatible infrared cameras that are fixed on a holder right above and in front of the head coil. The resulting close proximity of the cameras to the object allows precise tracking of its movement. During image acquisition, the MRI scanner uses this tracking information to prospectively compensate for head motion by adjusting gradient field direction and RF phase and frequency. Experiments performed on subjects demonstrate the system?s robustness, exhibiting an accuracy of better than 0.1mm and 0.15?. PROSPECTIVE HEAD MOVEMENT CORRECTION FOR HIGH-RESOLUTION MRI USING AN IN-BORE OPTICAL TRACKING SYSTEM. By Lei Qin Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park, in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2009 Advisory Committee: Professor Yang Tao, Chair Professor Rama Chellappa Dr. Jeff H Duyn Professor Elliot R McVeigh Professor Hubert Montas ? Copyright by Lei Qin 2009 ii Acknowledgements I would like to express my deepest gratitude to my supervisor, Dr. Jeff H Duyn, who has been an invaluable mentor for the past thee and half years at NIH. His insightful guidance and advice have made substantial differences in my work, as well as my life. His confidence in my capability has enabled much of my creative work. His encouragement and understanding have made lots of difficulties seem easier. I am also deeply grateful to my supervisor, Dr. Yang Tao, for seeing my research potential and pushing me to complete my PhD degree. His wide knowledge and his logical way of thinking have been of great value for me. His understanding and guidance have provided a good basis for the present thesis. His support and encouragement have been invaluable for the past four years. I would also like to thank Dr. Rama Chellappa, Dr. Elliot R McVeigh and Dr. Hubert Montas for serving on my committee and for their time, support, and helpful advices during my research and academic development. I also owe my most sincere gratitude to Dr. Alan Koretsky, for his guidance, encouragement, understanding and generous help in both professional and personal matters. He made me believe in myself when I was in doubt, showed me the right path when I was lost. I cannot thank him enough for all he did for me. I am also grateful to Dr. John Andrew Derbyshire from NHLBI for his help on my understanding of MR physics and sequence programming. I will never forget the long hours he spent with me on the scanner. He gave me tremendous inspiration in my work and gave me help whenever I needed it. iii I enjoyed every single day I worked in the Advance MRI lab. Every one in the lab is nice, generous and helpful to me. I want to thank Dr. Peter van Gelderen, Dr. Jacco de Zwart, Dr. Jonghoo Lee, Dr. Karin Shmueli, Dr. Bing Yao, Dr. Masaki Fukunaga, Dr. Marta Bianciardi, Dr. Shuming Wang, Dr. Joe Murphy-Boesch , Dr. Tieqiang Li and Susan Fulton. Special thanks go to Peter and Tieqiang for their guidance during my first steps into MR pulse sequence programming; and to Jacco for his help on my numerous computer problems. Only because of the help from all of you can I make this thesis possible. I would also like to express my appreciation to all present and former members of the Bio-imaging and Machine Vision Laboratory at University of Maryland for their constant suggestions and friendship, including Bin Zhu, Lu Jiang, Xin Chen, Hansong Jing, Xuemei Chen, Abby Vogel, Angela Vargas, and Gary Seibel. I would like to extend my deepest gratitude to my parents, who have always believed in me and knew I could accomplish the goal of being the first PhD in my family. And to my husband Dr. Fenghua Jin, thanks for the enduring love, encouragement, inspiration and support through all these years. His presence with me has made the whole work more enjoyable. Last, but far from least, I want to express my thanks to all the people who have made this thesis possible and because of whom my graduate experience has been one that I will cherish forever. iv Table of Contents Acknowledgements ....................................................................................................... ii Table of Contents ......................................................................................................... iv List of Figures .............................................................................................................. vi Chapter 1 Introduction .................................................................................................. 1 Chapter 2 Review of Literature ..................................................................................... 2 2.1 Principles of Magnetic Resonance Imaging ....................................................... 2 2.1.1 Nuclear Magnetic Resonance Signal ........................................................... 2 2.1.2 Spatial Localization for Imaging ................................................................. 5 2.1.3 Fast Imaging ? Echo Planar Imaging ......................................................... 10 2.2 Motion Artifacts ................................................................................................ 11 2.2.1 Reason of motion artifacts ......................................................................... 11 2.2.2 Correction for Motion Artifacts ................................................................. 13 2.3 Stereo Tracking System .................................................................................... 17 2.3.1 Camera Calibration .................................................................................... 17 2.3.2 3D Reconstruction ..................................................................................... 19 2.3.3 Object Tracking ......................................................................................... 20 2.3.4 3D Motion Estimation................................................................................ 21 Chapter 3 Objective .................................................................................................... 23 Chapter 4 Prospective Motion Correction for MRI using a Stereo Tracking System 24 4.1 System Setup ..................................................................................................... 24 4.2 System Calibration ............................................................................................ 26 4.2.1 CCD Camera Calibration ........................................................................... 27 4.2.2 Stereo Calibration ...................................................................................... 36 4.2.3 3D Reconstruction from Stereo ................................................................. 39 4.2.4 Integrated Optical and MRI System Calibration ....................................... 42 4.3 Real-time Motion Calculation........................................................................... 46 4.3.1 Kanade-Lucas-Tomasi Tracker .................................................................. 46 4.3.2 Stereo Matching ......................................................................................... 48 4.3.3 Rigid Motion Estimation............................................................................ 51 4.4 Motion Calculation in MRI coordinates ........................................................... 54 4.4.1 Motion in MRI physical coordinates ......................................................... 54 4.4.2 Motion in MRI logical coordinates ............................................................ 55 4.5 Motion Correction for MRI .............................................................................. 57 4.5.1 Rotation Correction .................................................................................... 57 4.5.2 Slice Selection Displacement Correction................................................... 58 4.5.3 Frequency Encoding Displacement Correction ......................................... 66 4.5.4 Phase Encoding Displacement Correction ................................................. 69 4.5.5 Summary of Position Compensation ......................................................... 72 4.5.6 K-space Line Re-acquisition ...................................................................... 73 4.6 Software Flowchart for the Motion Correction System .................................... 73 Chapter 5 Results and Discussion with the Stereo Motion Correction System for MRI ..................................................................................................................................... 75 v 5.1 Camera Calibration Test ................................................................................... 75 5.2 Stereo Reconstruction Validation ..................................................................... 76 5.3 Tracker Noise Evaluation ................................................................................. 77 5.4 Calibration Accuracy Between Cameras and MRI Coordinate System ........... 79 5.5 Pulse Sequence Test with a Phantom................................................................ 80 5.6 System Performance Test ................................................................................. 83 5.7 Volunteer Study ................................................................................................ 84 5.7.1 Breathing Related Motion .......................................................................... 85 5.7.2 Minimal Motion ......................................................................................... 86 5.7.3 Large Stepwise Motion .............................................................................. 87 5.7.4 Slow Motion............................................................................................... 88 5.8 Discussion ......................................................................................................... 90 Chapter 6 Prospective Motion Correction for MRI using a Single Camera ............... 94 6.1 System Setup ..................................................................................................... 94 6.2 Method .............................................................................................................. 95 6.2.1 Motion Computation .................................................................................. 97 6.2.2 Best Match Search ..................................................................................... 98 6.3 Result .............................................................................................................. 100 6.3.1 Accuracy of the Motion Estimation ......................................................... 100 6.3.2 Volunteer Study ....................................................................................... 103 6.4 Discussion ....................................................................................................... 104 Chapter 7 Conclusion ................................................................................................ 107 Chapter 8 Future Work ............................................................................................. 109 Chapter 9 Appendices ............................................................................................... 111 9.1 Derivation of motion in logical coordinate system ......................................... 111 9.2 MRI Coordinates from DICOM Image........................................................... 112 9.3 Gauss-Newton Algorithm ............................................................................... 114 9.4 Stereo System QA ........................................................................................... 117 Chapter 10 Bibliography ........................................................................................... 119 vi List of Figures Figure 2-1 Slice selection. Activating a given magnetic-field gradient creates a linear correspondence between the magnetic-field strength and the position along the gradient. An RF pulse with center frequency c? excites spins located at position cz ; the bandwidth of ?? and the gradient strength determine the slice thickness z? . ......................................................................................................... 6 Figure 2-2 Sequence diagram for a 2D gradient echo imaging sequence. RF pulse is a sinc function. Gz is the slice selection z-gradient. Gy is the phase encoding y- gradient, pictured as a series of horizontal lines to denote that it is being stepped regularly through increasing values during different repetition periods. Gx is the frequency encoding x-gradient and S is the echo. ................................................. 7 Figure 2-3 k-space trajectory for 2D gradient echo imaging sequence. The signal starts from kx=ky=0 after each slice selection rephasing lobe. Phase encoding gradient moves the k-space trajectory vertically to a ky depending on the current Gy (red vertical line). The dephasing lobe of Gx transverses the trajectory horizontally to kx,min (red horizontal line). The reversed read gradient then brings the spins to kx,max through a complete set of kx values (red arrows). The experiment is repeated by changing Gy to obtain a complete set of data (black arrows). .................................................................................................................. 9 Figure 2-4 (a) An EPI sequence with blipped phase encoding gradient. (b) K-space trajectory for the corresponding EPI sequence. ................................................... 11 Figure 2-5 An axial brain image exhibits ghosting due to head motion. Since the movement is random, the ghosts are smeared out along the phase encoding direction (left/right). ............................................................................................ 12 Figure 2-6 A high resolution T2* weighted GRE brain image. The resolution is 200?200?1000 micrometer, and the scan time is about 5 minutes. ................... 13 Figure 2-7 Navigator sequence and its application for motion correction. (a) X- NAV sequence for x-axis displacement measurement. The NAV echo is interleaved into the imaging sequence and is similar to an image echo, except that no phase encoding is applied. (b) A transverse section of the legs of a volunteer. Left leg moved from side to side while the right leg stayed stationary. The artifact caused by the left leg motion was marked with the arrow. (c) The X- NAV data obtained during the scan. The motion of the left leg and the static position of the right leg were clearly shown. (d) Motion corrected images using the NAV data from (c). The motion artifact of the left leg was eliminated. (Figures from Ehman and Felmlee, 1989) ........................................................... 15 Figure 2-8 k-space trajectory. (a) projection reconstruction; (b) spiral; (c) PROPELLER. ...................................................................................................... 16 Figure 4-1 Experimental setup. The top pane is an illustration of the position of the cameras and head coil in the MRI scanner bore. The bottom images show the real instruments. Cameras are highlighted with the red ellipse. ................................. 25 Figure 4-2 Calibration model. (a) Schematics of the phantom; (b) View of the actual phantom surface. .................................................................................................. 27 vii Figure 4-3 Pinhole camera geometry. OC is the camera center and also the origin of the camera coordinate system, Op is the principle point, M is a point in the space which has different coordinates in the world and camera coordinate systems, and md and mf are the image of M in the image plane with and without lens distortion. ............................................................................................................. 28 Figure 4-4 Camera calibration using a square target. (a) Projection of a square ABCD to the image plane; (b) The image and the four vanishing points { 1~V , 2~V , 3 ~V , 4 ~V }, and the horizon line H? ~ . ....................................................................... 31 Figure 4-5 Two views of the calibration target. The red crosses mark the extracted corners used for calibration, which have the number of 100 for each view. ....... 36 Figure 4-6 The epipolar geometry. M is a point in space, Oc and Oc? are the two camera centers. The camera baseline intersects each image plane at the epipoles e and e?. The camera centers, 3D space point M and its images m and m? lie in a common plane, called epipolar plane. The two lines me and m?e? are epipolar lines. ..................................................................................................................... 37 Figure 4-7 Calibration between the optical system and MRI. The calibration phantom was put in the field of view of both the cameras and the MRI scanner.43 Figure 4-8 Two images from the 3D scan of the calibration phantom. The red crossings mark the corners. The blue arrow marks an air bubble inside the phantom. .............................................................................................................. 44 Figure 4-9 Phantom corner position estimation in MRI coordinate system. (a) shows one side of a prism. Blue dots are the observed corner position along that side; green line is the estimated side and the red circle is the estimated corner position on the phantom surface. (b) shows the alignment between the extracted internal structure (green) of the phantom and an ideal model (red). The green lines are the side of the prisms and the red circles are the estimated corners position in MRI. ............................................................................................................................. 45 Figure 4-10 Epipolar lines estimated after stereo calibration. (a) shows image from one of the camera and two randomly chosen points. (b) shows the image from the other camera and the estimated epipolar lines corresponding to the two points in (a) respectively. .................................................................................................... 49 Figure 4-11 MRI physical and logical coordinate system. {OM, XM, YM, ZM} is MRI physical coordinate system, which doesn?t change after the MRI system is set up. {OL, XL, YL, ZL} is MRI logical coordinate system, which changes with different prescribed scan plane. XL is the frequency encoding direction, YL is phase encoding direction and ZL is slice selection direction. ........................................ 56 Figure 4-12 The change of RF frequency to select the same slice when motion occurs. Under the same slice-selective gradient, to follow the motion, the only way is to change the RF frequency. ..................................................................... 59 Figure 4-13 Slice selection sequences (RF pulse and z-gradient) and the three paths that generate the RF pulse, i.e. center frequency path, delta frequency path and phase path. t=0 is the start of TR and only one TR is shown here. .................... 60 Figure 4-14 Hybrid magnitude and phase image of the cylindrical phantom. Phase encoding gradient was zero. Frequency encoding was left-right direction. ........ 62 Figure 4-15 The hybrid space image after applying a linear RF frequency. Phase encoding gradient is set to zero. For each RF excitation, its frequency changes viii linearly with a step ?? . Perform FFT on each measured echo in k-space can we get a hybrid image. The magnitude of each line in the frequency encoding direction (shown with a rectangle shape) represents the object profile. The red line is to select an arbitrary point in the hybrid image, and the phases of those points also change linearly with a step ?? . Complex means to combine the magnitude and phase of the point in that line. ..................................................... 63 Figure 4-16 The hybrid phase image of the phantom when phase encoding gradient is set to zero. RF transmit frequency varies linearly along the phase encoding direction. (a) Phase correction is calculated with t? =264 s? ; (b) Phase correction is calculated with t? =180 s? ; (c) Phase correction is calculated with t? =230 s? ; (d) Phase correction is calculated with t? =480 s? . ....................... 65 Figure 4-17 Frequency encoding gradient and the acquisition window associated with it. Also the three paths that generate the receive frequency and phase. ...... 67 Figure 4-18 Hybrid phase image used for measurement of 't? . Phase encoding gradient is set to zero. RF receive frequency varies linearly along the phase encoding direction with a step of 100Hz. (a) Phase correction is calculated with 't? =144 s? ; (b) Phase correction is calculated with 't? =100 s? ; (c) Phase correction is calculated with 't? =200 s? . (d) Phase correction is calculated with 't? =300 s? . ......................................................................................................... 69 Figure 4-19 Software flowchart for the prospective head movement correction system for MRI. Calibration is all done offline and programmed with MATLAB. Real-time correction was programmed with C++ running in Linux. N is the number of total phase encoding steps for a scan. ................................................ 74 Figure 5-1 Distortion and correction for the camera. (a) Distortion model for the camera, containing radial and tangential distortion. The arrows and number on the curve mark the distortion direction and ......................................................... 76 Figure 5-2 Noise of the tracker. The maximum x translation error as a function of the number of tracked feature points. .................................................................. 78 Figure 5-3 Rotation and Translation parameters estimated from MRI images (solid lines)and from camera images transformed into MRI coordinates(dotted lines). The two sets of lines matching very well means the calibration between camera and MRI is accurate. ............................................................................................ 80 Figure 5-5 The MR hybrid image without phase encoding gradient. Small breathing induced head movement was observed during the scan of the volunteer. (a) without motion correction; (b) with motion correction. ...................................... 84 Figure 5-6 Prospective breathing induced motion correction for the 2D gradient echo acquisition. Images (a,b) show the evolution of motion parameters reported by the tracker during the acquisition of images (c) and (d) respectively. Image (c) was scanned without motion correction, and (d) was with motion correction; (e) and (f) were zoomed in on an area in the frontal brain indicated by the red square in images (c) and (d) respectively. ....................................................................... 86 Figure 5-7 Prospective motion correction data obtained for a volunteer that attempted to minimize head motion. Images (a,b) show the motion parameters reported by the tracker during the acquisition of images (c) and (d) respectively. Image (c) was scanned without motion correction, and (d) was with motion ix correction; (e) and (f) magnify an area in the front of the brain (the area marked by the red square in (c) and (d) respectively). ..................................................... 87 Figure 5-8 Example of the performance of the prospective motion correction method in the presence of large stepwise motion during the acquisition of multi-shot 2D gradient echo images on a normal volunteer. Images (a-c) show the motion parameters reported by the tracker during the acquisition of images (d-f) respectively. Image (d) was scanned without intentional motion during the scan, whereas (e,f) were scanned in the presence of motion; (e) was acquired without correction but (f) was acquired using the proposed method. (g-i) show additional detail in an area in the front of the brain that was indicated by the red square in images (d-f). ......................................................................................................... 88 Figure 5-9 Prospective motion correction in the presence of slow motion. The motion parameters reported by the tracker during the scan of images (d-f) are displayed in (a-c), respectively. Image (d) was acquired in the absence of intentional head motion, (e,f) during slow head motion, where, unlike (e), (f) was acquired with the proposed correction method. A magnification of the area in the red box in (d-f) is shown in (g-i). ........................................................................ 89 Figure 6-1 System setup. The camera was fixed right on top of the head coil. ....... 95 Figure 6-2 Method of the motion estimation using a single camera. N training EPI and camera images were acquired. From EPI images, 6DOF motion parameters were calculated from registration for each training data. During the real-time scan, the most similar camera image in the training set was found for the newly acquired camera image. The corresponding motion parameters from EPI were sent to MRI pulse sequence for prospective motion correction. ......................... 96 Figure 6-3 Motion parameters from the training data set. The total number of training images is 145. The motion is mainly rotation around z axis and translation in x axis. ........................................................................................... 101 Figure 6-4 Correlation of two randomly chosen images with all training images. Red crossings mark the chosen image, (a) the correlation between image No. 47 with others; (b) the correlation between image No. 100 with others. ................ 101 Figure 6-5 Prospective correction of large motion. (a) was acquired without intentional motion during the scan. (b c) were scanned in the presence of motion; (b) was acquired using the proposed method but (c) was without correction. (d,e) show the evolution of motion parameters during the acquisition of image (b,c), respectively. ....................................................................................................... 104 Figure 9-1 3D reconstruction error if camera parameters change. The solid lines projection is correct and the dotted projection is the result from different camera parameters with a small focal length and principle point difference. It?s obvious that the3D estimation magnify the camera parameters distinction. ................... 117 Figure 9-2 3D position estimation for a checker board pattern. Every circle represents a corner of the checker board. Red color shows position estimation from the left camera; blue color shows position estimation from the right camera; and green color shows those from the stereo. (a) when correct calibration parameters were used. (b) when focal length was changed with only 0.1mm. . 118 . 1 Chapter 1 Introduction Magnetic resonance imaging (MRI) has been widely used as a noninvasive clinical and research modality for the study of human anatomy. However, subject motion during scanning remains a severe problem and may degrade image quality below levels acceptable for clinical diagnosis. Recent improvements that yield high spatial resolution MRI of the brain (~ 0.2 mm) make this problem more acute, since for this application the tolerance to motion is reduced, and scan time is increased. The latter makes it more difficult for the subject to maintain the same position throughout the scan, which is especially problematic for children and patients. To compensate the effects of head motion during data acquisition, an in-bore optical motion tracking system is proposed. The system comprises one or two MR compatible infrared cameras that are fixed on a holder right above and in front of the head coil. The resulting close proximity of the camera(s) to the object allows precise tracking of its movement. During image acquisition, the MRI scanner uses this tracking information to prospectively compensate for head motion by adjusting gradient field direction and RF phase and frequency. Experiments performed on subjects demonstrate the system was able to improve MR image quality if motion occurred during acquisition. 2 Chapter 2 Review of Literature Magnetic resonance imaging (MRI) has been widely used as a noninvasive clinical and research modality for the study of human anatomy. It exploits the phenomenon of nuclear magnetic resonance (NMR) in an external magnetic field, whereby nuclei absorb and re-emit certain radio frequency (RF) waves during change in nuclear spin state. Image formation using NMR signals was developed by Lauterbur and Mansfield (Lauterbur, 1973; Mansfield and Grannell, 1973) in 1973 based on spatial encoding principles, which won them the 2003 Nobel Prize in Physiology and Medicine. Since then, MRI has undergone dramatic improvements in all the features that define image quality, such as resolution, signal-to-noise ratio (SNR), contrast enhancemant and speed. Specific structures, such as arteries, lesions, white matter fiber tracts, can also be visualized by manipulating RF fields and local magnetic field. 2.1 Principles of Magnetic Resonance Imaging 2.1.1 Nuclear Magnetic Resonance Signal The nucleus is composed of protons and neutrons, each of which intrinsically possesses angular momentum J , often called spin (Liang et al., 2000; Gerlach and Stern, 1924). Nuclei with an odd number of protons and/or an odd number of neutrons have a net spin. Because the nucleus is positively charged and possesses angular momentum, it possesses a magnetic moment ? also (Rabi et al., 1938). The relationship between the spin angular momentum and magnetic moment is 3 J?? = [2.1] Where ? is the gyromagnetic ratio. A related constant ? is also widely used, which is defined as ? ?? 2= [2.2] The value of ? or ? is nucleus-dependent. For hydrogen (proton), ? is 2.675?108 rad/s/T, ? is 42.58MHz/T. Because of its high concentration in the body in water and fat molecules, and their relatively high gyromagnetic ratio, the hydrogen proton is the source of signal for most clinical MRI exams. In the absence of an externally applied magnetic field, the hydrogen spins in biological materials have random orientations, therefore the vector sum of the magnetic moments, often referred to as net magnetization, denoted by 0M , is zero. However, if an external magnetic field 0B is applied, the spins will take only a discrete number of orientations, which equals to twice the nuclear-spin value plus one. For protons with a spin of ?, ? has two possible orientations with respect to 0B direction. Since spins in different orientations have different energy of interaction with the external magnetic field 0B , two energy states can be measured for protons, a high- and a low-energy state. The high-energy state is when spin is aligned opposing 0B , or spin-down ?E , while the low-energy state is when spin is aligned with 0B , or spin-up ?E . At equilibrium, the relative populations of the up and down states are given by the Boltzmann distribution as: 4 kT Bh kT Bh N N 00 1exp ?? ??? ? ?? ? ??= ? ? [2.3] Where h is Planck?s constant with the value h =1.0546?10-34J.s; k is Boltzmann?s constant, with the value k=1.381?10-23J/K; T is the absolute temperature of the sample in Kelvin. Since the net magnetization 0M is the vector sum of the individual magnetic moments, the excess of spins in the spin-up state results in an 0M aligned with the external field B0. To measure MRI signal by the RF receiver coil, 0M needs to be tipped away from the longitudinal direction (B0 direction) to the transverse plane (orthogonal to B0). This can be achieved by applying a radiofrequency (RF) pulse at the Larmor frequency, 00 B?? = [2.4] The RF pulse is normally turned on for a few microseconds or milliseconds. The created B1 magnetic field makes the spins to precess, As a result, the net magnetization is tipped away from B0 direction, creating a measurable transverse component xyM . At equilibrium, the fractional population difference kTBh 0? is very small, which is about 1 in 100,000 nuclei at 1.5T. Because the population difference is so small, the equilibrium net magnetization 0M is also small, making the measured MR signal xyM small. Therefore, to get a better signal-to-noise ratio (SNR) image, a stronger B0 is desirable. 5 2.1.2 Spatial Localization for Imaging A basic example of spatial localization is selective excitation and Fourier spatial encoding. To form an image, the first step is to localize the signals by exciting the magnetization only within a specified slice or slab of body (Bernstein et al., 2004; Lauterbur, 1973; Mansfield and Grannell, 1973). This slice selection is achieved by applying the RF pulse in the presence of a magnetic field gradient Gz. As given by the Larmor equation, the precession frequency is proportional to the field strength. The gradient therefore creates a variation in the Larmor frequency along the direction of the gradient. So, as shown in Figure 2-1, the RF pulse with a specific frequency c? excites only spins at a specific position cz because only those spins possess a resonant frequency that equals to the frequency of the applied RF pulse. The selection of another slice can be accomplished by changing the RF frequency, for example in Figure 2-1, increasing the RF frequency will excite a slice more superior while decreasing the RF frequency will select a slice more inferior. The thickness of the slice is determined by the bandwidth of the RF pulse and the amplitude of the gradient, as: zG z ? ??=? [2.5] Where z? is the slice thickness, Gz is the applied gradient field strength and ?? is the bandwidth of the RF pulse. 6 Figure 2-1 Slice selection. Activating a given magnetic-field gradient creates a linear correspondence between the magnetic-field strength and the position along the gradient. An RF pulse with center frequency c? excites spins located at position cz ; the bandwidth of ?? and the gradient strength determine the slice thickness z? . After the slice selection gradient is turned off, a phase encoding gradient (Gy) can be implemented. The magnetization accumulates a y-dependent phase when the y-gradient is kept on for a time y? : yG yyy ??? = [2.6] The encoded phase can be written in terms of the k-space spatial frequency (Twieg, 1983) in y direction: yk yy ?? 2= with yyy Gk ??= [2.7] The gradient Gy varies each time after excitation in a step-like fashion, to gather enough information for localizing the spin density in y direction. 7 The frequency encoding gradient Gx is applied after Gy. A negative dephasing lobe is followed by a positive lobe, during which an echo of the magnetization is formed and the signal is measured. The accumulated x-dependent phase is: xkxtG xxxx ??? 2== with xxx tGk ?= [2.8] The gradient Gx is a constant, and the time tx is a variable along the measurement period. These spatial encoding events can be depicted in a graphical format, called a pulse sequence. Figure 2-2 shows the basic gradient echo (GRE) sequence described above. Figure 2-2 Sequence diagram for a 2D gradient echo imaging sequence. RF pulse is a sinc function. Gz is the slice selection z-gradient. Gy is the phase encoding y-gradient, pictured as a series of horizontal lines to denote that it is being stepped regularly through increasing values during different repetition periods. Gx is the frequency encoding x-gradient and S is the measured echo signal. The spin density ),,( 0zyx? and the measured signal ),( yx kks satisfy the following equation: 8 dxdyykxkizyxkks yx fov yx ))(2exp(),,(),( 0 +?= ?? ?? [2.9] Therefore, sufficient k-space coverage is needed to reconstruct the image of the spin density with inverse 2D Fourier transform, yxyx fov yx dkdkykxkikkszyx ))(2exp(),(),,( 0 += ?? ?? [2.10] The k-space trajectory of the GRE imaging sequence is illustrated in Figure 2-3. The initial phase is 0== yx kk after each slice selection, corresponding to the center of the k-space. A phase encoding gradient moves the k-space trajectory vertically to a ky depending on the current Gy (red vertical line). The dephasing lobe of the frequency encoding gradient transverses the trajectory horizontally to kx,min (red horizontal line). The positive read gradient then brings the spins to kx,max, with signal being measured (red arrows). This gives one line in k-space, so the experiment is repeated by stepped changing Gy to obtain a complete set of data (black arrows). 9 Figure 2-3 k-space trajectory for 2D gradient echo imaging sequence. The signal starts from kx=ky=0 after each slice selection rephasing lobe. Phase encoding gradient moves the k-space trajectory vertically to a ky depending on the current Gy (red vertical line). The dephasing lobe of Gx transverses the trajectory horizontally to kx,min (red horizontal line). The reversed read gradient then brings the spins to kx,max through a complete set of kx values (red arrows). The experiment is repeated by changing Gy to obtain a complete set of data (black arrows). The MR signal is collected by uniformly sampling points in the k-space rather than by continuous monitoring, leading to discrete k-space coverage. ( )??? = ? = ?+????= 1 0 1 0 )(2exp),(),( M m N n yxyx yknxkmiynxmkks ?? [2.11] Where M and N are the number of sampled k-space data points in x and y direction; x? and y? are the pixel size. Therefore, the image of the spin densities can be reconstructed from the discrete inverse Fourier transform of the k-space data. 10 The total scan time required for 2D GRE acquisition is described by the following equation: NEXMTRTacq ??= [2.12] Where TR equals the repetition time and NEX equals the number of repetitions used (signal averaging). For a scan with TR 1s, phase encoding 256, and NEX 2, the total scan time is about 5 minutes. A higher resolution scan using much more phase encoding can prolong the scan time even more. Fast imaging strategies are therefore developed. 2.1.3 Fast Imaging ? Echo Planar Imaging The first ultra high-speed imaging technique was proposed by Mansfield in 1977 (Mansfield, 1977), named Echo Planar Imaging (EPI). Since then, various EPI techniques have been proposed. EPI encodes complete image position information in a single acquisition line rather than phase encoding a single line of k-space with every TR interval. Therefore, with a single excitation, an image can be formed. A GRE-EPI sequence is shown in Figure 2-4(a). Phase encoding of individual gradient echoes is through the use of a series of blipped Gy pulses. Figure 2-4(b) illustrates the k-space trajectory of the EPI sequence. 11 Figure 2-4 (a) An EPI sequence with blipped phase encoding gradient. (b) K-space trajectory for the corresponding EPI sequence. Because of the longer readout, EPI suffers more than GRE from various types of artifacts. In regions of high field inhomogeneities or at interfaces between tissues of different susceptibilities, image distortions or signal drop-out can be generated. The odd and even echoes shown in Figure 2-4(b) may have slight timing mismatches caused by imperfections in the gradients, which translate into phase errors and manifest themselves as duplicate images of the main image along the phase direction. 2.2 Motion Artifacts 2.2.1 Origin of motion artifacts For a slow high resolution structural scan, the head motion of an uncooperative volunteer causes motion artifacts primarily in the phase-encoding direction. The frequency encoding direction is less affected since the k-space data corresponding to the readout direction are collected over a relatively short period of a few milliseconds or less, and can be regarded as instantaneous on the time scale of most physiologic 12 motion. However, in the phase encoding direction, the positions are encoded by a step-wise changing magnetic field gradient prior to the acquisition of each line of k- space. Therefore, the time interval between neighboring two points in phase encoding direction is TR, which ranges from tens of milliseconds or several seconds and cannot be regarded as instantaneous. The motion will cause additional phase shifts to the spins from line to line, which will corrupt the spatial encoding and give rise to ghosting. Head motion is usually random motion, which makes the artifacts smeared out along the phase encoding direction and can obscure pathology (Figure 2-5). Figure 2-5 An axial brain image exhibits ghosting due to head motion. Since the movement is random, the ghosts are smeared out along the phase encoding direction (left/right). Recent improvements that yield high spatial resolution MRI of the brain (~ 0.2 mm, Figure 2-6) make this problem more acute, since for this application the tolerance to motion is reduced, and scan time is increased (Duyn et al., 2007; van Gelderen et al., 2007). The latter makes it more difficult for the subject to maintain 13 the same position throughout the scan, which is especially problematic for children and patients. Figure 2-6 A high resolution T2* weighted GRE brain image. The resolution is 200?200?1000 micrometer, and the scan time is about 5 minutes. 2.2.2 Correction for Motion Artifacts Several methods have been proposed to solve the head movement problem in MRI. All model head movement as rigid body motion with six degrees of freedom (DOF), namely three rotations and three translations along the MRI coordinate system. These parameters are then used to either retrospectively or prospectively compensate for the effects of motion on the image data. Retrospective motion correction addresses motion artifacts after the acquisition of a complete set of raw image data. While this might work well for in- plane motion, it is generally inadequate for through-plane motion, primarily because it cannot correct for the effects of this motion on the local magnetization history (i.e. 14 changes in saturation level of longitudinal magnetization due to motion-induced changes in the image-slice location). To avoid this problem prospective motion correction techniques have been proposed recently (Lee et al., 1998; Lee et al., 1996; Thesen et al., 2000; van der Kouwe et al., 2006; Ward et al., 2000). They track the head motion and rectify the acquisition planes correspondingly, by adjusting gradient direction and RF phases and frequencies. This avoids problems related to the effects of motion on spin history. The object motion parameters used by prospective and retrospective correction methods can be derived from MRI data using image or navigator based methods. Image based methods use image registration algorithms to detect the motion parameters (Friston et al., 1995; Friston et al., 1996; Jenkinson et al., 2002; Thesen et al., 2000). These techniques, while straightforward, can only calculate motion after acquiring a volume and are therefore always lagging the motion. Also, any ghosting and blurring artifacts that motion may cause may affect the accuracy of the registration algorithms. Navigator based motion correction methods acquire a motion-sensitive reference signal with the image (Ehman and Felmlee, 1989; Fu et al., 1995; Lee et al., 1998; Lee et al., 1996; van der Kouwe et al., 2006; Barnwell et al., 2007; Kadah et al., 2004). The earliest navigator method was capable of detecting only one- dimensional (1D) translation (Ehman and Felmlee, 1989) by employing a frequency encoding gradient but no phase encoding gradient before image acquisition. Soon after, orbital navigators (Fu et al., 1995; Ward et al., 2000), spherical navigators (Welch et al., 2002; Wyatt et al., 2005; Petrie et al., 2005), cloverleaf navigators (van 15 der Kouwe et al., 2006) and other navigators (Spuentrup et al., 2002; Thiel et al., 2002) (Wang et al., 2005; Pipe et al., 2002)were proposed to allow full tracking of 3D motion. However, the extra time required for measuring the navigator echoes led to an increased scan time. Figure 2-7 Navigator sequence and its application for motion correction. (a) X-NAV sequence for x-axis displacement measurement. The NAV echo is interleaved into the imaging sequence and is similar to an image echo, except that no phase encoding is applied. (b) A transverse section of the legs of a volunteer. Left leg moved from side to side while the right leg stayed stationary. The artifact caused by the left leg motion was marked with the arrow. (c) The X-NAV data obtained during the scan. The motion of the left leg and the static position of the right leg were clearly shown. (d) Motion corrected images using the NAV data from (c). The motion artifact of the left leg was eliminated. (source: Ehman and Felmlee, 1989) Self-navigating methods can also be used to retrospectively correct motion when combined with particular image acquisition techniques including projection acquisition (Glover and Noll, 1993; Glover and Pauly, 1992; Kim et al., 2008), spiral acquisition (Glover and Lee, 1995; Liu et al., 2004; Meyer et al., 1992; Hu and Glover, 2007), and PROPELLER (Pipe, 1999; Pipe and Zwart, 2006; Wang et al., 2005; Pipe et al., 2002). These all involve the collection of redundant MRI data, (a) (b) (c) (d) 16 usually including the central region of the k-space. The over-sampled central region of k-space provides intrinsic averaging of image features that reduces motion artifacts, and can be further used to correct spatial inconsistencies in position, rotation, and phase between acquisitions. One drawback of these methods is the increased scan time related to redundant k-space sampling. Figure 2-8 k-space trajectory. (a) projection; (b) spiral; (c) PROPELLER. Alternatively, external motion tracking systems can be used for MRI motion correction. These tracking systems rely on additional hardware, such as miniature coils (Derbyshire et al., 1998; Nevo et al., 2002), fiducial samples (Eviatar et al., 1997), optical reflectors (Eviatar et al., 1999) or stereo cameras (Dold et al., 2006; Tremblay et al., 2005; Zaitsev et al., 2006). Among them, stereo optical systems have been shown to give good motion correction results with reasonable accuracy retrospectively (Tremblay et al., 2005) as well as prospectively (Dold et al., 2006; Zaitsev et al., 2006). The stereo tracking system works in parallel with the scanner thus needs no extra scan time for motion detection in the imaging sequence. However, there are limitations to the current tracking systems, the most important being that the tracking target required for monitoring object position is either uncomfortable for the subjects (e.g., a mouthpiece used in Zaitsev et al., 2006), or (a) (b) (c) 17 hard to combine with close-fitting receiver arrays (e.g., a cap used in Tremblay et al., 2005). To overcome these limitations, we developed an in-bore video tracking system for prospective motion correction that monitors the position of features on the subject?s face. The system was designed to allow high resolution (~ 0.2mm) MRI at 7T in the presence of substantial head motion. 2.3 Stereo Tracking System Stereo tracking system includes two cameras, acquiring images of the same object simultaneously from slightly different view points. To determine the 3D motion of an object, the system needs to be accurately calibrated beforehand. The object?s motion parameters can be estimated by analyzing the image pairs captured by the stereo system. The tracking feature points are first detected in each image pair and the 3D position of these feature points are calculated through stereo triangulation. The 6- DOF motion parameters (3-DOF translations plus 3-DOF rotations) can be further estimated based on the 3D coordinates of these feature points. 2.3.1 Camera Calibration Camera calibration is a crucial issue in computer vision. It serves to determine the relationship between a 2D image perceived by a camera (or images perceived by multiple cameras) and the 3D information of the real object. Precise camera calibration is especially important for applications that involve metric measurements, such as dimensional measurements (Tomasi and Kanade, 1992), object motion 18 (Broggi, 1998; Charbonnier and Fournier, 1995), visual inspection (Newman and Jain, 1995), etc. Camera models usually involve of two sets of parameters: extrinsic and intrinsic. Extrinsic parameters, including a rotation and translation vector, model the geometrical relationship between the camera and the scene. Intrinsic parameters, including focal length, principal points, and distortions, model the internal geometry of a camera. They determine how to derive an image point position given the spatial position of the point with respect to the camera. The position and orientation of a stereo setup is also intrinsic since it is independent of scene structure. Intrinsic parameters remain constant after cameras are set up while extrinsic parameters may change with different scenes. The estimation of these two sets of parameters is called camera calibration, which allows one to relate the image measurements to the spatial structure of the observed scene. The existing camera calibration approaches can be classified into three categories: a) Linear closed-form methods. These methods construct a linear transformation to relate 3D points with their 2D projections (Abdel-Aziz and Karara, 1971; Faugeras and Toscani, 1986; Hall et al., 1982). Then least squares methods are used to get the closed-form solution. The advantage of these methods is the simplicity of the model and fast computation. However, they have the following disadvantages. First, the linear models don?t include lens distortion, which make these methods less accurate. Second, camera parameters are not explicit in these models, and constraints (i.e. rotation matrix should be orthonormal) are not considered. 19 b) Non-linear optimization methods. When lens distortion is considered in the camera model, the relationship between the 3D scene points and their 2D projections becomes non-linear. The non-linear error function is usually defined as the distance between 2D image points and the projection of the 3D points. Camera parameters can then be optimized iteratively by minimizing the defined error function (Faig, 1975; Slama et al., 1980). The advantage of these methods is that any distortion can be considered in the model and good accuracy can be achieved by reaching correct convergence. However, a good initial guess is required to guarantee correct convergence. c) Two-step methods. These methods first find closed-form solutions for some parameters, then do iterative optimization for the other or all of the parameters (Tsai, 1987; Weng et al., 1992b; Zhang, 2000). These methods take the advantages of the previous two methods and can always converge quickly to a correct solution due to the closed-form guess in the first step. Explicit intrinsic and extrinsic camera parameters and major distortions (namely, radial, tangential, decentering) can all be estimated with high accuracy in a short time. 2.3.2 3D Reconstruction 3D reconstruction computes the position of a point in 3D given its image in two views and the camera intrinsic parameters. The linear triangulation method (Bouguet, 1999; Hartley and Zisserman, 2003) back projects rays from the measured image points and find the intersection of the two rays for the 3D point. Since errors are inevitable in the measured points, the epipolar constraint (explained in Section 4.2.2 20 in detail) may not be satisfied and the intersection of the back-projected rays may not exist. Sampson error correction is applied to these measured points to correct first- order errors (Torr and Zisserman, 1997). This approximation is valid only when the error of the measured points is small. However, the epipolar constraint is still not satisfied exactly. A more costly triangulation method (Hartley and Sturm, 1997) can find the optimal 3D points by solving a sixth-degree polynomial. 2.3.3 Object Tracking Visual tracking is a very important component for systems with applications in robot control (Milella and Siegwart, 2006), surveillance (Coifman et al., 1998), human- computer interface (Gorodnichy and Roth, 2004), and visual reconstruction (Tomasi and Kanade, 1992). Various tracking methods have been proposed, which can be classified in four categories: model based (Kakadiaris and Metaxas, 2000; Zhou et al., 2004), region based (Hager and Belhumeur, 1998), contour based (Blake et al., 1993) and feature based (Shi and Tomasi, 1994; Yao and Chellappa, 1995; Zheng and Chellappa, 1993) methods. In this project, feature based tracking is used, because of its subpixel precision, speed, as well as robustness. The feature tracking process can be divided into two major subtasks: feature selection and feature tracking. Good features can be identified as edges or corners (Harris and Stephens, 1988) or zero crossings of the Laplacian of the image intensity (Marr et al., 1979). However, these feature selection criteria are often defined independently of the tracking algorithm. With the Kanade-Lucas-Tomasi (KLT) tracker, instead, a good feature is defined as the one that can be tracked well or that 21 can make the tracker work best (Lucas and Kanade, 1981; Tomasi and Kanatani, 1991). The tracker starts from Lucas and Kanade?s fast image registration technique that makes use of the spatial intensity gradient of the images to find a good match (Lucas and Kanade, 1981). After that initial work, Tomasi and Kanade developed a feature tracker based on sum of squared intensity differences (SSD) and assumed translational inter-frame motion (Tomasi and Kanatani, 1991). Then, Shi and Tomasi proposed an affine model to make the region matching better over longer time spans (Shi and Tomasi, 1994). Subsequently, Tommasini proposed an extension of this tracker to automatically reject spurious features, which made the tracker even more robust (Tommasini et al., 1998). Now, the KLT tracker has been widely used in many systems for high accuracy tracking, such as facial feature tracking, robot control, and flow velocity measurement. 2.3.4 3D Motion Estimation Computing a 3D rigid motion that maps a set of 3D points to another set is the basis of the absolute orientation problem (Horn, 1987; Horn et al., 1988). It is an important computer vision task and has numerous applications, such as robotics, pose estimation, and motion analysis. Solutions to this problem include closed-form and iterative methods, between which closed-form solution is preferred because of its robustness and efficiency. Iterative methods, however, may be trapped in local minima of the error function and always need a good initial guess. Because of these reasons, we adopted closed-form methods for our motion estimation. 22 Four major closed-form solutions are used for 3D rigid body transformations. The first one was proposed by Arun (Arun et al., 1987). In their derivation, rotation is represented as a standard orthonormal matrix and the solution is based on the singular value decomposition (SVD) of a covariance matrix of the data. However, this method will fail if the data is seriously noisy or becomes planar. Umeyama (1991) and Kanatani (1994) corrected this with different ways of derivation. Horn (1987) represented rotation using quaternions and the solution is the eigenvector corresponding to the largest positive eigenvalue of a symmetric matrix. Horn (1988) later also used an orthonormal matrix to represent rotation, and derived an alternative method that used manipulation of matrices and their eigenvalue-eigenvector decomposition. Walker et al (1991) presented a solution by representing rotation and translation together as a dual quaternion. These four algorithms were compared with respect to accuracy, stability and efficiency by Eggert et al (1997). The accuracy shows no discernible differences, while the SVD method (Arun et al., 1987; Umeyama, 1991) gives the best stability. Even though it is somewhat slower than Horn?s (1988) method, the superior accuracy and stability made us choose this method in our research. 23 Chapter 3 Objective The overall objective of this project is to design a new tracking system to prospectively correct head motion for MRI. This optical system should be compatible with MRI. It should be fast, robust, and accurate for tracking of head motion and should also be able to communicate with MR scanner before the MRI data are acquired. With the accomplishment of this project, sensitivity of brain MRI to head motion should be substantially reduced. Specifically, the objectives are: 1) To setup the optical tracking system. One or two MR compatible infrared cameras should be used for tracking with illumination of infrared LEDs. The tracking system needs to be able to determine the subject?s head position in real time (three rotations plus three translations). 2) To program MR pulse sequence to receive motion parameters from the optical tracking system and apply motion correction in real-time. Before each excitation, the MRI scan computer should be able to read the motion parameters to adjust the gradient field direction and RF phase and frequency in the imaging sequence to compensate the changes in position. 3) To evaluate the system accuracy and performance. Specifically, we need to evaluate the accuracy of the optical tracking system, the calibration accuracy of the combined optical and MRI system, and the acquired MRI image quality with and without motion compensation. 24 Chapter 4 Prospective Motion Correction for MRI using a Stereo Tracking System 4.1 System Setup MRI experiments were performed on a GE (Milwaukee, WI, USA) 7T MRI system, equipped with a detunable volume transmit coil and a 32-channel receive-only head coil (Nova Medical, Wilmington, MA, USA). The optical tracking system included two MR compatible infrared cameras (MRC Systems GmbH, Germany) as part of a stereo-vision system to measure the subject?s head motion in 3D. The size of the cameras was 28 x 18 x 30mm3. They were fixed on a holder right above and in front of the head coil (Figure 4-1). The distance and the angle between them were 12 cm and 28? respectively. Surrounding each camera lens, six infrared light emitting diodes were used to illuminate the field of view. The distance from the cameras to the face was around 8cm. The high spatial resolution of this setup (640 x 480 pixels, about 13 x 10cm2 field of view) allowed detection of very small movements. 25 Figure 4-1 Experimental setup. The top pane is an illustration of the position of the cameras and head coil in the MRI scanner bore. The bottom images show the real instruments. Cameras are highlighted with the red ellipse. The two cameras were connected to a filter box (MRC Systems GmbH, Germany) via the camera connector cable. This low pass filter, which has a cutoff frequency of 1MHz, was used to prevent interferences caused by the high frequency signals of the MR scanner, as well as possible infiltration of MR-frequency noise into the magnet room. The video output signals were then captured by a Matrox Morphis frame grabber (Matrox Electronic Systems Ltd. Quebec, Canada), using a standard BNC cable connection, by a tracking computer running a Linux operating system. Real-time tracking and motion estimation software was developed using C++. The tracking speed was 10Hz. The tracking computer communicated with the MRI scanner through a TCP/IP connection (Santos et al., 2004) to provide the console (scan computer) with the real-time motion parameters. This allowed the image acquisition pulse sequence to alter gradient direction and RF offset phases and frequencies to compensate for the changes in object position. 26 4.2 System Calibration A precise calibration is necessary to achieve the goal of motion correction for high- resolution MRI. A three step calibration procedure was developed, which included the calibration for each single camera, the calibration for the stereo, and the calibration between the stereo cameras and the MRI scanner. To achieve this, a calibration phantom was designed, the internal of which consisted of a grid of rectangular prisms (Figure 4-2a) carved out of a plastic form. The prisms were all of equal size, 10 x 10 x 18mm3 (length x width x height), and were 10 x 10mm2 apart, except those on the four corners, which were cut into triangular prisms to provide differentiation in the height direction. The open spaces in the phantom were filled in with saline to make them visible to MRI. The cover of the phantom was 2.4mm thick. On the surface of the phantom, a piece of paper with a black-and-white checkerboard representation of the internal grid was attached (Figure 4-2b). This design provided many corners for calibration, and the clear sharp corners improved calibration accuracy. 27 (a) (b) Figure 4-2 Calibration model. (a) Schematics of the phantom; (b) View of the actual phantom surface. 4.2.1 CCD Camera Calibration Camera calibration is required to build a model that can map between a 3D world and a 2D image. The pinhole model is a widely used one and its geometric illustration is presented in Figure 4-3: In this camera model, a 3D point TWWWW ZYX ],,[=M , is transformed from the world coordinate system to the camera coordinate system, TCCCC ZYX ],,[=M . Through central projection, the ideal (undistorted) image coordinate of the object point is [ ]Tuuu yx ,=m . However, common cameras are often characterized by a somewhat distorted projection behavior which causes the observed (distorted) image point shifting to [ ]Tddd yx ,=m . 28 Figure 4-3 Pinhole camera geometry. OC is the camera center and also the origin of the camera coordinate system, Op is the principle point, M is a point in the space which has different coordinates in the world and camera coordinate systems, and md and mu are the image of M in the image plane with and without lens distortion. In practice, the image in the image plane will be further sampled by a frame grabber card and stored in an image memory buffer. The relationship between a point WM in the world coordinate system and its image pm in image pixel coordinate can be represented as ( )CMRM ?= WC [4.1] ?? ? ?? ?== ?? ? ?? ?= ?? ? ?? ?= n n n cc cc u u u y xff ZY ZXf y x mm / / [4.2] ( ) ( ) ( ) ( )?? ? ?? ? +++++ +++++= ?? ? ?? ?= 22 12 4 2 2 1 22 21 4 2 2 1 221 221 uuuu uuuu d d d yrpyxprkrky xrpyxprkrkx y xm [4.3] Where 222 uu yxr += ?? ? ?? ? + += ?? ? ?? ?= 0 0 yyD xxD y x dy dx p p pm [4.4] 29 R and C are rotation matrix and translation vector which characterize the transformation between the world coordinate system and the camera coordinate system; f is the focal length; nm is the normalized coordinate; 1k and k2 are the radial lens distortion coefficients; p1 and p2 are the tangential lens distortion coefficients; Dx, Dy are the number of pixels per metric unit distance (For most imaging sensors currently manufactured, pixels may be assumed perfectly square, implying Dx=Dy. In the general case, they may be different); ),( 00 yx are the coordinates of the optical center (principal point) in the image coordinate system. The objective of the camera calibration procedure is to determine optimal values for these parameters based on image observations of a known calibration target. Since any calibration point WM and its corresponding image in the CCD sensor pm should satisfy Equation 4.1-4.4, the unknown parameters can be determined when the computed projection of a known structure (i.e. a checkerboard pattern) matches the best with the observed projection on the image. If enough (larger than the number of unknown camera parameters) calibration point pairs are given, the camera parameters can be estimated. A closed-form solution for the unknown parameters without considering lens distortion gives a good initial guess. In absence of lens distortion, the projection of WM to pm can be simplified as: ?? ? ?? ? + += ? ? ? ? ? ? ? ? ? ? ? ? + + =? ? ? ?? ?= 0 0 0 0 yyf xxf yZYfD xZXfD y x ny nx c c y c c x p p pm [4.5] 30 Where xx fDf = and yy fDf = represent the focal length of the camera in terms of pixel dimensions in x and y direction. Denoting the homogeneous coordinates of a vector [ ]Tyx ?,,=x by x~ , i.e. [ ]Tyx 1,,,~ ?=x , Equation 4.1 and 4.5 can be written as: [ ] [ ] WWW cy x p yf xf s MPMTRKMRCR0IK Mm ~~~ 10 ~ 0 0 0 1 ~ 0 0 ==? ? ? ?? ? ?= ? ? ? ? ? ? ? ? ? ? = [4.6] Where s is an arbitrary scale; K is a 3?3 matrix, mapping the normalized image coordinates to the image pixel coordinates; RCT ?= ; and P is a 3?4 matrix, called the perspective projection matrix. In matrix P, there are a total of 10 unknowns: 4 for K, 3 for R and 3 for C. The parameters contained in K are called the intrinsic camera parameters (distortion parameters are intrinsic parameters too, but are not estimated in this closed-form solution), and those in R and C are extrinsic parameters. Vanishing points can be used for calibration for its simplicity and ease of computation (Bouguet, 1999; Daniilidis and Ernst, 1996; Wang and Tsai, 1991). Figure 4-4 shows a perspective image of a square ABCD in space. 1nm , 2nm , 3nm and 4nm are the four corners of the square observed in the image plane after normalization, which means: piy x ni yf xf s mm ~ 1 /1 /1 ~ 0 0 ? ? ? ? ? ? ? ? ? ? ? ? = [4.7] 31 Figure 4-4 Camera calibration using a square target. (a) Projection of a square ABCD to the image plane; (b) The image and the four vanishing points { 1~V , 2~V , 3~V , 4~V }, and the horizon line H?~ . By assuming the optical center is at the center of the image, the intrinsic parameters to be estimated are therefore xf and yf only, and Equation 4.7 can be simplified as: ipiy x ni sf f s pKpm ~~ 1 0/1 0/1 ~ 1?= ? ? ? ? ? ? ? ? ? ? = [4.8] Where Kp is the intrinsic camera matrix without principle points, and )4,1(~ ?=iip is the homogeneous pixel coordinates of the corners after subtraction of the optical center. From the geometry in Figure 4-4, the two vanishing points 1~V and 2~V can be determined by: 32 21 432 144 323 211 432 211 ~~~~ ~~~~ ~~~~ ~~~~ ~~~~ ~~~~ ~~~~ VV? ??Vmm? mm? ??Vmm? mm? ??? ? ? ? ??? ? ??? ? ? ? ?? ?? ??? ? ? ? ?? ?? H nn nn nn nn [4.9] Where ? is the standard cross product. All the equalities in Equation 4.9 are defined up to scale. However, the normalized coordinates of the corners are not available since the focal lengths are unknown. But the pixel coordinates of the four corners )4,1(~ ?=iip are available, so Equation 4.9 should be written in pixel unit: p HpH p p p p p pp ppp p ppp p p p p p pp ppp p ppp ?K? VK?K?KV?KpKpK? ?KpKpK? VK?K?KV?KpKpK? ?KpKpK? ~~~ ~~)~()~(~~~ ~)~()~(~~ ~~)~()~(~~ ~~)~()~(~~~ ~)~()~(~~ ~~)~()~(~~ *1 2 1 4 *1 3 *1 2 4 *1 1 1 4 1 4 3 *1 3 1 2 1 3 1 1 2 *1 1 *1 1 2 *1 4 1 3 1 2 1 *1 2 1 1 1 1 ? ??? ??? ??? ??? ??? ??? ?? ? ? ? ? ? ? ? ????? ? ??? ??? ??? ????? ? ??? ??? ??? [4.10] Where *pK is the adjoint of Kp. Since K is a 3?3 matrix, and u, v are 3-vectors, we have )()()( * vuKKvKu ?=? (Faugeras and Robert, 1996). Because K is invertible, 1* )( ?= TKKK , therefore KK ?~** . And also explicitly, pp H ppp p p ppp p p 21 432 144 323 211 432 211 ~~~~ ~~~~ ~~~~ ~~~~ ~~~~ ~~~~ ~~~~ VV? ??Vpp? pp? ??Vpp? pp? ??? ? ? ? ??? ? ??? ? ? ? ?? ?? ??? ? ? ? ?? ?? [4.11] Because the two vanishing points 1~V and 2~V are mutually orthogonal: 0~)()~( )~()~(~~ 2 11 1 2 1 1 1 21 =? ??? ?? ?? p p T p Tp p p p p VKKV VKVKVV [4.12] 33 In Equation 4.12, p1~V and p2~V can be calculated from Equation 4.11, and are denoted here as Tp cba ],,[~~ 1111 ?V and Tp cba ],,[~~ 2222 ?V , so Equation 4.12 provides one constraint in the focal lengths: 021221221 =++ ccfbbfaa yx [4.13] Another constraint for the focal lengths comes from the fact that the diagonals of the square ABCD are orthogonal too. Therefore, similarly, we get: ?? ??? ??? ???? ?? ??? ??? ??? ??? ??? ??? ??? p p p Hp p p p p p Hp p p p ppp p ppp 4 1*1 6 *1 4 3 1*1 5 *1 3 6 *1 4 1 2 1 6 5 *1 3 1 1 1 5 ~ ~)~()~(~~ ~~)~()~(~~ ~~)~()~(~~ ~~)~()~(~~ VK?K?KV VK?K?KV ?KpKpK? ?KpKpK? [4.14] Then, since 0~)()~( )~()~(~~ 4 11 3 4 1 3 1 43 =? ??? ?? ?? p p T p Tp p p p p VKKV VKVKVV [4.15] We get another constraint for focal lengths with denoting Tp cba ],,[~~ 3333 ?V , and Tp cba ],,[~~ 4444 ?V : 043243243 =++ ccfbbfaa yx [4.16] So, xf and yf are the solution of Equation 4.13 and 4.16. Next, the extrinsic parameters can be initialized through homography (Zhang, 2000). Without loss of generality, the calibration pattern plane is assumed to be on 0=WZ plane. Denote the ith column of the rotation matrix R by ri. From Equation 4.6, we have 34 [ ] [ ] ? ? ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? ? ? ? 1101 21321 W W W W p p Y XYX y x s TrrKTrrrK [4.17] Therefore, a calibration point WM~ (since ZW is 0, TWWW YX ]1,,[~ =M ) and its image pm ~ is related by a homography H: Wps MHm ~~ = with [ ]TrrKH 21= [4.18] To calculate the homography, we first define a 9? 1 vector Ha as TTTT ],,[ 321 hhh , where T ih (i=1,2,3) is the i th row of H. Then Equation 4.18 can be written as 0~~ ~~ ~ 0 ~ 0 ~ ~ 1 3 2 1 3 2 1 = ? ? ? ? ? ? ? ? ? ?? ?? ?? ? = =? =? ? ? ? ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? ? ? ? aHMM0 M0M Mh Mh Mh M h h h T Wp T W T T Wp TT W W T W T p W T p W T T T p p y x s sy sx y x s [4.19] When all the corners in our calibration checkerboard pattern were used(Figure 4-2b), which was 100, we had 100 of the above equations. These can be written as 0LHa = , where L is a 200? 9 matrix. Therefore, the solution of Ha is the right singular vector of L associated with the smallest singular value. Note that H is defined up to a scale factor. Denoting H by its column vector ][ 321 hhhH = , from Equation 4.18, we get ? ? ? ??? ? = ?= = = ?= ? ? ? ? 3 1 213 2 1 2 1 1 1 1 21 ][ hKT rrr hKr hKr HKTrr ? ? ? ? [4.20] 35 To assure r1 and r2 are unit vectors, 2/11 2 1 1 1 ? ? ? ? ? ? ? ? += ?? hKhK? (or choose either 1 1 1 hK ?=? or 21 1 hK ?=? since they are very close). However, because of noise, the computed ][ 321 rrrR = does not in general satisfy the properties of a rotation matrix, therefore we need to do the following: 213 222 12122 111 / , / rrr rrr rrrrr rrr ?= ? ?? ? [4.21] Where ??? means ?to be replaced by?, and <,> is the standard inner product. The solution of the intrinsic parameters from Equation 4.9 ? 4.16, and the extrinsic parameters from Equation 4.18 ? 4.21 assumes no lens distortion of the cameras. However, the MR compatible cameras used in this project exhibited significant lens distortion, especially radial distortion. So to estimate all the camera parameters, we need to use the already calculated parameters as an initial guess for the minimization of the nonlinear cost function: 2 1 1 212100 ),,,,,,,,,,(?? = = ? M i N j wjiiyxppij ppkkyxff MTRmm [4.22] Where ),,,,,,,,,,( 212100 wjiiyxp ppkkyxff MTRm is the projection of point WjM from the world coordinate system into the ith calibration image according to Equation 4.1 ? 4.4. The 3?3 rotation matrix Ri has 3 DOF only and is converted to a 3?1 vector through Rodrigues formula for computation efficiency (Faugeras, 1993). pijm is the measured pixel coordinate. M is the total number of views used for calibration 36 and N is the number of corners on the calibration target in each view. It has been shown that calibration errors decrease when more images are used (Zhang, 2000), so 20 ~ 30 calibration images were used in our project. Figure 4-5 shows two calibration images, with red ?+? marking the N=100 corners (Harris and Stephens, 1988). The nonlinear minimization is solved with sparse Gauss-Newton algorithm (Hartley and Zisserman, 2003, Section 9.3). Figure 4-5 Two views of the calibration target. The 100 red crosses on each image mark the extracted corners used for calibration. 4.2.2 Stereo Calibration After calibration of both cameras, the relationship between the two cameras can be estimated. Let Oc and Oc? be the camera centers of the left and right cameras and I, I? the corresponding image planes respectively (Figure 4-6). Given a point M in the world, its projection on the left image is m, and on the right image is m?. For every point on the line OcM, their projections in the left image are all m, but their projections in the right image are on a line 'ml , which is called the epipolar line of m. The line 'ml is the intersection of the plane I? with the plane ? , defined by m, Oc and (a) (b) 37 Oc?, which is called the epipolar plane. Because the epipolar geometry is symmetric, point m? in the right image corresponds to an epipolar line lm? in the left image, which is the intersection of the same plane ? with the left image plane I. Furthermore, all epipolar planes (with different M) contain the line OcOc?. They all intersect I at a common point e and intersect I? also at a common point e?, which are called epipoles. Figure 4-6 The epipolar geometry. M is a point in space, Oc and Oc? are the two camera centers. The camera baseline OcOc?intersects each image plane at the epipoles e and e?. Oc , Oc?, M and its images m and m? all lie in a common plane, called epipolar plane. The two lines me and m?e? are epipolar lines. The epipolar geometry provides a computational significance in stereo vision that for a point in one image, its correspondence in the other image must lie on the epipolar line, and therefore the search space for a correspondence is reduced from 2D to 1D. But to get that, the rotation Rs and translation Ts between the two camera coordinate systems need to be determined. Given a set of points in space Mij (i=1,?,M represents the number of images used for calibration, j=1,?,N represents the number of corners in each calibration image), and let Tcijcijcijcij ZYX ],,[=M , Tcijcijcijcij ZYX ],,[ '''' =M be the Euclidean coordinates of Mij in the two camera coordinate systems, then 38 ss TMRM += cijcij ' [4.23] From each camera?s calibration step, we get the extrinsic parameters. Also, since the two cameras get images of the same object simultaneously, they have the relation: iiiicijiicij iwjicij iwjicij TRRTMRRM TMRM TMRM 1''1'' ''' ?? ?+=? ?? ??? += += [4.24] Therefore, with the M calibration images, we get M rotation and translation estimations. The median of these is used as the initial guess to search for the optimal Rs and Ts by minimizing the following cost function: 2 1 1 ' 2 ' 1 ' 2 ' 12121 ),,,,,,,,,,,,,,(?? = = ? M i N j wjiippij ppkkppkk MTRTRK'Kmm ss (( [4.25] Where [ ]TTpijTpijpij ',mmm =( is the observed corner position in both the left and right images; wjM is the point in world coordinate system; ),,,,,,,,,,,,,,( '2'1'2'12121 wjiip ppkk'ppkk MTRTRKKm ss( is the projection of a space point Mwj to the two cameras by combining Equation 4.1 ? 4.4 and 4.25; K and K? are the calibration matrix of the two cameras, including 00 ,,, yxff yx and ' 0 ' 0 '' ,,, yxff yx ; 2121 ,,, ppkk and ' 2 ' 1 ' 2 ' 1 ,,, ppkk are distortion parameters; Ri, and Ti are the extrinsic parameters corresponding to the ith calibration image captured by one camera. The nonlinear minimization is also solved with the sparse Gauss-Newton algorithm (Hartley and Zisserman, 2003). Note that in Equation 4.25, all the intrinsic parameters of the two cameras are to be refined even though they have been determined already after calibration of each camera. The reason to do it here is that 39 the epipolar constraint can provide some refinement on these parameters for better calibration accuracy. 4.2.3 3D Reconstruction from Stereo With all the parameters of the stereo known, from any corresponding image points m and m? in the two camera images, we can retrieve the 3D coordinate of that point based on stereo triangulation (Hartley and Zisserman, 2003; Hartley and Sturm, 1997; Bouguet, 1999). From epipolar geometry in Figure 4-6, m, m?, Oc and Oc? must lie in the epipolar plane, therefore the back-projection of the two lines Ocm and Oc?m? intersect in 3D at the point M. The computation of a 3D point position from its image in two 2D views is called stereo triangulation. Let Tnnn yx ],[=m and Tnnn yx ],[ ''' =m be the normalized coordinates for points m and m? respectively, calculated using Equation 4.2 ? 4.4 after the two cameras being calibrated. Let Tcccc ZYX ],,[=M and Tcccc ZYX ],,[ '''' =M be the coordinates of M in the two camera coordinate systems respectively, from Equation 4.2, we can get: ''' ' '' ~ 1 ,~ 1 ncn n ccncn n cc Zy x ZZy x Z mMmM = ? ? ? ? ? ? ? ? ? ? == ? ? ? ? ? ? ? ? ? ? = [4.26] With the known Rs and Ts from stereo calibration, and ss TMRM += cc' from Equation 4.23, we get: ss TmRm += ncnc ZZ ~~ '' [4.27] 40 Since cZ and 'cZ are the unknowns, the above equation can be rewritten as: [ ] ss TmmR =? ? ? ?? ?? ' '~~ c c nn Z Z [4.28] Denote ]~~[ 'nn mmRA s?= (a 3?2 matrix), the least square solution is then: sTAAA TT c c Z Z 1 ' )( ?=? ? ? ?? ? [4.29] If there is no noise in m and m?, Equation 4.29 returns the exact 3D position of the point M, which is the intersection of the two back-projecting rays Ocm and Oc?m?. However, if noise exists, the two rays don?t intersect in general, it is thus necessary to estimate a best solution for the point in 3D world. This can be realized by Sampson correction of the measured points (Hartley and Zisserman, 2003). But before Sampson correction, the epipolar constraint needs to be reiterated here. For Equation 4.27, by taking the cross product with Ts, followed by the inner product with '~ nm , we obtain 0)~(~ ' =?? nn mRTm ss . This can also be written as: ss RTEmEm ?== ][0 ~~ ' where n T n [4.30] E is called the essential matrix, and ?][ sT is the skew-symmetric matrix of Ts, defined as: ? ? ? ? ? ? ? ? ? ? ? ? ? =? 0 0 0 ][ 12 13 23 ss ss ss TT TT TT sT [4.31] Equation 4.30 mathematically describes the epipolar constraint illustrated in Figure 4-6. In particular it means that '~ nm lies on the epipolar line nm mEl ~' = . 41 A typical observation of a noisy point correspondence 'nn mm ? doesn?t in general satisfy the epipolar constraint. They have to be corrected to 'ncnc mm ? which have minimum distance to the measured 'nn mm ? but satisfy the epipolar constraint 0~~ ' =ncTnc mEm exactly. To emphasize the dependency on TTnTn ],[ 'mmx = , we will write the epipolar constraint as 0)(C =xE . This cost function may be approximated by a Taylor expansion to first order: x E ExE ?xx?x ? ?+=+ C)(C)(C [4.32] The corrected TTncTncc ],[ 'mmx = satisfy the constraint, i.e. 0)(C =cE x . Since xc ?xx += , we have 0 C)(C = ? ?+ x E E ?xx . Denote the partial derivative matrix (Jacobian matrix) x?? EC as J and the cost )(C xE as ? , the Sampson correction term x? is thus the solution of equation ?x ?=J? : ?TTx 1)( ??= JJJ? [4.33] The corrected point is therefore: ?TTc 1)( ??= JJJxx [4.34] Explicitly, the cost here is nTn? mEm ~~ '= , and the Jacobian matrix is: ])~(,)~(,)~(,)~[( 212'1' nnnTnT? mEmEmEmExJ =??= [4.35] The elements of J in the previous equation are computed by the chain rule as: 1 '' '' 1 ) ~(~)0,0,1(~ ~~~~~ n T n T n n T n n n n n T n n xxx ? mEmE m mEmmmEmJ == ? ? ? ?= ? ?= ? ?= [4.36] 42 Then from Equation 4.34 the corrected point is: ?? ? ? ? ? ? ?? ? ? ? ? ? +++??? ? ? ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? ? ? ? ? ? 2 1 2 ' 1 ' 2 2 2 1 2 2 '2 1 ' ' ' ' ' ' )~( )~( )~( )~( )~()~()~()~( ~~ n n n T n T nnn T n T n T n n n n n nc nc nc nc y x y x y x y x mE mE mE mE mEmEmEmE mEm [4.37] After Sampson error correction, the 3D position of a point can be estimated using Equation 4.29. But since Sampson correction is a first-order correction, it is accurate only if the noise in each image is small, making it a suboptimal solution. The optimal solution could be found by minimizing the function: 0~~),(),(),(C '2''2' =+= ncTncncnncnnnE tosubjectdd mEmmmmmmm [4.38] Where d(*,*) is the Euclidean distance between the points. This cost function can be minimized using Newton iteration algorithm, but it is more computationally expensive. The Sampson error correction therefore gives a close solution but a much cheaper computation. 4.2.4 Integrated Optical and MRI System Calibration The optical tracking system measured the motion parameters in the camera coordinate system. These needed to be converted to the MRI coordinate system to be used by the imaging sequence. Therefore, an accurate calibration was necessary to estimate the rotation (Rmc) and translation (Tmc) between the camera and MRI coordinates. To achieve this, the phantom was put in the target field (approximately the position of a subject?s face) within the field of view of the scanner and video cameras (Figure 4-7). A 3D image of the phantom was acquired by the MRI scanner (resolution 0.4 x 0.4 x 0.7 mm3, TR=55ms, TE=6ms, FA=10?, Slice Number=40) and 43 the stereo measured the 3D position of the calibration corners on the surface pattern based on stereo triangulation including distortion correction for the cameras (Section 4.2.3). Figure 4-7 Calibration between the optical system and MRI scanner. The calibration phantom was put in the field of view of both the cameras and the MRI scanner. To estimate the surface corner positions in MRI coordinates, the internal structure needed to be extracted. Figure 4-8 shows two slices from the 3D scan of the calibration phantom. We used corners to describe the internal structure, which when connected are the sides of prisms. Harris corner detection (Harris and Stephens, 1988) was applied on these MR images. A potential problem may be caused by air bubbles inside the phantom, i.e. the blue arrow in Figure 4-8a, which could result in a false corner. To correct this, lines were fitted on the corners and the crossings of these lines were marked as real corners, shown with red ?+? in Figure 4-8. The MRI coordinates of these corners can be determined from their pixel coordinates and the slice position information from DICOM image header (Section 9.2). 44 Figure 4-8 Two images from the 3D scan of the calibration phantom. The red crossings mark the corners. The blue arrow marks an air bubble inside the phantom. For each side of the prism, we have forty positions in MRI coordinates since we scanned 40 slices. They were used to estimate the corresponding corner position on the phantom surface. Figure 4-9a shows the estimation of one side of a prism from the extracted corners. Blue dots were the observed corners (Figure 4-8), which obviously did not lie on a line because of noise. To avoid this, a line that minimizes the perpendicular distances from each of the points to the line needs to be fitted. Principal Components Analysis (PCA) is a good solution for 3D line fitting. Denote all the blue dots as Mmi=[Xmi,Ymi,Zmi]T and Mm=[ Mm1, Mm2,? Mmk] (i=1,2,?,k) where k is the total number of slices (k equals to 40 here). Principal components (P=[p1 p2 p3]) of Mm are the eigenvectors of the covariant matrix of Mm: DPPMM TTmm = [4.39] Since Mm is of size 3?k, PCA generates three orthogonal components. The first principal component (denote as p1) gives the direction vector of the line because (a) (b) 45 it explains as much of the variance in the points as is possible with one dimension (green line in Figure 4-9a). The second and third components (p2 and p3) define directions perpendicular to the line, and are the error term in the fitting (blue lines in Figure 4-9a). The function of the fitted line is therefore: 1)( pMl tmean nf += [4.40] Where t is the free variable for the line function. Then the position of the corresponding corner on the surface can be estimated because the thickness of the phantom cover is known as 2.4mm (red circle in Figure 4-9a). Figure 4-9 Phantom corner position estimation in MRI coordinate system. (a) shows one side of a prism. Blue dots are the observed corner position along that side; green line is the estimated side and the red circle is the estimated corner position on the phantom surface. (b) shows the alignment between the extracted internal structure (green) of the phantom and a simulated ideal model (red). The green lines are the side of the prisms and the red circles are the estimated corners position in MRI (all dimensions are in mm). By performing the line fitting on each side of all the prisms inside the phantom, one can estimate all the corner positions on the surface. These corners do not lie in a plane exactly because of noise. So they were further registered to a simulated phantom surface (checkerboard pattern). The corners on the simulated (a) (b) 46 phantom were transformed into the MRI coordinate system and were the corrected positions used for calibration. Figure 4-9b shows the results after alignment. Red prisms represent the simulated phantom, which has 10 x 10 x 18 mm3 rectangular prisms in the middle and triangular prisms on the four corners. Green prisms are structures extracted from the MRI 3D image. Red circles represent the calibration corner positions (on the camera-visible checkerboard grid, 2.4 mm from the liquid surface). Therefore, based on the simultaneous measurements of the same points in both the MRI scanner and stereo coordinates, the rotation (a 3?3 orthonormal matrix Rmc) and translation (a 3?1 vector Tmc) between them can be calculated using a singular value decomposition (SVD) method (Section 4.3.3). . 4.3 Real-time Motion Calculation 4.3.1 Kanade-Lucas-Tomasi Tracker The Kanade-Lucas-Tomasi (KLT) tracker (Lucas and Kanade, 1981; Tomasi and Kanatani, 1991; Shi and Tomasi, 1994) is a feature tracker widely used in the computer vision community. It offers sub-pixel accuracy and is very computationally efficient, therefore is utilized for the head tracking in this project. Denote ),( tI x as the image intensity, where x is the position of a point, and t is time. As the object moves, the image intensity changes in a complex way, but if the camera grabbing frequency is sufficiently high, or the movement between successive images is sufficiently small, we can model the two successive images as: 47 )(),(),( xxdx ntItI +=++ ? [4.41] Where ? is the time difference between sampling of the two frames; d is the displacement vector (dx,dy)T; and n is noise. The displacement vector d is the one that can minimize the residual error: [ ]2),(),()( ? ?++= W ii tItI xdxx ?? [4.42] Where xi are points in the Gaussian window W centered on x. When the displacement vector is small, the intensity function can be approximated by Taylor series expansion truncated to the linear term, as: dxxxdx ),(),(),( ??? +??++=++ tItItI iii [4.43] By dropping the time term in the intensity for simplicity and denoting ),(),()( tItIh iii xxx ?+= ? , ),()( ?+??= tI ii xxxg , we can rewrite the residual error as: [ ]2)()()( ? += W iih dxgxx? [4.44] To find the displacement d, we let the derivative of the error to zero: [ ] ?? ? =? =+=?? W ii W i T i i W ii h h )()()()( 0)()()(2)( xgxdxgxg xgdxgxdx? [4.45] In other words, we must solve the equation: eGd = [4.46] 48 Where G is a symmetric 2?2 matrix ?= W i T i )()( xgxgG and e is a 2?1 vector ?= W iih )()( xgxe . However, because of the linearization of Equation 4.43, the solution of d is only approximately correct. Iteration in Newton style is therefore needed to search for the right displacement by resampling )( ih x with new d in each step. Iteration is stopped when difference in d is negligible, which usually takes less than 5 iterations. It can also be seen from Equation 4.46 that to get a reliable displacement, both eigenvalues of G must be larger than noise level. Explicitly, two small eigenvalues mean a roughly flat intensity within a window; a large and a small eigenvalue correspond to an edge and two large eigenvalues represent corners. So, to select good reliable features, we accept a window if: ??? >)2,1min( [4.47] Where 2,1 ?? are the eigenvalues of G and ? is a predefined threshold. 4.3.2 Stereo Matching In Section 4.2.3, it was shown that the position of a 3D point can be reconstructed from two perspective images. To achieve a correct 3D reconstruction, point correspondences between two images need to be established. A large amount of work has been carried out for matching different images of a single scene (Chou and Chen, 1990; Weng et al., 1992a), but the results are not satisfactory if the only geometric constraint, i.e., the epipolar constraint, is not used. In our application, after 49 stereo calibration, the epipolar geometry is known, therefore robust correspondences can be established between the two images (Zhang et al., 1994). After applying KLT on one of the camera image, we got N feature points suitable for tracking. For each point mi (i=1,2,?,N) in image 1, a corresponding epipolar line can be found as nim mEl ~' = , where E is the essential matrix defined in Equation 4.30, nim~ is the normalized homogeneous coordinate for point mi. Figure 4-10 shows the two epipolar lines (red lines in b) calculated for two randomly picked points (red ?+? in a). Both lines contain the corresponding points. Therefore the search range for a matching point is narrowed to a band centered on the epipolar line. All corners 'ijm (j=1,?,number of corners) are detected in this band as candidates (Harris and Stephens, 1988) and correlations are computed to search for the corresponding point. (a) (b) Figure 4-10 Epipolar lines estimated after stereo calibration. (a) shows an image from one of the camera and two randomly picked points. (b) shows the image from the other camera and the estimated epipolar lines corresponding to the two points in (a) respectively. 50 Given a point mi in image one and candidate points 'ijm in image two, a correlation window of size (2n+1)? (2m+1) centered at each of these points was used. The correlation coefficient is defined as: [ ][ ] [ ] [ ] 2'2'2211 ' 2 ' 211 ' )()()()( )()()()( ),( ??? ? ??? ? ?+ ??? ? ??? ? ?+ ?+?+ = ?? ? ?? ? W ijij W ii W ijijii iji IIII IIII c ?? ? m?mm?m m?mm?m mm [4.48] Where ? are the points in the Gaussian window W centered on zero. )]12)(12/[()()( +++= ? ? mnII W kk ? ?mm is the average at point m of Ik (k=1,2). Correlation measures the relationship between two windows. Possible correlations range from +1 to ?1. A zero correlation indicates that there is no relationship between the windows. A correlation of ?1 indicates a perfect negative correlation, meaning that as the intensities in one window go up, the other go down. A correlation of +1 indicates a perfect positive correlation, meaning that intensities in both windows have the same pattern. A best match point for mi is therefore the one in 'ijm which has the highest correlation. However, there could be cases where no match point exist in image two for a point in image one (occlusion, outside of field-of-view, etc), so a minimum correlation threshold was set to avoid this problem. Therefore, the best match point is thresholdccm ijir _)},(max{ '' >= mm [4.49] In our implementation, n=m=7 for the correlation window, and a correlation threshold of was set to 0.7. 51 4.3.3 Rigid Motion Estimation 3D positions from the corresponding points in the two views can be calculated based on stereo triangulation. When these feature points are tracked in real-time, rigid motion parameters (rotation Rc, translation Tc) can be calculated. Given two sets of points },...,,{ 21 Nxxx and },...,,{ 21 Nyyy in 3D space at successive time, rigid motion parameters are the solution by minimizing the cost function (Arun et al., 1987; Horn, 1987; Horn et al., 1988; Kanatani, 1994; Umeyama, 1991; Haralick et al., 1989): ( ) 2 1 2 ? = +?= N i ii cc TxRy? [4.50] The minimum should occur when the partial derivative of 2? with respect to Tc equals zero. ( )( ) xRyT TxRyT cc cc c ?=? =?+?=?? ? = N i ii 1 2 0)1(2? [4.51] Where ?? == == N i i N i i NN 11 1,1 yyxx [4.52] Therefore once Rc is known, Tc can be calculated easily. To minimize 2? with respect to Rc, the orthonormal constraint of the rotation matrix needs to be used. Let [ ]TTTT 321 ,, rrrR c = , where each Tkr is a row of Rc. The constraints can be written as: kj k k T j k T k ?= == ,1 3,2,1,1 rr rr [4.53] 52 Construct the Lagrangian function for 2? with the above constraint, we get ( ) ? ?? = = = +++?+ ??= 3 1 326315214 1 3 1 22 222)1( k TTT k T kk N i k cki T kik Ty rrrrrrrr xr ???? ? [4.54] Where [ ] [ ]TcccTiiii TTTyyy 321321 , == cTy . Substitute the constraint of Tc in Equation 4.51 to Equation 4.54, we get ( ) ? ?? = = = +++?+ ???= 3 1 326315214 1 3 1 22 222)1( )( k TTT k T kk N i k i T kkik yy rrrrrrrr xxr ???? ? [4.55] Then take the partial derivative of 2? with respect to kr , and set them to zero, we get: ))(())(( ))(())(( ))(())(( 3 1 3 1 3326153 2 1 2 1 3622142 1 1 1 1 3524111 xxrrrrxxxx xxrrrrxxxx xxrrrrxxxx ??=+++?? ??=+++?? ??=+++?? ?? ?? ?? == == == i N i i N i T ii i N i i N i T ii i N i i N i T ii yy yy yy ??? ??? ??? [4.56] Let ? = ??= N i T ii 1 ))(( xxxxA ? ? ? ? ? ? ? ? ? ? = 365 624 541 ??? ??? ??? ? And 53 T i N i i i N i ii N i ii N i i yyyyyy )()( ))(())(())(( 1 3 1 32 1 21 1 1 yyxx xxxxxxB ??= ?? ? ?? ? ??????= ? ??? = === [4.57] Equation 4.56 can be rewritten as BR?ARRB?RAR ccccc =+?=+ TTT [4.58] Since A is symmetric, AA =T , therefore ( ) TTTTT cccccc ARRRARARR == is also symmetric. Both of the two terms on the left-side are symmetric, which means the right-side must also be symmetric TT cc RBBR = [4.59] If the singular value decomposition (SVD) of B is TUDVB = where U and V are orthonormal and D is diagonal, then TTTTTT ccc RVDURUDVUDVR == )( [4.60] It can be observed that a solution for Rc is simply TVUR c = [4.61] But Kanatani (Kanatani, 1994) proves that this solution gives a reflection instead of rotation if det(VUT)=-1. So the correct solution should be adjusted to T T U VU VRc ? ? ? ? ? ? ? ? ? ? = )det( 1 1 [4.62] Which obviously still satisfies Equation 4.60. Given 3D feature point positions at any two successive times, real time motion parameters can be calculated by first constructing the correlation matrix B 54 using Equation 4.57, then calculating the rotation Rc from the SVD of B using Equation 4.62 and finally determining translation Tc from Equation 4.51. 4.4 Motion Calculation in MRI coordinates 4.4.1 Motion in MRI physical coordinates During the actual experiment, a number of feature points on the object (or face) were automatically selected in the left camera image after which the corresponding points were identified in the right camera image (Section 4.3.2). After correction for the image distortion of the camera lenses, the position of these feature points could be calculated in 3D using triangulation (Section 4.2.3). The changes in position of these feature points in the camera images were tracked over time (Section 4.3.1), allowing calculation of changes in their 3D position. The translation and rotation parameters were then estimated from this collection of 3D points (Section 4.3.3). However, these parameters were in the camera coordinate system and needed to be transformed into MRI coordinate system based on the prior calibration data. At time t0, the relationship between the points MRI coordinates (Xm) and camera coordinates (Xc) can be described as follows, based on calibration data: mccmcm TXRX += )()( 00 tt [4.63] where Rmc and Tmc are the rotation and translation parameters for the conversion between MRI and camera system (Section 4.2.4). At time ti, the tracker measures the movement parameters in camera coordinate system, 55 cccc TXRX += )()( 0tti [4.64] Where Rc and Tc are the rotational and translational motion parameters. Using the above two equations, we can calculate the movement (rotation Rm and translation Tm) in MRI coordinates, mccmcmcmccmcm mccmcm mccmcmcmccmcmmccmcm TTRTRRRT RRRR TTRTRRRXRRRX ++?= =? ++?= ? ? ?? 1 1 1 0 1 )()( tt i [4.65] 4.4.2 Motion in MRI logical coordinates Since the pulse sequence used MRI logical coordinate system instead of physical coordinate system, the motion parameters calculated from Equation 4.65 need to be transformed into the logical coordinate system. MRI physical coordinate system refers to the standard axes of an MRI magnet system, shown in Figure 4-11. The subject is usually placed head first into the magnet and carried in by a sliding gantry on the patient table. If the person is supine (lying on his/her back), the head point along M MO Z (direction of the main magnetic field), right shoulder along M MO X (left-right direction), and nose along M MO Y (up-down direction). OM is the gradient isocenter, where the gradients produce zero magnetic field and which typically corresponds to the center of the magnet. This coordinate system does not change and is used in camera and MRI system calibration in Section 4.2.4. The MRI logical coordinate system, however, depends on the prescribed scan plane. As in Figure 4-11, OLXL, OLYL and OLZL correspond to frequency encoding, 56 phase encoding and slice selection directions respectively. These change with different scan plane prescription and are used in MRI pulse sequence. Figure 4-11 MRI physical and logical coordinate system. {OM, XM, YM, ZM} is MRI physical coordinate system, which doesn?t change after the MRI system is set up. {OL, XL, YL, ZL} is MRI logical coordinate system, which changes with different prescribed scan plane. XL is the frequency encoding direction, YL is phase encoding direction and ZL is slice selection direction. The rotation and translation ( mlR and mlT ) between these two coordinate systems can be read out from the pulse sequence after the scan-plane is prescribed: ))(()( 00 mllmlm TXRX += tt [4.66] Therefore, the final motion parameters as will be used by the pulse sequence should be (see derivation in Section 9.1): mlmcorr RRR = [4.67] mcorrcorr TRT 1?= [4.68] The motion parameters (Rcorr and Tcorr) were sent from the tracking system to the MRI scanner over a TCP/IP connection (Santos et al., 2004) for real time compensation of motion. 57 4.5 Motion Correction for MRI Motion correction was evaluated using a GRE pulse sequence. The sequence updated the scan parameters before each RF excitation. Correction for rotation was performed by adjusting the gradient rotation matrix; correction for translation in the slice selection and read out direction was performed by adjusting the RF transmit and receive frequencies and phases respectively; and correction for translation in the phase encoding direction was performed by adjusting the receive phase offset. Since the correction for translation [ ]Tzyx ddd ,,=corrT is different in all three directions, we will discuss each of them separately. 4.5.1 Rotation Correction MRI can acquire images in any plane, which is realized by varying the frequency- encoding, phase-encoding, and slice-selection gradient waveforms. The directions of these three gradients define the XL, YL, and ZL for MRI logical coordinate system (Section 4.4.2). Therefore, after the operator?s prescription of a scan plan, the logical gradient waveforms that are played out should be in this coordinate system. However, a magnetic field gradient system normally consists of only three orthogonal gradient coils that generate gradient field in the physical XM, YM and ZM axis. In order to create the three logical gradient waveforms, three physical waveforms should be digitally mixed. The mixing ratios are determined by angles of the prescribed scan plane and can be described with an orthonormal rotation matrix. 58 ? ? ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? ? ? ? )( )( )( )( )( )( )( tG tG tG t tG tG tG lZ lY lX mZ mY mX mlR [4.69] Where Gmk and Glk (k=X, Y, Z) are the physical and logical gradient waveforms respectively, mlR (0) is the same matrix in Equation 4.66. When motion occurs during the acquisition, the orientation of the gradient fields needs to be adjusted in real time to follow the motion. The rotation matrix Rcorr from Equation 4.67 is thus used in the pulse sequence to correct the combination of physical gradient waveforms to generate the desired logical gradient waveforms. 4.5.2 Slice Selection Displacement Correction It has been shown in Figure 2-1 that a slice can be excited by applying a slice- selective RF pulse. Since RF pulse is frequency-selective, only spins resonating at the same frequency can be excited. To make an RF pulse spatially selective, it is necessary to make the spin resonance frequency position-dependent, which is accomplished by applying a linear z-gradient field during the excitation. In the presence of this gradient, the Larmor frequency at position z is given by zGzGBz zz ???? +=+= 00 )()( [4.70] Therefore a slice at position z can be excited by an RF pulse with frequency )(z? . If, however, the object moves dz in this direction, frequency of the RF pulse needs to be adjusted to select the same slice for each excitation (Figure 4-12), )())((),(' 00 zzzzz dzGdzGBdz ++=++= ???? [4.71] 59 Figure 4-12 The change of RF frequency to select the same slice when motion occurs. Under the same slice-selective gradient, to follow the motion, the only way is to change the RF frequency. For a GE scanner, the RF frequency generation path can be broken up into 3 components: center frequency, slice selection delta frequency, and phase. The center frequency is 0? , which is only B0 dependent, e.g. 298MHz for a 7T scanner. The delta-frequency component creates an output which is added to the output of the center frequency path for slice selection ( zGz? in Equation 4.70). The phase component determines the phase of the RF transmit pulse. Without loss of generality, we assume z=0 from now on, because it is a constant after scan plane prescription and only dz is the variable changing with motion. Figure 4-13 shows the sequences for slice selection and the three components that determine RF frequency and phase. For the GRE sequence used in our application, a SINC RF excitation pulse in conjunction with a slice-selection gradient is used. The center frequency is a constant, which is 298MHz as mentioned above. After prescription by an operator, delta frequency ?d is set, which is zero here because z=0 (the black straight line of ?d ). Also, the phase ?d is zero. 60 Figure 4-13 Slice selection sequences (RF pulse and z-gradient) and the three components that generate the RF pulse, i.e. center frequency, delta frequency and phase. t=0 is the start of TR and only one TR is shown here. Figure 4-13 also shows a parameter isodelay, which is a very important parameter for an RF pulse. After the RF pulse is played out, because of precession caused by the slice selection gradient, the transverse magnetization will generally not be coherent across the slice. Instead there will be phase dispersion ( ztGz?? = ). A rephasing lobe with an opposite polarity to the slice selection gradient is therefore needed to compensate for the phase dispersion. The isodelay is the parameter used to calculate the optimal area of rephasing gradient. It is defined as the effective precession or dephasing time that results in the phase dispersion across the slice. As 61 shown in the figure, the rephasing gradient lobe (negative blue lobe) should have the same area as the slice selection gradient under isodelay: Rz AisodelayG =? [4.72] Where AR is the area of the slice rephasing lobe. The shape of the rephasing lobe does not affect the phase, only the area matters. For the SINC RF pulse used in GRE, the isodelay is nearly equal to one-half the RF pulse width. After the rephasing gradient lobe, phase dispersion of the transverse magnetization is eliminated for each RF excitation pulse. Since multiple RF excitations have to be applied for imaging (Section 2.1.2), the phase at the start point of isodelay should therefore also be consistent for each excitation. If no motion occured, 0== ?? dd , meaning no phase difference adjustment is made between RF excitations. However, if there is displacement in the z direction, the delta frequency component needs to be changed with an offset zz dGd ?? = , a phase therefore is accumulated. At the start of isodelay, the phase is: )1_( isodelayrfpwtdGd zz ?+?=?? [4.73] This phase obviously is not zero anymore (the blue line of ?d in Figure 4-13), which will not cause phase dispersion across slice, but rather phase difference between every RF excitations. This phase needs to be subtracted in the RF phase generation path to recover net zero phase at isodelay (the red line of ?d in Figure 4-13). All parameters in Equation 4.73 are known, except t? , which is the period of turning on the delta frequency before the RF pulse plays out. Therefore, to calculate the correct phase, the constant t? has to be measured. 62 To measure t? , a cylindrical shaped phantom was used. The advantage of this phantom is that when scanned axially (and the phantom axis is aligned with B0), every slice is exactly the same. We set phase encoding gradient to zero, and reconstructed the data with FFT transformation only in the frequency encoding direction. The resultant is therefore a hybrid space image, with every line in the frequency encoding direction corresponding to the profile of the phantom. Figure 4-14 shows the hybrid magnitude and phase image of the phantom (256?128 matrix, TR=100ms, TE=6ms, FA=10?, Slice Thickness=1mm). Every line is exactly the same due to identical phase encoding. Figure 4-14 Hybrid magnitude and phase image of the cylindrical phantom. Phase encoding gradient is zero. Frequency encoding is left-right direction. If a linear frequency is added to each RF excitation frequency, the selected slices will vary correspondingly. However, because the phantom is cylindrical, every slice along the slice-selection direction is the same. Therefore the measured signal could be regarded as from the same slice. When the phase encoding gradient is set to 63 zero, the hybrid magnitude image will not change, but the phase along the phase encoding direction will change linearly, as shown in Figure 4-15. Figure 4-15 The hybrid space image after applying a linear RF frequency. Phase encoding gradient is set to zero. For each RF excitation, its frequency changes linearly with a step ?? . Perform FFT on each measured echo in k-space we can get a hybrid image. The magnitude of each line in the frequency encoding direction (shown with a curvature shape) represents the object profile. The red line is to select an arbitrary point in the hybrid image, and the phases of those points also change linearly with a step ?? . Complex means to combine the magnitude and phase of the point in that line. Only three lines are shown for illustration. The RF frequency changed with a step ?? . The acquired k-space signals were reconstructed by FFT along only the frequency encoding direction. The magnitude profiles all have the same shape because the profile of the phantom is independent of slice position. However, the phase of any arbitrary point along the phase encoding direction (red line) should vary linearly corresponding to ?? , shown as ? , ?? ?+ , ?? ?+ 2 . Even though ?? can be calculated by using only two lines, improved precision is achieved when using all the information from an acquisition ? ? ?? ? ?=? ?? = + 1 1 1 *arg view i ii PP? [4.74] 64 Where view is the total number of phase encoding, Pi is the reconstructed complex value of an arbitrary point (must be on the scanned object) on the ith phase encoding line. Then because )2/1_( isodelayrfpwt ?+??=? ?? [4.75] We can find the only unknown parameter t? , which equals to 264 s? for the scanner used. If we calculate ?? as the negative value of Equation 4.75 and change the phase of the RF pulse with this ?? , the linear phase change along the phase encoding direction caused by ?? should disappear (keep the phase encoding gradient still at zero), meaning the phase accumulation at the start of isodelay is compensation to zero. Since this compensation depends on a correct measurement of t? , the effect of using various t? for RF phase compensation is illustrated in Figure 4-16, in which f? ( ???=? 2f ) was set to 100Hz; (a) uses the correct t? =264 s? and the resultant phase image is very homogeneous showing the fully compensated phase, while (b-d) uses t? =180 s? , 230 s? and 480 s? and the resultant images show obvious uncorrected residual phases. 65 Figure 4-16 The hybrid phase image of the phantom when phase encoding gradient is set to zero. RF transmit frequency varies linearly along the phase encoding direction. (a) Phase correction is calculated with t? =264 s? ; (b) Phase correction is calculated with t? =180 s? ; (c) Phase correction is calculated with t? =230 s? ; (d) Phase correction is calculated with t? =480 s? . It can also be seen from Figure 4-16 that the phase image is very sensitive to the change of t? (if f? is set big, e.g. 100Hz), therefore the image itself can be used for the measurement of t? . After setting a f? for RF pulse, start changing its phase (a) (b) (c) (d) 66 using various t? . The t? that results in the most homogeneous hybrid phase image is the correct t? for the scanner. 4.5.3 Frequency Encoding Displacement Correction A frequency encoding gradient waveform consists of two portions, a dephasing gradient lobe and a readout gradient lobe. The polarity of the dephasing gradient lobe is opposite to the readout gradient lobe, which is used to prepare the transverse magnetization so that an echo signal can be created at a later time. The time of the echo peak is determined by the area under the dephasing gradient lobe. Because of the similar reason as in the previous section, the echo reaches its maximum (all spins are in phase, without considering B0 inhomogeneity, susceptibility variations, etc) when the area under the readout lobe equals to the area of the dephasing lobe. To describe this quantitatively, denote the dephasing gradient as -Gx, the spins along the frequency encoding direction precess at different frequencies: )()( 0 xGBx x?= ?? [4.76] The phase accumulation due to the dephasing gradient (in a rotating frame with frequency of 00 B?? = ) is xTGx xd ?? ?=)( [4.77] Where T is the duration of dephasing gradient. When the readout gradient Gx is applied, an additional phase dispersion is introduced. The accumulated phase will be: )()( TtxGx x ?= ?? [4.78] 67 At a specific time techo=T, the phase in the above equation equals zero, meaning all the spins are in phase and form an echo. If we want the echo to be at the center of the k-space, the readout gradient must be on for a time Tacq=2T (Figure 4-17). So during the acquisition period from 0 to Tacq, the sampled data produces a symmetric k-space line spanning kx,min to kx,max (Figure 2-3). Figure 4-17 Frequency encoding gradient, the acquisition window associated with it, and the three components that generate the receive frequency and phase. When motion dx occurs at this direction, the RF receive frequency needs to be offset by xxdGd ?? = [4.79] 68 Given a receive bandwidth BW? and the target xFOV (x direction FOV), the readout gradient should be xFOV BWG x ?= ? [4.80] By substituting Equation 4.80 to 4.79, we can get the offset receive frequency xdxFOV BWd = ? [4.81] Similarly as in Figure 4-13, the offset frequency change results in an extra phase on the echo signal. It has to be subtracted from the receive phase for each acquisition to get rid of phase difference caused by different x displacement (Figure 4-17). )2/_'( gxwpwtdd +?= ?? [4.82] Where pw_gxw/2 is the pulse width (duration) of the readout gradient, 't? is the time interval between start of the delta frequency and data acquisition. To measure 't? , we again set phase encoding gradient to zero and reconstructed the hybrid phase image. Change the receive frequency linearly with a step f? (e.g. 100Hz) for each k-space line acquisition, and compensate receive phase using Equation 4.82 with various 't? . The 't? corresponding to the most homogeneous phase image was therefore the correct 't? of the scanner. Figure 4-18 shows four hybrid phase images corresponding to 't? = 144 s? , 100 s? , 200 s? and 300 s? . The homogeneous image (a) means the receive phase caused by frequency offset was all compensated and therefore 't? = 144 s? . The diagonal shape was due to the applied linear change of the receive frequency. 69 Figure 4-18 Hybrid phase image used for measurement of 't? . Phase encoding gradient is set to zero. RF receive frequency varies linearly along the phase encoding direction with a step of 100Hz. (a) Phase correction is calculated with 't? =144 s? ; (b) Phase correction is calculated with 't? =100 s? ; (c) Phase correction is calculated with 't? =200 s? . (d) Phase correction is calculated with 't? =300 s? . 4.5.4 Phase Encoding Displacement Correction Phase encoding gradient is applied while the magnetization is in the transverse plane but before the readout (Figure 2-2). By varying the phase encoding linearly, a linear (a) (b) (c) (d) 70 phase is introduced to the magnetization in the phase encoding direction (Figure 2-2). FFT is used to reconstruct the spatial information of the object. Since the frequency and phase encoding can be analyzed independently, Equation 2.11 can be simplified to one dimension: ?? = ???= 1 0 )2exp()()( N n yy ynkiynks ?? [4.83] Where yny ?= , y? and N are the pixel size and number of pixels in phase encoding direction respectively. For each phase encoding step, the gradient Gy changes with a constant step creating a constant yk? ( yyy Gk ???=? ). The N steps phase-encoding starting at the bottom edge of k-space (Figure 2-3) can therefore be written as: 1,...,1,0)2()( ?=??= NmkNmmk yy [4.84] The signal is: ?? = ?????= 1 0 ))2/(2exp()()( N n ykNmyniynms ?? [4.85] To satisfy the Nyquist criterion, the phase encoding step size yk? must be chosen so that: yNyFOVk y ?==? 11 [4.86] Substitute Equation 4.86 to 4.85, we have ? ? ? = ? = ??= ?????= 1 0 1 0 )/2exp()exp()( )1)2/(2exp()()( N n N n Nnminiyn yNNmyniynms ??? ?? [4.87] The image can be reconstructed from the discrete inverse Fourier transform: 71 )/2exp()()1)(( 1 0 Nnmimsyn N m n ?? ? ? = =?? [4.88] The factor (-1)n means to change the sign of the alternate elements of the output image. If there is dy displacement occurred in this direction, the phase of the receiver needs to be changed: )/2exp()1(2exp)1)(()( )/(2exp)()1)(( 1 0 / 1 0 / NmniydNmimsdyn ydnNmimsdyn n N m yydy y y N m ydyn y ??? ?? ? ??? ? ??? ? ??? ? ??? ? ??=+?? ?????? ?+=?+? ? ? ? = ? ? = ?+ [4.89] Note also that yyyy Gk Nk ?? )0( 2)0( =??= [4.90] Where )0(yG is the gradient for the first step phase encoding. ( )yyyyydy dkNidyN Nidyi ??=?? ? ? ??? ? ? ?= ??? ? ??? ? ? ?=? ? ??? expexpexp)1( / [4.91] Substitute Equation 4.90 to 4.91, we get ( )yyyydy dGi ??? )0(2exp)1( / =? ? [4.92] Therefore, the phase offset needed to follow dy displacement is: yyy dyFOV mGd ??? ? ??? ? += ????? 2)0(2 [4.93] The first phase offset term in Equation 4.93 is the phase accumulated in the first encoding step; the second term is a linear phase ramp, meaning the phase increases yFOVd y /2? per phase encoding step. 72 Note that most of the time, when off-center FOV is required, the shift theorem for Fourier transforms was simply used, which says to shift the image along a direction, a linear phase ramp is needed in the frequency space (the second term in Equation.4.93). But in our case, this is not exactly true because phase encoding started from 0 instead of ?N/2. When dy is a constant, the linear phase ramp does produce an off-center FOV. The constant phase (first term in Equation 4.93) does not affect the image contrast because magnitude image does not change at all and phase image has only a global shift. However, when dy varies with motion, the first term multiplied with dy is not constant anymore. Thus if not compensated, artifacts will appear. 4.5.5 Summary of Position Compensation The pulse sequence updated the scan parameters before each RF excitation. The correction of motion for the GRE sequence can be summarized as follows. a. Correction for rotation was performed by adjusting the gradient rotation matrix with Rcorr (Equation 4.67). b. Correction for slice selection translation was performed by adjusting the RF transmit frequency offset with zz dG? , and adjusting the RF phase with )1_( isodelayrfpwtdGd zzz ?+??= ?? (Equation 4.73). c. Correction for read out direction translation was performed by adjusting the RF receive frequency with xdxFOVBWd =? (Equation 4.81), and adjusting the receiver phase with )2/_'( gxwpwtdd x +??= ?? (Equation 4.82). 73 d. Correction for translation in the phase encoding direction was performed by adjusting the receive phase offset with yyyy dyFOVmGd ?? ? ? ??? ? += ???? ? 2)0(2 (Equation 4.93). (The total receiver phase offset is therefore the summation of phase shift in c and d, yx ddd ??? += ) 4.5.6 K-space Line Re-acquisition An additional feature of the pulse sequence was that it allowed for a re-acquisition of motion corrupted k-space lines. This feature was added to reduce the effect of motion that was too fast to be compensated by gradient and RF frequency and phase adjustment. Re-acquisition was performed when the motion that occurred during the acquisition of a given k-space line exceeded a preset threshold. The displacement threshold for the reacquisition of a k-space line was set to 0.2mm (because image resolution was chosen to be about 0.2mm). A rotation threshold was not used because head rotation was always perceived as a displacement as well. This is because rotation is defined relative to the center of the image, whereas the head generally pivots around the back of the head, where it is supported by the receive coil array, opposite of the part of the head viewed by the cameras. 4.6 Software Flowchart for the Motion Correction System Figure 4-19 shows the software flowchart of the proposed motion correction system. Calibration was done offline and was programmed with MATLAB. Real time 74 tracking software was running on Linux with C++. The tracking computer communicated with the MRI scanner through a TCP/IP connection to provide the console (scan computer) with the real-time motion parameters. Figure 4-19 Software flowchart for the prospective head movement correction system for MRI. Calibration is all done offline and programmed with MATLAB. Real-time correction is programmed with C++ running in Linux. N is the number of total phase encoding steps for a scan. 75 Chapter 5 Results and Discussion with the Stereo Motion Correction System for MRI 5.1 Camera Calibration Test The cameras used in this project have 4 mm lenses and were positioned about 8cm away from the volunteer?s face. Distortion of the images is very serious. Figure 5-1a shows the distortion model on each pixel of one of the stereo cameras. The blue cross in the center marked the image center and the blue circle marked the principal point. Each arrow represented the displacement of the pixel caused by lens distortion. The displacement was bigger at the corners of the image, which reached more than 50 pixels. It meant that without a correct camera calibration, the 3D estimation from stereo would not be accurate. Image (b) shows a checkerboard image acquired with the corresponding camera. The edges of the checkerboard pattern were not straight. Instead, they were curved because of the distortions. Image (c) shows the image after distortion correction, which recovered the straight checkerboard edges. (a) 76 (b) (c) Figure 5-1 Lens distortion and correction for a camera. (a) Distortion model of the camera. The arrows and number on the curve mark the distortion direction and amplitude. (b) an image of a checkerboard pattern. The lens distortion is obvious especially close to the corners of the image. (c) lens distortion corrected image of b. 5.2 Stereo Reconstruction Validation The checkerboard pattern was attached to a highly sensitive, 6-axis- positioning system (M-824 compact 6-axis-positioning system, Physik Instruments, GmbH & Co. KG, Germany) to test the stereo?s accuracy. The hexapod was driven by six high-resolution actuators with sub-micron precision Experiments were done outside of the scanner since the positioning system was not MR-compatible. The cameras were fixed firmly on top of the moving platform at a distance of about 8cm to simulate the position of the stereo with respect to the human face. The platform was controlled by a computer to move in the x, y, and z direction over known distances and rotate around the three axes with known degrees, where the x, y and z were defined as the two orthogonal axes of the platform, and the z axis was defined as pointing out of it. Camera images were acquired at each position to measure these movements. The discrepancy between the real and measured positions was evaluated 77 to determine the stereo?s accuracy, which turned out to be 0.04?0.03mm, 0.03?0.02mm and 0.02?0.01mm in x, y and z direction, and errors of rotation around x, y and z axes were 0.03?0.13?, -0.02?0.05? and 0.01?0.03? respectively. This high accuracy is well within the highest available MRI resolution, which garantees adequate motion correction. 5.3 Tracker Noise Evaluation Since the motion correction was performed for each acquired k-space line, the tracking noise was also an important factor. Our real-time tracking was based on feature points, which provided sub-pixel accuracy but was also sensitive to noise in the video images. The tracking noise was evaluated by monitoring a stationary object for several minutes by tracking 3 feature points. When the tracker was set up outside of the scanner, the noise was very small even by tracking only 3 feature points, as shown in Table 1. However, when the tracker was put inside the MRI, a significant increase of the error was observed, even without scanning. During a scan the noise level increased even further. This might be attributed to remaining interference of the MRI scanner with the video cameras. These findings suggest that there is an effect of the static magnetic field as well as of the switching gradients and/or RF. The maximum translation error exceeded 2mm, which was unacceptable for the high-resolution MRI motion correction. 78 Table 1 Accuracy and noise of the tracker when it was put outside or inside of the scanner. Tracker Outside Tracker Inside (no scan) Tracker Inside (during scan) Mean ? Max Mean ? Max Mean ? Max x (mm) 0.065 0.047 0.215 0.002 0.315 1.057 0.702 0.609 2.772 y (mm) 0.027 0.014 0.068 0.076 0.034 0.179 0.052 0.057 0.314 z (mm) 0.029 0.025 0.107 0.158 0.221 0.918 0.111 0.288 1.138 rx (?) 0.021 0.014 0.061 0.146 0.169 0.740 0.047 0.226 0.881 ry (?) 0.016 0.024 0.081 0.068 0.080 0.338 0.034 0.143 0.588 rz (?) 0.038 0.028 0.132 0.019 0.249 0.864 0.545 0.485 2.166 To reduce the tracking noise, multiple feature points were used. Figure 5-2 shows the maximum x-translation error as a function of the number of feature points. It shows that tracker noise decreased with increasing number of feature points. For our application, we opted to select around 20 points to keep the maximum error below 0.2 mm and 0.2 degree. Figure 5-2 Noise of the tracker. The maximum x translation error as a function of the number of tracked feature points. However, the number of feature points on the human face detectable with both video cameras was often below 20, in part because of the cameras? limited field of 79 view of about 13 x 10cm2. To overcome this we decided to place a sticker on the subject?s forehead that displayed a number of white triangles on a black background. 5.4 Calibration Accuracy Between Stereo and MRI Coordinate System The calibration phantom (Figure 4-2) was used to test the calibration accuracy between the stereo system and the MRI scanner. First, the phantom was put at an arbitrary position. The stereo measured the corner positions in 3D and transformed them to MRI coordinates based on the calibration results. Then 3D MR images were acquired, from which the 3D MRI coordinates of the corners were determined. Accuracy was obtained by measuring the discrepancy between the estimated position (obtained by using the camera model and calibration parameters) and real ones (measured from MR images). This procedure was repeated at different arbitrary phantom positions. The accuracy of these points was found to be 0.06?0.05mm, 0.10?0.08mm and 0.15?0.13mm in the x, y, and z direction in the MRI coordinate system respectively, which should be sufficient for the motion compensation task. The results are also visualized in Figure 5-3 where rotation and translation parameters are calculated from these points at different phantom positions. Solid lines show parameters estimated using MRI images. Dotted lines show parameters estimated using camera images but further transformed into MRI coordinate system. The two sets of estimation matched very well, which means the calibration between the two coordinate systems was accurate. 80 Figure 5-3 Rotation and Translation parameters estimated from MRI images (solid lines) and from camera images transformed into MRI coordinates (dotted lines). The two sets of lines matching very well means the calibration between camera and MRI is accurate. 5.5 Pulse Sequence Test with a Phantom To test the accuracy of RF frequency and phase adjustment for translational displacement in slice-selection, phase-encoding and frequency-encoding directions, a high resolution phantom was scanned with our modified GRE sequence. (Rotation doesn?t need to be tested for the sequence because it only requires rotation matrix adjustment but doesn?t need any frequency or phase compensation.) Halfway through the experiment, a translation of 10mm was executed and the scan was continued until the end with the phantom staying at the new position. The particular motion pattern was used to maximize errors near the center of ky-space, which causes the most salient artifacts. 81 Figure 5-4 shows the results of three sets of scans (256?128 matrix, TR= 100ms, TE= 6.9ms, FA=30o). The first column represents the control scan without motion, second and third columns represent scans with displacements, without correction and with correction respectively. Displacements were executed all in the OMZM direction, so for axial scans (a-c), displacements were in the slice-selection direction; for coronal scans (d-f) with frequency-encoding superior-inferior (OMZM), displacements were in the frequency-encoding direction; for coronal scans (g-i) with phase-encoding superior-inferior (OMZM), displacements were in the phase-encoding direction. It can be seen that for all cases, the displacements were compensated successfully, which means the measurement of scanner specific time t? and 't? is accurate and the strategy of RF frequency and phase adjustments are correct. Note that image (g-i) show an aliasing artifact because of the small FOV prescribed in the phase-encoding direction, which, however, does not affect the displacement compensation test. 82 (a) (b) (c) (d) (e) (f) (g) (h) (i) Figure 5-4 Translation compensation of a phantom. (a,d,g) were scanned statically; all others were scanned with the phantom locating in one position for the first half of the scan and the second half in another position 10mm apart. (b,e,h) were scanned without position correction, but (c,f, i) were with correction. Displacements in (a-c) were in slice-selection direction; (d-f) in frequency-encoding direction, and (g-i) in phase-encoding direction. 83 5.6 System Performance Test With all the separate parts of the prospective motion correction system validated, the performance of the system as a whole needed to be tested. The optimal way is to evaluate hybrid space data of a scan with motion. Because ideally without motion, all lines in the phase-encoding direction should be exactly aligned. Imperfect alignment would indicate imperfect motion compensation. If motion can be corrected fully, then an alignment should be achieved. We set phase encoding gradient to zero and reconstructed the hybrid data (resolution 0.23mm2, TR=100ms, TE=25ms, FA=10?). Figure 5-5a shows motion induced by the breathing of a volunteer recorded with the tracker during the scan. Image (b) was scanned without real-time motion correction and the lines in the phase encoding direction shifted during the scan. When prospective motion correction was applied for image (c) acquisition, those lines were all aligned. Since the in plane resolution was only 0.2mm, the perfectly aligned lines in Figure 5-5b suggested that the error of the whole correction system should be less than a pixel or 0.2mm. Therefore, this is a quick and coarse way to validate the accuracy and reliability of the system. 84 Figure 5-5 The MR hybrid image without phase encoding gradient. (a) Small breathing induced head movement was observed during the scan of the volunteer. (b) without motion correction; (c) with motion correction. 5.7 Volunteer Study The efficacy of the tracking system was studied on a normal volunteer under an institutional review board-approved protocol. A nominal in-plane spatial resolution of 0.23 x 0.23mm2, a slice thickness of 3mm, 300ms TR, 25ms TE and 30? flip angle were used. Four motion conditions were studied: one with cyclic breathing-related head motion (Section 5.7.1), one with minimal motion (Section 5.7.2), one with large stepwise motion (Section 5.7.3), and one with slow gradual motion (Section 5.7.4). (a) (b) (c) 85 5.7.1 Breathing Related Motion During this experiment, the volunteer was instructed to not pay any effort to lay still. Motion parameters reported by the tracker (Figure 5-6a,b) showed cyclic head motion with a period consistent with the respiratory cycle. The observed rotation was mainly around the x-axis and the translation along the z-axis (in the MRI physical coordinate system), which suggested a nodding like motion. The scan was performed twice, once without motion correction and once by using the prospective motion correction technique proposed here. Figure 5-6c,e showed that even though the motion was only about 2mm and 1?, ghosting and blurring artifacts were clearly observed in the high- resolution images (0.23 x 0.23mm2 in-plane resolution). The prospective line-by-line motion corrected image using our real-time compensation system showed much improved image quality and a virtual elimination of ghosting artifact (Figure 5-6d,f) as compared to the uncorrected images (Figure 5-6c,e). The reacquisition rate was 2.3%. 86 Figure 5-6 Prospective breathing induced motion correction for the 2D gradient echo acquisition. Images (a,b) show the evolution of motion parameters reported by the tracker during the acquisition of images (c) and (d) respectively. Image (c) was scanned without motion correction, and (d) was with motion correction; (e) and (f) were zoomed in on an area in the frontal brain indicated by the red square in images (c) and (d) respectively. 5.7.2 Minimal Motion During this experiment, the volunteer was instructed to stay as still as possible. Despite the efforts of the volunteer, the motion parameters reported by the tracker (Figure 5-7a,b) showed a small gradual and jittering motion. This scan was also repeated twice, once without (a,c,e) and once with (b,d,f) motion correction. Both uncorrected (c,e) and corrected (d,f) images have similar quality and show little evidence of ghosting. The reacquisition rate of Figure 5-7d was 4.6%. 87 Figure 5-7 Prospective motion correction data obtained for a volunteer that attempted to minimize head motion. Images (a,b) show the motion parameters reported by the tracker during the acquisition of images (c) and (d) respectively. Image (c) was scanned without motion correction, and (d) was with motion correction; (e) and (f) magnify an area in the front of the brain (the area marked by the red square in (c) and (d) respectively). 5.7.3 Large Stepwise Motion Experiments in the presence of strong intentional motion demonstrated the robustness of the prospective motion correction technique (Figure 5-8). The motion parameters reported by the tracker indicated a range of motion of close to 10 mm translation and 10 degree rotation. The severe ghosting and blurring artifacts observed without correction (e,h) were almost entirely avoided with prospective motion correction (f,i), and the image quality of (f,i) was very similar to (d,g), which were acquired in the absence of intentional motion and with motion correction on (motion parameters in a). The reacquisition rate for (f,i) was 10%. 88 Figure 5-8 Example of the performance of the prospective motion correction method in the presence of large stepwise motion during the acquisition of multi-shot 2D gradient echo images on a normal volunteer. Images (a-c) show the motion parameters reported by the tracker during the acquisition of images (d-f) respectively. Image (d) was scanned without intentional motion during the scan, whereas (e,f) were scanned in the presence of motion; (e) was acquired without correction but (f) was acquired using the proposed method. (g-i) show additional detail in an area in the front of the brain that was indicated by the red square in images (d-f). 5.7.4 Slow Motion The final motion type that was investigated was slow motion resulting in large displacements and/or substantial rotation angles. Even without any reacquisition this type of motion was handled well by the prospective motion correction system. The motion parameters reported by the tracker (Figure 5-9b,c) showed that the range of 89 the motion was close to 10 mm of translation and 10 degrees of rotation. The reacquisition rate for (f,i) was 0% since the reacquisition mode was turned off. Image quality of (f,i), when motion correction was used, was notably better than (e,h), acquired in the absence of motion correction. Furthermore, image quality of (f,i) was similar to (d,g), acquired in the absence of intentional motion (motion parameters in a). Figure 5-9 Prospective motion correction in the presence of slow motion. The motion parameters reported by the tracker during the scan of images (d-f) are displayed in (a-c), respectively. Image (d) was acquired in the absence of intentional head motion, (e,f) during slow head motion, where, unlike (e), (f) was acquired with the proposed correction method. A magnification of the area in the red box in (d-f) is shown in (g-i). 90 5.8 Discussion In this study, a method for the prospective correction of 3D motion for high- resolution MRI was developed and evaluated. Preliminary data obtained on a human volunteer show substantial image quality improvement for high resolution gradient echo imaging at 7T, in particular under conditions of periodic head motion related to breathing and intentional head rotation. To obtain adequate tracking accuracy, the cameras were put right above the MRI receiver coil, monitoring the human face from a very small distance. Since the stereo 3D reconstruction accuracy depends on the angle and relative distance between the two cameras, the best setup of the two cameras should be explored further for better accuracy. The accuracy of the calibration between the tracker and the MRI scanner is also essential for system performance. Since all the motion parameters were estimated in the camera coordinate frame, they had to be transformed precisely to the MRI coordinate frame to be used in the sequence. However, calibration was difficult in this system because the tracker can only see the surface of the calibration object, whereas the MRI can only see its internal structure. A previous publication (25) dealt with this issue in an iterative way. They used three gel-filled reflective spheres for estimating the initial coordinate transformation, and then iteratively refined this transformation based on the internal structures of the phantom. This method, though accurate, took more than 30 minutes to accomplish. This limitation was addressed in our project by designing a phantom with high resolution, well ordered structures. Since every parameter of the phantom was known, the relationship between the external 91 appearance (black-and-white checkerboard) and internal MR-visible structure was easily established, which simplified the calibration and shortened the calibration time to several minutes. The noise of the tracker was another important factor essential for reliable correction of MRI data. It was demonstrated here that at least 20 feature points needed to be tracked for adequate precision. Because of the setup of our cameras, only a part of a face, lacking a large number of feature points, could be seen from the cameras. This was the reason for using a ?tracking target? mounted to the forehead. If the cameras could be put further away from the face, direct tracking of the face curvatures could be used instead of these feature points. Fixing the cameras to the inside of the magnet bore, instead of to the transmit coil as was done here may be a solution. Another solution to this issue of limited field of view would be the use of cameras with a fish-eye lens, which would give a larger view for a given camera distance from the face. However, the camera distortion correction may be more complicated. It has also been demonstrated that the noise comes from the interference between the cameras and the MRI scanner, so a better shielding or hardware filtering could be a solution too. Due to the tracker?s relative low frame rate some fast motion could not be adequately corrected. For this reason, a reacquisition function was added to the sequence in this project. The number of re-acquired lines (reacquisition rate) was dependent on the type of head motion and quite variable, e.g. 2.3% for the conditions presented in Figure 5-7d and 10% for those in Figure 5-8f. Considering the resulting improvement in image quality and the small increase in total imaging time, this 92 sacrifice appears worthwhile. Nevertheless, improved tracking speed might reduce or eliminate the need for re-acquisition. This was confirmed in the ?slow motion? experiment, during which the re-acquisition was turned off. Under this condition, the tracking speed exceeded the head motion speed, resulting in excellent image quality without any reacquisition, as was seen in Figure 5-9(f,i). With the continuing increase of computational power, and improved motion tracking algorithms, it is therefore expected that the need for re-acquisition will be reduced. One possible area for improved tracking is the use of predicitive estimates for object motion, for example by using Kalman filtering. Although the motion correction scheme presented here was effective in reducing the effects of motion on image quality (e.g. compare Figure 5-6d and Figure 5-8f with correction versus the not corrected versions in Figure 5-6c and Figure 5-8e), there remain possibilities for improvement. For example, the motion-corrected images in Figure 5-8f and Figure 5-9f are slightly inferior to the corresponding images acquired during the minimal head motion (Figure 5-8d and Figure 5-9d). This apparently imperfect motion compensation could be due to a number of factors, for example phase changes due to small changes in local B0 amplitude, or subtle changes in B1 amplitude and phase. The contribution of these effects and their potential mitigation remains to be investigated. In this current method, quantitative validation of image quality improvement on volunteer study has not been fulfilled. One of the reasons is that since image resolution is only 0.2mm, the reference image (no motion) can hardly be exactly the same slice as in the with motion case, which makes quantitative comparison very 93 difficult. To get a better validation, we need to test this technique on more volunteers and ask neurologists to give objective judgments on the image quality. 94 Chapter 6 Prospective Motion Correction for MRI using a Single Camera A substantial amount of clinical and research MRI images are rendered unusable because of subject motion. To solve this problem, prospective motion correction based on external tracking systems has been proposed. Stereo optical tracking system has been shown to give good real-time motion correction with reasonable accuracy. However, the calibration of this system is very complicated and time consuming, as it requires a camera system calibration as well as a calibration between cameras and MRI system using dedicated phantoms. We therefore propose an alternative motion correction method for MRI that does not require calibration and can work with just a single video camera. 6.1 System Setup All experiments were performed on a GE (Milwaukee, WI, USA) 7T MRI system. An MR compatible infrared camera (MRC Systems GmbH, Germany) was fixed on a holder right above and in front of the head coil (Figure 6-1). Six infrared emitting diodes were used to illuminate the field of view. The distance from the camera to volunteer?s face was around 8cm. The high spatial resolution of this setup (320 x 240 pixels, about 13 x 10cm2 field of view focusing mostly on the nose of the volunteer) allowed the detection of sub-millimeter movements. 95 Figure 6-1 System setup. The camera was fixed right on top of the head coil. 6.2 Method Rather than using phantom calibration scans, this method uses training scans to estimate head position and link the camera image measurement to the MRI scanner coordinate system. During the training scan, the subject was asked to slowly move his head in the directions that were least restrained, including rotations in the axial and sagittal planes. N EPI (Section 2.1.3; parameters: 128?96 voxels over 24?18 cm2, 11 3mm slices with 0.5mm gap, TE 50ms, TR 1.2s) and camera images were acquired simultaneously (Figure 6-2). From the EPI images, motion parameters in the MRI coordinate system (3 rotations and 3 translations) between each volume were calculated with image registration method. Since camera images were acquired simultaneously with EPI, each camera image in the training data thus corresponded to a motion vector. The training data were then sorted based on the motion parameters to help speed up the real-time search. An ROI region (the nose) was also selected based on these training camera images. 96 Figure 6-2 Method of the motion estimation using a single camera. N training EPI and camera images were acquired. From EPI images, 6DOF motion parameters were calculated from registration for each training data. During the real-time scan, the most similar camera image in the training data was found for the newly acquired camera image. The corresponding motion parameters from EPI were sent to MRI pulse sequence for prospective motion correction. During the real-time scan, the correlation between the ROI of the newly captured camera image and the ones in the training camera images was calculated to find the most similar one. The corresponding motion parameters of that training image served as the estimation of the current head position. These motion parameters were then sent to the MRI scan computer at 10Hz speed. A gradient echo sequence 97 was modified to incorporate motion correction by adjusting gradient rotation and RF frequencies and phases before each RF excitation (Section 4.5). 6.2.1 Motion Computation The motion parameters can be calculated from the N EPI images in the training data through image registration. Image registration involves the mapping between a pair of images, so that one image after transforming with the mapping parameters can be matched to the other one. tt TXRY += ii [6.1] Where Xi=[xi1, xi2, xi3]T and Yi=[yi1, yi2, yi3]T are the corresponding points in the two images in MRI physical coordinate system (after transformation as in Section 9.2) and Rt, Tt are the rigid body transformation between them. The registration parameters can therefore be determined by minimizing the sum of squared intensity differences between images (Friston et al., 1995; Friston, 2007): ?? ?=??= i ii i ii f 222 );( ttt pXYTXRY? [6.2] Where f is the transformation function, and pt = [pt1, pt2,?, pt6]T is the vector containing three rotation and three translation parameters. Given n corresponding points in the two images, Equation 6.2 can be solved using Gauss-Newton algorithm (Section 9.3). For the mth iteration, the vector pt is updated as: ( ) bJJJpp tt TTmm 1)()1( ?+ += [6.3] 98 Where ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? = 6 )( 2 )( 1 )( 6 )( 2 2 )( 2 1 )( 2 6 )( 1 2 )( 1 1 )( 1 );();();( );();();( );();();( t m n t m n t m n t m t m t m t m t m t m p f p f p f p f p f p f p f p f p f ttt ttt ttt pXpXpX pXpXpX pXpXpX J ? ???? ? ? (3n?6 matrix) and ?? ? ? ? ? ? ?? ? ? ? ? ? ? ? ? = );( );( );( )( )( 22 )( 11 m nn m m f f f t t t pXY pXY pXY ? ? (3n?1 vector). For each iteration step, Yi are re-sampled using the new transformation matrix. The iterative process is repeated until 2? can not be decreased or for a fixed number of iterations. This approach is based on the assumption that the starting estimate is close to the correct solution and the image is smooth, which is the case in our application because the motion between successive images acquired in the training data is very small and the relative low resolution EPI images are smooth. Therefore the possibility that a solution stuck in a local minimum instead of a global minimum is small 6.2.2 Best Match Search The search for the camera image in the training data that most closely matched the newly acquired image was realized through a similarity measure. The correlation coefficient (CC) is optimal in similarity measure as defined in Equation 4.48, reiterated here for a correlation between images A and B within a window W: 99 ( )( ) ( ) ( ) 22 )()( )()( ??? ? ??? ? ? ??? ? ??? ? ? ?? = ?? ? ?? ? WW W BBAA BBAA CC xx x xx xx [6.4] Where A and B are the mean pixel values of image A and B within the window W respectively. To speed up the computation, the correlation in image space can be calculated in frequency space using FFT. Let AAA ?= )()(? xx , BBB ?= )()(? xx , then the numerator in Equation 6.4 is ?? ? ?? == 1 0 )(?)(?)(?)(? N nW n nBnABACC x xx [6.5] Where N is the size of the window. It can be calculated through )(? kFA , Fourier transform of )(? nA ; and )(? kFB , Fourier transform of )(? nB . Since ?? ? ? ?? ? ? = ? = ? = ? = ? = ? = ? = ? ? ?? ? ? ???= ??= ?=?????? ? 1 0 1 0' 1 0 1 0 1 0' 1 0 1 0 /)'(2)'(?)(?1 /2/'2)'(?/2)(?1 /2)(*?)(?1)(*?)(? N n N n N k N k N n N n N k BABA NkmnnienBnA N NmkieNknienBNnkienA N NmkiekFkF NkFkFIFFT ? ??? ? The quantity in parentheses is 0 for all values of n? except those of the form pNmnn +?=' , where p is any integer. At those values, it is 1. Therefore, ( ) ?? = +?=? 1 0 * )(?)(?)(?)(? N n BA pNmnBnAkFkFIFFT [6.6] By comparing Equation 6.6 with 6.5, the correlation coefficient can be calculated as follows: convert the two images to 1D vectors, FFT the two vectors, multiply one resulted transform by the complex conjugate of the other, inverse FFT 100 transform the product and the first component (m=0) is the correlation coefficient. All other components of ( ))(?)(? * kFkFIFFT BA ? are the values of the correlation at different lags. 6.3 Result 6.3.1 Accuracy of the Motion Estimation To estimate the accuracy of the method, several minutes of data were acquired, consisting of 145 EPI volumes and the corresponding 145 camera images. During this time, the subject slowly moved his head in the directions that were least restrained, mainly rotations in the axial plane. All the EPI images in the training set were registered to a reference (the first volume in the training set can be used as a reference) to get the motion parameters. The range of rotation around ZM axis was about 17 degrees and translation in XM direction was 18mm (Figure 6-3). Figure 6-4 shows the correlation of two randomly chosen images in the training data set with all images. The curve suggests that the correlation is higher when positions are close, which means CC is a good measurement for most closely match image search. 101 Figure 6-3 Motion parameters from the training data set. The total number of training images is 145. The motion is mainly rotation around ZM axis and translation in XM axis. Figure 6-4 Correlation of two randomly chosen images with all training images. Red crossings mark the chosen image, (a) the correlation between image No. 47 with others; (b) the correlation between image No. 100 with others. To test the accuracy of motion estimation, the data were divided into two sets, set A and set B. For every camera image in set B, the most similar image in set A was determined based on correlation. The motion parameters of this image in set A served as motion estimate for that image in set B, whereas the true motion was derived from (a) (b) 102 registration of the EPI image in set B to the reference. Difference between camera derived position and EPI registration parameters was used to validate motion estimation accuracy. A k-fold cross-validation method (Duda et al., 2001) was used, where k was chosen to be 10. Training images were divided into k folds. For each trial, one fold images were used as set B, while the remaining k-1 fold images formed set A. After repeating the validation k times by using all different sets of B, the average error across the k trials was computed. Translation error in XM, YM and ZM direction is 0.021? 0.249mm, 0.004? 0.052mm and 0.001 ? 0.066mm and rotation error around XM, YM and ZM axis is 0.001? 0.026?, -0.001? 0.050?, and -0.006 ? 0.129? respectively. However, note that this error estimation was only true if the sampled motion space covered the whole space the volunteer could move inside the head coil. For instance, if the sampled space during training stage only covered rotations about ZM, while during the real-time scan, the volunteer moved around XM axis, even though a most closely matched image in the training data could be determined based on the highest correlation coefficient, the estimated motion parameters were not correct. Therefore the key issue for this method is to sample as complete as possible the motion space (the volunteer moves as much as he can) during the training stage. 103 6.3.2 Volunteer Study A GRE sequence modified to incorporate motion correction by adjusting gradient rotation and RF frequencies and phases before each RF excitation was used on a normal volunteer under an institutional review board-approved protocol (parameters: 512?384 voxels over 24?18 cm2, 3mm slice thickness, TR=100ms, TE=30ms). During the real-time scan, the most similar camera image in the training data was found for the newly captured camera image based on the correlation coefficient inside the ROIs (nose area chosen from the training data). The corresponding motion parameters of that training image were sent to the MRI scan computer at 10Hz speed for prospective motion correction. Experiments were performed with and without motion correction each with similar motion. Results are shown in Figure 6-5. The image quality of (b) is similar to (a), and much better than (c). The results suggest that the new system has good precision to be used in high resolution MRI. Motion parameters shown in (d,e) were not smooth because of the not dense enough sampling of the training data acquisition. 104 Figure 6-5 Prospective correction of large motion. (a) was acquired without intentional motion during the scan. (b c) were scanned in the presence of motion; (b) was acquired using the proposed method but (c) was without correction. (d,e) show the evolution of motion parameters during the acquisition of image (b,c), respectively. 6.4 Discussion We propose a new motion correction method for MRI using single camera without any calibration. A short training scan is required to allow object motion estimation from the camera images. It does not need any tracking target, therefore is more user friendly. Initial testing results show that this method is very promising. The major factor that affects the motion estimation accuracy is the registration error from EPI images. Even thought registration can get sub-voxel accuracy, if motion occurs between the acquisition of the first and last slice of a volume, rigid body registration is not able to correct. This could be a problem in our training data (a) (b) (c) (d) (e) 105 acquisition, because the TR of EPI acquisition was set relatively long (1s) to get a good coverage of the brain, and the volunteer was asked to continuously move his head during that period of time. In addition, ghost artifacts in a single slice could make the registration even harder. A solution to this is that during the training period of time, the volunteer should move very slowly to introduce negligible artifacts coming from these two effects. Another issue that could affect the accuracy is the EPI image quality. The tradeoff for EPI?s fast imaging speed is that it suffers image distortions, especially in a high magnetic field like in our case, which may in turn affect the registration performance. Field inhomogeneity can cause image distortions and even signal loss. Even though a high order shim can ameliorate this problem to some extent, the inhomogeneities remain, particularly close to tissue-air and tissue-bone interfaces. And even the field can be shimmed to almost homogeneous in the brain for acquisition of the first image in the training, the field changes when volunteer moves for about 20 degrees and 20mm, which could result in different distortion in the images and cannot be recovered by solely registration. Furthermore, gradient non- linearity, Nyquist ghost, etc, could also distort EPI acquisitions (Cohen and Weisskoff, 1991; Farzaneh et al., 1990; Jezzard and Clare, 1999). The correlation coefficient was proved to be optimal in searching for the most similar image in the training data for a new image. It was not only accurate, but also insensitive to noise because global information in an image has been used. However, the accuracy for motion estimation does not just rely on the best image search. It depends on the completeness of the sampled motion space during training stage. If the 106 sample space is very limited, then the best match image is not able to provide correct motion parameters. In conclusion, to get the best performance with this method, several conditions have to be satisfied: EPI image quality should be good and the volunteer should move slowly and to the full extent (not large because of the tight receive coil used) during the training data acquisition. 107 Chapter 7 Conclusion Head motion during scanning can greatly affect MRI image quality and is a major reason for rendering data useless in both clinical and research application. High- resolution brain structural imaging is especially compromised by head movement because high-resolution scans have longer scan time. Two new motion correction methods were proposed here using either stereo or a single camera to measure and correct rigid body motion in real time. A gradient echo sequence was modified to accept motion estimates before each TR to adjust the gradient orientations and RF frequencies and phases for each excitation. Images were reconstructed on the scanner computer for immediate access. Experiments done with both of the two methods demonstrated a consistent improvement in image quality if motion occurred during the acquisition. The advantage of using an optical system was its totally parallel computation to the pulse sequence, therefore no extra scanning time was needed for motion detection. The camera set up right on top of the head coil provided very high sensitivity for motion estimation. For the motion correction method using stereo cameras, the position estimation accuracy was better than 0.1mm and 0.15?, which has been shown to be good enough for high-resolution imaging. However, it did have several drawbacks that a good calibration took some extra preparation time before experiments, and the ?tracking target? used for motion detection was not totally user friendly. The first problem was not a big issue since stereo calibration only needed to be done offline 108 once as long as the cameras were fixed solidly on the holder (Section 9.4), and the calibration between the cameras and the MRI scanner was simplified to just several minutes. For the second problem, the tracking target was used only to remove noise caused by the scanner. A better shielding of the cameras should be able to solve the problem. For the single camera method, no calibration was needed at all, but only a short training period. The big advantage of this method was that no tracking target was needed, because face features were used directly for motion detection. It is more user friendly. However, the accuracy depended a lot on EPI image quality and sampled motion space during the training stage. For the first problem, parallel imaging (Pruessmann et al., 1999) should be able to improve both the speed and image quality. For the second problem, the volunteer should be instructed to move as much as he can inside the coil during the training data acquisition. 109 Chapter 8 Future Work In this project, two systems were built for prospective motion correction for high- resolution MRI in 7T. The feasibility of using optical tracking system to help MRI image acquisition has been demonstrated. But future work can be carried out to improve the performance further. Both methods calculated motion after acquiring video data. Therefore, there is always a time delay for correction in the sequence. A better solution is to predict motion using the Kalman filter. Another positive feature about Kalman filter is its ability to remove tracking noise, which may further lead to an unnecessary tracking target for the stereo correction method. Currently, the motion calculation was done continuously and whenever it was ready, it was sent to the MRI scan computer. The MRI scan computer then read the motion parameters inside the registers and executed the correction. Therefore when MRI started to read the data in the registers, they may not be fully updated yet. As a result, the adjustment of gradients and RF frequencies and phases may use uncorrected parameters. A future work is to synchronize the motion computation and MRI motion correction to avoid this problem. A better way to acquire training data for the single camera method is to synchronize camera image acquisition with navigators. The purpose of training is to link camera images with motion parameters in MRI system. The EPI takes too long to achieve this. A 3D navigator, however, will take only several milliseconds to tens of milliseconds. As a result, the sampling of complete motion space will take less than 110 one minute. With this method, we will be able to combine the advantage of both optical system (parallel computation with the sequence) and navigator (fast acquisition), and is believed to be more practical. 111 Chapter 9 Appendices 9.1 Derivation of motion in logical coordinate system The rotation and translation ( mlR and mlT ) between MRI physical and logical coordinate systems can be read out from the pulse sequence after the scan-plane is prescribed as in Equation 4.66 (the time variable t0 is dropped for brevity): )( mllmlm TXRX += [9.1] New motion parameters are denoted as mR and mT in physical coordinate system. Use subscripts ?o? and ?n? to define the position before and after motion (short for ?old? and ?new?), we have )( mmmm TXRX += on [9.2] and )( mllmlm TXRX += oo [9.3] The new motion parameters in logical coordinate system Rn and Tn must also satisfy the physical and logical relation, as )( nlnm TXRX += nn [9.4] So, [ ] [ ] nmnmlmlmnlmlmn nmnmllmlmn nmnmmn nmmmn nmnl TTRTRRRXRRR TTRTXRRR TTRXRR TTXRR TXRX ?++= ?++= ?+= ?+= ?= ??? ?? ?? ? ? 111 11 11 1 1 )( )( o o o o nn [9.5] 112 Since the scan plane needs to follow the motion, it means nlX should equal o lX all the time, therefore the rotation term in Equation 9.5 should be the identity matrix and the translation should be zero: ?? ??? += =? ?? ??? =?+ = ??? ? mnmln mlmn nmnmlmlmn mlmn TRTT RRR 0TTRTRRR IRRR 111 1 [9.6] The motion parameters that need to be sent to the sequence for correction are the difference between the new ones and the old ones read-out from the sequence (i.e. mlR and mlT ), which are: mlmcorr RRR = [9.7] mcorcorr TRT 1?= r [9.8] 9.2 MRI Coordinates from DICOM Image To get the MRI physical coordinate (millimeter) of any point from its DICOM image (index) is not very straightforward, because they use two different coordinate systems and different units. DICOM uses patient coordinate system, meaning for axial scan Xd increases right to left Yd increases anterior to posterior Zd increases inferior to superior MRI physical coordinate system on the other hand is defined as: Xm increases left to right Ym increases posterior to anterior 113 Zm increases inferior to superior DICOM headers have directions within the image plane, i.e. the Xd and Yd direction, r1=Header.ImageOrientationPatient[1:3] [9.9] r2=Header.ImageOrientationPatient[4:6] [9.10] where Rd=[r1 r2 r3] is the rotation matrix describing the DICOM image plane in the patient coordinate system. The direction orthogonal to the plane, i.e. the Zd direction, is defined as 213 rrr ?= [9.11] Then inter-slice distances need to be determined for unit transformation. If only one slice is scanned, z = Header.Slicethickness [9.12] Otherwise, it can be calculated from slice positions, as z =( Header(i+1).ImagePositionPatient - Header(i).ImagePositionPatient)r3 [9.13] where i is the index of one slice. Voxel size is therefore, v = [Header.PixelSpacing(1) Header.PixelSpacing(2) z] [9.14] where PixelSpacing saves in-plane pixel size in DICOM. Position of the first voxel in DICOM?s patient coordinate system is v0 = Header.ImagePositionPatientT [9.15] To map voxels to the patient coordinates is ?? ? ?? ?= 1 )(* 0 __ 0 vvRH diagd patienttodicom [9.16] Where diag(v) is a 3?3 matrix with the diagonal elements coming from v. 114 Then map from the patient coordinate system adopted by DICOM to the MRI physical coordinate system is a rotation about the Z axis: ? ? ? ? ? ? ? ? ? ? ? ? ? ? = 1 1 1 1 __ MRItopatientH [9.17] Combine both transformations: patienttoDICOMMRItopatient ____ * HHH = [9.18] Any voxel in DICOM image has MRI coordinate as dm HXX = [9.19] Note that if MATLAB is used, the voxel index is from 1 instead of 0, a translation matrix ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? = 1 11 11 11 mat is needed to be multiplied with Equation 9.18 for a correct mapping. matHHH ** ____ patienttoDICOMMRItopatient= [9.20] 9.3 Gauss-Newton Algorithm Gauss-Newton iteration is a technique to locate the minimum of a multivariate function that is expressed as the sum of squares of non-linear functions (Kelley, 1999; Nocedal and Wright, 2006). In this section, we do not have the ambition to state new concepts on Newton iteration, but just to give a derivation for a better explanation in Section 4.2.1, 4.2.2 and 6.2.1. 115 Let f be the function that maps a measurement vector x through parameter vector p to another measurement y, the solution p is obtained when distance 2? is minimized: 22 );( pxy? f?= [9.21] By assuming );( pxf as a linear function in the neighborhood of p0, a Taylor series expansion will lead to the approximation )();();();(? 000 ppppxpxpx ???+= fff [9.22] The derivative of the error function at p is ));(?();( ? 2? 2 pxyp pxp? ff T ????=?? [9.23] And because );(? pxf is linear, p px p px ? ?= ? ? );();(? 0ff [9.24] Substitute Equation 9.22 and 9.24 to 9.23, we get ))();();(();(2? 0000 2 yppppxpxppxp? ????+??=?? fff T [ 9.25] Let the Jacobian matrix be: p pxJ ? ?= );( 0f [9.26] Then solve for p when 0? 2 =?? p? : 00 1)( p?JJJp += ? TT [9.27] Therefore, the solution p is obtained with an initial estimate p0 and refined iteratively according to: 116 nn TT n p?JJJp += ? + 1 1 )( [9.28] The above algorithm is suitable for minimization with respect to a small number of parameters, but not suitable for minimizing cost functions with respect to large numbers of parameters because of the heavy burden of matrix inverse in Equation 9.28. However, in this project, when many parameters need to be estimated, e.g. 6M+8 unknowns in Equation 4.22, the Jacobian matrix is very sparse and can be taken advantage for great time savings. Use Equation 4.22 as an example, the parameters to be estimated are intrinsic parameters Tyx ppkkyxff ],,,,,,,[ 212100=a and extrinsic parameters ii TR , (using ],...,,[ 621 iiii rrr=b to denote the 6DOF parameters describing them). Then p can be written as TT M TTT ),...,,,( 21 bbbap = [9.29] The Jacobian matrix is therefore ?? ? ? ? ? ? ?? ? ? ? ? ? =??= MM p B B B A A A p mJ ?? 2 1 2 1 [9.30] Where amA ??= /ipi , iipi bmB ??= / . The sparse condition is because of the fact that each projected points ipm depend only on a and ib but not any other )( ijj ?b , so that 0/ =?? jip bm . Then Equation 9.28 can be solved for each iteration step a lot faster based on this sparse matrix. 117 9.4 Stereo System QA The stereo camera calibration takes quite some time but it does not have to be done every time before using it, as long as nothing changes. Therefore, it is important to check the stereo to make sure nothing changed, i.e. a system QA. It can be realized by comparing the 3D position estimations for a checkerboard pattern using either camera or the stereo. Because the checkerboard size is known, we are able to estimate corner positions in 3D from a single camera based on homography (Equation 4.18 ? 4.21). The difference between single camera and stereo estimation can be very large even if a small change in the cameras? parameters changed (Figure 9-1) Figure 9-1 3D reconstruction error if camera parameters change. The solid lines projection is correct and the dotted projection is the result from different camera parameters with a small focal length and principle point change. It?s obvious that the 3D estimation magnify the camera parameters distinction. Figure 9-2 shows the 3D corner positions estimated using left camera (red circles), right camera (blue circles) and stereo (green circles). When correct calibration parameters were used, the three estimations match perfectly as shown in (a). But if a very small change is made on the focal length, e.g. 0.1mm, they differ quite a lot (b). Also, the difference between three estimations can be calculated. For 118 image (a), the maximum error between left-right camera estimation is 0.0098092; the maximum error between left-stereo camera estimation is 0.064398; and the maximum error between right-stereo camera estimation is 0.087699. But for image (b) the maximum error between left-right camera estimation is 0.48134; the maximum error between left-stereo camera estimation is 3.287; and the maximum error between right-stereo camera estimation is 3.107. Therefore, the stereo QA can be accomplished either visually or quantitatively. In real application, the calibration parameters we use do not change, while cameras? status may change accidentally. But the error magnification should be the same as in this simulation. Therefore, each time before the stereo system is used, run the system QA to make sure if a re-calibration is necessary. Figure 9-2 3D position estimation for a checkerboard pattern. Every circle represents a corner of the checkerboard. Red circles represent positions estimated from the left camera; blue circles represent positions estimated from the right camera; and green circles represent those estimated from the stereo. (a) correct calibration parameters were used. (b) focal length was changed with 0.1mm. (a) (b) 119 Chapter 10 Bibliography Abdel-Aziz, Y. I. and Karara, H. M. (1971) Direct linear transformation into object space coordinates in close-range photogrammetry. Proc. Symposium on Close- Range Photogrammetry, 1-18. Arun, K. S., Huang, T. S. and Blostein, S. D. (1987) Least-squares fitting of two 3D point sets. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 9, 698-700. Barnwell, J. D., Smith, J. K. and Castillo, M. (2007) Utility of navigator-prospective acquisition correction technique (PACE) for reducing motion in brain MR imaging studies. AJNR Am J Neuroradiol, 28, 790-1. Bernstein, M. A., King, K. F. and Zhou, Z. J. (2004) Handbook of MRI pulse sequences. Amsterdam ; Boston: Academic Press. Blake, A., Curwen, R. and Zisserman, A. (1993) A framework for spatiotemporal control in the tracking of visual contours. International Journal of Computer Vision, 11, 127-145. Bouguet, J. Y. (1999) Visual methods for three-dimensional modeling. Ph.D. Thesis: California Institute of Technology. Broggi, A. (1998) Vision-based diving assistance. Intelligent Systems and their Applications, IEEE, 13, 22-23. Charbonnier, L. and Fournier, A. (1995) Heading guidance and obstacles localization for an indoor mobile robot. Advanced Robotics, 1995. ICAR '95. Proceedings., 8th International Conference on. pp. 507-513. Chou, C.-H. and Chen, Y.-C. (1990) Moment-preserving pattern matching. Pattern Recognition, 23, 461-474. Cohen, M. S. and Weisskoff, R. M. (1991) Ultra-fast imaging. Magnetic Resonance Imaging, 9, 1-37. 120 Coifman, B., Beymer, D., McLauchlan, P. and Malik, J. (1998) A real-time computer vision system for vehicle tracking and traffic surveillance. Transportation Research Part C: Emerging Technologies, 6, 271-288. Daniilidis, K. and Ernst, J. (1996) Active intrinsic calibration using vanishing points. Pattern Recognition Letters, 17, 1179-1189. Derbyshire, J. A., Wright, G. A., Henkelman, R. M. and Hinks, R. S. (1998) Dynamic scan-plane tracking using MR position monitoring. J Magn Reson Imaging, 8, 924-32. Dold, C., Zaitsev, M., Speck, O., Firle, E. A., Hennig, J. and Sakas, G. (2006) Advantages and limitations of prospective head motion compensation for MRI using an optical motion tracking device. Acad Radiol, 13, 1093-103. Duda, R. O., Hart, P. E. and Stork, D. G. (2001) Pattern classification. New York: Wiley. Duyn, J. H., van Gelderen, P., Li, T. Q., de Zwart, J. A., Koretsky, A. P. and Fukunaga, M. (2007) High-field MRI of brain cortical substructure based on signal phase. Proc Natl Acad Sci U S A, 104, 11796-801. Eggert, D. W., Lorusso, A. and Fisher, R. B. (1997) Estimating 3-D rigid body transformations: a comparison of four major algorithms. Machine Vision and Applications, 9, 272-290. Ehman, R. L. and Felmlee, J. P. (1989) Adaptive technique for high-definition MR imaging of moving structures. Radiology, 173, 255-63. Eviatar, H., Saunders, J. and Hoult, D. (1997) Motion Compensation by gradient adjustment. Proc Intl Magn Reson Med, 1898. Eviatar, H., Schattka, B., Sharp, J., Rendell, J. and Alexander, M. (1999) Real time head motion correction for functional MRI. Proc Intl Magn Reson Med, 269. Faig, W. (1975) Calibration of close-range photogrammetric systems: mathematical formulation. Photogrammetric Engineering and Remote Sensing, 41, 1479- 1486. 121 Farzaneh, F., Riederer, S. J. and Pelc, N. J. (1990) Analysis of T2 limitations and off- resonance effects on spatial resolution and artifacts in echo-planar imaging. Magnetic Resonance in Medicine, 14, 123-139. Faugeras, O. (1993) Three-dimensional computer vision : a geometric viewpoint. Cambridge, Mass.: MIT Press. Faugeras, O. and Robert, L. (1996) What can two images tell us about a third one? International Journal of Computer Vision, 18, 5-19. Faugeras, O. and Toscani, G. (1986) The calibration problem for stereo. Computer Vision and Pattern Recognition, 1986. Proceedings., 1986 IEEE Computer Society Conference on. pp. 15-20. Friston, K., Ashburner, J., Frith, C., Poline, J., Heather, J. and Frackowiak, R. (1995) Spatial registration and normalization of images. Human Brain Mapping, 2, 165-189. Friston, K. J. (2007) Statistical parametric mapping : the analysis of funtional brain images. Amsterdam ; Boston: Elsevier/Academic Press. Friston, K. J., Williams, S., Howard, R., Frackowiak, R. S. and Turner, R. (1996) Movement-related effects in fMRI time-series. Magn Reson Med, 35, 346-55. Fu, Z. W., Wang, Y., Grimm, R. C., Rossman, P. J., Felmlee, J. P., Riederer, S. J. and Ehman, R. L. (1995) Orbital navigator echoes for motion measurements in magnetic resonance imaging. Magn Reson Med, 34, 746-53. Gerlach, W. and Stern, O. (1924) Ueber die Richtungsquantelung im Magnetfeld. Ann Phys, 74, 673. Glover, G. H. and Lee, A. T. (1995) Motion artifacts in fMRI: comparison of 2DFT with PR and spiral scan methods. Magn Reson Med, 33, 624-35. Glover, G. H. and Noll, D. C. (1993) Consistent projection reconstruction (CPR) techniques for MRI. Magn Reson Med, 29, 345-51. 122 Glover, G. H. and Pauly, J. M. (1992) Projection reconstruction techniques for reduction of motion effects in MRI. Magn Reson Med, 28, 275-89. Gorodnichy, D. O. and Roth, G. (2004) Nouse [`]use your nose as a mouse' perceptual vision technology for hands-free games and interfaces. Image and Vision Computing, 22, 931-942. Hager, G. D. and Belhumeur, P. N. (1998) Efficient region tracking with parametric models of geometry and illumination. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 20, 1025-1039. Hall, E. L., Tio, J. B. K., McPherson, C. A. and Sadjadi, F. A. (1982) Measuring Curved Surfaces for Robot Vision. Computer, 15, 42-54. Haralick, R. M., Joo, H., Lee, C., Zhuang, X., Vaidya, V. G. and Kim, M. B. (1989) Pose estimation from corresponding point data. Systems, Man and Cybernetics, IEEE Transactions on, 19, 1426-1446. Harris, C. and Stephens, M. (1988) A combined corner and edge detector. Alvey Vision Conference. pp. 147-151. Hartley, R. and Zisserman, A. (2003) Multiple view geometry in computer vision. Cambridge, UK ; New York: Cambridge University Press. Hartley, R. I. and Sturm, P. (1997) Triangulation. Computer vision and image understanding, 68, 146-157. Horn, B. K. P. (1987) Closed-form solution of absolute orientation using unit quaternions. J. Opt. Soc. Am. A, 4, 629-642. Horn, B. K. P., Hilden, H. M. and Negahdaripour, S. (1988) Closed-form solution of absolute orientation using orthonormal matrices. J. Opt. Soc. Am. A, 5, 1127- 1135. Hu, Y. and Glover, G. H. (2007) Three-dimensional spiral technique for high- resolution functional MRI. Magn Reson Med, 58, 947-51. 123 Jenkinson, M., Bannister, P., Brady, M. and Smith, S. (2002) Improved optimization for the robust and accurate linear registration and motion correction of brain images. Neuroimage, 17, 825-41. Jezzard, P. and Clare, S. (1999) Sources of distortion in functional MRI data. Human Brain Mapping, 8, 80-85. Kadah, Y. M., Abaza, A. A., Fahmy, A. S., Youssef, A. B., Heberlein, K. and Hu, X. P. (2004) Floating navigator echo (FNAV) for in-plane 2D translational motion estimation. Magn Reson Med, 51, 403-7. Kakadiaris, L. and Metaxas, D. (2000) Model-based estimation of 3D human motion. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22, 1453- 1459. Kanatani, K. (1994) Analysis of 3-D rotation fitting. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 16, 543-549. Kelley, C. T. (1999) Iterative methods for optimization. Philadelphia: SIAM. Kim, S., Dougherty, L., Rosen, M. A., Song, H. K. and Poptani, H. (2008) Automatic correction of in-plane bulk motion artifacts in self-navigated radial MRI. Magn Reson Imaging, 26, 367-78. Lauterbur, P. C. (1973) Image formation by induced local interactions. Examples employing nuclear magnetic resonance. . Nature, 190-191. Lee, C. C., Grimm, R. C., Manduca, A., Felmlee, J. P., Ehman, R. L., Riederer, S. J. and Jack, C. R., Jr. (1998) A prospective approach to correct for inter-image head rotation in fMRI. Magn Reson Med, 39, 234-43. Lee, C. C., Jack, C. R., Jr., Grimm, R. C., Rossman, P. J., Felmlee, J. P., Ehman, R. L. and Riederer, S. J. (1996) Real-time adaptive motion correction in functional MRI. Magn Reson Med, 36, 436-44. Liang, Z.-P., Lauterbur, P. C. and IEEE Engineering in Medicine and Biology Society. (2000) Principles of magnetic resonance imaging : a signal processing perspective. Bellingham, Wash. New York: SPIE Optical Engineering Press ; 124 IEEE Press. Liu, C., Bammer, R., Kim, D. H. and Moseley, M. E. (2004) Self-navigated interleaved spiral (SNAILS): application to high-resolution diffusion tensor imaging. Magn Reson Med, 52, 1388-96. Lucas, B. D. and Kanade, T. (1981) An interative image registration technique with an application to stereo vision. International Joint Conference on Aritificial Intelligence, 1981. Proceedings.,, 674-679. Mansfield, P. (1977) Nulti-planar image formation using NMR spin-echoes. J Phys C Solid State Phys, 10, L55-L58. Mansfield, P. and Grannell, P. (1973) NMR 'diffraction' in solids? J Phys C Solid State Phys, 6, L422. Marr, D., Poggio, T. and Ullman, S. (1979) Bandpass channels, zero-crossings, and early visual information processing. J. Opt. Soc. Am., 69, 914-916. Meyer, C. H., Hu, B. S., Nishimura, D. G. and Macovski, A. (1992) Fast spiral coronary artery imaging. Magn Reson Med, 28, 202-13. Milella, A. and Siegwart, R. (2006) Stereo-Based Ego-Motion Estimation Using Pixel Tracking and Iterative Closest Point. Computer Vision Systems, 2006 ICVS '06. IEEE International Conference on. pp. 21-21. Nevo, E., Roth, A. and Hushek, M. (2002) An electromagnetic 3D locator system for use in MR scanners. Proc Intl Magn Reson Med, 334. Newman, T. S. and Jain, A. K. (1995) A Survey of Automated Visual Inspection. Computer vision and image understanding, 61, 231-262. Nocedal, J. and Wright, S. J. (2006) Numerical optimization. New York: Springer. Petrie, D. W., Costa, A. F., Takahashi, A., Yen, Y. F. and Drangova, M. (2005) Optimizing spherical navigator echoes for three-dimensional rigid-body motion detection. Magn Reson Med, 53, 1080-7. 125 Pipe, J. G. (1999) Motion correction with PROPELLER MRI: application to head motion and free-breathing cardiac imaging. Magn Reson Med, 42, 963-9. Pipe, J. G., Farthing, V. G. and Forbes, K. P. (2002) Multishot diffusion-weighted FSE using PROPELLER MRI. Magn Reson Med, 47, 42-52. Pipe, J. G. and Zwart, N. (2006) Turboprop: improved PROPELLER imaging. Magn Reson Med, 55, 380-5. Pruessmann, K. P., Weiger, M., Scheidegger, M. B. and Boesiger, P. (1999) SENSE: Sensitivity encoding for fast MRI. Magnetic Resonance in Medicine, 42, 952- 962. Rabi, I. I., Zacharias, J. R., Millman, S. and Kusch, P. (1938) A new method of measuring nuclear magnetic moments. Phys Rev, 53, 318. Santos, J. M., Wright, G. A. and Pauly, J. M. (2004) Flexible real-time magnetic resonance imaging framework. Conf Proc IEEE Eng Med Biol Soc, 2, 1048- 51. Shi, J. and Tomasi, C. (1994) Good features to track. Computer Vision and Pattern Recognition, 1994. Proceedings CVPR '94., 1994 IEEE Computer Society Conference on. pp. 593-600. Slama, C. C., Theurer, C. and Henriksen, S. W. (1980) Manual of photogrammetry. Falls Church, Va.: American Society of Photogrammetry. Spuentrup, E., Stuber, M., Botnar, R. M., Kissinger, K. V. and Manning, W. J. (2002) Real-time motion correction in navigator-gated free-breathing double-oblique submillimeter 3D right coronary artery magnetic resonance angiography. Invest Radiol, 37, 632-6. Thesen, S., Heid, O., Mueller, E. and Schad, L. R. (2000) Prospective acquisition correction for head motion with image-based tracking for real-time fMRI. Magn Reson Med, 44, 457-65. Thiel, T., Czisch, M., Elbel, G. K. and Hennig, J. (2002) Phase coherent averaging in magnetic resonance spectroscopy using interleaved navigator scans: 126 compensation of motion artifacts and magnetic field instabilities. Magn Reson Med, 47, 1077-82. Tomasi, C. and Kanade, T. (1992) Shape and motion from image streams under orthography: a factorization method. International Journal of Computer Vision, 9, 137-154. Tomasi, C. and Kanatani, K. (1991) Detection and tracking of point features. Tenical Report: CMU-CS-91-132. Tommasini, T., Fusiello, A., Trucco, E. and Roberto, V. (1998) Making good features track better. Computer Vision and Pattern Recognition, 1998. Proceedings. 1998 IEEE Computer Society Conference on. pp. 178-183. Torr, P. H. S. and Zisserman, A. (1997) Robust parameterization and computation of the trifocal tensor. Image and Vision Computing, 15, 591-605. Tremblay, M., Tam, F. and Graham, S. J. (2005) Retrospective coregistration of functional magnetic resonance imaging data using external monitoring. Magn Reson Med, 53, 141-9. Tsai, R. (1987) A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. Robotics and Automation, IEEE Journal of, 3, 323-344. Twieg, D. B. (1983) The k-trajectory formulation of the NMR imaging process with applications in analysis and synthesis of imaging methods. Med Phys, 10, 610- 21. Umeyama, S. (1991) Least-squares estimation of transformation parameters between two point patterns. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 13, 376-380. van der Kouwe, A. J., Benner, T. and Dale, A. M. (2006) Real-time rigid body motion correction and shimming using cloverleaf navigators. Magn Reson Med, 56, 1019-32. 127 van Gelderen, P., de Zwart, J. A., Starewicz, P., Hinks, R. S. and Duyn, J. H. (2007) Real-time shimming to compensate for respiration-induced B0 fluctuations. Magn Reson Med, 57, 362-8. Walker, M. W., Shao, L. and Volz, R. A. (1991) Estimating 3-D location parameters using dual number quaternions. CVGIP: Image Understanding, 54, 358-367. Wang, F. N., Huang, T. Y., Lin, F. H., Chuang, T. C., Chen, N. K., Chung, H. W., Chen, C. Y. and Kwong, K. K. (2005) PROPELLER EPI: an MRI technique suitable for diffusion tensor imaging at high field strength with reduced geometric distortions. Magn Reson Med, 54, 1232-40. Wang, L. L. and Tsai, W.-H. (1991) Camera calibration by vanishing lines for 3-D computer vision. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 13, 370-376. Ward, H. A., Riederer, S. J., Grimm, R. C., Ehman, R. L., Felmlee, J. P. and Jack, C. R., Jr. (2000) Prospective multiaxial motion correction for fMRI. Magn Reson Med, 43, 459-69. Welch, E. B., Manduca, A., Grimm, R. C., Ward, H. A. and Jack, C. R., Jr. (2002) Spherical navigator echoes for full 3D rigid body motion measurement in MRI. Magn Reson Med, 47, 32-41. Weng, J., Ahuja, N. and Huang, T. S. (1992a) Matching two perspective views. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 14, 806- 825. Weng, J., Cohen, P. and Herniou, M. (1992b) Camera calibration with distortion models and accuracy evaluation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 14, 965-980. Wyatt, C. L., Ari, N. and Kraft, R. A. (2005) Spherical navigator registration using harmonic analysis for prospective motion correction. Inf Process Med Imaging, 19, 738-49. Yao, Y.-S. and Chellappa, R. (1995) Tracking a dynamic set of feature points. Image Processing, IEEE Transactions on, 4, 1382-1395. 128 Zaitsev, M., Dold, C., Sakas, G., Hennig, J. and Speck, O. (2006) Magnetic resonance imaging of freely moving objects: prospective real-time motion correction using an external optical motion tracking system. Neuroimage, 31, 1038-50. Zhang, Z. (2000) A flexible new technique for camera calibration. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22, 1330-1334. Zhang, Z., Deriche, R., Faugeras, O. and Luong, Q.-t. (1994) A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry. Tenical Report: INRIA. Zheng, Q. and Chellappa, R. (1993) Automatic feature point extraction and tracking in image sequences for unknown camera motion. Computer Vision, 1993. Proceedings., Fourth International Conference on. pp. 335-339. Zhou, S. K., Chellappa, R. and Moghaddam, B. (2004) Visual tracking and recognition using appearance-adaptive models in particle filters. Image Processing, IEEE Transactions on, 13, 1491-1506.