ABSTRACT Title of dissertation: CALIBRATION AND METROLOGY USING STILL AND VIDEO IMAGES Feng Guo Doctor of Philosophy, 2007 Dissertation directed by: Professor Rama Chellappa Department of Electrical and Computer Engineering and Department of Computer Science Metrology, the measurement of real world metrics, has been investigated ex- tensively in computer vision for many applications. The prevalence of video cameras and sequences has led to the demand for fully automated systems. Most of the exist- ing video metrology methods are simple extensions of still-image algorithms, which have certain limitations, requiring constraints such as parallelism of lines. New techniques are needed in order to achieve accurate results for broader applications. An important preprocessing step and a closely related topic to metrology is cali- bration using planar patterns. Existing approaches lack exibility and robustness when extended to video sequences. This dissertation advances the state of the art in calibration and video metrology in three directions: (1) the concept of partial rectiflcation is proposed along with new calibration techniques using a circle with diverse types of constraints; (2) new calibration methods for video sequences using planar patterns undergoing planar motion are proposed; and (3) new algorithms to extend video metrology to a wide range of applications are presented. A fully auto- mated system using the new technique has been built for measuring the wheelbases of vehicles. CALIBRATION AND METROLOGY USING STILL AND VIDEO IMAGES by Feng Guo Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulflllment of the requirements for the degree of Doctor of Philosophy 2007 Advisory Committee: Professor Rama Chellappa, Chair/Advisor Professor Larry Davis Professor David Jacobs Professor Dave Mount Professor Min Wu c Copyright by Feng Guo 2007 Dedication To my parents, Jihua Guo and Suyue Shi, who have always supported me in all of my endeavors. I would not be where I am were it not for their steadfast encouragement. And to my wife, Qing Xie, who showed unwavering faith in me even when I doubted myself the most, and who sat through many a physics lesson with patience, grace, and love - all for my sake. ii ACKNOWLEDGMENTS I would like to take this opportunity and thank my advisor, Professor Rama Chellappa for giving me an invaluable opportunity to work on challenging and ex- tremely interesting projects over the past several years. It has been a pleasure to work with and learn from such an extraordinary individual. This experience will beneflt me forever. Thanks are due to Professor Larry Davis, Professor David Jacobs, Professor Dave Mount, and Professor Min Wu for agreeing to serve on my thesis committee and for sparing their invaluable time reviewing the manuscript. I am very grateful to Dr. Qinfen Zheng, my co-advisor, Professor Daniel Stan- cil and Professor T. E. (Ed) Schlesinger, my advisors in Carnegie Mellon University, Dr. Dorin Comaniciu and Dr. Kevin S. Zhou, my supervisors at Siemens Corporate Research, for their support and guidance. My colleagues at the Computer Vision Lab have enriched my graduate life in many ways and deserve a special mention. My interaction with Aswin C. Sankara- narayanan, Haibin Ling, Zhanfeng Yue, Yang Ran, Ashok Veeraraghavan, Jie Shao, Jian Li, Seong-Wook Joo, Volkan Cevher, Xue Mei, and Naresh Cuntoor has been very fruitful. I would also like to acknowledge help and support from Fatima Bangura and other stafi members in the department. iii I owe my deepest thanks to my family - my parents Jihua and Suyue, and my sister Jun, who have always stood by me and guided me through my career; my wife, Qing, for always being there to support me and be constant source of encouragement during my Ph.D; my daughter, Katie, for bringing me endless happiness. Last, but far from least, I want to express my thanks to all the people who have made this thesis possible and because of whom my graduate experience has been one that I will cherish forever. iv Table of Contents List of Tables ix List of Figures x I INTRODUCTION AND BACKGROUND 1 1 Introduction 2 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Challenges in Video Metrology . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Existing Approaches and their Limitations . . . . . . . . . . . . . . . 6 1.3.1 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3.2 Metrology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2 Background 11 2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Elements of projective geometry . . . . . . . . . . . . . . . . . . . . . 11 2.2.1 Homogenous coordinates and projective transformation . . . . 11 2.2.2 Camera Model . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.3 Line, conic and their transformations . . . . . . . . . . . . . . 12 2.2.4 Vanishing point, vanishing line, and minimal calibration . . . 13 2.2.5 Pole-polar relationship . . . . . . . . . . . . . . . . . . . . . . 15 2.2.6 Circular Points . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3 Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.1 Invariant Forms . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.4 Camera Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.4.1 Calibrate Camera Using Planar Patterns . . . . . . . . . . . . 20 2.4.2 Related works . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.5 Metrology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.5.1 Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.5.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.5.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 II IMAGECALIBRATIONUSINGACIRCLEANDOTHERCONSTRAINTS 28 3 Partial Rectiflcation 29 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Partial Rectiflcation . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 v 3.2.1 View from Group Theory . . . . . . . . . . . . . . . . . . . . 31 3.3 Partial Rectiflcation Using a Circle . . . . . . . . . . . . . . . . . . . 34 3.3.1 Mapping center . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.3.2 Solution Using Mapping Center . . . . . . . . . . . . . . . . . 36 3.3.3 Ambiguity and Solution in Multiple Constraints . . . . . . . . 37 3.3.4 Solutions for Existing Partial Rectiflcation . . . . . . . . . . . 38 3.3.4.1 A Line l Passing Through the Circle?s Center . . . . 38 3.3.4.2 A Vanishing Point v . . . . . . . . . . . . . . . . . . 38 3.3.4.3 A Coplanar Circle . . . . . . . . . . . . . . . . . . . 38 3.3.5 Performance under noise . . . . . . . . . . . . . . . . . . . . . 38 4 Constraints Related to Lengths 41 4.1 Point Distance Constraint . . . . . . . . . . . . . . . . . . . . . . . . 41 4.1.1 An Arbitrary Mapping Center Solution . . . . . . . . . . . . . 42 4.1.2 Corresponding Subgroup (Approach One) . . . . . . . . . . . 42 4.1.3 Corresponding Subgroup (Approach Two) . . . . . . . . . . . 45 4.1.4 Geometric Properties of The Solution Curve . . . . . . . . . . 48 4.1.5 General Solution and Validation . . . . . . . . . . . . . . . . . 50 4.1.6 Perturbation Analysis . . . . . . . . . . . . . . . . . . . . . . 51 4.2 Metric rectiflcation using two distance constraints . . . . . . . . . . . 51 4.2.1 Uniqueness analysis . . . . . . . . . . . . . . . . . . . . . . . . 52 4.2.2 Equal angles in two-solution cases . . . . . . . . . . . . . . . . 53 4.2.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3 Line distance constraint . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.4 Equal Distance Constraint . . . . . . . . . . . . . . . . . . . . . . . . 56 4.4.1 Corresponding subgroup approach . . . . . . . . . . . . . . . . 57 4.4.2 General solution . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.4.3 Invariant among the pattern . . . . . . . . . . . . . . . . . . . 59 4.4.4 Perturbation Analysis . . . . . . . . . . . . . . . . . . . . . . 61 4.4.5 Detection of Center for Concentric Circles . . . . . . . . . . . 62 4.4.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5 Constraints Related to Angles 66 5.1 Two Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5.1.1 Case 1: Two Edges Tangent to The Circle . . . . . . . . . . . 66 5.1.2 Case 2: Vertex on the Circle . . . . . . . . . . . . . . . . . . . 67 5.2 Problem Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.3 Identify the Subgroup . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.4 Solution for Right Angles . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.4.1 Geometric Properties of the Solution . . . . . . . . . . . . . . 70 5.4.2 Validity of the Solution . . . . . . . . . . . . . . . . . . . . . . 73 5.4.3 Synthetic Data Experiments . . . . . . . . . . . . . . . . . . . 74 5.4.4 Real Image Experiments . . . . . . . . . . . . . . . . . . . . . 76 5.5 Invariants of the Partially Rectifled Pattern . . . . . . . . . . . . . . 80 5.5.1 Approach 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 vi 5.5.2 Approach 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.5.3 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.5.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 III VIDEO CALIBRATION FOR OBJECTS IN PLANAR MOTION 84 6 Video Camera Calibration Using Objects in Planar Motion 85 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 6.2 Calibration Using Line Segment . . . . . . . . . . . . . . . . . . . . . 86 6.2.1 Algorithm for Pin-hole Camera Model . . . . . . . . . . . . . 86 6.2.2 Algorithm for General Camera Model . . . . . . . . . . . . . . 89 6.2.3 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.3 Calibration Using an Ellipse . . . . . . . . . . . . . . . . . . . . . . . 90 6.3.1 Area of an Ellipse . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.3.2 A?ne Rectiflcation: From Projective to A?ne . . . . . . . . . 93 6.3.2.1 Removing Translation and Skewness . . . . . . . . . 93 6.3.2.2 Transformation of Axes . . . . . . . . . . . . . . . . 95 6.3.2.3 Calculate s and r with Respect to a Given Direction 96 6.3.2.4 Estimation of Vanishing Lines from the Given Area Ratio of Two Ellipses . . . . . . . . . . . . . . . . . 97 6.3.2.5 Solve the Area Ratio Constraint . . . . . . . . . . . 97 6.3.2.6 Properties of The Solution . . . . . . . . . . . . . . . 98 6.3.2.7 Validation of The Solution . . . . . . . . . . . . . . . 100 6.3.2.8 A?ne Rectiflcation . . . . . . . . . . . . . . . . . . . 101 6.3.3 Metric Rectiflcation: From A?ne to Metric . . . . . . . . . . 101 6.3.4 Calibration using synthetic data . . . . . . . . . . . . . . . . . 103 6.4 Rectiflcation Using Arbitrary Shape . . . . . . . . . . . . . . . . . . . 104 6.4.1 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.4.1.1 Synthetic Leaf Data . . . . . . . . . . . . . . . . . . 105 6.4.1.2 Stationary Video . . . . . . . . . . . . . . . . . . . . 106 6.4.1.3 Camera undergoing Planar Motion . . . . . . . . . . 106 6.4.2 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.4.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 IV VIDEO METROLOGY 113 7 Video Metrology Using a Stationary Camera 114 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 7.2 Two Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 7.2.0.1 Perturbation Analysis . . . . . . . . . . . . . . . . . 116 7.2.0.2 Perturbation Analysis . . . . . . . . . . . . . . . . . 117 7.3 Measure On-plane Line Segments . . . . . . . . . . . . . . . . . . . . 118 7.3.1 Reference Lengths in Three Frames . . . . . . . . . . . . . . . 119 vii 7.3.2 Perturbation Analysis . . . . . . . . . . . . . . . . . . . . . . 120 7.3.3 Reference and Probe Lengths in Multiple Frames . . . . . . . 122 7.3.4 Performance in Noisy Environments . . . . . . . . . . . . . . . 125 7.4 Synthetic Data Experiments . . . . . . . . . . . . . . . . . . . . . . . 127 7.4.1 Metrology Algorithm . . . . . . . . . . . . . . . . . . . . . . . 127 7.5 System Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 128 7.5.1 Wheels Detection and Tracking . . . . . . . . . . . . . . . . . 129 7.5.2 Minimal Calibration . . . . . . . . . . . . . . . . . . . . . . . 130 7.5.3 Indoor Video Experiments . . . . . . . . . . . . . . . . . . . . 130 7.5.4 Outdoor Video Experiments . . . . . . . . . . . . . . . . . . . 132 7.6 Measure All Line Segments . . . . . . . . . . . . . . . . . . . . . . . . 134 7.6.1 Simplifled Case . . . . . . . . . . . . . . . . . . . . . . . . . . 135 7.6.2 General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 7.6.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 7.7 Frame Selection for Camera in Planar Motion . . . . . . . . . . . . . 138 7.7.1 Synthetic Data Results . . . . . . . . . . . . . . . . . . . . . . 140 7.7.2 Indoor Video Experiments . . . . . . . . . . . . . . . . . . . . 142 8 Summary and Future Research Directions 145 8.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . 145 8.2 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . 146 Bibliography 148 viii List of Tables 4.1 Statistical performance comparison between Jiang?s method and ours, measured by the distance between the detected center and of concen- tric circles and the ground truths in pixels. . . . . . . . . . . . . . . . 65 5.1 Error in the right angles of the chessboard appearing on the rectifled image in degree (?). Each value corresponds to an appropriate angle on Figure 5.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.2 Cumulative Match Characteristics (CMCs) of the matching of corners between two chessboard images . . . . . . . . . . . . . . . . . . . . . 83 6.1 Statistics of the calibration performance . . . . . . . . . . . . . . . . 104 6.2 Performance comparisons among Jiang?s center of concentric circles detection method [23], and our method using equal distances con- straint (Section 4.4.5, denoted as ?Dist?), and this method (denoted as ?Area?) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 7.1 Wheel base metrology results . . . . . . . . . . . . . . . . . . . . . . 132 7.2 Wheelbase metrology results for camera in planar motion . . . . . . . 144 ix List of Figures 2.1 Illustration of metrology setup . . . . . . . . . . . . . . . . . . . . . . 14 2.2 The pole-polar relationship. The lines lp = cp and lq = cq are the polars of the points p and q with respect to conic c. The point r is the pole of the line pq with respect to c. The pole-polar relationship is projective invariant. For a circle C, OP ? OQ ifi Lp ? Lq. . . . . . 15 2.3 Illustration of cross-ratio (a) of points (b) of lines . . . . . . . . . . . 17 2.4 Patterns used for calibration. (a) concentric circles [25]. (b) coplanar circles [20]. (c) a circle and multiple lines passing through the center of the circle [51]. and (d) a circle and two sets of parallel lines [9]. . 21 3.1 Illustration of the homography transformation groups deflned by par- tial rectiflcations. P: group of all the projective transformations. Gsim: subgroup corresponding to similarity, which keeps all the ge- ometric properties. Gcir: subgroup that keeps a unit circle C flxed. G1;2: subgroups keeps another geometric property besides keeping C flxed. Gsim = G1TG2. . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.2 Difierent rectiflcation for the same conic when choosing difierent map- ping centers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.3 An example for partial rectiflcation. The constraint is a pair of par- allel lines, and the solution is the image of diameter st. . . . . . . . . 36 3.4 Rectiflcation performance when mapping center is disturbed. (a) Ar- row plot. Perturbation in (b) angular direction (c) radial direction . . 40 4.1 Illustration for obtaining an arbitrary solution for Problem 1. . . . . . 43 4.2 Illustration for the corresponding subgroup solution for Problem 1, approach one. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3 Conflguration of the subgroup for Problem 1. M and N are the tan- gent points and can be obtained as the intersection points between the unit circle C and P?s polar line Lp. P, M and N are flxed points during the transformation in the subgroup . . . . . . . . . . . . . . . 45 4.4 The curve E as the solution for Problem 2. (a) When P is outside C, E is a part of ellipse tangent to C; (b) otherwise, E is the whole ellipse totally inside C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 x 4.5 Illustration of the three subcases when both two points P1;2 are out- side C. Red Os relate P1 and blue 4s relate P2 . . . . . . . . . . . . . 54 4.6 Rectiflcation of the signpost in front of Google?s headquarter. (a) Google logo, feature points marked as red cross (b) Trace of the mapping center from each feature point (c) Original image (d) Rectifled signpost . . . . . 55 4.7 Illustration of the contradiction when the mapping center X is not on the symmetric axis. E1;2 are the solutions for OP1;2 = d respectively . 58 4.8 Comparison of center detection algorithms. (a) Jiang?s method [24]. (b) Our method, dotted lines are the solutions for equal constraints. (c) comparison in zoomed-in image. black ?: ground truth; blue ?: our detection results; red ?: Jiang?s detection in each round, starting from red one. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.1 Solution curve in difierent cases. (a) :P inside C; (b)(d):P on C; others:P outside C. (a)-(c):Q inside C; (d)(e):Q on C; others:Q out- side C. (a)-(f): XY < 2; (g)-(i): XY = 2; (j)-(l): XY > 2. (j):1=X2 +1=Y 2 < 1; (k):1=X2 +1=Y 2 = 1; (l):1=X2 +1=Y 2 > 1 . . . 71 5.2 Illustration for the validity of solutions for a planar pattern. Due to the appearance of \srt, the shaded area is not valid. The valid solution is constrained to lie on the solid red curve. . . . . . . . . . . 74 5.3 Illustration for synthetic data setup when focal length f = 50, camera height h = 5, elevation angle = 45? and rotation angle ` = 45?. (a) Detected integer points in gray pixels, and fltted circle and square in black curves. (b) Solution curves obtained using orthogonal line constraints from four corners shown in difierent colors. . . . . . . . . 75 5.4 Evaluation of methods I (straight line) and II (line with symbols) to estimate (a) focal length, (b) elevation angle, and (c) rotation angle, vs. the circle?s radius. f = 50, ` = 45?, and h = 5. red: = 30?; green: = 45?; blue: = 60?. Ground truth values are plotted in black solid lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.5 Evaluation of the two methods. (a) Image plane. Method I using the points of a checkerboard, shown in red dots; Method II using a circle and four corners of the checkerboard as right angles, shown in solid blue curves. The solution curves are indicated as black dotted curves. (b) Estimated focal length vs. rotation angle ` when = 45?. ground truth: f = 50. . . . . . . . . . . . . . . . . . . . . . . . . . . 77 xi 5.6 Experiments for planar Euclidean Structure. (a)-(d) School sign. (a) detected circles and lines on the original image. (b) zoomed solution curves with unzoomed on upper right. (c) rectifled image using o as the mapping center. (d) rectifled images using ambiguous mapping centers 1-3. (e)-(g): Desk plane. (e) detected circles and lines on the original image. (f) solution curves (g) rectifled image and the right angles for evaluation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.7 Illustration of the invariant deduction . . . . . . . . . . . . . . . . . . 80 5.8 Experiments for image registration. (a) two images. (b) match value for each pair of corners . . . . . . . . . . . . . . . . . . . . . . . . . . 83 6.1 Experiments using synthetic data for vanishing line estimation: (a) A sample of the images of wheelbases and the vanishing line (b) log mesh plot of the variance of calculated . Very high values are trunked to include the minimum value. The minimum value is marked as ?, which is close to the ground truth in cross mark (?). (c) Feet of perpendicular to the vanishing lines using 10, 20, 40 and 80 frames, ten times respectively. The cross mark (?) is the ground truth value. (d) The estimation of using difierent number of frames. The ground truth value is 70?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.2 Given the vanishing line, an upright ellipse e0 can be used for area ratio comparison instead of e. st and pq are the parallel and orthog- onal axes of e. dp, dq and dst are the distances from p, q and s (t) to the vanishing line l1 respectively. . . . . . . . . . . . . . . . . . . . . 94 6.3 Calculation of the transformed parallel axis. . . . . . . . . . . . . . . 95 6.4 Solutions curves under four conditions. Solid circles are the circles whose area to be equalized. The one with plus mark is a unit circle. The dashed curve is the solution curve, and only the solid part is valid. Dotted lines are common tangents of the two circles. . . . . . 100 6.5 (a) A?ne transformation keeps the area of an ellipse constant. (b) Examples of three ellipses with equal areas cannot be simultaneously a?ne transformed into same shapes . . . . . . . . . . . . . . . . . . . 102 xii 6.6 Syntheticexperiment. (a)Threeprojectedidenticalellipsesareshown in solid curve. The black ellipse is flxed and others are randomly drawn. Dashed curves are the two solution curves to equalize the area ratio between the red/blue and the black ellipse. The dashed black line is the estimated vanishing line, while the gray solid line is the ground truth. (b) Rectifled results (solid curves) compared with the ground truth (dashed curve). Two sets of parallel lines of a square and one solid circle are used for evaluation. The magnifled histogram of two angles between the two sets of parallel lines is shown in (c), and (d) two axes ratio of the big circle (above, ground truth 1) and focal length estimation (below, ground truth value 150). . . . . . . . 109 6.7 Synthetic data for projected leaves. (a)-(d): set 1, (e)-(h): set 2. (a)(e): leaf images in database. (b)(f): projected blobs with fltting ellipses in solid curves, curves of poles in dashed curve. (c)(g) flrst round rectiflcation result. re-fltted ellipses are shown as dotted lines. (d) ?false alarms? (h) result after second round rectiflcation. . . . . . 110 6.8 Rectiflcation of a stationary video using a moving vehicle. (a) a sample frame when a car is turning. (b) masks of the cars. (c) curves of poles. (d) rectiflcation result. . . . . . . . . . . . . . . . . . . . . 111 6.9 Rectiflcation of a video in planar motion using a tennis racket. (a) a sample frame. (b) selected masks and all the fltted ellipses. (c) rectiflcation result of (a). (d) rectiflcation result of another frame. . 112 6.10 Experiment with synthetic data for detection of the center of concen- tric circles. (a) the image, (b) the area of ellipse, (c) the detected error in distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 7.1 Illustration of Lemma 6. . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.2 An illustration of Lemma 7. mn is parallel moved to ok, where o is given and k is unknown. p and q are two points on the vanishing line. 117 7.3 Illustration of Problem 10 and its solutions. Red, green and blue line segments are the reference lengths and the black segment is the probe.120 7.4 Poorly fltted ellipse is improved signiflcantly by adding one more point. The red ellipse is fltted by red (noisy) line segments; the blue ellipse is fltted by both red and blue line segments. The dashed black ellipse is the ground truth. . . . . . . . . . . . . . . . . . . . . . . . . 125 xiii 7.5 Synthetic data for metrology algorithm. (a) Traces of wheelbases from two vehicles. (b) Mean of metrology error. red ?: result using rectiflcation; green ?: our ellipses fltting metrology algorithm with- out perturbation analysis; blue 4: our metrology algorithm with perturbation analysis (weighted) (c) Standard deviation of metrology result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7.6 Flow chart of the wheelbase measuring system . . . . . . . . . . . . . 129 7.7 Skew coordinate using vehicle?s moving direction. . . . . . . . . . . . 130 7.8 Indoor sequence (a) a sample frame with number 4190 (b) mask after removing the pixel with intensity lower than 0.00, 0.05, 0.10, 0.15 re- spectively (c) sample of retrieved wheels from one vehicle. Same color indicates a pair (one front, one rear) from one frame (d) endpoints after parallel moving of two vehicles, red for the larger toy car, green for the smaller toy car, ?+? indicates the common endpoint. . . . . . . 131 7.9 Sample frames from outdoor sequences. (a) Toyota Camry (b) Honda Civic (c) Hyundai Elantra (d) Ford Explorer . . . . . . . . . . . . . . 133 7.10 Sample of detected wheels. The detected wheel centers are indicated by white crosses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 7.11 Wheelbase metrology probability plots . . . . . . . . . . . . . . . . . 135 7.12 Illustration of Theorem 4. Red dashed and blue solid indicate frames 1 and 2 separatively. The black dash dot line is the transformed frame 1136 7.13 Metrology of the tracked points from reference wheel base . . . . . . 137 7.14 Synthetic data for frame selection. (a) The vanishing lines repre- sented by feet. the black crosses: the vanishing line of frame 0; the green crosses: the vanishing lines of other frames; blue dots: flxed lines obtained from the homography matrices between frame 0 and other frames. red circles: those selected vanishing lines whose corre- sponding flxed line is close to the vanishing line of frame 0. (b) the standard deviations (STD) of the coordinates (x in blue cross and y in red plus) of the feet of the selected vanishing lines vs. difierent distance d0 used. (c) The number of selected frames vs. d0 in 10,000 frames. (d) The ratio of the numbers of selected frames and all the frames should be selected vs. d0. . . . . . . . . . . . . . . . . . . . . . 141 7.15 Wheelbase metrology using a planar motion camera . . . . . . . . . . 144 xiv Part I INTRODUCTION AND BACKGROUND 1 Chapter 1 Introduction 1.1 Motivation Metrology, the fleld of measuring the geometric attributes of an object, has been one of the oldest topics in many areas of science and technology and is still an area of active research. The prevalence of camera applications has motivated the study of metrology using computer vision techniques [11][22]. Compared to traditional methods, computer vision techniques have several advantages that can be summarized as follows: (1) the visualization allows for better interaction between human and the machine; (2) passive sensing, without afiecting the objects being measured; (3) the equipment is relatively cheap. These advantages make metrology using cameras an important topic to study with applications in surveillance systems. Techniques developed for still image metrology have reached maturity after more than a decade of work. Recently, video metrology has received more attention [3][33][43] because it has certain advantages over still image metrology: ? Video provides more reliable results Even a low frame rate video sequence contains hundreds of frames. The volu- minous information available in a video makes video metrology to be preferred over image metrology. First, some restrictions imposed on image (e.g. , both 2 reference and probe are required to appear concurrently without occlusion) can be eliminated. Second, because noise can be minimized statistically us- ing the redundancy in video frames, video metrology does not require high resolution frames. A moderately priced video camera can provide accurate measurement results. ? Video leads to easier automation A video camera can be set and then run twenty-four hours a day without human intervention, while still image cameras need manual operation (e.g. , shutter click). In image-based and video-based processing approaches, points used for metrology need to be marked. Although automatic detection of fea- ture points in images is a feasible task, these points in general do not have any physical interpretation. Thus, the measurement results would be meaningless without the knowledge of what the points represent. On the contrary, in the case of a video sequence, point detection and tracking have become more reli- able. Moreover, contextual interpretation of the detected points is possible in video sequences. As a topic in projective geometry, metrology is closely connected to other topics, such as planar rectiflcation, Euclidean structure reconstruction, motion esti- mation, etc. Among these topics, camera calibration has the closest tie to metrology because (1) calibration is a prerequisite step for metrology; and (2) measurement results can be used for calibration. Thus, research on calibration is important to metrology. 3 1.2 Challenges in Video Metrology The procedure for video metrology is straightforward: setting up the video camera, obtaining minimal calibration, detecting/tracking points related to the en- tity being measured, calculating the relative length, converting the length into the absolute one using the reference, and estimating the error. Two types of video cameras are most commonly used in video metrology. A stationary camera is typically used in surveillance systems; and a camera undergo- ing planar motion is typically mounted on moving vehicles or planes. A prerequisite for metrology is minimal calibration [11], which is deflned as the combination of the vanishing line of the reference plane and the vertical vanishing point. Absolute measurement values can be obtained from a fully calibrated camera [7]. In many cases, however, the full calibration is not available. Relative measurement results, i.e. , ratio of the lengths between two line segments, can be obtained using mini- mal calibration. If the absolute value of one of them is given (called the reference length), the other one (called the probe) can then be calculated. Before applying the metrology algorithm, points related to the quantities to be measured should be extracted from the video sequence using low level techniques. In most cases, more than one estimate may be obtained. It is necessary to fuse the multiple estimates and report a single result with error estimation. Building an automated video metrology system presents several key challenges: ? It is di?cult to automatically obtain the minimal calibration. Most existing camera calibration algorithms emphasize the accuracy of intrinsic parameters, 4 while the determination of extrinsic parameters is handled using minimal cal- ibration. More emphasis is put on robustness of low-resolution or low-frame- rate sequences rather than accuracy. Moreover, most calibration algorithms assumes the availability of certain patterns or some prior knowledge of the image, which may not be available in the video sequence or hard to detect. ? It is possible to track large number of moving points in a video sequence. However, robust tracking without drift can be obtained only for a few. Fur- ther, many of the stable tracks may not have geometric signiflcance. It is extremely important, but very hard, to choose the right points to perform metric estimation. ? Usually, multiple estimates are obtained from difierent frames. These esti- mates have difierent error covariances. Thus, a flnal estimate should not come from a simple mean/medium operation. A method is required to fuse the results from difierent frames to obtain an optimal estimate. ? It is di?cult to apply the metrology algorithm to a moving camera. Even for a camera exhibiting planar motion, the vanishing line may occupy difierent positions in difierent frames due to motion jitter. To the best of our knowledge, there is no discussion about how metrology should work under this situation. 5 1.3 Existing Approaches and their Limitations 1.3.1 Calibration Zhang [55] proposed a simple calibration algorithm using a planar image, which has been widely adopted in video calibration. This method works well with a large cluster of planar points. When the scene has little texture, however, this method may not provide reliable results. Many similar algorithms using circles [1][8][19][18][26][23][52] rely on detecting one or more ellipses, which is considered di?cult for video sequences. Another method for calibration uses parallel lines [29]. As there is no general way to judge whether two lines are parallel, and hence parallel lines are often man- ually marked. Another issue is that there may not be su?cient number of parallel lines for reliably estimating the vanishing line. Thus, this method only works on some carefully selected images or videos. Several methods have been designed to overcome these limitations. For exam- ple, Lv et al. [33] extracted the vanishing line of the ground plane using a walking human and assuming that the height of the human blob is a constant. However, this approach is not robust [43], as a walking human is not a rigid object. Bose and Grimson [3] estimated the vanishing point under the assumption that vehicles move at constant speeds. In most cases, this assumption does not hold. 6 1.3.2 Metrology Existing image-based metrology approaches can be classifled into two cate- gories: parallel lines mensuration and planar rectiflcation. Criminisi et al. [12] contributed to height estimation in single and multiple views, where the line seg- ments being compared are constrained to be parallel. Liebowitz and Zisserman [29] presented an algorithm for planar rectiflcation using the constraints of two line seg- ments or a flxed angle. Chen and Ip [9] rectifled an image using a vanishing line and circular points and further measured the distances. Wang et al. [9] measured the on-plane line segments using two set of perpendicular lines. Existing video-based metrology techniques are simple extensions of image- based metrology approaches. For example, Bose and Grimson rectifled the plane using a stationary camera and Liebowitz?s method [29]. Liang et al. [28] measured the height using a purely translated camera, as a direct extension of Criminisi?s method [11]. The limitations of image-based metrology methods still stand, and no error estimation is provided. Stafier et al. [43] empirically developed a planar normalization method for a stationary video. His result is empirical, and there is no theoretical proof that the result can be improved. 1.4 Contributions The contributions of this dissertation are categorized into three parts: (1) a new approach for solving calibration problems; (2) robust calibration methods for video; and (3) an automatic video metrology algorithm and its system implementa- 7 tion. We present a new calibration method using a circle and difierent types of constraints in Chapter 3. An intermediate set of partial rectiflcation problems is identifled using the techniques from the fleld of group theory. This concept provides a general solution structure to partial rectiflcation problems with difierent types of constraints. The concept is applied to calibration using planar patterns including a circle. The solution is represented as a curve including all valid mapping centers. To be more speciflc, two sets of constraints are studied. The flrst set involves distance constraints. Given the distance from a point to the circle, the partial rectiflcation problem is solved by identifying the flxed points that are eigenvectors of the homography matrix (Section 4.1). This solution is used to solve problems with other constraints, such as the line distance constraint, flxed arc angle constraint, etc. The solution is further extended to incorporate using equal distance constraint, i.e. , given two points with equal distance to the circle (Section 4.4). The second set involves angle constraints. We specially deal with the constraint of a pair of orthogonal lines (Section 5.4). Investigating the properties of the solution curve allows for the solution to be easily obtained from the original image. As applications, we propose a new method to flt concentric circles, and a new method for image registration. We then discuss new calibration methods for a video sequence when a planar object moves on the ground. In Section 6.2, we present a two-step calibration algo- rithm using a line segment moving on the reference plane with flxed length. The flrst step obtains a coarse estimate using the assumption of a pin-hole camera model; the 8 second step involves recursively tuning the estimate. Another calibration algorithm uses an ellipse in planar motion (Section 6.3). It also consists of two steps: (1) a?ne rectiflcation by equalizing the sizes of the ellipses, and (2) metric rectiflcation by equalizing the shapes of the ellipses. This algorithm has been generalized to any planar object moving on the reference plane. Finally, the design of a metrology algorithm and corresponding system imple- mentation are discussed in Section 7.3. This problem is solved to a concentric circle fltting problem accompanied by an error analysis. Based on a technique for fast wheel center detection, a real-time automated system for wheelbase measurement of a moving vehicle using a stationary video camera is built (Section 7.5). The measurement results for wheelbases are used to determine the vehicle sizes. When applying the algorithm to stationary objects measurement using a camera in planar motion, the jitter efiects need to be removed. A novel method to compare the van- ishing line between two frames using the plane induced homography is described in Section 7.7. The method helps to select frames with similar calibration information for metrology. 1.5 Summary The overall goal of this dissertation is to investigate video metrology problems, including feasibility, accuracy, implementation and applications. New calibration methods are developed to be used in metrology work. A novel algorithm is developed to achieve high accuracy for video mensuration followed by the implementation of 9 a fully automated system. 10 Chapter 2 Background 2.1 Notation In the rest of the dissertation, we use upper case letters to indicate shapes in the world system (or partially rectifled image) and the corresponding lower case letters for their projected images. For example, p is the image of a world point P. A line segment joint by two endpoints P and Q is denoted as PQ and its length as kPQk. Furthermore, we label the reference plane as R, its vanishing line as L. The vertical vanishing point is denoted as z. 2.2 Elements of projective geometry In this section, we introduce some fundamental terminology of projective ge- ometry, including homogenous coordinates, the vanishing point, the vanishing line, circular points and pole-polar relationship, etc. . 2.2.1 Homogenous coordinates and projective transformation A point in a n-dimensional space Rn with coordinates (x1;x2;:::;xn) can be represented by a length n + 1 sequence as (kx1;kx2;:::;kxn;k), where k is a non- 11 zero real number. This representation is called homogenous coordinates. A point?s homogenous coordinates are not uniquely deflned, but with a scaling factor xn+1. When xn+1 = 0, the point is at inflnity. A projective transformation of projective space Rn is represented by a linear transformation of homogeneous coordinates: P0 = H(n+1)?(n+1)P (2.1) where H is an arbitrary non-singular matrix. 2.2.2 Camera Model Let P = (X;Y;Z;1)> be the 3-D homogenous coordinates of a world point and p = (x;y;1)> be the 2-D homogeneous coordinates of its projection on the image plane. They are related by a 3?4 projection matrix M: x ? MX (2.2) and M can be written as M = K(R j v) (2.3) where K is the 3?3 upper triangular camera intrinsic matrix; R is the 3?3 rotation matrix and v is the 3?1 translation vector. 2.2.3 Line, conic and their transformations A line L in a plane is represented by an equation such as ax+by+c = 0, or a vector (a;b;c)>. A point P lies on the line L if and only if L>P = 0. In homogenous 12 coordinates, the intersection of two lines L and L0 can be written as P = L?L0, and the line passing through two points P and P0 can be written as L = P?P0. A conic is a curve described by a second-degree equation in the plane, which is of three main types: hyperbola, ellipse, and parabola. The equation of a conic in homogeneous coordinates is ax21 +2bx1x2 +cx22 +2dx1x3 +2ex2x3 +fx23 = 0 (2.4) or in the matrix form P>CP = 0, where the conic coe?cient matrix C is given by C = 0 BB BB BB @ a b d b c e d e f 1 CC CC CC A (2.5) Under the point transformation P0 = HP, a line L is transformed to L0 = H?>L and a conic C is transformed to C0 = H?>CH?1. For an ellipse E, we use O(E) to represent its center, and a 2?2 matrix E to represent its shape, such that E = T?>EOT?1, where EO = 0 BB BB BB @ E11 E12 0 E21 E22 0 0 0 ?1 1 CC CC CC A ;T = 0 BB BB BB @ 1 0 O(E)1 0 1 O(E)2 0 0 1 1 CC CC CC A (2.6) We further use A(E) to represent the area of E. 2.2.4 Vanishing point, vanishing line, and minimal calibration The image of a point at inflnity V1 is called a vanishing point v1, which can be detected as the intersection of the images of a pair of parallel lines. For the 13 reference plane vertical point vanishing line camera center Figure 2.1: Illustration of metrology setup parallel lines in all directions on a plane, their corresponding vanishing points form a straight line, which is called the vanishing line [22]. A common metrology setup includes one or more parallel planes, which is called the reference plane(s). A reference plane generally refers to the ground plane in surveillance scenarios, denoted as R. In the rest of this work, the vanishing line alwaysreferstothevanishinglineofthereferenceplane, denotedasL. Thevanishing point for the reference plane?s normal direction is called the vertical (vanishing) point, denoted as z. Knowledge of the vanishing line and the vertical point (L;z) is called minimal calibration. An illustration of the setup with minimal calibration is shown in Figure 2.1. Given the 3 ? 4 camera projective matrix M, the vertical point can be ob- tained from z = M(0;0;1;0)> = M3. The vanishing line can be written as L = (M(1;0;0;0)>)?(M(0;1;0;0)>) = M1 ?M2. 14 o rp qlq lpc O R P QLq Lp C (a) (b) Figure 2.2: The pole-polar relationship. The lines lp = cp and lq = cq are the polars of the points p and q with respect to conic c. The point r is the pole of the line pq with respect to c. The pole-polar relationship is projective invariant. For a circle C, OP ? OQ ifi Lp ? Lq. 2.2.5 Pole-polar relationship The relationship between a pole and polar is illustrated in Figure 2.2. Line l = cp is the polar of the point p with respect to the conic c, and the point p = c?1l is the pole of l with respect to c [22]. The pole-polar relationship is invariant under projective transformation. For two points p and q, the intersection between their polars is the pole of the line pq (see Figure 2.2(b)): (cp)?(cq) ? c?1(p?q) (2.7) If C is a circle, the center O of C is the pole of the line at inflnity L1; and the polar of a point at inflnity passes though the center O. 15 2.2.6 Circular Points The circular points, denoted as I;J = (1;?i;0)>, are the pair of points on the line at inflnity through which all circles pass. An absolute conic is deflned as ?1 = IJ> +JI>. The image of absolute conic (IAC) can be written as ! = K?>K?1 (2.8) followed the angle between two rays cos = x1!x2px 1!x1 px 2!x2 (2.9) Thus, IAC is very important to camera calibration and estimation on the Euclidean Structure (ES). 2.3 Invariants Invariants are properties of geometric conflgurations that remain unchanged under an appropriate class of transformations. The simplest projective invariant | straight line, is used precisely as the basic feature in vision systems. The cross-ratio is another example of a projective invariant. Invariants have been used in model based vision, shape descriptors for 3D objects, etc. [36]. In our research, invariants are used to localize points and applied to metrology problems. 2.3.1 Invariant Forms The invariants of the planar projective group have been extensively studied [36] and many forms have been discovered. Here we give a brief survey of a theory 16 (a) (b) Figure 2.3: Illustration of cross-ratio (a) of points (b) of lines of invariants. Cross-Ratio The cross-ratio of four collinear points is deflned by Cr(P1;P2;P3;P4) = (P1 ?P2)(P3 ?P4)(P 1 ?P4)(P3 ?P2) (2.10) Changing the order of the four points on the projective line, six distinct values of the cross-ratio can be identifled within the permutations. If we deflne Cr(P1;P2;P3;P4) as ?, the six distinct values are represented as ?, 1?, 1 ? ?, 11??, ??1 ? , ? ??1 respectively [36]. Since points and lines are dual, there exists an identical cross-ratio of lines. The notations of the lines and angles are shown in Figure 2.3(b). The cross-ratio is deflned as Cr(L1;L2; L3;L4) = sinfi13sinfi 23 sinfi24 sinfi14 (2.11) It is also a projective invariant. 17 Two-Line-Two-Point Four coplanar features, namely a pair of lines L1, L2 and a pair of points P1, P2 deflne a quantity that remains unchanged under projective transformations: 2L2P(L1;L2;P1;P2) = L1 ?P1L 2 ?P1 L2 ?P2 L1 ?P2 (2.12) Five-Point Given a set of flve points on a plane, there exist three invariants: I1(X1;X2;X3;X4;X5) = M421M532M 432M521 (2.13) I2(X1;X2;X3;X4;X5) = M421M531M 431M521 (2.14) I3(X1;X2;X3;X4;X5) = M431M532M 432M531 (2.15) Since I1 = I2I3, only two of the three invariants are independent. 2.3.2 Related Work The history of invariant theory dates back hundreds of years. The early works are reviewed in Parshall and Rice?s book [39]. Planar invariants have been used in metrology, model matching, and recognition. The cross-ratio invariants have been widely used in metrology algorithms [3][11]. Lourakis et al. [30] proposed an approach to planar surface matching based on the two-line two-point invariant. Parameswaran and Chellappa [37] applied the flve point determinant invariants to view-invariant human action recognition. Burns [4] indicated the non-existence of 3D invariants in general. He also analyzed the variance of a four point set in perspective projection. Weiss and Ray 18 [49] proposed a model based 3-D invariant in a determinant form, and applied it for object recognition in single images. Parameswaran and Chellappa [38] extended it to video for human action recognition. Zhu et al. [57] proposed a six point invariant and applied it for object recognition. Rao et al. [40] discovered that the matrix for n points in multiple views is constrained by rank at most 3. This property is applied for action recognition from trajectories. 2.4 Camera Calibration When representing in homogenous coordinates, the image point and world point are related by a 3?4 matrix, which is determined by eight parameters of the camera. Five among them are intrinsic to the camera itself, the so called intrinsic parameters. They are scale factors in the x and y coordinate directions fix and fiy, skewness s, and the coordinates of the principle point (u0;v0). The other three parameters indicate the relative position of the camera to the reference plane. In this work, the height h, elevation angle , and rotation angle ` relative to the reference plane are adopted. Calibration includes two parts: determine the intrinsic parameters or the cal- ibration matrix K = 0 BB BB BB @ fix s u0 0 fiy v0 0 0 1 1 CC CC CC A and extrinsic parameters. 19 2.4.1 Calibrate Camera Using Planar Patterns Zhang [55] presented an algorithm to calibrate a camera using the homography matrices among difierent views of a planar pattern. This algorithm contains two steps: 1. Deflne vij = (hi1hj1;hi1hj2 +hi2hj1;hi2hj2;hi1hj3 +hi3hj1;hi3hj2 +hi2hj3;hi3hj3)> and solve for a 1?6 vector b by 0 BB @ v>ij (vii ?vjj)> 1 CC Ab = 0 (2.16) 2. Compute B = K?>K?1 by setting b = (B11;B12;B22;B13;B23;B33)> 3. Extract the intrinsic parameters from B: v0 = (B12B13 ?B11B23)=(B11B22 ?B212) ? = B33 ?(B213 +v0(B12B13 ?B11B23))=B11 fix = p ?=B11 fiy = q ?B11=(B11B22 ?B212) s = ?B12fi2xfiy=? u0 = sv0=fiy ?B13fi2x=? (2.17) The last step actually is a Cholesky factorization. Zhang?s algorithm requires at least three images from one planar pattern, which are not always available. Under the pin-hold camera assumption (i.e. , no 20 (a) (b) (c) (d) Figure 2.4: Patterns used for calibration. (a) concentric circles [25]. (b) coplanar circles [20]. (c) a circle and multiple lines passing through the center of the circle [51]. and (d) a circle and two sets of parallel lines [9]. skewness, unit aspect ratio, and coincidence of the principal point and the image center), the only unknown intrinsic parameter is the focal length f. The homography matrix can be written as H = 0 BB BB BB @ f 0 0 0 f 0 0 0 1 1 CC CC CC A Hx Hy H0 +?Hz ? (2.18) Using the property Hx ?Hy = 0, f can be calculated from H. 2.4.2 Related works Subsequent to Zhang?s work, the interest in calibration using planar patterns has been reignited. Meng et al. [34] calibrated a camera using a circle and multiple lines passing through its center. Yang et al. [52] published a method to calibrate 21 a camera using three co-centered circles. Calibration using concentric circles was flrst proposed by Fremont et al. [16]. The features of this pattern were further investigated by Kim et al. [26], Gurdjos et al. [18] and Abad et al. [1], respectively. Jiang and Quan [23] demonstrated a practical algorithm to detect concentric circles. In all of the above solutions, the key step is to localize the projected common center of the two concentric circles. The solutions can be grouped into two main approaches: (1) recursively detect the midpoints of chords using the cross-ratio, and (2) solve the projected center of the concentrate circles as a degenerate circle of the linear combination between the two concentric circles. Another pattern being widely used is coplanar circles, which was flrst proposed by Wu et al. [51]. They discussed an approach to detect the vanishing line using two planar circles using difierent relations among the circles. Chen et al. [8] presented a complicated method to calibrate a pin-hole camera using multiple coplanar circles. Gurdjos et al. [20] later proposed a simpler solution using the concept of family of circles. Calibration using spheres has also received some attention. Agrawal and Davis [2], Daucher et al. [13], Ying and Zha [53], and Zhang et al. [54] investigated this problem, respectively. Zhang [56] also proposed a calibration method using one-dimensional objects for calibration. Cao and Foroosh [5] extended this method using a trapezoid. Cao and Shah [6] calibrated a camera using two vertical lines and their shadows. Gurdjos et al. [19] used confocal conics for calibration. 22 2.5 Metrology Metrology in image and videos has been studied as an interesting problem with many applications. It has two components: spatial localization and dimension determination. The spatial localization component estimates an object?s position relative to the environment, for example, the ball?s position with respect to the goal post can be used to determine whether a goal has been scored in a soccer game [41]. Since the object?s location changes with time, generally multiple concurrent views are needed. The dimension determination component estimates the distance between two points on the same object [11]. It has received more attention because the results generally can be used to recognize or identify the object itself. As the two points are invariant relative to the object in the world system, evidences occurred using multiple frames always improve the measurement results. Metrology requires less calibration information compared to 3D reconstruction problems. A common setup includes one or more parallel reference planes. In stationary surveillance scenarios, the reference plane generally refers to the ground. Minimal calibration [12], deflned as the combination of the vanishing line of the reference plane and the vertical vanishing point, is assumed to be available. The objective is to estimate the ratio of lengths between two parallel line segments in the world system. If one of them has a standard known length, the length of other can be calculated from it. 23 2.5.1 Categories From the properties of objects, metrology can be categorized into rigid ob- ject metrology and non-rigid object metrology. Common examples of rigid objects include paintings, buildings, and vehicles. It is relatively simple to measure rigid objects since the distance between two points on the object remains constant in the video. Measuring a non-rigid objects is not only a geometry problem, but also related to other issues such as selecting points or frames relatively invariant to the length. For example, to accurately measure the height of a walking human in a video, recognizing the human pose is important. From the relationship of the line segment to the reference plane, there are three types of metrology one can consider: length, height, and others. Length metrology measures the line segments lying on or parallel to the reference plane; Height metrology measures the line segments perpendicular to the reference plane, or the distance from a point to the reference plane. Some other geometric properties, such as size (which is weakly deflned), 2D radius [43] have been considered also, but just for approximate normalization rather than for accurate results. The task of metrology arises in many contexts: single view metrology, multiple view (normally two views) metrology, and video metrology. Single view and multiple view metrology, also known as image metrology, have been well studied [11]. Since only limited information can be derived from a single image or multiple images, many assumptions have to be made, for example, the parallel relationship between two lines, etc. Also, it needs manual steps such as endpoint localization of line 24 segments. Due to the ubiquitous presence of video cameras, video metrology has recently received a lot of attention. In contrast to still image-based metrology, it has the potential to be a fully automated system. It does not require the parallel assump- tion either. Video metrology can further be classifled as single stationary video metrology, multiple video metrology, and moving video metrology. 2.5.2 Applications Metrology has a wide spectrum of applications. Metrology can be a preprocessing step for recognition. For example, the height of a walking human excludes a large number of candidates from a gallery in a recognition system. It can also be used for classiflcation. For example, difierent types of vehicles may have very difierent ratios between their height and length (sedan vs. SUV); or very difierent ratios between their total length and wheel base (sedan vs. hatchback). In our experiments using a single video (Section 7.5.4), we further show that metrology can be used to determine the vehicle category (compact, mid-size, or full-size). Detecting abnormal events is an essential issue for a surveillance system. Ab- normal object sizes may also belong to abnormal events. For example, a height of a vehicle larger than normal generally implies that something has been added to the top of the vehicle. In a driver assistance system (DAS), it is important to distinguish people from 25 vehicles in real time. The size of the object is a good feature for this task. Also, properly detecting the distance from one vehicle to another objects prevents collision and accidents, which belongs to the topic of metrology. Metrology has applications in sports too. In a soccer game, for example, whether a goal is scored or not is not easy to determine when the view is blocked by the player?s body at that moment. Estimating the ball?s position at each frame can be done using metrology algorithms. Metrology can resolve any disputes that may arise. one can think of similar applications in other sports like tennis and cricket. Metrology also helps to solve other computer vision problems. For example, in a tracking system, the size/height of an object can be treated as a feature. When two objects overlap and separate from each other, this feature can be used to maintain tracking. 2.5.3 Related Work It was realized early on that the ratio of two line segments can be recovered when the camera?s parameters (intrinsic and extrinsic) are known. To simplify the problem, sometimes many common assumptions have been made, such as: unit aspect ratio, zero skew and coincidence of principal point and the image center. Caprile and Torre, in their classical work [7], developed an algorithm to compute the focal length and projection matrix using the properties of vanishing points from a single view. Liebowitz and Zisserman [29] presented a two-step algorithm to rectify the perspective images of planes. The flrst step is to estimate the vanishing line 26 from two detected vanishing points, and transform the image from projective to a?ne. Using three constraints, they transformed the image from a?ne to metric. Triggs [46] presented a systematic error analysis. Criminisi, et al. [12] considered the estimation of height from an uncalibrated image using projective geometry and the cross-ratio, which have become popular in height metrology. In a subsequent work [11], they extended this work to multiple reference planes. Given two sets of perpendicular lines, Wang et al. [48] measured the line segments on the reference plane without rectiflcation. Moons et al. [35] presented an algorithm for recovering the 3D a?ne structure from two perspective views taken by a camera undergoing pure translation. In the context of video metrology, Staufier et al. [43] built a linear model to normalize object size in a video. Finally, Bose and Grimson [3] rectifled the plane by tracking objects moving on a straight line at a constant speed. 27 Part II IMAGE CALIBRATION USING A CIRCLE AND OTHER CONSTRAINTS 28 Chapter 3 Partial Rectiflcation 3.1 Introduction Camera calibration [15][22] has received much attention over the last decade. The increased popularity of the zoom lenses and frequent adjustment of the camera settings demand camera calibration to be performed fast and conveniently. Due to the improved quality of the imaging system, unwanted nonlinear distortions (e.g. , radial distortion) can be ignored, which enables the use of simple planar patterns for calibration. Triggs [46] and Zhang [55] have developed point-based self-calibration methods using a planar pattern from multiple views. Accurate calibration using this technique requires the points to be spread on the image plane. A common mode of failure for point-based method is when the point correspondences are poor or feature points are occluded. Recently, many rectiflcation algorithms using planar patterns that include lines and circles have been reported [8][9][17][20][23][25][34][47][52]. The camera can then be directly calibrated from the rectiflcation result. These algo- rithms claim additional robustness as lines and conics can be accurately detected even when they are partially occluded. Each of the existing rectiflcation algorithm deals with a special planar pattern. Thus, a disadvantage of these algorithms, is that their utility is limited, i.e. , whether any of these algorithms work in the real world depends on whether the corresponding 29 pattern appears or not. In this chapter, we identify an intermediate set of partial rectiflcation problems. These problems are common to most rectiflcation/calibration algorithms. The basic concept is to treat the given pattern as a set of independent constraints, and then rectify the plane by combining the solutions for each constraints. This concept is ap- plied to planar patterns including a circle. We introduce the term \mapping center" as the solution for these partial rectiflcation problems and set up the rectiflcation system. Two types of examples will be illustrated in the next two chapters. 3.2 Partial Rectiflcation Metric rectiflcation, also referred as rectiflcation, recovers the shapes of the planar objects from their images. The recovered geometric properties include ratio of lengths, angles, etc. The term \rectiflcation" has been extended to recovering a portion of geometric properties. For example, a?ne rectiflcation only recovers parallelism by mapping the vanishing line to the line at inflnity. In other words, the line at inflnity is flxed during the homography transformation between the original plane and the recovered image. To distinguish the original and extended concepts of the term \rectiflcation", we use another term \partial rectiflcation" to represent the extended concept. For- mally, a \partial rectiflcation" is deflned as a set of transformations, any of which recovers one or more geometric properties of the plane or a given pattern. As an example, a?ne transformation recovers parallelism in all directions, and so it is one 30 kind of partial rectiflcation. A weaker partial rectiflcation may only recover the par- allelism in a special direction, whose corresponding set of transformation matrices maps one vanishing point to the point at inflnity. More examples will be discussed in next two chapters. 3.2.1 View from Group Theory To clarify partial rectiflcation and its solution properties, we study it from a group theory viewpoint. Projective transformations form a group P, i.e. , it satisfles P ?P ! P. Similarity Gsim is a subgroup of projective Gsim ? P, which keeps all the geometric properties invariant (except for absolute length). The set the transformations that keeps any one (or more) geometric property invariant form a subgroup Ginv ? P. Using a?ne as an example, the invariant is the line at inflnity. Thus, a?ne transformations form a subgroup Gaff ? P: Gaff ? Gaff ! Gaff. Another example is a set of transformations that keep a unit circle C = diag(1;1;?1) flxed: Gcir = fHjH?>CH?1 = Cg. It is obvious that Gcir ?Gcir !Gcir, thus it is a subgroup of P. Let h0 be the homography matrix from the world plane to the image plane. The solution to a metric rectiflcation problem is a matrix H satisfying H?h0 2Gsim. We deflne a subset S 2P satisfying S?h0 = Gsim. Obviously H can be obtained as any element of S. On the contrary, since Gsim has been well studied, S can be also easily obtained from H as S = Gsim ?H. Thus, we may redeflne S as the solution to the metric rectiflcation problem. The solution is actually a right coset of Gsim in P: 31 Ssim;h0 = Gsim ?h?10 . Similarly as described for metric rectiflcation, for any arbitrary partial rectifl- cation problem, a corresponding subgroup can be determined, such that the solution becomes a right coset of the subgroup: Gpartial: S ? h0 = Gpartial. The geometric invariant corresponding to the subgroup is nothing but the geometric property to be recovered in the partial rectiflcation problem. For example, a partial rectiflca- tion problem is to recover a circle from its projection, e.g. , an ellipse. We let the projected circle be transformed to the unit circle C. Thus the partial rectiflcation problem corresponds to the subgroup Gcir. The solution of this partial rectiflcation becomes Gcir ? h?10 . The subgroup is not uniquely deflned. In the above example, we may use any circle C0 to replace the unit circle C, and the solution is determined only up to a similarity transformation. We now highlight a nice property of partial rectiflcation. Let there be two partial rectiflcation problems with solutionS1 andS2 respectively. We haveS1?h0 = G1 and S2 ?h0 = G2. It is easy to prove that (S1TS2)?h0 = (G2TG2). Thus, the intersection of the solutions is also a solution for a new partial rectiflcation problem, which recovers both geometric properties of the two previous partial rectiflcation problems. Given multiple independent geometric properties in a pattern, no transfor- mations keep them all invariant except for similarity transformations. We design several partial rectiflcation problems, so that for each geometric property, it is re- covered by at least one of the problems. Thus, the intersection of all the solutions of these partial rectiflcation problems transforms the projected pattern to a pattern 32 P Gcir Gsim G1 G2 Figure 3.1: Illustration of the homography transformation groups deflned by partial rectiflcations. P: group of all the projective transformations. Gsim: subgroup cor- responding to similarity, which keeps all the geometric properties. Gcir: subgroup that keeps a unit circle C flxed. G1;2: subgroups keeps another geometric property besides keeping C flxed. Gsim = G1TG2. keeping all the independent geometric properties, which means it is the solution to the metric rectiflcation problem. Although the concept of partial rectiflcation suggests a general solution for rectiflcation, therearestilltwoproblemstobesolvedbeforeitsrealapplication: flrst, how to obtain the solutions of the partial rectiflcation problems; and second, how to calculate the intersection of two sets. In the rest of this chapter, we concentrate on the latter for the case when a circle is contained in the pattern. Examples of solutions for the flrst problem will be discussed in the next two chapters. 33 Figure 3.2: Difierent rectiflcation for the same conic when choosing difierent map- ping centers 3.3 Partial Rectiflcation Using a Circle Now we consider a speciflc category of partial rectiflcation problems, which recover a projected circle. i.e. , those transformations convert a given ellipse c into a circle. Under this, we flrst investigate the number of additional geometric properties that are needed for metric rectiflcation, which is answered through an analysis of the Degrees of freedom (DoF). A general projective transformation has eight DoFs while similarity has four (two translation, one rotation and one scaling). The four additional DoFs require four independent constraints to remove the distortion of an image. To enforce a general conic of the format ax2 +bxy +cy2 +dx+ ey +f = 0 be a circle, two conditions are naturally provided: a = c and b = 0. Thus, two additional constraints are needed. If each partial rectiflcation provides one more constraint, metric rectiflcation can be done from two partial rectiflcations problems. The relationships among the partial rectiflcation problems is shown in Figure 3.1 34 3.3.1 Mapping center Lemma 1. Circle rectiflcation. Given an ellipse (projected circle) c and a point o inside c, there is a unique matrix H (up to a similarity) that transforms o to a point O and c to a circle C centering at O. Proof. Existence: Without loss of generality, let o be the origin (0;0;1)>. We represent the ellipse c and the transformation matrix H as c = 0 BB BB BB @ A B D B C E D E F 1 CC CC CC A H = 0 BB BB BB @ a 0 0 b c 0 d e f 1 CC CC CC A It is easy to verify that H = 0 BB BB BB @ pA+d2 ?b2 0 0 (B +de)=c pC +e2 0 ?D=f ?E=f p?F 1 CC CC CC A (3.1) satisfles the requirement. Transforming using this matrix, it is seen that O = (0;0;1)> is the origin, and C is a unit circle centering at O. Uniqueness: Assume that there are two matrices H1 and H2 both satisfying the requirement. Matrix H2H?11 maps a circle C1 and its center O1 to a circle C2 and its center O2 respectively. Because L1 = C1O1 and L1 = C2O2 as mentioned in the last section, the line at inflnity is flxed during the transformation H2H?11 . Thus, H1 and H2 are related by an a?ne transformation. Because a circle remains invariant, it is a similarity transformation. We call the point o a mapping center because it is mapped to the center of 35 v s t l1 l2 Figure 3.3: An example for partial rectiflcation. The constraint is a pair of parallel lines, and the solution is the image of diameter st. the circle in the rectifled plane. For a valid solution, the mapping center is always inside the projected circle. 3.3.2 Solution Using Mapping Center Given a conic (ellipse) c as a projected circle, the rectiflcation matrix H is de- termined by the mapping center o. Therefore the rectiflcation problem is equivalent to flnding the mapping center o. Because o can be any point inside c, it has two degrees of freedom. Rectiflcation can be also viewed as reducing the Dofs from two to zero by incorporating additional constraints. The partial rectiflcation problem with a circle reduces the Dofs by one. In general, because there is still one remaining Dof, the solution to a partial rectiflcation problem is a curve or several curves on which the mapping center resides. To illustrate this concept, we provide a simple example shown in Figure 3.3. 36 The constraint is the projection of a pair parallel lines, and the corresponding partial rectiflcation problem is to rectify a projected circle c and to parallelize the given lines l1 and l2. The solution is obtained as follows: v = l1?l2 is the vanishing point. Let l = cv be the polar of v with respect to c. It passes through the mapping center. Denote the two intersections of c and l as s and t. Because the mapping center is inside c, the solution is st, a line segment on l inside c. Given a projected circle with more than two constraints, the plane can be fully rectifled as follows: (1) Obtain the solution curve for the partial rectiflcation problem from each constraint. (2) Calculate the intersection(s) of all the curves. The true mapping center satisfles all the constraints and lies at the intersection. (3) Obtain the homography matrix using Lemma 1 and rectify the pattern. 3.3.3 Ambiguity and Solution in Multiple Constraints Generally, two curves may have more than one intersection. The solution of two partial rectiflcation problems may not be unique. Except when the planar image is cut by the corresponding vanishing line, none of the intersections can be rejected. Thus, ambiguity may arise when there are only two constraints, which can be removed by multiple (? 3) constraints. If there are more than three constraints, however, in real case the solution curves may not pass through a common point due to noise. Under this situation, we use the point with minimum sum of the distance as the mapping center. A better approach is to assume that each constraint is independent of each other (which may 37 not hold in most cases); analyze the perturbation of each solution due to noise; and estimate the point with highest probability to be the mapping center. 3.3.4 Solutions for Existing Partial Rectiflcation 3.3.4.1 A Line l Passing Through the Circle?s Center Meng, et al. have discussed the pattern with multiple lines passing through the center of a circle [34]. This problem can be considered as a collection of partial rectiflcation problems. Each problem is the image of a line passing through the center of the circle, and the solution is the corresponding diameter of the circle. 3.3.4.2 A Vanishing Point v Denote the polar line of the vanishing point v as l = cv. l is the line passing through the mapping center. When the vanishing line lv is given [9], the solution is its pole point x = c?1lv. Actually this is a metric rectiflcation problem and x is the mapping center. 3.3.4.3 A Coplanar Circle The mapping center can be estimated using Gurdjos? method [18]. 3.3.5 Performance under noise Generally speaking, the projected circle should be robust because it is obtained from multiple detected points. Thus, in the performance analysis, the circle is always 38 assumed to be noise free, which works well except when the projected circle in the image plane is very small. Based on this assumption, the rectiflcation performance only relies on the accuracy of the detected mapping center. In the rectifled image, it only relies on the distance from the detected mapping center X to the ground truth O, denoted as dx = kOXk. Because the detected mapping center X is on the calculated solution curve S, if S does not pass through O, X cannot be coincident to O. Actually, dx cannot be smaller than the distance from O to S, denoted as ds. We use ds to indicate the performance of the solution. It should be noted that even two solution curves with small ds do not guarantee an accurate mapping center. An inverse problem is to study how the points on the plane are afiected by inaccurate mapping center. As shown in Figure 3.4 (a), the mapping center X (in red circle) is perturbed by a small amount in x direction from O (in blue cross). The displacements of points on the plane ?P are displayed in arrows. Discompose ?P into two components, the one in radial direction ?Pr and the one in angular direction ?Pa, which are shown in (b) and (c) respectively. For the points close to the x-axis, their displacements are mainly from the radial direction; for the points close to the y-axis, their displacements are mainly from the angular direction. 39 ?2 ?1 0 1 2 ?2 ?1 0 1 2 (a) ?2 ?1 0 1 20 0.5 1 1.5 00.02 0.040.06 0.080.1 0.12 yx ?2 ?1 0 1 20 0.5 1 1.500.050.1 0.150.2 yx (a) (b) (c) Figure 3.4: Rectiflcation performance when mapping center is disturbed. (a) Arrow plot. Perturbation in (b) angular direction (c) radial direction 40 Chapter 4 Constraints Related to Lengths Inplanarcalibrationproblems, atypicalconstraintscomefrommetrology, i.e., the ratio between two lengths. In this chapter we discuss the partial rectiflcation problem in the presence of such constraints. 4.1 Point Distance Constraint A circle naturally provides a length: its radius, which can be used as the reference length. Unlike other line segments, the center of the circle, served as one of the endpoint of the radius, is not marked on the image. The other lengths come from a point on that plane. A single point does not contain any dimensional information, but its relative position with respect to the circle can be used. Rather than using \inside" or \outside" to represent the relationship between the point and the circle, the metric information derived by deflning the distance from the point to the circle as the distance of the point to the center is used. Using the distance as a constraint, we address the following partial rectiflcation problem: Problem 1. Given an ellipse c as the image of a circle under projective transfor- mation, a point p not on the ellipse and a positive number dP (dP > 1 when p is outside c, dP < 1 otherwise), partially rectify the plane such that c is transformed to a circle C, p to a point P, and kOPk = dPr, where O is the center of the circle C 41 and r is the radius of the circle. kOPk is deflned as the distance from the point P to the circle C. Without loss of generality, we assume r = 1 to simplify the derivation. 4.1.1 An Arbitrary Mapping Center Solution An arbitrary mapping center in the solution set for Problem 1 can be obtained as: Solution 1. Arbitrarily draw a line from p intersecting c at two points u and v as shown in Figure 4.1. Select the point x satisfying x = (dP +1)up+(dP ?1)vp?2dPuv2d Pp?(dP ?1)u?(dP +1)v (4.1) Using x as the mapping center, the corresponding transformation converts c to the unit circle C and p to a point P, and the distance from P to C is dP. Proof. Let points O, U, V and P be the points x, u, v and p after the transformation respectively. Since O, U, V and P are collinear, the cross ratio does not change: kPUk kPVk = kPUk kPVk kOVk kOUk = kpuk kpvk kxvk kxuk = ? dP +1 dP ?1 (4.2) which leads to POUV = dP=2. Thus, the distance is recovered. 4.1.2 Corresponding Subgroup (Approach One) Without loss of generality, the partially rectifled pattern can be viewed as follows: A unit circle C and a point P at the x-axis with distance dP to the circle. 42 x puv Figure 4.1: Illustration for obtaining an arbitrary solution for Problem 1. O P R1 R2 X1 X2 Q M N Lp Figure 4.2: Illustration for the corresponding subgroup solution for Problem 1, approach one. The transformation of the corresponding subgroup keeps the pattern flxed. It is re-stated as follows: Problem 2. Given a unit circle C = diag(1;1;?1) and a point P = (d;0;1)>, flnd the valid mapping centers such that during the corresponding transformations both C and P remain flxed. A simpler version of Problem 2 is to acquire the mapping centers on the x-axis, denoted as X = (x;0;1)>. Let the circle C intersect the x-axis as A = (1;0;1) and B = (?1;0;1). After transformation, A0 and B0 are still on the circle and on the x-axis. There are two cases to consider: 43 Case 1: A0 = A and B0 = B. Because P0 = P, there are three collinear points flxed during a transformation, thus the line is point wise flxed. Especially we have X = X0 = O. It is a trivial transformation. Case 2: A0 = B and B0 = A. The cross ratio of the four collinear points X, P, A, and B are invariant, written as AP BP BX AX = BP AP AO BO (4.3) Or (d?1) (d+1) (x+1) (x?1) = (d+1) (d?1) (0?1) (0+1) (4.4) By solving (4.4), we have x = 2dd2 +1 (4.5) Because P and C are flxed, the polar line of P with respect to C is also flxed. Denote the polar as Lp, and let it intersect the x-axis at Q and intersect the circle at M and N respectively. After the transformation, Q0 is still on the x-axis and M and N still at the unit circle. Thus, these three points are flxed. The line Lp is point wise flxed during the transformation. We then have the following observation: Lemma 2. The point (1=d;y;1)> is flxed during a transformation corresponding to the mapping center (2d=(d2 +1);0;1)>, where y can be any number. Let us return to Problem 2 and consider the mapping center not on the x-axis. Let X = (x;y;1)> and denote = tan?1(y=x). Draw the perpendicular PR of OX, as shown in Figure 4.2. kORk = dcos . From Lemma 2, we have kOXk = 2dcos d2 cos2 +1 (4.6) 44 M N O P Lp dP2? Figure 4.3: Conflguration of the subgroup for Problem 1. M and N are the tangent points and can be obtained as the intersection points between the unit circle C and P?s polar line Lp. P, M and N are flxed points during the transformation in the subgroup which leads to x = 2dcos 2 d2 cos2 +1 (4.7) y = xtan (4.8) To keep P flxed, the mapping center should satisfy (d2 +1)x2 ?2dx+y2 = 0. That is the solution of the mapping center. 4.1.3 Corresponding Subgroup (Approach Two) As mentioned above, P is flxed during the transformation. M and N, the two intersections between the polar of P and the circle C are also flxed. When P is outside C, as shown in Figure 4.3, denote \MON as 2?. The coordinates of the three points are M = (cos?;sin?;1)>, P = (1=cos?;0;1)>, and N = (cos?;?sin?;1)>. When P is inside C, the three points can be represented using the same format, except that 45 ? becomes a complex number. The following derivation holds for both cases. Denote the transformation matrix as H. The equation of flxed P becomes P ? HP, i.e. , P is the eigenvector of H. For the same reason, M and N are also the eigenvectors of H. The three points are not collinear, so the three eigenvectors are independent. Let em, ep and en be the three corresponding eigenvalues of M, P and N. H can be written as H = VDV?1 (4.9) where V = (M P N) = 0 BB BB BB @ cos? 1=cos? cos? sin? 0 ?sin? 1 1 1 1 CC CC CC A (4.10) D = diag(em;ep;en) = 0 BB BB BB @ em 0 0 0 ep 0 0 0 en 1 CC CC CC A (4.11) Because the unit circle is transformed into itself, we have C = H?>CH?1. Using (4.9) and that D is symmetric (D> = D), this condition becomes C = (VDV?1)?>C(VDV?1)?1 = V?>D?1V>CVD?1V?1 (4.12) which can be further simplifled as D?1(V>CV)D?1 = (V>CV) (4.13) 46 Let A = V>CV. Simple manipulations yield A = 0 BB BB BB @ 0 0 ?2sin2 ? 0 tan2 ? 0 ?2sin2 ? 0 0 1 CC CC CC A (4.14) where A is a symmetric antidiagonal matrix. Since D?1AD?1 = A, the corresponding parameters require ep = ?1 (4.15) emen = 1 (4.16) Letting e = emep , e+ = e + 1e, and e? = e ? 1e, the transformation matrix H can be written as H ? 0 BB BB BB @ e+ cos2 ? ?2 ?e?cos?sin? ?(e+ ?2)cos? e?cos?sin? ?e+ sin2 ? ?e?sin? (e+ ?2)cos? ?e?sin? ?e+ +2cos2 ? 1 CC CC CC A (4.17) The mapping center X = (xt;yt;1)> is transformed to the center of the circle, i.e. , the origin O = (0;0;1)>. Using O = HT, the following two conditions are obtained (e+ cos? ? 2cos?)xt ?e?sin?yt = e+ ?2 (4.18) e?cos?xt ?e+ sin?yt = e? (4.19) By solving e from (4.19) and removing e from (4.18), the equation for the curve of X is obtained as xt(xt(1+cos2 ?)?2cos?)+y2t cos2 ? = 0 (4.20) 47 Using d = 1=cos?, the flnal result can be written as 1+d2 d2 x 2 ? 2 dx+ y2 d2 = 0 (4.21) Thus same results are obtained from both approaches. 4.1.4 Geometric Properties of The Solution Curve Denote the left hand side of Equation (4.21) as f(x;y). Equation (4.21) can be written as f(x;y) = 0. Obviously this curve is a conic. To be more speciflc, it is an ellipse. Denote it as E. Due to symmetry, E is symmetric with respect to the x-axis. It is also easy to verify that E passes through the origin O: f(0;0) = 0. That is because the origin O itself should always be a valid mapping center. The intersections between the circle C and E can be obtained by solving x2 + y2 = 1 and (4.21). When P is outside C, two double intersections are obtained: M(2) and N(2), i.e. , E is tangent C at M and N. This phenomenon can be explained as follows. When P is outside C, draw a line intersect C and very close to M. Let the intersection be U and V. There must be a point X in the segment UV to satisfy (4.1), i.e. , using X as the mapping center, the corresponding transformation keeps P flxed. Thus, M can be as close to E as we want. In other words, E passes through M. At the same time, any mapping center to keep the distance cannot be outside C, which leads to the conclusion that E is not outside C. Then E is tangent to C at both M and N. When P is inside C, E is tangent to C at complex points. In the real image, E is inside C. Theorem 1. Given a circle C and a point P, denote the polar P as Lp = CP, 48 P M N dPO E P M N dPO E (a) P is outside C (b) P is inside C Figure 4.4: The curve E as the solution for Problem 2. (a) When P is outside C, E is a part of ellipse tangent to C; (b) otherwise, E is the whole ellipse totally inside C and let M and N be the two intersection points between C and Lp (may not be real points). All the mapping centers that keep C and P flxed during the transformation form a curve E, which is an ellipse passing through the original point O and tangent (intersect twice) to C at two points M and N. Further analysis shows that, when P is inside C, E separates C into two regions. The inner region of E contains P. For any point in the inner region as the mapping center, the distance from transformed P to O is shorter than d; and for any point in the outer region as the mapping center, the transformed distance is greater than d. When P is outside C, E separates C into three regions. Using any point in the region inside E as the mapping center, the transformed distance is greater than d; for a point in other two regions, the transformed distance is smaller than d. 49 4.1.5 General Solution and Validation Using the results given above, the solution curve e on the original image can be obtained as follows: 1. Find an arbitrary solution to partially rectify the pattern. Denote the corre- sponding matrix as H. 2. On the new pattern, calculate the ellipse E passing through the center of the partially rectifled circle and tangent to the circle at the tangent point of the line from the point. 3. Transform the ellipse E back to the original image e = H?1E Since any ellipse is determined by flve boundary points, an alternative ap- proach is described below: 1. Arbitrarily flnd flve mapping centers as special solutions. 2. Fit an ellipse e using the flve mapping centers. However, not every point on e can be a valid mapping center. When p is outside c, they are required to be on the same side of the vanishing line, which means that the point p and the mapping center x must be on difierent sides of the line lp, where lp = cp is the polar of p with respect to c. Thus, the valid mapping center is just an arc (a part) of e. 50 4.1.6 Perturbation Analysis The perturbation analysis is performed on the world plane. When the point P moves a small amount to P0 = P+?P, the corresponding dislocation of the mapping center is studied. Because the perturbation on the angular direction ?Pa does not change the distance from P to O, it has no efiect to ds. Thus, only the radial component ?Pr is considered. A simple calculation yields ds = ?Pr=k(d2P ?1)k. The sensitivity of the solution to the input, deflned as ?Pr=ds, is k(d2P ?1)k. It is easy to see that when dP = 1, i.e. , P on C, the sensitivity is inflnity. Thus, a tiny change in P in the radial direction results in inflnite displacement of the mapping center. Actually, the point on the circle contains no information and so should not be considered as a constraint. The mapping center is sensitive to the points close to the circle?s boundary. Thus, when dP ? 1, the performance of the solution curve may be not good. 4.2 Metric rectiflcation using two distance constraints Problem 3. Given an ellipse c as the image of a circle, two points p1;2 and two positive numbers d1;2, rectify the plane such that c is transformed to a unit circle C, p1;2 to points P1;2, to satisfy kOP1k = d1, kOP2k = d2, where O is the center of the circle C. The solution to this problem isobvious: solvetwopartial rectiflcation problems using the independent distance constraints. The solution then is the intersections 51 of the two solution curves. 4.2.1 Uniqueness analysis Generally, two ellipses may have zero to four intersection points. We now analyze the uniqueness of the solution. For time being, when the two ellipses e1 and e2 are tangential at one point, we treat that point as intersecting at the same points twice, i.e., two intersections. There are three cases to study. ? Case 1, both p2 and p2 are inside c. Choose any point on the line segment p1 and p2 (which is always inside c as the mapping center) to transform the pattern. p1 and p2 transform to opposite side of the center of the circle. We discuss this problem using this conflguration. Without loss of generality, we assume p1 to be along the positive half of the x-axis and p2 along the negative direction. e1 and e2, the two solution curves to the corresponding partial rectiflcations, are ellipses symmetric to the x- axis. Their intersections are also symmetric to the x-axis, so the number of solutions is a even number: 0, 2, or 4. Because the centers of e1 and e2 are along the positive and negative halves of the x-axis, there are at most two intersections. Since there exists at least one intersection as the ground truth, there are two valid solutions. These two solutions are symmetric to the x-axis on this conflguration. ? Case 2. p1 is inside c and p2 is outside c. Select a point that is on the line segment of p1p2 and inside c. Transform the 52 pattern using it as the mapping center. The conflguration becomes similar to case 1, so there are two solutions. ? Case 3. Both points are outside c. The relationship of the solution curve is studied in the rectifled image, i.e., the two curves already intersect at the center of the circle. Each solution curve is a minor arc (less than 180 degree) of the ellipse, which means that the number of intersections is no more than two. Based on how the four points m1, n1, m2, and n2 are ordered on the rectifled circle c, there are three subcases: { sequence m1n1m2n2: Under this case, the line segment p1p2 intersects c. The number of intersections is even. Since the two curves intersect at the original point, there are two intersections. { sequence m1m2n2n1: Under this case, p1p2 intersects c at its extension line. For the same reason, there are two valid mapping centers. { sequence m1m2n1n2: Under this case, the line p1p2 does not intersect c. The number of intersections is odd, which leads to one. We summarize the results as follows. When the line p1p2 intersects c, there are two valid solutions; otherwise, there is one solution. 4.2.2 Equal angles in two-solution cases When there are two solutions, denote the mapping centers as s and t. Let Ps1;2 be the two points after rectiflcation using s; Pt1;2 be the two points using t. The 53 M1 N1 P1M2 N2P2 M1 N1 P1M2 N2 P2 M1 N1 P1M2 N2 P2 Figure 4.5: Illustration of the three subcases when both two points P1;2 are outside C. Red Os relate P1 and blue 4s relate P2 angles after two rectiflcations have the following relation: \Ps1OPs2+\Pt1OPt2 = 2?. Proof. Since there are two mapping centers, as proved above, the line p1p2 intersects c. Selectapointonp1p2 andinside c asthemappingcentertotransformthepattern. We prove the above property on the transformed conflguration. The whole image is symmetric to the line p1p2, which leads to the result that the two mapping centers s and t are also symmetric to p1p2. Thus, the two rectifled angles \Ps1OPs2 and \Pt1OPt2 are equal. 4.2.3 Experiments An image taken at the Google headquarters is processed. The signpost is rectifled using the outer loop of the second \o" in \Google" and ten detected feature points, as shown in Figure 4.6 (c). The distances from the feature points to the circle are measured from a standard Google logo image displayed in Figure 4.6 (a). One ellipse is calculated as the trace of the mapping center from each distance constraint. The mapping center is detected as the point with the minimum sum of distance to the ten ellipses. Figure 4.6 (b) illustrates the outer loop of the \o" in blue color, ten 54 (a) (b) (c) (d) Figure 4.6: Rectiflcation of the signpost in front of Google?s headquarter. (a) Google logo, feature points marked as red cross (b) Trace of the mapping center from each feature point (c) Original image (d) Rectifled signpost 55 ellipses as traces from each constraint in red color and the detected mapping center by the blue cross. The rectifled image is shown in Figure 4.6 (d). The error in the estimates of the right angles of the signpost in the rectifled image is within 3?. 4.3 Line distance constraint Just as a dual problem, the partial rectiflcation problem relating to line dis- tance constraint is addressed as: Problem 4. Given an ellipse c, a line l that is not tangent to the ellipse and a positive number dL (dL 6= 1), partially rectify the plane such that c is transformed to a unit circle C, l to a line L, and the distance from the center of the circle to the line equals dL: kOLk = dL, where O is the center of the circle C. kOLk is deflned as the distance from the line L to the circle C. Denote the point P as the pole of L: P = C?1L, we have kOPk = 1=dL. The distance constraint from a line to a circle becomes a distance constraint from a point to a circle. Since the polar-pole relation is invariant to projective transformation, the image of P can be obtained from p = c?1l. Problem 4 has been successfully reduced to Problem 1. 4.4 Equal Distance Constraint A common situation is that the distance from a given point to a circle is unknown; however, it is known that there are two points at same distance to a circle 56 (due to symmetry or other reasons). The partial rectiflcation problem for this case is stated as: Problem 5. Given an ellipse c, two points p1;2 which are both outside or inside the ellipse, partially rectify the plane such that c is transformed to a unit circle C, p1;2 to points P1;2, such that the two distances from p1;2 to the center of the circle C are equal, i.e. , kOP1k = kOP2k, where O is the center of the circle C. This problem can be solved using the solution to Problem 1 4.4.1 Corresponding subgroup approach Let the pattern be already partially rectifled, i.e. , kOP1k = kOP2k. Let L\ be the angle bisector of \P1OP2. We claim that the part of the line L\ inside the circle C is the subgroup of the mapping centers, which can be proved using the symmetry: ? For all points X 2 L\ and inside C as the mapping center, the rectifled image satisfles kOP1k = kOP2k. Proof. Since kOP1k = kOP2k, line L\ passes through the circle?s center O. P1 and P2 are symmetric to L\. Because X is on the line L\, without loss of generality, we assume that L\ is flxed during the transformation. Thus P1 and P2 are still symmetric to the line L\ after transformation, so kOP1k = kOP2k ? As the mapping center, if a point X satisfying the equal distance constraint, it must be on the line L\. 57 X Y Z E1 E2 Figure 4.7: Illustration of the contradiction when the mapping center X is not on the symmetric axis. E1;2 are the solutions for OP1;2 = d respectively Proof. Suppose there exists a mapping center X that satisfles the equal dis- tance constraint but not on L\ (as shown in Fig. 4.7), we prove that con- tradiction occurs below. Denote the equal distance after the transformation using X as the mapping center is d0. Let Y be the symmetric point of X with respect to L\. As a result, if Y is used as the mapping center, the two distances from the points to the circle will also be d0. Let the ellipse E1 be the solution for the partial rectiflcation problem kOP1k = d0. It must pass through both X and Y, two points at difierent sides of the line L\. E1 must intersect L\. Denote (one of) the intersection point as Z. Due to symmetry, E2, the solution for kOP2k = d0, also passes through Z. Thus, there are three points X, Y and Z. As the mapping center, either X, Y or Z will transform the pattern such that two distances from P1;2 to the circle is d0. As discussed above, there are at most two solutions, resulting in a contradiction. Thus, all points satisfying the equal-distance constraint are 58 on line L\. 4.4.2 General solution A special solution for Problem 5 can be easily achieved: arbitrarily select a positive number d0 (d0 > 1 if both points are outside c and d0 < 1 otherwise); let d1 = d0 and d2 = d0 and solve Problem 5. Because two points determine a line, an alternative way to solve Problem 5 is described as below: 1. Arbitrarily select d1 (d1 > 1 when both p1;2 are outside c and d1 < 1 other- wise). Denote the mapping center to satisfy the constraint kOP1k = d1 and kOP2k = d1 as x1. 2. Arbitrarily select d2 6= d1 and obtain the mapping center x2. 3. The part of the line x1x2 inside c is the solution for the problem 4.4.3 Invariant among the pattern Given an equal-distance constrained pattern, neither the distance d nor the angle ` = \P1OP2=2 is known. As discussed above, when d is known, the pattern can be rectifled and ` has a unique solution; on the other hand, when ` is known, we have shown that d can also be obtained. There is an invariant during the partial rectifled transformation. Let P1 = (m;n;1)>, P2 = (m;?n;1)>. Denote their midpoint as M = (m;0;1)>. The solution curve for the equal-distance constraint lies on the x-axis. 59 Assume that M and P1 are mapped to M0 = (m0;0;1) and P01 = (m0;n0;1) respec- tively by a mapping center X = (m;0;1)>. We compute the distance d0 = kOP01k = pm02 +n02. Denote the two intersection points of the line XP and the unit conic C as A = (xa;ya;1)> and B = (xb;yb;1)>. Straightforward computations show that ya and yb are the two roots of a quadratic function: ((m?X)2 +n2)y2 +2(m?X)ny +n2(X2 ?1) = 0 (4.22) Using Vieta?s formulas [14], it can be shown that ya and yb satisfy ya +yb = 2(X ?m)n(m?X)2 +n2 (4.23) yayb = n 2(X2 ?1) (m?X)2 +n2 (4.24) After transformation, A0B0 becomes a diameter of the circle. Using the cross ratio PA PB OB OA = P0A0 P0B0 O0B0 O0A0 (4.25) which is (0?ya)(n?yb) (0?yb)(n?ya) = (d0 ?1)(0+1) (d0 +1)(0?1) (4.26) From (4.26), d0 can be solved as d0 = n(ya ?yb)n(y a +yb)?2yayb (4.27) Using (4.23) and (4.24), d02 ?1 can be written as d02 ?1 = n 2((ya +yb)2 ?4yayb (n(ya +yb)?2yayb)2 ?1 (4.28) = (m?X) 2 ?n2(X2 ?1)?(mX ?1)2 (mX ?1)2 (4.29) = (X2 ?1)1?m 2 ?n2 (mX ?1)2 (4.30) 60 The distance from M0 to O can be achieved by setting n = 0 in (4.30): m02 ?1 = kOM0k = (X2 ?1) 1?m 2 (mX ?1)2 (4.31) Since d02 = m02 +n02, n0 can be solved as n02 = (d02 ?1)?(m02 ?1) = (X2 ?1) n 2 (mX ?1)2 (4.32) It leads to the following equation m02 + 1?m 2 n2 n 02 = 1 (4.33) It is a quadratic curve centering at O and passing through (1;0;1)>. An invariant can be described as c = 1d2 sin2 ` ?tan2 ` (4.34) where c = (1?m2)=n2 can be obtained from any partially rectifled image. 4.4.4 Perturbation Analysis Assume that the equal-distance d and the angle 2` = \P1OP2 are known. Let the detected mapping center at the x-axis and flx P1 on the y-axis. The perturbation of X1 only comes from the noisy P2. Furthermore, as stated above, the angular direction noise does not afiect the result, the perturbation d = OX only comes from the component on the radial direction ?P2r. If X0 denotes the projection of X to OP2, we have d0 = OX0 = dsin(2`). As proved above, ?P2r = d0kd2P ?1k. The solution curve of this problem is the bisector of \P1OP2 passing through X. Thus, ds = dcos`. Finally, the sensitivity can be represented as 2sin(`)kd2P ?1k. 61 The performance is not only afiected by the distances of points, but also by the angle. It should be noted that although dP and ` are related in a pattern, neither of them is known. Thus, sensitivity studies can only be done after the pattern is rectifled. 4.4.5 Detection of Center for Concentric Circles Center detection from projected concentric circles is always considered an im- portant and interesting problem [18][23][26][25]. The accuracy of the detection di- rectly relates to the accuracy of pattern rectiflcation, and subsequent calibration. The problem can be described as follows: Problem 6. Given two sets of points: pi (i = 1;2;:::;m) and qj (j = 1;2;:::;n) on two projected concentric circles cp and cq respectively, detect the projected center of the concentric circles. Using the solution of Problem 5, a natural solution for the fltting problem can be obtained as described below: 1. Fit the conic cp using pi (i = 1;2;:::;m). 2. For any two points (qj1;qj2) 2 fqjg, which should be at same distance from cp, solve the partial rectiflcation problem using the equal-distance constraint and obtain the solution line l1;2. 3. Calculate the point x with minimum sum of distance to all the solution lines, which is the projected center. 62 This method can also be used for fltting projected concentric circles. By analyzing the degree of freedoms, it can be easily seen that fltting two concentric circles requires eight points. Jiang et al. [24] have proved that the task can be done using four points on each circle. Our algorithm considers the case when flve points lie on one circle (denoted as c0) while the other three on the other circle (denoted as c). We flrst flt c0 using the flve points on it. For each pair of points (p1;p2) on c, because they are of the equal-distance to c0, a line can be drawn as the solution for equal-distance constraint. The intersection of two lines is the common center of the concentric circles. This algorithm can be extended to multiple points. Many existing concentric circles detection algorithms [1][18][26] flt the circles independently, which may result in poor performance when part of either circle is occluded. Another set of solutions [23] detects the four intersections of a line and the two circles, and then estimates the center using the cross ratio. We compare our method with a representative algorithm from the latter class. 4.4.6 Experiments We use synthetic data to detect the center of concentric circles and compare our results with Jiang?s algorithm [23]. Two concentric circles with radii ratio 1:2 are drawn on a plane. The region between the circles are colored in black. The projected black ring is blurred using a Gaussian fllter. Jiang?s method is flrst used and the detection results are shown in Figure 4.8 (a). To be fair, same points on the conics are used in our method. The lines are 63 1 2 3 4 5 6 1 (a) (b) (c) Figure 4.8: Comparison of center detection algorithms. (a) Jiang?s method [24]. (b) Our method, dotted lines are the solutions for equal constraints. (c) comparison in zoomed-in image. black ?: ground truth; blue ?: our detection results; red ?: Jiang?s detection in each round, starting from red one. shown in Figure 4.8 (b). A zoomed-in plot in Figure 4.8 (c) shows that under noisy environment, Jiang?s method does not always converge to the ground truth, and is worse than ours. We ran both methods in difierent 100 images and the statistical results are presented in Table 6.2. It can be seen that the results using Jiang?s method do not converge to the ground truth, and our result is generally better than theirs. 64 Table 4.1: Statistical performance comparison between Jiang?s method and ours, measured by the distance between the detected center and of concentric circles and the ground truths in pixels. Jiang?s method Our Method round 1 2 3 4 5 6 mean 7.18 1.72 1.56 1.51 1.71 2.07 1.50 median 7.06 1.60 1.43 1.32 1.45 1.74 1.16 std 0.26 1.16 0.98 0.98 1.23 1.43 1.19 65 Chapter 5 Constraints Related to Angles Besides (relative) length, an angle between two lines is another important quantity in measurements. Zero and right angles commonly appear in the real world and have been used for various projective geometry topics [22]. This chapter discusses the role of a known angle as a constraint, stated as: Problem 7. Given a conic (ellipse) c, two lines lp and lq, and an angle value 0 < < ?=2, partially rectify the pattern such that c is converted to a circle C, and the angle between the transformed two lines Lp and Lq is . 5.1 Two Special Cases To learn some intuitive appreciation of the solution, we consider two special cases for study. Denote the intersection of Lp and Lq, i.e. , the vertex of the angle as R. 5.1.1 Case 1: Two Edges Tangent to The Circle When Lp and Lq are both tangent to C, denote the tangent points as M and N respectively. It is obvious that OR bisects \MRN, i.e. , \MRO = =2. OR = 1=sin( =2). This case reduces to a distance constraint discussed in the last chapter. 66 5.1.2 Case 2: Vertex on the Circle When R is on the circle C, let the two edges intersect C at M and N. Denote the intersection of two tangent lines from M and N as S. It easily follows that \MRN = ) \MON = 2 ) \MSN = ? ?2 . Thus the problem is reduced to case 1. Specially, when = ?=2, MN is the diameter of the circle, that is, O on MN. Thus, the solution is MN. 5.2 Problem Reduction Let the projected circle angle be o. The poles of lp and lq with respect to c be p and q respectively. On the partially rectifled image, we have POQ = ?? . Thus, the problem can be rewritten as Problem 8. Given a conic (ellipse) c, two points p and q, and an angle value 0 < < 2?, partially rectify the pattern such that c is converted into a circle C, and POQ = ?? , where O is the circle center and P and Q are transformed from p and q respectively. 5.3 Identify the Subgroup Without loss of generality, the rectifled circle is assumed to be the unit circle, P and Q are located on the two lines passing through the origin: Lop : xcos(fi) = ysin(fi) and Loq : xcos(fl) = ysin(fl) respectively. 67 Denote the homography matrix as H = 0 BB BB BB @ a b c d e f ? ? ? 1 CC CC CC A (5.1) Assume P = (rp cosfi;rp sinfi;1)>, Q = (rq cosfl;rq sinfl;1)>. After transfor- mation, P0 is on Lop. Mathematically, HP = 0 BB BB BB @ arp cosfi +brp sinfi +c drp cosfi +erp sinfi +f ? 1 CC CC CC A ? P0 = 0 BB BB BB @ r0p cosfi r0p sinfi 1 1 CC CC CC A (5.2) which leads to arp cosfi +brp sinfi +c drp cosfi +erp sinfi +f = cosfi sinfi (5.3) Similarly, HQ ? Q0 leads to arq cosfl +brq sinfl +c drq cosfl +erq sinfl +f = cosfl sinfl (5.4) Because the circle is flxed during the transformation: H?>CH?1 ? C. Using C = C>, the condition can be written as HCH?> ? C, or 0 BB BB BB @ a2 +b2 ?c2 ad+be?cf x ad+be?cf d2 +e2 ?f2 x ? ? ? 1 CC CC CC A ? 0 BB BB BB @ 1 0 0 0 1 0 0 0 ?1 1 CC CC CC A (5.5) Compare each corresponding element, two more constraints can be obtained as: a2 +b2 ?c2 = d2 +e2 ?f2 (5.6) ad+be?cf = 0 (5.7) 68 Let the mapping center be T = (x;y;1)>, which is mapped to O. We have HT = O: ax+by +c = 0 (5.8) dx+ey +f = 0 (5.9) There are six equations, which can be solved by removing a,b,c,d,e and f, leading to an equation relating x and y, which is the solution curve. 5.4 Solution for Right Angles The solution for arbitrary is complex. In man-made environments, right angles appear much more frequently. This section discusses the solution when = ?=2. In the rectifled pattern, let P = (X;0;1)> lie on the x-axis and Q = (0;Y;1)> on the y-axis. Equations (5.3) and (5.4) become dX +f = 0 (5.10) bY +c = 0 (5.11) Using them to cancel c and f, (5.6) and (5.7) become a2 +b2 ?b2Y 2 = d2 +e2 ?d2X2 (5.12) ad+be?bdXY = 0 (5.13) and x, y can be solved from (5.8) and (5.9) as: x = beY ?dXae?bd ; y = daX ?bYae?bd : (5.14) 69 Finally x(x?X)+y(y ?Y) = ?(ad+be)(aX ?bY)(eY ?dX)(ae?bd)2 = ?XYbd(aX ?bY)(eY ?dX)(ae?bd)2 = ?XYxy (5.15) 5.4.1 Geometric Properties of the Solution The solution of this partial rectiflcation problem is f(x;y) = x2 +XYxy +y2 ?Xx?Yy = 0 (5.16) It represents a quadratic curve, which can either be an ellipse, a parabola, or a hyperbola, depending on whether kXYk is smaller, equal, or greater than 2, respec- tively. The shapes of the curves under difierent conditions are illustrated in Figure 5.1. Speciflcally, when the line PQ is tangent to the circle C, i.e. , 1=X2+1=Y 2 = 1, f(x;y) = 0 degenerates to two lines: the line Xx + Yy = 1 is the line PQ; and the line x=X + y=Y = 0 passes through O and the other tangent points from P and Q as shown in Figure 5.1 (k). It corresponds to the solution obtained for the special case 2. The flrst special case is displayed in Figure 5.1 (d). The curve f(x;y) = 0 possesses an interesting geometric property. In partic- ular, it passes through the following interesting points: ? The origin O as f(0;0) = 0. This is because the pattern has already been partially rectifled before transformation. ? The two points P and Q as f(X;0) = f(0;Y) = 0. A geometric explanation when P is inside the circle C can be provided as follows: The transformation 70 O PQ O P Q O P Q (a) X;Y < 1 (b) X = 1, Y < 1 (c) X > 1, Y < 1, XY < 2O P Q O P Q O P Q (d) X = Y = 1 (e) X > 1, Y = 1,XY < 2 (f) X;Y > 1, XY < 2O P Q O P Q O P Q (g) X > 1, Y < 1, XY = 2 (h) X = 2,Y = 1 (i) X;Y > 1, XY = 2O P Q O P Q O P Q (j) XY > 2, 1=X2 +1=Y 2 < 1 (k) XY > 2, 1=X2 +1=Y 2 = 1 (l) 1=X2 +1=Y 2 > 1 Figure 5.1: Solution curve in difierent cases. (a) :P inside C; (b)(d):P on C; others:P outside C. (a)-(c):Q inside C; (d)(e):Q on C; others:Q outside C. (a)-(f): XY < 2; (g)-(i): XY = 2; (j)-(l): XY > 2. (j):1=X2 + 1=Y 2 < 1; (k):1=X2 + 1=Y 2 = 1; (l):1=X2 +1=Y 2 > 1 71 corresponding to the mapping center P maps P to P0 = O, which is on the x axis. For any arbitrary Q0, rotating the coordinate can always make it to lie on the y-axis, while keeping P0 and C flxed. Thus, the mapping center P satisfles the constraint and should be on the solution curve. An identical argument can be made for Q. ? The intersection points of the two lines and the circle. The intersections of the curve f(x;y) = 0 and C must satisfy x2 +XYxy +y2 ?Xx?Yy = 0 x2 +y2 = 1 (5.17) By subtracting (5.16) from (5.17), we have (Xx?1)(Yy?1) = 0. Since Lp is Xx = 1 and Lq is Yy = 1, the intersections are on Lp and Lq, i.e. , they are the intersections of the lines and the circle. Because the pole-polar relationship does not change in projective transforma- tions, flnally we conclude: Solution 2. The solution for Problem 7 when = ?=2 is a quadratic curve passing through all the four intersection points between lines lp;q and the given conic c (which may not be of real coordinates) and the two poles p and q of the line lp;q with respect to c. In general, it is enough to determine a quadratic curve when six distinct points are known. When one of the poles, for example, the point P is on the circle, there are only four distinct points. However, f(x;y) = 0 is now a tangent to C at P, 72 giving enough conditions to solve for the curve. When both poles are on the circle, it becomes the special case 1. 5.4.2 Validity of the Solution For a coplanar pattern, any image point t and the conic c are on the same side of the vanishing line. For a line segment ab, if both two endpoints are on the side of c, obviously any point on the line segment are also on the side of c. Thus only the two endpoints a and b need to be considered. As discussed in the last chapter, the mapping center and either endpoint should be on the difierent sides of the corresponding polar line, which leads to three cases: ? The line segment is inside the circle. The mapping center can be anywhere inside the circle. ? a is outside c and b is inside c. The mapping center can not be in the circular segment region Ra formed by c and the polar la = ca. ? Both a and b are outside c. Two circular segments Ra and Rb are restricted. The intersection between la and lb is the pole of ab, denoted as pab. There are three subcases: { Line ab does not intersect c. The two circular segments shares common region. { Line segment ab intersect c. The two circular segments are separated. { The extension of ab intersect c. It is same as b inside c. 73 r s t lr lslt Figure 5.2: Illustration for the validity of solutions for a planar pattern. Due to the appearance of \srt, the shaded area is not valid. The valid solution is constrained to lie on the solid red curve. Figure 5.2 illustrates an example of the region of validity. Let \srt be the detected angle, where s, r, and t are the detected endpoints and corners. Let lr, ls, lt be the corresponding polars. Each provides an invalid arc region formed by itself and the conic. Thus, the mapping center is constrained to lie in the unshaded area. 5.4.3 Synthetic Data Experiments We compare our calibration method with a point-based method discussed in [44] using synthetic images. A pinhole camera model (i.e. , with zero skewness, unit aspect ratio and coincidence of the principle point and the image center) is assumed. In the flrst experiment, we use a pattern including a concentric square and a circle. Assuming that the coordinates of detected points are integers, we obtain the edges and flt the lines and conic as shown in 5.3 (a). To analyze the performance improvement due to adding a circle, we flx the size of square as one and change the radius of the circle from 0:1 to 3:9. Homography matrices are obtained using three methods: method I is the point-based method, which maps the corners of the square to (?1;?1;1)>; method II estimates the 74 ?25?20?15?10?5 0 5 10 15 20 25 ?25 ?20 ?15 ?10 ?5 0 5 10 15 20 25 (a) (b) Figure 5.3: Illustration for synthetic data setup when focal length f = 50, camera height h = 5, elevation angle = 45? and rotation angle ` = 45?. (a) Detected integer points in gray pixels, and fltted circle and square in black curves. (b) Solution curvesobtainedusingorthogonallineconstraintsfromfourcornersshownindifierent colors. mapping center using the projected circle and four orthogonal lines constraints from the corners of the square; method III is similar to method II but uses two parallel lines constraints instead. Figure 5.4 illustrate a typical result from these three methods when ` = 45? and = 30?;45?;60?. Because methods II and III exhibited very similar performance and can hardly be distinguished from each other, results from method III are not plotted. In the flgure, method II is worse than method I when the radius of the circle r is small, due to the poor fltting of the projected circle. However, when r increases to a number comparable to the length of the square, our method outperforms the point-based one since intuitively more information is used. Incrementing the circle size does not always increase the calibration performance, because when the circle 75 0 1 2 3 440 45 50 55 60 65 radius focal length ? = 30? ? = 45?? = 60? 0 1 2 3 42530 3540 4550 5560 65 radius elevation angle ? 0 1 2 3 4 44.95 45 45.05 45.1 45.15 radius rotation angle ? (a) (b) (c) Figure 5.4: Evaluation of methods I (straight line) and II (line with symbols) to estimate (a) focal length, (b) elevation angle, and (c) rotation angle, vs. the circle?s radius. f = 50, ` = 45?, and h = 5. red: = 30?; green: = 45?; blue: = 60?. Ground truth values are plotted in black solid lines. is large enough, the algorithm is afiected by the inaccuracy of the square. In another experiment, we use a checkerboard pattern and a large circle with the diameter same as the side length of the chessboard as shown in Figure 5.5 (a). Method I uses 9 ? 9 corners of the chessboard for point correspondence. Method II uses the circle and four constraints from the boundary corners. Figure 5.5 (b) presents the focal length estimates obtained from both methods vs. `, when = 45?. Method II uses only four constraints but achieves similar performance as method I. 5.4.4 Real Image Experiments Images of size 512?768 are taken by a Pentax DSLR camera. Lines and conics are retrieved from the edges of the images. A school tra?c sign was taken as the flrst image (see Figure 5.6 (a)). The head of the girl is used as a circle and lines extracted from the boundary of the sign are used. It provides three constraints from 76 0 20 40 60 8044 46 48 50 52 54 56 rotation angle ? (in degree) focal length f method Imethod II (a) (b) Figure 5.5: Evaluation of the two methods. (a) Image plane. Method I using the points of a checkerboard, shown in red dots; Method II using a circle and four corners of the checkerboard as right angles, shown in solid blue curves. The solution curves are indicated as black dotted curves. (b) Estimated focal length vs. rotation angle ` when = 45?. ground truth: f = 50. 77 Table 5.1: Error in the right angles of the chessboard appearing on the rectifled image in degree (?). Each value corresponds to an appropriate angle on Figure 5.6 1.58 0.84 0.41 -0.09 -0.49 -1.31 -1.24 2.10 1.36 0.93 0.43 0.03 -0.79 -0.72 - - 1.13 0.62 0.23 - -0.52 - - - 0.81 0.42 - - edges: (i) orthogonal between left and bottom, (ii) orthogonal between right and bottom, and (iii) parallel between left and right. The corresponding solutions for the partial rectiflcation problem are shown in red, green and gray color in Figure 5.6 (b). The common intersection of the three curves rectifles the image well. However, using only two of them leads to an ambiguity (see Figure 5.6 (d)). The second image contains parts of two books and an occluded compact disk on a chessboard. The image is rectifled using the CD boundary as a circle and a corner of each book as a right angle. The orthogonal lines appearing on the chessboard are used only to evaluate the rectiflcation results. The errors of right angles in degree are shown in Table 5.1. They are around 1?. 78 1 23 (a) (b) (c) (d) (e) (f) (g) Figure 5.6: Experiments for planar Euclidean Structure. (a)-(d) School sign. (a) detected circles and lines on the original image. (b) zoomed solution curves with unzoomed on upper right. (c) rectifled image using o as the mapping center. (d) rectifled images using ambiguous mapping centers 1-3. (e)-(g): Desk plane. (e) detected circles and lines on the original image. (f) solution curves (g) rectifled image and the right angles for evaluation. 79 O D P Q M N X Y Figure 5.7: Illustration of the invariant deduction 5.5 Invariants of the Partially Rectifled Pattern 5.5.1 Approach 1 We investigate the properties of the partial rectifled pattern. Let the line PQ intersect C at two points M and N (see Figure 5.7). OD is perpendicular to PQ at D, as shown in Figure 5.7. Using the property of right angle triangles, we have kODk = XY=pX2 +Y 2, kPDk = X2=pX2 +Y 2 and kQDk = Y 2=pX2 +Y 2. Thus, the cross ratio of M, N, P and Q is written as Cr(M;N;P;Q) = DP?DMDP+DN DQ?DNDQ+DM = r 2 ?pr2(X2 +Y 2)?X2Y 2 r2 +pr2(X2 +Y 2)?X2Y 2 (5.18) where r is the radius of the circle. The cross-ratio is invariant to projective trans- formation, I(P;Q) = (1?(X=r)2)(1?(Y=r)2) is a constant. In other words, in a partially rectifled pattern where r = 1, MP?MQ?NP?NQ is a constant. 80 5.5.2 Approach 2 P and Q are constrained to lie on the x axis and y axis respectively. The distances from them to center, X and Y, are not flxed. Since Y changes with X, the relationship between X and Y is to be recovered. We will show there is an invariant relating X and Y. Let P map to P0 = (X0;0;1)> and Q0 = (0;Y 0;1)> using a mapping center X = (m;n;1)>. From (4.30), we have X02 ?1 = (X2 ?1)1?m 2 ?n2 (mX ?1)2 (5.19) Y 02 ?1 = (Y 2 ?1)1?m 2 ?n2 (nY ?1)2 (5.20) Since X is on the solution curve E, m and n satisfy m2+XYmn+n2?Xm?Yn = 0, leading to (mX ?1)(nY ?1) = 1+mnXY ?mX ?nY = 1?m2 ?n2 (5.21) We have (X02 ?1)(Y 02 ?1) = (X2 ?1)(Y 2 ?1) (5.22) Thus, there is an invariant c: c = (X2 ?1)(Y 2 ?1) (5.23) Please note that pX2 ?1 represents the length of the tangent line from X to the circle C. It is seen that the product of the two lengths of tangent lines is invariant. Both approaches achieve similar results. The flrst one is more concise but limited to the points inside the circle. 81 5.5.3 Application If a pattern contains a circle and multiple pairs of orthogonal lines, each pair corresponds to a unique invariant value. Thus the pair wise correspondence between two views can be obtained by comparing their corresponding invariant values. This can be used for image registration. Actually, even if the two lines are not orthogonal, an invariant can still be calculated. Given multiple lines, each invariant from two lines can be calculated. Thus, two images can be registered by matching the intersections with same/similar invariants. 5.5.4 Experiments The third experiment shows a multiple view image registration example using the geometric invariant reported in Section 5.5. Figure 5.8 presents two views of a CD on a chessboard. In each image, the 9?9 grids of the chessboard are detected and 81 intersections by orthogonal lines are to be matched. We use I2 = sign(I)=pI as the indicator because it approximately re ects the distance from the line to the circle. We calculate the distance of each pair between two views and rank them. A ranked image is shown in Figure 5.8 (b), where the diagonal is very bright, showing that correct matching always receives high ranks. The Cumulative Match Characteristics (CMCs) of the experiment in rank 1, 2, 3, 5 and 8 are 20, 41, 52, 73 and 79 respectively. 97:5% corners can be matched well within 10% of the total 81 corners. 82 (a) (b) Figure 5.8: Experiments for image registration. (a) two images. (b) match value for each pair of corners Table 5.2: Cumulative Match Characteristics (CMCs) of the matching of corners between two chessboard images Rank 1 Rank 2 Rank 3 Rank 4 Rank 5 Rank 8 20 41 52 64 73 79 83 Part III VIDEO CALIBRATION FOR OBJECTS IN PLANAR MOTION 84 Chapter 6 Video Camera Calibration Using Objects in Planar Motion 6.1 Introduction Due to the prevalence of video applications in computer vision, video camera calibration has become increasingly important. Instead of deriving intrinsic param- eters, e.g. , the characteristic of camera itself, video camera calibration focuses more on extrinsic parameters, e.g. , the relation of the video sensor to the environment. We study planar motion, which is mostly observed in video sequences. It happens in two scenarios: (1) a stationary camera acquiring images of objects moving on the ground, (such as in surveillance applications) or (2) when a camera is in a planar motion (such as a camera mounted on vehicles in a driver assisted system). Theoretically, all the planar calibration methods can be used to calibrate a video camera undergoing planar motion. However, there are many disadvantages in such approaches. The point-based self-calibration methods proposed by Zhang [55] requires reliable point detection such as SIFT [31], Harris Corner [21] and point tracking [42]. Furthermore, the points should be distributed on the image plane to obtainanaccurateresult. Circle-basedcalibrationmethods[8][9][17][20][23][25][34][47][52] rely on detecting ellipses. Either point- or circle-based algorithms often sufier from the failure of low level feature detection, tracking or matching, especially in low quality videos. Another approach is through 3-D reconstruction, e.g. , planar fac- 85 torization [27]. However, it is also susceptible to errors in feature tracking. In this chapter, we study the video (minimal) calibration problem. We pursue the calibration method using a minimum number of ground points detected and reliably tracked, and discuss blob calibration when no point can be tracked. 6.2 Calibration Using Line Segment Let there be a line segment MN moving on the reference plane, generating images mtnt for frame t = 1;2;:::;T. Using the constraint that the length of the segment kMtNtk is a constant, the vanishing line is estimated from the given seg- ments mtnt. 6.2.1 Algorithm for Pin-hole Camera Model We flrst use a simplifled camera model by assuming unit aspect ratio, zero skew, and coincidence of the principal point and the image center. Under this condition, the vanishing line is determined by one intrinsic parameter, the focal length f, and two extrinsic parameters, the elevation angle and the rotation angle `. The transform matrix from the reference plane to the image plane can be written as: H = 0 BB BB BB @ f cos(`) f sin(`)cos( ) 0 ?f sin(`) f cos(`)cos( ) 0 0 ?sin( ) h 1 CC CC CC A (6.1) 86 where h is the height of the camera and does not afiect the position of the vanishing line. The vanishing line can be calculated as L = (sin(`)sin( ) cos(`)sin( ) f cos( ))> (6.2) Since the inverse homography matrix can be written as H?1 = SAP, where S, A and P are similarity, a?ne, and perspective transform matrices respectively, with structure as [29] A = 0 BB BB BB @ 1 fl ? fi fl 0 0 1 0 0 0 1 1 CC CC CC A ; P = 0 BB BB BB @ 1 0 0 0 1 0 L(1) L(2) L(3) 1 CC CC CC A (6.3) Simple matrix computations show that APH ? 0 BB BB BB @ cos`+fisin` fl sin`cos ?ficos`cos fl 0 ?sin` cos`cos 0 0 0 hcos 1 CC CC CC A (6.4) By comparison, fi and fl can be represented as fi = ? cos(`)sin(`)sin 2( ) cos2( )cos2(`)+sin2(`); fl = cos( ) cos2( )cos2(`)+sin2(`) (6.5) If the vanishing line L is known, ` can be obtained using tan(`) = L(1)=L(2). P can be applied to the endpoints of the line segments and make it an a?ne, denoted as m0tn0t. For any two given frames t1 and t2, using the fact that the lengths of segments are equal, Liebowitz et al. [29] has proved the following constraint of fi and fl: in 2D complex space with fi and fl as real and imaginary axes respectively, the point (fi;fl) lies on a circle with center (cfi;0) and radius r: cfi = ?xt1?yt1 ??xt2?yt2?y2 t1 ??y2t2 ; r = flfl flfl?xt2?yt1 ??xt1?yt2 ?x2t1 ??y2t2 flfl flfl (6.6) 87 where ?xi = xp;i ?xq;i, ?yi = yp;i ?yq;i (i = t1;t2). Using (6.5), cos2 satisfles the function below x2 cos2 `((cfi cos`?sin`)2 ?r2 cos2 `)+x(2cos2 `sin2 `(cfi cos`?sin`)(cfi sin`+cos`)+1) +sin2 `((cfi sin`+cos`)2 ?r2 sin2 `) = 0 (6.7) Factoring the left side provides (xcos2 `+sin2 `)(x((cfi cos`?sin`)2?r2 cos2 `)+(cfi sin`?cos`)2?r2 sin2 `)) = 0 (6.8) because cos2 ? 0, xcos2 `+sin2 ` is always positive. Finally, we solve for cos2 as cos2 = ?cos 2 `?r2 sin2 `+2cfi cos`sin`+c2 fi sin 2 ` sin2 `?r2 cos2 `?2cfi cos`sin`+c2fi cos2 ` (6.9) From (6.9), we conclude that can be solved using a pair of nonparallel equal length segments given the vanishing line. Any pair of line segments appearing in the video can be used to estimate , and the values from difierent pairs should be the same in noise free condition, if the position of the vanishing line is correct. Otherwise, these values will be far from the ground truth and difierent from each other. Using the variance of obtained from multiple pairs of line segments, we arrive at a coarse estimation algorithm summarized below. 1. Select those line segments pairs, which are unlikely to be parallel in the world system. 2. For any possible vanishing line, estimate cos2 from each pair and calculate the variance var(cos2 ). 3. Choose the vanishing line with smallest var(cos2 ) 88 6.2.2 Algorithm for General Camera Model As proved, the vanishing line is the polar line of the image of the center with respect to the image of the circle. Thus, the estimation of the vanishing line can be reflned from a starting point L0 using the following steps: 1. Let the iteration number p = 0 2. Parallel move the wheel base in frames mini using Lp so that one of their end points shares the common point o and the other becomes ki. 3. Fit an ellipse E though ki. 4. Estimate the vanishing line, denoted as Lp+1. 5. Repeat steps 2-4 until the vanishing line position converges. The flrst algorithm requires a pin-hole camera, while the second one removes this constraint but needs a starting point. These two algorithms can be combined by applying the flrst algorithm (using the pin-hole camera assumption) to get a good initial estimate and then using the second algorithm to reflne the result. 6.2.3 Experiment To evaluate this method, an experiment using synthetic data is designed. Sev- eral line segments of unit length are randomly distributed on a 200 ? 200 region on the reference plane. A pin-hole camera of f = 1000 is used, which is located with height 250 and elevation angle 75?. Figure 6.1 (a) shows a sample image of 89 the line segments and the vanishing line. Gaussian distributed noise is applied to all the endpoints. The vanishing line is represented using two parameters: d, the distance from the origin to the vanishing line, and `, the direction of the vanishing line. Figure 6.1 (b) displays the logarithm of the variance of cos2 on the vanishing line space. The variance reaches a minimum when the vanishing line is closed to the ground truth. We then study how many line segments are needed to obtain a stable estima- tion of the vanishing line. Four groups experiments, ten in each, are designed. In each group, 10, 20, 40 and 80 line segments are used respectively. The estimation results using dcos(`);dsin(`) are shown in Figure 6.1 (c), and the corresponding values of is in Figure 6.1 (d). Generally robust results can be obtained using roughly 20 line segments. 6.3 Calibration Using an Ellipse Let et (t = 1;2;:::;T) be the images of an ellipse that is moving on the ground. We present a method to rectify the plane by transforming all et to same size and shape. The rectiflcation method contains two steps: (1) a?ne rectiflcation by equal- izing the ellipses area, and (2) metric rectiflcation by equalizing the ellipses shape. The rectiflcation result directly leads to calibration. 90 ?400?2000 200400 ?400 ?200 0 200 200400600800?100 10?4?3 ?2?1 01 d ? log(var? ) (a) (b) 3035 3035 3035 3035 350 360 370 380 10 frames20 frames40 frames80 frames 10 frames20 frames40 frames80 frames 69.5 70 70.5 71 (c) (d) Figure 6.1: Experiments using synthetic data for vanishing line estimation: (a) A sample of the images of wheelbases and the vanishing line (b) log mesh plot of the variance of calculated . Very high values are trunked to include the minimum value. The minimum value is marked as ?, which is close to the ground truth in cross mark (?). (c) Feet of perpendicular to the vanishing lines using 10, 20, 40 and 80 frames, ten times respectively. The cross mark (?) is the ground truth value. (d) The estimation of using difierent number of frames. The ground truth value is 70?. 91 6.3.1 Area of an Ellipse To illustrate this approach, a lemma is flrst introduced: Lemma 3. The area of an ellipse E equals ? divided by the square root of the determinant of its 2?2 shape matrix A(E): A(E) = ?=pkEk Proof. Let m and n be the semi-major and semi-minor axes respectively, the area of the ellipse can then be written as [14] A(E) = ?mn. The two eigenvalues of E are em = 1=m2 and en = 1=n2. Thus we have A(E) = ?=p(emen) = ?=pkEk. Lemma 3 directly leads to the following Corollary: Corollary 1. The area ratio between two ellipses is invariant to a?ne transforma- tion. Proof. Disregarding translation, an a?ne transformation can be written as a 2?2 matrix A: E0 = A?>EA?1 (6.10) The area of the transformed ellipse becomes A(E0) = ?= q kA?>EA?1)k = ?= q kA?>k p kEk q kA?1k ? = kAkA(E) (6.11) The area ratio change during the translation does not relate to the ellipse. Thus, the area ratio between two ellipses does not change during an a?ne transformation. 92 6.3.2 A?ne Rectiflcation: From Projective to A?ne Based on Corollary 1, removal of projective distortion can be achieved by equalizing the areas of the projective ellipses. Note that any transformation matrix H can be decomposed into a product of two matrices: H = PA [29], where A is an a?ne matrix and P is related to the vanishing line l1 = (l1;l2;l3)>: P = 0 BB BB BB @ 1 0 0 0 1 0 l1 l2 l3 1 CC CC CC A (6.12) Since an a?ne transformation does not change the area ratio, without lose of gen- erality, the projective matrix is simplifled to be P, which is uniquely determined by the vanishing line. 6.3.2.1 Removing Translation and Skewness To this end, we investigate the area of an ellipse under a given projective transformation. Without loss of generality, we assume the corresponding vanishing line is parallel to the x-axis for the simplicity of illustration: l1 = (0;1;?l)>. First, notice that Lemma 4. If e0 is an ellipse a?ne transformed from e by translating and skewing parallel to the vanishing line (as shown in Figure 6.2), then A(E0) = A(E). Proof. Consider an a?ne matrix A(s;t) translating and skewing the ellipse along 93 the vanishing line dq dst dp s?t? p? q? o?e? st p q oe Figure 6.2: Given the vanishing line, an upright ellipse e0 can be used for area ratio comparison instead of e. st and pq are the parallel and orthogonal axes of e. dp, dq and dst are the distances from p, q and s (t) to the vanishing line l1 respectively. the x direction of the format: A(s;t) = 0 BB BB BB @ 1 s t 0 1 0 0 0 1 1 CC CC CC A (6.13) From 0 BB BB BB @ 1 0 0 0 1 0 0 1 ?l 1 CC CC CC A 0 BB BB BB @ 1 s t 0 1 0 0 0 1 1 CC CC CC A = 0 BB BB BB @ 1 s t=l 0 1 0 0 1 ?l 1 CC CC CC A = 0 BB BB BB @ 1 s+t=l ?t=l 0 1 0 0 0 1 1 CC CC CC A 0 BB BB BB @ 1 0 0 0 1 0 0 1 ?l 1 CC CC CC A (6.14) Wehave, PA(s;t) = A(s+t=l;?t=l)P. Inotherwords, let e0 = A(s;t)?>eA(s;t)?1, we have E0 = A(s + t=l;?t=l)?>eA(s + t=l;?t=l)?1. Thus, A(e0) = A(e) and A(E0) = A(E). The area ratio change of e0 and e is the same. Below, we assume that the ellipse is upright, i.e. , its axes are orthogonal and parallel to the vanishing line. 94 d dmn c m nm? n? z s t the vanishing line l? Figure 6.3: Calculation of the transformed parallel axis. 6.3.2.2 Transformation of Axes Lemma 5. Let PQ and ST be two line segments orthogonal and parallel to the vanishing line respectively: pq ? l1, st k l1. After transformation P, their lengths become PQ = ld pdq pq; ST = 1d st st (6.15) where the deflnitions of d? are illustrated in Figure 6.2. The proof directly comes from the fact that P(x;y;1)> ? (x=(y ? l);y=(y ? l);1)>. Below we calculate the two axes of an ellipse E after transformation. Some geometric notations are illustrated in Figure 6.3. Assuming that the shape of e is e = diag(r2s2;r2=s2), where r and s are two parameters relating the area and the shape of ellipse, respectively. The axis of E orthogonal to the vanishing line, denoted as PQ ? L1, is trans- formed from pq, the axis of e. Using dp = d ? rs, dq = d + rs, and pq = rs, we 95 have PQ = lrs(d?rs)(d c +rs) = lrsd2 ?s2r2 (6.16) The parallel axis of E, denoted as MN, is not transformed from st, the axis of e, but a line segment mn. Draw the perpendicular cz of l1 from the center of e. Because z is the vanishing point, mz and nz are tangents to the ellipse e. We draw a co-centered supplementary circle e0 with radius r2=s2 (see Figure 6.3). Denote the intersections between line mn and circle e0 are m0 and n0. Obviously m0z and n0z are tangents of the circle e0. Furthermore, mn = m0n0=s2. Calculations show that dmn = d?s2r2=d and mn = ps2r2 ?(s2r2=d)2=s2. MN = ps2r2 ?(s2r2=d)2=s2 d?s2r2=d = r spd2 ?s2r2 (6.17) The area of ellipse E is proportional to the product of the two axes: A(E) / MN?PQ = lr 2 (d2 ?s2r2)3=2 (6.18) 6.3.2.3 Calculate s and r with Respect to a Given Direction We remove the assumption that the vanishing line is parallel to the x axis and let it be in an arbitrary direction , as in l1 : xcos +ysin +l = 0. To estimate l from the area ratio, we flrst calculate s, r, and dc of each ellipse with respect to the direction . Denote the shape of e as e = (a;b;b;c). r represents the area of e and is independent of : r4 = kek = ac?b2 (6.19) 96 . If we draw two lines lt1 and lt2 tangent to e and in the direction , the distance between the two lines will be 2sr. In other words, lines xcos +ysin ?sr = 0 are tangent to ax2 +2bxy +cy2 = 1. Solve them using ? = 0 sr = s asin2 ?2bcos sin +ccos2 ac?b2 (6.20) which can be simplifled using (6.19) s = r p asin2 ?2bcos sin +ccos2 (6.21) 6.3.2.4 Estimation of Vanishing Lines from the Given Area Ratio of Two Ellipses For two ellipses e1 and e2, after transformation P, their area ratio becomes A(E1) : A(E2) = r 2 1(d 2 2 ?s 2 2r 2 2) 3=2 r22(d21 ?s21r21)3=2 (6.22) Let dc be the distance from the center of the ellipse to the line passing through the origin along the direction : loxcos + ysin = 0. For a given area ratio value fi, the vanishing line satisfles ((l?dc2)2 ?s22r22) = fl1=3((l?dc1)2 ?s21r21) (6.23) where fl = (fir22=r21)2 is the change of the area ratio. 6.3.2.5 Solve the Area Ratio Constraint Since it is easy to transform one of the ellipses into a circle, without loss of generality, we assume that e1 = C = diag(1;1;?1) is a unit circle centered at the 97 origin, and e2 = e. The solution in (6.23) then becomes (l?(xc cos +yc sin ))2 ? asin 2 ?2bcos sin +ccos2 ac?b2 = fl(l2 ?1) (6.24) where (xc;yc) is the center of e). Using the pole of the vanishing line v = (x;y;1)> = C?1l1 to represent the solution, we have (1?(xc cos +yc sin ) p x2 +y2)2 ?(a0sin2 ?2b0cos sin +c0cos2 ) ?fl +(1?fl)(x2 +y2) = 0 (6.25) where a0 = a=t, b0 = b=t, c0 = c=t and t = ac?b2. Substituting cos and sin by x=px2 +y2 and y=px2 +y2, (6.25) leads to the solution to this problem: (x2c ?c0 +fl)x2 +2(xcyc +b0)xy +(y2c ?a0 +fl)y2 ?2xcx?2ycy +(1?fl) = 0 (6.26) Equation (6.26) represents a conic. For any point v on the conic, using its polar line lv = Cv as the vanishing line, the corresponding transformation makes the two converted ellipses to have the desired area ratio fi. 6.3.2.6 Properties of The Solution We study the properties of the curve represented by (6.26). Not surprisingly, the curve passes through the origin x = 0;y = 0 when fl = 1, which indicates that the line at inflnity is a (trivial) solution to keep the area ratio unaltered. 98 Equation (6.26) can be rewritten as (x0x?1)2 +(y0y ?1)2 ?1?(a0y2 ?2b0xy +c0x2)+fl(x2 +y2 ?1) = 0 (6.27) Irrespective of the value of fl, the curve always passes through four flxed points determined by the following two equations: (x0x?1)2 +(y0y ?1)2 ?1?(a0y2 ?2b0xy +c0x2) = 0 (6.28) (x2 +y2 ?1) = 0 (6.29) Equation(6.29) suggests that these points lie on the unit circle C. We further in- vestigate the polar line corresponding to each flxed point. First of all, it is tangent to the circle C, so satisfying l = 1. Substituting it into (6.24) and using (6.19), we know that the polar line satisfles xo cos +yo sin ?sr = 1 (6.30) The polar line is also tangent to the ellipse e, i.e. , it is the common tangent line of the two ellipses. Remembering this polar line is the vanishing line, we interpret this property as follows: when the vanishing line approaches the ellipse, the area of the transformed ellipse grows. When the vanishing line becomes tangent to the ellipse, the area becomes inflnity. If the vanishing line is a common tangent to the two ellipses, the area ratio is inflnity over inflnity, which can be any value. Thus we have Theorem 2. For transformation that makes the area ratio of two ellipses a given value, their corresponding poles of the vanishing lines lie on a conic, which passes 99 (a) (b) (c) (d) Figure 6.4: Solutions curves under four conditions. Solid circles are the circles whose area to be equalized. The one with plus mark is a unit circle. The dashed curve is the solution curve, and only the solid part is valid. Dotted lines are common tangents of the two circles. through all the (real/complex) common tangent points.(see Figure 6.4 for illustra- tion) A conic can be determined by flve points. Except for the four flxed points, the flfth point fully determines the flnal area ratio of the ellipses after transformation. 6.3.2.7 Validation of The Solution To guarantee that a vanishing line does not intersect the circle C, the pole of the vanishing line must be inside C. The pole is also constrained because both ellipses should be on the same side of the vanishing line. Using two circles (one is the unit circle) as examples, the solution curves in difierent conditions are shown in Figure 6.4: the common tangent lines pass through the intersections between the solution curve and the unit circle. 100 6.3.2.8 A?ne Rectiflcation Based on Theorem 2, we can a?nely rectify a plane using three ellipses with same areas: (1) Arbitrarily select one of the ellipses and transform it into a circle. (2) Calculate the curves for given area ratio 1 using (6.26). (3) A?nely rectify the plane using the vanishing line as the polar of the intersection of two curves. Because two curves may intersect at more than one point, this method leads to multiple solutions. 6.3.3 Metric Rectiflcation: From A?ne to Metric Now we discuss an approach to metrically rectify the plane by equalizing the shape of three ellipses of the same area. Denote the trace of E T (E). As we know T (E) = em +en = 1m2 + 1n2 (6.31) Using Lemma 3, the two semi axes m and n are the two roots of the following quadratic equation: x2 ?x p TA2=?2 +2A=? +A=? = 0 (6.32) Thus, we have Theorem 3. The shape of the ellipse of flxed area is determined by the trace of its shape matrix. Disregarding translation, rotation, and scaling, which do not change the met- ric properties except for the absolute lengths, in this section we assume that (1) all 101 OM N P? Q? P Q dpq (a) (b) Figure 6.5: (a) A?ne transformation keeps the area of an ellipse constant. (b) Ex- amples of three ellipses with equal areas cannot be simultaneously a?ne transformed into same shapes the ellipses are centered at the origin; (2) the origin is invariant to the a?ne trans- formation; and (3) the transformation does not change the ellipse?s area. Without loss of generality, we use a 2?2 matrix A to indicate the a?ne matrix that afiects the shape of the ellipse A = 0 BB @ 1=t s 0 t 1 CC A (6.33) Let the shape be E = (a b;b c). After applying the transformation, it becomes: E0 = A?>EA?1 = 0 BB @ t 0 ?s 1=t 1 CC A 0 BB @ a b b c 1 CC A 0 BB @ t ?s 0 1=t 1 CC A = 0 BB @ at2 b?ast b?ast as2 ?2sb=t+c=t2 1 CC A (6.34) The trace of the shape of new ellipse is T (E0) = a(s2 +t2)?2bs=t+c=t2. 102 Given multiple ellipses Ei (i = 1;2;3;:::;n) of the same area, converting them into ellipses of same shape is done be equalizing their traces after transformation: T (E0) = ai(s2 + t2)?2bis=t + ci=t2 = T0, where T0 is a constant to be determined. When n ? 3, this problem can be solved by solving (s2t2+t4;st;t2T0) from the linear equations and further computing s and t from them. Because T0 and t4 are required to be positive, not every three ellipses of same area can be a?ne transformed into the same shape. Figure 6.5 (b) displays such a case. 6.3.4 Calibration using synthetic data We use an ellipse with semi-axes 2:5 and 1 flxed at the origin (this enables to easy visual evaluation of the rectifled image) and two identical ellipses randomly drawn in a 5?5 square. A camera with a focal length of 150 and elevation angle 70? is applied. We assume that the detected points on the ellipses all have integer coordinates, and ellipses are fltted using them, which introduces noise. A sample image is shown Figure 6.6 (a). To evaluate the results, we select two sets of parallel lines and a circle, as shown in Figure 6.6 (b). The angles between the rectifled parallel lines Ax and Ay are calculated, which indicates the quality of the a?ne rectiflcation. The ratio between the lone and short axes of the recovered circle indicates the quality of the metric rectiflcation. Finally, the focal length is estimated. The experiment was repeated 100 times and the results are displayed in 6.6 (c) and (d). The standard deviation of the focal length is 30:3, which is not small. However, when the number of ellipses increases, the performance improves signiflcantly. Table 103 Table 6.1: Statistics of the calibration performance # of ellipses used 3 5 7 9 STD of the focal length 30.34 20.64 10.64 8.33 6.1 shows the standard deviation of the estimated focal length vs. the number of ellipses used. 6.4 Rectiflcation Using Arbitrary Shape Based on the theory above, we address a calibration method using an arbitrary shape in planar motion. It is based on a basic intuition: any region can be fltted into an ellipse shape. Given a planar shape moving on the plane, generating a blob in each frame, we flt an ellipse to the blob and use the ellipse for rectiflcation. Obviously, the fltted ellipse may not be optimal after rectiflcation. We design a recursive rectiflcation method to improve accuracy: 1. Detect the object blob by background subtraction [50] or other methods. 2. Select the largest blob as the reference. 3. Fit each blob region by an ellipse. 4. Calculate the curve of poles from each ellipse with respect to the reference ellipse. 5. Select the point with minimum sum of square of distances to the curves as the pole of vanishing line. 104 6. Rectify the plane, and recalculate the blobs on the rectifled plane. 7. Repeat steps 3-6 until the refltted . 6.4.1 Experiments 6.4.1.1 Synthetic Leaf Data We use two leaf shapes to rectify a projected plane. Using the width of original image as a unit, in each experiment one leaf is arbitrarily placed in a 4?3 rectangle three times. The camera is placed at height 1 with an elevation angle 75?. After rectiflcation, the area of the normalized blobs is set as ?. Typical resulting images are shown in Figure 6.7. In the flrst experiment, the leaf?s shape is hard to perceive from the three projected blobs. Although the quality of flt is poor (much smaller compared to what is should be), the result is satisfactory. After rectiflcation, the vanishing line is 540 unit away from the origin, far from the image region. In the second experiment, a palmate shaped leaf is used. The image appears to be more challenging to rectify because (1) one blob is projectively distorted signiflcantly and (2) one blob contains only 300 pixels and the petiole is missing. In the flrst round, the estimate is no good enough, but another round of rectiflcation provides improved result. The distances from the origin to the vanishing line in each ground are 51, 461, 974 and 1122, respectively. The projective distortion is removed step by step. 105 6.4.1.2 Stationary Video Using a web camera (PHILIPS DMVC 300K/37L), we captured a video se- quence from a tall building looking at streets. The camera transmits one 480?640 frame per second. A total of 6 frames containing a vehicle as a planar object mov- ing on the ground plane are used for rectiflcation. Figure 6.8 shows the rectiflcation result. Although the vehicle is not a real planar object, the rectiflcation still works quite well especially for the removal of projective distortion. 6.4.1.3 Camera undergoing Planar Motion The last sequence comes from a camera in planar motion. The same camera moved along a long table and captured the images of a tennis racket cover placed on the ground. An intensity threshold is chosen to extract the black part of the cover as a planar object. Varying the threshold from 25=255 to 75=255 does not signiflcantly change the obtained mask. Altogether 14 frames are used. The rectiflcation results are shown in Figure 6.9. 6.4.2 Application We have presented a method to detect the center of concentric circles. Now we present another method, which can also be used to detect projected concentric circles in some cases. Let there be two coplanar circles Cp = diag(1;1;?1) and Cq = diag(1;1;?r2) (0 < r < 1) with common center O at origin. The area ratio between the two circles is r2. Using a point P = (x;y;1)> as the mapping center, 106 based on (6.22), the area ratio after transformation is written as A(c2) : A(c1) = r2 (d 2 ?1)3=2 (d2 ?r2)3=2 (6.35) where d2 = 1=(x2 + y2). From (6.35) we draw two conclusions: (1) using any point other than the common center O as the mapping center, the transformed area ratio always becomes smaller; and (2) the transformed area ratio is only related to the distance from the mapping center to the center. In other words, all the mapping centers that transformed the area ratio into a value v < r2 lie on a circle, which is also centered at O. Interpreted from another point of view, the common center of the concentric circles is inside the solution curve for any given area ratio value. This observation leads to a new method to detect the center of a set of projected concentric circles: 1. Fit the two ellipses cp and cq using the given information (cq inside cp). 2. Using the center of cq as the mapping center, calculate the corresponding area ratio rqp after transformation. 3. Using rqp as the area ratio constraint, obtain the corresponding solution curve cqp, which should be an ellipse inside cq. 4. Replace cq with cqp and repeat step 2, until cq converges. 5. Rectify the pattern and repeat from step 1. 107 Table 6.2: Performance comparisons among Jiang?s center of concentric circles de- tection method [23], and our method using equal distances constraint (Section 4.4.5, denoted as ?Dist?), and this method (denoted as ?Area?) Jiang?s Dist Area round 1 2 3 4 5 6 1 2 3 mean 7.18 1.72 1.56 1.51 1.71 2.07 1.50 5.62 0.23 0.25 median 7.06 1.60 1.43 1.32 1.45 1.74 1.16 5.69 0.23 0.26 std 0.26 1.16 0.98 0.98 1.23 1.43 1.19 0.78 0.08 0.07 6.4.3 Experiments Using synthetic data, we flrst analyze the speed of this algorithm. Two pro- jected concentric circles are shown in Figure 6.10 (a). The area of inner circle and the distance between the detected center and the ground truth vs. the number of iterations are plotted in Figure 6.10 (b) and (c). The inner circle converges expo- nentially to the center. We then compare the results to Jiang?s method using the same set of test synthetic images described in Section 4.4.5. The results on each round are listed in Table 6.2. It can be seen that this method outperforms Jiang?s method and our method derived from equal distance constraint described in Section 4.4.5. 108 ?50?40?30?20?10 0 10 20 30 40 50 ?60 ?50 ?40 ?30 ?20 ?10 0 10 20 ?6 ?4 ?2 0 2 4 6?6 ?4 ?2 0 2 4 6 (a) (b)0 2 4 6 8 10 12 14 16 18 200 5 10 15 20 25 0 5 10 15 20 25 30 350 5 10 15 20 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 10 5 10 15 axis ratio 0 50 100 150 200 250 3000 5 10 15 focal length (c) (d) Figure 6.6: Synthetic experiment. (a) Three projected identical ellipses are shown in solid curve. The black ellipse is flxed and others are randomly drawn. Dashed curves are the two solution curves to equalize the area ratio between the red/blue and the black ellipse. The dashed black line is the estimated vanishing line, while the gray solid line is the ground truth. (b) Rectifled results (solid curves) compared with the ground truth (dashed curve). Two sets of parallel lines of a square and one solid circle are used for evaluation. The magnifled histogram of two angles between the two sets of parallel lines is shown in (c), and (d) two axes ratio of the big circle (above, ground truth 1) and focal length estimation (below, ground truth value 150). 109 (a) (b) (c) (d) (d) (d) (e) (f) (g) (h) Figure 6.7: Synthetic data for projected leaves. (a)-(d): set 1, (e)-(h): set 2. (a)(e): leaf images in database. (b)(f): projected blobs with fltting ellipses in solid curves, curves of poles in dashed curve. (c)(g) flrst round rectiflcation result. re-fltted ellipses are shown as dotted lines. (d) ?false alarms? (h) result after second round rectiflcation. 110 (a) (b) (c) (d) Figure 6.8: Rectiflcation of a stationary video using a moving vehicle. (a) a sample frame when a car is turning. (b) masks of the cars. (c) curves of poles. (d) rectiflcation result. 111 (a) (b) (c) (d) Figure 6.9: Rectiflcation of a video in planar motion using a tennis racket. (a) a sample frame. (b) selected masks and all the fltted ellipses. (c) rectiflcation result of (a). (d) rectiflcation result of another frame. ?150?100?50 0 50 100 150 ?40 10 60 110 160 1 2 3 4 510?910 ?8 10?7 10?6 10?5 10?4 10?3 10?2 10?1 100 Round Area 1 2 3 4 510?8 10?7 10?6 10?5 10?4 10?3 10?2 10?1 100 101 Round Distance (a) (b) (c) Figure 6.10: Experiment with synthetic data for detection of the center of concentric circles. (a) the image, (b) the area of ellipse, (c) the detected error in distance 112 Part IV VIDEO METROLOGY 113 Chapter 7 Video Metrology Using a Stationary Camera 7.1 Introduction Existing image metrology algorithms are limited in that they rely on the com- parison of parallel line segmentations. Video metrology directly based on image metrology inherits this limitation. Actually, with multiple frames this limitation can be overcome. Thus, the video metrology problem we investigate in this chapter can be addressed as: Problem 9. Given a length on the reference plane in multiple frames as a reference, accurately estimate the length of another line segment on the reference plane using multiple frames. First we show how the solution is derived. Then we describe our implementa- tion of a wheelbase metrology system using a stationary video camera followed by its performance evaluation. Finally, we discuss how metrology works using a camera in planar motion. 7.2 Two Lemmas The di?culty of this problem is due to the complex relationship between the reference and probe line segments, because both of them are scattered all over the 114 m n p q v dp MN PQ O Figure 7.1: Illustration of Lemma 6. image plane. To overcome it, two lemmas relating to the line segments on the reference plane are provided with perturbation analysis. Lemma 6 enables the metrology in 1-D space, which is stated as: Lemma 6. Given four collinear points m, n, p and q in the image plane, denote v as the vanishing point along this direction (see Fig. 7.1). The ratio between the two line segments kMNk and kPQk can be written as: r = kMNkkPQk = dpd m dq dn (dm ?dn) (dp ?dq) (7.1) where dt ?kvtk for t = m, n, p and q. Lemma 6 can be easily proved using the cross-ratio [36]. When kMNk = kPQk and all the other points are known, Lemma 6 can also be used to localize q: dq = dmdndpd mdp +dmdn ?dndp (7.2) 115 7.2.0.1 Perturbation Analysis To analyze the accuracy of (7.1) under noisy conditions, we take the log oper- ation both sides: lnr = lndp +lndq +ln(dm ?dn)?lndm ?lndn ?ln(dp ?dq) (7.3) and then take the derivative. Denote the small disturbance of the distance from any point t as ?t, the perturbation of the ratio then becomes ?r r = 1 dm ?dn( dn dm?m ? dm dn ?n)? 1 dp ?dq( dp dq?q ? dq dp?p) (7.4) Using Lemma 7, one can parallel move a line segment to a given point, on the image plane, which is stated as: Lemma 7. Given a line segment mn and a point o, let mn and mo intersect the vanishing line L at points p, q respectively. Denote the intersection point of lines nq and op as k (see Fig. 7.2). In the world system, OK is parallel to MN and kOKk = kMNk. Proof. Because p and q are vanishing points, OP and MN are parallel, and NQ and MN are also parallel. Thus, MNKO forms a parallelogram. We have kOKk = kMNk. Following Lemma 7, the homogeneous coordinate of k is obtained as: k ? (m?n?L?o)?(m?o?L?n) (7.5) Using the properties of cross product: a ? (b ? c) = (a ? c)b ? (a ? b)c and a?a = 0, we simplify (7.5) as: 116 o m n k p qthe vanishing line Lp dK hK Figure 7.2: An illustration of Lemma 7. mn is parallel moved to ok, where o is given and k is unknown. p and q are two points on the vanishing line. k ? (m?n?L?o)?(m?o?L?n) = ((nhm ?mhn)?o)?((ohm ?mho)?n) = (ohm ?mho)((nhm ?mhn)?o?n)?n((nhm ?mhn)?o?(ohm ?mho) = hohnm(m?n?o)?hnhmo(m?n?o)+hohmn(n?o?m) where hm = m?L, hn = n?L and ho = o?L. Because (m?n)?o = (m?n)?o = (n?o)?m = det(jmnoj), k can be further represented using dot products: k ? (n?L)(o?L)m?(m?L)(n?L)o?(o?L)(m?L)n (7.6) 7.2.0.2 Perturbation Analysis By forcing the third element of each point coordinate to be 1, (7.6) can be written as: ((n?L)(o?L)?(m?L)(n?L)?(o?L)(m?L))k = (n?L)(o?L)m?(m?L)(n?L)o?(o?L)(m?L)n (7.7) 117 By taking the dot product of L on both sides of (7.7), we have 1 k?L + 1 m?L = 1 o?L + 1 n?L (7.8) By taking the cross product of L on both sides of (7.7) and using (7.8), we have k?L k?L + m?L m?L = o?L o?L + n?L n?L (7.9) Draw a line L? perpendicular to the vanishing line L in Fig. 7.2. Denote the distance from any point t to the lines L and L? as ht and gt respectively. The geometric interpretation of (7.8) and (7.9) is 1 hk + 1 hm = 1 ho + 1 hn (7.10) gk hk + gm hm = go ho + gn hn (7.11) Denote the disturbance of a point t along the direction of L and L? as ?gt and ?ht respectively (t = m;n. We assume that o has no disturbance). From (7.10) and (7.11), the perturbations along the direction of L and L? are represented as ?gk = hkh n ?gn + hk(gm ?gk)h2 m ?hm ? hkh m ?gm ? hk(gn ?gk)h2 n ?hn (7.12) ?hk = h 2 k h2n?hn ? h2k h2m?hm (7.13) 7.3 Measure On-plane Line Segments In this section, we present an algorithm for measuring on-plane line segments and supporting perturbation analysis. 118 7.3.1 Reference Lengths in Three Frames To illustrate the key idea of our algorithm, we flrst provide a simplifled problem as an example with its solution and corresponding perturbation analysis. Problem 10. Given the reference length kMNk = 1 on the reference plane in three frames as mini (i = 1;2;3), estimate the length kOPk of any line segment op on the reference plane. We analyze the problem both in the image plane and show the corresponding picture in the world system. First, to make the reference lengths and the probe share a common endpoint, we parallel move mini so that mi is mapped to o, and ni to point ki. In the world plane, kOKik = kMiNik = 1. Thus, the three points Ki (i = 1;2;3) determine a unit circle C centering at O. Let K0 be the intersection point between C and OP. The radius kOK0k = 1. If k0, the image of K0, is known, kOPk can be measured using Lemma 6. Thus, the key to solving this problem is to determine the image of C, which is an ellipse denoted as E. To determine an ellipse requires at least flve points on the ellipse. Three points ki (i = 1;2;3) on E can be obtained using Lemma 7. Because the center o is given, we can extend kio through o and localize point ki+3 on the extension line, so that kKi+3Ok = 1. This operation provides the other three points. The ellipse E then can be fltted using ki (i = 1;2;:::;6). We summarize the algorithm as follows: 1. Parallel move mini to oki using Lemma 7. 119 m1 n1 k1 k4 m2n2k 2 k5 m3 n3 k3 k6 o k0 p The Vanishing Line M1 N1 K1 K4 M2 N2 K2 K5M3 N3 K3 K6 O K0 P (a) image plane (b) world system Figure 7.3: Illustration of Problem 10 and its solutions. Red, green and blue line segments are the reference lengths and the black segment is the probe. 2. Extend kio through o. Localize ki+3 on the extension line using Lemma 6, such that kKi+3Ok = kKiOk = 1. 3. Fit an ellipse E passing through all the six points ki (i = 1;2;:::;6). 4. Obtain the intersection between E and op, which is denoted as k0. 5. Calculate r = kOPk : kOK0k using Lemma 6 as the metrology result. 7.3.2 Perturbation Analysis We analyze the perturbation of K0 introduced by disturbance of Ki. Without loss of generality, O is assumed to coincide with the image center. The parameter vector E = (abcde)> represents the ellipse E satisfying ax2+bxy+cy2+dx+ey+f = 0. The coordinates of Ki is denoted as (xi;yi) and its coe?cient vector are deflned 120 as vi = (x2i xiyi y2i x y)>. Let G = (v1 v2 v3 v4 v5)>, the ellipse can be solved as E = ?fG?1u (7.14) where u = (1 1 1 1 1)>. Suppose the disturbance at each point K is (?x;?y), the corresponding distur- bance of the coe?cient matrix is ?G. From (7.14), the perturbation of the ellipse becomes ?E = ?G?1 ?G E (7.15) Using the condition v0E = ?f, the perturbation to the coe?cient vector of K0 becomes ?v0E = v0 G?1 ?G E (7.16) Deflne r0 = px20 +y20 as the distance from K0 to O, and represent ?v0 as ?r0v0=r0, the perturbation of r0 is ?r0 = r0 v0 G?1 ?G G?1 u (7.17) To further simplify (7.16), we assume that both the reference and probe lengths are much shorter than the distance from point O to the vanishing line L, which leads to d = e = 0. The parameters of the ellipse are simplifled into a 3?1 vector E = (a b c) and the coe?cient vector of Ki becomes vi = (x2i xiyi y2i). G?1 can be expressed as: G?1 = 0 BB BB BB @ ? y2y3(x1y2?x2y1)(x3y1?x1y3) ? y3y1(x2y3?x3y2)(x1y2?x1y2) ? y1y2(x3y1?x1y3)(x2y3?x3y2) x2y3+x3y2 (x1y2?x2y1)(x3y1?x1y3) x3y1+x1y3 (x2y3?x3y2)(x1y2?x1y2) x1y2+x2y1 (x3y1?x1y3)(x2y3?x3y2) ? x2x3(x1y2?x2y1)(x3y1?x1y3) ? x3x1(x2y3?x3y2)(x1y2?x1y2) ? x1x2(x3y1?x1y3)(x2y3?x3y2) 1 CC CC CC A (7.18) 121 ?G G?1 u is the same order of the disturbance. Denote ti = yi=xi, (i = 0;1;2;3). We have v0G?1 = (x 2 0 x21 (t2 ?t0)(t3 ?t0) (t2 ?t1)(t3 ?t1) x20 x22 (t3 ?t0)(t1 ?t0) (t3 ?t2)(t1 ?t2) x20 x23 (t1 ?t0)(t2 ?t0) (t1 ?t3)(t2 ?t3)) (7.19) In (7.19), the denominator is related to the angle of \KiOKj, which is denoted by ?ij. Then we have kv0G?1k? minik?0ikmin i;j k?ijk (7.20) 7.3.3 Reference and Probe Lengths in Multiple Frames In general case, both the reference and the probe lengths appear in multiple frames. Problem 9 can be formalized as Problem 11. Given a reference length kMNk = 1 in multiple frames mini (i = 1;2;:::f) and a probe line segment ST also in multiple frames sjtj (j = 1;2;:::g), determine the length l of ST. We flrst select an arbitrary point o on the image plane and parallel move mini to opi, and sjtj to oqj. We then localize pi+f on opi so that kOPik = kOPi+fk and repeat the same steps to localize qj+g. Now Consider the lengths of the new line segments in the world system: kOPik = 1 (i = 1;2;:::2f), kOQjk = l (j = 1;2;:::2g). Denote O?s coordinate as (xO;yO). Pi and Qi are on two concentric circles Cp : (x?xO)2 +(y?yO)2 ?1 = 0 and Cq : (x?xO)2 + (y ?yO)2 ?l2 = 0 respectively. Deflne a trivial circular conic 122 centered at O with zero radius as C0 = 0 BB BB BB @ 1 0 ?xO 0 1 ?yO ?xO ?yO x2O +y2O 1 CC CC CC A (7.21) The two circles can be written as Cp = C0 ? 0 BB BB BB @ 0 0 0 0 0 0 0 0 1 1 CC CC CC A ; Cq = C0 ?l2 0 BB BB BB @ 0 0 0 0 0 0 0 0 1 1 CC CC CC A (7.22) Denoting the 3?3 homography matrix from the world reference plane to the image plane as H, the image of a circle C is known as an ellipse: E = H?>CH?1. The image of concentric circles is of the form Ep = E0 ?L Eq = E0 ?l2L (7.23) where E0 = H?>C0H?1 and L = H?>diag(0;0;1)H?1. We analyze L and E0 respectively. Let the vanishing line be expressed as L = (L(1) L(2) L(3)). H?1 can be written as H?1 = fi 0 BB BB BB @ h11 h12 h13 h21 h22 h23 L(1) L(2) L(3) 1 CC CC CC A (7.24) Thus, L = fi2 0 BB BB BB @ L(1)L(1) L(1)L(2) L(1)L(3) L(1)L(2) L(2)L(2) L(2)L(3) L(1)L(3) L(3)L(3) L(3)L(3) 1 CC CC CC A (7.25) 123 Since E0 stands for a trivial ellipse at point o = (xo;yo), it is of the form a0(x?xo)2+b0(x?xo)(y?yo)2+c0(y?yo)2 = 0 (we ignore the restriction b2 < 4ac). Ellipse Ep and Eq can then be written as a0(x?xo)2 +b0(x?xo)(y ?yo)+c0(y ?yo)2 ?fi2vL = 0 (7.26) a0(x?xo)2 +b0(x?xo)(y ?yo)+c0(y ?yo)2 ?fl2vL = 0 (7.27) where fl = fil, v = (x2 xy y2 x y 1), and L = (L(1)2 2L(1)L(2) L(2)2 2L(1)L(3) 2L(2)L(3) L(3)2)> Deflne e = (a b c fi2 fl2)> and vectors with components up = ((x?xo)2 (x?xo)(y ?yo) (y ?yo)2 v(p)L 0 ) uq = ((x?xo)2 (x?xo)(y ?yo) (y ?yo)2 0 v(q)L) The coe?cient matrix is M = (up1 up1 :::upf uq1 :::uqg)>, and Problem 11 is reduced to solving a linear fltting problem: Me = 0 (7.28) The nontrivial least square solution is: e = argminkMek subject to kek = 1 which can be solved by performing the Singular Value Decomposition (SVD): M = USV> where e is the singular vector corresponding to the smallest singular value of M [22]. 124 Figure 7.4: Poorly fltted ellipse is improved signiflcantly by adding one more point. The red ellipse is fltted by red (noisy) line segments; the blue ellipse is fltted by both red and blue line segments. The dashed black ellipse is the ground truth. After determining fi2 and fl2, the length of the probe line segment is calculated directly as l = p fl2=fi2 (7.29) This method also works for measuring the length of multiple line segments on the reference plane by estimating fi, fl, and , etc. 7.3.4 Performance in Noisy Environments Perturbation analysis for the case of multiple frames is very di?cult. Intu- itively, a poorly fltted ellipse may be improved signiflcantly by adding one more point along a \decent" direction (Fig. 7.4). When there are enough reference lengths, the references inducing large perturbation errors can be ignored. The perturbation of the result using a triple of reference lengths fi;j;kg is denoted as ?li;j;k. The pertur- bation using multiple lengths is lower than any of them ?l < ?li;j;k. We approximate the total perturbation as the lowest value from the triples: ?l = mini;j;k ?li;j;k 125 During the measuring procedure, we assume that the disturbances of end- points of the input line segments are independent and identically distributed. The perturbation of ki due to the input noise can be derived using (7.7). After fltting the ellipses using ki, we decompose each perturbation vector into two components: one tangent to the ellipse and the other normal to the ellipse, denoted as ti and ni respectively. Since ti does not afiect the fltting result, only ni is considered. The algorithm is modifled as follows: 1. Fit the two ellipses using M as described before. 2. Calculate the perturbation of each reference and probe lengths along with the normal direction of the fltted ellipses. 3. Estimate the total perturbation of each probe, denoted as ni. 4. Reflt the two ellipse using a new matrix M0, where each row of M0 is constructed by the row of M multiplies a factor fi ? 1=ni. This algorithm actually gives a weight to each point during fltting, that is, inversely proportional to the efiect of perturbation. Since the efiect of perturbation is estimated based on fltting ellipses, an Expectation-Maximization (EM) algorithm may be applied: the E-step is invoked for estimating the errors from the ellipses; and the M-step for fltting the ellipses to minimize the errors. However, experiments have shown that the EM algorithm may not be necessary as the results after one iteration are usually acceptable. 126 wheelbase 1wheelbase 2 (a) 0 0.51 1.52 2.53 3.54 0.01 0.02 0.03 0.04 noise level mean of relative mensuration error RectificationEllipse FittingWeighted Ellipse Fitting 0 0.51 1.52 2.53 3.54 0.01 0.02 noise levelstandard deviation of relative mensuration result RectificationEllipse FittingWeighted Ellipse Fitting (b) (c) Figure 7.5: Synthetic data for metrology algorithm. (a) Traces of wheelbases from two vehicles. (b) Mean of metrology error. red ?: result using rectiflcation; green ?: our ellipses fltting metrology algorithm without perturbation analysis; blue 4: our metrology algorithm with perturbation analysis (weighted) (c) Standard deviation of metrology result. 7.4 Synthetic Data Experiments We illustrate the efiectiveness of our metrology algorithm using synthetic data. 7.4.1 Metrology Algorithm We flrst evaluate the performance of our method and compare it to the rec- tiflcation method using synthetic data. Two sets of images of wheelbases lying on the reference plane are simulated. As shown in Fig. 7.5 (a), each set contains two 127 main directions. The ratio between the two wheelbases is 1:60. The noisy input data is generated by adding independent and identically dis- tributed Gaussian random noise to all the endpoints of the line segments. Three methods are used to measure the ratio of the two sets of wheelbases. The tradi- tional method flrst rectifles the image using the Liebowitz?s method [29], calculates the lengths in the rectifled image and then compares the means of the two sets. The rectiflcation algorithm contains a step to estimate the common intersection points of multiple circles on the fi?fl plane, which is achieved by minimizing the sum of distances from that point to all the circles (The distance from one point to a circle is deflned as the difierence between the distance from the point to the circles center and the circle?s radius). The second method is our unweighted concentric-ellipses fltting algorithm. The last one considers noise analysis and uses the weight. We tried difierent levels of noises and for each level, ten sets of data are independently generated. The mean of relative error of each method versus the noise level is plotted as shown in Fig. 7.5 (b). Both of our methods outperformed traditional rectiflcation method. Weighted fltting achieves best result among all the methods. Fig. 7.5 (c) plots the standard deviation of the metrology results. 7.5 System Implementation In this section, we present a real-time wheelbase measuring system using a stationary camera. The conflguration of the system is presented in Figure 7.6. 128 BackgroundSubtraction WheelDetection & Tracking FrameSelection WheelbaseMensuration Vanishing LineEstimation Other DistanceMensuration Figure 7.6: Flow chart of the wheelbase measuring system 7.5.1 Wheels Detection and Tracking After background subtraction [50], we estimate the vehicle?s direction of mo- tion using optical ow. We attach a system of skew coordinates to the moving vehicle (x0;y0), where x0 is along the direction of motion and y0 is vertical, as shown in Fig. 7.7. Since each tire of a vehicle is black and the wheel cover is silver or gray, even a simple intensity fllter can separate the wheel cover from the tire well. To detect the wheels, we preset a low threshold ti and remove pixels whose intensity is lower than the threshold from the foreground vehicle mask. As we gradually raise the threshold and keep removing the low intensity pixels from the mask, the whole mask breaks into pieces step by step. The procedure stops when two similar blobs appear at the bottom (with low coordinate y0). These two blobs are treated as the mask of wheel covers. The centers of the blobs are used as the detected wheel centers. The flnal threshold value is denoted as t. A similar method is used to track the wheels. To speed up the operation, the initial threshold of the frame f is set as the threshold from the frame f ? 1 after 129 x ? y ? Figure 7.7: Skew coordinate using vehicle?s moving direction. subtracting a small amount tif = tf?1 ??f. The detected region also focuses on the previous wheel center location plus the shift of the vehicle center. The tracking is performed in real time. 7.5.2 Minimal Calibration Treating the line connecting the detected wheel centers as the wheel-base, we estimate the vanishing line using the method presented in Chapter 6. 7.5.3 Indoor Video Experiments Two radio controlled toy Mitsubishi Lancer cars with green color were used in our experiment. The larger car is scaled to 1:6 and the small one to 1:16. Each frame has 480?640 pixels and the frame rate is 20 frames/second. Sample frames are shown in Fig. 7.8 (a). Figure 7.8 (b) illustrates the mask after removing the low intensity pixels. Samples of the tracked wheels are shown in 130 (a) (b) ?100?80?60?40?20 0 20 40 60 80 100?80 ?60 ?40 ?20 0 20 40 60 80 (c) (d) Figure 7.8: Indoor sequence (a) a sample frame with number 4190 (b) mask after removing the pixel with intensity lower than 0.00, 0.05, 0.10, 0.15 respectively (c) sample of retrieved wheels from one vehicle. Same color indicates a pair (one front, one rear) from one frame (d) endpoints after parallel moving of two vehicles, red for the larger toy car, green for the smaller toy car, ?+? indicates the common endpoint. 131 Table 7.1: Wheel base metrology results Year/Make/Model Size/Category Ground Truth(In.) Metrology result(In.) 2004 Hyundai Elantra compact sedan 102.7 102:83?1:75 2001 Honda Civic compact sedan 103.1 103:47?2:77 2004 Toyota Camry midsize sedan 107.1 108:02?2:45 2004 Ford Explorer Large SUV 113.8 115:04?2:32 Fig. 7.8 (c). Fig. 7.8 (d) shows the endpoints of the wheelbases after parallel move and fltting steps have been performed. Our video metrology result shows that the two wheelbases ratios are 39:2%. Actual measurement shows that the larger toy car?s wheelbase is 41:90 cm and the smaller one is 16:75 cm and the ratio is 40%. Our solution has an error of 1.95%. Compared to the ratio claimed by the toy producer (1=16 : 1=6 = 0:375) with 6.25% error, it is far better. 7.5.4 Outdoor Video Experiments The video sequences were captured using cameras located more than 25 feet above the ground. Four types of vehicles were imaged, which include a 2004 Toyota Camry, 2001 Honda Civic, 2004 Huyndai Elantra and 2004 Ford Explorer. The frame rate is 20 frames/second and each frame has 480 ? 720 pixels. The sample frames are shown in Fig. 7.9 (a)-(d). Samples of detected wheels are shown in Fig. 7.10 (e). Using the fltted ellipses, we can estimate the error of each of the line segments, 132 (a) (b) (c) (d) Figure 7.9: Sample frames from outdoor sequences. (a) Toyota Camry (b) Honda Civic (c) Hyundai Elantra (d) Ford Explorer and then calculate the variance of the input noise from it. Then the probability distribution of the measuring result can be achieved, which is presented in Fig. 7.11. The measuring result and error estimate are displayed in Table 7.1. Although Civic and Elantra are not distinguishable, the Camry vehicle can be separated from them. The Explorer is difierent from Camry also. Thus the result can be used to determine the vehicles class. 133 Figure 7.10: Sample of detected wheels. The detected wheel centers are indicated by white crosses 7.6 Measure All Line Segments We brie y discuss how to measure the line segments that do not lie on the reference plane, given a line segment on the reference plane as a reference. Since the reference lengths can be freely parallel moved on the reference plane, the problem can be restated as: Problem 12. Given a reference line segment mini, estimate the length of mipi. Let Q be the projective point of P onto R. The results of previous section show that mq can be measured using mn. The metrology of pq can be obtained using Criminisi?s method [10]. kMPk can be achieved using the Pythagoras?s rule: kMPk = pkMQk2 +kPQk2. This problem is simplifled to localize the projective point q on the image plane. 134 100 105 110 1150 0.05 0.1 0.15 0.2 0.25 Inches Probability 2004 Toyota Camry 100 105 110 1150 0.05 0.1 0.15 0.2 0.25 Inches Probability 2001 Honda Civic (a) Camry (b) Civic 95 100 105 1100 0.05 0.1 0.15 0.2 0.25 0.3 Inches Probability 2004 Hyundai Elantra 105 110 115 120 1250 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 Inches Probability 2004 Ford Explorer (c) Elantra (d) Explorer Figure 7.11: Wheelbase metrology probability plots 7.6.1 Simplifled Case We flrst study a simplifled version of the problem by adding a constraint to point P: Problem 13. Given a reference line segment mn, estimate the length of mp, under the constraint that the plane formed by M, N, and P is perpendicular to the reference plane R. Let the lines mn and pz intersect at q, where z is the vertical vanishing point. Point q is the projective point because (1) the line PQ is perpendicular to R as 135 m1 n1 m2(m?1) n2(n?1) p1 p2 z z? p?1 q 2(q?1) Figure 7.12: Illustration of Theorem 4. Red dashed and blue solid indicate frames 1 and 2 separatively. The black dash dot line is the transformed frame 1 it passes through the vertical vanishing point, and (2) Q lies on the intersection of plane MNP and plane R. 7.6.2 General Case To solve problem 12 in general case, at least two frames are required. Theorem 4. Given reference lengths mini (i = 1;2) with kM1N1k = kM2N2k and a point q1 on the reference plane R, a point q2 can be localized so that kQ2M2k = kQ1M1k and \Q2M2N2 = \Q1M1N1. Proof. Draw an ellipse Em so that for any point x on Em, kXM2k = kQ1M1k; ellipse En so that for any point x on En, kXN2k = kQ1N1k. Denote the intersection point of the two ellipses as q2. We have 4M2N2Q2 ? 4M1N1Q1, thus q2 satisfles the requirement. Theorem 4 actually suggests a rotation operation on the reference plane. We can then obtain q from two frames by rotating frame 1 to 1? so that m01 = m2 and 136 (a) side view (b) top view Figure 7.13: Metrology of the tracked points from reference wheel base n01 = n2. The vertical point z then becomes z0. Because q is always on the line pz during the rotation, the intersection point of z0p01 and zp2 is q2 = q01 (see Fig. 7.12). A metrology method for multiple frames can then be designed. 1. Denote MN?s midpoint as K, flnd Ki in each frame. 2. Rotate each frame, so that k0i = k0 and m0in0i is parallel to the x-axis. Here k0 can be any flxed point. 3. Localize the point q0 which minimizes the distance to z0ip0i: q0 = argminq dist2(q;z0ip0i) 4. Project q0 to z0ip0i as q0i. 5. Estimate kM0iP0ik through q0i and combine the result. 7.6.3 Experiments We use the same outdoor video sequence to measure several line segments on the SUV. By tracking feature points on the object using the KLT [32][42][45] tracker, the geometric properties of an object can be derived. We track the window 137 corners and rack crossing points of the Ford Explorer and then measure their spatial locations using our method. The window corners are assumed to form a vertical plane passing through the two wheel centers (thus we can use the simplifled method) and the rack points have no such constraint. The initial points are manually selected. Fig. 7.13 compares our result with the ground truth images from side and top views. Major errors occur when the tracking points drift away as frames are processed. 7.7 Frame Selection for Camera in Planar Motion A camera undergoing planar motion is a camera that moves parallel to the reference plane at a flxed height h, elevation angle , and rotation angle `. The vanishing line L is flxed during the planar motion. Stationary objects shot by a camera in planar motion can be treated as objects in planar motion on the reference plane shot by a stationary camera. Thus, our metrology algorithm is applicable to the camera in planar motion. In reality, the planar motion camera is mounted on a planar motion object, e.g., a vehicle moving on the reference plane. Under this condition, the height of the camera is invariant. However, two angles and ` may change with the time, which makes the video jitter and the vanishing line unstable. One solution to this problem is to select a group of frames with the vanishing lines similar to each other and use those frames for metrology. The vanishing line can be represented using two parameters: the distance 138 d from the original point o to L and the angle of L: L = (sin`;cos`;d)>. To comparethesimilarityoftwovanishinglines, asimplecameramodelwithunitaspect ratio and zero skew is adopted. Under this condition, we draw a perpendicular from o to L and denote the foot as f: f = (dcos ;dsin ;1)>. Given a reference vanishing lineL0 andtwovanishinglinesL1;2 closetoL0, wecanprovethefollowingpropertyof the feet: Ifk 1? 0k?k 2? 0kandk`1?`0k?k`2?`0k, thenkf0f1k?kf0f1kThus, the distance of the two feet can be used to indicate the distance of two vanishing lines. In the rest of this chapter, we use the foot f to represent the vanishing line. The images from two frames 0;1 share a flxed line l0;1, which can be calculated as the only real eigenvector of H>, where H is the homography matrix. If two frames have the same vanishing line L, it can be calculated as the flxed line among the two frames. One the other hand, given the vanishing line of frame 0, if l0;1 = L0, then the vanishing line of frame 1 is the same as 0?s; otherwise, they are difierent. Based on this fact, we propose the following criteria to select frames with similar vanishing lines as frame 0: 1. Set a predetermined distance threshold d0. 2. Obtain the homography matrix H between frame 0 and the query frame. 3. Calculate the real eigenvector of H>, denoted as l. 4. Find the distance d between line l and L0. If d < d0, the the frame is selected. For frames that have very difierent vanishing lines, the eigenvectors are dis- tributed sparsely. Assuming that there are many frames with similar vanishing lines 139 as frame 0, their eigenvectors are closes to L0. Thus, when L0 is unknown, it can be obtained by clustering the eigenvectors. 7.7.1 Synthetic Data Results Our last experiment is about frame selection. Let the elevation angle and rotation angle of frame 0 be 0 = 70? and `0 = 5?. The two angles of other frames are perturbed independently by Gaussian noise with standard deviation 5?. The camera?s positions are randomly distributed in a square. The vanishing lines of these frames are shown in Fig. 7.14 (a) and compared with the flxed lines, which are more sparsely distributed. Select those flxed lines within a distance d0 to L0, as shown inside the gray circle. The corresponding vanishing lines are marked with red circles. We have the following observations. ? Selected vanishing lines are comparatively closer to L0, especially in y( ) di- rection. ? Selected vanishing lines may not be within the distance of d0 to L0. ? Not all the vanishing lines within the distance of d0 to L0 are selected. The algorithm is evaluated from three aspects. There are totally 10;000 frames involved. First, the distribution of selected vanishing lines is analyzed. Denote the foot of vanishing line as fx and fy. Fig. 7.14 (b) displayed the standard deviations (STD) of fx and fy vs. d0. The flgure shows that both std(fx) and std(fy) are roughly proportional to d0, i.e. , there exists a constant fi satisfying fx;fy < fid0. 140 0 50 100 300 350 400 450 500 20 30 40 50330 340 350 360 370 380 390 1 3 5 7 90 5 10 (a) (b) 1 2 3 4 5 6 7 8 9 100 50 100 150 200 250 300 350 1 2 3 4 5 6 7 8 9 100.15 0.160.17 0.180.19 0.20.21 0.220.23 0.240.25 (c) (d) Figure 7.14: Synthetic data for frame selection. (a) The vanishing lines represented by feet. the black crosses: the vanishing line of frame 0; the green crosses: the vanishing lines of other frames; blue dots: flxed lines obtained from the homography matrices between frame 0 and other frames. red circles: those selected vanishing lines whose corresponding flxed line is close to the vanishing line of frame 0. (b) the standard deviations (STD) of the coordinates (x in blue cross and y in red plus) of the feet of the selected vanishing lines vs. difierent distance d0 used. (c) The number of selected frames vs. d0 in 10,000 frames. (d) The ratio of the numbers of selected frames and all the frames should be selected vs. d0. 141 Thus, to select some frames with the vanishing lines within a distance d to L0, we may just set d0 = d=fi. The std(fx) and std(fy) for unselected frames are 33 and 100 respectively. Although std(fx) is slightly smaller than std(fy) for the selected frames, more vanishing lines far away from L0 in the y direction are flltered out. The second aspect we analyze is the number of frames we select, which is plotted in (c). The number of frames is proportional to d20. The last question to answer is: for the vanishing lines within a region d, how many of them are selected? If we assume that the vanishing lines are uniformly distributed within a circle radius r, which makes its STD around 0:41r. For each d0, we create an ellipse with the x-axis 2std(fx)=0:41 and the y-axis 2std(fy)=0:41, respectively. The number of the vanishing lines inside the ellipse denoted as nv can be treated as the number of frames should be selected. The ratio between the number of actually selected is plotted in (d). Around 20% frames are selected using this method. Synthetic data shows that this method works but not perfectly. With a too small d0, the number of selected frames might be small, which makes the metrology algorithm unstable; with a too large d0, the vanishing lines of selected frames might be not close to L0, makes the metrology algorithm not accurate. Thus, a proper value of d0 should be chosen. 7.7.2 Indoor Video Experiments The last experiment involves four stationary toy vehicles: silver Cobra, black Hummer, yellow SVT and blue Silverado. The frame rate is 3 frames/second and 142 totally 500 frames are collected. Each frame was saved as a 480?640 pixels JPEG flle. Eight wheel centers are detected and tracked automatically. Frame 250 is selected as the reference frame. The homography matrices between each frame and frame 250 are calculated from the eight registered wheel centers. The vanish- ing line is obtained in two ways: by using two sets of parallel lines yielding the result (?0:0098;1:0000;259:0464)>; by clustering the eigenvectors providing the re- sult (?0:0056;1:0000;256:2997)>. The two results are close to each other. Using the threshold d0 = 2, 18 frames among 500 were selected and used for metrology. The reference frame 250 is shown in Figure 7.15 (a). Figure 7.15 (b) displays the feet of the eigenvector lines. The red square is the foot of the vanishing line of frame 250 detected using parallel lines. The blue circle is the cluster center of the eigenvectors. Frame 330 is shown in 7.15 (c), where no parallel line sets can be accurately detected. The endpoints of the parallel moved line segments are shown in Figure 7.15 (d), together with the fltted concentric ellipses with corresponding colors. The comparison between the metrology results and the ground truth values are summarized in Table 7.2. From it, the error is around 3%, somewhat larger than the result from a stationary camera, but still small. 143 ?100?50 0 50 100150 200 250 300 350 (a) frame 250 (b) Feet of the eigenvector lines ?200?100 0 100 200?30 ?10 10 30 50 (c) frame 330 (d) Fitted concentric ellipses Figure 7.15: Wheelbase metrology using a planar motion camera Table 7.2: Wheelbase metrology results for camera in planar motion Vehicle Color Ground Truth (cm) Metrology result (ratio) error (%) Silver 41.4 1:000?0:035 (reference) Black 21.5 0:540?0:034 3.83 Yellow 19.2 0:417?0:020 3.11 Blue 17.8 0:464?0:021 0.05 144 Chapter 8 Summary and Future Research Directions In this dissertation, we have advanced current calibration and metrology re- search. The research contributions of this dissertation include a novel point of view on rectiflcation problems, solutions to difierent types of constraints, discovery of invariants, a set of video calibration methods, video metrology algorithm, jitter removal algorithm, and an automated wheelbase metrology system. 8.1 Summary of Contributions First, we have presented a new method for calibration using planar patterns including a circle. We have provided a general solution structure to an intermediate set of partial rectiflcation problems, so that the calibration problems can be solved constraint by constraint rather than case by case. We have studied two sets of constraints based on distance and angle. In the flrst set, we have solved the partial rectiflcation problem given the distance from a point to a circle, which is an ellipse tangent to the intersections between the projected circle and the polar of the point. We have proved that the solution curve for equal distance constraint is a straight line. In the second set, we specially deal with the constraint of a pair of orthogonal lines. The solution curve passes through the two poles and the intersections between the two lines and the projected circle. We have obtained two byproducts by studying 145 the patterns: (1) we have developed a new method to detect the center of concentric circles, and (2) developed a new method for image registration. Second, we have presented several new video calibration methods. We have proposed a two-step method to estimate the vanishing line using a line segment moving on the reference plane. We have also proposed a method to rectify the plane using a moving ellipse, which also consists of two steps: (1) a?ne rectiflcation by equalizing the sizes of the ellipses, and (2) metric rectiflcation by equalizing the shapes of the ellipses. This algorithm has been generalized to any planar object moving on the reference plane. We have developed another algorithm to detect the center of concentric circles that outperforms existing algorithms. Finally, we have presented a new algorithm for video metrology. Rather than directly extending an image-based metrology, our algorithm uses the property of video and achieves higher accuracy. We have developed a wheelbase measurement system for surveillance video, and the error in measurement is within 2%. Thus, this system can determine the vehicle categories. 8.2 Limitations and Future Work Several open issues and intriguing research questions were identifled while con- ducting this research and performing the experiments. These issues and questions point to the following future research directions. 1. Automatic derivation of the relationship between lines, points and circles. In the experiments on calibration using circles, the lines and points forming the 146 constraints are identifled manually. There are few algorithms to automatically determine (1) the intersecting angles of two lines (parallel, perpendicular or not), and(2)therelationofpointsandthecircle, whichprohibitsouralgorithm to process a single view image automatically. In addition, detecting ellipse is not trivial. A robust and accurate ellipse detector has to be developed. 2. Solutions to more types of constraints. Several commonly seen constraints in- cluding equal-distance and right angle constraints have been solved. However, to build a system applicable in most scenes, we need to be able to include as many constraints as possible. For example, 30, 45, 60 and 120 degrees angle constraints appear frequently, and the corresponding solutions have not been represented in existing approaches. 3. Further study on video calibration using planar objects. We have presented an interesting and practical video calibration technique using moving planar objects. However, there are still some challenges left. Theoretically, the re- cursive method has not been proved to converge to the solution. Practically, generally observed shapes are approximated as planars. The non-planar efiect should be analyzed for building a robust calibration system. 4. Investigation on video metrology using multiple cameras. Our metrology algo- rithm can be extended to multiple cameras. New problems, such as camera setting and calibration, sensors cooperate and fusing, etc. arise, which makes a good research topic. 147 Bibliography [1] F. Abad, E. Camahort, and R. Viv?o. Camera calibration using two concentric circles. In Image Analysis and Recognition: International Conference (1), pages 688{696, September 2004. [2] M. Agrawal and L. S. Davis. Camera calibration using spheres: A semi-deflnite programming approach. In Proceedings of the Ninth IEEE International Conference on Computer Vision, pages 782{791, October 2003. [3] B. Bose and E. Grimson. Ground plane rectiflcation by tracking moving objects. In Proceedings of the Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pages 94{101, October 2003. [4] J. B. Burns, R. S. Weiss, and E. M. Riseman. View variation of point-set and line- segment features. IEEE Trans. Pattern Analysis and Machine Intelligence, 15(1):51{ 68, 1993. [5] X. Cao and H. Foroosh. Simple calibration without metric information using an isoceles trapezoid. In Seventeenth International Conference on Pattern Recognition (1), pages 104{107, August 2004. [6] X. Cao and M. Shah. Camera calibration and light source estimation from images with shadows. In IEEE Computer Society Conference on Computer Vision and Pat- tern Recognition (2), pages 918{923, June 2005. [7] B. Caprile and V. Torre. Using vanishing points for camera calibration. International Journal of Computer Vision, 4(2):127{140, 1990. 148 [8] Q. Chen, H. Wu, and T. Wada. Camera calibration with two arbitrary coplanar circles. In Proceedings of the Eighth European Confeence on Computer Vision (3), pages 521{532, May 2004. [9] Y. Chen and H. H.-S. Ip. Planar metric rectiflcation by algebraically estimating the image of the absolute conic. In Seventeenth International Conference on Pattern Recognition (4), pages 88{91, August 2004. [10] A. Criminisi, I. D. Reid, and A. Zisserman. Single view metrology. In Proceedings of the Seventh IEEE International Conference on Computer Vision, pages 434{441, September 1999. [11] A. Criminisi, I. D. Reid, and A. Zisserman. Single view metrology. International Journal of Computer Vision, 40(2):123{148, 2000. [12] A. Criminisi, A. Zisserman, L. J. V. Gool, S. K. Bramble, and D. Compton. New ap- proach to obtain height measurements from video. SPIE: Investigation and Forensic Science Technologies, 3576(1):227{238, 1999. [13] N. Daucher, M. Dhome, and J. T. Laprest. Camera calibration from spheres images. In Proceedings of the third European conference on Computer vision (1), pages 449{ 454, May 1994. [14] H. Dorrie. 100 Great Problems of Elementary Mathematics. Dover Publications, 1965. [15] O. Faugeras. Three-dimensional computer vision: a geometric viewpoint. MIT Press, Cambridge, MA, USA, 1993. [16] V. Fremont and R. Chellali. Direct camera calibration using two concentric circles from a single view. In International Conference on Artiflcial Reality and Telexistence, pages 93{98, December 2002. 149 [17] F. Guo. Plane rectiflcation using a circle and points from a single view. In Eighteenth International Conference on Pattern Recognition (2), pages 9{12, August 2006. [18] P. Gurdjos, A. Crouzil, and R. Payrissat. Another way of looking at plane-based cal- ibration: The centre circle constraint. In Proceedings of the 7th European Conference on Computer Vision-Part (4), pages 252{266, May 2002. [19] P. Gurdjos, J.-S. Kim, and I.-S. Kweon. Euclidean structures from confocal conics: Theory and application to camera calibration. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1214{1221, June 2006. [20] P. Gurdjos, P. Sturm, and Y. Wu. Euclidean structure from n>=2 parallel circles: Theory and algorithms. In Proceedings of the 9th European Conference on Computer Vision (1), pages 238{252, May 2006. [21] C. Harris and M. Stephens. A combined corner and edge detector. In Alvey Vision Conference, pages 147{151, June 1988. [22] R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cam- bridge University Press, second edition, 2004. [23] G. Jiang and L. Quan. Detection of concentric circles for camera calibration. In Proceedings of the 10th IEEE International Conference on Computer Vision, pages 333{340, Octorber 2005. [24] G. Jiang, H.-T. Tsui, L. Quan, and A. Zisserman. Geometry of single axis mo- tions using conic fltting. IEEE Trans. Pattern Analysis and Machine Intelligence, 25(10):1343{1348, 2003. [25] J.-S. Kim, P. Gurdjos, and I.-S. Kweon. Geometric and algebraic constraints of projected concentric circles and their applications to camera calibration. IEEE Trans. Pattern Analysis and Machine Intelligence, 27(4):637{642, 2005. 150 [26] J.-S. Kim, H.-W. Kim, and I.-S. Kweon. A camera calibration method using con- centric circles for vision applications. In Proceedings of the 5th Asian Conference on Computer Vision, pages 515{520, January 2002. [27] J. Li and R. Chellappa. A factorization method for structure from planar motion. In IEEE Workshop on Motion and Video Computing, pages 154{159, January 2005. [28] B. Liang and N. Pears. Visual navigation using planar homographies. InInternational Conference on Robotics and Automation, pages 205{210, May 2002. [29] D. Liebowitz and A. Zisserman. Metric rectiflcation for perspective images of planes. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 482{488, June 1998. [30] M. I. A. Lourakis, S. T. Halkidis, and S. C. Orphanoudakis. Matching disparate views of planar surfaces using projective invariants. Image Vision Comput., 18(9):673{683, 2000. [31] D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91{110, 2004. [32] B. D. Lucas and T. Kanade. An iterative image registration technique with an ap- plication to stereo vision. In International Joint Conferences on rtiflcial Intelligence, pages 674{679, August 1981. [33] F. Lv, T. Zhao, and R. Nevatia. Camera calibration from video of a walking human. IEEE Trans. Pattern Analysis and Machine Intelligence, 28(9):1513{1518, 2006. [34] X. Meng, H. Li, and Z. Hu. A new easy camera calibration technique based on circular points. In Proceedings of the British Machine Vision Conference, September 2000. 151 [35] T. Moons, L. J. V. Gool, M. V. Diest, and E. J. Pauwels. A?ne reconstruction from perspective image pairs obtained by a translating camera. In Applications of Invariance in Computer Vision, pages 297{316, October 1993. [36] J. L. Mundy and A. Zisserman, editors. Geometric invariance in computer vision. MIT Press, Cambridge, MA, USA, 1992. [37] V. Parameswaran and R. Chellappa. Quasi-invariants for human action representa- tion and recognition. In Sixteenth International Conference on Pattern Recognition (1), pages 307{310, August 2002. [38] V. Parameswaran and R. Chellappa. View invariants for human action recognition. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2), pages 613{619, June 2003. [39] K. H. Parshall and A. C. Rice. The evolution of an international mathematical research community, 1800-1945: An overview and an agenda. Providence, RI: Amer- ican Mathematical Society (AMS). Hist. Math., Providence. 23, 229-252, 2002. [40] C. Rao, M. Shah, and T. Syeda-Mahmood. Invariance in motion analysis of videos. In Proceedings of the eleventh ACM international conference on Multimedia, November 2003. [41] I. D. Reid and A. Zisserman. Goal-directed video metrology. In 4th European Con- ference on Computer Vision (2), pages 647{658, April 1996. [42] J. Shi and C. Tomasi. Good features to track. In IEEE Conference on Computer Vision and Pattern Recognition, pages 593{600, June 1994. [43] C. Staufier, e Kinh, and T. Lily. Robust automated planar normalization of tracking data. In Proceedings of Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation in Tracking and Surveillance, October 2003. 152 [44] P. Sturm and S. Maybank. On plane-based camera calibration: A general algorithm, singularities, applications. In IEEE Conference on Computer Vision and Pattern Recognition, pages 432{437, June 1999. [45] C. Tomasi and T. Kanade. Detection and tracking of point features. Technical Report CMU-CS-91-132, Carnegie Mellon University, April 1991. [46] B. Triggs. Autocalibration from planar scenes. In 5th European Conference on Computer vision (1), pages 89{108, June 1998. [47] G. Wang, F. Wu, and Z. Hu. Novel approach to circular-points-based camera cali- bration. In Proceedings of Second International Conference on Image and Graphics, pages 830{837, july 2002. [48] G. Wang, Y. Wu, and Z. Hu. A novel approach for single view based plane metrology. In Sixteenth International Conference on Pattern Recognition (2), pages 556{559, August 2002. [49] I. Weiss and M. Ray. Model-based recognition of 3d objects from single images. IEEE Trans. Pattern Analysis and Machine Intelligence, 23(2):116{128, 2001. [50] C. R. Wren, A. Azarbayejani, T. Darrell, and A. Pentland. Pflnder: Real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):780{785, 1997. [51] Y. Wu, H. Zhu, Z. Hu, and F. Wu. Camera calibration from the quasi-a?ne invariance of two parallel circles. In Proceedings of the 8th European Conference on Computer Vision (1), pages 190{202, May 2004. [52] C. Yang, F. Sun, and Z. Hu. Planar conic based camera calibration. In Sixteenth International Conference on Pattern Recognition, pages 1555{1558, September 2000. 153 [53] X. Ying and H. Zha. Linear approaches to camera calibration from sphere images or active intrinsic calibration using vanishing points. In Proceedings of the Tenth IEEE International Conference on Computer Vision, pages 596{603, October 2005. [54] H. Zhang, G. Zhang, and K.-Y. K. Wong. Camera calibration with spheres: Linear approaches. In Proceedings on International Conference on Image Processing (2), pages 1150{1153, September 2005. [55] Z. Zhang. A exible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11):1330{1334, 2000. [56] Z. Zhang. Camera calibration with one-dimensional objects. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(7):892{899, 2004. [57] Y. Zhu, L. Seneviratne, and S. Earles. Three dimensional object recognition using invariants. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems 95, pages 354{359, Auguster 1995. 154