ABSTRACT Title of dissertation: ALGORITHMS FOR SOLVING LINEAR AND POLYNOMIAL SYSTEMS OF EQUATIONS OVER FINITE FIELDS WITH APPLICATIONS TO CRYPTANALYSIS Gregory Bard Doctor of Philosophy, 2007 Dissertation directed by: Professor Lawrence C. Washington Department of Mathematics This dissertation contains algorithms for solving linear and polynomial systems of equations overGF(2). The objective is to provide fast and exact tools for algebraic cryptanalysis and other applications. Accordingly, it is divided into two parts. The first part deals with polynomial systems. Chapter 2 contains a successful cryptanalysis of Keeloq, the block cipher used in nearly all luxury automobiles. The attack is more than 16,000 times faster than brute force, but queries 0.6?232 plaintexts. The polynomial systems of equations arising from that cryptanalysis were solved via SAT-solvers. Therefore, Chapter 3 introduces a new method of solvingpolynomialsystemsofequationsbyconvertingthemintoCNF-SATproblems and using a SAT-solver. Finally, Chapter 4 contains a discussion on how SAT-solvers work internally. The second part deals with linear systems over GF(2), and other small fields (and rings). These occur in cryptanalysis when using the XL algorithm, which con- verts polynomial systems into larger linear systems. We introduce a new complexity model and data structures for GF(2)-matrix operations. This is discussed in Ap- pendix B but applies to all of Part II. Chapter 5 contains an analysis of ?the Method of Four Russians? for multiplication and a variant for matrix inversion, which is logn faster than Gaussian Elimination, and can be combined with Strassen-like al- gorithms. Chapter 6 contains an algorithm for accelerating matrix multiplication over small finite fields. It is feasible but the memory cost is so high that it is mostly of theoretical interest. Appendix A contains some discussion of GF(2)-linear algebra and how it differs from linear algebra in R and C. Appendix C discusses algorithms faster than Strassen?s algorithm, and contains proofs that matrix multiplication, matrix squaring, triangular matrix inversion, LUP-factorization, general matrix in- version and the taking of determinants, are equicomplex. These proofs are already known, but are here gathered into one place in the same notation. ALGORITHMS FOR SOLVING LINEAR AND POLYNOMIAL SYSTEMS OF EQUATIONS OVER FINITE FIELDS, WITH APPLICATIONS TO CRYPTANALYSIS by Gregory V. Bard Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2007 Advisory Committee: Professor Lawrence C. Washington, Chair/Advisor Professor William Adams, Professor Steven Tretter, Professor William Gasarch, Assistant Professor Jonathan Katz. a169 Copyright by Gregory V. Bard 2007 Preface Pigmaei gigantum humeris impositi plusquam ipsi gigantes vident1. (Attributed to Bernard of Chartres, ????) One of the many reasons the subject of Mathematics is so beautiful is the continuing process of building one theorem, algorithm, or conjecture upon another. This can be compared to the construction of a cathedral, where each stone gets laid upon those that came before it with great care. As each mason lays his stone he can only be sure to put it in its proper place, and see that it rests plumb, level, and square, with its neighbors. From that vantage point, it is impossible to tell what role it will play in the final edifice, or even if it will be visible. Could George Boole have imagined the digital computer? Another interesting point is that the cathedrals of Europe almost universally took longer than one lifetime to build. Therefore those that laid the foundations had absolutely no probability at all of seeing the completed work. This is true in mathematics, also. Fermat?s Last Theorem, the Kepler Conjecture, the Insolubility of the Quintic, the Doubling of the Cube, and other well-known problems were only solved several centuries after they were proposed. And thus scholarly publication is a great bridge, which provides communication of ideas (at least in one direction) across the abyss of death. On example is to contemplate the conic sections of Apollonius of Perga, (circa ??? bc). Can one imagine how few of the ordinary or extraordinary persons of Western Europe in perhaps the seventh century ad, would know of them. Yet, ?? centuries after their introduction, they would be found, by Kepler, to describe the motions of astronomical bodies. In the late twentieth century, conic sections were studied, at least to some degree, by all high school graduates in the United States of America, and certainly other countries as well. Such is the nature of our business. Some papers might be read by only ten persons in a century. All we can do is continue to work, and hope that the knowledge we create is used for good and not for ill. An old man, going a lone highway, Came at the evening, cold and gray, To a chasm, vast and deep and wide, Through which was flowing a sullen tide. The old man crossed in the twilight dim; The sullen stream had no fears for him; But he turned when safe on the other side And built a bridge to span the tide. ?Old man,? said a fellow pilgrim near, ?You are wasting strength with building here; Your journey will end with the ending day; You never again must pass this way; 1Dwarves, standing on the shoulders of giants, can further than giants see. ii You have crossed the chasm, deep and wide? Why build you the bridge at the eventide?? The builder lifted his old gray head: ?Good friend, in the path I have come,? he said, ?There followeth after me today A youth whose feet must pass this way. This chasm that has been naught to me To that fair-haired youth may a pitfall be. He, too, must cross in the twilight dim; Good friend, I build this bridge for him.? by William Allen Drumgoole iii Foreword The ignoraunte multitude doeth, but as it was euer wonte, enuie that knoweledge, whiche thei can not attaine, and wishe all men ignoraunt, like unto themself...Yea, the pointe in Geometrie, and the unitie in Arithmetike, though bothe be undiuisible, doe make greater woorkes, & increase greater multitudes, then the brutishe bande of ignoraunce is hable to withstande... But yet one commoditie moare...I can not omitte. That is the fily- ing, sharpenyng, and quickenyng of the witte, that by practice of Arith- metike doeth insue. It teacheth menne and accustometh them, so cer- tainly to remember thynges paste: So circumspectly to consider thynges presente: And so prouidently to forsee thynges that followe: that it maie truelie bee called the File of witte. (Robert Recorde, ????, quoted from [vzGG03, Ch. 17]). iv Dedication With pleasure, I dedicate this dissertation to my parents. To my father, who taught me much in mathematics, most especially calculus, years earlier than I would have been permitted to see it. And to my mother, who has encouraged me in every endeavor. v Acknowledgments I am deeply indebted to my advisor, Professor Lawrence C. Washington, who has read and re-read many versions of this document, and every other research paper I have written to date, provided me with countless hours of his time, and given me sound advice, on matters technical, professional and personal. He has had many students, and all I have spoken to are in agreement that he is a phenomenal advisor, and it is humbling to know that we can never possibly equal him. We can only try to emulate him. I have been very fortunate to work with Dr. Nicolas T. Courtois, currently Senior Lecturer at the University College of London. It suffices to say that the field of algebraic cryptanalysis has been revolutionized by his work. He permitted me to study with him for four months in Paris, and later even to stay at his home in England. Much of my work is joint work with him. My most steadfast supporter has been my best friend and paramour, Patrick Studdard. He has read every paper I have written to date, even this dissertation. He has listened to me drone on about topics like group theory or ring theory and has tolerated my workaholism. My life and my work would be much more empty without him. My first autonomous teaching experience was at American University, a major stepping stone in my life. The Department of Computer Science, Audio Technology and Physics had faith in me and put their students in my hands. I am so very grateful for those opportunities. In particular I would like to thank Professor Michael Gray, vi and Professor Angela Wu, who gave me much advice. The departmental secretary, Yana Shabaev was very kind to me on numerous occasions and I am grateful for all her help. I also received valuable teaching advice from my future father-in- law, Emeritus Professor Albert Studdard, of the Philosophy Department at the University of North Carolina at Pembroke. MyadmissiontotheAppliedMathematicsandScientificComputation(AMSC) Program at the University of Maryland was unusual. I had been working as a PhD student in Computer Engineering for several years when I decided to add a Mas- ters in AMSC as a side task. Later, I switched, and made it the focus of my PhD. Several faculty members were important in helping me make that transition, includ- ing Professor Virgil D. Gligor, Professor Richard Schwartz, and Professor Jonathan Rosenberg. But without the program chair, Professor David Levermore, and my advisor, Professor Lawrence C. Washington, this would have been impossible. I am grateful that many rules were bent. Several governmental agencies have contributed to my education. In reverse chronological order, a136 The National Science Foundation (NSF) VIGRE grant to the University of Maryland provided the Mathematics Department with funds for ?Dissertation Completion Fellowships.? These are exceedingly generous semester-long gifts and I had the pleasure to receive two of them. I am certain this dissertation would look very different without that valuable support. a136 I furthermore received travel funding and access to super-computing under the vii NSF grant for the Sage project [sag], Grant Number 0555776. a136 The European Union established the ECRYPT organization to foster research among cryptographers. The ECRYPT foundation was kind enough to support me during my research visit in Paris. Much of the work of this dissertation was done while working there. a136 The Department of the Navy, Naval Sea Systems Command, Carderock Sta- tion, supported me with a fascinating internship during the summer of 2003. They furthermore allowed me to continue some cryptologic research toward my graduate degrees while working there. a136 The National Security Agency was my employer from May of 1998 until September of 2002. Not only did I have the chance to work alongside and learn from some of the most dedicated and professional engineers, computer scientists and mathematicians who have ever been assembled into one orga- nization, but also I was given five semesters of financial support, plus several summer sessions, which was very generous, especially considering that two of those semesters and one summer were during wartime. Our nation owes an enormous debt to the skill, perseverance, self-discipline, and creativity of the employees of the National Security Agency. Several employees there were sup- porters of my educational hopes but security regulations and agency traditions would forbid my mentioning them by name. Professor Virgil D. Gligor, Professor Jonathan Katz, and Professor William R. Franklin (of RPI), have all been my academic advisor at one time or another viii and I thank them for their guidance. Each has taught me from their own view of computer security and it is interesting how many points of view there really are. Our field is enriched by this diversity of scholarly background. Likewise Dr Harold Snider gave me important advice during my time at Oxford. My Oxford days were an enormous influence on my decision to switch into Applied Mathematics from Computer Engineering. My alma mater, the Rensselaer Polytechnic Institute, has furnished me with many skills. I cannot imagine a better training ground. My engineering degree was sufficiently mathematically rigorous that I succeeded in a graduate program in Applied Mathematics, without having a Bachelor?s degree in Mathematics. In fact, only two courses, Math 403, Abstract Algebra, and Math 404 Field Theory, were my transition. Yet I was practical enough to hired into and promoted several times at the NSA, from a Grade 6 to a Grade 11 (out of a maximum of 15), including sev- eral commendations and a Director?s Star. I am particularly indebted to the faculty there, and its founder, Stephen Van Rensselaer, former governor of New York and (twice) past Grand Master of Free & Accepted Masons of the State of New York. It was at Rensselaer that I learned of and was initiated into Freemasonry, and the In- stitute?s mission of educating ?the sons of farmers and mechanics? with ?knowledge and thoroughness? echoes the teachings of Freemasonry. My two greatest friends, chess partners, and Brother Freemasons, Paul Bello and Abram Claycomb, were initiated there as well, and I learned much in discussions with them. I learned of cryptography at a very young age, perhaps eight years old or so. Dr. Haig Kafafian, also a Freemason and a family friend, took me aside after Church ix for many years and taught me matrices, classical cryptography, and eventually, allowed me to be an intern at his software development project to make devices to aid the handicapped. There I met Igor Kalatian, who taught me the C language. I am grateful to them both. x Table of Contents List of Tables xvi List of Figures xviii List of Abbreviations xx 1 Summary 1 I Polynomial Systems 5 2 An Extended Example: The Block-Cipher Keeloq 6 2.1 Special Acknowledgment of Joint Work . . . . . . . . . . . . . . . . . 7 2.2 Notational Conventions and Terminology . . . . . . . . . . . . . . . . 7 2.3 The Formulation of Keeloq . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3.1 What is Algebraic Cryptanalysis? . . . . . . . . . . . . . . . . 8 2.3.2 The CSP Model . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3.3 The Keeloq Specification . . . . . . . . . . . . . . . . . . . . . 9 2.3.4 Modeling the Non-linear Function . . . . . . . . . . . . . . . . 10 2.3.5 I/O Relations and the NLF . . . . . . . . . . . . . . . . . . . 11 2.3.6 Disposing of the Secret Key Shift-Register . . . . . . . . . . . 12 2.3.7 Describing the Plaintext Shift-Register . . . . . . . . . . . . . 13 2.3.8 The Polynomial System of Equations . . . . . . . . . . . . . . 13 2.3.9 Variable and Equation Count . . . . . . . . . . . . . . . . . . 14 2.3.10 Dropping the Degree to Quadratic . . . . . . . . . . . . . . . 15 2.3.11 Fixing or Guessing Bits in Advance . . . . . . . . . . . . . . . 16 2.3.12 The Failure of a Frontal Assault . . . . . . . . . . . . . . . . . 17 2.4 Our Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.4.1 A Particular Two Function Representation . . . . . . . . . . . 19 2.4.2 Acquiring an f(8)k -oracle . . . . . . . . . . . . . . . . . . . . . 19 2.4.3 The Consequences of Fixed Points . . . . . . . . . . . . . . . . 20 2.4.4 How to Find Fixed Points . . . . . . . . . . . . . . . . . . . . 21 2.4.5 How far must we search? . . . . . . . . . . . . . . . . . . . . . 23 2.4.6 Fraction of Plainspace Required . . . . . . . . . . . . . . . . . 24 2.4.7 Comparison to Brute Force . . . . . . . . . . . . . . . . . . . 26 2.4.8 Some Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.4.9 Cycle Lengths in a Random Permutation . . . . . . . . . . . . 30 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.6 A Note about Keeloq?s Utilization . . . . . . . . . . . . . . . . . . . . 35 xi 3 Converting MQ to CNF-SAT 36 3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.2.1 Application to Cryptanalysis . . . . . . . . . . . . . . . . . . . 39 3.3 Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.4 Converting MQ to SAT . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.4.1 The Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.4.1.1 Minor Technicality . . . . . . . . . . . . . . . . . . . 41 3.4.2 Measures of Efficiency . . . . . . . . . . . . . . . . . . . . . . 44 3.4.3 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.4.4 Fixing Variables in Advance . . . . . . . . . . . . . . . . . . . 47 3.4.4.1 Parallelization . . . . . . . . . . . . . . . . . . . . . 49 3.4.5 SAT-Solver Used . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.4.5.1 Note About Randomness . . . . . . . . . . . . . . . 49 3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.5.1 The Source of the Equations . . . . . . . . . . . . . . . . . . . 51 3.5.2 The Log-Normal Distribution of Running Times . . . . . . . . 51 3.5.3 The Optimal Cutting Number . . . . . . . . . . . . . . . . . . 53 3.5.4 Comparison with MAGMA, Singular . . . . . . . . . . . . . . 56 3.6 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.8 Cubic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.9 NP-Completeness of MQ . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.10 Final Note on Sparse MQ and F5 . . . . . . . . . . . . . . . . . . . . 63 4 How do SAT-Solvers Operate? 64 4.1 The Problem Itself . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.1.1 Conjunctive Normal Form . . . . . . . . . . . . . . . . . . . . 65 4.2 Chaff and its Descendants . . . . . . . . . . . . . . . . . . . . . . . . 66 4.2.1 Variable Management . . . . . . . . . . . . . . . . . . . . . . . 66 4.2.2 The Method of Watched Literals . . . . . . . . . . . . . . . . 68 4.2.3 How to Actually Make This Happen . . . . . . . . . . . . . . 68 4.2.4 Back-Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.3 Enhancements to Chaff . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.3.1 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.3.2 The Alarm Clock . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.3.3 The Third Finger . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.4 Walk-SAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.5 Economic Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.5.1 An Interesting Coincidence . . . . . . . . . . . . . . . . . . . . 75 xii II Linear Systems 76 5 The Method of Four Russians 77 5.1 Origins and Previous Work . . . . . . . . . . . . . . . . . . . . . . . . 78 5.1.1 Strassen?s Algorithm . . . . . . . . . . . . . . . . . . . . . . . 79 5.2 Rapid Subspace Enumeration . . . . . . . . . . . . . . . . . . . . . . 80 5.3 The Four Russians Matrix Multiplication Algorithm . . . . . . . . . . 82 5.3.1 Role of the Gray Code . . . . . . . . . . . . . . . . . . . . . . 83 5.3.2 Transposing the Matrix Product . . . . . . . . . . . . . . . . . 84 5.3.3 Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.3.4 A Quick Computation . . . . . . . . . . . . . . . . . . . . . . 84 5.3.5 M4RM Experiments Performed by SAGE Staff . . . . . . . . . 85 5.4 The Four Russians Matrix Inversion Algorithm . . . . . . . . . . . . . 86 5.4.1 Stage 1: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.4.2 Stage 2: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.4.3 Stage 3: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.4.4 A Curious Note on Stage 1 of M4RI . . . . . . . . . . . . . . . 89 5.4.5 Triangulation or Inversion? . . . . . . . . . . . . . . . . . . . . 92 5.5 Experimental and Numerical Results . . . . . . . . . . . . . . . . . . 93 5.6 Exact Analysis of Complexity . . . . . . . . . . . . . . . . . . . . . . 98 5.6.1 An Alternative Computation . . . . . . . . . . . . . . . . . . . 99 5.6.2 Full Elimination, not Triangular . . . . . . . . . . . . . . . . . 100 5.6.3 The Rank of 3k Rows, or Why k+epsilon1 is not Enough . . . . . . 101 5.6.4 Using Bulk Logical Operations . . . . . . . . . . . . . . . . . . 103 5.6.5 M4RI Experiments Performed by SAGE Staff . . . . . . . . . 104 5.6.5.1 Determination of k . . . . . . . . . . . . . . . . . . . 104 5.6.5.2 Comparison to Magma . . . . . . . . . . . . . . . . . 104 5.6.5.3 The Transpose Experiment . . . . . . . . . . . . . . 105 5.7 Pairing With Strassen?s Algorithm for Matrix Multiplication . . . . . 105 5.8 The Unsuitability of Strassen?s Algorithm for Inversion . . . . . . . . 107 5.8.1 Bunch and Hopcroft?s Solution . . . . . . . . . . . . . . . . . 109 5.8.2 Ibara, Moran, and Hui?s Solution . . . . . . . . . . . . . . . . 110 6 An Impractical Method of Accelerating Matrix Operations in Rings of Finite Size 115 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 6.1.1 Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 6.2 The Algorithm over a Finite Ring . . . . . . . . . . . . . . . . . . . . 117 6.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 6.2.2 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 6.2.3 Taking Advantage of znegationslash= 1 . . . . . . . . . . . . . . . . . . . . 120 6.2.4 The Transpose of Matrix Multiplication . . . . . . . . . . . . 120 6.3 Choosing Values of b . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 6.3.1 The ?Conservative? Algorithm . . . . . . . . . . . . . . . . . . 121 6.3.2 The ?Liberal? Algorithm . . . . . . . . . . . . . . . . . . . . . 122 xiii 6.3.3 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 6.4 Over Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 6.4.1 Complexity Penalty . . . . . . . . . . . . . . . . . . . . . . . . 125 6.4.2 Memory Requirements . . . . . . . . . . . . . . . . . . . . . . 125 6.4.3 Time Requirements . . . . . . . . . . . . . . . . . . . . . . . . 126 6.4.4 The Conservative Algorithm . . . . . . . . . . . . . . . . . . . 126 6.4.5 The Liberal Algorithm . . . . . . . . . . . . . . . . . . . . . . 127 6.5 Very Small Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . . 127 6.6 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 6.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 6.7.1 Ring Additions . . . . . . . . . . . . . . . . . . . . . . . . . . 130 6.7.2 On the Ceiling Symbol . . . . . . . . . . . . . . . . . . . . . . 131 A Some Basic Facts about Linear Algebra over GF(2) 133 A.1 Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 A.2 Boolean Matrices vs GF(2) Matrices . . . . . . . . . . . . . . . . . . 133 A.3 Why is GF(2) Different? . . . . . . . . . . . . . . . . . . . . . . . . . 133 A.3.1 There are Self-Orthogonal Vectors . . . . . . . . . . . . . . . . 134 A.3.2 Something that Fails . . . . . . . . . . . . . . . . . . . . . . . 134 A.3.3 The Probability a Random Square Matrix is Singular or In- vertible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 B A Model for the Complexity of GF(2)-Operations 137 B.1 The Cost Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 B.1.1 Is the Model Trivial? . . . . . . . . . . . . . . . . . . . . . . . 138 B.1.2 Counting Field Operations . . . . . . . . . . . . . . . . . . . . 138 B.1.3 Success and Failure . . . . . . . . . . . . . . . . . . . . . . . . 139 B.2 Notational Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . 139 B.3 To Invert or to Solve? . . . . . . . . . . . . . . . . . . . . . . . . . . 140 B.4 Data Structure Choices . . . . . . . . . . . . . . . . . . . . . . . . . . 140 B.4.1 Dense Form: An Array with Swaps . . . . . . . . . . . . . . . 141 B.4.2 Permutation Matrices . . . . . . . . . . . . . . . . . . . . . . . 141 B.5 Analysis of Classical Techniques with our Model . . . . . . . . . . . . 142 B.5.1 Na??ve Matrix Multiplication . . . . . . . . . . . . . . . . . . . 142 B.5.2 Matrix Addition . . . . . . . . . . . . . . . . . . . . . . . . . 142 B.5.3 Dense Gaussian Elimination . . . . . . . . . . . . . . . . . . . 142 B.5.4 Strassen?s Algorithm for Matrix Multiplication . . . . . . . . . 144 C On the Exponent of Certain Matrix Operations 147 C.1 Very Low Exponents . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 C.2 The Equicomplexity Theorems . . . . . . . . . . . . . . . . . . . . . . 148 C.2.1 Starting Point . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 C.2.2 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 C.2.3 A Common Misunderstanding . . . . . . . . . . . . . . . . . . 155 xiv Bibliography 156 xv List of Tables 2.1 Fixed points of random permutations and their 8th powers . . . . . . 23 3.1 CNF Expression Difficulty Measures for Quadratic Systems, by Cut- ting Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2 Running Time Statistics in Seconds . . . . . . . . . . . . . . . . . . . 55 3.3 SpeedsofComparisonTrialsbetweenMagma, SingularandANFtoCNF- MiniSAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.4 CNF Expression Difficulty Measures for Cubic Systems, by Cutting Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 5.1 M4RM Running Times versus Magma . . . . . . . . . . . . . . . . . 86 5.2 Confirmation that k = 0.75log2n is not a good idea. . . . . . . . . . 86 5.3 Probabilities of a Fair-Coin Generated n?n matrix over GF(2), hav- ing given Nullity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.4 Experiment 1? Optimal Choices of k, and running time in seconds. . 96 5.5 Running times, in msec, Optimization Level 0 . . . . . . . . . . . . . 96 5.6 Percentage Error for Offset of K, From Experiment 1 . . . . . . . . . 97 5.7 Results of Experiment 3?Running Times, Fixed k=8 . . . . . . . . . 112 5.8 Experiment 2?Running times in seconds under different Optimiza- tions, k=8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.9 Trials between M4RI and Gaussian Elimination (msec) . . . . . . . . 113 5.10 The Ineffectiveness of the Transpose Trick . . . . . . . . . . . . . . . 113 5.11 Optimization Level 3, Flexible k . . . . . . . . . . . . . . . . . . . . . 114 6.1 The Speed-Up and Extra Memory Usage given by the four choices . . 128 6.2 Memory Required (in bits) and Speed-Up Factor (w/Strassen) for Various Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 xvi B.1 Algorithms and Performance, for m?n matrices . . . . . . . . . . . 146 xvii List of Figures 2.1 The Keeloq Circuit Diagram . . . . . . . . . . . . . . . . . . . . . . . 9 3.1 The Distribution of Running Times, Experiment 1 . . . . . . . . . . . 52 3.2 The Distribution of the Logarithm of Running Times, Experiment 1 . 53 5.1 A Plot of M4RI?s System Solving in Sage vs Magma . . . . . . . . . 106 C.1 The Relationship of the Equicomplexity Proofs . . . . . . . . . . . . . 149 xviii List of Algorithms 1 Method of Four Russians, for Matrix Multiplication . . . . . . . . . . 82 2 Method of Four Russians, for Inversion . . . . . . . . . . . . . . . . . 87 3 The Finite-Ring Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 118 4 Fast Mb(R) multiplications for R a finite field but not GF(2) . . . . . 125 5 Gram-Schmidt, over a field of characteristic zero. . . . . . . . . . . . 134 6 To compose two row swap arrays r and s, into t . . . . . . . . . . . . 141 7 To invert a row swap array r, into s . . . . . . . . . . . . . . . . . . . 142 8 Na??ve Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . 142 9 Dense Gaussian Elimination, for Inversion . . . . . . . . . . . . . . . 143 10 Dense Gaussian Elimination, for Triangularization . . . . . . . . . . . 144 11 Strassen?s Algorithm for Matrix Multiplication . . . . . . . . . . . . . 144 xix List of Abbreviations f(x) = o(g(x)) limx?? f(x)g(x) = 0 f(x) = O(g(x)) ?c,n0 ?n>n0 f(x)?cg(x) f(x) = ?(g(x)) ?c,n0 ?n>n0 f(x)?cg(x) f(x) = ?(g(x)) limx?? g(x)f(x) = 0 f(x) = ?(g(x)) f(x) = O(g(x)) while simultaneously f(x) = ?(g(x)) f(x)?g(x) limx?? f(x)g(x) = 1 Cnf-Sat The conjunctive normal form satisfiability problem Des The Data Encryption Standard GF(q) The finite field of size q Hfe Hidden Field Equations (a potential trap-door OWF) M4rm The Method of Four Russians for matrix multiplication M4ri The Method of Four Russians for matrix inversion Mc The Multivariate Cubic problem Mq The Multivariate Quadratic problem Owf One-way Function Quad The stream cipher defined in [BGP05] Rref Reduced Row Echelon Form Sage Software for Algebra and Geometry Experimentation Sat The satisfiability problem Uutf Unit Upper Triangular Form xx Chapter 1 Summary Generally speaking, it has been the author?s objective to generate efficient and reliable computational tools to assist algebraic cryptanalysts. In practice, this is a question of developing tools for systems of equations, which may be dense or sparse, linear or polynomial, over GF(2) or one of its extension fields. In addition, the author has explored the cryptanalysis of block ciphers and stream ciphers, targeting the block ciphers Keeloq and the Data Encryption Standard (Des), and the stream cipher Quad. Only Keeloq is presented here, because the work on the last two are still underway. The work on Des has seen preliminary publication in [CB06]. The dissertation is divided into two parts. The first deals with polynomial systems and actual cryptanalysis. The second deals with linear systems. Linear systems are important to cryptanalysis because of the XL algorithm [CSPK00], a standard method of solving overdefined polynomial systems of equations over finite fields by converting them into linear systems. Chapter 2 is the most practical, and contains a detailed study of the block cipher Keeloq and presents a successful algebraic cryptanalysis attack. The cipher Keeloq is used in the key-less entry systems of nearly all luxury automobiles. Our attack is 214.77 times faster than brute force, but requires 0.6?232 plaintexts. Chapter 3 has the largest potential future impact, and deals not with linear 1 systems but with polynomial systems. Also, it deals not only with dense systems (as Part II does), but with sparse systems also. Since it is known that solving a quadratic system of equations is NP-hard, and solving the Cnf-Sat problem is NP- hard, and since all NP-Complete problems are polynomially reducible to each other, it makes sense to look for a method to use one in solving the other. This chapter shows how to convert quadratic systems of equations into Cnf-Sat problems, and that using off-the-shelf Sat-solvers is an efficient method of solving this difficult problem. Chapter 4 describes in general terms how Sat-solvers work. This tool is often viewed as a black box, which is unfortunate. There is no novel work in this chapter, except the author does not know of any other exposition on how these tools operate, either for experts or a general audience. The second part begins with Chapter 5, and contains the Method of Four Russians, which is an algorithm published in the 1970s, but mostly forgotten since, for calculating a step of the transitive closure of a digraph, and thus also squaring boolean matrices. Later it was adapted to matrix multiplication. This chapter provides an analysis of that algorithm, but also shows a related algorithm for matrix inversion that was anecdotally known among some French cryptanalysts. However, the algorithm was not frequently used because it was unclear how to eliminate the probability of abortion, how to handle memory management, and how to optimize the algorithm. The changes have made negligible the probability of abortion, and have implemented the algorithm so that it outperforms Magma [mag] in some cases. The software tool Sage [sag], which is an open source competitor to Magma, has 2 adopted the author?s software library (built on the Method of Four Russians) for its dense GF(2)-matrix operations, and this code is now deployed. Chapter 6 has an algorithm of theoretical interest, but which may find some practical application as well. This algorithm is for finite-ring matrix multiplication, with special attention and advantage to the finite-field case. The algorithm takes any ?baseline? matrix multiplication algorithm which works over general rings, or a class of rings that includes finite rings, and produces a faster version tailored to a specific finite ring. However, the memory required is enormous. Nonetheless, it is feasible for certain examples. The algorithm is based on the work of Atkinson and Santoro [AS88], but introduces many more details, optimizations, techniques, and detailed analysis. This chapter also modifies Atkinson and Santoro?s complexity calculations. Three appendices are found, which round out Part II. Appendix A contains some discussion of GF(2)-linear algebra and how it differs from linear algebra in R and C. These facts are well-known. We introduce a new complexity model and data structures for GF(2)-matrix operations. This is discussed in Appendix B but applies to all of Part II. Appendix C discusses algorithms faster than Strassen?s algorithm, and con- tains proofs that matrix multiplication, matrix squaring, triangular matrix inver- sion, LUP-factorization, general matrix inversion and the taking of determinants, are equicomplex. These proofs are already known, but are here gathered into one place in the same notation. Finally, two software libraries were created during the dissertation work. The 3 first was a very carefully coded GF(2)-matrix operations and linear algebra library, that included the Method of Four Russians. This library was adopted by Sage, and is written in traditional ANSI C. The second relates to Sat-solvers, and is a Java library for converting polynomials into Cnf-Sat problems. I have decided that these two libraries are to be made publicly available, as soon as the formalities of submitting the dissertation are completed. (the first is already available under the GPL?Gnu Public License). 4 Part I Polynomial Systems 5 Chapter 2 An Extended Example: The Block-Cipher Keeloq The purpose of this chapter is to supply a new, feasible, and economically relevant example of algebraic cryptanalysis. The block cipher ?Keeloq?1 is used in the keyless-entry system of most luxury automobiles. It has a secret key consisting of 64 bits, takes a plaintext of 32 bits, and outputs a ciphertext of 32 bits. The cipher consists of 528 rounds. Our attack is faster than brute force by a factor of around 214.77 as shown in Section 2.4.7 on page 27. A summary will be given in Section 2.5 on page 33. This attack requires around 0.6?232 plaintexts, or 60% of the entire dictionary, as calculated in Section 2.4.5 on page 23. This chapter is written in the ?chosen plaintext attack? model, in that we assume that we can request the encryption of any plaintext and receive the corresponding ciphertext as encrypted under the secret key that we are to trying guess. This will be mathematically represented by oracle access to Ek(vectorP) = vectorC. However, it is easy to see that random plaintexts would permit the attack to proceed identically. 1This is to be pronounced ?key lock.? 6 2.1 Special Acknowledgment of Joint Work The work described in this chapter was performed during a two-week visit with the Information Security Department of the University College of London?s Ipswich campus. During that time the author worked with Nicolas T. Courtois. The content of this chapter is joint work. Nonetheless, the author has rewritten this text in his own words and notation, distinct from the joint paper [CB07]. Some proofs are found here which are not found in the paper. 2.2 Notational Conventions and Terminology Evaluating the function f eight times will be denoted f(8). For any lscript-bit sequence, the least significant bit is numbered 0 and the most significant bit is numbered lscript?1. If h(x) = x for some function h, then x is a fixed point of h. If h(h(x)) = x but h(x)negationslash= x then x is a ?point of order 2? of h. In like manner, if h(i)(x) = x but h(j)(x)negationslash= x for all j 1 for the system to be expected to have at most one solution. As always with algebraic cryptanalysis, unless we make an assumption that is false, we always know the system of equations has at least one solution, because a message was sent. And thus we have a unique solution when ?> 1. 15 Li = Pi ?i?[0,31] Li = ki?32 mod 64?Li?32?Li?16?Li?23?Li?30?Li?1Li?12??i ?Li?6Li?12?Li?6Li?30?Li?12Li?23?Li?23Li?30 ??iLi?23??iLi?12??iLi?23??iLi?12 ?i?[32,559] ?i = Li?1Li?6 ?i?[32,559] ?i = Li?1Li?30 ?i?[32,559] Ci = Li?528 ?i?[528,559] Even with ? = 2 this comes to 3168 equations and 3168 unknowns, well beyond the threshold of size for feasible polynomial system solving at the time this dissertation was written. 2.3.11 Fixing or Guessing Bits in Advance Sometimes in Gr?obner basis algorithms or the XL algorithm, one fixes bits in advance [Cou04b, et al]. For example, in GF(2), there are only two possible values. Thus if one designates g particular variables, there are 2g possible settings for them, but one needs to try 2g/2 on average if exactly one solution exists. For each guess, one rewrites the system of equations either by substituting the guessed values, or if not, then by adding additional equations of the form: k1 = 1,k2 = 0,.... If the resulting Gr?obner or XL running time is more than 2g/2 times faster, this is a profitable move. In cryptanalysis however, one generates a key, encrypts?messages, and writes 16 equations based off of the plaintext-ciphertext pairs and various other constraints and facts. Therefore one knows the key. Instead of guessing all 2g possible values, we simply guess correctly. However, two additional steps must be required. First, we must adjust the final running time by a factor of 2g. Second, we must ensure that the system identifies a wrong guess as fast, or faster, than solving the system in the event of a correct guess. In the event that the second condition above is not present, an alarm clock can be used. If, for example, the correctly guessed key bits cause a good solution after t1 seconds, and after t2, the incorrectly guessed key bits cause the SAT-solver to declare the program unsatisfiable, then t1lessmucht2 suggests the following technique: Guess some key bits. After ct1 seconds, where 1 < c < t2/t1 is some constant, declare the system unsatisfiable and try another guess. Repeat until a good guess is found (i.e. the SAT-solver returns a valid key). We make no comment now on how to find the optimal c, but six standard deviations of t1 is a good starting-point. 2.3.12 The Failure of a Frontal Assault First we tried a simple CSP. With ? plaintext messages under one key, for various values of ? we encrypted and obtained ciphertexts, and wrote equations as described already, in Section 2.3.10 on page 15. We also used fewer rounds than 528, to see the impact of the number of rounds, as is standard. The experiments were an obvious failure, and so we began to look for a more efficient attack. a136 With 64 rounds, and ? = 4, and 10 key bits guessed, Singular required 70 17 seconds, and ElimLin in 10 seconds. a136 With 64 rounds, and? = 2 but the two plaintexts differing only in one bit (the least significant), Singular required 5 seconds, and ElimLin 20 seconds. Min- iSat [ES05], using the techniques of Chapter 3, required 0.19 seconds. Note, it is natural that these attacks are faster, because many internal variables during the encryption will be identically-valued for the first and second message. a136 With 96 rounds, ? = 4, and 20 key bits guessed, MiniSat and the techniques of Chapter 3, required 0.3 seconds. a136 With 128 rounds, and ? = 128, with a random initial plaintext and each other plaintext being an increment of the previous, and 30 key bits guessed, ElimLin required 3 hours. a136 With 128 rounds, and ? = 2, with the plaintexts differing only in the least significant bit, and 30 key bits guessed, MiniSat requires 2 hours. These results on 128 rounds are slower than brute-force. Therefore we did not try any larger number of rounds or finish trying each possible combination of software and trial parameters. Needless to say the 528 round versions did not terminate. Therefore, we need a new attack. 18 2.4 Our Attack 2.4.1 A Particular Two Function Representation Recall that each 64th round uses the same key bit. In other words, the same bit is used in rounds t,t+64,t+128,.... Note further, 528 = 8?64+16. Thus the key bits k15,...,k0 are used nine times, and the key bits k63,...,k16 are used eight times. With this in mind, it is clear that the operation of the cipher can be represented as Ek(vectorP) = gk(fk(fk(???fkbracehtipupleft bracehtipdownrightbracehtipdownleft bracehtipupright 8 times (vectorP) = gk(f(8)k (vectorP)) = vectorC where the fk represents 64 rounds, and the gk the final 16 ?extra? rounds. 2.4.2 Acquiring an f(8)k -oracle Suppose we simply guess the 16 bits of the key denoted k15,...,k0. Of course, we will succeed with probability 2?16. But at that point, we can evaluate gk or its inverse g?1k . Then, g?1k (Ek(vectorP)) = g?1k (gk(f(8)k (vectorP))) = f(8)k (vectorP) and our oracle for Ek now gives rise to an oracle for f(8)k . 19 2.4.3 The Consequences of Fixed Points For the moment, assume we find x and y such that fk(x) = x and fk(y) = y. At first, this seems strange to discuss at all. Because fk(x) = x and therefore f(8)k (x) = x, we know Ek(x) = gk(f(8)k (x)) = gk(x). But, gk(x) is part of the cipher that we can remove by guessing a quarter (16 bits) of the key. Therefore, if we ?know something? about x we know something about multiple internal points, the input, and output of Ek(x). Now we will make this idea more precise. Intuitively, we now know 64 bits of input and 64 bits of output of the function f (32 bits each from each message). This forms a very rigid constraint, and it is highly likely that only one key could produce these outputs. This means that if we solve the system of equations for that key, we will get exactly one answer, which is the secret key. The only question is if the system of equations is rapidly solvable or not. The resulting system has equations for the 64 rounds of f. For both of x and y, there are equations for L0,...,L95 and 32 additional output equations, but the first 32 of these and last 32 of these (in both cases) are of the forms Li = xi and Li?64 = xi, and can be eliminated by substituting. Thus there are actually 96+32?32?32 = 64 equations (again one per round) for both x and y, and thus 128 total equations. We emphasize that this is the same system of equations as Section 2.3.8 on page 13 but with only 64 rounds for each message. The xi?s and yi?s are known. Thus the unknowns are the 64 bits of the key, and the 32 ?intermediate? values ofLi for bothxandy. This is 128 total unknowns. 20 After translating from cubic into quadratic format, it becomes 384 equations and 384 unknowns. This is much smaller than the 3168 equations and 3168 un- knowns we had before. In each case, ElimLin, Magma, Singular, and the methods of Chapter 3 solved the system fork0,...,k63 in time too short to measure accurately (i.e. less than 1 minute). It should be noted that we require two fixed points, not merely one, to make the attack work. One fixed point alone is not enough of a constraint to narrow the keyspace sufficiently. However, two fixed points was sufficient each time it was tried. Therefore, we will assume f has two or more fixed points, and adjust our probabilities of success accordingly. One way to look at this is to say that only those keys which result in two or more fixed points are vulnerable to our attack. However, since the key changes rapidly in most applications (See Section 2.6 on page 35), and since approximately 26.42% of random functions GF(2)32?GF(2)32 have this property (See Section 2.4.8 on page 30), we do not believe this to be a major drawback. 2.4.4 How to Find Fixed Points Obviously a fixed point of fk is a fixed point of f(8)k as well, but the reverse is not necessarily true. Stated differently, the set of fixed points of f(8)k will contain the set of all fixed points of fk. We will first calculate the set of fixed points of f(8)k , which will be very small. We will try the attack given in the previous subsection, using every pair of fixed 21 points. If it is the case that fk has two or more such fixed points, then one such pair which we try will indeed be a pair of fixed points offk. This will produce the correct secret key. The other pairs will produce spurious keys or inconsistent systems of equations. But this is not a problem because spurious keys can be easily detected and discarded. The running time required to solve the system of equations is too short to accurately measure, with a valid or invalid pair. Recall, that this is 384 equations and 384 unknowns as compared to 3168, as explained in Section 2.4.3 on page 21. There are probably very few fixed points of f(8)k , which we will prove below. And thus the running time of the entire attack depends only upon finding the set of fixed points of f(8)k . One approach would be to iterate through all 232 possible plaintexts, using the f(8)k oracle. This would clearly uncover all possible fixed points of f(8)k and if fk has any fixed points, they would be included. However, this is not efficient. Instead, one can simply try plaintexts in sequence using the f(8)k oracle. When theithfixedpointxi isfound, onetriestheattackwiththei?1pairs(x1,xi),...,(xi?1,xi). If two fixed points of fk are to be found in x1,...,xi, the attack will succeed at this point, and we are done. Otherwise, continue until xi+1 is found and try the pairs (x1,xi+1,..., xi,xi+1), and so forth. 22 Table 2.1 Fixed points of random permutations and their 8th powers Size 212 212 213 214 215 216 Experiments 1000 10,000 10,000 10,000 10,000 100,000 Abortions (n1 < 2) 780 7781 7628 7731 7727 76,824 Good Examples (n1?2) 220 2219 2372 2269 2273 23,176 Average n1 2.445 2.447 2.436 2.422 2.425 2.440 Average n8 4.964 5.684 5.739 5.612 5.695 5.746 Average Location 2482 2483 4918 9752 19,829 39,707 Percentage (?) 60.60% 60.62% 60.11% 59.59% 60.51% 60.59% 2.4.5 How far must we search? One could generate a probability distribution on the possible values of n1 and n8, the number of fixed points of fk and f(8)k . However, if all we need to know is how many plaintexts must be tried until two fixed points of f are discovered, then this can be computed by an experiment. We generated 10,000 random permutations of size 212,213,214,215 and 100,000 of 216. Then we checked to see if they had two or more fixed points, and aborted if this were not the case. If two or more fixed points were indeed present, we tabulated the number of fixed points of the eigth power of that permutation on composition. Finally, we examined at which value the second fixed point of f was found, when iterating through the values of f(8) and searching for its fixed points. The data is given in Table 2.1 on page 23. It shows that we must check around 60% of the possible plaintexts. It also confirms the values of n1 = 2.39 (calculated in Section 2.4.8 on page 30) and n8 = 5.39 (calculated in Section 2.4.9 on page 31). 23 2.4.6 Fraction of Plainspace Required As calculated in Section 2.4.9 on page 31, we expect f to have an expected value of 3 total points of orders 2, 4, or 8. This is independent of the number of fixed points, so long as the number of fixed points is small. The probability of p fixed points of f being present is 1/(p!e) as calculated in Lemma 3 on page 29. Upon conditioning that f have at least two fixed points, this becomes 1 (p!e)(1?2/e) Our plan is to check each fixed point of f(8), and see if it is a fixed point of f (denoted ?yes?), or if not, which would mean it is a point of order 2, 4, or 8 (denoted ?no?). Upon the second ?yes? result, we stop. How many checks must we perform, until we stop? Denote the number of checks required as k. Now recall that we expect 3 points of order 2, 4, and 8, independent of the number of fixed points of f. This is shown in Section 2.4.9 on page 31. If we have p fixed points of f, we will expect to have 3 +p fixed points of f(8). For example, suppose there are p = 2 fixed points of f. In expectation, we will find 5 fixed points of f(8) The possible arrangements are k pattern 2 YYNNN 3 YNYNN, NYYNN 4 YNNYN, NYNYN, NNYYN 5 YNNNY, NYNNY, NNYNY, NNNYY 24 The expected value of k in the above example is 4. Since the fixed points of f(8) are expected to be uniformly distributed in the plainspace, we can model them as placed uniformly in the intervals{(0,0.2),(0.2,0.4),...,(0.8,1.0)}(viewed as fractions of the plainspace). Alternatively, this is expecting the fixed points to be located at approximately{0.1,0.3,0.5,0.7,0.9}. This is obviously a crude model but the afore mentioned experiments confirm it. In our case, sincekis expected to be 4, then 0.7 or 70% of the plaintext must be checked, in expectation, for p = 2. Of course if the fixed points were not uniformed distributed, and we knew the distribution, we would find them faster than this. In general, 2?k?5, and (2k?1)/(2(p+3)) is the fraction of plaintext that needs to be computed. Call this value ?. To find the probability distribution of ?, we need only find a probability distribution for k, since we already have one for p. This tabulation will be simplified by observing that to the left of k, there is one yes, and all remaining are no?s. To the right of k, one finds p?2 yes?s and all remaining are no?s. The left of k has k?1 slots, and the right of k has p+ 3?k slots. This gives us a number of arrangements: parenleftbiggk?1 1 parenrightbiggparenleftbiggp+ 3?k p?2 parenrightbigg = (k?1) parenleftbiggp+ 3?k 5?k parenrightbigg Since k?{2,3,4,5}the total number of arrangements is (1) parenleftbiggp+ 1 3 parenrightbigg + (2) parenleftbiggp 2 parenrightbigg + (3) parenleftbiggp?1 1 parenrightbigg + (4) parenleftbiggp?2 0 parenrightbigg Thankfully, that totals to parenleftbigp+33 parenrightbig, the number of ways of putting three no?s into p+ 3 slots. This fact can be seen algebraically by expanding out the ?choice? 25 function above in each case. The probability of a k = K is thus given by Pr{k = K|p = P}= (K?1) parenleftbigP+3?K 5?K parenrightbig parenleftbigP+3 3 parenrightbig Then we can apply the probability distribution of p Pr{k = K|p?2}= ?summationdisplay p=2 (K?1)parenleftbigp+3?k5?K parenrightbigparenleftbig p+3 3 parenrightbig 1p!e(1?2/e) From this, the expected value of k can be found Then we can apply the prob- ability distribution of p E[k|p?2] = k=5summationdisplay k=2 ?summationdisplay p=2 k(k?1) parenleftbigp+3?k 5?k parenrightbig parenleftbigp+3 3 parenrightbig 1p!e(1?2/e) Knowing that ?, the fraction of the plainspace that we must search is given by ? = 2k?12(p+ 3) = k?1/2p+ 3 as shown above, we can substitute to obtain: E[?|p?2] = k=5summationdisplay k=2 ?summationdisplay p=2 parenleftbiggk?1/2 p+ 3 parenrightbigg (k?1)parenleftbigp+3?k 5?k parenrightbig parenleftbigp+3 3 parenrightbig 1p!e(1?2/e) This evaluates to 0.6297, which is close the value given in Table 2.1 on page 23, considering the crudeness of the model. 2.4.7 Comparison to Brute Force Recall, thatf has two or more fixed points with probability 1?2/e, and that we require f to have two or more. Our success probability is 2?16(1?2/e)?2?17.92. A brute force attack which would itself have probability 2?17.92 of success would consist 26 of guessing 246.08 possible keys and then aborting, because 46.08 + 17.92 = 64, the length of the key. Therefore, our attack must be faster than 246.08 encryptions of guesses, or 528?246.08?255.124 rounds. We require, for our attack, g?1k (Ek(vectorP)), which will need an additional 16 rounds. Even if we use the whole dictionary of 232 possible plaintexts, this comes to (528 + 16)232 ?241.087 rounds, which is about 214.04 times faster than brute force. If instead we use (528 + 16)(3/5)232 (which is now an expected value based on the last paragraph of the previous section), we require 240.77 rounds. 2.4.8 Some Lemmas This section provides some of the probability calculations needed in the pre- vious sections. The argument in this section is that if (for random k) the function fk : GF(2)n ?GF(2)n is computationally indistinguishable from a random permu- tation from S2n, then fk and f(8)k have various properties. Our fk and f(8)k are not random permutations, but are based off of the Keeloq specification. Since we are discussing the cryptanalysis of a block cipher, we conjecture that modeling fk as a random permutation is a good model (as is common). If not, much easier attacks might exist. This is a standard assumption. However, we only need 3 facts from this analysis. First, the expected number of fixed points, if there are two, is about 2.3922. Second, the probability of having two or more fixed points is about 26.424%. Third, the number of fixed points of f(8) is around 5.39. These particular facts were verified by simulations, given in 27 Table 2.1 on page 23, and found to be reasonable. Lemma 1 Both f and g are bijections. Proof: Note Ek is a permutation (bijection) for any specific fixed key, as must be the case for all block ciphers. Then further note Ek(vectorP) = gk(f(8)k (vectorP)) implies that f and g are bijections. Of course, if the domain or range of these functions were not finite, this argument would be incorrect. The only conclusion wouldbethattheoutermostfunctionissurjectiveandthattheinnermostisinjective. [] Lemma 2 If h : GF(2)n ? GF(2)n is computationally indistinguishable from a random permutation, then hprime(x) = h(x)?x is computationally indistinguishable from a random function. Proof: If hprime is computationally distinguishable from a random function that means that there exists an Algorithm A, which in polynomial time compared to n, and probability?, can distinguish between?(some oracle) being eitherhprime or being a random function of appropriate domain and range. Presumably this requires queries to ?, and only polynomially many queries compared to n since Algorithm A runs in polynomial time compared to n. Finally, ? is non-negligible compared to n, or more simply, 1/? is lower-bounded by a polynomial in n. We create Algorithm B, which will distinguish between ? being h or being a random permutation with appropriate domain and range. First, run Algorithm 28 A. Whenever Algorithm A asks for a query ?(x), return ?(x)?x. If ? is h then ?(x)?x = h(x)?x = hprime(x). Likewise, if ? is a random permutation, then ??x acts as computationally indistinguishable from a random function, since it is a well- known theorem that random functions and random permutations cannot be distin- guished in polynomial time [GGM86]. Algorithm B should output ?pseudorandom? or ?random? if and only if Algorithm A does. This is a perfect simulation in either case, and therefore Algorithm A will be correct with probability? and therefore Algorithm B will be correct with probability ?. Thus h and a random permutation are computationaly distinguishable. We have now proven that hprime being computationaly distinguishable from a random function implies that h is computationaly distinguishable from a random permutation. The contrapositive is our lemma. [] Lemma 3 If h : GF(2)n ? GF(2)n is a random permutation, then the limit as n??of the probability that h has p fixed points is 1/(p!e) Proof: If hprime(x) = h(x)?x, and if h(y) = y then hprime(y) = 0. Thus the set of fixed points of h is merely the preimage of 0 under hprime. By Lemma 2, hprime behaves as a random function. Thus the value of hprime(y) for any particular y is an independently and identically distributed uniform random variable. The ?Bernoulli trials? model therefore applies. If|hprime?1(0)|is the size of the preimage of 0 under hprime then limn??Prbraceleftbig|hprime?1(0)|= pbracerightbig = parenleftbigg2n p parenrightbiggparenleftbig 2?nparenrightbigpparenleftbig1?2?nparenrightbig2n?p 29 = parenleftbigg2n p parenrightbiggparenleftbig 2?nparenrightbigpparenleftbig1?2?nparenrightbig2n parenleftbig1?2?nparenrightbig?p ? (2 n)(2n?1)(2n?2)(2n?3)???(2n?p+ 1) p! (2 ?n)p(e?1)(1) ? (1)(1?1?2 ?n)(1?2?2?n)(1?3?2?n)???(1?(p?1)?2?n)e?1 p! ? 1/p!e Thus h has p fixed points with probability 1/(p!e). [] Corollary 1 If h : GF(2)n ?GF(2)n is a random permutation, then h has two or more fixed points with probability 1?2/e?0.26424. Assuming h has 2 or more fixed points, it has exactly 2 with probability 1/2e 1?2/e ? 0.69611 and 3 or more 1? 1/2e 1?2/e ? 0.30389. Therefore it is useful to calculate the expected number of fixed points given that we assume there are 2 or more. This calculation is given by parenleftbigg 1 1?2/e parenrightbiggi=?summationdisplay i=2 i e(i!) ?2.3922 2.4.9 Cycle Lengths in a Random Permutation It is well-known that in a random permutation, the expected number of cycles of length m is 1/m, which is proven in the lemma at the end of this subsection. Thus the expected numbers of cycles of length 1, 2, 4, and 8 in fk are 1, 1/2, 1/4, 1/8. All of these are fixed points of f(8)k , providing 1, 2, 4, and 8 fixed points each, 30 or a total of 1, 1, 1, and 1 expected fixed points, or 4 expected fixed points total. Thus n8 = 4, in the general case. In the special case of fk having at least 2 fixed points, we claim the number should be largely unchanged. Remove the 2.39 expected fixed points from the do- main of fk. This new function has a domain and range of size 2n?2 and is still a permutation. Since we are assuming n is large, 2n?2?2n. Indeed, if one makes a directed graph with one vertex for each x in the domain, with x having only one exit edge, which points to f(x), then those two fixed points are both individual islands disconnected from the main graph. Clearly, the remainder of the graph of the permutation fk is unchanged by this removal. But, that further implies that fk restricted to the original domain but with the two fixed points removed is unchanged on its other values. Thus, f(8)k after the removal would have 4 fixed points. We can estimate then 4?1 + 2.39 = 5.39 fixed points for f(8)k , because 1.0 fixed points are expected from fk in general, and 2.39 when fk has at least two fixed points. An alternative way to look at this is that the fixed points of f(8)k are precisely the points of orders 1, 2, 4, and 8 of fk. These four sets are mutually exclusive, and their cardinalities (as random variables for a random permutation), are asymptoti- cally independent as the size of the domain goes to infinity. To see why this is true, imagine a directed graph G = (V,E) with one vertex for each domain point and one edge pointing from x to fk(x) for all x in the domain. The fixed points of f are verteces that have only a self-loop as their edges. Therefore, a few can be deleted from the graph without changing its characteristics. 31 Thus the expected number of points of order 2, 4, and 8, of a random permu- tation, should remain unchanged upon requiring the permutation to have 2 or more fixed points. This comes to 212 + 414 + 818 = 3 expected points. Since we require fk to have two or more fixed points, it will have 2.39 in expectation, as calculated in Section 2.4.8 on page 30. Thus the expected number of fixed points of f(8)k is 5.39, when f is restricted to having two or more fixed points. See also the note at the end of this subsection. Lemma 4 The expected number of cycles of length m in a random permutation h : D?D, in the limit as|D|??is 1/m. Proof: Consider x,f(x),f(f(x)),... = y1,y2,y3,... Call the first repeated value among the yi to be yr. More precisely, yr = y1, and yr negationslash= yi for 1