ABSTRACT
Title of Dissertation: NONLINEAR COMPLEXITY OF
BOOLEAN PERMUTATIONS
Thomas Gordon Draper, Doctor of Philosophy, 2009
Dissertation directed by: Dr. Lawrence C. Washington
Department of Mathematics
We introduce the concept of nonlinear complexity, where the complexity of a
function is determined by the number of nonlinear building blocks required for
construction. We group functions by linear equivalence, and induce a complexity
hierarchy for the a ne equivalent double cosets. We prove multiple invariants
of double cosets over the a ne general linear group, and develop a specialized
double coset equivalence test. This is used to classify the 16! permutations over
4 bits into 302 equivalence classes, which have a maximal nonlinear depth of 6.
In addition, we present a new complexity class de ned in terms of nonlinearity.
NONLINEAR COMPLEXITY OF BOOLEAN PERMUTATIONS
by
Thomas Gordon Draper
Dissertation submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial ful llment
of the requirements for the degree of
Doctor of Philosophy
2009
Advisory Committee:
Dr. Lawrence C. Washington, Chairman/Advisor
Dr. Je ery Adams
Dr. John J. Benedetto
Dr. Kartik Prasanna
Dr. William Gasarch, Dean?s Representative
c Copyright by
Thomas Gordon Draper
2009
DEDICATION
To Tyra, Thomas, Lillian, Elizabeth and most of all Susan, who made
it all possible.
ii
ACKNOWLEDGEMENTS
Thoughtful consideration revealed many who contributed to my dis-
sertation, in ways large and small.
My parents head the list. From my  rst memories, they saw my
a nity for mathematics and nurtured it. They challenged, inspired,
balanced and loved me. I realize more and more how important this
was.
My professors and teachers at BYU challenged me further and helped
me set a higher standard for myself, both in mathematics and in
life. Their greatest contribution to me was helping me to center on
God and then to build my life around Him. Interestingly, I realize
that much of this communication was made possible because of our
common love of mathematics and science.
My job has been one of the greatest experiences of my life. I would
never have imagined the satisfaction received from puzzle solving with
impact. Again and again I have found myself in the right place at the
iii
right time for the right opportunity. I consider myself very blessed
and wish everyone could do what they love for a living. In addition,
my bosses and coworkers have been extremely supportive of my dis-
sertation work, even when it increased their own burden. I am very
grateful for this and I know this is not the norm for the workplace.
I wish to personally thank Dr. Washington and Dr. Gasarch. They
were both extremely helpful in directing and focusing my research
e orts. I admire them both as researchers and teachers.
Finally, I wish to thank my children and especially my wife. I know
we sacri ced much of our time together and I hope it will be worth
it. We have a long and exciting road ahead to share together. I hope
Tyra, Thomas, Lillian and Elizabeth will each  nd and enjoy their
talents. I am excited to see their lives unfold.
I conclude with an overwhelming appreciation for the love of my life,
Susan. She encouraged me when I was tired. She strengthened me
when I was weak. She burdened herself to make things easier for
me. She loved me. I will always be in her debt. I adore her more
than anyone else on this earth. I am looking forward to our many
adventures yet to come.
iv
TABLE OF CONTENTS
List of Tables xi
List of Figures xiii
1 Introduction 1
1.1 The Di culty with Di culty . . . . . . . . . . . . . . . . . . . . . 1
1.2 Combinatorics and Complexity . . . . . . . . . . . . . . . . . . . 2
1.3 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Summary of Primary Results . . . . . . . . . . . . . . . . . . . . 4
1.5 Classi cation Strategy . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5.1 What Functions are Equivalent? . . . . . . . . . . . . . . . 5
1.5.2 How Do We Induce a Complexity Hierarchy? . . . . . . . . 6
1.5.3 What Can We Compute? . . . . . . . . . . . . . . . . . . . 7
2 Complexity Measures 8
2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 A Di erent View of Complexity . . . . . . . . . . . . . . . . . . . 8
2.3 Transposition Complexity Measure . . . . . . . . . . . . . . . . . 9
2.4 Gate Complexity Measure . . . . . . . . . . . . . . . . . . . . . . 11
2.5 Reversible Gate Complexity Measure . . . . . . . . . . . . . . . . 12
v
2.6 Nonlinear Reversible Complexity Measure . . . . . . . . . . . . . 13
3 Reversible Circuits 15
3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Classical Circuit Diagrams . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Basic Reversible Gates . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.1 NOT Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.2 Controlled NOT Gate (CNOT) . . . . . . . . . . . . . . . 18
3.3.3 SWAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3.4 To oli Gate . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3.5 Example Permutation . . . . . . . . . . . . . . . . . . . . 21
3.3.6 Fredkin Gate . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4 Linear transformations on reversible circuits . . . . . . . . . . . . 23
3.5 Controlled Linear Transformations . . . . . . . . . . . . . . . . . 25
3.6 A ne Transformations . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7 Scratch Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.7.1 3-bit AND . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Which Permutations Should Be Equivalent? 32
4.1 Results and Application . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 The Equivalence Choice . . . . . . . . . . . . . . . . . . . . . . . 33
4.4 A ne Equivalence of To oli and Fredkin Gates . . . . . . . . . . 35
4.5 Controlled A ne and Linear Permutations . . . . . . . . . . . . . 36
vi
5 Counting Equivalent Functions 37
5.1 Summary and Results . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3 Counting Fixed Points Under a Double Coset Action . . . . . . . 40
5.4 Counting A ne Equivalent Double Cosets in S2n . . . . . . . . . 42
5.4.1 Example: A ne equivalent 3-bit permutations . . . . . . . 42
5.4.2 Counting A ne Equivalent n-bit Permutations . . . . . . 45
5.5 Counting A ne Equivalent Double Cosets in A2n . . . . . . . . . 46
5.6 Open problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6 Hamiltonian Cycles over GLn(2) 53
6.1 Results and Application . . . . . . . . . . . . . . . . . . . . . . . 53
6.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.3 Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.4 Borel Subgroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.5 Heuristic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.6 Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7 Double Coset Representatives 63
7.1 Results and Application . . . . . . . . . . . . . . . . . . . . . . . 63
7.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.3 Basis Fixing Permutations . . . . . . . . . . . . . . . . . . . . . . 65
7.4 Basis Permuting Permutations . . . . . . . . . . . . . . . . . . . . 66
7.5 Minimal Basis Fixing Permutations . . . . . . . . . . . . . . . . . 68
7.6 Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
vii
8 Multiple Rank Invariant 71
8.1 Results and Application . . . . . . . . . . . . . . . . . . . . . . . 71
8.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.3 The k-rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.4 The Invariance of the k-rank . . . . . . . . . . . . . . . . . . . . . 74
8.5 Multiple Rank Invariant . . . . . . . . . . . . . . . . . . . . . . . 75
8.6 Categorization of 3-bit Permutations by MRI . . . . . . . . . . . . 76
8.7 Categorization of 4-bit Permutations by MRI . . . . . . . . . . . . 77
8.8 Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
9 Additional Invariants 79
9.1 Results and Application . . . . . . . . . . . . . . . . . . . . . . . 79
9.2 Parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
9.3 Two-Way Permutations . . . . . . . . . . . . . . . . . . . . . . . . 80
9.4 Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
10 Complexity Theory 84
10.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
10.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
10.3 Nonlinear Complexity Class . . . . . . . . . . . . . . . . . . . . . 86
10.4 Nonlinear Polynomial Complexity . . . . . . . . . . . . . . . . . . 91
10.5 Uniform and Non-Uniform Nonlinear Circuits . . . . . . . . . . . 93
10.6 Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
11 Computation 96
11.1 Storage of Permutations . . . . . . . . . . . . . . . . . . . . . . . 96
11.1.1 Truth Table Form . . . . . . . . . . . . . . . . . . . . . . . 97
viii
11.1.2 Function Table Form . . . . . . . . . . . . . . . . . . . . . 98
11.1.3 Polynomial Form . . . . . . . . . . . . . . . . . . . . . . . 99
11.2 Quick double coset test . . . . . . . . . . . . . . . . . . . . . . . . 100
11.2.1 Composition via a Hamiltonian Path . . . . . . . . . . . . 100
11.2.2 Linearity Test . . . . . . . . . . . . . . . . . . . . . . . . . 101
11.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
12 Classi cation of 3-bit and 4-bit permutations 102
12.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
12.2 Permutations of Interest . . . . . . . . . . . . . . . . . . . . . . . 102
12.2.1 Modular Addition . . . . . . . . . . . . . . . . . . . . . . . 103
12.2.2 Modular Multiplication . . . . . . . . . . . . . . . . . . . . 103
12.2.3 Modular A ne . . . . . . . . . . . . . . . . . . . . . . . . 104
12.2.4 Modular Inverse . . . . . . . . . . . . . . . . . . . . . . . . 104
12.2.5 Modular Exponentiation . . . . . . . . . . . . . . . . . . . 105
12.2.6 Pseudo-Inverse over GF(2)n . . . . . . . . . . . . . . . . . 105
12.3 Complexity Class NLC(i;j) . . . . . . . . . . . . . . . . . . . . . . 106
12.4 3-bit Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . 106
12.4.1 NLC(3;0) . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
12.4.2 NLC(3;1) . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
12.4.3 NLC(3;2) . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
12.4.4 NLC(3;3) . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
12.4.5 No 3-bit Permutations are One-Way . . . . . . . . . . . . 110
12.5 4-bit Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . 110
12.6 Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
ix
13 Case Studies 114
13.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
13.2 The Hiltgen function . . . . . . . . . . . . . . . . . . . . . . . . . 114
13.3 Zech Logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
13.4 Incrementing modulo 2n . . . . . . . . . . . . . . . . . . . . . . . 118
13.5 Multiplication by 2n 1 + 1 modulo 2n . . . . . . . . . . . . . . . . 119
13.6 Addition and Subtraction modulo 2n . . . . . . . . . . . . . . . . 120
14 Conclusion 122
A Source Code 123
x
LIST OF TABLES
3.1 Truth Table: NOT Gate . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Wire Program: NOT Gate . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Truth Table: Controlled NOT Gate . . . . . . . . . . . . . . . . . 18
3.4 Wire Program: CNOT Gate . . . . . . . . . . . . . . . . . . . . . 19
3.5 Wire Program: SWAP . . . . . . . . . . . . . . . . . . . . . . . . 19
3.6 Truth Table: SWAP . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.7 Wire Program: To oli Gate . . . . . . . . . . . . . . . . . . . . . 21
3.8 Truth Table: To oli Gate . . . . . . . . . . . . . . . . . . . . . . 21
3.9 Wire Program: f = (a2 a0a1;a1 1;a2 a0a1 a0) . . . . . . . 22
3.10 Truth Table: f = (a2 a0a1;a1 1;a2 a0a1 a0) . . . . . . . . 22
3.11 Truth Table: Fredkin Gate . . . . . . . . . . . . . . . . . . . . . . 23
4.1 Number of Permutation Equivalence Classes . . . . . . . . . . . . 34
5.1 Cycle Sets of AGL(3,2) . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2 MAGMA function for counting double cosets . . . . . . . . . . . . 45
5.3 Number of A ne Equivalent Permutations . . . . . . . . . . . . . 46
5.4 Number of A ne Equivalent Even Permutations . . . . . . . . . . 51
8.1 Multiple Rank Invariant of 3-bit Permutations . . . . . . . . . . . 77
8.2 Multiple Rank Invariant of 4-bit Permutations . . . . . . . . . . . 78
xi
9.1 Invariant Table for 4-bit Permutations . . . . . . . . . . . . . . . 83
10.1 Space Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 88
11.1 Truth Table: f = (a2 a0a1;a1 1;a2 a0a1 a0) . . . . . . . . 97
11.2 Monomial Conversion Table . . . . . . . . . . . . . . . . . . . . . 99
12.1 NLC(3;0) Permutations . . . . . . . . . . . . . . . . . . . . . . . . 107
12.2 NLC(3;1) Permutations . . . . . . . . . . . . . . . . . . . . . . . . 108
12.3 NLC(3;2) Permutations . . . . . . . . . . . . . . . . . . . . . . . . 109
12.4 NLC(3;3) Permutations . . . . . . . . . . . . . . . . . . . . . . . . 109
12.5 Invariant Table for each 4-bit Permutations Depth . . . . . . . . . 112
12.6 Depth of Various 4-bit Permutations . . . . . . . . . . . . . . . . 113
xii
LIST OF FIGURES
3.1 Full Adder from XORs, ANDs and ORs . . . . . . . . . . . . . . . 16
3.2 Circuit Diagram: NOT Gate . . . . . . . . . . . . . . . . . . . . . 17
3.3 Circuit Diagram: Controlled NOT Gate . . . . . . . . . . . . . . 18
3.4 Circuit Diagram: SWAP . . . . . . . . . . . . . . . . . . . . . . . 19
3.5 Circuit Diagram: To oli Gate . . . . . . . . . . . . . . . . . . . . 20
3.6 Circuit Diagram: f = (a2 a0a1;a1 1;a2 a0a1 a0) . . . . . . 21
3.7 Circuit Diagram: Fredkin Gate . . . . . . . . . . . . . . . . . . . 23
3.8 Circuit Diagram: Linear Matrix Example . . . . . . . . . . . . . . 24
3.9 Block matrix product acting on kets (A B = BA) . . . . . . . . 25
3.10 Two equivalent block matrix actions . . . . . . . . . . . . . . . . 25
3.11 Fredkin gate as a controlled linear transformation . . . . . . . . . 26
3.12 To oli gate as a controlled linear transformation . . . . . . . . . . 26
3.13 Equivalent A ne Transformations . . . . . . . . . . . . . . . . . . 27
3.14 3-bit AND Using 3 Wires? . . . . . . . . . . . . . . . . . . . . . . 28
3.15 3-bit AND: Second Attempt . . . . . . . . . . . . . . . . . . . . . 28
3.16 3-bit AND: Third Attempt . . . . . . . . . . . . . . . . . . . . . . 29
3.17 3-bit AND: Third Attempt (Revisited) . . . . . . . . . . . . . . . 29
3.18 3-bit AND: Fourth Attempt . . . . . . . . . . . . . . . . . . . . . 29
3.19 3-bit AND: Fifth Attempt . . . . . . . . . . . . . . . . . . . . . . 30
xiii
3.20 3-bit AND: Final Attempt . . . . . . . . . . . . . . . . . . . . . . 31
4.1 Equivalence of Fredkin and To oli gates . . . . . . . . . . . . . . 35
4.2 A ne Equivalent Controlled Transformations . . . . . . . . . . . 36
6.1 Extending the Active Cycle . . . . . . . . . . . . . . . . . . . . . 59
12.1 3-bit Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . 107
13.1 Zech Logarithm for x3 +x+ 1 . . . . . . . . . . . . . . . . . . . . 117
13.2 Zech Logarithm for x4 +x+ 1 . . . . . . . . . . . . . . . . . . . . 118
13.3 Incrementing modulo 8 . . . . . . . . . . . . . . . . . . . . . . . . 119
13.4 Incrementing modulo 16 . . . . . . . . . . . . . . . . . . . . . . . 119
xiv
Chapter 1
Introduction
1.1 The Di culty with Di culty
In 1996, James L. Massey delivered the IACR Distinguished Lecture at EU-
ROCRYPT ?96, entitled \The Di culty with Di culty" [Mas96]. This talk,
which served as an inspirational starting point for my own research, addressed
why measuring the complexity of Boolean functions is so di cult. This thesis
presents a new method for measuring complexity and develops tools that allow
us to examine a previously unknown structure: the minimal complexity of all
16! = 20;922;789;888;000 4-bit Boolean permutations.
In particular, we introduce the notion of nonlinear complexity. We will prove
that all permutations may be constructed using alternating rounds of linear and
simple nonlinear permutations. In other words, every n-bit permutation p is a
composition of the form
p = L0N1L1N2L2   NnLn;
where each Li is an n-bit linear function, and each Ni is a simple nonlinear
permutation selected from a small set. The nonlinear complexity of a permu-
1
tation will be the minimum number of rounds of nonlinearity needed to build the
permutation.
To  nd the minimal nonlinearity of all 4-bit Boolean permutations, faster
double coset identi cation methods were needed. Special methods for working
with AGLn(2) double cosets were needed, since general double coset methods
would take months to compute the results given in this thesis.
Special Hamiltonian cycles accelerate the double coset equivalence test. Vari-
ous invariants are used to accelerate double coset identi cation. The mathemat-
ical theory behind each of these advances is developed in this thesis.
Using these new tools, we examine the nonlinear complexity of many common
3-bit and 4-bit permutations.
1.2 Combinatorics and Complexity
Let V = GF(2), the  nite  eld over two elements. Given a Boolean function
f : Vn!V, how di cult is f to compute? Typically, the di culty of computing
f is associated with how many \building blocks" are needed to construct f. One
common set is the AND, OR and NOT gates. AND, OR and NOT are a universal
set. This means any Boolean function can be constructed from a  nite number
of them. Such constructions are not unique, but there will be a minimal number
of gates needed to construct a function (There may also be multiple minimal
solutions).
Minimal solutions are found by exhaustively constructing all possible func-
tions and marking the minimal instances of a function by where they  rst occur.
This method is essentially a combinatorical search. Unfortunately, the number of
functions f : Vn!V is 22n. Due to the doubly exponential growth, classifying
2
the 225 = 4;294;967;296 5-bit to 1-bit functions is a challenge for current desktop
computers. If we could classify a billion billion functions (1018) per second, then
classifying all 227 = 2128 = 340282366920938463463374607431768211456 7-bit to
1-bit functions would take about 1013 years, which is a thousand times longer
than the current age of the universe (1:3 1010 years).
In response to this massive explosion of complexity, researchers chose to focus
on certain equivalence classes of functions.
1.3 History
In 1951, researchers at the \Computational Laboratory of Harvard" exhaus-
tively classi ed the 402 classes under permutation and complementation of the
input variables of the 65;536 four-variable Boolean functions [AtSotCL51]. After-
ward, mathematicians realized Polya theory could easily enumerate the number
of classes. This sparked a large amount of theoretical interest in counting the
number of Boolean functions in various manners [Lor64].
Studying methods for  nding optimal or minimal representations for these
functions was largely ignored. This is likely due to the fact that look-up tables
work very e ciently for small functions.
Quantum computing has forced mathematicians to reconsider the foundations
of computer science with the added constraint of reversibility. In the reversible
context, it is more natural to study Boolean permutations (n-bit to n-bit invert-
ible functions) as opposed to the more commonly studied Boolean functions (n to
1 bit functions). Many foundational ideas that were thought to be settled have
been resurrected. Even lowly addition has once again become a hot topic. In the
search for e cient programs on a quantum computer, the exact calculation of
3
small permutations is important. Look-up tables are problematic on a quantum
computer since computation space is at a premium. Due to the di culty of quan-
tum computation, we are willing to spend a great deal of classical computational
power to  nd an e cient quantum program.
This paper presents a new method for categorizing the complexity of a func-
tion in terms of the number of rounds of nonlinearity needed to achieve the func-
tion. Functions which di er from each other by a linear transformation will be
considered to have the same nonlinear complexity. A small set of simple nonlinear
permutations will be used to connect the linearly related equivalence classes. By
counting the minimal number of nonlinear transitions needed to create a function,
we can assign a partial ordering to the equivalence classes.
1.4 Summary of Primary Results
A new theory of complexity is introduced, based on nonlinearity. This new notion
of complexity proves useful in comparing the exact complexity of very small
functions, but also de nes new complexity classes that  t nicely into the current
complexity framework of computer science.
We show how Boolean permutations can be separated into double coset equiv-
alence classes based a ne transformations. We can then link the double cosets
via a special class of nonlinear functions. The nonlinear connections induce a
complexity hierarchy among Boolean Permutations.
Previous methods for  nding the complexity of Boolean functions focused on
counting the minimal number of gates needed to realize a function from a given
universal set. The doubly exponential growth of Boolean functions has made
research beyond 3-bit Boolean permutations and 5-to-1 bit Boolean functions
4
impractical.
The central contribution of this paper is to count rounds of nonlinearity in-
stead of counting gates. By relaxing the restriction from the  ne granularity of
gate count to a coarser measure, we can view the complexity structure present in
permutations previously too di cult to analyze.
Using the results of this paper, we will revisit the complexity of 3-bit permu-
tations in a new light and present a previously unknown complexity structure for
4-bit permutations.
1.5 Classi cation Strategy
1.5.1 What Functions are Equivalent?
Linear functions play an integral role in virtually every  eld of mathematics.
Why? Because we can solve, manipulate, invert, combine and expand them
in an understandable way that does not carry over very neatly into the realm
of nonlinear functions. We do not use linear functions because they are more
common than nonlinear functions. We use linear functions because they represent
the majority of problems that we know how to solve.
A linear function of a bit vector is a matrix where all of the entries are 0 and
1 and the resulting product is reduced modulo 2. This is the same as the ring of
matrices over the two-element  nite  eld GF(2). Of special interest are the linear
functions that are invertible. These linear functions form the group GLn(2) and
will be referred to as linear permutations since they permute the 2n n-bit vectors.
Considering n-bit permutations in this light, linear permutations might be
considered the permutations with minimal complexity. They form a subgroup in
5
which it is easy to multiply and invert. Their matrix form is much more compact
(n2 bits) than their associated permutation description (n2n bits). In fact, we
will want to slightly enlarge our de nition of minimal complexity permutations
to include all n-bit a ne permutations, since a ne permutations also share the
same nice properties.
After deciding that a ne permutations have minimal complexity, it is natural
to associate two nonlinear functions if they are equivalent under a ne transfor-
mations. This equivalence divides S2n into disjoint double coset where AGLn(2)
acts on the inputs and outputs of a permutation.
It is interesting to note that the current best known bound between the com-
plexity of a function and its inverse is based on a sparse linear transformation
with a dense inverse [Hil93].
1.5.2 How Do We Induce a Complexity Hierarchy?
We obviously need a nonlinear permutation to add to our a ne permutations, or
we will never be able to realize all permutations in S2n. There are two reasonable
choices here, and both will be considered.
First, the To oli and Fredkin gates are two di erent universal gates for re-
versible computation. We will prove that these gates are a ne equivalent, and
thus they both induce the same complexity structure. This will be referred to as
the To oli basis or BT.
Second, controlled a ne permutations are also universal for reversible com-
putation. Although this class of functions is not as simple as a single To oli
gate, controlled a ne functions have a complexity roughly equivalent to the lin-
ear transformations between each nonlinear round. They are also easy to invert
6
and cover a broader range of permutations faster. These gates will be referred to
as the controlled a ne basis or BCA.
Once we have chosen which nonlinear basis we wish to use, we can induce
the related complexity tree. We start with a ne permutations as our level zero
complexity (zero rounds of nonlinearity). Then all equivalence classes that can
be reached using one gate from our nonlinear basis become level one complexity
(one round of nonlinearity). Then the new equivalence classes that can be reached
from level one, become level two complexity and so on. We can then create a
graph illustrating how di erent permutations relate and the number of rounds
of nonlinearity needed to achieve a particular function gives a rough complexity
measure.
1.5.3 What Can We Compute?
Even though we have reduced the combinatorial complexity problem to a much
more algebraic problem, we still face formidable computational problems. The
equivalence classes themselves are double cosets, and there are currently no truly
satisfactory algorithms for double coset enumeration, equivalence or canonical
representation [Hol05].
Much of the paper develops the theory behind special methods for solving
a ne double coset problems speci cally over S2n. The nonlinear complexity
depth is computed for all 3 and 4 bit permutations.
7
Chapter 2
Complexity Measures
2.1 Summary
This chapter recalls three di erent, but known, complexity measures. These
measures serve as a useful benchmark for assessing the advantages of the nonlinear
complexity measure. The complexity measures introduced are:
 Transposition Complexity Measure
 Gate Complexity Measure
 Reversible Gate Complexity Measure
2.2 A Di erent View of Complexity
In computer science, studies of complexity usually focus on an in nite class of
functions. For example, the complexity class P, contains decision problems that
can be computed in polynomial time, and NP, contains decision problems whose
answer can be veri ed in polynomial time. The open question of whether or not
P = NP is the most important question in computer science.
8
However, we may be interested in a particular function and not an in nite
class. In designing a 64-bit adder, there will be optimal constructions given a set
of building blocks (e.g. logic gates, transistors, etc.). Knowing which complexity
class addition is in may provide a reasonable starting design, but  nding an
optimal design is a much harder problem.
Let us now consider a number of measures for assessing the optimal construc-
tion of a function.
2.3 Transposition Complexity Measure
One simple complexity measure in which a complete theory can be easily devel-
oped is the transposition complexity measure. Recall that a permutation p on
N =f1;2;:::;ngis a bijective map from N to itself. One common representation
for permutations is: 0
B@ 1 2 ::: n
p(1) p(2) ::: p(n)
1
CA
NOTE: Due to the computational nature of the permutations studied in this
paper, it will be more natural to consider \zero-based" permutations instead of
the more traditional \one-based" permutations used in most algebra texts. Thus,
permutations on n elements will be presented as:
0
B@ 0 1 ::: n 1
p(0) p(1) ::: p(n 1)
1
CA
Example 2.3.1. Consider the function p(x) = 3x + 3 mod 8. Evaluating the
9
function for each x in f0;1;2;3;4;5;6;7g yields:
0
B@ 0 1 2 3 4 5 6 7
3 6 1 4 7 2 5 0
1
CA
Lemma 2.3.2. Every permutation can be written as a composition of disjoint
cycles.
Proof. See [Jac74, 49]
Example 2.3.3. Decomposing the permutation 3x+3 mod 8 into cycles yields:
(0347)(1652)
A transposition is a permutation that swaps two elements and leaves all of
the other elements  xed. Essentially, it is the simplest non-trivial permutation.
Using the key fact that any cycle can be decomposed into a product of transpo-
sitions (e.g. (0347) = (47)(37)(07)), we arrive at another simple, yet powerful
result.
Lemma 2.3.4. Every permutation can be written as a composition of transposi-
tions. Furthermore, every m-cycle can be written as the composition of m 1
transpositions.
Proof. See [Jac74, 49-50]
We can now de ne the transposition complexity, CT, of a permutation
to be the minimum number of transpositions whose composition realizes that
permutation.
Lemma 2.3.5. If p is a permutation on f0;1;:::;n 1g then CT(p) = n c,
where c is the number of disjoint cycles in p.
10
Proof. See [Mas96]
Let us now consider the pros and cons of the transposition complexity mea-
sure.
 Pro: CT is easy to compute given a permutation.
 Pro: CT de nes a partial ordering on the complexity measure of permuta-
tions.
 Con: Linear functions have high complexity. Linear and a ne functions,
which are generally considered easy to compute, can obtain close to maximal
complexity.
 Con: A permutation and its inverse always have the same complexity. This
leads to the surprising result that there are no one-way permutations for
the transposition complexity measure. If we believe that there are one-
way permutations, then this complexity measure is not measuring the right
thing.
2.4 Gate Complexity Measure
This is probably the complexity measure that  rst comes to mind for most re-
searchers. First, choose a universal set of gates such as AND, OR and NOT,
or all possible 2-bit gates, or even just NAND gates. Then count the number
of gates necessary to compute a function. Functions are created by connecting
the gates in an acyclic network and then computing the output given a speci c
input. The gate complexity of a given function with respect to a given set of
universal gates is the minimum number of gates needed to realize the function.
11
Let us now consider how gate complexity stacks up against our criteria.
 Pro: Simple functions have simple constructions.
 Pro: Counting gates de nes a partial ordering on the complexity measure
of functions.
 Con: Computing the measure seems to be very di cult. It appears that
exhaustively enumerating all possible functions from acyclic networks is the
only way to  nd the minimum function in general.
 Con: Functions that are closely related can have very di erent looking gate
structures. It is also di cult to know if functions are only one gate away
from each other without exhaustively checking.
Current state of the art: Exhaustively searching the 226 = 264 functions that
map 6 bits to 1 bit or the (24)!  244 permutations over 4 bits is prohibitive.
In a search of the literature, the author has found no indication that these cases
have been exhausted. However, in 1997, the coordinated e orts of DESCHALL
exhaustively searched the 256 DES keys in an RSA Security challenge. Thus,
either of these classi cations could possibly be done by a large coordinated e ort.
To date, it seems that only the minimum number of gates needed for all 3-bit
permutations and 5-bit to 1-bit functions have been computed.
2.5 Reversible Gate Complexity Measure
Since it appears that look-up tables can not be used very e ciently on a quantum
computer, there has been increased interest in  nding the optimal gate construc-
tion of a function under the restrictions of reversibility.
12
Unfortunately, the search for optimal gate implementations in the reversible
context faces many of the same challenges that plague the gate complexity mea-
sure [MDM07].
 Pro: Counting reversible gates de nes a partial ordering on the complexity
measure of permutations.
 Pro: Has low complexity for simple functions. Since classical computation
can be e ciently simulated by a reversible computation, simple functions
will have simple reversible gate implementations.
 Con: Exhaustively enumerating all possible functions by applying new
gates again appears to be the only way to  nd the minimum function in
general.
2.6 Nonlinear Reversible Complexity Measure
Each of the previous complexity measures have been studied in depth, and will
not be treated further in this paper. However, each of these complexity measures
provides a useful measuring stick in evaluating new complexity measures. We
now present the basics of nonlinear complexity, and analyze its advantages.
Since linear functions are so simple, yet powerful, it is reasonable to suggest
that we consider two functions to be equivalent if a linear transformation on the
input and output of one makes it equal to the other. Using this basic notion of
equivalence, it is now natural to measure a function by the number of rounds of
nonlinearity necessary to achieve it.
Obviously, we must restrict the class of nonlinear connections, otherwise ev-
ery function is achievable in at most one round of nonlinearity { using itself!
13
The class of nonlinear connections should have low complexity and have inverses
that are easy to compute. Inspired again by quantum computing, the most nat-
ural nonlinear function is the To oli gate. This gate is universal for reversible
computation.
 Pro: Since the complexity measure graph can be constructed from the
identity up, we get a partial ordering similar to the gate complexity mea-
sure. Except in this case, we are counting the number of nonlinear functions
needed.
 Pro: Simple functions have low complexity. Linear functions have com-
plexity zero, since they require no rounds of nonlinearity. Simple nonlinear
functions have implementations requiring few NAND gates. These can be
converted into reversible functions using few To oli gates. Since each Tof-
foli gate can be used as a round of nonlinearity, this sets an upper bound
on the number of rounds of nonlinearity a function needs.
 Pro: We are able to compute a complexity graph for 4-bit permutations.
This is currently not feasible for the gate complexity measure or reversible
gate complexity measure.
 Con: Similar to combinatorical complexity measures, this approach also
quickly runs out of steam due to the doubly exponential size of the space
we are exploring.
Possibly the most exciting result of the nonlinear complexity measure is that
we are able to use more powerful mathematics to analyze the problem of com-
plexity.
14
Chapter 3
Reversible Circuits
3.1 Summary
This chapter introduces the basic gates of reversible computing and illustrates
how these gates may be composed. Three di erent methods for visualizing re-
versible circuits are introduced: circuit diagram, truth table and wire program.
The majority of the content can be found in the open literature. It is presented
here to provide the reader su cient background to understand the material in
the later chapters.
3.2 Classical Circuit Diagrams
Classical circuits may be represented as diagrams connecting the inputs and out-
puts by a combination of wires connected by universal gates. There are certain
rules to building circuits, such as no closed loops, but these restrictions will not
be discussed in this paper.
Example 3.2.1. One very common building block of circuit design is the full
adder, which computes the sum and carry of two bits and an incoming carry bit.
15
A common presentation is given in Figure 3.1.
Figure 3.1: Full Adder from XORs, ANDs and ORs
It is also useful to represent reversible circuits with diagrams. In [Ben73], Ben-
nett discusses the necessary conditions for reversible computing. For reversible
circuits, the conditions imply that
1. The number of wires is constant throughout the computation.
2. Every gate or action must be a permutation.
3. Wires may not be split or joined.
3.3 Basic Reversible Gates
The three most commonly used reversible gates are the NOT gate, the con-
trolled NOT gate or CNOT, and the To oli gate, also referred to as a controlled-
controlled-NOT gate, a doubly controlled NOT gate or simply CCNOT. As we
discuss reversible functions it is useful to view the functions in multiple forms,
the wire program, circuit diagram and truth table. We now present some basic
gates in each of the three forms.
16
In the wire programs presented in this thesis, the binary operation  = will
be used repeatedly. This notation is very useful in reversible computing since it
ensures that the end function is a permutation. Similar to a C program where
a+ = 2 means that the value of a is incremented by 2, hki = f(a0;a1;:::an)
means that the value of hki is updated by XORing it with the value of f.
3.3.1 NOT Gate
The NOT gate is the simplest and acts on only one bit. The wire program for
the NOT gate acting on the kth bit is simply hki = 1, In  gure 3.2, the wire
h0i is negated.
Figure 3.2: Circuit Diagram: NOT Gate
a0     b0 = a0 1
Table 3.1: Truth Table: NOT Gate
Input (a0) Output (b0)
0 1
1 0
Given an n wire circuit, the NOT gate applied to various wires will generate
any bit translation (e.g. f(x) = x b where b2Zn2 ).
17
Table 3.2: Wire Program: NOT Gate
Instruction h0i
a0
h0i = 1 a0 1
3.3.2 Controlled NOT Gate (CNOT)
The controlled NOT gate or CNOT adds the value of one wire to another. If hii
is the control and hji is the target, then the wire program for the CNOT gate is
hji =hii. In  gure 3.3, h0i is the control and h1i is the target.
Figure 3.3: Circuit Diagram: Controlled NOT Gate
a0  b0 = a0
a1     b1 = a1 a0
Table 3.3: Truth Table: Controlled NOT Gate
a0a1 b0b1
0 0 0 0
0 1 0 1
1 0 1 1
1 1 1 0
Note that the control and target can be any wires. The CNOT gates on n
wires generate the linear group GLn(2). Combining CNOT and NOT gates will
generate the a ne linear group AGLn(2).
18
Table 3.4: Wire Program: CNOT Gate
Instruction h0i h1i
a0 a1
h1i =h0i a0 a1 a0
3.3.3 SWAP
Lemma 3.3.1. The value of two lines can be swapped using 3 CNOTs. (Referred
to as SWAP.)
Proof. If we wish to switch the values stored inhiiandhji, we can simply apply
the circuit in Figure 3.4.
Figure 3.4: Circuit Diagram: SWAP
a0       b0 = a1
a1          b1 = a0
We can trace the values on line h0i and line h1i through the following wire
program:
Table 3.5: Wire Program: SWAP
Instruction h0i h1i
a0 a1
h1i =h0i a0 a1 a0
h0i =h1i a1 a1 a0
h1i =h0i a1 a0
19
Table 3.6: Truth Table: SWAP
a0a1 b0b1
0 0 0 0
0 1 1 0
1 0 0 1
1 1 1 1
This is a well known trick from computer science when you wish to switch the
value of two variables and not use a temporary variable.
3.3.4 To oli Gate
The doubly controlled NOT gate, CCNOT or To oli gate adds the product of
two wires to another. If hii and hji are the controls and hki is the target, then
the wire program for the CCNOT gate ishki =hiihji. In  gure 3.5,h0iandh1i
are the controls and h2i is the target.
Figure 3.5: Circuit Diagram: To oli Gate
a0  b0 = a0
a1  b1 = a1
a2     b2 = a2 a0a1
The To oli is probably the most well known universal reversible gate. Note
that both the NOT and CNOT gates can be derived from the To oli gate by
forcing certain control lines to be 1.
20
Table 3.7: Wire Program: To oli Gate
Instruction h0i h1i h2i
a0 a1 a2
h2i =h0i h1i a0 a1 a2 a0a1
Table 3.8: Truth Table: To oli Gate
a0a1a2 b0b1b2
0 0 0 0 0 0
0 0 1 0 0 1
0 1 0 0 1 0
0 1 1 0 1 1
1 0 0 1 0 0
1 0 1 1 0 1
1 1 0 1 1 1
1 1 1 1 1 0
3.3.5 Example Permutation
Example 3.3.2. Consider the function f(a0;a1;a2) = (a2  a0a1;a1  1;a2  
a0a1 a0). The circuit diagram, wire program and truth table for f are in  gure
??, and tables 3.9 and 3.10.
Figure 3.6: Circuit Diagram: f = (a2 a0a1;a1 1;a2 a0a1 a0)
a0      b0 = a2 a0a1
a1      b1 = a1 1
a2      b2 = a2 a0a1 a0
21
Table 3.9: Wire Program: f = (a2 a0a1;a1 1;a2 a0a1 a0)
Instruction h0i h1i h2i
a0 a1 a2
h1i = 1 a0 a1 1 a2
h2i =h0ih1i a0 a1 1 a2 a0a1 a0
h0i =h2i a2 a0a1 a1 1 a2 a0a1 a0
Table 3.10: Truth Table: f = (a2 a0a1;a1 1;a2 a0a1 a0)
a0a1a2 b0b1b2
0 0 0 0 1 0
0 0 1 1 1 1
0 1 0 0 0 0
0 1 1 1 0 1
1 0 0 0 1 1
1 0 1 1 1 0
1 1 0 1 0 0
1 1 1 0 0 1
3.3.6 Fredkin Gate
Another common universal gate for reversible computing is the Fredkin gate, or
controlled swap gate. While the To oli gate has two controls and one target, the
Fredkin gate has one control and two targets. If the control is 1, then the values
of the two targets are swapped. Again, if the control is 0, then no action takes
place.
Writing wire programs with the Fredkin gate is awkward since it updates the
value of two wires at once. For this reason, most wire programs will be written
22
Figure 3.7: Circuit Diagram: Fredkin Gate
a0  b0 = a0
a1
SWAP b1 = a1 a0a1 a0a2a
2 b2 = a2 a0a1 a0a2
Table 3.11: Truth Table: Fredkin Gate
a0a1a2 b0b1b2
0 0 0 0 0 0
0 0 1 0 0 1
0 1 0 0 1 0
0 1 1 0 1 1
1 0 0 1 0 0
1 0 1 1 1 0
1 1 0 1 0 1
1 1 1 1 1 1
using To oli gates instead of Fredkin gates. Due to the linear equivalence between
To oli and Fredkin gates (that will be proven later), the necessary conversion is
fairly trivial.
3.4 Linear transformations on reversible circuits
Linear transformations are central to the equivalence relation that will be devel-
oped later. Not surprisingly, linear permutations can be constructed using only
a small subset of the gates presented so far.
Lemma 3.4.1. The CNOT gates on n wires generate GLn(2).
23
Proof. Observe that the CNOT from hii to hji is the same as adding column i
to column j and the SWAP gate on hii and hji simply swaps columns i and j.
These elementary column operations can be used to reduce any invertible linear
transformation to the identity matrix. Thus, by simply reversing the echelon
reduction, any invertible linear transformation can be made from CNOTs.
Instead of representing a linear transformation as a series of controlled NOT
gates, we can represent the linear transformation as a single matrix action on a
group of wires. This can be done without hiding too much complexity since every
linear permutation may be constructed using O(n) CNOTs [KMS07].
As the binary state progresses from left to right, the binary state will be a
GF(2) column vector acted on by left multiplication of GF(2) matrices. Mirroring
the physics convention for quantum vectors, we will represent the column vectors
using the \ket" notation. (e.g. The column vector [1;1;0]T =j110i.)
Example 3.4.2. Consider the vector j110i acted upon by the linear transfor-
mation
h0 0 1
1 0 10 1 0
i
. The result will be the vector j110i multiplied on the left by the
matrix
h0 0 1
1 0 10 1 0
i
j110i=
h0 0 1
1 0 10 1 0
i
[1;1;0]T =
h0 0 1
1 0 10 1 0
ih1
10
i
=
h0
11
i
= [0;1;1]T =j011i:
Figure 3.8: Circuit Diagram: Linear Matrix Example
a0 0 0 1
1 0 1
0 1 0
a2
ja0a1a2i= a1 a0 a2 =ja2;a0 a2;a1i
a2 a1
24
Remark 3.4.3. In order to maintain a consistent notation with established no-
tation, actions are applied to the input ket from left to right. When considering
linear actions applied to the ket, the matrix multiplication is applied on the left
of the column vector. Thus, to retain readability, we will use a notation similar
to the opposite ring or Rop notation when multiplying matrices in block form.
Thus we de ne A B = BA for matrices A and B. Note that applying matrix A
and then B to ket jxi yields B(Ajxi) = (A B)jxi.
Figure 3.9: Block matrix product acting on kets (A B = BA)
A B A B=
It should also be noted that linear transformations can be enlarged to include
bit lines they do not a ect. Of course, the opposite is also true. Bit lines that
are una ected can be dropped out of the block matrix.
Figure 3.10: Two equivalent block matrix actions
1 0 0
0 a b
0 c da bc d
=
3.5 Controlled Linear Transformations
Controlled linear transformations will operate in a wire diagram as anticipated.
If the controls are all 1 then the diagram will apply the linear transformation.
25
Otherwise, no action is taken.
Example 3.5.1. We can express the Fredkin gate by controlling the matrix [ 0 11 0 ].
Figure 3.11: Fredkin gate as a controlled linear transformation
  
0 1
1 0
=
SWAP
Example 3.5.2. Since a single control a ne shift is equivalent to a linear trans-
formation, we can express the To oli gate as a controlled linear transformation.
Figure 3.12: To oli gate as a controlled linear transformation
  
 = 1 1
0 1    
3.6 A ne Transformations
An a ne transformation is simply a linear transformation plus the addition of a
vector. Flipping the state of a wire in reversible circuits is typically represented
by placing a  on the wire. While this notation is a little strange, it also makes
sense. It is simply a controlled NOT without the control, and therefore just a
NOT gate.
26
Example 3.6.1. We can express the a ne transformation
h0 0 1
1 0 00 1 1
i
+j110iin two
di erent ways.
Figure 3.13: Equivalent A ne Transformations
0 0 1 1
1 0 0 1
0 1 1 0
0 0 1
1 0 0
0 1 1
    
=     
Controlled a ne transformations will also operate in exactly the same manner
as controlled linear transformations. If the controls are all 1, then the a ne
transformation will be applied. Otherwise, no action is taken.
3.7 Scratch Space
Finally we consider scratch space, or additional computation space. For various
calculations, additional wires may be needed to store intermediate values or the
 nal calculation. In some cases, such additional wires are necessary, in other
cases, they simply speed up the computation. It should be noted that although
the To oli gate is universal, it is only universal given enough scratch space.
To illustrate how scratch space is used in reversible computation, we will
consider a simple problem: computing the AND of three wires.
3.7.1 3-bit AND
Suppose we wanted to construct the 3-bit AND from To oli, CNOT and NOT
gates. We will explore various ideas to show why two scratch bits are necessary
27
and give motivation for the idea that works.
Since a permutation is a bijection, there do not exist functions f and g such
that  gure 3.14 can have the output abc on the third wire (Two values must map
to one). Thus, we must utilize at least one additional wire to perform a reversible
computation for the 3-bit AND gate.
Figure 3.14: 3-bit AND Using 3 Wires?
a
?
f(a;b;c)
b g(a;b;c)
c abc
Now consider the circuit in  gure 3.15.
Figure 3.15: 3-bit AND: Second Attempt
a   a
b       b abc
c  c
0          abc
Figure 3.15 has computed the value of the 3-bit AND abc, but one of the
wires has been modi ed. It is often preferable not to modify the input variables
in the event they are needed for other computations. This is corrected in the
implementation in  gure 3.16.
Figure 3.16 now does seem to do the trick, but usually in reversible computing,
one wants to compute a value and have it XORed onto some set of bits. In this
case we are relying on the scratch bit to be initialized to zero for the 3-bit AND
to work. Figure 3.17 shows the circuit evaluated with an arbitrary scratch bit d.
28
Figure 3.16: 3-bit AND: Third Attempt
a   a
b           b
c  c
0           abc
Figure 3.17: 3-bit AND: Third Attempt (Revisited)
a   a
b           b acd
c  c
d           d acd abc
Computing the 3-bit AND and XORing its value on the fourth bit is a triply
controlled NOT. Upon further re ection, we realize that the CCCNOT, or triply
controlled NOT is an odd permutation on 4-bit wires. Since the NOT, CNOT
and CCNOT are all even permutations on 4-bit wires, we will need to add yet
another scratch bit to create a permutation equivalent to CCCNOT.
With 2 scratch bits, it doesn?t take long to  nd the construction in  gure
3.18.
Figure 3.18: 3-bit AND: Fourth Attempt
a  a
b  b
c  c
0      ab
e     e abc
29
Thus, in  gure 3.18 all of the input values are preserved and the 3-bit AND is
XORed onto the target wirehei. The only problem is that one of our scratch bit
still contains some left over information. For classical computation, this doesn?t
seem like much of an issue. But for reversible computation, erasure means heat
dissipation. We would prefer to \uncompute" the scratch bit to return it to
its original value. Furthermore, if the reversible computation is to be used on a
quantum computer, it is necessary to return any scratch space used to its original
state or it will destroy the superposition of the computation. Thus,  gure 3.19 is
still a better version of our 3-bit AND.
Figure 3.19: 3-bit AND: Fifth Attempt
a   a
b   b
c  c
0          0
e     e abc
Finally, there is one more surprising improvement that can be made. The
fourth scratch bit does not need to be initialized to zero for the computation to
work. For any value in d,  gure 3.20 successfully computes the 3-bit AND of
a;b;c, XORs the value onto e and returns d to it original value.
This amazing trick is even more useful in quantum computation. Qubit wires
can be used as computation scratch space, even when we cannot observe their
contents.
30
Figure 3.20: 3-bit AND: Final Attempt
a   a
b   b
c   c
d           d
e         e abc
3.8 Conclusion
Hopefully this small foray into reversible computation illustrates some of the
di culty in exploring permutation constructions. One important observation
to make at this point is the relationship between scratch space and one-way
permutations.
Any reversible computation that does not use any scratch space, or a compu-
tation that \uncomputes" the scratch space that it does use, is easily reversible.
All of the gates can be run in the reverse order.
Thus for a permutation to be one-way, some circuit must exist which can
compute one direction of the permutation e ciently. Furthermore, this circuit
must leave \dirty" scratch space (not \uncomputed"), otherwise the inverse com-
putation can simply run the circuit backwards.
31
Chapter 4
Which Permutations Should Be Equivalent?
4.1 Results and Application
The major results of this chapter are:
 A ne equivalence is a maximal equivalence relation in S2n.
 The two major universal reversible gates (To oli and Fredkin) are a ne
equivalent.
 Singly-controlled linear and a ne permutations are a ne equivalent.
4.2 Background
In the early 1950?s, Aiken[AtSotCL51] and Moore[Moo52] exhaustively computed
the 402 equivalence classes of the 65536 four-variable Boolean functions (4-bit to
1-bit) under permutation and complementation of the input variables. Shortly
thereafter, mathematicians found combinatorical methods for counting that were
far superior to the exhaustive methods used by Aiken and Moore. This work
32
aroused a lot of interest in using Polya theory to count equivalent Boolean func-
tions. References to much of this work can be found in [Lor64].
Permutations as well as functions may be classi ed by Polya theory. This
chapter explores various notions of equivalence, and why a ne equivalence is
considered optimal.
4.3 The Equivalence Choice
Any group action on the input and output variables will de ne an equivalence.
Aiken and Moore?s original research stemmed from the question of how many
4-input, 1-output \cans" had to be designed to create all 4-bit nonlinear func-
tions. Since the input plugs could be arranged in any order, equivalence under
permutation of the inputs was one of the  rst equivalences studied. The following
is a list of some of the more natural and common choices.
1. Complementation of variables.
2. Permutation of variables.
3. Complementation and permutation of variables.
4. Linear transformation of variables.
5. A ne transformation of variables.
Table 4.1 is extracted from [Lor64]. It is helpful to see how the di erent
equivalence relations on inputs and outputs change the number of classes of per-
mutations. For most of the equivalence relations, no closed form solution to count
them is known. A lower bound is provided for the last four.
33
Table 4.1: Number of Permutation Equivalence Classes
Equivalence n 1 2 3 4
None 2n! 2 24 40320 20922789888000
Complementation 2n!+(2n 1)22n 1!22
n 1
22n 1 6 924 81738720000
Permutation > 2n!(n!)2 2 7 1172 36325278240
Comp. & Perm. > 2n!22n(n!)2 1 2 52 142090700
Linear > 2n!22n2 2 2 10 52246
A ne > 2n!22n2+2n 1 1 4 302
Note that the group of a ne transformations contains all of the other listed
group actions as subgroups. One obvious question is whether or not there is a
larger subgroup than a ne transformations suitable for larger equivalence classes.
Theorem 4.3.1 (O?Nan-Scott). Let Fq be the  nite  eld with q elements and
Fnq the n-dimensional vector space over Fq. Let AGL(n;q), S(Fnq ) and A(Fnq ) be
the a ne general linear group, symmetric group and alternating group acting on
Fnq respectively. Then AGL(n;q)\A(Fnq ) is a maximal subgroup of A(Fnq ).
Proof. See [LPS87].
This proves that AGLn(2) is a maximal subgroup in A2n. Thus, adding any
nonlinear element of S2n to the a ne group will generate all of A2n or S2n.
Although S2n contains other maximal subgroups, a ne equivalence is the natural
choice if we wish linear functions to have low complexity.
34
4.4 A ne Equivalence of To oli and Fredkin
Gates
As we start our journey of studying the complexity of functions using a ne
equivalence, it is refreshing to see that the two universal gates for reversible
computing are in fact a ne equivalent.
Lemma 4.4.1. The To oli gate and the Fredkin gate are a ne equivalent.
Figure 4.1: Equivalence of Fredkin and To oli gates
  
0 1
1 0
= 0 1
1 1
1 1
0 1
1 1
1 0
Proof. We need to show that when the top wire is 0, no action as taken by both
sides, and when the top wire is 1, the values in the bottom two lines are swapped.
When the top wire is 0, the controlled gate is not applied. Thus no action is
taken on the left, and the action on the right is
[ 0 11 1 ] [ 1 11 0 ] = [ 1 11 0 ] [ 0 11 1 ] = [ 1 00 1 ];
which is the identity, and therefore takes no action.
In the case where the top wire is 1, the action on the right is
[ 0 11 1 ] [ 1 10 1 ] [ 1 11 0 ] = [ 1 11 0 ] [ 1 10 1 ] [ 0 11 1 ] = [ 0 11 0 ];
which is equivalent to a swap. Thus the To oli and Fredkin gates are a ne
equivalent.
35
4.5 Controlled A ne and Linear Permutations
Lemma 4.5.1. A singly controlled a ne permutation is equivalent to a singly
controlled linear permutation composed with an a ne permutation.
Proof. Separate the controlled a ne function into the linear permutation and vec-
tor addition. The singly controlled vector addition is just the product of multiple
CNOTs. Any composition of CNOTs is a linear permutation. Thus, the singly
controlled a ne permutation is equivalent to a controlled linear permutation and
a linear permutation.
Another way to consider the equivalence is: conditionally negating wire hii
by wire hji; which is the same as adding wire hii to wire hji.
Example 4.5.2. Consider the following controlled a ne permutation:
Figure 4.2: A ne Equivalent Controlled Transformations
   
0 1 1
1 0 1
= 0 1
1 0
     
    
Since the composition of CNOTs is simply a linear function, the controlled
a ne permutation on the left is linear (and therefore a ne) equivalent to the
controlled linear permutation on the right.
36
Chapter 5
Counting Equivalent Functions
5.1 Summary and Results
A method for counting a ne equivalent double cosets in S2n is reviewed. This
result is then extended to show how we can enumerate the a ne equivalent double
cosets in A2n.
5.2 Background
Polya theory is a very robust method for counting di erent types of objects,
subject to an equivalence relation. Let us  rst review the basic counting theorems
and how they are applied to counting the number of a ne equivalent double
cosets.
Definition 5.2.1. Let f and g be n-bit permutations in S2n. Then f and g
are a ne equivalent or f  g if there exist a0 and a1, a ne permutations in
AGLn(2) <S2n, such that f = a0ga1. This is equivalent to f and g belonging to
the same double coset of S2n where the action on the right and left is by AGLn(2)
Anytime we have a group acting on a set, the group action will divide the
37
set into disjoint orbits. The Cauchy-Frobenius Lemma, one of the most beautiful
results of combinatorical group theory, states that the number of orbits is equal
to the average number of  xed points by the group action.
We need to introduce some de nitions before proceeding with the proof.
Definition 5.2.2. Let G be a group acting on the set X. De ne Xg = fx2
Xjg x = xg. Xg is the set of  xed points under the action of G.
Definition 5.2.3. Let G be a group acting on the set X. De ne the stabilizer
Gx =fg2Gjg x = xg. Then Gx is the subgroup of G that  xes the point x.
Definition 5.2.4. Let G be a group acting on the set X. De ne G(x) =fy2
Xj9g3g x = yg. G(x) is the orbit of the element x.
Lemma 5.2.5. (Orbit-Stabilizer relation) Let G be a group acting on the set X.
Given x2X, the length of the orbit G(x) is equal to the index of the stabilizer:
jG(x)j=jG=Gxj;
or equivalently,
jG(x)j jGxj=jGj:
Proof. De ne a mapping  : G(x) ! G=Gx by gx 7! gGx. Then using the
following relations:
g1x = g2x () g 12 g1 2Gx () g1Gx = g2Gx;
we can see that  is injective, onto, and well de ned. Therefore, it is a bijection.
It follows immediately that
jG(x)j=jG=Gxj:
38
We now have the machinery needed to prove the following.
Theorem 5.2.6 (Cauchy-Frobenius). Let G be a group acting on the set X. The
number of disjoint orbits is equal to the average number of  xed points.
jG==Xj= 1jGj
X
g2G
jXgj;
where jG==Xj denotes the number of G-orbits in X.
Proof. De ne the function f : G X!f0;1g by
f(g;x) =
8
><
>:
1 if g x = x
0 otherwise
Then by summing over x or g we get:
X
g2G
jXgj=
X
g2G;x2X
f(g;x) =
X
x2X
jGxj
Recall the Orbit-Stabilizer relation (we use it twice):
For all x2X jGxj jG(x)j=jGj:
Any group action divides X = O1[O2[   [ON into a disjoint union of orbits,
and it is exactly these orbits we wish to count. Consider the right hand sum over
a single orbit generated by y.
X
x2Oy
jGxj=
X
x2Oy
jGyj=jGyj jG(y)j=jGj:
Thus, the sum of jGxj over any orbit is jGj. Therefore,
X
g2G
Xg =
X
x2X
jGxj=
X
x2O1[   [ON
jGxj=
NX
i=1
jGj= NjGj
where N is the number of orbits.
It follows immediately that
jG==Xj= N = 1jGj
X
g2G
Xg:
39
5.3 Counting Fixed Points Under a Double Coset
Action
De ne  : (AGLn(2) AGLn(2)) S2n !S2n by  ((a;b);p) = apb 1. In order
to count the number of equivalence classes we have, we need to  gure out how
many permutations are  xed by the action of (a;b).
For any permutation we can de ne the cycle set  to be a tuple ( 1; 2; 3;:::)
where each  i indicates the number of i-cycles in the permutation. For any
permutation on N elements,
X
i
i i = N:
Lemma 5.3.1. If the action of (a;b) on the permutation p  xes p, then a and b
must have the same cycle set.
Proof. Suppose the action (a;b)  xesp. Thenapb 1 = por equivalently, apb 1p 1 =
1. LetCk = (b0b1:::bk 1) be ak-cycle inb 1. ThenpCkp 1 = (p(b0)p(b1):::p(bk 1)).
This can be veri ed by checking that
pCkp 1(p(bi)) = pCk(bi) = p(bi+1):
Since conjugating by p maps k-cycles to k-cycles for all k, conjugating by p will
take b to another permutation with the same cycle structure. Therefore a must
also have the same cycle structure since a is the inverse of pbp 1.
Lemma 5.3.2. Given two permutations a and b with the same cycle set ( 1; 2;:::),
the number of permutations  xed by the action (a;b) is
Y
k
 k!k k:
40
Proof. Recall that when the action (a;b)  xes a permutation p, apb 1p 1 = 1. Let
Cbk = (b0b1:::bk 1) be a k-cycle in b 1. Then pCbkp 1 = (p(b0)p(b1):::p(bk 1)).
Given another k-cycle in a, Cak, there are k di erent permutation maps p that
will conjugate Cbk to become the inverse of Cak. This is simply due to the fact
that the permutation can be shifted in k possible ways, all of which are actually
the same permutation.
If we have  k k-cycles in b 1 to match with  k k-cycles in a, there are  k!
ways to choose the matching. With each matched pair we have shown that there
are k permutations with which we can conjugate Cbk to become Cak. Thus there
are  k!k k ways to match all k-cycles.
Since the permutation acts on each cycle independently, the total number of
permutations  xed by the action (a;b) is the product over all k.
For counting the number of double cosets in general, we would examine the
cycle sets for the left and right action.
Theorem 5.3.3. Let H < S2n and let H = fh2Hjh has cycle set  g. Then
the number of HH double cosets of S2n is
1
jHj2
X
 
jH j2
Y
k
 k!k k:
Proof. Recall the double coset group action of G = H H on S2n,  : (H 
H) S2n !S2n de ned by  ((a;b);p) = apb 1. By Theorem 5.2.6, the number
of orbits (double cosets) is
jG==S2nj = 1jGj
X
g2G
Xg
= 1jHj2
X
h1;h22H
X(h1;h2)
41
By Lemma 5.3.1 the number of  xed points is zero unless h1 and h2 are in
the same cycle set. This reduces our equation to
1
jHj2
X
 
X
h1;h22H 
X(h1;h2):
Applying Lemma 5.3.2, we get
1
jHj2
X
 
X
h1;h22H 
Y
k
 k!k k:
Since the summand is independent of h1 and h2, we arrive at the desired result:
1
jHj2
X
 
jH j2
Y
k
 k!k k:
5.4 Counting A ne Equivalent Double Cosets
in S2n
By theorem 5.3.3, the number of a ne equivalent double cosets in P is
1
jAGLn(2)j2
X
 
jH j2
Y
k
 k!k k
where H is the number of elements in H with the cycle type  . Recall that all
the elements of any conjugacy class have the same cycle type. To computejH j,
we will take the union of conjucacy classes in AGLn(2) with cycle type  .
5.4.1 Example: A ne equivalent 3-bit permutations
We will now use MAGMA[BCP97] to  nd the cycle sets for the a ne general
linear group over GF(2).
42
> Classes(AGL(3,2));
Conjugacy Classes
-----------------
[1] Order 1 Length 1
Rep Id($)
[2] Order 2 Length 7
Rep (1, 2)(3, 4)(5, 6)(7, 8)
[3] Order 2 Length 42
Rep (1, 5)(2, 6)
[4] Order 2 Length 42
Rep (1, 2)(3, 4)(5, 8)(6, 7)
[5] Order 3 Length 224
Rep (1, 7, 4)(2, 5, 8)
[6] Order 4 Length 84
Rep (1, 4, 3, 2)(5, 8, 7, 6)
[7] Order 4 Length 168
Rep (1, 6, 2, 5)(7, 8)
[8] Order 4 Length 168
Rep (1, 8, 2, 5)(3, 6, 4, 7)
[9] Order 6 Length 224
Rep (1, 5, 7, 8, 4, 2)(3, 6)
[10] Order 7 Length 192
Rep (1, 5, 2, 3, 4, 6, 8)
[11] Order 7 Length 192
Rep (1, 6, 3, 5, 8, 4, 2)
Note that some classes have the same cycle set, even though they are not
conjugate. These need to be grouped together in our counting. (i.e. conjugacy
classes 2 and 4 are not conjugate, but both have a (0;4) cycle set.)
43
Table 5.1: Cycle Sets of AGL(3,2)
Cycle Set Classes Total Count
(8) 1 1 12 8!18
(4;2) 3 42 422 4!142!22
(0;4) 2;4 49 492 4!24
(2;0;2) 5 224 2242 2!122!32
(0;0;0;2) 6;8 252 2522 2!42
(2;0;1;0;1) 7 168 1682 2!121!211!41
(0;1;0;0;0;1) 9 224 2242 1!211!61
(0;0;0;0;0;0;0;1) 10;11 384 3842 1!81
Thus, we calculate the number of a ne equivalent 3-bit permutations by
summing the terms in table 5.1, and then dividing by jAGL(3;2)j2 = 13442.
SUM = 12 8!18 + 422 4!142!22 + 492 4!24 + 2242 2!122!32 + 2522 2!42
+ 1682 2!121!211!41 + 2242 1!211!61 + 3842 1!81
= 7225344
Thus the total number of double cosets is
SUM
13442 =
7225344
13442 = 4:
44
5.4.2 Counting A ne Equivalent n-bit Permutations
Table 5.2 provides a general MAGMA program to compute the number of a ne
equivalent permutations for any n. The computation is mostly limited by the
di culty of  nding the conjugacy classes of AGLn(2).
Table 5.2: MAGMA function for counting double cosets
Count_AGL_double_cosets_in_Sym := function(n)
H:=AGL(n,2); // Permutation group in Sym(2^n)
ConjClass:=ConjugacyClasses(H);
CycleTypes:={}; // Union together similar cycle sets
for x in ConjClass do
CycleTypes:=CycleTypes join {CycleStructure(x[3])};
end for;
sum:=0; // Compute and add summands for each cycle type
for x in CycleTypes do
size:=0;
for y in ConjClass do
if (x eq CycleStructure(y[3]))
then size:=size+y[2]; end if;
end for;
summand:=size^2;
for y in x do
summand:=summand*Factorial(y[2])*(y[1]^y[2]);
end for;
sum:=sum+summand;
end for;
return sum/(#H)^2; // Divide by group order
end function;
Using the function in table 5.2, the author was able to compute the number
of a ne equivalent double cosets for n  7. The total count grows doubly
exponentially, so only the values for n 5 are provided in table 5.4.
45
Table 5.3: Number of A ne Equivalent Permutations
n A ne Equivalent Total Permutations (2n!)
2 1 24
3 4 40320
4 302 20922789888000
5 2569966041123963092 263130836933693530167218012160000000
5.5 Counting A ne Equivalent Double Cosets
in A2n
Since the To oli gate is an even permutation for all n > 3, the To oli gate
and AGLn(2) can only generate A2n. For this reason, it will be more natural to
examine the complexity structure of the double cosets in A2n and extend to S2n
via an odd permutation only when necessary.
Let G be a group acting on a set X by conjugation. Let CG(x) denote the
centralizer and xG the conjugacy class of an element x. We will consider both
S2n and A2n acting on A2n by conjugation. For any  2A2n,
jCS2n( )j j S2nj= n!
jCA2n( )j j A2nj= n!2 :
Note that
CA2n( ) = CS2n( )\A2n <CS2n( ):
We will see that either the centralizer or conjugacy class will be half-sized in A2n.
The following theorem details exactly what happens.
46
Theorem 5.5.1. Let S2n and A2n both act on A2n by conjugation, and let  2
A2n. Then
jCA2n( )j=
8
><
>:
jCS2n( )j if  has only odd distinct cycles.
1
2jCS2n( )j otherwise.
Proof. See [Sco87].
Corollary 5.5.2. Let S2n and A2n both act on A2n by conjugation, and let
 2A2n. Then
j A2nj=
8
><
>:
1
2j 
S2nj if  has only odd distinct cycles.
j S2nj otherwise.
Proof. The order of the centralizer times the orbit must be n!2 . Thus if the
centralizer doesn?t change in A2n, the conjugacy class must split. Likewise, if the
conjugacy class doesn?t split in A2n, the centralizer must be half as large.
Since most permutations do not have a cycle type with only distinct odd
cycles, the conjugacy class for most permutations is the same between S2n and
A2n. However, the centralizer is now half as large for these permutations.
Let  represent the distinct odd cycles and  the rest. Let  1 and  2 represent
the two A2n conjugacy classes of cycle type  . Then the equation for theorem
5.3.3 reduces to
1
jAGLn(2)j2
 X
 
jH j2 12
Y
k
 k!k k +
X
 ;i=1;2
jH ij2
Y
k
 k!k k
!
:
The number of a ne equivalent double cosets in A2n is close to half the
number in S2n due to the fact that most of the cycles types are not distinct odd
cycles.
47
Lemma 5.5.3. If the distinct odd cycles  of AGLn(2) split evenly into the two
A2n conjugacy classes  1 and  2, then the number of a ne equivalent double
cosets in A2n is exactly half the number in S2n. I.e. if for all distinct odd cycles
 
jHA2n 1 j=jHA2n 2 j= 12jHS2n j
Then A2n contains half of the double cosets of S2n.
Proof. Since the summand result is already correct for cycle types  , that are
not odd and distinct, we need only consider the summands associated with the
odd distinct cycle types  .
X
 ;i=1;2
jH ij2
Y
k
 k!k k =
X
 ;i=1;2
jH 2 j2
Y
k
 k!k k
= 14
X
 ;i=1;2
jH j2
Y
k
 k!k k
= 14
 
2 
X
 
jH j2
Y
k
 k!k k
!
= 12
X
 
jH j2
Y
k
 k!k k:
Thus the contribution from both the odd distinct cycles  and the other cycles
 is half in both cases. Therefore the number of a ne double cosets A2n will be
exactly half of S2n.
Elements of the same AGLn(2) conjugacy class, will obviously be in the same
A2n conjugacy class, thus if a A2n conjugacy class has a cycle type that is odd
and distinct, there is no guarantee that there will be a matching conjugacy class.
In order to test whether or not two permutations are in the same A2n conjugacy
class, we will use the following result.
48
Lemma 5.5.4. Let p and q be permutations in S2n with the same odd distinct
cycle structure. Since p and q have the same cycle structure, there exists  2S2n
such that p =  q  1. If  is odd, then p and q are in di erent A2n conjugacy
classes. If  is even then p and q are in the same A2n conjugacy class.
Proof. Find  such that p =  q  1. Such a  is guaranteed to exist since p and
q have the same cycle structure.
Assume  is an odd permutation. If p and q are in the same A2n conjugacy
class, then there exists an even permutation  such that q =  p  1. Thus
p =  q  1
=   p  1  1
= (  )p(  ) 1:
Therefore   is an odd permutation in the centralizer of p, which is a contra-
diction, since the centralizer of p is contained in A2n.
Assume that  is an even permutation. Since  2A2n, this implies that p
and q are in the same A2n conjugacy class.
Consider now the number of cycles in a permutation in S2n with odd distinct
cycles. Since the length of the cycles must sum to 2n, there must always be an
even number of distinct odd cycles. This leads us to a case that will pair two odd
distinct cycle AGLn(2) conjugacy classes.
Lemma 5.5.5. Let b be an a ne permutation in AGLn(2), n > 2, with an odd
distinct cycle structure  . Let m be the number of cycles in  . If m 2 (mod 4),
then b and b 1 are in di erent A2n conjugacy classes.
49
Proof. Consider the cycles in the permutation b. If every cycle is reversed, we
have permutation b 1. We will consider the parity of the permutation  that
reverses every cycle by conjugation.
Let Ck = (b1b2   bk) be a cycle of b. De ne  on Ck so that  (bi) = bk i+1.
Then
 Ck  1 =  (b1b2   bk)  1
= ( (b1) (b2)    (bk))
= (bk   b2b1):
Since k is odd, the midpoint of the cycle stays  xed, and  is thus composed of
k 1
2 transpositions.
Extending  over all cycles, we see that  is composed of 2n m2 transpositions.
Since m  2 (mod 4), 2n m2 is odd. Therefore  is an odd permutation. By
lemma 5.5.4, b and b 1 must be in di erent A2n conjugacy classes.
Since the inversion action stays within AGLn(2), there must be an exact
pairing between the two A2n conjugacy classes within AGLn(2). Thus all distinct
odd cycle types with 2 (mod 4) cycles will split evenly in the A2n conjugacy
classes. Thus the only cycle types which can a ect the sum are the distinct odd
cycle types with 0 (mod 4) cycles.
The case of n = 5 is the  rst instance where there exists a odd distinct cycle
type with 0 (mod 4) cycles. There are two conjugacy classes of size 15237120
with a (1;3;7;21) cycle structure. A parity check between the two conjugacy
classes reveals that they di er by an even conjugation. Lemma 5.5.4 indicates
that the two conjugacy classes will be in the same A2n conjugacy class.
Let NS be the number of a ne equivalent 5-bit permutations. Computing
50
the number of a ne equivalent 5-bit even permutations NA yields
NA = Ns2 + 15237120
2 3 7 21
2jAGLn(2)j2
= 1284983020561981548:
Is should be noted that we only added half of the summand since the other
half was already included in NS2 . The correction factor turns out to only make a
di erence of 2, since Ns2 = 1284983020561981546.
In the cases where n = 4;6, there are no odd distinct cycle types with 0
(mod 4) cycles. Thus, the number of a ne equivalent even permutations is ex-
actly half for those two cases. For n = 7 there are 10 conjugacy classes with odd
distinct cycle type and 0 (mod 4) cycles.
Table 5.4: Number of A ne Equivalent Even Permutations
n A ne Equivalent in A2n
3 2
4 151
5 1284983020561981548
6 38115488450430370396302626146823126191571813695482842576562378932
The fact that there are 151 a ne equivalent classes in A16 will be used in the
classi cation of 4-bit permutations.
5.6 Open problems
 Develop a general method for computing the conjugacy classes of AGLn(2).
51
A method was developed in [Hou06], but the results are incomplete. Hou?s
calculation indicated that the number of a ne equivalent 5-bit permuta-
tions was 2569966041123938084 instead of 2569966041123963092. A veri -
cation check by the author revealed that Hou was missing two conjugacy
classes found by MAGMA. While the canonical form (conjugacy class) of
linear matrices has been well studied, the problem is surprisingly not com-
pletely solved for  nding the canonical form of a ne transformations.
 Develop a general method for counting the a ne equivalent double cosets in
A2n. Are there always half as many a ne equivalent double cosets in A2n
when n is even (n> 2)?
52
Chapter 6
Hamiltonian Cycles over GLn(2)
6.1 Results and Application
Identifying whether or not two functions are a ne equivalent is critical to the
development of a nonlinear hierarchy. This basic computation, which is essentially
a double coset membership test, must be done many times, and thus requires a
very e cient implementation.
With permutations stored as bit vectors, CNOTs can be applied by simple
masking, shifting and XORing. These operations are much faster to implement
than generically composing a permutation and a linear function.
Since the double coset test requires exhausting over all bit matrices in GLn(2),
it is natural to ask whether a Hamiltonian cycle exists over GLn(2) where the
vertices are the invertible matrices and the edges correspond to adding one row
to another.
The major results of this chapter are:
 A proof that the subgroups of upper triangular matrices in GLn(2) have
Hamiltonian cycles.
53
 A heuristic algorithm for generating Hamiltonian cycles over GLn(2)
 The algorithm found Hamiltonian cycles over GLn(2) for n = 2;3;4;5.
The Hamiltonian cycles found are used to construct a highly e cient double
coset test.
6.2 Background
A graph contains a Hamiltonian cycle if there exists a sequence of connected
edges that pass through each vertex only once, returning to the the start vertex
at the end. The Knight?s Tour problem of  nding a sequence of knight moves
that touch each square of the chess board once is a famous Hamiltonian cycle
problem. The Gray code is another famous example of a Hamiltonian cycle. In
general, determining whether or not a given graph has a Hamiltonian cycle is
NP-complete.
Given a group G and a generating set S, the Cayley graph  =  (G;S) is
constructed by assigning each element of G to a vertex, and then connecting two
vertices if their di erence is in S. Since S is a generating set,  (G;S) must be
connected.
The following lemma and corollary will be used to construct Hamiltonian
cycles in this chapter.
Lemma 6.2.1. Let  (G;S) be the Cayley graph for a group G and generating
set S = f 0; 1;:::g. Let f : Zn ! f0;1;2;:::jSj 1g be a map such that
H = [ f(0) f(1)    f(n 1)] is a Hamiltonian cycle on  . Then any cyclic rotation
of H is a Hamiltonian cycle on  , i.e. [ f(k) f(k+1)    f(k+n 1)] is a Hamiltonian
cycle on  for all k2Z.
54
Proof. Since H = [ f(0) f(1)    f(n 1)] is a Hamiltonian cycle, we may list the
elements of G in the following order.
 f(0)
 f(0) f(1)
 f(0) f(1) f(2)
...
 f(0) f(1) f(2)    f(n 1) = e
If we now act on the elements of G by multiplying on the left by   1f(0), we get
another ordering of the elements of the group G.
e
 f(1)
 f(1) f(2)
...
 f(1) f(2)    f(n 1)
Recall that H is a Hamiltonian cycle implies
e =  f(0) f(1) f(2)    f(n 1):
Conjugating both sides by  f(0) yields
e =  f(1) f(2)    f(n 1) f(0):
55
Moving the  rst term e to the end, now gives
 f(1)
 f(1) f(2)
...
 f(1) f(2)    f(n 1)
 f(1) f(2)    f(n 1) f(0):
This ordering is associated with the Hamiltonian cycle [ f(1) f(2)    f(n 1) f(0)].
Since a single circular shift will yield a new Hamiltonian cycle, any circular
shift will yield a new Hamiltonian cycle. Thus [ f(k) f(k+1)    f(k+n 1)] is a
Hamiltonian cycle for any k.
Corollary 6.2.2. Let  (G;S) be the Cayley graph for a group G and generating
set S. Let G1 be a subgroup of G. If H is a Hamiltonian cycle over the vertices
in G1, then any rotation of H is a Hamiltonian cycle over G1.
Proof. Apply Lemma 6.2.1 the Cayley graph  (G1;S1) where S1 is the subset of
S used in H.
Corollary 6.2.3. Let  (G;S) be the Cayley graph for a group G and generating
set S. Let C1 be a left coset of G. If H is a Hamiltonian cycle over the vertices
in C1, then any rotation of H is a Hamiltonian cycle over C1.
Proof. Any Hamiltonian cycle over a coset will also be a Hamiltonian cycle over
the associated subgroup.
The Lov asz Conjecture simply states that every  nite connected Cayley
graph contains a Hamiltonian cycle. Many sub-cases of the the Lov asz Conjecture
56
have been proved. These cases mostly deal with special generating sets for the
symmetric group Sn. For a survey of these results see [CG96]. The author is not
aware of any results over GLn(2).
6.3 Existence
The computational e ect of an AND, shift and XOR is the same as applying a
single CNOT gate, which is the same as the linear operation of adding one row
to another. This will be demonstrated in detail later. It is well known from
linear algebra that simple row operations will generate any matrix. Thus, the
computational speed-up relies on the solution of the following problem.
Problem 6.3.1. Starting from the identity matrix, is it possible to step through
all possible invertible matrices in GLn(2) using only a single row operation at
each step?
In other words, is there a Hamiltonian cycle over GLn(2) with the generating
set S =f ijg, where  ij is the operation of adding row i to row j?
For n = 2, the problem turns out to be quite easy. Since  2ij = 1 for
all i;j, the only choice is to alternate between  12 and  21. It turns out that
[ 12; 21; 12; 21; 12; 21], is a Hamiltonian cycle as follows:
[ 1 00 1 ]  12 ! [ 1 01 1 ]  21 ! [ 0 11 1 ]  12 ! [ 0 11 0 ]  21 ! [ 1 11 0 ]  12 ! [ 1 10 1 ]  21 ! [ 1 00 1 ]
Finding a solution for n = 3 turned out to be not so easy. Early attempts to
solve this problem found many near misses. Even though the ability to add any
row to any other row gives a lot of freedom, the extra freedom made searching
very di cult. Appealing to the Lov asz conjecture, a solution should exist for any
generating set. Consider the following lemma.
57
Lemma 6.3.2. Let G = GLn(2) and let  i be the row operation that adds row i
to row i + 1. For  n, the row operation will wrap around and add row n to row
1. Then S =f i for i2[1::n]g is a generating set for G.
Proof. Each  i =  i;i+1. Noting that  ik kj ik kj =  ij, one can build up any
 ij.
Thus, in the case wheren = 3, instead of consideringS =f 12; 13; 23; 21; 31; 32g,
the Lov asz conjecture implies that a Hamiltonian cycle should exist using only
S =f 1; 2; 3g.
Using the reduced generating set, the following 24 long sequence was found.
When the sequence is repeated 7 times, it generates all 168 matrices in AGL(3,2).
 1 2 1 2 1 2 1 3 1 2 1 2 1 2 1 3 1 2 1 2 1 3 1 2
The cases where n = 2;3 are special in that they cannot be found using the
heuristic algorithm presented later.
6.4 Borel Subgroup
Removing one generator from S = f ig, namely  n, leaves a set that generates
the lower triangular matrices, since the rows can only add down. This subgroup,
also known as the Borel group, is one generator away from GLn(2). Hamiltonian
cycles on cosets of this subgroup could be pieced together to form a Hamiltonian
cycle for GLn(2). This is how the heuristic algorithm to be presented later will
work.
Theorem 6.4.1. Let Bn be the lower triangular subgroup of GLn(2) and let
S =f i for i2[1::(n 1)]g. Then the Cayley graph  (Bn;S) has a Hamiltonian
cycle.
58
Figure 6.1: Extending the Active Cycle
ve
Active cycle
vg
 n
 1
 n
 n
 k  k
Coset cycle
Proof. For n = 3, one can verify that [ 1 2 1 2 1 2 1 2] is a Hamiltonian cycle.
By induction, we assume that a Hamiltonian cycle exists for Bn. Identify Bn
with the subgroup of Bn+1 that is zero everywhere o the diagonal in the  nal
row. The cosets of Bn partition Bn+1. Each coset can be traversed using the
Hamiltonian cycle for Bn.
Note that  n commutes with all of the generators in Bn except for  n 1. The
following process may be used to generate a Hamiltonian cycle for all of Bn+1.
1. Start with the Hamiltonian cycle associated with the subgroup Bn. This
will be the  rst state of what will be referred to as the active cycle. Note
that all the cosets of Bn may also be traversed by the same transition order
as the Hamiltonian cycle for Bn. The goal is to connect each coset cycle to
the active cycle, eventually enlarging the active cycle to cover all of Bn+1.
2. Starting at the identity, e, follow the Hamiltonian path on the active cycle
and test each vertex by applying  n to see if  n transitions to a vertex not
on the active cycle.
59
3. Let vg be the  rst vertex on the active cycle such that vg  n is not on the
active cycle. Since S is a generating set, the only time this can fail is when
we have generated the entire Hamiltonian cycle. (Note: In the case where
transition away from the active cycle from the initial point e, one of the
incoming or outgoing edges to e must not be  n 1. Thus, the rest of the
argument will still apply.)
4. Since  n commutes with all of the generators for Bn except  n 1, the edge
transitioning into vg must be  n 1. See  gure 6.1. Otherwise, commuta-
tivity would imply that we would have transitioned away from the active
cycle on the previous step.
5. Let  k be the edge transitioning out of vg on the active cycle. Then k6= n,
since  n must be taking us to a new coset cycle. And k6= n 1, since  2n 1 =
e, and the vertices are distinct on a Hamiltonian path. Thus  n k =  k n.
6. By corollary 6.2.3, there is a Hamiltonian cycle over the coset and by corol-
lary 6.2.2, we may rotate the cycle so that the last edge is  k.
7. Since the coset cycle is a Hamiltonian cycle, the product of all the tran-
sitions is the identity (return to the starting point). Thus, if we do not
complete the last transition, the product of all the previous transitions
must by   1k which is simply  k since it has order 2.
8. Extend the active cycle to include the points on the coset cycle by branching
out from the active cycle at vg. Continue around the Hamiltonian cycle from
the coset, but stopping before applying the  nal transition  k. Return back
to the active cycle via  n. The total action is equivalent to  n k n which is
60
equivalent to  k due to commutativity. Thus we return at exactly the next
vertex on the active cycle and can now complete the trip back to ve.
9. If the active cycle now covers all of Bn+1 stop. Otherwise, return to step 2
to extend the active cycle by another coset cycle.
6.5 Heuristic Algorithm
Although the evidence is clearly in favor of a Hamiltonian cycle existing for
GLn(2), the author was unable to  nd a solution. However, the following heuristic
algorithm was used to  nd Hamiltonian cycles over GLn(2) for n = 4;5. In this
case note that  n =  n;1. The  nal  n adds the nth position to the  rst position.
The only di erence between this algorithm and the algorithm for  nding
Hamiltonian cycles over the Borel group is that we cannot prove that this al-
gorithm succeeds. It is possible to build an active cycle that cannot be extended,
but is also not the full Hamiltonian cycle.
1. Start with the Hamiltonian cycle associated with the Borel subgroup Bn.
2. Apply  n to each vertex vg. If  n transitions to a coset not contained in
the active cycle, and the incoming and outgoing transitions for vg are not
both  1 and  n 1, then we have found a vertex from where we can extend
the active cycle. NOTE: Such a vertex may not exist. This is where the
algorithm could fail.
3. Let  k be the transition where k6= 1;n 1. If  k is an incoming transition,
then choose the vertex immediately preceding vg to be our branching point.
61
4. In exactly the same manner as the previous algorithm, insert the new coset
cycle in the active cycle in the place of  k.
5. If the active cycle now covers all of GLn(2) stop. Otherwise, return to step
2 to extend the active cycle by another coset cycle.
6.6 Open Problems
 Prove that a Hamiltonian cycle exists over GLn(2) using only row addi-
tions. This could include all possible row additions, not just the reduced
set considered here.
 Find a \Gray Code" style algorithm for the Hamiltonian cycle. The current
method only  nds a path whose description is essentially a long description
of what move to do at each step. For Gray Code, there is a function f(t)
which indicates which bit should be  ipped at time t. The ideal answer
would be a simple function f(t) = (a;b) indicating the next step would be
to add row a to row b.
62
Chapter 7
Double Coset Representatives
7.1 Results and Application
This chapter de nes a Basis Fixing Permutation (BFP) as a permutation
that  xes the zero vector and each vector of weight one. Each double coset
contains at least one, and usually many BFPs. Additional reductions can be
applied to  nd a minimal BFP from a given BFP. Unfortunately, many double
cosets still contain multiple minimal BFPs.
This set of minimal BFPs is used to identify a given double coset. Once
found, minimal BFPs can quickly discover which class a given permutation is in.
Unfortunately,  nding all the minimal BFPs is computationally intensive.
The major results of this chapter are:
 Every permutation is a ne equivalent to a basis  xing permutation (BFP).
 Each coordinate polynomial of a BFP has a single linear term.
 Given a lexicographic ordering and a BFP, a minimal BFP can be quickly
computed.
63
A ne equivalent representatives for each double coset are used to accelerate
the identi cation algorithms.
7.2 Background
Let us  rst recall some basic terms used with linear and a ne transformations.
For the following de nitions, let K be a  eld and the vi?s elements of a vector
space over K.
Definition 7.2.1. The linear span of m vectors v1;:::;vm2Kn is
( mX
i=1
aivi
  
  
 ai2K
)
:
Similarly, the a ne span of m+ 1 vectors v0;:::;vn2Kn is
( mX
i=0
aivi
  
  
 
mX
i=0
ai = 1;ai2K
)
:
Definition 7.2.2. A set of m vectors v1;:::;vm is linearly independent if
mX
i=1
aivi = 0 =) 8i;ai = 0:
Similarly, a set of n+ 1 vectors v0;:::;vn is a nely independent if
nX
i=0
aivi = 0 and
nX
i=0
ai = 0 =) 8i;ai = 0:
Definition 7.2.3. Given n, the a ne basis vectors are e0 =~0 and e1;:::;en,
where each ei is a bit vector with zeros in every position except a 1 in the ith
position, for i in 0;1;:::;n.
64
7.3 Basis Fixing Permutations
Definition 7.3.1. A Basis Fixing Permutation (BFP) is a function that
sends each a ne basis vector to itself. Thus, f is basis  xing if f(~0) =~0;f(~e1) =
~e1:::;f(~en) = ~en.
The  rst major reduction that can be made is realizing that every permutation
is a ne equivalent to a BFP. This was proved by the author and subsequently
discovered to be in [Hou06].
Theorem 7.3.2. Every permutation in S2nis a ne equivalent to a basis  xing
permutation.
Proof. Suppose there exists a set of n+1 a ne independent vectors x0;x1;:::;xn
such that p(x0);p(x1);:::;p(xn) is also a ne independent. Then there exist a ne
maps a0;a1 such that for all i, a0(ei) = xi and a1(p(xi)) = ei. Thus a1pa0(ei) = ei
for all i and a1pa0 is therefore basis  xing. Thus the proof reduces to  nding an
a ne independent set that maps to an a ne independent set via p.
We prove by induction that for every k in 0  k  n there exists an a ne
independent setXk such that p(Xk) is also a ne independent. Clearly fork = 0;1
any Xk and p(Xk) is a ne independent. Assume that for k < n, Xk is a ne
independent. LethXkiaf denote the a ne span of ak+1 element setXk. Consider
the permutation p, restricted as follows:
p : V2nnhXkiaf !V2nnp(Xk):
Since
jp(V2nnhXkiaf)j+jV2nnhp(Xk)iafj= 2(2n 2k) > 2n (k+ 1) =jV2nnp(Xk)j;
65
the intersection p(V2nnhXkiaf)\(V2nnhp(Xk)iaf) is non-empty, and thus there
exists an x not in the spanhXkiaf that maps to p(x) not in the span ofhp(Xk)iaf.
Adding x to Xk, we create a new set Xk+1 = Xk[x, such that Xk+1 and p(Xk+1)
are both a nely independent.
Using the  nal Xn constructed in this manner, we can now  nd maps a0 and
a1 so that a1pa0 is a basis  xing permutation.
Lemma 7.3.3. Let p be a BFP. Then each coordinate function pi of p has the
form
pi(x1;:::;xn) = xi +fi(x1;:::;xn)
where fi contains no constant or linear terms.
Proof. p(e0) = e0 = ~0 implies no coordinate function can contain a 1. Given
that all coordinate functions have no constant terms, p(ei) = ei indicates that
the linear term for each pi is xi.
7.4 Basis Permuting Permutations
One important subgroup of GLn(2) is the group of matrices with a single one in
each row and column. This subgroup is isomorphic to Sn acting on f1;2;:::;ng
via the injective map  : Sn!GLn(2) de ned by the outer product sum
 (p) =
nX
i=1
  e
p(i)
 he
ij:
Thus  (p) is the linear function that maps the basis vectorjeiito the basis vector
  e
p(i)
 for each i2f1;2;:::;ng.
In AGLn(2), there is a subgroup isomorphic to Sn+1 acting onf0;1;2;:::;ng,
which allows e0 to be permuted with the basis vectors. We de ne this subgroup
via the injective map  : Sn+1 !AGLn(2) de ned by
66
 (q) =
nX
i=1
   e
q(i)
 he
ij+
  e
q(0)
 he
ij
 +  e
q(0)
 :
The preceding a ne form is applied to a vectorjviby  rst multiplying by the
matrix component and then adding the vector component. Thus
 (q)jvi=
nX
i=1
   e
q(i)
 he
ijjvi+
  e
q(0)
 he
ijjvi
 +  e
q(0)
 :
Definition 7.4.1. A Basis Permuting Permutation (BPP) is a function
that sends each a ne basis vector to another. Thus, p is basis permuting if for
all i in 0;1;:::;n, f(~ei) = ~ej for some j in 0;1;:::;n.
Lemma 7.4.2. All of the elements of  (Sn+1) < AGLn(2) are basis permuting
permutations.
Proof. Given q 2Q, verifying that q(jeji) maps to   eq(j) is su cient to prove
that q is a BPP. Recall that
heijjeji=
8
><
>:
1 i = j6= 0
0 otherwise.
When q is evaluated at e0 =~0, only the a ne component   eq(0) does not zero
out. The result q(je0i) =   eq(0) is as expected. Evaluating q at ej where j6= 0
gives:
67
q(jeji) =
" nX
i=1
   e
q(i)
 he
ij+
  e
q(0)
 he
ij
 +  e
q(0)
 
#
(jeji)
=
nX
i=1
   e
q(i)
 he
ijjeji+
  e
q(0)
 he
ijjeji
 +  e
q(0)
 
=   eq(j) +  eq(0) +  eq(0) 
=   eq(j) :
Thus, q permutes the a ne basis vectors, and is therefore a BPP.
Lemma 7.4.3. Any basis  xing permutation conjugated by a basis permuting
permutation is a basis  xing permutation.
Proof. Let q be a BPP and p be a BFP. Then for i in f0;1;:::ng
q 1pq(jeji) = q 1p(  eq(j) )
= q 1(  eq(j) )
=   eq 1(q(j)) 
= jeji:
Since q 1pq  xes all a ne basis vectors, q 1pq is a BFP.
7.5 Minimal Basis Fixing Permutations
Conjugating by the elements in Q< AGLn(2), we can construct many di erent
BFPs from a given BFP. Determining which BFP is minimal depends on the
68
lexicographic order chosen. For Gr obner bases, grevlex is preferred, since it is
usually easier to compute an ideal with respect to that order.
Definition 7.5.1. The Graded Reverse Lexicographic or grevlex is a poly-
nomial ordering. The ordering of monomials is decided  rst by degree. If the
degree is the same, it decides ties by the degree of the variables in reverse order.
Example 7.5.2. Polynomials ordered by grevlex.
 x1x3 <x0x1x2 since the degree(x1x3) = 2 < 3 = degree(x0x1x2).
 x1x2 <x0x3 since the degrees are equal, but x0x3 has the higher numbered
variable (x3).
 x0x1x3x4 <x0x2x3x4 since the degrees are equal, but the second monomial
contains the highest variable not common to both (x2).
Definition 7.5.2. The Grevlex Permutation Order is an ordering of BFPs.
Compare two permutations p and q, by considering their coordinate functions
(p1;:::;pn) and (q1;:::;qn). In the  rst case where pi and qi di er, the order is
decided by the grevlex order between pi and qi.
For a given BFP p, there exists a minimal BFP q 1pq for q2Q according to
the grevlex permutation order. Unfortunately, there can be many such minimal
BFPs in a given double coset. The true minimal BFP would be the minimum of
all such minimal BFPs.
For computational purposes, the author currently searches for all minimal
BFPs, and then identi es the minimum BFP from this set. This information is
stored in a table.
Algorithm for permutation identi cation:
69
1. Compute BFP from permutation.
2. Compute minimal BFP from BFP.
3. Look up minimum BFP associated with minimal BFP in table.
Once the table is created, permutation identi cation is quite fast. Unfortu-
nately, constructing the table is computationally intensive.
7.6 Open Problems
 Find an e cient algorithm for computing the minimum BFP for a double
coset.
 How e ciently can the parity of a permutation be computed from a BFP?
 Given a basis  xing function F(x1;:::;xn) = (x1 + f1;:::;xn + fn), how
di cult is it to determine whether F is a permutation?
 How hard is it to count the quadratic permutations?
70
Chapter 8
Multiple Rank Invariant
8.1 Results and Application
When comparing and categorizing permutations, the double coset test must be
performed many, many times. If two permutations can be distinguished via a
quick test, the double coset test is not needed.
This chapter develops a new rank invariant that can distinguish certain double
cosets. The invariant essentially measures how much of a permutation is linear,
quadratic, cubic, etc.
The major results of this chapter are:
 A proof that the multiple rank invariant is consistent with a ne equivalent
double cosets.
 An e cient algorithm for computing the multiple rank invariant.
 A categorization of how many equivalence classes are of each type for n =
3;4.
The multiple rank invariant is used extensively to speed up the computation
of the a ne equivalence complexity tree.
71
8.2 Background
Every Boolean permutation can be decomposed intoncoordinate functions (f1;:::;fn).
Each coordinate function fi : GF(2)n!GF(2) can be expressed as a polynomial
over GF(x1;:::;xn). Since the value for each xi is restricted to be exactly 0 or 1,
the relationship x2i = xi holds for each xi. A polynomial will be called reduced if
no variable has degree greater than 1.
8.3 The k-rank
Consider the  nite vector space over GF(2) where the independent vectors are
the 2n reduced monomials from GF(x1;:::;xn), namely
1;fxigni=1;fxixjgni;j=1;i6=j;:::x1x2   xn:
Then similarly to how the rank of a set of vectors is calculated, we can calcu-
late the rank of a set of polynomials.
Example 8.3.1. Consider the following sets of polynomials.
 The set fx1;x2;x1 +x2g has rank 2, since the third term is the sum of the
 rst two.
 The setfx1;x2;x1x2ghas rank 3, since all three terms are linearly indepen-
dent.
This now leads to two important de nitions.
Definition 8.3.1. The k-span of the variables X =fx1;:::;xng, is the linear
span of all monomials of degree k or less. We will denote the k-span by Xk.
72
Definition 8.3.2. The k-rank of a set of polynomials P = fp1;:::;pmg is
rank(P) + rank(Xk) rank(P[Xk). We will denote the k-rank by Rk(P).
The k-rank essentially measures how much of the rank of P is due to polyno-
mials of degree k or less. For the k-rank of a permutation p = (f1;:::;fn), the
set of polynomials will be the set of coordinate functions, i.e. P =ff1;:::;fng.
Example 8.3.3. Consider the following sets of polynomials.
 Let P1 = fx1;x2;x1x2g. Then R1(P1) = 2, since two of the terms can be
expressed by linear polynomials. R2(P1) = 3 since all polynomials can be
expressed by degree two or less polynomials.
 Let P2 = fx1 + x1x2;x2 + x1x2;x1x2g. Then R1(P2) = 2 and R2(P2) = 3.
This can be veri ed by calculation or recognizing that P2 was derived from
P1 by adding the last term to each of the  rst two. It can also be seen
by reducing by the monomial x1x2 and realizing that all of the remaining
terms are linear.
Example 8.3.4. Consider the following permutations.
 Let p1 = (x1;x2;x3 + x1x2). Then P1 = fx1;x2;x3 + x1x2g. As before,
R1(P1) = 2 and R2(P1) = 3.
 Let p2 = (x1 + x3 + x1x2;x2 + x1x3 + x1x2;x3 + x1x2). Then P1 = fx1 +
x3 +x1x2;x2 +x1x3 +x1x2;x3 +x1x2g. In this case R1(P1) = 1 since f1 +f3
is linear and R2(P1) = 3 since all of the coordinate functions are at most
quadratic.
73
8.4 The Invariance of the k-rank
Lemma 8.4.1. All the coordinate functions of an n-bit permutation are linearly
independent.
Proof. Let p be a permutation, and let P =ff1;:::;fng. Suppose P was linearly
dependent. Then there would exist a non-zero linear combination L such that
L(f1;:::;fn) = 0. Since L is a non-zero vector, it can be extended to an invertible
matrix M where L is the  rst row of M. Note that since M is invertible, M is
also a permutation.
Consider now the composition M P. Since M and P are both permutations,
M P must also be a permutation. However, if we consider the  rst coordinate
function of M P it is L(f1;:::;fn) = 0. Thus M P cannot be a permutation,
since every coordinate function of a permutation must be balanced.
Theorem 8.4.2. The k-rank of a permutation is invariant under the action of
AGLn(2) for all k 1.
Proof. Recall the de nition of k-rank for a permutation p:
Rk(p) = rank(p) + rank(Xk) rank(p[Xk)
We wish to show that Rk(a p b) = Rk(p) for all a;b2AGLn(2). This can be
done in two parts. Note that a creates new coordinate functions by summing up
various coordinate functions and b substitutes a ne combinations of variables in
place of the original variables.
First, consider the e ect of a. By the previous lemma, ap must have full
rank so rank(ap) = rank(p). Also note that ap is the linear combination of the
coordinate functions of p with the possible addition of the a ne vector 1, which
74
is contained in Xk for all k. Thus rank(ap[Xk) = rank(p[Xk). Thus,
Rk(ap) = rank(ap) + rank(Xk) rank(ap[Xk)
= rank(p) + rank(Xk) rank(p[Xk) = Rk(p)
Now consider the e ect ofb. Appealing to the previous lemma again, rank(pb) =
rank(p). Thus we only need to verify that rank(pb[Xk) = rank(p[Xk).
Notice that the coordinate functions of pb are really the original coordinate
functions f1;:::;fn evaluated at y1;:::;yn where yi = bi(x1;:::;xn).
Let Yk be the k-span of the yi. Then Yk Xk since any polynomial of degree
k+ 1 is the sum of monomials of degree less than or equal to k+ 1. Since b is an
invertible a ne function, we can write each xi as a a ne function of y1;:::;yn.
Thus, by the same argument Xk Yk and therefore Xk = Yk. Thus,
rank(pb[Xk) = rank(pjY [Yk) = rank(pjX[Xk = rank(p[Xk):
This implies,
Rk(pb) = rank(pb)+rank(Xk) rank(pb[Xk) = rank(p)+rank(Xk) rank(p[Xk) = Rk(p):
Since neither a ne function a or b a ects the rank, we can apply them in
either order and deduce that Rk(apb) = Rk(p).
Therefore the k-rank is invariant under a ne equivalence.
8.5 Multiple Rank Invariant
Before de ning the multiple rank invariant, it should be noted that no information
is gained using the n-rank invariant.
75
Lemma 8.5.1. No n-bit permutation has a coordinate function with an x1x2   xn
term.
Proof. Considering the truth table of x1x2   xn for n > 1, it only takes a non-
zero value at one point, namely (1;1;:::;1). Every other monomial is non-zero
at an even number of points. Since the coordinate function of a permutation
must be balanced, and therefore have an even number of 0?s and 1?s (for n> 1),
x1x2   xn cannot be contained in any coordinate function because it would make
the weight odd, and therefore not balanced.
Thus we only need to consider monomials of degree up to n 1 for permutation
coordinate functions.
Definition 8.5.1. The multiple rank invariant of an n-bit permutation is an
n 1 tuple of k invariant di erences. I(p) = (R1(p);R2(p) R1(p);:::;Rn 1(p) 
Rn 2(p).
Example 8.5.2. Consider the following permutations.
 Let p1 = (x1;x2;x3 +x1x2). As computed earlier, R1(P1) = 2 and R2(P1) =
3. Thus the multiple rank invariant is the tuple (2;1).
 Let p2 = (x1;x1 + x2;x3 + x1x2x4;x4 + x1x2). By inspection, R1(P2) = 2,
R2(P2) = 3 and R3(P2) = 4. Thus the multiple rank invariant is the tuple
(2;1;1).
8.6 Categorization of 3-bit Permutations by MRI
For the four 3-bit equivalence classes, the multiple rank invariant completely
distinguishes each class.
76
Table 8.1: Multiple Rank Invariant of 3-bit Permutations
MRI # of classes Description
(3,0) 1 A ne
(2,1) 1 To oli
(1,2) 1
(0,3) 1
We will discover later that the multiple rank invariant also captures the rela-
tive complexity of all 3-bit permutations, ordering them in terms of those requir-
ing 0,1,2 or 3 To oli gates.
8.7 Categorization of 4-bit Permutations by MRI
As seen in Table 8.2, the multiple rank invariant separates many of the 302 4-bit
equivalence classes, but fails to distinguish among some of the most prevalent
permutation types.
When the complexity tree is developed later, we will see that the multiple rank
invariant has a rough correlation to the nonlinear complexity of a permutation.
8.8 Open Problems
For n  4 the multiple rank invariant is always the same for p and p 1, even
when they are in di erent double cosets. Is this true for all n?
77
Table 8.2: Multiple Rank Invariant of 4-bit Permutations
MRI # of Classes Description
(4,0,0) 1 A ne
(3,1,0) 1 To oli
(3,0,1) 1 Triple controlled NOT
(2,2,0) 2
(2,1,1) 3
(2,0,2) 4
(1,3,0) 3
(1,2,1) 9
(1,1,2) 20
(1,0,3) 13
(0,3,1) 5
(0,2,2) 39
(0,1,3) 127
(0,0,4) 74
78
Chapter 9
Additional Invariants
9.1 Results and Application
In addition to the multiple rank invariant, there are a handful of other minor
invariants that can be used to re ne the double coset identi cation process. This
chapter presents a number of these invariants and their associated proofs.
The major results of this chapter are:
 The parity of a permutation is invariant under a ne equivalence.
 De nition of a two-way permutation and its use as an invariant.
These additional invariants are used extensively to speed up computation of
the a ne equivalence complexity tree.
9.2 Parity
The parity of a permutation is often a useful characteristic, and it will play a
central role in classifying the complexity of permutations.
79
Lemma 9.2.1. Any permutation that does not involve all of the bit wires is an
even permutation.
Proof. Let p be a permutation that does not involve a certain wire. Without loss
of generality, we can assume that the high bit is not involved in the permutation.
Then p can be decomposed into two disjoint cycle sets: those whose high bit is
0, and those whose high bit is 1. Furthermore, since the high bit is not involved
in p, the two cycle sets must have exactly the same structure. Thus the overall
permutation must be even.
Corollary 9.2.2. For n > 2, the parity of a permutation is invariant under
a ne equivalence.
Proof. All a ne permutations are generated by the NOT gate on any single bit,
and by CNOT gates between any two bits. When n > 2, there is at least one
bit that is not involved in the action of the NOT or the CNOT. Thus by Lemma
9.2.1, CNOT and NOT are both even permutations, and therefore any a ne
permutation is an even permutation.
As proved earlier, there are an equal number of even and odd double cosets.
The parity test is independent of the multiple rank invariant, distinguishing some
permutations with identical MRIs.
9.3 Two-Way Permutations
A one-way permutation is a permutation which is easy to compute, but its inverse
is hard to compute. Proving the existence of one-way permutations has proven
extremely di cult, and is closely related to the question of P ?= NP. We will not
80
be solving this problem here, but rather introduce the notion of the opposite of
a one-way permutation.
Definition 9.3.1. A permutation p, is a two-way permutation, if there exist
a;b2AGLn(2) such that apb = p 1.
Thus a two-way permutation is a ne equivalent to its inverse. This implies
that if a method was found to compute p (respectively p 1) faster, that would
automatically yield a method for computing p 1 (respectively p) faster. Thus no
two-way permutation has a chance of being a one-way permutation.
Lemma 9.3.2. If p is a two-way permutation, then every permutation a ne
equivalent to p is a two-way permutation.
Proof. Assume p is a two-way permutation. Then there exist a;b 2 AGLn(2)
such that apb = p 1. Consider any other element in the a ne double coset of p,
having the form gph. Then
(gph) 1 = h 1p 1g 1 = h 1apbg 1 = h 1ag 1(gph)h 1bg 1:
Thus the inverse of gph is a ne equivalent to gph. Therefore gph is a two-way
permutation.
Thus if an a ne equivalent double coset contains a two-way permutation,
then the entire double coset is made up of two-way permutations. This implies
that \two-way-ness" is invariant over the double coset. Now let us consider the
relationship between involutions and two-way permutations.
Lemma 9.3.3. If a permutation p a ne equivalent to an involution, then p is a
two-way permutation.
81
Proof. Suppose p is in a double coset containing an involution i. Then there exist
a;b2AGLn(2) such that apb = i. Thus,
(apb)(apb) = i2 = 1
pbapba = 1
bapba = p 1:
Therefore, p is a ne equivalent to p 1 and is therefore a two-way permutation.
It is unknown if the converse is true. Table 9.1 shows how the parity and
two-way invariants further re ne permutation identi cation.
9.4 Open Problems
 Is every two-way permutation a ne equivalent to an involution? For n 4
it is true.
 Are there other useful invariants?
82
Table 9.1: Invariant Table for 4-bit Permutations
Two-way not Two-way
MRI Even Odd Even Odd
(4,0,0) 1
(3,1,0) 1
(3,0,1) 1
(2,2,0) 2
(2,1,1) 1 2
(2,0,2) 3 1
(1,3,0) 3
(1,2,1) 1 4 4
(1,1,2) 11 3 2 4
(1,0,3) 4 9
(0,3,1) 3 2
(0,2,2) 12 3 10 14
(0,1,3) 14 35 42 36
(0,0,4) 32 26 6 10
83
Chapter 10
Complexity Theory
10.1 Results
The goal of this chapter is to illustrate the usefulness of determining the complex-
ity of a function by measuring the nonlinearity of a function. The major results
of this chapter are as follows:
 De nition of new complexity class based on nonlinearity of a computation.
 Proofs relating nonlinear complexity to AC and NC.
 Proofs relating the classical and nonlinear versions of P and P=poly.
10.2 Background
An excellent introduction to the complexity classes relevant to circuits is given in
[Vol99]. This section provides important complexity class de nitions from that
book.
To determine a circuit class, we  rst choose a basis for constructing functions.
Let us consider two of the most common bases.
84
Definition 10.2.1. B0 =f:;_2;^2gis the standard bounded fan-in basis. B1 =
f:;(_n)n2N;(^n)n2Ng is the standard unbounded fan-in basis.
In B0 you can only take the AND or OR of two values, whereas in B1 you
may take the AND or OR of as many wires as you wish. The depth of a circuit
is the length of the longest path from the input bits to the output bits.
Definition 10.2.2. Let B be a basis and let s;d : N ! N. We de ne the
following complexity classes:
1. SIZEB(s) is the class of all sets A  f0;1g for which there is a circuit
family C over basis B of size O(s) that accepts A as an input.
2. DEPTHB(d) is the class of all sets A f0;1g for which there is a circuit
family C over basis B of depth O(d) that accepts A as an input.
3. SIZE-DEPTHB(s;d) is the class of all sets A f0;1g for which there is a
circuit family C over basis B of size O(s) and depth O(d) that accepts A
as an input.
Two important complexity classes capture the notion of what circuits can be
parallelized e ciently: NC and AC.
Definition 10.2.3. For i 0, de ne
NCi = SIZE-DEPTHB0(nO(1);(logn)i);
and let
NC =
[
i 0
NCi:
Similarly, for i 0, de ne
ACi = SIZE-DEPTHB1(nO(1);(logn)i);
85
and let
AC =
[
i 0
ACi:
Thus NC and AC both represent polynomial sized circuits with polylog depth.
The only di erence is AC allows unbounded fan-in (using basis B1) as opposed
to bounded fan-in for NC (using basis B0). The following two results are well
known in complexity theory.
Theorem 10.2.4. For i 0,
NCi ACi:
Corollary 10.2.5.
NC = AC:
For a proof of these results, see [Vol99].
10.3 Nonlinear Complexity Class
Similar to the notion of size and depth for classical circuits, we introduce the
notions of width and depth for nonlinear reversible circuits. We also need to
introduce the notion of a nonlinear basis. Recall that Tijk is a To oli gate with
control wires i;j acting on the target wire k. Also recall that (i;j;A) represents
the controlled a ne permutation with i control wires dictating whether or not
the j j a ne transformation A is applied to the next j wires.
Definition 10.3.1. Let C be a reversible circuit acting on n wires. Then BT =
fTijk where i;j;k2[0;n)g is the To oli basis for C. BCA =f(i;j;A) where i2
[1;n 1);j2[1;n i) and A2AGLn(j)g is the controlled a ne basis for C.
86
The depth of a nonlinear circuit will be the number of times the nonlinear
basis had to be used. Note that multiple nonlinear basis gates could be used in
a single round as long as they were operating on independent wires.
Example 10.3.2. The 6-bit permutation computing T0;1;2 and T3;4;5 could be
done in one round of nonlinearity since the two To oli gates are acting on inde-
pendent sets of wires.
Definition 10.3.2. Let B be a nonlinear basis and let w;d : N!N. We de ne
the following complexity classes:
1. WIDTHB(s) is the class of all functions f :f0;1g !f0;1g for which there
is a reversible circuit family C over basis B using O(w) wires that computes
f.
2. DEPTHB(s) is the class of all functions f :f0;1g !f0;1g for which there
is a reversible circuit family C over basis B using O(d) rounds of B that
computes f.
3. WIDTH-DEPTHB(s) is the class of all functions f : f0;1g !f0;1g for
which there is a reversible circuit family C over basis B using O(w) wires
and O(d) rounds of B that computes f.
In the same spirit in which NC and AC were de ned, we now de ne the class
of nonlinear circuits with polynomial width and polylog depth.
Definition 10.3.3. For i 0, de ne
NLCi = WIDTH-DEPTHBCA(nO(1);(logn)i);
87
and let
NLC =
[
i 0
NLCi:
Note that NLC is de ned using the controlled a ne basis BCA. Another
complexity class similar in nature to NCi can be de ned using the To oli basis
BT.
Although NLC may seem overpowered, given its ability to perform arbitrary
a ne transformations between every nonlinear step, we will see that NLCi  ts
nicely within the AC hierarchy.
Lemma 10.3.4.
ACi NLCi:
Proof. Let C be a circuit in ACi with s(n) gates and d(n) depth. Since there are
only s(n) gates in the circuit, there are at most s(n) +n di erent wires to join in
a gate at any given time: n for the inputs and s(n) for the output of each gate.
Thus the maximal fan-in gate for the circuit has less than s(n) +n inputs.
Construct a nonlinear circuit of width w(n) = (s(n)+1)(s(n)+n). This space
will be allocated according to table 10.1.
Table 10.1: Space Requirements
Size Description
n Initial input
s(n)(s(n) +n) Wires to hold input for each of the s(n) gates.
s(n) Wires to hold output for each of the s(n) gates.
88
To populate the circuit with gates, we will follow the  ow of the original
circuit C. Let Gi be the set of gates computed at depth i in the circuit C. For
each gate g2Gi, we construct it using the following steps.
1. Using CNOTs, copy the value of the inputs to g to the input staging area
allocated for g.
2. If g is an OR gate, negate all input wires.
3. Apply a multiple controlled NOT from the inputs of g to the output wire
allocated for g.
4. If g is an OR gate, negate the output wire.
Note that step 3 is the only nonlinear step. All CNOTs and negations are
a ne functions, and can be computed in the linear round between each nonlinear
round.
Furthermore, since each g2Gi only depends on values from earlier rounds,
all g 2Gi can be computed in the same nonlinear round. Thus the nonlinear
circuit will also have depth d(n).
Since C 2 ACi, s(n) is polynomial. Thus w(n) = (s(n) + 1)(s(n) + n) is
also polynomial. Also, C 2 ACi implies d(n) = O((logn)i). Therefore, ACi  
NLCi.
Lemma 10.3.5.
NLCi NCi+1:
Proof. Recall that
NLCi = WIDTH-DEPTHBCA(nO(1);(logn)i):
89
This implies
NLCi =
[
k 0
WIDTH-DEPTHBCA(nk;(logn)i):
We will prove that for each k,
WIDTH-DEPTHBCA(nk;(logn)i) NCi+1:
The construction of NLCi by union will then imply that NLCi NCi+1.
Let C be a nonlinear circuit in WIDTH-DEPTHBCA(nk;(logn)i). Then C
consists of O((logn)i) rounds of alternating linearity and nonlinearity on a width
of O(nk) wires. Let us consider the depth in NC terms of the linear and nonlinear
rounds separately.
In each linear round, an a ne matrix acts on the O(nk) wires. Since each
output wire of the a ne transformation, is essentially the parity of some subset
of the wires, and PARITY is in NC1 (polynomial gates, log depth), the a ne
transformation can be computed with polynomial many gates and O(log(nk)) =
O(klogn) = O(logn) depth.
In each nonlinear round, multiple controlled a ne transformations are ap-
plied. For each wire i compute the output value ai as if the a ne transformation
it is associated with was applied. Similar to the reasoning above, this can be done
in O(logn) depth and polynomial gates. Also, for each i compute the control
value ci which is the AND of the control wires for the a ne function associated
with i. The AND of O(nk) inputs can also be computed in O(logn) depth in NC.
Let ni be the original value of wire i with no transformation applied. The correct
output for wire i is then
(ni^:ci)_(ai^ci):
Examination veri es that if all the controls are 1, the output of wire i will be ai,
90
and otherwise ni. This  nal computation only requires depth 2, thus the total
depth for the nonlinear round is O(logn).
Since the nonlinear circuit has O((logn)i) linear and nonlinear rounds, and
each round has NC depthO(logn), the entire circuit will have NC depthO((logn)i) 
O(logn) = O((logn)i+1).
Since the entire circuit uses only polynomially many gates,
WIDTH-DEPTHBCA(nk;(logn)i) NCi+1:
Furthermore, since this inclusion holds for all k,
NLCi NCi+1:
Theorem 10.3.6.
NLC = AC = NC:
Proof. Combining the two previous lemmas, we get
NCi ACi NLCi NCi+1  ACi+1  NLCi+1:
Since NC, AC and NLC are all in nite unions, this is su cient to prove NLC=
AC= NC.
10.4 Nonlinear Polynomial Complexity
Consider now the complexity class with polynomial width and a polynomial num-
ber of nonlinear rounds.
Theorem 10.4.1.
WIDTH-DEPTHBCA(nO(1);nO(1)) = SIZEB1(nO(1)):
91
Proof. For each k, let C be a circuit class in SIZEB1(nk). Convert every AND and
OR gate into NAND gates using NOT gates on the inputs and outputs as needed.
Let s(n) be the number of NAND gates necessary to compute C. Construct a
nonlinear circuit of width n + s(n). The  rst n wires will contain the function
input, and the last s(n) will be used to hold the output from each of the s(n)
NAND gates. Populate the circuit by computing the NAND gates in the same
order as for the circuit computation, storing each output in the s(n) available
positions. In the worst case, each NAND gate must be computed in a separate
nonlinear depth. Thus the depth of the nonlinear circuit is also O(nk). This
circuit conversion implies
SIZEB1(nk) WIDTH-DEPTHBCA(nk;nk):
Since this containment holds for any k,
SIZEB1(nO(1)) WIDTH-DEPTHBCA(nO(1);nO(1)):
Alternatively, for each k let C be a circuit class in WIDTH-DEPTHBCA(nk;nk).
Let us consider the number of gates required in terms of the linear and nonlinear
rounds separately.
In the linear rounds, we need to compute a matrix multiply followed by an
a ne shift. This can be done in O((nk)2) = O(n2k) gates.
Similarly, in the nonlinear rounds, compute the potential a ne output for
each bit ai for each wire i. This also requires O((nk)2) = O(n2k) gates. Also,
for each i compute the control value ci which is the AND of the control wires for
the a ne function associated with i. Finally, let ni be the original value of wire
i with no transformation applied. The correct output for wire i is then
(ni^:ci)_(ai^ci):
92
Using O(n2k) gates for each of the O(nk) wires potentially requires O(n3k) gates.
Combining together all O(nk) rounds of nonlinearity, we require O(n3k)  
O(nk) = O(n3k gates. Thus,
WIDTH-DEPTHBCA(nk;nk) SIZEB1(n4k):
Since this containment holds for any k,
WIDTH-DEPTHBCA(nO(1);nO(1)) SIZEB1(nO(1)):
Combining the two containments, we now conclude
WIDTH-DEPTHBCA(nO(1);nO(1)) = SIZEB1(nO(1)):
10.5 Uniform and Non-Uniform Nonlinear Cir-
cuits
Theorem 10.4.1 states that any nonlinear circuit with polynomial width and depth
can be converted into a circuit with polynomial gates. We can therefore derive a
relation between nonlinear circuits and the complexity classes P and P=poly.
The di erence between uniform and non-uniform polynomial circuits is subtle
yet important. For a circuit class to be uniform, all circuits must be described by a
 nite algorithm, or  nite set of instructions. Even though the circuit generated by
the algorithm may get larger and larger as n increases, the same  nite description
is used to generate every circuit.
For non-uniform circuit classes, there is no blueprint. Each circuit may be
di erent. Even if each circuit is polynomially sized, the collective description of
the circuits must be in nite.
93
Definition 10.5.1. De ne the following classes:
 P is the class of uniform circuits that run in polynomial time.
 P=poly is the class of non-uniform circuits that run in polynomial time.
 nonlinearP is the class of uniform circuits with polynomial width and rounds
of nonlinearity,
 nonlinearP=poly is the class of non-uniform circuits with polynomial width
and rounds of nonlinearity,
Corollary 10.5.2.
nonlinearP = P and nonlinearP=poly = P=poly:
Proof. Using theorem 10.4.1 we can convert back and forth between polynomial
circuits and nonlinear circuits with polynomial width and rounds of nonlinearity.
Since the conversion description is  nite, any uniformity will be preserved.
10.6 Open Problems
The results of this chapter show that measuring the complexity of a function or
algorithm via its nonlinearity captures most of the essence of \polynomialness".
It is hoped that this point of view will be useful in developing new complexity
proofs.
As quantum computing was part of the inspiration for  nding optimal re-
versible computation, it is interesting to consider what nonlinearity means in the
quantum computing context. For quantum computing, only the CNOT and arbi-
trary single unitary gates on qubits are needed for computation. It is unclear to
94
the author what subset of this computation should be chosen to form the a ne
equivalence.
95
Chapter 11
Computation
Due to the doubly exponential nature of boolean permutations, extremely e cient
computational methods are needed to perform many of the calculations in this
paper. This chapter outlines a number of the more important methods used. To
describe any function f : Vn!Vn, n2n storage bits are required. Even though
n-bit permutations theoretically require onlydlog2(2n!)e n2n 2n storage bits,
it is useful to store the permutations using all n2n bits, especially since there are
cases where we are unsure whether or not we have a permutation. Each of the
following algorithms will assume that all n2n bits are in a single multi-precision
integer.
Note that for n = 4, n2n = 64, and thus a 4-bit permutation may be stored
as a single word on a 64-bit machine. Thus the computation time increases
signi cantly when extending beyond 4-bit permutations.
11.1 Storage of Permutations
There are multiple ways to store bit permutations, and there are algorithmic
reasons for preferring each. In this section, we will discuss the three types used
96
in this paper and how to convert between them.
11.1.1 Truth Table Form
We can always describe a permutation using its truth table. Recall the example
permutation considered earlier:
Table 11.1: Truth Table: f = (a2 a0a1;a1 1;a2 a0a1 a0)
a0a1a2 b0b1b2
0 0 0 0 1 0
0 0 1 1 1 1
0 1 0 0 0 0
0 1 1 1 0 1
1 0 0 0 1 1
1 0 1 1 1 0
1 1 0 1 0 0
1 1 1 0 0 1
Given the values in a particular order, storing the inputs is unnecessary and
we only need to store the output of the permutation in the correct order. Since
the author works on a little endian architecture, it was easier to store the least
signi cant bits on the right. Thus the truth table is stored in the following order:
f(111)f(110)f(101)f(100)f(011)f(010)f(001)f(000):
Thus the permutation is stored as the binary word
w = 001100110011101000111 010|{z}
f(000)
:
97
Extract the value f(i) shifting w by i n bits to the right and then masking
o the right n bits (or low n bits). Note that
(1 <<n) 1 = 2n 1 = 1111111   12| {z }
n ones in binary
:
Thus to extract the value f(i) from the word w, we execute:
f(i) = ( w >> (i*n) ) & ((1<<n)-1)
11.1.2 Function Table Form
Another way to look at a permutation is essentially turning the truth on its side.
Each row is then the truth table of coordinate function. Again, due to little
endian-ness, we store the coordinate function of the least signi cant bit function
on the right, i.e. f2f1f0. Each bit function will also have its least signi cant bits
to the right. Thus for our example, each fi will be in the following order:
fi(111)fi(110)fi(101)fi(100)fi(011)fi(010)fi(001)fi(000):
The example permutation will be stored as follows:
w = 1001101000110011 01101010| {z }
f0
:
Obtain the truth table for the coordinate function fi shifting w by i 2n bits
to the right and then masking o the low 2n bits.
Thus to extract the value f(i) from the word w, we execute:
f_i = ( w >> ( i*(1<<n) ) ) & ( ( 1 << (1<<n) ) - 1 )
98
11.1.3 Polynomial Form
Since any boolean function can be represented as a polynomial, we can also
specify each coordinate function by indicating which terms are present in the
polynomial.
Given i where 0 i< 2n, i can represent each possible monomial by consider-
ing the binary expansion. The kth bit of i indicates whether or not the monomial
contains xk in the product. In the special case where i = 0, the monomial is just
1. An example conversion is given in table 11.2.
Table 11.2: Monomial Conversion Table
Position Binary Monomial
0 0 0 0 1
1 0 0 1 a0
2 0 1 0 a1
3 0 1 1 a1a0
4 1 0 0 a2
5 1 0 1 a2a0
6 1 1 0 a2a1
7 1 1 1 a2a1a0
Recall that the example permutation has coordinate functions:
b0 = a2 a0a1 1
b1 = a1
b2 = a2 a0a1 a0
99
Thus, coordinate function b0 will have 1?s in the positions 0;3;4, and would
have binary form 00011001.
Again the polynomials will be stored with the least signi cant bit functions
on the right so the example permutation would be stored as:
0001101000000100 00011001| {z }
b0
:
Obtain the polynomial for the coordinate function fi shifting w by i 2n bits
to the right and then masking o the low 2n bits.
Thus to extract the polynomial bi from the word w, we execute:
p_i = ( w >> ( i*(1<<n) ) ) & ( ( 1 << (1<<n) ) - 1 )
11.2 Quick double coset test
Recall that two permutations p and q are a ne equivalent if there exist a ne
functions a0 and a1 such that p = a0qa1. Thus testing two permutations for
equivalence is an instance of the Double-Coset Membership (DCM) problem.
Generic methods for solving DCM proved too slow for our application, so a DCM
test speci cally for a ne equivalence was developed.
11.2.1 Composition via a Hamiltonian Path
Note that p = a0qa1 implies a 10 = qa1p 1 or simply that qa1p 1 is an a ne
function. Computing two successive permutation compositions is quite demand-
ing, especially if we have to exhaust over all values for a1. We will try to avoid
calculating full compositions by computing qa1 for successive values of a1 using
a gray-code type ordering, and then testing qa1p 1 for linearity.
100
Using the Hamiltonian cycles developed in Chapter 6, we can successively
step through the values qa1 for all a1 2AGLn(2). Each step on the Hamiltonian
cycle is a CNOT, which can be implemented using a mask, shift and XOR. Each
qa1 is then composed with p 1 and tested for linearity.
11.2.2 Linearity Test
Given a permutation, how di cult is it to determine whether or not the permu-
tation is an a ne permutation? Since most of the linear tests are expected to
fail, the strategy used for the double coset test tries to  nd failures quickly.
Instead of computing the entire composition of qa1p 1, we can compute only 4
values of the truth table, and verify that they satisfy the expected a ne relation.
Example 11.2.1. Suppose t was an a ne 3-bit permutation. Then
t(000) +t(001) +t(100) +t(101) = 000:
Essentially, any a ne relation that held before t was applied must also hold after
t was applied.
For 4-bit permutations, this test discovers 12=13 of nonlinear permutations.
(There are 13 possibilities for the fourth value, of which only one satis es the a ne
relation.) This fast failure allows us to avoid computing an entire permutation
composition most of the time.
11.3 Conclusion
The specialized double coset membership test allows the entire 4-bit permutation
space to be explored in a matter of hours, as opposed to early implementations
which had an expected runtime of months.
101
Chapter 12
Classi cation of 3-bit and 4-bit Permutations
12.1 Summary
The machinery developed in the preceding chapters now gives us the opportunity
to examine the nonlinear complexity of various permutations. We will  rst de ne
some common permutations, and then compute their nonlinear complexity. The
assortment of problems chosen aims to provide some insight into what problems
are more di cult to compute.
12.2 Permutations of Interest
An n-bit permutation permutes 2n elements. Some permutations are naturally
de ned on sets whose size is not exactly a power of two. We are still interested in
the complexity of these permutations. In this initial classi cation of the nonlinear
complexity, we will consider permutations to be  xed in the positions which they
do not specify.
To save space, we will describe the bit permutations using integers instead
of binary. I.e. 6 = 1102. Thus, 3-bit permutations will permute the set
102
f0;1;2;:::;7g and 4-bit permutations will permute the set f0;1;2;:::;15g.
12.2.1 Modular Addition
Given a positive integer m where 2n 1 <m 2n and an integer a where 0 b<
m, de ne an n-bit permutation  by
 (x) =
8
><
>:
x+b mod m if x<m
x otherwise
Such a permutation will be referred to as the permutation x+b mod m.
Although we will consider  to be constant for values of x m, it may be
advantageous to allow some permutation of the upper values in  nding a per-
mutation with minimal complexity. For simplicity, this paper will only consider
permutations with  xed values outside of the modular range.
Example 12.2.1. The 3-bit permutation x+ 2 mod 6.
x 0 1 2 3 4 5 6 7
x+ 2 mod 6 2 3 4 5 0 1 6 7
12.2.2 Modular Multiplication
Given a positive integer m where 2n 1 < m  2n and an integer a where
gcd(a;m) = 1, de ne an n-bit permutation  by
 (x) =
8
><
>:
ax mod m if x<m
x otherwise
Such a permutation will be referred to as the permutation ax mod m.
103
Example 12.2.2. The 3-bit permutation 2x mod 5.
x 0 1 2 3 4 5 6 7
2x mod 5 0 2 4 1 3 5 6 7
12.2.3 Modular A ne
Given a positive integer m where 2n 1 < m  2n and integers a;b where
gcd(a;m) = 1 and 0 b<m, de ne an n-bit permutation  by
 (x) =
8
><
>:
ax+b mod m if x<m
x otherwise
Such a permutation will be referred to as the permutation ax+b mod m.
Example 12.2.3. The 4-bit permutation 5x+ 6 mod 11.
e 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
5x+ 6 mod 11 6 0 5 10 4 9 3 8 2 7 1 11 12 13 14 15
12.2.4 Modular Inverse
Given a positive integer m where 2n 1 < m 2n, de ne a n-bit permutation  
by
 (x) =
8
><
>:
x 1 mod m if x<m;(x;m) = 1
x otherwise
Thus is x is invertible modulo m,  (x) = x 1, otherwise x is  xed. Such a
permutation will be referred to as x 1 mod m.
Example 12.2.4. The 3-bit permutation x 1 mod 7.
x 0 1 2 3 4 5 6 7
x 1 mod 7 0 1 4 5 2 3 6 7
104
12.2.5 Modular Exponentiation
Given a prime p where 2n 1 < p  2n and a primitive root a, de ne an n-bit
permutation  by
 (x) =
8
><
>:
ax mod p if 0 <x<p
x otherwise
Such a permutation will be referred to as ax mod p.
Example 12.2.5. The 3-bit permutation 2x mod 5.
x 0 1 2 3 4 5 6 7
2x mod 5 0 2 4 3 1 5 6 7
12.2.6 Pseudo-Inverse over GF(2)n
Let f(x) 2 GF(x) be a primitive polynomial of degree n. Then GF(x)=f(x)
is isomorphic to GF(2n). Since f is primitive, powers of x generate all 2n 1
nonzero elements of GF(2n). The pseudo-inverse is the map  7! 2n 2, which
maps each nonzero element to its inverse and maps 0 to itself.
 ( ) =  2n 2
Each element of the  nite  eld will be expressed as a binary integer as follows:
 = an 1xn 1 +   +a1a+a0 = (an 1   a1a0)2:
Such a permutation will be referred to as x 1 mod f.
Example 12.2.6. The 3-bit permutation   1 mod x3 +x+ 1.
 0 1 2 3 4 5 6 7
  1 0 1 5 7 6 3 4 2
105
12.3 Complexity Class NLC(i;j)
The complexity trees used in this chapter were generated using the To oli gate
as the nonlinear transition. In the 4-bit case, where the To oli gate can only
generate even permutations, a triply controlled NOT or CCCNOT was allowed
for the  nal nonlinear round instead. We will use the following de nition in this
chapter.
Definition 12.3.1. The complexity class NLC(i;j) = WIDTH-DEPTHB(i;j),
where the nonlinear basis B is a To oli gate and a Ci 1-NOT (A NOT gate with
i 1 controls). Additionally, the Ci 1-NOT gate is only allowed in the  nal
nonlinear round.
Thus NLC(i;j) includes all permutations that can be computed in width less
than or equal to i, with at most j nonlinear rounds (as restricted in de nition).
The theory of the preceding chapters was used to compute the results in the
following sections.
12.4 3-bit Permutations
Since all 2-bit permutations are a ne functions, 3-bit permutations are the  rst
nontrivial case to examine. 3-bit permutations are also unique due to the fact
that the Tofolli gate is an odd permutation, and therefore generates all of S23
with AGL3(2).
As established earlier, there are four a ne equivalence classes of the 8! =
40320 3-bit permutations. Using the To oli gate as our nonlinear transition, we
can generate a canonical representative for each of the three nonlinear classes
106
(Figure 12.1). We will consider which permutations fall into each complexity
level.
Figure 12.1: 3-bit Permutations
NLC(3;1) NLC(3;2) NLC(3;3)
 
 
    
  
     
     
      
      
      
12.4.1 NLC(3;0)
The simplest 3-bit permutations are the a ne permutations or those requiring
zero rounds of nonlinearity. NLC(3;0) contains 1344 permutations. Table 12.1
provides some example permutations.
Table 12.1: NLC(3;0) Permutations
Permutation Parameters
ax mod 7 a = 2;4
5x mod 8
x 1 mod 7
12.4.2 NLC(3;1)
This is the simplest possible nonlinear complexity class, since all 2-bit permuta-
tions are a ne. This class contains 7 1344 = 9408 3-bit permutations. Obviously
107
this class includes the To oli and Fredkin gates, as well as the permutations in
table 12.2.
Table 12.2: NLC(3;1) Permutations
Permutation Parameters
x+ 1 mod 8
ax mod 7 a = 3;5;6
ax mod 8 a = 3;7
x 1 mod 5
12.4.3 NLC(3;2)
NLC(3;2) is the only nonlinear class of even 3-bit permutations. It is also the
largest class containing 14 1344 = 18816 which is almost half of the 3-bit per-
mutations. Example permutations are provided in table 12.3.
108
Table 12.3: NLC(3;2) Permutations
Permutation Parameters
x+ 1 mod 5
x+ 1 mod 7
4x mod 5
5x mod 6
2x mod 5
ax mod 7 a=3,5
12.4.4 NLC(3;3)
NLC(3;3) contains the 3-bit permutations with maximal nonlinear complexity
and has 8 1344 = 10752 members. Example permutations are provided in table
12.4.
Table 12.4: NLC(3;3) Permutations
Permutation Parameters
x+ 1 mod 6
ax mod 5 a = 2;3
3x mod 5
  1 mod x3 +x+ 1
109
12.4.5 No 3-bit Permutations are One-Way
It should be noted that every 3-bit permutation class contains an involution.
Therefore, by lemma 9.3.3, all 3-bit permutations are two-way permutations.
Thus no 3-bit permutation can be a one-way permutation.
12.5 4-bit Permutations
4 bit permutations contain the  rst examples of permutations which are not two-
way. These classes are of particular interest in the study of one-way permutations,
since the inverse of a permutation must be in a di erent double coset if there is
any chance of computing f faster than f 1.
Table 12.5 shows how the invariants sort the 302 equivalence classes at each
depth. The table indicates that the Multiple Rank Invariant corresponds loosely
with the nonlinear depth. Also it is interesting to note that two-way permutations
are represented at every complexity depth.
Table 12.6 illustrates some examples of various simple permutations in each
of the seven nonlinear complexity classes for 4-bit permutations.
12.6 Open Problems
 There is one unique class of 4-bit permutations with depth 6. Is any per-
mutation we are familiar with in this class?
 For 4-bit permutations, involutions occur at every nonlinear complexity
depth. Does this happen in general?
110
 Does one of the permutation classes that is not two-way have an imple-
mentation with additional scratch space that has provably fewer rounds of
nonlinearity than its inverse?
111
Table 12.5: Invariant Table for each 4-bit Permutations Depth
Two-way not Two-way
Depth MRI Even Odd Even Odd
NLC(4;0) (4,0,0) 1
NLC(4;1) (3,1,0) 1
(3,0,1) 1
NLC(4;2) (2,2,0) 2
(2,1,1) 1 2
NLC(4;3) (2,0,2) 2 1
(1,3,0) 3
(1,2,1) 1 4 4
(1,1,2) 3 1 2 4
NLC(4;4) (2,0,2) 1
(1,1,2) 7 2
(1,0,3) 3 9
(0,3,1) 3 2
(0,2,2) 7 2 8 14
(0,1,3) 3 12 14 26
NLC(4;5) (1,1,2) 1
(1,0,3) 1
(0,2,2) 5 1 2
(0,1,3) 11 23 28 10
(0,0,4) 31 26 6 10
NLC(4;6) (0,0,4) 1
112
Table 12.6: Depth of Various 4-bit Permutations
Depth Permutation Parameters
NLC(4;0) ax mod 15 a = 2;4;8
9x mod 16
NLC(4;1) ax mod 15 a = 7;11;13;14
ax mod 16 a = 5;13
NLC(4;2) x+ 1 mod 16
 x mod 11
ax mod 16 a = 3;7;11;15
NLC(4;3) x 1 mod 11
5x mod 12
NLC(4;4) x+ 1 mod n n = 9;15
 x mod 9
ax mod 10 a = 3;7;9
ax mod 11 a = 2;3;4;5;6;7;8;9
ax mod 12 a = 7;11
ax mod 13 a = 5;8
ax mod 14 a = 9;11;13
ax mod 13 a = 6;7
NLC(4;5) x+ 1 mod n n = 10;11;12;13;14
x 1 mod 13
ax mod 9 a = 2;4;5;7
ax mod 13 a = 2;3;4;6;7;9;10;11;12
ax mod 14 a = 3;5
  1 mod x4 +x+ 1
ax mod 11 a = 2;6;7;8
ax mod 13 a = 2;11
NLC(4;6) Nothing found
113
Chapter 13
Case Studies
13.1 Summary
Using the tools developed for classifying and identifying Boolean permutations,
we now focus on the study of particular permutations as follows.
 Hiltgen?s asymmetric permutation.
 Zech logarithms: 1 +!e = !Z(e) where !2GF(2n).
 Increment: x+ 1 (mod 2n).
 Multiplication: (2n 1 + 1)x (mod 2n).
 Addition and Subtraction.
13.2 The Hiltgen function
In 1990, the  rst ever computationally asymmetric permutation for the gate
complexity measure was found [HG92]. The permutation f(x1;x2;x3;x4) =
114
(y1;y2;y3;y4) is de ned as follows:
y1 = x1 x3
y2 = x2 x4(x1 x2)
y3 = x3 x4(x1 x2)
y4 = x4:
Likewise, the inverse permutation f 1(y1;y2;y3;y4) = (x1;x2;x3;x4) is de ned
as follows:
x1 = y1 y3 y4(y1 y2 y3)
x2 = y2 y4(y1 y2 y3)
x3 = y3 y4(y1 y2 y3)
x4 = y4:
By exhaustively enumerating all possible constructions of f and f 1, Hiltgen
discovered that f could be constructed using only 5 gates, while f 1 required at
least 6 gates. Thus, the permutationf is computationally asymmetrical according
to the gate complexity measure.
Examination of the Hiltgen permutation reveals that it is a ne equivalent to
the To oli gate. Thus the complexity di erence between f and f 1 is solely due
to the a ne component. It is also interesting to note the Hiltgen permutation has
very low nonlinear complexity (just one To oli gate). This was likely necessary
to ensure that all possible constructions could be exhausted.
115
13.3 Zech Logarithm
In computations over  nite  elds,  eld multiplication can be optimized at the
expense of  eld addition. The Zech logarithm is used to compute  eld addition
when multiplication has been optimized using powers of a primitive root [Jun93].
Typically, the Zech logarithm values are precomputed and stored in a table.
Definition 13.3.1. The Zech logarithm Z(e) of !e is the discrete logarithm
of 1 +!e.
1 +!e = !Z(e)
By de nition, Z(0) =1 and Z(1) = 0.
This can be used to compute the sum of two primitive root powers as follows:
!a +!b = !a(1 +!b a) = !a!Z(b a) = !a+Z(b a):
Since the discrete log ranges from 0 to 2n 2, we will use the value 2n 1 as
a placeholder for in nity.
Example 13.3.2. Let ! be a primitive root of the polynomial x3 +x+ 1. Then
the Zech logarithm will de ne the following 3-bit permutation. Be aware that
7 =1 in the permutation.
e 0 1 2 3 4 5 6 7
Z(e) 7 3 6 1 5 4 2 0
Computation reveals that the permutation is a ne equivalent to the nonlinear
permutation of depth 2. Figure 13.1 illustrates a circuit for Z(a0 +a1 2+a2 22).
Notice that the dashed box contains the only nonlinear part of the permuta-
tion.
116
Figure 13.1: Zech Logarithm for x3 +x+ 1
a0                    
a1                
a2                
_ _ _ _ 
 
 
 
 
 
 
 
 
 _ _ _ _
If we consider the other primitive degree 3 polynomial x3 + x2 + 1, then the
Zech logarithm will now de ne the following di erent 3-bit permutation.
e 0 1 2 3 4 5 6 7
Z(e) 7 5 3 2 6 1 4 0
The Zech logarithm for x3 +x2 + 1 also has nonlinear depth 2. Thus, the two
Zech logarithms are a ne equivalent, since there is only one 3-bit equivalence
class with nonlinear depth 2.
Example 13.3.3. Consider now the Zech logarithm associated with the degree
4 primitive polynomial x4 +x+ 1. It has the following truth table (15 =1):
e 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Z(e) 15 4 8 14 1 10 13 9 2 7 5 12 11 6 3 0
Identi cation by computation reveals the Zech logarithm for x4 +x+ 1 has a
nonlinear depth of 5. The Zech logarithm for x4 +x3 + 1 has the truth table:
e 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Z(e) 15 12 9 4 3 10 8 13 6 2 5 14 1 7 11 0
Computation then reveals that the two Zech logarithms are again a ne equiv-
117
Figure 13.2: Zech Logarithm for x4 +x+ 1
x0                                         x
1                                                   x
1                                                                    x
4                                                    
alent. In fact,
Zx4+x3+1(e) = Zx4+x+1(e);
where  x is the permutation which complements all variables. It is unknown if all
the Zech logarithms associated with primitive polynomials are a ne equivalent,
but it is true for n 4.
13.4 Incrementing modulo 2n
One of the simplest nonlinear permutations we can examine is the simple act of
adding 1. Let us  rst consider the 3-bit case.
x 0 1 2 3 4 5 6 7
I(x) 1 2 3 4 5 6 7 0
A computational check reveals that I(x) = x+ 1 (mod 8) requires 1 round of
nonlinearity and is thus a ne equivalent to the single To oli gate. Figure 13.3
illustrates a circuit for I(x) = I(x0 + 2 x1 + 22 x2).
Consider now the incrementation function on 4 bits, I(x) = x+1 (mod 16). A
computational check reveals that I(x) is an odd permutation requiring two rounds
of nonlinearity. Thus it must use one To oli gate and one triple controlled NOT
gate. Figure 13.4 illustrates a circuit for I(x) = I(x0 + 2 x1 + 22 x2 + 23 x3).
118
Figure 13.3: Incrementing modulo 8
x0       x0 + 1
x1      x1 +x0
x1     x2 +x0x1 + 1
Figure 13.4: Incrementing modulo 16
x0        x0 + 1
x1       x1 +x0
x2      x2 +x0x1
x3     x2 +x0x1
13.5 Multiplication by 2n 1 + 1 modulo 2n
In exploring a ne functions modulo various moduli, it was discovered that the
permutation (2n 1 + 1)x (mod 2n) always appeared to be linear. Upon further
investigation, the following theorem was proved.
Theorem 13.5.1. (2n 1 + 1)x (mod 2n) is a linear n-bit permutation. Further-
more, it is generated by adding the low bit to the high bit.
Proof. We will consider the two cases where x is an even and odd number. First,
suppose x = 2k is even. Then the multiplication  xes 2k.
(2n 1 + 1)2k k2n + 2k 2k (mod 2n)
Suppose now that x = 2k + 1 is odd. Then
(2n 1 + 1)(2k + 1) k2n + 2n 1 + 2k + 1 2n 1 + 2k + 1 (mod 2n):
Adding 2n 1 is the same as toggling the high bit of 2k + 1.
119
Since odd and even numbers can be distinguished by their low bit, we can
combine both cases into the single case of using a single controlled NOT from the
low bit to the high bit. Thus,
(2n 1+1)x (2n 1+1)(x0+2x1+   +2n 1xn 1) (x0+2x1+   +2n 1(x0 xn 1));
and therefore (2n 1 + 1)x (mod 2n) is a linear function.
13.6 Addition and Subtraction modulo 2n
Addition and subtraction are two of the most common and most useful functions.
Despite their apparent simplicity, a great deal of literature and research has shown
many surprising ways to improve the computation of addition (i.e. carry-look
ahead adders, carry-save adders, constant depth adders).
We will consider addition and subtraction using Two?s complement which
is the most common implementation on computers. Two?s complement allows
the same circuitry to add signed or unsigned numbers.
Example 13.6.1. Suppose we were working with 8-bit numbers and wanted to
 nd the two?s complement of 40 = 001010002.
001010002 Original value of 40
110101112 Complement
110110002 Add one
The  nal value 110110002 equals  40 in two?s complement. Note that
001010002 + 110110002 = 000000002;
ignoring the over ow carry bit.
120
The following theorem proves that addition and subtraction are in the same
a ne equivalence class. Any improvement to one automatically translates into
an improvement for the other. Thus addition and subtraction will always have
the same nonlinear complexity.
Theorem 13.6.2. For any positive integer n, 2?s complement addition and sub-
traction modulo 2n are a ne equivalent.
Proof. Recall that in 2?s complement,  a = a + 1, where a is a with every bit
complemented. Equivalently, a = a 1.
Consider the permutations +; : GF(2)2n ! GF(2)2n where for all a;b 2
GF(2)n,
+(a;b)7!(a;a+b mod 2n) and  (a;b)7!(a;a b mod 2n):
De ne A1;A2 2AGLn(2) where A1 complements the  rst n bits and A2 comple-
ments all 2n bits. Consider the action of A2 (+) A1.
A2 + A1(a;b) = A2 +(a;b)
= A2(a;a+b)
= A2(a; a 1 +b)
= A2(a;a b)
= (a;a b)
=  (a;b)
Since A1 and A2 are a ne permutations, + and  are in the same double
coset and are therefore a ne equivalent.
121
Chapter 14
Conclusion
The tools and theory developed in this thesis provide a new method for assessing
the complexity of boolean permutations via nonlinearity. Given the universal
nature of reversible computation, this implies the complexity of all classical com-
putations can be measured according to their nonlinearity.
The fact that the amount of nonlinearity in a permutation is highly correlated
with its complexity should not come as a surprise. Mathematicians are constantly
developing the linear theory of a  eld and then gently extending out into the realm
of nonlinearity.
It is hoped that the notion of nonlinearity as complexity will prove fruitful in
both the study of complexity theory as well as in the study of optimal functions
and algorithm theory.
Further advances in theory may make it possible to classify all 5-bit per-
mutations. In addition, the computational methods for computing the optimal
nonlinear complexity may lead to the discovery of a 4-bit or 5-bit permutation
that is \mildly one-way", that is requiring one more round of nonlinearity for p 1
than for p.
122
Appendix A
Appendix A: Source Code
#!/usr/bin/env python
# _size4.pyx
import random as rand # Avoid conflict with stdlib.h
ctypedef unsigned long long ulong
ctypedef unsigned int uint
cdef extern from "Python.h":
void* PyMem_Malloc(int)
void PyMem_Free(void* p)
cdef extern from "stdlib.h":
int random()
ctypedef long size_t
void* malloc(size_t size)
void free(void* ptr)
from size import rbin, bin
#############################################################
# BitStatic class
#############################################################
cdef class Size4:
cdef public uint n, nn, N
cdef public ulong one, m_identity, p_identity
cdef public ulong m_mask, r_mask, c_mask, p_mask, p_base
cdef ulong* x_mask
123
def __init__(self, uint n=4):
cdef ulong i,j
cdef ulong id, one
one = 1
self.n = n
self.nn = n*n
self.N = 1ull << n
# Define the Identity matrix
id = 0
for i in range(n):
id ^= (1ull << (n+1)*i)
self.m_identity = id
# Define the Identity permutation
id = 0
for i in range( 1 << n ):
id = id ^ ((i*1ull) << (n*i))
self.p_identity = id
# The matrix mask is n*n bits set to 1111..1
self.m_mask = (1ull << n*n) - 1
# The row mask is the low n bits set to 111..1
self.r_mask = (1ull << n) - 1
# The column mask is n 1?s spaced n bits apart
id = 0
for i in range(n):
id ^= (1ull << (i*n))
self.c_mask = id
# The permutation mask is 2**n 1?s
self.p_mask = 2*((1ull << (self.N*self.n-1)) - 1) + 1
self.p_base = self.p_mask // max(self.r_mask, 1)
# x_mask[i] is the locations of monomials with no x_i.
self.x_mask = <ulong*> malloc(sizeof(ulong)*self.n)
for i in range(self.n):
self.x_mask[i] = (self.p_mask // ((1 << (1<<(i+1)))-1) ) \
* ((1 << (1<<(i)))-1)
def x_test(cls):
for i in range(cls.n):
print rbin(cls.x_mask[i],64)
124
def p_id(cls):
???
Mysterious bug
p_id() != p_identity
???
return int(cls.p_identity)
def p_mk(cls):
return int(cls.p_mask)
################################################################
# Print methods
################################################################
def str_v(cls, v):
???Binary form of permutation
>>> print Size(3).str_v(0)
000
>>> print Size(3).str_v(6)
011
???
return rbin(v, cls.n)
def str_m(cls, m):
???Binary form of matrix
>>> print Size(3).str_m(0421)
100
010
001
>>> print Size(3).str_m(0153)
110
101
100
???
s = ??
for i in xrange(cls.n):
t = (m >> cls.n*i) & cls.r_mask
s += rbin(t, cls.n) + ?\n?
s = s[:-1]
return s
def str_a(cls, a):
???Binary form of affine matrix
>>> print Size(3).str_a(07421)
100
010
125
001
111
>>> print Size(3).str_a(04153)
110
101
100
001
???
s = ??
for i in xrange(cls.n + 1):
t = (a >> cls.n*i) & cls.r_mask
s += rbin(t, cls.n) + ?\n?
s = s[:-1]
return s
def str_p(cls, p):
???Binary form of permutation
>>> s = Size(3)
>>> print s.str_p(s.p_identity)
0 1 2 3 4 5 6 7
>>> print s.str_p(067452301)
1 0 3 2 5 4 7 6
???
s = str(p & cls.r_mask)
for i in xrange(1, cls.N):
s += ? ?+str((p >> (i*cls.n)) & cls.r_mask)
return s
def str_t(cls, p):
???Binary form of permutation
>>> s = Size(3)
>>> print s.str_t(s.p_identity),
000
001
010
011
100
101
110
111
>>> print s.str_t(067452301),
001
000
011
010
126
101
100
111
110
???
s = ??
for i in xrange(cls.N):
s += bin((p >> (i*cls.n)) & cls.r_mask, cls.n)+?\n?
return s
################################################################
# Conversion methods
################################################################
???
NOTE: Finite Functions acting on [0,1,...,2**n-1]
0 1 2
3 4 5 ==> 8 7 6 5 4 3 2 1 0
6 7 8
Functions are named as follows
_ReturnType_InputTypes_FunctionDescription
Possible types are:
m matrix
v vector
r row vector
c column vector
a affine
b bit
p permutation
? controlled affine (condition off of most significant bits)
???
cpdef ulong r_c(cls, ulong c):
???Converts column vector to row vector
>>> s = Size(3)
>>> print s.str_v(s.r_c(s.c_mask))
111
???
c ^= c >> (2*cls.n-2)
c ^= (c & (cls.r_mask << cls.n)) >> (cls.n - 1)
c &= cls.r_mask
return c
cpdef ulong c_r(cls, ulong r):
???Converts row vector to column vector
127
>>> s = Size(3)
>>> print s.str_v(s.r_c(s.c_r(s.r_mask)))
111
???
r ^= (r >> 1) << cls.n
r ^= (r >> 2) << 2*cls.n
r &= cls.c_mask
return r
cpdef ulong p_v(cls, ulong v):
???Returns permutation equal to adding vector v.
>>> s = Size(3)
>>> print s.str_p(s.p_v(1))
1 0 3 2 5 4 7 6
???
return cls.p_identity ^ (v * cls.p_base)
cpdef ulong p_m(cls, ulong m):
???Returns permutation equal to matrix multiplication by m.
>>> print Size(3).str_p(Size(3).p_m(0153))
0 3 5 6 1 2 4 7
???
cdef uint i
cdef ulong p
p = 0
for i in range(cls.N):
p ^= cls.r_rm_mul(i, m) << (i * cls.n)
return p
cpdef ulong p_a(cls, ulong a):
???Permutation equal to affine matrix multiplication by a.
>>> print Size(3).str_p(Size(3).p_a(04153))
4 7 1 2 5 6 0 3
???
cdef ulong p, i
p = 0
for i in range(cls.N):
p ^= cls.r_ra_mul(i, a) << (i * cls.n)
return p
cpdef ulong a_v(cls, ulong v):
???Returns affine matrix equal to adding vector v.
>>> print Size(3).str_a(Size(3).a_v(6))
100
010
128
001
011
???
return cls.m_identity ^ (v << cls.nn)
cpdef ulong m_v(cls, ulong v):
???Return matrix with char polynomial associated with v.
>>> print Size(3).str_m(Size(3).m_v(3))
010
001
110
???
cdef ulong m
m = cls.m_identity << 1
m ^= ( v << (cls.n - 1) * cls.n ) & cls.m_mask
return m
def a_m(cls, m):
return m
# TODO: Tensor two permutations together
################################################################
# Coercion methods
################################################################
cpdef ulong v_p(cls, ulong p):
???Returns zero shift of permutation.
Coerce - Inverts p_v
>>> s = Size(3)
>>> print s.str_v(s.v_p(s.p_identity))
000
???
return p & cls.r_mask
cpdef ulong m_p(cls, ulong p):
???Matrix function of permutation basis.
Coerce - Inverts p_m
>>> s = Size(3)
>>> print s.str_m(s.m_p(s.p_identity))
100
010
001
???
cdef uint i
129
m = 0
for i in range(cls.n):
m ^= (p >> ( (1 << i) * cls.n) & cls.r_mask) \
<< (i * cls.n)
return m
cpdef ulong a_p(cls, ulong p):
???Returns affine function of permutation basis.
Coerce - Inverts p_a
>>> s = Size(3)
>>> print s.str_a(s.a_p(s.p_identity))
100
010
001
000
>>> print s.str_a(s.a_p(067452301))
100
010
001
100
???
cdef uint i
cdef ulong v, a
a = 0
v = p & cls.r_mask
for i in range(cls.n):
a ^= (v ^ (p >> ( (1 << i) * cls.n) & cls.r_mask)) \
<< (i * cls.n)
return a ^ (v << cls.nn)
cpdef ulong v_a(cls, ulong a):
???Returns affine shift.
Coerce - Inverts a_v
>>> s = Size(3)
>>> print s.str_v(s.v_a(07421))
111
???
return (a >> cls.nn) & cls.r_mask
cpdef ulong m_a(cls, ulong a):
???Returns zero shift of permutation.
Coerce - Inverts convert.a_m
>>> s = Size(3)
>>> print s.str_m(s.m_a(07421))
100
130
010
001
???
return a & cls.m_mask
################################################################
# Get and set methods
################################################################
cpdef ulong r_pi_get(cls, ulong p, uint index):
???Returns the value p(i) from the permutation p.
>>> s = Size(3)
>>> print s.str_v(s.r_pi_get(s.p_identity, 4))
001
???
return (p >> cls.n*index) & cls.r_mask
cpdef ulong b_vi_get(cls, ulong v, uint index):
???
Candidate static method.
???
return (v >> index) & 1
cpdef ulong b_mij_get(cls, ulong m, uint i, uint j):
???Returns the (i,j) entry of the matrix m.
>>> s = Size(3)
>>> print s.b_mij_get(s.m_identity,0,0)
1
>>> print s.b_mij_get(s.m_identity,1,0)
0
???
return (m >> (i*cls.n +j)) & 1
???
_b_ci_get
_r_mi_get
_c_mi_get
_vib_set
???
################################################################
# Matrix methods
################################################################
cpdef ulong r_rm_mul(cls, ulong r, ulong m):
131
???Row vector times matrix
>>> s = Size(3)
>>> print s.str_v(s.r_rm_mul(1, 0421))
100
>>> print s.str_v(s.r_rm_mul(7, 0437))
000
>>> print s.str_v(s.r_rm_mul(7, 0467))
101
???
cdef ulong v
v = cls.c_r(r)
m &= v * cls.r_mask
return cls.r_m_xor_cols(m)
cpdef ulong r_ra_mul(cls, ulong r, ulong a):
???Row vector times affine matrix
>>> s = Size(3)
>>> print s.str_v(s.r_ra_mul(1, 03421))
010
>>> print s.str_v(s.r_ra_mul(7, 05437))
101
>>> print s.str_v(s.r_ra_mul(7, 01467))
001
???
cdef ulong v,m
v = cls.c_r(r)
m = a & (v * cls.r_mask)
return cls.r_m_xor_cols(m) ^ (a >> cls.nn)
cpdef ulong r_mr_mul(cls, ulong m, ulong r):
???Matrix vector times row
>>> s = Size(3)
>>> print s.str_v(s.r_mr_mul(0421, 1))
100
>>> print s.str_v(s.r_mr_mul(0665, 7))
000
>>> print s.str_v(s.r_mr_mul(0467, 7))
101
???
cdef ulong v
m &= r * cls.c_mask
v = cls.c_m_xor_rows(m)
return cls.r_c(v)
cpdef ulong r_ar_mul(cls, ulong a, ulong r):
132
???Affine matrix times row vecto
First multiply by matrix and then add affine component
>>> print Size(3).str_v(Size(3).r_ar_mul(04421, 1))
101
>>> print Size(3).str_v(Size(3).r_ar_mul(07566, 1))
110
???
cdef ulong m, v
m = a & (r * cls.c_mask) & cls.m_mask
v = cls.c_m_xor_rows(m)
return cls.r_c(v) ^ (a >> cls.nn)
cpdef ulong m_mm_mul(cls, ulong m1, ulong m2):
???Matrix multiplication
>>> s = Size(3)
>>> print s.str_m(s.m_mm_mul(0421, 0421))
100
010
001
>>> print s.str_m(s.m_mm_mul(311, 311))
101
010
001
???
cdef uint i
cdef ulong f,r,t
t = cls.m_m_transpose(m2)
r = cls.c_m_xor_rows(m1 & t)
for i in range(1, cls.n):
f = cls.m_mi_upshift_rows(m1, i)
f = cls.c_m_xor_rows(f & t)
f = cls.m_mi_upshift_rows(f, cls.n - i)
r ^= f << (cls.n - i)
r = cls.m_m_ishift_rows(r)
return r
cpdef ulong a_aa_mul(cls, ulong a1, ulong a2):
???Affine Matrix multiplication
>>> print Size(3).str_a(Size(3).a_aa_mul(0421, 0421))
100
010
001
000
>>> print Size(3).str_a(Size(3).a_aa_mul(07467, 07467))
133
101
010
001
010
???
cdef ulong m,v
m = cls.m_mm_mul(a1 & cls.m_mask, a2 & cls.m_mask)
v = cls.r_ra_mul((a1 >> cls.nn) & cls.r_mask, a2)
return m ^ (v << cls.nn)
cpdef ulong m_m_inv(cls, ulong m):
???Assuming the matrix is invertible. Returns the inverse.
>>> s = Size(3)
>>> print s.str_m(s.m_m_inv(0421))
100
010
001
>>> print s.str_m(s.m_m_inv(143))
010
001
111
>>> print s.str_m(s.m_m_inv(0124))
001
010
100
>>> print Size(2).str_m(Size(2).m_m_inv(6))
01
10
???
cdef ulong m1,m2,sval,oval,val
cdef uint i,j
m1 = m
m2 = cls.m_identity
for i in range(cls.n):
for j in range(i, cls.n): # pivot a 1 to (i,i) position
if cls.b_mij_get(m1, j, i) == 1:
break
m1 = cls.m_mij_swap_rows(m1, i, j)
m2 = cls.m_mij_swap_rows(m2, i, j)
val = (m1 >> i) & cls.c_mask # which rows added
val ^= 1ull << i*cls.n # zero out pivot position
sval = val * ((m1 >> i*cls.n) & cls.r_mask)
oval = val * ((m2 >> i*cls.n) & cls.r_mask)
m1 ^= sval
134
m2 ^= oval
return m2
cpdef ulong a_a_inv(cls, ulong a):
???Affine inverse. Assume multiplication on right.
>>> Size(3).a_a_inv(07567)
1003
???
cdef ulong m,v
m = cls.m_m_inv(a & cls.m_mask)
v = cls.r_rm_mul(a >> cls.nn, m)
return m ^ (v << cls.nn)
cpdef ulong m_m_transpose(cls, ulong m):
???Transpose of Bit Matrix
>>> Size(3).m_m_transpose(0421)
273
>>> Size(3).m_m_transpose(0153)
143
???
cdef ulong f
cdef uint i
f = 0
for i in range(cls.n):
t = cls.c_mask & (m >> i)
f ^= cls.r_c(t) << (cls.n * i)
return f
def m_random(cls):
???Returns a random invertible bit matrix
>>> s = Size(3); m=s.m_random();
>>> s.m_mm_mul(m, s.m_m_inv(m)) == s.identity
True
???
while True:
n = rand.randrange(1<<cls.nn)
if cls.is_inv_a(n):
return n
def a_random(cls):
???Returns a random invertible affine bit matrix
>>> s = Size(3); a=s.a_random();
>>> s.a_aa_mul(a, s.a_a_inv(a)) == s.identity
True
???
135
while True:
n = rand.randrange(1<<cls.nn)
if cls.is_inv_a(n):
return n ^ (rand.randrange(cls.N) << cls.nn)
################################################################
# Generator methods
################################################################
"""
def density2(cls):
???Generate all weight 2 words.
>>> for i in Size(4).density2(): print Size(4).str_v(i)
1100
1010
1001
0110
0101
0011
???
m = 3
for i in xrange(cls.n-1):
current = (m << i)
yield current
for j in xrange(i+1, cls.n-1):
current ^= m << j
yield current
return
def density3(cls):
???Generate all weight 2 words.
>>> for i in Size(5).density3(): print Size(5).str_v(i)
11100
11010
11001
10110
10101
10011
01110
01101
01011
00111
???
m = 7
136
for i in xrange(cls.n-2):
last = (m << i)
current = last
yield current
for j in xrange(i+1, cls.n-2):
for k in xrange(j+1, cls.n-1):
current ^= 3 << k
yield current
last ^= 5 << j
current = last
yield current
return
"""
################################################################
# Helper methods
################################################################
cpdef ulong r_m_xor_cols(cls, ulong m):
???XOR the values in each column to form a row vector.
>>> s = Size(3)
>>> s.r_m_xor_cols(s.m_identity) == s.r_mask
True
???
cdef uint i
for i in range(1, cls.n):
m ^= (m >> (cls.n * i) ) & cls.r_mask
m &= cls.r_mask
return m
cpdef ulong c_m_xor_rows(cls, ulong m):
???XOR the values in each row to form a column vector.
>>> s = Size(3)
>>> s.c_m_xor_rows(s.m_identity) == s.c_mask
True
???
cdef uint i
for i in range(1, cls.n):
m ^= (m >> i) & cls.c_mask
m &= cls.c_mask
return m
cpdef ulong m_mij_swap_rows(cls, ulong m, uint i, uint j):
???
>>> Size(3).m_mij_swap_rows(0421, 0, 2)
137
84
???
cdef ulong val
val = (m >> cls.n*i) ^ (m >> cls.n*j)
val &= cls.r_mask
m ^= val << cls.n*i
m ^= val << cls.n*j
return m
cpdef ulong m_mi_upshift_rows(cls, ulong m, int i):
??? Circular shift of rows upward
>>> Size(3).m_mi_upshift_rows(0421, 1)
98
???
return (m>>cls.n*i)^((m<<(cls.n*(cls.n-i)))&cls.m_mask)
cpdef ulong m_m_ishift_rows(cls, ulong m):
??? Circular shift ith row by i
>>> Size(3).m_m_ishift_rows(0421)
161
???
cdef uint i
cdef ulong f
f = m & cls.r_mask
for i in range(1, cls.n):
val = (m >> cls.n*i) & cls.r_mask
val = cls.r_r_circ_shift(val, i)
f ^= val << cls.n*i
return f
cpdef ulong r_r_circ_shift(cls, ulong r, uint i):
return ((r>>(cls.n-i)&cls.r_mask)^(r<<(i)))&cls.r_mask
cpdef ulong m_rr_outer_product(cls, ulong r1, ulong r2):
??? Outer product
>>> s = Size(3)
>>> print s.str_m(s.m_rr_outer_product(05, 07))
111
000
111
???
r1 = cls.c_r(r1)
return r1 * r2
138
???
m_m_canonical form
b_vv_dotproduct
???
################################################################
# Permutation methods
################################################################
cpdef ulong p_p_inv(cls, ulong p):
???Assuming the function is a permutation. Returns inverse.
>>> print Size(3).str_p(Size(3).p_p_inv(024710536))
3 4 7 1 6 2 0 5
???
cdef ulong i
cdef ulong f
f = 0
for i in range(cls.N):
f ^= i << ( ( (p >> i*cls.n) & cls.r_mask ) * cls.n )
return f
cpdef ulong p_p_reverse(cls, ulong p):
???Reverses permuation as a list
>>> print Size(3).str_p(Size(3).p_p_reverse(024710536))
2 4 7 1 0 5 3 6
???
cdef uint i
cdef ulong f
f = 0
for i in range(cls.N):
f ^= ((p>>i*cls.n)&cls.r_mask)<<(cls.N-i-1)*cls.n
return f
cpdef ulong p_pp_mul(cls, ulong p0, ulong p1):
???Composition function.
>>> print Size(3).str_p(Size(3).p_pp_mul(024710536,024710536))
4 0 7 6 3 2 1 5
>>> print Size(3).str_p(Size(3).p_pp_mul(024710536,053610742))
3 0 6 2 4 5 1 7
???
cdef uint i
cdef ulong f, p0i, p1j
f = 0
for i in range(cls.N):
139
p0i = (p0 >> i*cls.n) & cls.r_mask
p1j = (p1 >> p0i*cls.n) & cls.r_mask
f ^= p1j << cls.n*i
return f
cpdef ulong p_pa_mul(cls, ulong p, ulong a):
???Composition function.
>>> s = Size(3)
>>> print s.str_p(s.p_pa_mul(053610742,07124))
5 6 0 7 3 4 1 2
???
cdef uint i
cdef ulong f, pi
f = 0
for i in range(cls.N):
pi = (p >> i*cls.n) & cls.r_mask
f ^= cls.r_ra_mul(pi, a) << cls.n*i
return f
cpdef ulong p_ap_mul(cls, ulong a, ulong p):
???Composition function. (Slower than reversed composition)
>>> s = Size(3)
>>> print s.str_p(s.p_ap_mul(07124,053610742))
5 0 6 4 3 7 1 2
???
pa = cls.p_a(a)
return cls.p_pp_mul(pa,p)
cpdef ulong p_p_fixBasis(cls, ulong p):
cdef ulong a
while not cls.is_inv_a(cls.a_p(p)):
p = cls.p_ap_mul(cls.a_random(), p)
a = cls.a_a_inv(cls.a_p(p))
p = cls.p_pa_mul(p,a)
return p
cpdef uint is_a_pp_mul(cls, ulong p, ulong q):
cdef ulong v000,v001,v010,v100,v101,v110,v111
cdef ulong t
v000 = (q >> (cls.n * (p&cls.r_mask)))
v001 = (q >> (cls.n * ((p >> cls.n) & cls.r_mask)))
v010 = (q >> (cls.n * ((p >> 2*cls.n) & cls.r_mask)))
v011 = (q >> (cls.n * ((p >> 3*cls.n) & cls.r_mask)))
if ((v000 ^ v001 ^ v010 ^ v011) & cls.r_mask) != 0:
return 0
140
v100 = (q >> (cls.n * ((p >> 4*cls.n) & cls.r_mask)))
v101 = (q >> (cls.n * ((p >> 5*cls.n) & cls.r_mask)))
if ((v000 ^ v001 ^ v100 ^ v101) & cls.r_mask) != 0:
return 0
v110 = (q >> (cls.n * ((p >> 6*cls.n) & cls.r_mask)))
if ((v000 ^ v010 ^ v100 ^ v110) & cls.r_mask) != 0:
return 0
v111 = (q >> (cls.n * ((p >> 7*cls.n) & cls.r_mask)))
if ((v000 ^ v001 ^ v110 ^ v111) & cls.r_mask) != 0:
return 0
t = cls.p_pp_mul(p,q)
if cls.is_a_p(t) == 1: # Just test the darn thing then!
return t
else:
return 0
cpdef uint is_inv_a(cls, ulong a):
???Determines whether affine transformation is invertible.
>>> Size(3).is_inv_a(02743)
False
>>> Size(3).is_inv_a(07764)
True
???
ai = cls.a_a_inv(a)
return cls.a_aa_mul(a,ai) == cls.m_identity
cpdef int cmp_p(cls, ulong p0, ulong p1):
???Compare using lexicographic value
Examples:
>>> Size(2).cmp_p(0xe4,0xd9)
True
???
cdef uint i
cdef ulong val0, val1
for i in range(cls.N):
val0 = cls.r_pi_get(p0, i)
val1 = cls.r_pi_get(p1, i)
if val0 == val1:
continue
else:
return val0 < val1
return 0
cpdef ulong p_random_s(cls):
141
cdef ulong p
cdef ulong i,j
p = cls.p_identity
for i in range(cls.N):
j = rand.randrange(i,cls.N)
p = cls.p_pij_swap(p,i,j)
return p
cpdef ulong p_random(cls):
cdef ulong p
cdef ulong i,j
p = cls.p_identity
for i in range(cls.N):
j = (random() % (cls.N - i)) + i
p = cls.p_pij_swap(p,i,j)
return p
cpdef ulong p_pij_swap(cls, ulong p, uint i, uint j):
cdef ulong val
val = cls.r_pi_get(p,i) ^ cls.r_pi_get(p,j)
p ^= val << cls.n*i
p ^= val << cls.n*j
return p
cpdef ulong p_p_next(cls, ulong p):
cdef uint i,j
cdef ulong pi,pj
i = cls.N - 1
pj = cls.r_pi_get(p,i)
while True:
i = i -1
if i < 0:
return 0
pi = cls.r_pi_get(p, i)
if pi < pj:
break
pj = pi
j = cls.N
while cls.r_pi_get(p, j-1) <= pi:
j = j - 1
p = cls.p_pij_swap(p,i,j-1)
142
i = i + 1
j = cls.N -1
while i<j:
p = cls.p_pij_swap(p,i,j)
i += 1
j -= 1
return p
cpdef ulong p_p_not(cls, ulong p, ulong target):
???NOT Gate
>>> s = Size(3); p = s.p_identity
>>> print s.str_p(s.p_p_cnot(p,0))
1 0 3 2 5 4 7 6
???
return p ^ (cls.p_base << target)
cpdef ulong p_p_cnot(cls, ulong p, ulong source, ulong target):
???Control NOT
>>> s = Size(3); p = s.p_identity
>>> print s.str_p(s.p_p_cnot(p,0,1))
0 3 2 1 4 7 6 5
???
cdef ulong d
d = (p & (cls.p_base << source)) >> source
return p ^ (d << target)
cpdef ulong p_p_ccnot(cls, ulong p, ulong source1, \
ulong source2, ulong target):
???Double control NOT
>>> s = Size(3); p = s.p_identity
>>> print s.str_p(s.p_p_ccnot(p,0,1,2))
0 3 2 1 4 7 6 5
???
cdef ulong d
d = (p & (cls.p_base << source1)) >> source1
d &= (p & (cls.p_base << source2)) >> source2
return p ^ (d << target)
cpdef ulong p_p_cccnot(cls, ulong p, ulong source1, \
ulong source2, ulong source3, ulong target):
???Triple control NOT
>>> s = Size(4); p = s.p_identity
>>> print s.str_p(s.p_p_cccnot(p,0,1,2,3))
???
143
cdef ulong d
d = (p & (cls.p_base << source1)) >> source1
d &= (p & (cls.p_base << source2)) >> source2
d &= (p & (cls.p_base << source3)) >> source3
return p ^ (d << target)
def Test3(cls, p):
L = []
L.append(cls.p_identity)
L.append(cls.p_p_ccnot(L[0],0,1,2))
L.append(cls.p_p_ccnot(L[1],0,2,1))
L.append(cls.p_p_ccnot(L[2],1,2,0))
for i in range(4):
if cls.DC(p,L[i]):
return i
return ?Fail?
def Test4(cls,p):
if cls.b_p_parity(p) == 0:
return ?Even: ?+str(cls.Test4Even(p))
else:
return ?Odd: ?+str(cls.Test4Odd(p))
def order(cls, p):
???Find order of permutation
>>> s=Size(3); s.order(s.p_identity) == 1
True
>>> s.order(057136420)
7
???
d = p
i = 1
while True:
if d == cls.p_identity:
return i
d = cls.p_pp_mul(d,p)
i += 1
if i > 999:
return None
################################################################
# Rank Algorithms
################################################################
144
cpdef uint b_p_parity(cls, ulong p):
???Return the parity of the permutation.
>>> s = Size(4); p = s.p_identity
>>> s.b_p_parity(p)
0
>>> p = s.p_random(); s.b_p_parity(s.p_pp_mul(p,p))
0
>>> print Size(3).b_p_parity(067543210) # Toffoli Gate
1
???
cdef uint a,c,i,j
a = 0; c = 0
for j in range(cls.N):
if (a>>j)&1 == 0:
c ^= 1
i = j
while True:
a ^= 1<<i
i = cls.r_pi_get(p,i)
if i == j:
break
return c # (cls.N-c)%2
cpdef uint n_p_lowbit(cls, ulong p):
cdef uint n
n = 0
if (p & 0xffffffff) == 0:
n += 32
p = p >> 32
if (p & 0xffff) == 0:
n += 16
p = p >> 16
if (p & 0xff) == 0:
n += 8
p = p >> 8
if (p & 0xf) == 0:
n += 4
p = p >> 4
if (p & 0x3) == 0:
n += 2
p = p >> 2
if (p & 0x1) == 0:
n += 1
return n
145
cpdef uint is_a_p(cls, ulong p):
???Determines whether a permutation is an affine function.
>>> print Size(3).is_a_p(Size(3).p_identity)
True
>>> print Size(3).is_a_p(067543210) # Toffoli Gate
False
>>> print Size(3).is_a_p(067452301) # NOT Gate on lsb
True
???
cdef uint i
cdef ulong a,v
a = 0
v = p & cls.r_mask
for i in range(cls.n):
a ^= (v ^ (p >> ( (1 << i) * cls.n) & cls.r_mask)) \
<< (i * cls.n)
a ^= (v << cls.nn)
for i in range(cls.N):
if cls.r_ra_mul(i,a)!=((p>>(i*cls.n))&cls.r_mask):
return 0
return 1
cpdef uint is_a_p4(cls, ulong p):
???Determines whether a permutation is an affine function.
>>> print Size(4).is_a_p(Size(4).p_identity)
True
>>> print Size(3).is_a_p(067543210) # Toffoli Gate
False
>>> print Size(3).is_a_p(067452301) # NOT Gate on lsb
True
???
p ^= ( p & 0xf)*0x1111111111111111ull
p ^= ((p >> 4) & 0xf)*0x1010101010101010ull
p ^= ((p >> 8) & 0xf)*0x1100110011001100ull
if (p & 0xffff) != 0:
return 0
p ^= ((p >> 16) & 0xf)*0x1111000011110000ull
if (p & 0xffffffff) != 0:
return 0
p ^= ((p >> 32) & 0xf)*0x1111111100000000ull
return (p == 0)
cpdef uint is_t_p(cls, ulong p):
???Determines whether a permutation is a.e. to Toffoli.
146
???
cdef uint i
cdef ulong t,q
i = cls.n_p_signature(p)
if i == 11:
return 1
else:
return 0
cpdef uint is_t3_p(cls, ulong p):
???Determines whether a permutation is a.e. to Toffoli.
???
cdef uint i
cdef ulong t,q
i = cls.n_p_signature(p)
if i == 3:
return 1
else:
return 0
cpdef uint n_p_rank(cls, ulong p):
???Determines rank of the four vector truth tables.
???
cdef uint rank, n, a, b
rank = 0
while p != 0:
n = cls.n_p_lowbit(p)
b = n & 0x3
a = n - b
p ^= ((p>>b)&0x1111111111111111ull)*((p>>a)&0xf)
rank += 1
return rank
cpdef ulong p_p_reduce_affine(cls, ulong p):
p ^= ( p & 0xf)*0x1111111111111111ull
p ^= ((p >> 4) & 0xf)*0x1010101010101010ull
p ^= ((p >> 8) & 0xf)*0x1100110011001100ull
p ^= ((p >> 16) & 0xf)*0x1111000011110000ull
p ^= ((p >> 32) & 0xf)*0x1111111100000000ull
return p
cpdef ulong p_p_reduce_quadratic(cls, ulong p):
p ^= ((p >> 3*4) & 0xf)*0x1000100010001000ull
p ^= ((p >> 5*4) & 0xf)*0x1010000010100000ull
p ^= ((p >> 6*4) & 0xf)*0x1100000011000000ull
147
p ^= ((p >> 9*4) & 0xf)*0x1010101000000000ull
p ^= ((p >> 10*4) & 0xf)*0x1100110000000000ull
p ^= ((p >> 12*4) & 0xf)*0x1111000000000000ull
return p
cpdef ulong p_p_reduce_cubic(cls, ulong p):
p ^= ((p >> 7*4) & 0xf)*0x1000000010000000ull
p ^= ((p >> 11*4) & 0xf)*0x1000100000000000ull
p ^= ((p >> 13*4) & 0xf)*0x1010000000000000ull
p ^= ((p >> 14*4) & 0xf)*0x1100000000000000ull
return p
def rank_signature(cls, ulong p):
cdef uint prev, next
sig = [ cls.b_p_parity(p) ]
prev = cls.n_p_rank(p)
p = cls.p_p_reduce_affine(p)
next = cls.n_p_rank(p)
sig.append(prev - next)
prev = next
p = cls.p_p_reduce_quadratic(p)
next = cls.n_p_rank(p)
sig.append(prev - next)
prev = next
p = cls.p_p_reduce_cubic(p)
next = cls.n_p_rank(p)
sig.append(prev - next)
prev = next
return sig
cpdef uint n_p_signature(cls, ulong p):
cdef uint prev, next, sig
p = cls.p_p_reduce_affine(p)
sig = 4 - cls.n_p_rank(p)
p = cls.p_p_reduce_quadratic(p)
sig ^= (4 - sig - cls.n_p_rank(p)) << 3
return sig
cpdef uint n_p_signature2(cls, ulong p):
cdef uint sig
sig = cls.n_p_signature(p)
sig ^= cls.DC(p,cls.p_p_inv(p)) << 6
sig ^= cls.b_p_parity(p) << 7
return sig
148
def L_p_signature(cls, ulong p):
???Signature List (Linear, Quadratic, Parity,
CommutatorSize, involution, order3, order4)
???
cdef uint sig
sig = cls.n_p_signature(p)
L = []
L.append(sig&0x7)
L.append(sig>>3)
L.append(cls.b_p_parity(p))
L.append(len(cls.S_p_affineCommutator(p)))
L.append(cls.DC(p,cls.p_p_inv(p)))
???Order does not appear to be working
L.append(cls.b_p_AE_Order3(p))
L.append(cls.b_p_AE_Order4(p))
L.append(cls.b_p_AE_Order5(p))
L.append(cls.b_p_AE_Order6(p))
L.append(cls.b_p_AE_Order7(p))
L.append(cls.b_p_AE_Order8(p))
L.append(cls.b_p_AE_Order9(p))
L.append(cls.b_p_AE_Order16(p))
???
return L
################################################################
# Double Coset
################################################################
cpdef uint equivDC(cls, ulong a, ulong b):
cdef uint h
cdef ulong t
for h in range(1<<(cls.nn+cls.n)):
if cls.is_inv_a(h):
t = cls.p_pa_mul(a,h)
t = cls.p_pp_mul(t,b)
if cls.is_a_p_fastfail(t):
return 1
return 0
cpdef linear(cls):
cdef ulong h
H = set()
for h in (1<<cls.nn):
if cls.is_inv_a(h):
149
H.add(h)
return H
###############################################################
# Build up a coset test
###############################################################
cpdef uint DC(cls, ulong p, ulong q):
cdef ulong* coset
cdef ulong* qlist
cdef ulong t
cdef uint i,j
coset = <ulong*> malloc(sizeof(ulong)*20160)
qlist = <ulong*> malloc(sizeof(ulong)*16)
p = cls.p_p_inv(p)
coset[0] = p
p = cls.p_p_4bit_linear(p, coset+1)
for j in range(16):
qlist[j] = cls.p_ap_mul(cls.a_v(j),q)
for i in range(20160):
for j in range(16):
#if cls.is_a_pp_mul(coset[i],qlist[j]) != 0:
if cls.is_a_p4(cls.p_pp_mul(coset[i],qlist[j]))!=0:
free(coset)
free(qlist)
return 1
free(coset)
free(qlist)
return 0
cpdef ulong a_pp_DC(cls, ulong p, ulong q):
cdef ulong* coset
cdef ulong* qlist
cdef ulong t, a, pi
cdef uint i,j
coset = <ulong*> malloc(sizeof(ulong)*20160)
qlist = <ulong*> malloc(sizeof(ulong)*16)
pi = cls.p_p_inv(p) # pi = p^-1
coset[0] = pi
cls.p_p_4bit_linear(pi, coset+1)
for j in range(16):
qlist[j] = cls.p_ap_mul(cls.a_v(j),q)
150
for i in range(20160):
for j in range(16):
if cls.is_a_p4(cls.p_pp_mul(coset[i],qlist[j]))!=0:
t = cls.p_pp_mul(coset[i],qlist[j])
a = cls.a_a_inv(cls.a_p(t)) << 32
#a = cls.a_p(cls.p_p_inv(t)) << 32
#a = cls.a_p(t) << 32
t = cls.p_pp_mul(p,t)
t = cls.p_pp_mul(t,cls.p_p_inv(q))
free(coset)
free(qlist)
return a ^ cls.a_p(t)
free(coset)
free(qlist)
return 0
cpdef uint DC_Toff(cls, ulong p, ulong q):
??? Tests if p and q affinely differ by a Toffoli.
There exists a,b,c such that apbqc=T
???
cdef ulong* coset
cdef ulong* qlist
cdef ulong t
cdef uint i,j
coset = <ulong*> malloc(sizeof(ulong)*20160)
qlist = <ulong*> malloc(sizeof(ulong)*16)
p = cls.p_p_inv(p)
coset[0] = p
p = cls.p_p_4bit_linear(p, coset+1)
for j in range(16):
qlist[j] = cls.p_ap_mul(cls.a_v(j),q)
for i in range(20160):
for j in range(16):
if cls.is_t_p(cls.p_pp_mul(coset[i],qlist[j]))!=0:
free(coset)
free(qlist)
return 1
free(coset)
free(qlist)
return 0
cpdef uint DC_T3(cls, ulong p, ulong q):
??? Tests if p and q affinely differ by a CCCNOT.
There exists a,b,c such that apbqc=T^3
???
151
cdef ulong* coset
cdef ulong* qlist
cdef ulong t
cdef uint i,j
coset = <ulong*> malloc(sizeof(ulong)*20160)
qlist = <ulong*> malloc(sizeof(ulong)*16)
p = cls.p_p_inv(p)
coset[0] = p
p = cls.p_p_4bit_linear(p, coset+1)
for j in range(16):
qlist[j] = cls.p_ap_mul(cls.a_v(j),q)
for i in range(20160):
for j in range(16):
if cls.is_t3_p(cls.p_pp_mul(coset[i],qlist[j]))!=0:
free(coset)
free(qlist)
return 1
free(coset)
free(qlist)
return 0
#cpdef uint b_p_AE_Involution(cls, ulong p):
def b_p_AE_Involution(cls, p):
??? Tests if p is Affine Equivalent to an involution.
There exists a,b such that apb=i where i^2=1.
???
cdef ulong* coset
cdef ulong* qlist
cdef ulong t
cdef uint i,j
coset = <ulong*> malloc(sizeof(ulong)*20160)
qlist = <ulong*> malloc(sizeof(ulong)*16)
coset[0] = p
cls.p_p_4bit_linear(p, coset+1)
for j in range(16):
qlist[j] = cls.p_ap_mul(cls.a_v(j),p)
for i in range(20160):
for j in range(16):
t = cls.p_pp_mul(coset[i],qlist[j])
if cls.is_a_p4(t) != 0:
t = cls.p_p_inv(t)
t = cls.p_pp_mul(p,t)
if cls.p_pp_mul(t,t) == cls.p_identity:
free(coset)
free(qlist)
152
return 1
free(coset)
free(qlist)
return 0
# Hamiltonian cycle
cdef inline ulong a_aij_cnot(cls, ulong a, uint i, uint j):
return a ^ (((a>>i) & cls.p_base) << j)
cdef ulong p_p_01(cls, ulong p, ulong* coset):
p ^= ((p >> 1) & cls.p_base); coset[0] = p # CNOT(1,0)
p ^= ((p & cls.p_base) << 1); coset[1] = p # CNOT(0,1)
p ^= ((p >> 1) & cls.p_base); coset[2] = p # CNOT(1,0)
p ^= ((p & cls.p_base) << 1); coset[3] = p # CNOT(0,1)
p ^= ((p >> 1) & cls.p_base); coset[4] = p # CNOT(1,0)
return p
cdef ulong p_p_10(cls, ulong p, ulong* coset):
p ^= ((p & cls.p_base) << 1); coset[0] = p # CNOT(0,1)
p ^= ((p >> 1) & cls.p_base); coset[1] = p # CNOT(1,0)
p ^= ((p & cls.p_base) << 1); coset[2] = p # CNOT(0,1)
p ^= ((p >> 1) & cls.p_base); coset[3] = p # CNOT(1,0)
p ^= ((p & cls.p_base) << 1); coset[4] = p # CNOT(0,1)
return p
cdef ulong p_p_20(cls, ulong p, ulong* coset):
cdef ulong mask
mask = cls.p_base << 2
p = cls.p_p_10(p,coset)
p ^= ((p & mask) >> 1); coset[5] = p # CNOT(2,1)
p = cls.p_p_10(p,coset+6)
p ^= ((p & mask) >> 2); coset[11] = p # CNOT(2,0)
p = cls.p_p_10(p,coset+12)
p ^= ((p & mask) >> 1); coset[17] = p # CNOT(2,1)
p = cls.p_p_10(p,coset+18)
return p
cdef ulong p_p_21(cls, ulong p, ulong* coset):
cdef ulong mask
mask = cls.p_base << 2
p = cls.p_p_01(p,coset)
p ^= ((p & mask) >> 2); coset[5] = p # CNOT(2,0)
p = cls.p_p_01(p,coset+6)
p ^= ((p & mask) >> 1); coset[11] = p # CNOT(2,1)
p = cls.p_p_01(p,coset+12)
153
p ^= ((p & mask) >> 2); coset[17] = p # CNOT(2,0)
p = cls.p_p_01(p,coset+18)
return p
cdef ulong p_p_0212(cls, ulong p, ulong* coset):
cdef uint i,j
for i in range(7):
j = ((i>>2)^i)&1 # produces pattern 0101101
if j==0:
p = cls.p_p_20(p,coset+i*24)
else:
p = cls.p_p_21(p,coset+i*24)
if i<6:
p = cls.a_aij_cnot(p,j,2); coset[(i+1)*24-1] = p
return p
cdef ulong p_p_30(cls, ulong p, ulong* coset):
p = cls.p_p_0212(p,coset)
p = cls.a_aij_cnot(p,3,2); coset[168-1] = p
p = cls.p_p_0212(p,coset+168)
p = cls.a_aij_cnot(p,3,1); coset[2*168-1] = p
p = cls.p_p_0212(p,coset+2*168)
p = cls.a_aij_cnot(p,3,2); coset[3*168-1] = p
p = cls.p_p_0212(p,coset+3*168)
p = cls.a_aij_cnot(p,3,0); coset[4*168-1] = p
p = cls.p_p_0212(p,coset+4*168)
p = cls.a_aij_cnot(p,3,2); coset[5*168-1] = p
p = cls.p_p_0212(p,coset+5*168)
p = cls.a_aij_cnot(p,3,1); coset[6*168-1] = p
p = cls.p_p_0212(p,coset+6*168)
p = cls.a_aij_cnot(p,3,2); coset[7*168-1] = p
p = cls.p_p_0212(p,coset+7*168)
return p
cdef ulong p_p_31(cls, ulong p, ulong* coset):
p = cls.p_p_0212(p,coset)
p = cls.a_aij_cnot(p,3,2); coset[168-1] = p
p = cls.p_p_0212(p,coset+168)
p = cls.a_aij_cnot(p,3,0); coset[2*168-1] = p
p = cls.p_p_0212(p,coset+2*168)
p = cls.a_aij_cnot(p,3,2); coset[3*168-1] = p
p = cls.p_p_0212(p,coset+3*168)
p = cls.a_aij_cnot(p,3,1); coset[4*168-1] = p
p = cls.p_p_0212(p,coset+4*168)
p = cls.a_aij_cnot(p,3,2); coset[5*168-1] = p
154
p = cls.p_p_0212(p,coset+5*168)
p = cls.a_aij_cnot(p,3,0); coset[6*168-1] = p
p = cls.p_p_0212(p,coset+6*168)
p = cls.a_aij_cnot(p,3,2); coset[7*168-1] = p
p = cls.p_p_0212(p,coset+7*168)
return p
cdef ulong p_p_4bit_linear(cls, ulong p, ulong* coset):
p = cls.p_p_30(p,coset)
p = cls.a_aij_cnot(p,0,3); coset[1*1344-1] = p
p = cls.p_p_31(p,coset+1344)
p = cls.a_aij_cnot(p,2,3); coset[2*1344-1] = p
p = cls.p_p_31(p,coset+2*1344)
p = cls.a_aij_cnot(p,0,3); coset[3*1344-1] = p
p = cls.p_p_30(p,coset+3*1344)
p = cls.a_aij_cnot(p,1,3); coset[4*1344-1] = p
p = cls.p_p_31(p,coset+4*1344)
p = cls.a_aij_cnot(p,2,3); coset[5*1344-1] = p
p = cls.p_p_31(p,coset+5*1344)
p = cls.a_aij_cnot(p,0,3); coset[6*1344-1] = p
p = cls.p_p_30(p,coset+6*1344)
p = cls.a_aij_cnot(p,1,3); coset[7*1344-1] = p
p = cls.p_p_31(p,coset+7*1344)
p = cls.a_aij_cnot(p,0,3); coset[8*1344-1] = p
p = cls.p_p_31(p,coset+8*1344)
p = cls.a_aij_cnot(p,2,3); coset[9*1344-1] = p
p = cls.p_p_30(p,coset+9*1344)
p = cls.a_aij_cnot(p,1,3); coset[10*1344-1] = p
p = cls.p_p_31(p,coset+10*1344)
p = cls.a_aij_cnot(p,0,3); coset[11*1344-1] = p
p = cls.p_p_31(p,coset+11*1344)
p = cls.a_aij_cnot(p,2,3); coset[12*1344-1] = p
p = cls.p_p_30(p,coset+12*1344)
p = cls.a_aij_cnot(p,0,3); coset[13*1344-1] = p
p = cls.p_p_30(p,coset+13*1344)
p = cls.a_aij_cnot(p,3,2) # One wasted step.
p = cls.a_aij_cnot(p,2,3); coset[14*1344-1] = p
p = cls.p_p_30(p,coset+14*1344)
return p
###############################################################
# Set Manipulators
###############################################################
cdef ulong* A_S(cls, S):
155
cdef ulong* A
cdef uint i,n
n = len(S)
A = <ulong*> malloc(sizeof(ulong)*n)
i = 0
for x in S:
A[i] = x
i += 1
return A
cdef S_An(cls, ulong* A, uint n):
cdef uint i
S = set()
for i in range(n):
S.add(A[i])
return S
cdef ulong* A_An_inv(cls, ulong* A, uint n):
cdef uint i
for i in range(n):
A[i] = cls.p_p_inv(A[i])
return A
cdef ulong* A_p_leftAffineCoset(cls, ulong p):
cdef ulong* coset
cdef uint i,j
coset = <ulong*> malloc(sizeof(ulong)*20160*16)
coset[0] = p
p = cls.p_p_4bit_linear(p, coset+1)
for i in range(16):
for j in range(20160):
coset[20160*i+j] = coset[j] ^ (cls.p_base * i)
return coset
def S_p_leftAffineCoset(cls, ulong p):
cdef ulong* coset
coset = cls.A_p_leftAffineCoset(p)
return cls.S_An(coset,322560)
cdef ulong* A_p_leftLinearCoset(cls, ulong p):
cdef ulong* coset
cdef uint i,j
coset = <ulong*> malloc(sizeof(ulong)*20160)
coset[0] = p
p = cls.p_p_4bit_linear(p, coset+1)
156
return coset
def S_p_leftLinearCoset(cls, ulong p):
cdef ulong* coset
coset = cls.A_p_leftAffineCoset(p)
return cls.S_An(coset,20160)
cdef ulong* A_p_affineCommutator(cls, ulong p):
cdef ulong* S
cdef ulong* C
cdef uint i, count
S = <ulong*> malloc(sizeof(ulong)*20160*16)
C = <ulong*> malloc(sizeof(ulong)*20160*16+1)
S = cls.A_p_leftAffineCoset(p) # pH
p = cls.p_p_inv(p)
count = 1
for i in range(322560):
if cls.is_a_pp_mul(S[i],p) != 0: # pHp^-1
C[count] = cls.p_pp_mul(S[i],p)
count += 1
C[0] = count - 1
return C
def S_p_affineCommutator(cls, ulong p):
cdef ulong* S
cdef uint n
S = cls.A_p_affineCommutator(p)
return cls.S_An(S+1,S[0])
cdef ulong* A_p_linearCommutator(cls, ulong p):
cdef ulong* S
cdef ulong* C
cdef uint i, count
S = <ulong*> malloc(sizeof(ulong)*20160)
C = <ulong*> malloc(sizeof(ulong)*20160+1)
S = cls.A_p_leftLinearCoset(p) # pH
p = cls.p_p_inv(p)
count = 1
for i in range(20160):
if cls.is_a_pp_mul(S[i],p) != 0: # pHp^-1
if cls.v_p(cls.p_pp_mul(S[i],p)) == 0:
C[count] = cls.p_pp_mul(S[i],p)
count += 1
C[0] = count - 1
return C
157
def S_p_linearCommutator(cls, ulong p):
cdef ulong* S
cdef uint n
S = cls.A_p_linearCommutator(p)
return cls.S_An(S+1,S[0])
def S_p_affineTransversal(cls, ulong p):
cdef ulong t
H = cls.S_p_affineCommutator(p)
G = cls.S_p_leftAffineCoset(cls.p_identity)
for h in H:
G.remove(h)
H.remove(cls.p_identity) # t*id will be removed by pop()
T = set([ cls.p_identity ])
while len(G) > 0:
t = G.pop()
for h in H:
G.remove(cls.p_pp_mul(t,h))
T.add(t)
return T
def S_p_linearTransversalToff(cls):
???Affine shift commute through a Toffoli gate into linear.
Thus, the transversal can be only linear functions.
???
cdef ulong t,p
p = 0xbedcfa9836547210 # Toffoli
H = cls.S_p_linearCommutator(p)
G = cls.S_p_leftLinearCoset(cls.p_identity)
for h in H:
G.remove(h)
H.remove(cls.p_identity) # t*id will be removed by pop()
T = set([ cls.p_identity ])
while len(G) > 0:
t = G.pop()
for h in H:
G.remove(cls.p_pp_mul(t,h))
T.add(t)
return T
def is_pS_DC(cls, ulong p, S):
cdef ulong s
cdef uint sig
158
sig = cls.n_p_signature(p)
for s in S:
if cls.n_p_signature(s) != sig:
continue
if cls.DC(p,s) != 0:
return 1
return 0
def p_pS_DC(cls, ulong p, S):
cdef ulong s
cdef uint sig
sig = cls.n_p_signature(p)
for s in S:
if cls.n_p_signature(s) != sig:
continue
if cls.DC(p,s) != 0:
return s
return 0
cdef ulong is_pAn_DC(cls, ulong p, ulong* A, uint n):
cdef ulong* coset
cdef ulong t
cdef uint i,j,k
coset = <ulong*> malloc(sizeof(ulong)*20160)
p = cls.p_p_inv(p)
coset[0] = p
p = cls.p_p_4bit_linear(p, coset+1)
for i in range(20160):
for j in range(16):
t = coset[i] ^ (cls.p_base * j)
for k in range(n):
if cls.is_a_pp_mul(t,A[k]) != 0:
free(coset)
return 1
free(coset)
return 0
???
Is Basis Fixing
Is Normal (zero fixing)
Is involution
Cycle Index Polynomial
Conjugation
Random
Hilbert polynomial
159
One and two variable
???
if __name__ == ?__main__?:
import doctest, sys
doctest.testmod(sys.modules[__name__])
160
BIBLIOGRAPHY
[AtSotCL51] H. H. Aiken and the Sta of the Computation Laboratory, Synthesis
of electronic computing and control circuits, 1951, pp. 231{278.
[BCP97] Wieb Bosma, John Cannon, and Catherine Playoust, The magma
algebra system i: the user language, J. Symb. Comput. 24 (1997),
no. 3-4, 235{265.
[Ben73] C. H. Bennett, Logical reversibility of computation, IBM J. Res.
Develop. 17 (1973), 525{532. MR MR0449020 (56 #7325)
[CG96] Stephen J. Curran and Joseph A. Gallian, Hamiltonian cycles and
paths in cayley graphs and digraphs { a survey, Discrete Mathemat-
ics 156 (1996), no. 1-3, 1 { 18.
[HG92] A. P. Hiltgen and J. Ganz, On the existence of speci c complexity
{ asymmetric permutations, Technical Report, Signal and Informa-
tion Proc. Lab., ETH-Zurich (1992).
[Hil93] Alain P. L. Hiltgen, Constructions of feebly-one-way families of per-
mutations, Advances in cryptology|AUSCRYPT ?92 (Gold Coast,
1992), Lecture Notes in Comput. Sci., vol. 718, Springer, Berlin,
1993, pp. 422{434. MR MR1292706 (96e:94014)
161
[Hol05] Derek F. Holt, Handbook of computational group theory (discrete
mathematics and its applications), Chapman & Hall/CRC, January
2005.
[Hou06] Xiang-dong Hou, A nity of permutations of Fn2, Discrete Appl.
Math. 154 (2006), no. 2, 313{325. MR MR2194404 (2006j:06024)
[Jac74] Nathan Jacobson, Basic algebra I, W. H. Freeman and Company,
San Francisco, CA, USA, 1974, ISBN 0-7167-0453-6 (v. 1.).
[Jun93] Dieter Jungnickel, Finite  elds, Bibliographisches Institut,
Mannheim, 1993, Structure and arithmetics. MR MR1238714
(94g:11109)
[KMS07] Samuel A. Kutin, David Petrie Moulton, and Lawren M. Smithline,
Computation at a distance, 2007.
[Lor64] C.S. Lorens, Invertible boolean functions, Electronic Computers,
IEEE Transactions on EC-13 (1964), no. 5, 529{541.
[LPS87] Martin W. Liebeck, Cheryl E. Praeger, and Jan Saxl, A classi ca-
tion of the maximal subgroups of the  nite alternating and symmet-
ric groups, J. Algebra 111 (1987), no. 2, 365{383. MR MR916173
(89b:20008)
[Mas96] James L. Massey, The di culty with di culty, Presented EURO-
CRYPT 1996, in Zaragoza, Spain, 1996.
[MDM07] D. Maslov, G. W. Dueck, and D. M. Miller, Techniques for the
synthesis of reversible to oli networks, ACM Trans. Des. Autom.
Electron. Syst. 12 (2007), no. 4, 42.
162
[Moo52] E. F. Moore, A table for four-relay two-terminal contact networks,
1952.
[Sco87] W. R. Scott, Group theory, second ed., Dover Publications Inc.,
New York, 1987. MR MR896269 (88d:20001)
[Vol99] Heribert Vollmer, Introduction to circuit complexity, Texts in The-
oretical Computer Science. An EATCS Series, Springer-Verlag,
Berlin, 1999, A uniform approach. MR MR1704235 (2001b:68047)
163