ABSTRACT Title of dissertation: PRACTICAL DYNAMIC SOFTWARE UPDATING Iulian Gheorghe Neamtiu Doctor of Philosophy, 2008 Dissertation directed by: Professor Michael Hicks Department of Computer Science This dissertation makes the case that programs can be updated while they run, with modest programmer effort, while providing certain update safety guarantees, and without imposing a significant performance overhead. Few systems are designed with on-the-fly updating in mind. Those systems that permit it support only a very limited class of updates, and generally provide no guarantees that following the update, the system will behave as intended. We tackle the on-the-fly updating problem using a compiler-based approach called dynamic software updating (DSU), in which a program is patched with new code and data while it runs. The challenge is in making DSU practical: it should support changes to programs as they occur in practice, yet be safe, easy to use, and not impose a large overhead. This dissertation makes both theoretical contributions?formalisms for rea- soning about, and ensuring update safety?and practical contributions?Ginseng, a DSU implementation for C. Ginseng supports a broad range of changes to C programs, and performs a suite of safety analyses to ensure certain update safety properties. We performed a substantial study of using Ginseng to dynamically up- date six sizable C server programs, three single-threaded and three multi-threaded. The updates were derived from changes over long periods of time, ranging from 10 months to 4 years-worth of releases. Though the programs changed substantially, the updates were straightforward to generate, and performance measurements show that the overhead of Ginseng is detectable, but modest. In summary, this dissertation shows that DSU can be practical for updating realistic applications as they are written now, and as they evolve in practice. PRACTICAL DYNAMIC SOFTWARE UPDATING by Iulian Gheorghe Neamtiu Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2008 Advisory Committee: Professor Michael Hicks, Chair/Advisor Professor Bruce Jacob, Dean?s Representative Professor Jeffrey Foster Professor Jeffrey Hollingsworth Professor Neil Spring c?Copyright by Iulian Gheorghe Neamtiu 2008 P?arint?ilor mei, Elisabeta ?si Gheorghe ii Acknowledgments I am deeply indebted to my advisor, Mike Hicks, for his relentless help and guidance, academic and otherwise, that have made this work possible. I am grateful to Mike for the late hours we spent working together prior to paper deadlines, for his patience in showing me how to write a type system and prove it correct, and for a million other things. But apart from teaching me the ?skills of the trade? for doing research, Mike has taught me something equally important: that I could achieve what at first seemed impossible, if I was willing to dedicate myself to it. I want to thank Jeff Foster for his help and advice, and for teaching an excellent class on Program Analysis and Understanding in Fall 2003 that opened my eyes to the rigor and elegance of Programming Languages research. My first advisor, Liviu Iftode, got me started on conducting research very early in my Ph.D. program. Working with Liviu and his students at Rutgers University, Florin Sultan and Aniruddha Bohra, put me on the right track for my dynamic updating work. I also thank the other members of my Ph.D committee, Jeff Hollingsworth, Bruce Jacob and Neil Spring, for taking time to go over my dissertation and pro- viding feedback and suggestions. I was fortunate to collaborate with Gareth Stoyle, Gavin Bierman, Peter Sewell, Manuel Oriol and Polyvios Pratikakis on both theoretical and practical as- pects of this work. This dissertation has benefited greatly from our joint efforts. The Programming Languages group at Maryland was a fun and productive environment to work in. Nikhil Swamy, Mike Furr, David Greenfieldboyce, Eric iii Hardisty, Chris Hayden, Pavlos Papageorgiou, Nick Petroni, Khoo-Yit Phang, and Saurabh Srivastava were always willing to help whenever I needed suggestions, ideas, or feedback. Finally, I thank my family for their unwavering support that allowed me to start a Ph.D. and carry it out to completion. My wife Monica brought much needed sunshine in my graduate student life: she put up with my moods when I was stressed, stood by my side, and provided comfort and encouragement. She did all these while being in a similarly stressful position herself: first a Ph.D. student, and, later on, an assistant professor. My parents Elisabeta and Gheorghe brought me up to respect and value education, and to strive for thoroughness. They have spared no effort in helping me succeed, and this dissertation is a small tribute to their endeavors. iv Table of Contents List of Figures ix List of Abbreviations xii 1 Overview 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Ginseng . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2.1 Ginseng Compiler . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.2 Patch Generator . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2.3 Runtime System and Update Points . . . . . . . . . . . . . . . 9 1.2.4 Version Consistency . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2 Software Evolution 16 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.1 AST Matching . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2.2 Change Detection and Reporting . . . . . . . . . . . . . . . . 21 2.2.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3 Implications for Dynamic Software Updating . . . . . . . . . . . . . . 26 2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3 Single-threaded Implementation and Evaluation 32 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.2 Enabling On-line Updates . . . . . . . . . . . . . . . . . . . . . . . . 35 3.2.1 Function Indirection . . . . . . . . . . . . . . . . . . . . . . . 35 3.2.2 Type Wrapping . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.2.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.2.4 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.3 Safety Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3.1 Tracking Changes to Types . . . . . . . . . . . . . . . . . . . 46 3.3.2 Abstraction-Violating Aliases . . . . . . . . . . . . . . . . . . 50 3.3.3 Unsafe Casts and Polymorphism . . . . . . . . . . . . . . . . . 51 3.3.4 Ginseng?s Type Safety vs Activeness Check . . . . . . . . . . . 53 3.3.5 Choosing Update Points . . . . . . . . . . . . . . . . . . . . . 54 3.4 Dynamic Patches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.5 Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.5.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.5.2 Changes To Original Source Code . . . . . . . . . . . . . . . . 64 3.5.3 Dynamic Updating Catalysts . . . . . . . . . . . . . . . . . . 65 3.5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.6 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 v 3.6.1 Application Performance . . . . . . . . . . . . . . . . . . . . . 73 3.6.2 Memory Footprint . . . . . . . . . . . . . . . . . . . . . . . . 78 3.6.3 Service Disruption . . . . . . . . . . . . . . . . . . . . . . . . 80 3.6.4 Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.7 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4 Multi-threaded Implementation and Evaluation 83 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.2 Induced Update Points . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.2.1 Barrier Approach . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.2.2 Relaxed Approach . . . . . . . . . . . . . . . . . . . . . . . . 93 4.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.3.1 Replacing Active Code . . . . . . . . . . . . . . . . . . . . . . 98 4.3.2 Concurrency Issues . . . . . . . . . . . . . . . . . . . . . . . . 99 4.4 Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.4.1 Icecast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.4.2 Memcached . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.4.3 Space Tyrant . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.4.4 Source Code Changes . . . . . . . . . . . . . . . . . . . . . . . 107 4.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.5.1 Update Availability . . . . . . . . . . . . . . . . . . . . . . . . 112 4.5.2 Application Performance . . . . . . . . . . . . . . . . . . . . . 117 4.5.3 Memory Footprint . . . . . . . . . . . . . . . . . . . . . . . . 119 4.5.4 Compilation Time . . . . . . . . . . . . . . . . . . . . . . . . 121 4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 5 Version Consistency 123 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.2 Contextual effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 5.2.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 5.2.2 Typing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 5.2.3 Semantics and Soundness . . . . . . . . . . . . . . . . . . . . 130 5.2.4 Contextual Effect Inference . . . . . . . . . . . . . . . . . . . 134 5.3 Single-threaded Transactional Version Consistency . . . . . . . . . . . 137 5.3.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.3.2 Typing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.3.3 Operational Semantics . . . . . . . . . . . . . . . . . . . . . . 142 5.3.4 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 5.3.5 Implementing Version Consistency for C Programs . . . . . . 150 5.3.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 5.4 Relaxed Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 5.4.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 5.4.2 Typing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 vi 5.4.3 Operational Semantics . . . . . . . . . . . . . . . . . . . . . . 158 5.4.4 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 5.5 Multi-threaded Version Consistency . . . . . . . . . . . . . . . . . . . 162 5.5.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 5.5.2 Typing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 5.5.3 Operational semantics. . . . . . . . . . . . . . . . . . . . . . . 163 5.5.4 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 6 Related Work 170 6.1 Dynamic Software Updating . . . . . . . . . . . . . . . . . . . . . . . 170 6.1.1 Update Support . . . . . . . . . . . . . . . . . . . . . . . . . . 170 6.1.2 Correctness of Dynamic Software Updating . . . . . . . . . . . 172 6.1.3 Multi-threaded Systems . . . . . . . . . . . . . . . . . . . . . 174 6.1.4 Operating Systems . . . . . . . . . . . . . . . . . . . . . . . . 175 6.1.5 DSU in Languages Other Than C . . . . . . . . . . . . . . . . 177 6.1.6 Edit and Continue Development . . . . . . . . . . . . . . . . . 179 6.1.7 Dynamic Code Patching . . . . . . . . . . . . . . . . . . . . . 181 6.2 Alternative (Non-DSU) Approaches to Online Updating . . . . . . . . 181 6.3 Software Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 6.4 Contextual Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 7 Future Work 187 7.1 DSU for Other Categories of Applications . . . . . . . . . . . . . . . 187 7.2 DSU for Operating Systems . . . . . . . . . . . . . . . . . . . . . . . 191 7.3 Safety of Dynamic Software Updating . . . . . . . . . . . . . . . . . . 193 7.3.1 Dynamic Approaches . . . . . . . . . . . . . . . . . . . . . . . 195 7.3.2 Static Approaches . . . . . . . . . . . . . . . . . . . . . . . . . 196 7.4 Software Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 8 Conclusions 199 A Developing Updateable Software Using Ginseng 202 A.1 Preparing Initial Sources . . . . . . . . . . . . . . . . . . . . . . . . . 202 A.1.1 Specifying Update Points . . . . . . . . . . . . . . . . . . . . . 202 A.1.2 Code Extraction . . . . . . . . . . . . . . . . . . . . . . . . . 203 A.1.3 Memory Allocation Functions . . . . . . . . . . . . . . . . . . 204 A.1.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 A.1.5 Check-ins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 A.2 Dynamic Patches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 A.2.1 Type Transformers . . . . . . . . . . . . . . . . . . . . . . . . 208 A.2.2 State Transformers . . . . . . . . . . . . . . . . . . . . . . . . 210 B Proteus-tx Proofs 211 C Relaxed Updates Proofs 240 vii D Multi-threading Proofs 267 Bibliography 272 viii List of Figures 1.1 Building and dynamically updating software with Ginseng. In Stage 1, Ginseng compiles a C program into an updateable application. In Stage 2 and later, dynamic patches are generated and loaded into the application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1 High level view of ASTdiff. . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Two successive program versions. . . . . . . . . . . . . . . . . . . . . 18 2.3 Map generation algorithm. . . . . . . . . . . . . . . . . . . . . . . . . 19 2.4 Summary output produced for the code in Figure 2.2. . . . . . . . . . 21 2.5 Density tree for struct/union field additions (Linux 2.4.20?2.4.21). . 22 2.6 ASTdiff running time for various program sizes. . . . . . . . . . . . . 25 2.7 Function and global variable additions and deletions. . . . . . . . . . 29 2.8 Function body and prototype changes. . . . . . . . . . . . . . . . . . 30 2.9 Classifying changes to types. . . . . . . . . . . . . . . . . . . . . . . . 31 3.1 Compiling a program to be dynamically updateable. . . . . . . . . . . 41 3.2 Updating a long-running loop using code extraction. . . . . . . . . . . 43 3.3 Ginseng dynamic patch for the update Zebra 0.92a?0.93a. . . . . . 58 3.4 Zebra architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.5 Vsftpd: simplified structure. . . . . . . . . . . . . . . . . . . . . . . . 66 3.6 Zebra performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.7 KissFFT: DSU impact on performance. . . . . . . . . . . . . . . . . . 76 3.8 KissFFT: impact of optimizations on running time. . . . . . . . . . . 77 3.9 Memory footprints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.10 Patch application times. . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.11 DSU compilation time breakdown for updateable programs. . . . . . 81 ix 4.1 Contextual effects and their use for implementing induced update points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.2 Examples of version consistent updates. . . . . . . . . . . . . . . . . . 91 4.3 Example of a version inconsistent update. . . . . . . . . . . . . . . . 93 4.4 Pseudo-code for safety check and update protocol routines. . . . . . . 96 4.5 Icecast structure. DSU annotations are marked with ?*?. . . . . . . . 105 5.1 Contextual effects source language . . . . . . . . . . . . . . . . . . . . 126 5.2 Contextual effects type system . . . . . . . . . . . . . . . . . . . . . . 127 5.3 Contextual effects operational semantics (partial) . . . . . . . . . . . 131 5.4 Proteus-tx syntax, effects, and updates . . . . . . . . . . . . . . . . . 136 5.5 Proteus-tx typing (extends Figure 5.2) . . . . . . . . . . . . . . . . . 140 5.6 Proteus-tx operational semantics . . . . . . . . . . . . . . . . . . . . 143 5.7 Proteus-tx update safety . . . . . . . . . . . . . . . . . . . . . . . . . 144 5.8 Proteus-tx typing extensions for proving soundness . . . . . . . . . . 148 5.9 Version consistency analysis results. . . . . . . . . . . . . . . . . . . . 154 5.10 Source language for relaxed updates. . . . . . . . . . . . . . . . . . . 156 5.11 Selected type rules for relaxed updates. . . . . . . . . . . . . . . . . . 159 5.12 Relaxed updates: operational semantics. . . . . . . . . . . . . . . . . 161 5.13 Relaxed updates: heap typing, trace and update safety. . . . . . . . . 167 5.14 Relaxed updates: typing extensions for proving soundness . . . . . . 168 5.15 Multi-threaded syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 168 5.16 Multi-threaded additions for expression typing . . . . . . . . . . . . . 168 5.17 Multi-threaded configuration typing . . . . . . . . . . . . . . . . . . . 168 5.18 Multi-threaded operational semantics rules. . . . . . . . . . . . . . . . 169 7.1 Spatio-temporal characteristics of long-running applications. . . . . . 188 x A.1 Directing Ginseng to perform code extraction. . . . . . . . . . . . . . 203 A.2 Directing Ginseng to perform check-ins. . . . . . . . . . . . . . . . . . 207 A.3 Type transformer example. . . . . . . . . . . . . . . . . . . . . . . . . 208 A.4 State transformer example. . . . . . . . . . . . . . . . . . . . . . . . . 209 B.1 Transaction effect extraction . . . . . . . . . . . . . . . . . . . . . . . 215 xi List of Abbreviations AST Abstract Syntax Tree AVA Abstraction-Violating Alias DSU Dynamic Software Updating FTP File Transfer Protocol LOC Lines Of Code OS Operating System SSH Secure SHell TVC Transactional Version Consistency VC Version Consistency VM Virtual Machine xii Chapter 1 Overview 1.1 Motivation Continuous operation is a requirement of many of today?s computer systems. Examples range from pacemakers to cell phone base stations to nuclear power plant monitors. For ISPs, credit card providers, brokerages, and on-line stores, being available 24/7 is synonymous with staying in business: an hour of downtime can cost hundreds of thousands, or even millions of dollars [84, 90, 34], and longer downtimes put companies at increasingly higher risk. Despite the requirement that these systems ?run forever,? they must be updated to fix bugs and add new features. The most common update method today, from data centers to desktops to sensor networks, is to stop the system, install the update and restart at the new version. For example, one study [65] found that 75% of nearly 6,000 outages of high-availability applications were planned-for hardware and software maintenance. Another example is critical updates to Windows Vista, where the update is con- sidered so important that the operating system decides to apply the updates and reboot without giving the user the option to postpone installing the updates [13]. Internet access is quickly becoming ubiquitous, so more software vendors re- lease their operating system or application patches online. Unfortunately, this leads to more frequent application restarts, or, when updating the OS, to more frequent 1 reboots. To make a bad situation worse, experts suggest even higher patch release frequencies are needed, to reduce application and OS vulnerability [45]. As expected, this increase in patch release frequency, and hence restarts, is problematic. In a large enterprise, reboots can have a large administrative cost [116]. For embedded systems involved in mission-critical or medical applications, reboots are intolerable. For system administrators and end users, patches and reboots are burdensome: both categories are slow in applying patches because patches are dis- ruptive and might introduce new bugs [97]. To fix these problems, we need to support on-line updates, i.e., applying soft- ware updates without having to restart or reboot. In prior work, many researchers have proposed variations of an approach to supporting on-line updates called Dy- namic Software Updating (DSU). In this approach, a running program is patched with new code and data on-the-fly, while it runs. This dissertation tackles the on-line updating problem using a fine-grained, compiler-based DSU approach. We compile programs specially so that they can be dynamically patched, generate most of a dynamic patch automatically, and finally, load the dynamic patch into the running program. DSU is appealing because of its generality: in principle any program can be updated in a fine-grained way, without the need for redundant hardware or special- purpose software architectures. The challenge is in making DSU practical: it should be flexible and yet safe, efficient, and easy to use. We now describe each of these properties in detail. 2 1. Flexibility. DSU should be applicable to a broad range of applications (irre- spective of abstraction level, software architecture, level of concurrency), and permit arbitrary updates to applications (since the form of future updates cannot be predicted). 2. Efficiency. Applications should require few modifications to support DSU, patches should be easy to write, and updateability should not degrade appli- cation performance. 3. Safety. DSU should provide safety guarantees that give application developers (and patch developers) assurances that following the update, the program will behave as intended. Unfortunately, DSU systems presented in prior work fail to address one or more of these requirements. Many systems do not support all of the software changes as they appear in practice, i.e., are not flexible. Other systems are flexible, but provide no update safety guarantees. To address these problems, we have built Ginseng, a new DSU system for C programs that aims to support most changes that appear in practice, and to satisfy the three practicality criteria laid out above. We have chosen C because it is a very popular language in the construction of long-running software. The kernels of Linux and the BSD OS family are written in C. A survey on safety-critical software [98] used in aerospace, transportation, medical and energy systems finds Ada, followed by assembler, C and C++ to be the predominant programming languages used to construct such systems. Popular long-running Internet servers such as BIND, 3 Apache, Sendmail, and OpenSSH are also written in C, motivating our decision to pursue C as the target language for our DSU system. A DSU system must support the kinds of software changes that typically occur between releases. To find out how programs typically change, we studied the source code evolution of some long-running C programs. We built a tool named ASTdiff that parses two versions of a program, compares their abstract syntax trees, and reports the differences. We used ASTdiff to compare versions of several large C programs (BIND, OpenSSH, Apache, Vsftpd, GNU Zebra and the Linux kernel) spanning several months to several years. We describe ASTdiff and our findings in Chapter 2. The results of the study show that, to enable long-term evolution, a DSU system must support addition of new definitions (functions, data, or types), the replacement of existing definitions (data or functions), and changes to types (data representations, function signatures, and types of global variables). Similar studies on the Linux kernel [89] and several substantial Java applications [28] show that changes to function signatures and class interfaces are part of software evolution for all programs analyzed. Alargenumberofcompiler-orlibrary-basedDSUsystemshavebeendeveloped for C [42, 47, 20, 6], C++ [52, 60], Java [17, 86, 31, 70], and functional languages like ML [32, 43] and Erlang [8]. Many do not support all of the changes needed to make dynamic updates in practice. For example, updates cannot change type definitions or function prototypes [86, 31, 52, 60, 6], or else only permit such changes for abstract types or encapsulated objects [60, 43]. In many cases, updates to active code (e.g., long-running loops) are disallowed [43, 70, 42, 47, 60], and data stored 4 Stage 1: initial compilation Stage 2: generating dynamic patches v n .c p 1 .c Patch Generator Patch Generator Compiler Compiler CompilerVersion Data d 0 v 0 v 1 .cv 0 .c Version Data d n-1 p n .c dynamic patch v 0 ->v 1 dynamic patch v n-1 ->v n ... Start execution Update v 0 to v 1 Update v n-1 to v n Runtime System Figure 1.1: Building and dynamically updating software with Ginseng. In Stage 1, Ginseng compiles a C program into an updateable application. In Stage 2 and later, dynamic patches are generated and loaded into the application. in local variables may not be transformed [50, 47, 42, 52]. Recent systems are more flexible, and support such changes [23, 24, 68], but provide no safety guarantees. 1.2 Ginseng Ginseng is a compiler and tool suite for constructing updateable applications from C programs. Using Ginseng, we compile programs specially so that they can be dynamically patched, and generate most of a dynamic patch automatically. Ginseng performs a series of analyses that when combined with runtime support ensure that an update will not violate certain safety properties, while guaranteeing that data is kept up-to-date. We now proceed to presenting a high-level overview of our approach. Ginseng consists of a compiler, a patch generator and a runtime system for building updateable software. The compiler and patch generator are written in Objective Caml using the CIL framework [80]. The runtime system is a library 5 written in C. Basic usage is illustrated in Figure 1.1, with Ginseng components in white boxes. There are two stages. First, for the initial version of a program, v0.c, the compiler generates an updateable executable v0, along with some type and analysis information (Version Data d0). The executable is then deployed. Second, when the program has changed to a new version (v1.c), the developer provides the new and old code to the patch generator to generate a patch p1.c representing the differences. This is passed to the compiler along with the current version information, and turned into a dynamic patch v0?v1. The runtime system links the dynamic patch into the running program, completing the on-line update. This process continues for each subsequent program version. 1.2.1 Ginseng Compiler The Ginseng compiler has two responsibilities: 1) it compiles programs to be dynamically updateable, and 2) it applies static analyses to ensure updates are safe even when type definitions change. We describe each of these in turn. Compilation Techniques. The Ginseng compiler transforms an input C program so that existing functions will call replacement functions present in a dynamic patch, and data is converted to the latest representation whenever data types change. The technique for updating functions is called function indirection; it permits old code to call new function versions by introducing a level of indirection (via a global variable) between a caller and the called function. To update a function to 6 its new version, the runtime system dynamically loads the new function version and sets the indirection variable to the new function, so new calls go to the new function version. Ginseng also must permit transformations to the state of the program, so the state is compatible with the new code. For this, Ginseng uses a technique called type wrappers: each definition of a named type T is converted into a ?wrapped? version wT whose size is larger and allows room for future growth. When an update changes the definition of T in the original program, existing values of type wT in the compiled program must be transformed to have the new type?s representation, to be compatible with the new code. This is done via a function called type transformer. For example, if the old definition of T is struct { int x;} and the new definition is struct { int x; int y;}, the type transformer?s job is to copy the old value of x and initialize y to a default value. Code is compiled to notice when a typed value is out of date, and if so, to apply the necessary type transformer. Safety Analyses. Ginseng combines static analysis with runtime support to en- sure that updates are always type-safe, even when changes are made to function prototypes or type definitions. While supporting the addition of new definitions, or the replacement of data and functions at the same type, is relatively straightforward, supporting changes to types is challenging: if the old and new programs assume dif- ferent representations for a certain type, then old code accessing new data, or new code accessing old data, leads to a representation inconsistency, i.e., a violation of type safety. To illustrate this, consider the following simple program; the old version 7 is on the left and the new program version is on the right. The update changes the signature of foo to accept two arguments instead of one. 1 void foo (int i) { ... } 2 void bar () { 3 int i; 4 ... 5 foo(i ); 6 ... 7 } 1 void foo (int i, int j) { ... } 2 void bar () { 3 int i,j; 4 ... 5 foo(i,j ); 6 ... 7 } Suppose the update is applied when the old program?s execution reaches line 4. The new version of foo is loaded, and the call on line 5 will invoke the new ver- sion, passing it one argument, i. But this is incorrect, since the new version of foo expects two arguments. The correct thing to do is to postpone the update until after the call to foo. Ginseng performs two safety analyses (updateability analysis and abstraction-violating alias analysis) to ensure an update will not lead to such type safety violations, while guaranteeing that data is kept up-to-date. The basic idea is to examine the program to discover assumptions made about the types of updateable entities (i.e., functions or data) in the continuation of each program point. These assumptions become constraints on the timing of updates (Section 3.3 discusses the implementation of theses analyses). This is in contrast to previous approaches that focus on the updating mech- anism, rather than update safety, and as a consequence, support only limited-scale updates, or provide no safety guarantees. 8 1.2.2 Patch Generator Another key factor in enabling dynamic updates to realistic programs is the ability to construct a dynamic patch automatically. The Ginseng patch generator (Section 3.4) has two responsibilities. First, it identifies those definitions (global variables, functions, or types) that have changed between versions. Second, for each type definition that has changed, it generates a type transformer function used to convert values from a type?s old representation to the new one. The compiler inserts code so that the program will make use of these functions following a dynamic patch. If the new code assumes an invariant about global state (e.g., certain files are open, certain threads are started, or a list is doubly-linked), this invariant has to hold after the update takes place. Users can write state transformer functions that are run at update time to convert state and run initialization code for new features, as necessary. Users also may adjust the generated type transformers as necessary. 1.2.3 Runtime System and Update Points The dynamic update itself is carried out by the Ginseng runtime system (Sec- tion 3.4), which is linked into the updateable program. Once notified, the runtime system will cause a dynamic patch to be dynamically loaded and linked at the next safe update point. An update point is essentially a call to a run-time system func- tion DSU update(). Update points can be inserted manually, by the programmer, or automatically, by the compiler. Our safety analyses will annotate these points with constraints as to how definitions are allowed to change at each particular point. The 9 runtime system will check that these constraints are satisfied by the current update, and if so, it ?glues? the dynamic patch into the running program. In our experience, finding suitable update points in long-lived server programs is quite straightforward, and the analysis provides useful feedback as to whether the chosen spots are free from restrictions. Sections 3.2, 3.3, and 3.4 describe these features of Ginseng in detail. A practical DSU system must strive to provide strong update safety guaran- tees without affecting update availability (the time from when an update becomes available to when it is applied). Long-running programs amenable to dynamic updating are usually structured around event processing loops, where one loop iteration handles one event. For the single-threaded programs we have updated, we placed update points (calls to DSU update)manually, atthecompletionofatop-levelevent-handlingloop. Whilethe manual enumeration of a few update points works well for single-threaded programs, in a multi-threaded program, an update can only be applied when all threads have reached a safe update point. Since this situation is unlikely to happen naturally, we could imagine interpreting each occurrence of DSU update() as part of a barrier? when a thread reaches a safe update point, it blocks until all other threads have done likewise, and the last thread to reach the barrier applies the update and releases the blocked threads. Unfortunately, because all threads must reach safe points, this approach may fail to apply an update in a timely fashion. Therefore, we must allow updates in the middle of the loop while still ensuring update safety. 10 1.2.4 Version Consistency Performing an update in the middle of a loop can potentially lead to problems, even if the update is type-safe, because the update violates what we call version consistency: when programmers write the event processing code they assume the loop body will execute code belonging to the same version. An update could violate that assumption. We solved this problem by allowing programmers to designate blocks of code as transactions whose execution must always be attributable to a single program version. An example of a transaction would be a loop iteration, which corresponds to processing an event. In Chapter 5 we present a formalism called contextual effects that can be used to reason about the past and future computation at each program point. Using a static analysis based on contextual effects we can enforce version consistency even when an update is performed inside a transaction. Version consistency is a desirable property, but many systems designed to support long-term evolution [103, 12, 11, 6, 1, 23, 24] do not implement it. Ginseng provides multi-threaded DSU support that is as flexible and safe as the single-threaded approach, while ensuring updates can be applied in a timely fashion. A key concept introduced in this dissertation, explained in Chapter 4, is that of induced update points. Induced update points helps us accomplish our goal of balancing safety and availability. We allow programmers to designate update points in multi-threaded programs where global state is consistent, and writing an update is straightforward, as the global invariants hold at those points. The code 11 between two update points constitutes a transaction. The update, however, can take place in between programmer-specified update points, at an induced update point. Our system enforces that an update appears to execute at an update point: if a code update takes place in between two update points, the execution trace can be attributed to exactly one program version. In other words, an update can be applied in the middle of a transaction, but the execution of a transaction is still attributable to a single program version. This flexibility is crucial in being able to update multi-threaded programs in a timely manner, without requiring all threads to reach a programmer-inserted update point simultaneously. 1.3 Evaluation Ginseng?s support for a broad range of changes to programs, along with safety and automation, has enabled us to implement long-term updates to single- and multi-threaded programs. We updated three open-source, single-threaded server programs with three to four years? worth of releases: Vsftpd (the Very Secure FTP daemon), the Sshd daemon from the OpenSSH suite, and the Zebra server from the GNU Zebra routing software package, for a total of 27 updates (Chapter 3). We were also able to perform type-safe, version-consistent updates to three multi-threaded programs: the Icecast streaming server, Memcached (a distributed memory object caching system) and the Space Tyrant game server. We considered one year worth of releases for Icecast and Space Tyrant, and ten months for Memcached, for a total of 13 updates (Chapter 4). 12 Though these programs were not designed with updating in mind, we had to make only a handful of changes to their source code to make them safely updateable. Each dynamic update we performed was based on an actual release, and for each application, we applied updates corresponding to up to four years? worth of releases, totaling as many as twelve different patches in one case. To achieve these results, we developed several new implementation techniques, including new ways to handle the transformation of data whose type changes, to allow dynamic updates to infinite loops and active code, and to allow updates to take effect in programs with function pointers. Details are in Sections 3.2 and 4.3.1. Overhead due to updating is modest: application performance usually degrades by 0?10%, though for one of the programs, the overhead was 32%. Memory footprint for updateable applications is 0?10% larger, compared to unmodified applications, except for one application, where it is 46%. The updates we performed to the six servers present a substantial demonstra- tion that DSU can be practical: it can support on-line updates over a long period based on actual releases of real-world programs. These servers are similar in that they keep long-lived, in-memory state, and rebooting the server is disruptive for the clients. However, they only constitute one category of long-running programs. Other long-running systems keep short-lived in-memory state (e.g., web servers), or store their state on disk (e.g., database systems). In Sections 7.1 and 7.2 we talk about how DSU would apply to these other categories of systems, and what are the trade-offs between using DSU and using traditional high-availability techniques. 13 1.4 Contributions Based on our experience, we believe Ginseng makes significant headway toward meeting the DSU practicality criteria we have set forth above: ? Flexibility. Ginseng permits updates to single- and multi-threaded C pro- grams. The six test programs are realistic, substantial and most of them are widely used in constructing real-world Internet services. Ginseng sup- ports changes to functions, types, and global variables, and as a result we could perform all the updates in the 10 months?4 years time frame we consid- ered. Patches were based on actual releases, even though the developers made changes without having dynamic updating in mind. ? Efficiency. We had to make very few changes to the application source code. Despite the fact that differences between releases were non-trivial, generating and testing patches was relatively straightforward. We developed tools to generate most of a dynamic patch automatically by comparing two program versions, reducing programmer work. We found that DSU overhead is modest forI/Oboundapplications, butmorepronouncedforCPU-boundapplications. Our novel version consistency property improves update availability, resulting in a smaller delay between the moment an update is available and the moment the update is applied. ? Safety. Updates cannot be applied at arbitrary points during a program?s execution, because that could lead to safety violations. Ginseng performs a 14 suite of static safety analyses to determine times during the running program?s execution at which an update can be performed safely. In summary, this dissertation makes the following contributions: 1. A practical framework to support dynamic updates to single- and multi- threaded C programs. Ours is the most flexible, and arguably the most safe, implementation of a DSU system to date. 2. A substantial study of the application of our system to six sizable C server programs, three single-threaded, and three multi-threaded, over long periods of time ranging from 10 months to 4 years worth of releases. 3. A novel type-theoretical system that generalizes standard effect systems, called contextual effects; contextual effects are useful when the past or future com- putation of the program is relevant at various program points, and have ap- plications beyond DSU. We also present a formalism and soundness proof for our novel update correctness property, version consistency, which permits us to provide certain update safety guarantees for single- and multi-threaded programs 4. An approach for comparing the source code of different versions of a C pro- gram, as well as a software evolution study of various versions of popular open source programs, including BIND, OpenSSH, Apache, Vsftpd and the Linux kernel. 15 Chapter 2 Software Evolution To effectively support dynamic updating, we first need to understand how software evolves. This chapter presents an approach to characterizing the evolution of C programs, along with a study that analyzes how several substantial open-source programs have changed over years-worth of releases. 2.1 Introduction We have developed a tool called ASTdiff that can quickly compute and sum- marize simple changes to successive versions of C programs by partially matching their abstract syntax trees. ASTdiff identifies the changes, additions, and deletions of global variables, types, and functions, and uses this information to report a vari- ety of statistics. The Ginseng patch generator uses the ASTdiff output to determine the contents of a dynamic patch. Our approach is based on the observation that for C programs, function names are relatively stable over time. We analyze the bodies of functions of the same name and match their abstract syntax trees structurally. During this process, we compute a bijection between type and variable names in the two program versions, which will help us determine changes to types and variables. If the old and new ASTs fail to match (modulo name changes), we consider this a change to that function?s body, 16 AST 1 Parser Parser AST 2 Program version 1 Program version 2 Type Matchings Bijection Computation Change DetectorName Matchings Changes& Statistics Figure 2.1: High level view of ASTdiff. and will replace the entire function at the next update. We have used ASTdiff to study the evolution history of a variety of popular open source programs, including Apache, Sshd, Vsftpd, Bind, and the Linux kernel. This study has revealed trends that we have used to inform our design for DSU. In particular, we observed that function, type and global variable additions are far more frequent than deletions. We also found that function bodies change frequently over time; function prototypes change as well, but not as frequently as function bodies do. Finally, type definitions (such as struct and union declarations) do change, but infrequently, and often in simple ways. 2.2 Approach Figure 2.1 provides an overview of ASTdiff. We begin by parsing the two program versions to produce abstract syntax trees (ASTs), which we traverse in parallel to collect type and name mappings; these mappings will help us avoid reporting spurious changes due to renamings. With the mappings at hand, we detect and collect changes to report to the user, either directly or in summary form. In this section, we describe the matching algorithm, illustrate how changes are detected and reported, and describe our implementation and its performance. 17 1 typedef int sz t ; 2 3 int count; 4 5 struct foo { 6 int i; 7 float f; 8 char c; 9 }; 10 11 int baz(int a, int b) { 12 struct foo sf; 13 sz t c = 2; 14 sf.i = a + b + c; 15 count++; 16 } 17 1 typedef int size t ; 2 3 int counter; 4 5 struct bar { 6 int i; 7 float f; 8 char c; 9 }; 10 11 int baz(int d, int e) { 12 struct bar sb; 13 size t g = 2; 14 sb.i = d + e + g; 15 counter++; 16 } 17 void biff (void) {} Version 1 Version 2 Figure 2.2: Two successive program versions. 2.2.1 AST Matching Figure 2.2 presents an example of two successive versions of a program. As- suming the example on the left is the initial version, ASTdiff discovers that the body of baz is unchanged?which is what we would like, because even though every line has been syntactically modified, the function in fact is structurally the same, and produces the same output. ASTdiff also determines that the type sz t has been renamed size t , the global variable count has been renamed counter, the structure foo has been renamed bar, and the function biff has been added. To report these results, ASTdiff must find a mapping between the old and new names in the program, even though functions and type declarations have been reordered and modified. To do this, ASTdiff begins by finding function names that are common between program versions; our assumption is that function names do 18 procedure GenerateMaps(Version1,Version2) F1?set of all functions in Version 1 F2?set of all functions in Version 2 global TypeMap?? global GlobalNameMap?? for each function f ?F1?F2 do ? ? ? AST1?AST of f in Version 1 AST2?AST of f in Version 2 Match Ast(AST1,AST2) procedure Match Ast(AST1,AST2) local LocalNameMap?? for each (node1,node2)?(AST1,AST2) do ? ??? ??? ??? ??? ??? ??? ? ??? ??? ??? ??? ??? ??? ? if (node1,node2) = (t1 x1,t2 x2) // declaration then braceleftbiggTypeMap?TypeMap?{t 1?t2} LocalNameMap?LocalNameMap?{x1?x2} else if (node1,node2) = (y1 := e1 op eprime1,y2 := e2 op eprime2) // assignment then ? ??? ??? ? ??? ??? ? Match Ast(e1,e2) Match Ast(eprime1,eprime2) if isLocal(y1) and isLocal(y2) then LocalNameMap?LocalNameMap?{y1?y2} else if isGlobal(y1) and isGlobal(y2) then GlobalNameMap?GlobalNameMap?{y1?y2} else if ...// other syntactic forms else break Figure 2.3: Map generation algorithm. not change very often. ASTdiff then tries to match function bodies corresponding to the same function name in the old and new versions. The function body match helps us construct a bijection (i.e., a one-to-one, onto mapping) between names in the old and new versions. We traverse the ASTs of the function bodies of the old and new versions simultaneously, adding entries to a LocalNameMap and a GlobalNameMap that map local variable names and global variable names, respectively. Two variables are considered equal if we encounter them in the same syntactic position in the two 19 function bodies. For example, in Figure 2.2, parallel traversal of the two versions of baz results in the LocalNameMap: a?d,b?e,sf?sb,c?g and a GlobalNameMap with count ? counter. Similarly, we form a TypeMap between named types (typedefs and aggregates) that are used in the same syntactic positions in the two function bodies. For example, in Figure 2.2, the name map pair sb?sf will introduce a type map pair struct foo?struct bar. We define a renaming to be a name or type pair j1?j2 where j1?j2 exists in the bijection, j1 does not exist in the new version, and j2 does not exist in the old version. Based on this definition, ASTdiff will report count ? counter and structfoo?structbar as renamings, rather than additions and deletions. This approach ensures that consistent renamings are not presented as changes, and that type changes are decoupled from value changes, which helps us better understand how types and values evolve. Figure 2.3 presents the pseudocode for our algorithm. We accumulate global maps TypeMap and GlobalNameMap, as well as a LocalNameMap per function body. We invoke the routine Match Ast on each function common to the two versions. When we encounter a node with a declaration t1 x1 (a declaration of variable x1 with type t1) in one AST and t2 x2 in the other AST, we require x1 ?x2 and t1 ?t2. Similarly, when matching statements, for variables y1 and y2 occurring in the same syntactic position we add type pairs in the TypeMap, as well as name pairs into LocalNameMap or GlobalNameMap, depending on the storage class of y1 and y2. 20 ------- Global Variables ---------- Version1 : 1 Version2 : 1 renamed : 1 ------- Functions ----------------- Version1 : 1 Version2 : 2 added : 1 locals/formals name changes : 4 ------- Structs/Unions ------------ Version1 : 1 Version2 : 1 renamed : 1 ------- Typedefs ----------------- Version1 : 1 Version2 : 1 renamed : 1 Figure 2.4: Summary output produced for the code in Figure 2.2. LocalNameMap will help us detect functions which are identical up to a renaming of local and formal variables, and GlobalNameMap is used to detect renamings for global variables and functions. As long as the ASTs have the same shape, we keep adding pairs to maps. If we encounter an AST mismatch (the break statement on the last line of the algorithm), we stop the matching process for that function and use the maps generated from the portion of the tree that did match. 2.2.2 Change Detection and Reporting With the name and type bijections in hand, ASTdiff visits the functions, global variables, and types in the two programs to detect changes and collect statistics. We categorize each difference that we report either as an addition, deletion, or change. 21 / : 111 include/ : 109 linux/ : 104 fs.h : 4 ide.h : 88 reiserfs_fs_sb.h : 1 reiserfs_fs_i.h : 2 sched.h : 1 wireless.h : 1 hdreg.h : 7 net/ : 2 tcp.h : 1 sock.h : 1 asm-i386/ : 3 io_apic.h : 3 drivers/ : 1 char/ : 1 agp/ : 1 agp.h : 1 net/ : 1 ipv4/ : 1 ip_fragment.c : 1 Figure 2.5: Density tree for struct/union field additions (Linux 2.4.20?2.4.21). We report any function names present in one file and not the other as an addition, deletion, or renaming as appropriate. For functions in both files, we report that there is a change in the function body if there is a difference beyond the renamings that are represented in our name and type bijections. This can be used as an indication that the semantics of the function has changed, although this is a conservative assumption (i.e., semantics-preserving transformations such as code motion are flagged as changes). In our experience, whenever ASTdiff detects an AST mismatch, manual inspection has confirmed that the function semantics has indeed changed. We similarly report additions, deletions and renamings of global variables, and 22 changes in global variable types and static initializers. For types we perform a deep structural isomorphism check, using the type bijection to identify which types should be equal. We report additions, deletions, or changes in fields for aggregate types; additions, deletions, or changes to base types for typedefs; and additions, deletions, or changes in item values for enums. ASTdiff can be configured to either report this detailed information or to produce a summary. For the example in Figure 2.2, the summary output is presented in Figure 2.4. In each category, Version1 represents the total number of items in the old program, and Version2 in the new program. For brevity we have omitted all statistics whose value was 0. ASTdiff can also present summary information in the form of a density tree, which shows how changes are distributed in a project. Figure 2.5 shows the density tree for the number of struct and union fields that were added between Linux versions 2.4.20 and 2.4.21. In this diagram, changes reported at the leaf nodes (source files) are propagated up the branches, making clusters of changes easy to visualize. In this example, the include/linux/ directory and the include/linux/ide.h header file have a high density of changes. A potential over-conservatism of our matching algorithm is that having insuf- ficient name or type pairs could lead to renamings being reported as additions/dele- tions. The two reasons why we might miss pairs are partial matching of functions and function renamings. As mentioned previously, we stop adding pairs to maps when we detect an AST mismatch, so when lots of functions change their bodies, we miss name and type pairs. This could be mitigated by refining our AST comparison 23 to recover from a mismatch and continue matching after detecting an AST change. Because renamings are detected in the last phase of the process, functions that are renamed don?t have their ASTs matched, another reason for missing pairs. In order to avoid this problem, the bijection computation and function body matching would have to be iterated until a fixpoint is reached. Note that reporting spurious changes due to renamings do not affect the correctness of our DSU implementation. For example, reporting a function is as added and deleted instead of renamed would only cause more code to be loaded. In practice, however, we found the approach to be reliable. For the case studies in Section 2.3, we have manually inspected the ASTdiff output and the source code for renamings that are improperly reported as additions and deletions due to lack of constraints. We found that a small percentage (less than 3% in all cases) of the reported deletions were actually renamings. The only exception was an early version of Apache (versions 1.2.6-1.3.0) which had significantly more renamings, with as many as 30% of the reported deletions as spurious. 2.2.3 Implementation ASTdiff is constructed using CIL, an OCaml framework for C code analysis [80] that provides ASTs as well as some other high-level information about the source code. We have used it to analyze all releases of Vsftpd1 from inception (Nov. 2001) to March 2005; all releases of the OpenSSH Sshd daemon2 from inception (Oct 1999) 1http://vsftpd.beasts.org/ 2http://www.openssh.com/ 24 0 10 20 30 40 50 60 70 0 50 100 150 200 250 300 350 400 Time (s) Source code size (kLOC) ASTdiff total Parsing Figure 2.6: ASTdiff running time for various program sizes. to March 2005; 8 snapshots in the lifetime of Apache 1.x3 (Feb. 1998 to Oct. 2003); and portions of the lifetimes4 of the Linux kernel5 (versions 2.4.17, Dec. 2001 to 2.4.21, Jun. 2003) and BIND6 (versions 9.2.1, May 2002 to 9.2.3, Oct. 2003). The running time of ASTdiff is linear in the size of the input programs? ASTs. Figure 2.6 shows the running time of ASTdiff on our test applications, plotting source code size versus running time. Times are the average of 5 runs; the system used for experiments was a dual Xeon@2GHz with 1GB of RAM running Fedora Core 3. The top line is the total running time while the bottom line is the portion of the running time that is due to parsing, provided by CIL. The difference between the two lines is our analysis time. Computing changes for two versions of the largest test program takes slightly over one minute. The total time for running the analysis on the full repository (i.e., all the versions) for Vsftpd was 21 seconds (14 versions), for Sshd was 168 seconds (25 versions), and for Apache was 42 seconds (8 versions). 3http://httpd.apache.org/ 4Analyzing earlier versions would have required older versions of gcc. 5http://kernel.org/ 6www.isc.org/products/BIND/ 25 2.3 Implications for Dynamic Software Updating This section explains how we used ASTdiff to characterize software changes and to guide the way we designed Ginseng. We are mainly interested in three aspects of software evolution: how often do definitions get deleted, how often do function signatures change, and how do type definitions change. The reason we consider these aspects important is that implementing deletion and supporting type changes safely is problematic for DSU systems. We present our findings as structured around asking and answering three research questions: Are function and variable deletions frequent, relative to the size of the program? When a programmer deletes a function or variable, we would expect a DSU implementation to delete that function from the running program when it is dynamically updated. However, implementing on-line deletion is difficult, because it is not safe to delete functions or variables that are currently in use (or will be in the future). Therefore, if definitions are rarely deleted over a long period, the benefit of cleaning up dead code may not be worth the cost of implementing a safe mechanism to do so. For simplicity, Ginseng does not unload unused functions and variables after they have been replaced and are no longer in use (Section 3.4). Figure 2.7 illustrates how Sshd, Vsftpd, and Apache have evolved over their lifetime. The x-axis plots time, and the y-axis plots the number of function and global variable definitions for various versions of these programs. Each graph shows the total number of functions and global variables for each release, the cumulative number of functions/variables added, and the cumulative number of functions/vari- 26 ables deleted (deletions are expressed as a negative number, so that the sum of deletions, additions, and the original program size will equal its current size).7 The rightmost points show the current size of each program, and the total number of additions and deletions to variables and functions over the program?s lifetime. According to ASTdiff, Vsftpd and Apache delete almost no functions, but Sshd deletes them steadily. For the purposes of our DSU question, Vsftpd and Apache could therefore reasonably avoid removing dead code, while doing so for Sshd would have a more significant impact (assuming functions are similar in size). Are changes to function prototypes frequent? Many DSU methodologies do not update a function whose type has changed. While it is easy, technically, to load or replace a function, a change to a function?s prototype can lead to type safety violations (Section 3.3). Figure 2.8 presents graphs similar to those in Figure 2.7. For each program, we graph the total number of functions, the cumulative number of functions whose body has changed, and the cumulative number of functions whose prototype has changed.8 As we can see from the figure, changes in prototypes are relatively infrequent for Apache and Vsftpd, especially compared to changes more generally. In contrast, functions and their prototypes have changed in Sshd far more rapidly, with the total number of changes over five years roughly four times the current number of functions, with a fair number of these resulting in changes in prototypes. In all cases we can see some changes to prototypes, meaning that 7We use cumulative figures to show that additions are much more frequent than deletions. 8We use cumulative figures to show that body changes are much more frequent than prototype changes. 27 supporting prototype changes in DSU is a good idea. Are changes to type definitions relatively simple? In most DSU systems, changes to type definitions (which include struct, union, enum, and typedef declara- tions in C programs) require an accompanying type transformer function to be sup- plied with the dynamic update. Each existing value of a changed type is converted to the new representation using this transformer function. Of course, this approach presumes that such a transformer function can be easily written. If changes to type definitions are fairly complex, it may be difficult to write a transformer function. Figure 2.9 plots the relative frequency of changes to struct, union, and enum definitions (the y-axis) against the number of fields (or enumeration elements for enums) that were added or deleted in a given change (the x-axis). The y-axis is presented as a percentage of the total number of type changes across the lifetime of the program. We can see that most type changes affect predominantly one or two fields; an exception is Sshd, where changing more than two fields is common. We also used ASTdiff to learn that fields do not change type frequently (not shown in the figure). 2.4 Conclusion We have presented an approach to finding differences between program ver- sions based on partial abstract syntax tree matching. Our algorithm uses AST matching to determine how types and variable names in different versions of a pro- gram correspond. We have constructed ASTdiff, a tool based on our approach and 28 -400-200 0 200 400 600 800 1000 1200 1400 01/00 01/01 12/01 12/02 12/03 12/04 Sshd -100 0 100 200 300 400 500 600 700 800 12/01 12/02 12/03 12/04 Vsftpd # Functions+GvarsAddedDeleted -200 0 200 400 600 800 1000 1200 01/99 01/00 01/01 12/01 12/02 Apache Figur e2 .7: Functi on and glo bal va riable additio ns and deletio ns. 29 0 500 1000 1500 2000 2500 3000 01/00 01/01 12/01 12/02 12/03 12/04 Sshd 0 50 100 150 200 250 300 350 400 450 500 12/01 12/02 12/03 12/04 Vsftpd # FunctionsBody changesPrototype changes 0 100 200 300 400 500 600 700 800 900 1000 01/99 01/00 01/01 12/01 12/02 Apache Figur e2 .8: Functi on bo dy and proto typ ec hang es. 30 0 10 20 30 40 50 60 70 80 90 0 1 2 3 4 5 6 7Relative frequency (%) # Fields added/deleted Linux Vsftpd Apache Sshd Bind Figure 2.9: Classifying changes to types. used it to analyze several popular open source projects over a few years in their life- time. The software evolution insights we have gained from using ASTdiff, e.g., the way types and functions change have helped us in the design and implementation of Ginseng, our DSU system for C programs. 31 Chapter 3 Single-threaded Implementation and Evaluation This chapter presents the implementation of Ginseng, an approach and tool suite for dynamically updating C programs, along with its evaluation on single- threaded programs.1 Chapter 4 will discuss Ginseng?s support for multi-threaded programs and its evaluation on multi-threaded programs. 3.1 Introduction Our primary considerations for designing Ginseng follow the three practicality criteria described in Chapter 1 (efficiency, flexibility, and safety). We believe these features are necessary for any DSU system aiming to support long-term evolution for realistic programs: Efficiency. DSU should permit writing applications in a natural style: while an application writer should anticipate that software will be upgraded, she should not have to know what form that update will take. Similarly, writing dynamic updates should be as easy as possible. The performance of updateable applications should be in line with that of normally-compiled applications; if support for update imposes 1The design, implementation and evaluation of Ginseng on single-threaded programs are the result of joint efforts with Gareth Stoyle, Michael Hicks, Manuel Oriol, Gavin Bierman, and Peter Sewell; we present details on their contributions in Section 3.7. 32 a high overhead, DSU is not likely to be adopted. Flexibility. The power and appeal of DSU is to permit applications to change on the fly at a fine granularity. Thus, programmers should be able to change data representations, change function prototypes, reorganize subroutines, etc. as they normally would. Safety. Dynamic updates should not be hard to establish as correct. The harder it is to develop applications that use DSU and prove their correctness, the more its benefits of finer granularity and control is diminished. To evaluate single-threaded Ginseng, we have used it to dynamically upgrade three single-threaded servers: Vsftpd (the Very Secure FTP daemon), the Sshd daemon from the OpenSSH suite, and the Zebra server from the GNU Zebra routing software package. Based on our experience, we believe Ginseng squarely meets the first two criteria for the class of single-threaded server applications we considered, and makes significant headway toward the third. These programs are realistic, substantial, and in common use. Though they were not designed with updating in mind, we had to make only a handful of changes to their source code to make them safely updateable. Each dynamic update we performed was based on an actual release, and for each application, we applied updates corresponding to at least three years? worth of releases, totaling as many as twelve different patches in one case. To achieve these results, we developed several new implementation techniques, including new ways to 33 handle the transformation of data whose type changes, to allow dynamic updates to active code, and to allow updates to take effect in programs with function pointers. Though we have not optimized our implementation, overhead due to updating is modest: between 0 and 32% on the programs we tested. Despite the fact that changes were non-trivial, generating and testing patches was relatively straightforward. We developed tools to generate most of a dynamic patch automatically by comparing two program versions, reducing programmer work. More importantly, Ginseng performs two safety analyses to determine times during the running program?s execution at which an update can be performed safely. The theoretical development of our first analysis, called the updateability analy- sis [106], is not a contribution of this dissertation. We present an implementation of that analysis for the full C programming language, along with some practical exten- sions for handling some of the low-level features of C. These safety analyses assist assurance of correctness, though the programmer needs a clear ?big picture? of the application, e.g., the interactions between application components, and establishing and maintaining global invariants. A high-level overview of Ginseng?s components was presented in Section 1.2. The next three sections describe these components in detail, while Sections 3.5 and 3.6 describe our experience using Ginseng and evaluate its performance. 34 3.2 Enabling On-line Updates To make programs dynamically updateable we address two main problems. First, existing code must be able to call new versions of functions, whether via a direct call or via a function pointer. Second, the state of the program must be transformed to be compatible with the new code. For a type whose definition has changed, existing values of that type must be transformed to conform to the new definition. Ginseng employs two mechanisms to address these two problems, respectively: function indirection and type-wrapping. We discuss them in turn below, and show how they can be combined to update active code. 3.2.1 Function Indirection Function indirection is a standard technique [50] that permits old code to call new function versions by introducing a level of indirection between a caller and the called function, so that its implementation can change. For each function f in the program, Ginseng introduces a global variable f ptr that initially points to the first version of f.2 Ginseng encodes version information through name mangling, renaming the initial version of f to f v0, the subsequent version f v1 and so on. Each direct call to f within the program is replaced with a call through ?f ptr. Ginseng also handles function pointers in an interesting way: if the program passes f as data 2Ginseng is more careful than we are in these examples about generating non-clashing variable names. 35 (i.e., as a function pointer), Ginseng generates a wrapper function that calls ?f ptr and passes this wrapper instead. To dynamically update f to version 1, the runtime system dynamically loads the new version f v1 and then stores the address of f v1 in f ptr. While function indirection is not new, the idea of generating function wrappers to permit updates to a function whose address is taken is, to our knowledge, first introduced in this dissertation. 3.2.2 Type Wrapping The Ginseng updating model enforces what we call representation consis- tency [106], in which all values of type T in the program at a given time must logically be members of T?s most recent version. The alternative would be to al- low multiple versions of a type to coexist, where code and values of old and new type could interact freely within the program. (Hj?almt?ysson and Gray [52] and Duggan [32] refer to these approaches as global update and passive partitioning, re- spectively.) Representation consistency is a useful property because it more closely models the ?forward march? of a program?s on-line evolution, making it easier to reason about. To enforce representation consistency, Ginseng must ensure that when a par- ticular type T?s definition is updated, values of that type in the running program are updated as well. To do this, a dynamic patch defines a type transformer func- tion used to transform a value vT from T?s old definition to its new one. Just like functions, types are associated with a version, and the type transformer cTn?n+1 36 converts values of type Tn (i.e., the representation of T in version n) to be those of type Tn+1. As we explain later, much of a type transformer function can be generated automatically via a simple comparison of the old and new definitions. Given this basic mechanism, we must address two questions. First, when are type transformers to be used? Second, how is updateable data represented? Applying Type Transformers. To transform existing vTn values the runtime system must find them all and apply cTn?n+1 to each. One approach would be to do this eagerly, at update-time; this would require either implementing a garbage- collector-style tracing algorithm [43], or maintaining a registry of pointers to every (live) value of type Tn during execution [12]. More simply, we could restrict type transformation to only those data reachable from global variables, and require the programmer to implement the tracer manually [50]. Finally, we could do it lazily, as the program executes following the update [32, 17, 7]. Ginseng uses the lazy approach. The compiler renames version n of the user?s definition of T to be Tn, where the definition of T simply wraps that of Tn, adding a version field. Given a value vT (of wrapped type T), Ginseng inserts a coercion func- tion called conT (for concretization of T) that returns the underlying representation. This coercion is inserted wherever vT is used concretely, i.e., in a way that depends on its definition. For example, this would happen when accessing a field in a struct. Whenever conT is called on vT, the coercion function compares vT?s version n with the latest version m of T. If n < m, then the necessary type transformer functions are composed and applied to vT changing it in-place. That is, Ginseng automatically 37 invokes the entire type transformer chain cTn?n+1,cTn+1?n+2,...,cTm?1?m to yield the up-to-date vTm (of type Tm). The lazy approach has a number of benefits. First, it is not limited to pro- cessing only values that are reachable by global variables; stack-allocated values, or those reachable from stack-allocated values, are handled easily. Second, it amortizes transformation costs, reducing the potential pause at update-time that would be re- quired to transform all data in the program. The drawback is that per-type access during normal program execution is more expensive (due to the calls to conT), and the programmer has little control over when type transformers are invoked, since this is determined by the program?s execution. Therefore, transformers must be written to be timing-independent. In our experience, type transformers are used rarely, and so it may be sensible to use a combination of eager and lazy application to reduce total overhead. Without care, it could be possible for a transformed value to end up being processed by old code, violating representation consistency. This could lead a conT coercion to discover that the version n on vT is actually greater than the version m of the type T expected by the code. A similar situation arises when function types change: old code might end up calling the new version of a function assuming it has the old signature. We solve these problems with some novel safety analyses, described in more detail in Section 3.3. Type Representations. While lazy type updating is not new [7], there has been little or no exploration of its implementation, particularly for a low-level language 38 such as C. Based on our experience, a given type is likely to grow in size over time, so the representation of the wrapped type T must accommodate this. One approach is to define the wrapper type to use a fixed space, larger than the size of T0 (padding). This strategy allows future updates to T that do not expand beyond the preallocated padding. Themainadvantageofthepaddingapproachisthattheallocationstrategy for wrapped data is straightforward: stack-allocated data in the source program is still stack-allocated in the compiled program, and similarly for malloced data. This is because type transformation happens in place: the transformed data overwrites the old data in the same storage. On the other hand, a data type cannot grow beyond the initial padding, ham- pering on-line evolution. Padding also changes the cache locality of data. For example, if a two-word structure in the original program is expanded to four words, then half as many elements can fit in a cache line. Analternativeapproachwouldbetouseindirection, andrepresentthewrapped type as a pointer to a value of the underlying type. This mechanism is used in the K42 operating system [60], which supports updating objects. The indirection ap- proach solves the growth problem by allowing the size of the wrapped type to grow arbitrarily, but introduces an extra dereference per access. More importantly, the indirection approach makes memory management more challenging: how should storage for the transformed data be allocated, and what is to happen to the now- unneeded old data? Also, when data is copied, the indirected data must be copied as well, to preserve the sharing semantics of the application. The simplest solution would be to have the compiler malloc new representations and free (or garbage col- 39 lect) the old ones; this is less performance-friendly than stack allocation. Another alternative would be to use regions [109], which have lexically-scoped lifetimes (as with stack frames), but support dynamic allocation. Of course, a hybrid approach is also possible: data could start out with some padding, and an indirection is only added if the padding is ever exceeded. Nevertheless, for simplicity, Ginseng employs the padding approach. 3.2.3 Example Figure 3.1 presents a simple C program and how we compile it to be update- able. The main program is in function call: it creates a value t of type struct T and calls function foo (via apply) to set its .x field to 1. The original program is on the left, and the resulting updateable program is in the middle and right columns. The comments can be ignored; these are the results of the safety analysis, explained in the next section. First, we can see that all function definitions have been renamed to include a version, and that Ginseng has introduced a ptr variable for each function, to keep a pointer to the most current version. Calls to functions are indirected through these pointers. Second, we can see that the definition of struct T is now a wrapper for struct T0, the original definition. The con T function unwraps a struct T, poten- tially converting it to the latest representation via a call to DSU transform (which invokes the type transformer if the value must be updated). The con T function is called twice in call v0 to extract the underlying value of t. Finally, we can see that 40 1 struct T { 2 int x; int y; 3 }; 4 5 void foo (int ? x) { 6 ?x = 1; 7 } 8 void ap ply (void (? fp )( int ?), 9 int ?x ){ 10 fp( x); 11 } 12 void call () { 13 struct T t= {1,2 }; 14 ap ply (fo o,&t .x ); 15 t.y = 1; 16 } 1 struct T { 2 unsi gn ed int ve rsi on ; 3 union {struct T0 da ta ; 4 cha rpa ddi ng [X ]; }ud ata ; 5 }; 6 struct T0 ? con T( struct T? ab s) { 7 DSU tra nsfo rm(ab s); 8 ret urn &abs ? ud ata. data ; 9 } 10 11 void ? foo ptr = & foo v0; 12 void ? ap ply ptr = & ap ply v0; 13 void ? call ptr = & call v0 ; 14 15 void foo wrap (int ?x ){ 16 (? foo ptr )(x ); 17 } 20 struct T0 { int x; int y; }; 21 22 /? D=D?= {T }, L= {T }, x:T ?/ 23 void foo v0 (int ?x ){ ?x = 1; } 24 25 /? D= {fo o,T }, D?= {T }, L= {} ,x:T ?/ 26 void ap ply v0 (void (? fp )( int ?), 27 int ?x) { 28 fp( x); 29 } 30 31 /? D= {T, app ly} ,D?= {} ,L= {} ?/ 32 void call v0 () { 33 struct T t= {0, {.da ta= {1,2 }}} ; 34 /? D= {T, app ly} ?/ 35 (? ap ply ptr )( foo wrap , 36 &( con T( &t)) ? x); 37 /? D= {T }? / 38 &( con T( &t)) ? y= 1; 39 /? D= {} ?/ 40 41 } Or igina lpro gra m Up datea ble pro gra m Figur e3 .1: Compiling apr ogr am to be dyna mically up dat ea ble. 41 Ginseng has generated foo wrap to wrap an indirected call to foo; this is passed as a function pointer to apply. 3.2.4 Loops When a function f is updated, in-flight calls are unaffected, but all subsequent calls, including recursive ones, invoke the new f. In general, this makes reasoning about the timeline of an update simpler. On the other hand, it presents a problem for functions that implement long-running or infinite loops: if an update occurs to such a function while the old version is active, then the new version may not take effect for some time, or may never take effect. This is a disadvantage of any updating system that prevents updates to active functions (Section 3.3.4). We solve this problem by a transformation we call code extraction. To illustrate how this work, we present an example of updating a long-running loop by extracting the loop body into a separate function.If the function containing the block is later changed, then this extracted function will notice the changes to the loop on the next iteration. As the code and state preceding the loop might have changed as well, the loop function must be parametrized by some extracted code state. This state will be transformed using our standard type transformer mechanism on the next iteration of the loop. Code extraction using a separate function parametrized by state is a technique similar to prior work on functional and parallel compilers (lambda lifting [59], procedure splitting [91], function outlining [115]) and on-stack replacement in optimizing VMs [22, 2]. 42 1 #p ragma DSU ex tract(??L1? ?) 2 3 int foo (float g) { 4 int x = 2; 5 int y = 3; 6 wh ile (1) { 7 L1 :{ 8 x = x+1; 9 if (x == 8) break ; 10 els e conti nue ; 11 if (x == 9) ret urn 42; 12 } 13 } 14 ret urn 1; 15 } 1 struct L1 xs { 2 float ?g ; int ?x ; int ?y ; 3 }; 4 5 int L1 ex tract (int ?ret , 6 struct L1 xs ?xs ){ 7 ?( xs? x) = ?(xs ? x) + 1; 8 if (?( xs? x) == 8) { 9 ret urn 0; // bre ak 10 } els e { 11 ret urn 1; // contin ue 12 } 13 if (?( xs? x) == 9) { 14 ?ret = 42; 15 ret urn 2; // retur n 16 } 17 ret urn 1; // implicit contin ue 18 } 19 int foo (float g) { 20 int x = 2; 21 int y = 3; 22 struct L1 xs xs ; 23 int retval ; 24 int co de ; 25 xs .g = & g; // init extra cted co de sta te 26 xs .x = & x; 27 xs .y = & y; 28 wh ile (1) { 29 co de = L1 ex tract(& retv al, &xs); 30 if (co de == 0) break ; 31 els e if (co de == 1) continue ; 32 els e retur n retval ; 33 } 34 ret urn 1; 35 } Or igina lpro gra m Up da teable prog ram Figur e3 .2: Up dating alo ng-running loo pus ing co de extra ction. 43 For illustration, consider the code in the left column of Figure 3.2. The pro- grammer directs Ginseng that the code block labeled L1 should be extracted. The result is shown in the middle and right columns. In the middle is the extracted function, L1 extract, and on the right side is the rewritten original function foo. The function L1 extract takes two arguments: struct L1 xs ?xs, and int ?ret. The first argu- ment, xs, is the ?extracted state?, which contains pointers to all of the local variables and parameters referenced in foo that might be needed by the code in L1; we can see in foo where this value is created. Within L1 extract, references to local variables (x) or parameters (g) have been changed to refer to them through ?(xs). Within the function foo, L1 extract is called on each loop iteration. Within L1 extract, expressions that would have exited the loop?notably break, continue, and return statements?are changed to return x, where x is 0 for break, 1 for continue and 2 for return. In foo, this return code is checked and the correct action is taken. If in a subsequent program version the loop in foo were to change, the extracted versions of the two loop bodies would be different, with the new one updating the old one. The new version will be invoked on the loop?s next iteration, and if the new loop requires additional state (e.g., new local variables or parameters were added to foo), then this is handled by the type transformer function for struct L1 xs. This type transformer might perform side-effecting initialization as well, for code that would have preceded the execution of the current loop. Note that foo?s callers are neither aware nor affected by the loop extraction inside the body of foo. When extracting infinite loops, nothing else needs to be done. However, if the loop might terminate, we must extract the code that follows the loop as well, so 44 that an updated loop does not execute a stale post-amble when it completes; we accomplish this by simply marking the post-amble for code extraction as we did with L1 above. The annotations the programmer needs to add for code extraction are described in detail in Section A.1.2. A similar technique to code extraction, called stack reconstruction, is used in UpStare, another dynamic updating system [68]. Stack reconstruction allows the update developer to define a correspondence between program points in the old and new versions, and, at update time, the stacks of all active functions are converted into new-version stacks via user-specified functions. The advantage of stack reconstruction is that programmers do not need to identify in advance the code blocks, or loops, that need to be extracted. Replacing arbitrary code on the stack was critical for supporting two of our three benchmark applications, Vsftpd and Sshd (Section 3.5). Both applications are structured around event loops: a parent process accepts incoming connection requests, and forks. The forked child breaks out of the loop and executes the loop postamble. If the loop body and loop postamble change in later versions, this will translate into updates to both extracted functions, hence both the parent and the children will get to execute the most up to date version. 3.3 Safety Analysis When developing software with Ginseng, programmers designate points in the program where an update should take place; to indicate an update point, the 45 programmer adds a call to function DSU update. Update points are usually placed at program points where global state is consistent, e.g., at the end of an iteration of a long-running loop (Section 3.2.4). Placing update points where global invariants hold simplifies reasoning about update safety and writing the update. However, correct update point placement raises an issue for the programmer, since the form of future updates cannot be predicted. Therefore, the programmer needs to know whether an update that occurs in the future could create problems if they take effect at a certain update point. To illustrate this, let us look again at the example in Figure 3.1. Suppose the program has just entered the call function?is it safe to update the type T? Generally speaking the answer is no, because code t.x assumes that t is a structure with field x, and a change to the representation of t could violate this assumption, leading to unexpected behavior. In this section we look at how Ginseng helps the programmer avoid choosing bad update points like this one using static analysis. 3.3.1 Tracking Changes to Types The example given above illustrates what could happen when old code ac- cesses new data, essentially violating representation consistency. To prevent this situation from happening, Ginseng applies a constraint-based, flow-sensitive update- ability analysis [106] that annotates each update point with the set of types that may not be updated if representation consistency is to be preserved. This set is called the capability because it defines those types that can be used by old code 46 that might be on the call stack during execution. Of course, the capability is a conservative approximation, as it approximates all possible ?stack shapes.? It is computed by propagating concrete uses of data backwards along the control flow of the program to possible update points. Statically-approximated capabilities are illustrated in Figure 3.1, where the sets labeled D in the comments define the current capability; on functions, D defines theinputcapability(capabilityatthestartofthefunction)andDprime definestheoutput capability (capability at the end of the function). When T appears in D, it means that the program has the capability to use data of type T concretely. An update must not revoke this capability when it is needed. We will now explain the capabilities for each function in the program (third column of Figure 3.1). For function foo (line 22), the input capability, D, contains T for two reasons: 1) because T has a live pointer into it for the duration of the function (live pointers are captured by the set L and are explained in Section 3.3.2), and because T appears in the output capability Dprime (i.e., is used concretely in foo?s continuation, in call). For function apply (line 25), the input and output capabilities contain T due to its concrete use in apply?s continuation; foo appears in the input capability D because we call foo in apply via the function pointer fp; the live pointer set L is empty because there is no live pointer into a type for the entire duration of apply?the last live pointer to T is dereferenced on line 28. For function call (line 31), the output capability Dprime is empty because there is nothing left on the stack after call exits; the live pointer set L is empty because there is no live pointer into a type for the entire duration of call; finally, the input capability D contains T and apply 47 because they are used concretely (accessed and called, respectively) in the body of call. At each program point, the capability D imposes a restriction on the functions and type that can be updated. For example, if we update apply at line 34, its type must either remain unchanged or the new type be a subtype of the old type [106], because apply appears in the capability D at that point. At line 37 we can perform an update that changes the type of apply or foo because there is no call to them on left on the stack; however, we cannot perform an update that changes the definition of T, because T is used concretely on the next line. Programmers indicate where updates may occur in the program text by insert- ing a call to a special runtime system function DSU update. When our analysis sees this function, it ?annotates? it with the current capability. At run-time this anno- tation is used to prevent updates that would violate the static determination of the analysis. Moreover, the runtime system ensures that if a type is updated, then any functions in the current program that use the type concretely are updated with it; that is, even though ASTdiff finds no difference in the ASTs of a function in the old and new program versions, we will still load the new function version. This allows the static analysis to be less conservative. In particular, although the constraints on the form of capabilities induced by concrete usage are propagated backwards in the control flow, propagation does not continue into the callers of a function [106]. This propagation is not necessary because the update-time check ensures that all function calls are always compatible with any changed type representations. The formalization and soundness proof of the updateability analysis are not 48 part of this dissertation, and are presented elsewhere [106]. However, the imple- mentation of this analysis for the full C language is one of the contributions of this dissertation. Our implementation extends the basic analysis to also track concrete uses of functions and global variables, which permits more flexible updates to them. In the former case, by considering a call as a concrete use of a function, and function names as types, we can use the analysis to safely support a change to the type of the function. Similarly, in the latter case, by taking reads and writes of global variables as concrete uses, and the name of a global variable as a type, we can support representation changes to global variables. As shown in Section 2.3, the types of functions and global variables do change over time, so this extension has been critical to making DSU work for real programs. The implementation also properly accounts for both signals and non-local con- trol transfers via setjmp/longjmp, albeit quite conservatively. Since signal handlers can fire at any point in the program, we disallow occurrences of DSU update inside a signal handler (or any function that handler might call), to avoid violating assumptions of the analysis (we could allow updates to occur, but prevent updates that would change type representations, function signatures, etc.). We model setjmp/longjmp as non-local goto; that is, the updateability analysis assumes that any longjmp in the program could go to any setjmp. The six server programs presented in Sections 3.5 and 4.4 do not employ setjmp/longjmp, but all of them use signals. 49 3.3.2 Abstraction-Violating Aliases C?s weak type system and low level of abstraction sometimes make it difficult for us to maintain the illusion that a wrapped type is the same as its underlying type. In particular, the use of unsafe casts and the address-of (&) operator can reveal a type?s representation through an alias. An example of this can be seen in Figure 3.1 where apply is called passing the address of field x of t. Within foo, called by apply with this pointer, the statement ?x = 1 is effectively a concrete use of T, but this fact is not clear from x?s type, which is simply int ?. An update to the representation of struct T while within foo could lead to a runtime error. We have a similar situation when using a pointer to a typedef as a pointer to its concrete representation. We say that these aliases are abstraction violating. One extreme solution would be to mark structs whose fields have their address taken as non-updateable. However, this solution can be relaxed by observing that only as long as an alias into a value of type T exists is it dangerous to update T. Thus if we know, at each possible update point, those types whose values might have live abstraction-violating aliases (AVAs), we can prevent those types from being changed. We discover this set of types using a abstraction violating alias analysis, an analysis that follows the general approach of effect reconstruction [67, 21, 5]. This analysis is described in Stoyle?s dissertation [105]. The comments in Figure 3.1 illustrate the AVA analysis results for the example, where L is the set of types having live abstraction-violating aliases. L?s contents are shown for each function, and the effect associated with variable x in functions foo and 50 apply is shown to be T via the notation x:T. Looking at the example, we can see the call function violates T?s abstraction by taking the address of t.x, and then passes this pointer to apply. This pointer is not used concretely in call, so does not effect subsequent computation in this function: call?s environment has no abstraction violating pointers. As call is the only caller of apply, its associated L is empty. However, the environment of the body of apply does contain an abstraction-violating pointer, namely the parameter x. Thus when apply calls foo via the pointer fp, T?s abstraction is violated and the L annotation for foo must contain T. In the example, we consider all statements as possible update points, and so extend D according to the results of the AVA analysis. This is why, for example, T appears in the capability of both foo and apply. In both cases T is in L or in the effect of a free variable in the environment (i.e., x). We do not show an annotation for foo wrap because it is an auto-generated function (though Ginseng?s safety analysis handles it properly). 3.3.3 Unsafe Casts and Polymorphism To ensure that the program operates correctly, many representation-revealing castsaredisallowed. Forexample, ifwehadadeclaration struct S{ int x; int y; int z; }, a C programmer might use this as a subtype of struct T from Figure 3.1, by cast- ing a struct S? to a struct T?. Given the way that we represent updateable types, permitting this cast would be unsafe, since struct S and struct T might have distinct type transformers and version numbers and treating one as the other may result in incorrect transformation. As a result, when our analysis discovers such a cast, it 51 rules both types as non-updateable. However, it would be too restrictive to handle all casts by rendering the types non-updateable. For example, C programmers often use void ? to program generic types. One might write a ?generic? container library in which a function to insert an element takes a void ?as its argument, while one that extracts an element returns a void ?. The programmer would cast the inserted element to void ? and the returned void ? value back to its assumed type. This idiom corresponds to parametric poly- morphism in languages like ML and Haskell. Programmers also encode existential types using void ? to build constructs like callback functions, and use upcasts and downcasts when creating and using callbacks, respectively. For example: struct callback { void ?env; void (?fp)(void ?env, int arg); }; void invoke(struct callback ?cb, int arg) { cb?fp(cb?env,arg); } In this case, the env field of callback is existentially quantified: users can construct callbacks where there exists some consistent type ? that can be given to the env field and the first argument of fp field, but the invoke function is indifferent to this type?s actual identity. Because ? can be different for different callbacks, C programmers must use the type void ?, using upcasts and downcasts when creating and using callbacks, respectively. 52 If these idioms are used correctly, then they pose no problem to Ginseng?s com- pilation approach since they do not reveal anything about a type?s representation. However, we cannot treat casts to and from void ?as legal in general, because void ? could be used to ?launder? an unsafe cast. For example, we might cast struct S? to void ?, and then the void ? to struct T?. Each cast may seem benign on its own, but becomes unsafe in combination. To handle this situation, our analysis anno- tates each void ? type in the program with the set of concrete types that might have been cast to it, e.g., casting a struct T? to a void ? would add struct T to the set. When casting a void ? to struct S?, the analysis ensures the annotation on the void ? contains a single element, which matches struct S. If it does not, then this is a potentially unsafe cast and both struct T and struct S are made non-updateable. Since our analysis is not context-sensitive, some legal downcasts will be forbidden, for example when a container library is used twice in the program to hold different object types. Fortunately, such context-sensitivity is rarely used by the programs we have considered. In the worst case, we inspect the program manually to decide whether a cast is safe or not, and override the analysis results in this case with a pragma. The annotations the programmer needs to add for overriding the analysis, along with some examples of their use are presented in Section A.1.4. 3.3.4 Ginseng?s Type Safety vs Activeness Check One popular way for ensuring proper timing is to restrict an update from taking place if it affects code that is actively executing, i.e., is referenced by the 53 stack of a running thread [23, 24, 1]. We call this restriction the ?activeness check.?3 Unfortunately, while the activeness check precludes many problematic update times, not all problematic update times are ruled out. Ginseng?s safety check is comparable to the activeness check, though there are some differences. Our check permits updates to the body or signature of the current function, whereas the activeness check doesn?t. However, since we take into account abstraction-violating aliases, we are more restrictive as to what types may be updated. For example, an alias p into a field of struct T can lead to a type safety violation if p is dereferenced after the definition of struct T changes. Ginseng?s safety analysis only permits updates to struct T after the alias p is no longer live. 3.3.5 Choosing Update Points In Section 3.3 we mentioned that programmers choose where to place update points. Placingupdatepointswhereglobalinvariantsholdsimplifiesreasoningabout update safety and writing the update. We define such points quiescent points, i.e., points in the program is one at which there are no partially-completed operations, and all global state is consistent (i.e., global invariants are satisfied). Dynamic updates are best applied at such quiescent points, so that writing an update is straightforward. 3A common criticism of the activeness check is that it is too strong: it precludes updates to code that never becomes inactive, e.g., the body of an infinite loop. In our experience, such updates are relatively rare, and in any case can be supported using techniques such as loop extraction (Section 3.2.4). 54 Ginseng adds constraints on types that can change at a programmer-inserted update point, so an update does not violate type safety. However, Ginseng does not provide guidance on where an update point should be placed?it only ratifies the programmer?s decision in terms of type safety. A problem that can arise from bad update placement is best illustrated by the following example. The code on the left is the old program version, while the code on the right is the new version. The only change is moving the call to g from the body of h into the body of f. 1 void g() { ... } 2 void f() { ... } 3 4 void h () { 5 6 f (); 7 g(); 8 } 1 void g() { ... } 2 void f() { g(); } 3 4 void h () { 5 6 f (); 7 8 } While the old and new program essentially ?do the same thing?, a badly timed up- date can lead to unexpected behavior, even though the update is type safe. Suppose the update occurs on line 5 in the old program. The call to f will be to the new ver- sion that calls g, but then returns to its caller, the old h, which then calls g (line 7) again. Note that despite the update being type safe, we ended up calling g twice, which is problematic. If g is a memory deallocation function such as free, we end up freeing a location twice. If g is a logging function, we end up with a duplicated log entry. We can construct a symmetrical scenario where g is moved from f into h, and as a result of the update, we fail to call it. This example illustrates the importance of update timing, and its impact on update correctness. In Chapter 5 we will show how programmers can designate 55 code blocks that ?go together? (e.g., the body of function h in our example, or one iteration of an event processing loop). Based on this programmer indication, Ginseng enforces a property named version consistency: all the functions and global variables in such a block are accessed at the same program version. In our example, Ginseng would prevent an update that changes both f and h from being applied at line 5, because this leads to a version-inconsistent execution for the code in the body of h. Note that a quiescent point is related to, but not identical with, a point with empty capability (Section 3.3); its capability may not necessarily be empty, although it is usually small. On the other hand, an empty capability does not imply quies- cence, but rather indicates there are no concrete uses of types beyond the current point. 3.4 Dynamic Patches Patch Generation. For each new release we need to generate a dynamic patch, which consists of new and updated functions and global variables, type transformers and state transformers. The Ginseng patch generator generates most of a dynamic patch automatically in three steps. First, it compares the old and new versions of a program using ASTdiff (Section 2.2) to discover the new and modified defi- nitions. Second, it adds the new and changed definitions to the patch file, where unchanged definitions are made extern. Third, it generates type transformers for all changed types by attempting to construct a conversion from the old type into the 56 new type [50]. For example, if a struct type had been extended by an extra field, the generator would produce code to copy the common fields and add a default initial- izer for the added one. This simplistic approach to patch generation is surprisingly effective, requiring few manual adjustments; in Section A.2 we present some con- crete examples of how the programmer writes state transformers and adjusts the auto-generated type transformer. After the patch is generated and the state and/or type transformers are writ- ten, we pass the resulting C file to Ginseng, and the final result is compiled to a shared library so that it can be linked into the running program. Ginseng compiles the patch just as it does the initial version of a program, but also introduces initial- ization code to be run at update-time. The initialization code will effectively ?glue? the dynamic patch into the running program, as explained next. Dynamic Patch Example. Figure 3.3 presents the source code for an actual dynamic patch, corresponding to the update from Zebra 0.92a to 0.93a. All the code, except the state transformer (DSU state xform on lines 6?7) and programmer- adjusted part of type transformers (DSU tt x on lines 1?4) is auto-generated. The first part (lines 1?7) contains type and state transformers. The second part (lines 10?22) contains new and changed functions and global variables. Note how Ginseng performs name mangling by renaming each function definition accord- ing to function version: access list standard is a new function, hence its name ends in v0, whereas vty serv sock and config write access are now at the second version. The function DSU install patch is the auto-generated ?glue code? that installs the latest 57 1 void DSU tt vty v0( struct vty old ?xin, 2 struct vty new?xout, 3 struct DSU wrapper struct vty?xnew ) 4 { ... } 5 6 void DSU state xform(void) 7 { ... } 8 9 10 /?New functions?/ 11 int access list standard v0 (...) 12 { ... } 13 14 /?Changed functions ?/ 15 extern void vty serv sock v0 (unsigned short port, char?path ) ; 16 void vty serv sock v1 (char const?hostname, unsigned short port, char?path ) ; 17 { ... } 18 19 extern int config write access ipv4 v0 (DSU wrapper struct vty?vty ) 20 int config write access ipv4 v1 ( wrapper struct vty ?vty ) 21 { ... } 22 23 void DSU install patch(void) 24 { 25 DSU latest tt struct vty = & tt vty v0; 26 27 vty serv sock ptr = & vty serv sock v1; 28 config write access ipv4 ptr = & config write access ipv4 v1 ; 29 access list standard host ptr = & access list standard host v0 ; 30 31 DSU state xform(); 32 } 33 34 char?DSU update contents = ? struct vty vty serv sock config write access ipv4 ... ?; Figure 3.3: Ginseng dynamic patch for the update Zebra 0.92a?0.93a. version of the type transformer for types that have changed (line 25), sets the func- tion pointers for new and changed functions (lines 27?29) and finally invokes the state transformer (line 31). Ginseng also includes the update contents (line 34)?a string containing the set of functions, types and global variables changed by the update. 58 Runtime System. To perform an update, the user sends a signal to the running program, which alerts the runtime system. When the program reaches a possible update point (i.e., the first call to DSU update for single-threaded programs, or the first thread to reach an induced update point for multi-threaded programs), the runtime system will try to perform the update. First, Ginseng loads the shared library containing the dynamic patch into memory, using dlopen [64]. Then, the runtime system retrieves the update contents DSU update contents. Now the runtime system is ready to perform the update safety check. For single-threaded programs, DSU update contents is checked against the capability of the update point (Section 3.3.1). For multi-threaded programs, DSU update contents is checked against each thread?s capability and contextual effects (Section 4.2.1). The check prevents Ginseng from applying an update at an unsafe point. If the check fails, the runtime system gives the control back to the program, and will try to apply the update at a later update point. If the check succeeds, the runtime system will install the patch. Patch installation is very simple: Ginseng calls DSU install patch which installs the type transformers for the updated types, redirects changed functions to the new versions, and finally invokes the state transformer if the user has provided one. Type transformers are not called at update time; they are invoked lazily, when the values to be updated are accessed. Our current runtime system has two main limitations. We do not support patch unloading, so old code and data will persist following an update. This memory leak has been minimal in practice?between 21% and 40% after three years? worth 59 of patches for our benchmark applications because, as explained in Section 2.3, deletions are infrequent. Second, dynamic updates cannot be rolled back trans- actional. If, during an update, an error is encountered in Ginseng-generated glue code, the runtime system or the user-supplied state transformer, we do not yet have a safe mechanism to abort the update and restore the state to the pre-update one. Possible approaches to solving this problem are saving the values of global variables prior to the update and restoring them upon failure [50], or using specu- lations [108]/transactional memory [15] to roll back the effects of a failed update. We leave these problems to future work. 3.5 Experience Wenowpresentourexperiencewithdynamicallyupdatingthreesingle-threaded open-source programs: Vsftpd, the Very Secure FTP daemon, the OpenSSH Sshd daemon, and the Zebra routing daemon from the GNU Zebra routing package;4 in Chapter 4 we will present our experience with updating multi-threaded programs. We chose these programs because they are long-running, maintain soft state that could be usefully preserved across updates, and are in wide use. For each pro- gram we downloaded releases spanning several years and then applied the method- ology shown in Figure 1.1. In particular, we compiled the earliest release to be updateable and started running it. Then we generated dynamic patches for subse- quent releases and applied them on-the-fly in release order, while the program was 4http://www.zebra.org 60 actively performing work (serving files, establishing connections, etc.). With this process, we identified key application features that make updating the applications easy or hard. We also identified strong points of our approach (that enabled most of the updates to be generated automatically), along with issues that need to be addressed in order to make the updating process easier, more flexible and applicable to a broad category of applications. In the rest of this section, we describe the applications and their evolution history, and the manual effort required to dynamically update them; identify application characteristics and Ginseng fea- tures that make updating feasible; and conclude by reviewing factors that enabled us to meet the challenges set forth in Section 3.1. 3.5.1 Applications Table3.1showsreleaseandupdateinformationforeachprogram. Columns 2?4 show the version number, release date and program size for each release. Column 5 contains the nature of individual releases.5 To give a sense of programmer effort for each update, column 6 shows the number of type transformers for that specific update, while column 7 presents the size of the state transformer in lines of code; ?-? means no type or state transformers were needed for a particular release. We now briefly discuss each application, then describe how the applications changed over a three year period, and finally discuss the manual effort required to dynamically update them. Vsftpd stands for ?Very Secure FTP Daemon? and is now the de facto FTP 5As described at http://freshmeat.net/ 61 Program Release Date Size Description Type State xform xform (LOC) (count) (LOC) 1.1.0 07/02 10,141 1.1.1 10/02 10,245 Minor bug/security fixes 2 - 1.1.2 10/02 10,540 Major feature enhanc. 5 - 1.1.3 11/02 10,723 Minor feature enhanc. 3 - 1.2.0 05/03 12,027 Major feature enhanc. 7 1 1.2.1 11/03 12,662 Minor feature enhanc. 6 - Vstfpd 1.2.2 04/04 12,691 Major bug/security fixes 3 - 2.0.0 07/04 13,465 Major feature enhanc. 4 - 2.0.1 07/04 13,478 Major bug/security fixes - - 2.0.2pre2 07/04 13,531 Other - - 2.0.2pre3 03/05 14,712 Other - - 2.0.2 03/05 17,386 Major bug/security fixes - - 2.0.3 03/05 17,424 Major bug/security fixes 1 - 3.5p1 10/02 47,424 3.6.1p1 04/03 49,120 Minor bug/security fixes 3 - 3.6.1p2 04/03 49,134 Major bug/security fixes - - 3.7.1p1 09/03 51,133 Major bug/security fixes 8 - 3.7.1p2 04/03 51,145 Major bug/security fixes - - Sshd 3.8p1 02/04 52,547 Other 8 - 3.8.1p1 04/04 52,549 Minor bug/security fixes 1 - 3.9p1 08/04 53,979 Major feature enhanc. 8 - 4.0 03/05 56,803 Minor feature enhanc. 9 - 4.1 05/05 56,840 Minor bug/security fixes 3 - 4.2p1 09/05 58,104 Minor bug/security fixes 7 - 0.92a 08/01 41,630 0.93a 07/02 40,649 Major bugfixes 11 30 Zebra 0.93b 09/02 40,679 Minor fixes 1 1 0.94 11/03 45,447 Minor security fixes 3 17 0.95 03/05 45,546 Major bugfixes 7 - 0.95a 09/05 45,586 Other 1 - Table 3.1: Application releases. 62 Program Functions Types Global variables Add Del. Proto Body Add Del. Chg. Add Del. Chg. changes changes Vstfpd 97 21 33 308 12 2 6 72 9 15 Sshd 131 19 85 752 27 2 19 70 19 29 Zebra 134 44 13 321 24 6 4 56 11 52 Table 3.2: Changes to applications. server in major Unix distributions. For our study, we considered the 13 versions from 1.1.0 through 2.0.3. As can be seen in Table 3.1, in the time frame we consid- ered there were 3 major feature enhancements, 4 major bugfixes, 2 minor feature enhancements and 1 minor bugfix. Sshd is the SSH daemon from the OpenSSH suite, which is the standard open- source release of the widely-used secure shell protocols. We upgraded Sshd 10 times, corresponding to 11 OpenSSH releases (version 3.5p1 to 4.2p1) over three years. GNU Zebra is a TCP/IP routing software package for building dedicated routers that support the RIP, OSPF, and BGP protocols on top of IPv4 or IPv6. It consists of protocol daemons (RIPd, OSPFd, BGPd) and a Zebra daemon which acts as a mediator between the protocol daemons and the kernel (Figure 3.4), storing and managing acquired routes. Storing routes in Zebra allows protocol daemons to be stopped and restarted without discarding and re-learning routes (which can be a time consuming process). We upgraded Zebra 5 times, corresponding to 6 releases (version 0.92a to 0.95a) over 4 years. Evolution History. Table 3.2 contains the cumulative number of changes that occurred to the software over that span, computed using ASTdiff. ?Types? refers to structs, unions and typedefs together. Global variable changes consists of changes to 63 either global variable types or to global variable static initializers. As an example reading of the table, notice that for Vsftpd, 97 functions were added, 21 were deleted, 33 functions had their prototype changed, and 308 functions had the bodies changed. For Sshd, 19 types changed; for Zebra, there were 52 global variable changes. We mentioned in Chapter 2 that a dynamic software updating system must support changes, additions, and deletions for functions, types and global variables if it is to handle realistic software evolution. Ginseng supports all these changes, therefore we have been able to dynamically update the three applications from the earliest to the latest versions we considered. 3.5.2 Changes To Original Source Code To safely update these applications with Ginseng required making a few small changes and additions to their source code. These changes amount to around 50 lines of code for Vsftpd and Sshd and 40 lines for Zebra, for each program version. The changes consisted of introducing named types for some global variables (we need to introduce such types for global variables whose addresses are taken, to support changes to these variables? types and static initializers), directives to the compiler (for adding update points, analysis, code extraction?described in detail in Section A.1) and in one case (Vsftpd), instantiating an existential use of void ? (see Section 3.3.3). Another change to Vsftpd is discussed in the next subsection. For each new release, we would use the Ginseng patch generator to generate the initial patch, and then verify or complete the auto-generated type transformers 64 RIPd OSPFd BGPd Zebra Kernel route redistribution route announcement route addition/deletion Figure 3.4: Zebra architecture. and write state transformers (where needed, which was rare, as can be seen in columns 6?7 of Table 3.1). This effort was typically minimal. Table 3.3 presents the breakdown of patches, across all releases, into manual and auto-generated source code: the first column shows the number of source code lines we had to write for type and state transformers, the second column shows code lines we had to write to cope with changes in global variables? types or static initializers, and the third column shows the amount of code coming out of the patch generator. The code dealing with changes in static initializers for global variables is frequently a mere copy-paste of the variable?s static initializer. 3.5.3 Dynamic Updating Catalysts In the process of updating the three applications, we discovered four factors that make programs amenable to dynamic updating. Quiescence. As mentioned in Section 3.3.5, dynamic updates are best applied at quiescent points, so that writing an update is straightforward. However, it is the programmer?s responsibility to find such points and indicate them to the compiler 65 int main(){ init (); conn = accept loop(); L1: { init conn (conn); handle conn(conn); } } int accept loop() { L2: while (1) { fd = accept(); if (!fork()) return fd; } } void handle conn(fd) { L3: while (1) { read(cmd,fd); } } Figure 3.5: Vsftpd: simplified structure. via DSU update(). Fortunately, each application was structured around an event pro- cessing loop, where the end of the loop defines a stable quiescent point: there are no pending function calls, little or no data on the stack, and the global state is consistent. At update time, new versions of the functions are installed and global state is transformed so the next iteration of the loop will be effectively executing the new program. For instance, Vsftpd is structured around two infinite loops: one for accepting new client connections, and one for handling commands in existing connections (Figure 3.5). Each time a connection is accepted, the parent forks a new process and returns from the accept loop within the child process. The main function then initializes the connection and calls handle con to process user commands. To be able to update the long running loops, and to handle updates following the accept loop in main, we used loop extraction (Section 3.2.4) at each of the three labeled locations so that they could be properly updated. Note that although L1 is not a loop, by using loop extraction we were able to update code on main?s stack (the continuation of accept loop()) without replacing main itself. For each of the three applications we used one programmer-inserted update point, in the main process, at the end of one 66 Program Source code (LOC) Type + state xform Gvar changes Patch generator (manual) (manual) auto Vsftpd 162 930 83,965 Sshd 125 659 248,587 Zebra 49 244 43,173 Table 3.3: Patch source code breakdown. iteration of the accept loop (hence both the parent process, and each new child process spawned as result of accepting a connection will always execute the latest version). Functional State Transformation. Our mechanisms for transforming global state (state transformers) and local state (type transformers) assume that we can write a function that transforms old program state into new program state. Unfor- tunately, sometimes it is not possible to impose the semantics of the new application on the existing state. We encountered two such cases in our test applications. In the upgrade from Sshd 3.7.1p2 to Sshd 3.8p1, a new security feature was intro- duced: the user?s Unix password is checked during the authentication phase and if the password has expired, port forwarding will be not be allowed on the SSH connection. However, when dynamically updating a live connection from version 3.7.1p2 to 3.8p1, the authentication phase has passed already, so the new policy is not enforced for existing connections (though they could be shut down forcibly). For new connections requests coming in after the update, the new check is, of course, performed. A similar situation arose in going from Vsftpd 1.1.1 to 1.1.2. The new release 67 introduced per-IP address connection limits by mapping the ID of each connection process with a count related to remote IP address. These counts are increased when a process is forked and decremented in a signal handler when a process dies. Unfortunately, following an update, any current processes will not have been added to the newly introduced map, and so the signal handler will not execute properly. In effect, the new state is not a function of the old state. In this case, the easy remedy is to modify the 1.1.2 signal handler to not decrement the count if the process ID is not known. When transforming some value, a type transformer can only refer to the old version of the value and the latest version of global variables, which means that in principle some transformations may be difficult or impossible to carry out. In practice we did not find this to be a problem: for all the 29 type transformers we had to write, the programmer effort was limited to initializing newly added struct fields. Type-safe Programs. As mentioned in Section 3.3, low-level programming id- ioms might result in types being marked non-updateable by the analysis. Since having a non-updateable type restricts the range of possible updates, we would like to maximize the number of updateable types, so the solution is to either have a more precise analysis, or inspect specific type uses by hand and override the analysis for that particular type. For the programs we have considered, the techniques presented in Sections 3.3.2 and 3.3.3 increased the precision of the analysis and thereby greatly reduced the need to inspect the program manually. 68 For instance, in vsftpd, strings are represented by a struct mystr that carries the proper string along with length and the allocated size. The address of the string field is passed to functions, hence revealing struct mystr?s representation, but our abstrac- tion violation analysis was able to detect that the aliases were temporary and did not escape the scope of the callee, hence the type was updateable at the conclusion of the call. Polymorphism is employed in all three programs; using the void ? analysis (Section 3.3.3) we were able to detect type-safe uses of void ?and reduce the number of casts that have to be manually inspected. Inline assembly can compromise type safety as well: we do not know how global variables, types and functions are used when passed to assembly code. Our analysis treats inline assembly conservatively by preventing changes to type definitions and types of functions or global variables used in inline assembly. However, when a manual inspection confirms that such uses are safe, we can decide to override the analysis; we only had one such situation in Sshd. In the end, we manually overrode the analysis only for a handful of types: 0 for Vsftpd, 1 for Zebra, and 4 for Sshd. Our type wrapping scheme relies on the fact that programs rarely rely on how types are physically laid out in memory, i.e., that they are treated abstractly in this respect. Fortunately, this was a good assumption for these programs. We could not type wrap some ?low level? types, e.g., Vsftpd?s representation of an IP address, since its layout is ultimately fixed by the OS syscall API. On the other hand, this and low-level structures like this one rarely change, since they are tied to external specifications. 69 Robust Design. We wanted our DSU approach to be general enough to be applied to off-the-shelf software, written without dynamic updates in mind (as was the case with our test applications). However, there are measures developers can take to make applications more update-friendly. Apart from features mentioned above (quiescent points, type safety, and abstract types), we have also found defensive programming and extensive test cases to be helpful in developing and validating the updates. All three programs we looked at were written defensively using assert liberally, which facilitated error detection and helped us spot Ginseng bugs relatively easily. By looking at the assertions in the code, we were able to detect the invariants the programs relied on, and preserve them across updates. Sshd comes with a rigorous test suite that provides extensive code coverage, and for Zebra and Vsftpd we created our own suites to test a broad range of features. 3.5.4 Summary We believe we have addressed the DSU challenges set forth in Section 3.1. We did not have to change the applications extensively to render them updateable. Patch generation was mostly automatic, and writing the manual parts was easy. We were able to support a large variety of changes to applications; as can be seen in Tables 3.1 and 3.2, the applications have changed significantly during the three-four years time-frame we considered. Once we became familiar with the appli- cation structure (e.g., interaction between components, global invariants), writing patches was easy, with all the infrastructure generated automatically; the only man- 70 ual task was to initialize newly added fields, write state transformers, or make some small code changes. A combination of factors have helped us address these challenges: (1) pro- grams were amenable to dynamic updating (easily identifiable quiescence points the application, application changes that allowed updates to be written as functions from the old state to the new state, robust application design and moderate use of type-unsafe, low-level code), and (2) Ginseng, especially analysis refinements and support for automation, has made the task of constructing and validating updates easy, even for applications in the range of 50-60 KLOC. 3.6 Performance In this section, we evaluate the impact of our approach on updateable soft- ware. We analyzed the overhead introduced by DSU by subjecting the instrumented applications to a variety of ?real world? tests. We considered the following aspects: 1. Application performance. We measured the overhead that updateability im- poses on an application?s performance by running stress tests. We found that DSU overhead is modest for I/O bound applications, but significant for CPU- bound ones. 2. Memory footprint. Type wrapping, extra version checks and dynamic patches result in an increased memory footprint for DSU applications. We found the increase to be negligible for updateable and updated applications, but after stacking multiple patches, the memory footprint increase is detectable. 71 3. Service disruption. We measure the cost of performing an actual update while the application is in use. The update will cause a delay in the application?s processing, while the patch is loaded and applied, and will result in an amor- tized overhead as data is transformed. In all the updates we performed, even for large patches, we found the update time to be less than 5 ms. 4. Type wrapping overhead. In order to measure the impact of type wrapping on CPU-bound applications, we instrumented an application that performs computations on named types exclusively?KissFFT. We found type wrapping to introduce a significant overhead, in terms of both performance and memory footprint. We also measured the running time of Ginseng to compile our benchmark programs, to measure the overhead of compilation and our analyses. WeconductedourexperimentsondualXeon@2GHzserverswith1GBofRAM, connected by a 100Mbps Fast Ethernet network. The systems ran Fedora Core 3, kernel version 2.6.10. All C code, generated by Ginseng or otherwise, was compiled with gcc 3.4.2 at optimization level -O2. We have compiled and run the experiments with optimization level -O3, but apart form a slight increase in memory footprint (less than 1%), there was no detectable difference in performance. Unless otherwise noted, we report the median of 11 runs. 72 Program Connection time (ms) stock updateable updated once streak Vsftpd 6.71 6.9 (2.83%) 7.04 (4.91%) 8.4 (25.18%) Sshd 47.62 49.26 (3.44%) 49.5 (3.94%) 62.89 (32.06%) Zebra 0.63 0.65 (3.17%) 0.65 (3.17%) 0.67 ( 6.34%) Program Transfer rate (MB/s) stock updateable updated once streak Vsftpd 7.95 7.95 (0%) 7.97 (0.25%) 7.98 (0.37%) Sshd 7.85 7.84 (?0.12%) 7.83 (?0.25%) 7.84 (?0.12%) Table 3.4: Server performance, in absolute numbers and relative to the stock version. 3.6.1 Application Performance To assess the impact of updateability on application performance, we tried different stress tests on the updateable applications. For each application, we mea- sure the performance of its most recent version under four configurations. The stock configuration is the application compiled normally, without updating. The update- able configuration is the application compiled with updating support. The updated once configuration is the application after performing one update, whereas the up- dated streak configuration is the application compiled from its oldest version and then dynamically updated multiple times to bring it to the most recent version; this configuration is useful for considering any longer-term effects on performance due to updating. Vstfpd. We tested Vsftpd performance using two metrics: connection time and transfer rate. For connection time, we measured the time it took wget to sequentially request 500 files of size 0, and divided by 500. Since wget opens a new connection for each file, and disk transfers are not involved, we get a picture of the overhead 73 DSU imposes on FTP clients. As seen in Table 3.4, the updateable, updated and streak-updated versions were 3%, 5% and 25% slower than the stock server. With a difference of at most 1.7 ms, we do not believe this to be a problem for FTP users. These measurements seem to suggest a progressive slowdown due to updating. The primary reason for this appears to be poorer spatial locality. Using OProfile,6 we measured the total cycles, instructions retired, and cache and TLB misses during benchmark runs of the one-update and streak-updated versions. We found that the effective CPI of the streak-updated version was consistently higher, and that this was attributable to cache and TLB misses. Such misses are understandable: code and data that were close together in the original program are now spread across multiple shared libraries. We also measured the median transfer rate of a single 600 MB file to a single client. The results are shown in Table 3.4; since file transfer is a network-bound operation, the transfer rates of the different configurations are virtually identical. Sshd. For Sshd we measured the same indicators as for Vsftpd, connection time and transfer rate. For the former, we loaded the server with 1000 concurrent requests and measured the total elapsed time, divided by 1000. (Client-server authentica- tion was based on public key hence no manual intervention was needed.) Each client connection immediately exited after it was established (by running the exit command). The measured connection time is shown in Table 3.4. The update- able, updated and streak-updated versions were 3%, 4% and 32% slower than the 6http://oprofile.sourceforge.net 74 RIP OSPF BGP R-RIP R-OSPF 0 5 10 15 20 Total time (s) Stock Updateable Updated once Updated streak Figure 3.6: Zebra performance. stock server. Again, we do not think the 15ms difference is going to be noticed in practice. The CPU-intensive nature of authentication and session key computation accounts for SSH connection time being almost 10 times larger than for FTP. To measure the sustained transfer rate over SSH we used scp to copy a 600MB file. As shown in Table 3.4, the results are similar to the Vsftpd benchmark?the transfer is network-bound and the DSU overhead is undetectable Zebra. Since Zebra is primarily used for route proxying and redistribution, the focus of Zebra experiments was different than for Vsftpd and Sshd. First, we mea- sured the overhead DSU imposes on route addition and deletion. We started each protocol daemon alone with Zebra, and programmed the protocol daemon to add and delete 100,000 routes. When passing routes through the updateable, updated and streak-updated versions of the Zebra daemon, the DSU overhead was 4%, 6% and 12%, compared to the stock case (first three clusters in Figure 3.6). Second, we measured route redistribution performance. We started the RIP daemon, turned on 75 Stock Updateable Updated once Updated streak0 10 20 30 Running time (s) 11.89 27.28 27.59 27.67 Figure 3.7: KissFFT: DSU impact on performance. redistribution to OSPF and BGP daemons, programmed the RIP daemon to add and delete 100,000 routes, and measured the time it took until the route updates were reflected back into the OSPF and BGP routing tables. Similarly, we timed redistribution of OSPF routes to RIP and BGP daemons. BGP redistribution is not supported by Zebra. The DSU overhead in the route redistribution case (last two clusters in Figure 3.6) is the same as for the ?no redistribution? case above: 4%, 6% and 12% respectively. Zebra offers a command line interface for remote administration, so as a sanity check only, we measured the connection time for Zebra as well. We wrote a simple client that connects to the Zebra daemon, authenticates, executes a simple command (?show version?) and then exits. We measured (Table 3.4) a 3%, 3%, and 6% increase in connection times for the updateable, updated once and streak-updated Zebra versions, respectively. 76 0 10 20 30 Running time (s) Stock No slop, with con optimization No slop, without con optimization No slop, with con optimization Updateable 11.89 16.93 21.16 23.78 27.28 Figure 3.8: KissFFT: impact of optimizations on running time. KissFFT. The overhead of DSU is dwarfed by I/O costs in our experiments. On the one hand, this is good because illustrates that for a relevant class of applications, DSU is not cost-prohibitive. On the other hand, it does not give a sense of the costs of DSU for more compute-bound applications. To get a sense of this cost, we instrumented KissFFT7, a Fast Fourier Transform library. Figure 3.7 shows the total time to perform 100,000 Fast Fourier Transforms on 10,000 points. The updateable, updated once and updated streak versions were on average more than twice as slow (a factor of 2.29x) than the stock version. We analyzed KissFFT to understand the source of the overhead. The program operates on a large array of complex numbers, and each complex number is repre- sented as a struct complex. Therefore, before accessing fields a con complex has to be performed. Moreover, each complex number will have some slop to accommodate future growth. Together, these two overheads can make a significant difference, as shown in Figure 3.8. First, the compiler does not attempt to optimize away redundant cons; 7http://sourceforge.net/projects/kissfft 77 that is, KissFFT will perform consecutive cons for data that could not have been updated in between. As shown in the figure, hand-optimizing away redundant cons in the main loop yielded some improvement. Second, the added slop results in poor cache behavior, as far fewer complex numbers in the array would be hot in the cache. The figure shows the effect of setting the slop to 0, effectively just adding the version field to the struct. Avoiding redundant cons reduces the DSU penalty to 100%, eliminating the slop reduces the DSU penalty to 78%, and combining the two techniques yields a final DSU overhead of only 42%. We believe that in the future we could leverage static analysis in order to avoid introducing redundant cons, and could explore different updateable type representations (such as the hybrid solution described in Section 3.2.2) for reducing the overhead of the slop. Note however, that KissFFT belongs to a category of performance-critical applications where the cost of DSU might outweigh the benefits; we discuss other such applications in Section 7.1. The point of our KissFFT experiments is to explore the cost of DSU in CPU-intensive applications in which uses of named types abound. 3.6.2 Memory Footprint Type wrapping, function indirection, version checking and loop extraction all consume extra space, so updateable applications have larger memory footprints. Figure 3.9 reports memory footprints for the four scenarios, with quartiles as error 78 vsftpd sshd zebra kissfft 0 2 4 6 8 Memory Footprint (MB) stock updateable updated once updated streak Figure 3.9: Memory footprints. bars.8 Measurements were made using pmap at the conclusion of each throughput benchmark. For the updateable and updated cases, the only significant increase is displayed by KissFFT. The explanation is quite simple: KissFFT uses a large number of structs whose size grows by a factor > 2 due to type wrapping. The footprint increases for Vsftpd, Sshd and Zebra are overshadowed by OS variability. However, for the streak updates, the median footprint increase (relative to the stock version) is 21%, 40% and 27% for Vsftpd, Sshd and Zebra, respectively. The larger footprint increase for streak updates is expected, since dynamic patches for three years worth of updates are added into the memory space of the running program and never unloaded (Section 3.4). 79 0 100 200 300 Binary patch size (KB) 0 1 2 3 4 5 Update time (ms) Figure 3.10: Patch application times. 3.6.3 Service Disruption One of the goals of DSU is to avoid service interruption due to the need toapply software patches. By applying these patches on-line, we preserve useful application state, leave connections open, and sustain service. However, the service will still be paused while new patch files are loaded, and service could be degraded somewhat due to the application of type transformers at patch time and thereafter. Figure 3.10 illustrates the delay introduced by applying a patch; the delay includes loading the shared object, performing the dynamic linking and running the state transformer (type transformation time was hard to measure, and likely very small, and so is not included). The figure presents measurements for every patch to all of our program versions, and graphs the elapsed time against the size of the patch object files. We can see that patch application time increases linearly with the size of the patch. In terms of service interruption, DSU is minimally intrusive: in all cases, the time to perform an update was under 5 milliseconds. 8We present memory footprint data using error bars (as opposed to just numbers as we did for performance experiments) because there is a lot of variability in the data. 80 0 10 20 30 40 50 60 Program size (kLOC) 0 10 20 30 40 50 60 70 80 90 100 Total time (s) misc gcc/CIL analysis vsftpd 1.1.0 vsftpd 2.0.3 zebra 0.93a zebra 0.95 sshd 3.6.1 sshd 4.2 Figure 3.11: DSU compilation time breakdown for updateable programs. 3.6.4 Compilation The time to compile the first and last versions of our benchmarks is shown in Figure 3.11. The times are divided according to the analysis time, parsing and compilation time, and remaining tasks. In general, the majority of the overhead is due to the safety analyses (whole program, constraint-based analyses). The overhead consists of time spent in the updateability analysis, the AVA analysis and solving constraints introduced in these analyses using Banshee [61]. Given that Ginseng is only needed in the final stages of development, i.e., when the application is about to be deployed or when a patch needs to be generated and compiled, this seems reasonable. 3.7 Acknowledgments The design, implementation and evaluation of Ginseng on single-threaded pro- grams are the result of joint work. Gareth Stoyle, Michael Hicks, Gavin Bierman, 81 and Peter Sewell developed Proteus [106], an imperative calculus that forms the base of Ginseng?s update type safety. Gareth Stoyle and I worked together on implement- ing Proteus for the full C language, and on the initial design and implementation of the Ginseng compiler. Gareth Stoyle and Michael Hicks designed and implemented the abstraction-violating alias analysis. Michael Hicks wrote the initial support for code extraction. Manuel Oriol wrote the code that generates type transformers and helped test the preliminary versions of Ginseng. Michael Hicks and Manuel Oriol made the source code changes required to make Vsftpd updateable, and wrote the updates for Vsftpd. 3.8 Conclusion This chapter has presented the implementation of single-threaded Ginseng, a system for updating C programs while they run. We have shown that Ginseng can be used to easily update realistic C programs over long stretches of their lifetimes, with only a modest performance decrease. Our system is arguably the most flexible of its kind, and our novel static analyses make it one of the most safe. Our results suggest that dynamic software updating can be practical for upgrading running systems. In Chapter 4 we show how we have extended Ginseng to handle multi-threaded programs, along with our experience updating several multi-threaded servers. 82 Chapter 4 Multi-threaded Implementation and Evaluation Chapter 3 presented Ginseng?s implementation for single-threaded programs and its evaluation on several server programs. In this chapter we show how Ginseng supports multi-threaded programs. Our goal is to provide DSU support that is as flexible and safe as Ginseng?s single-threaded approach. 4.1 Introduction While some prior work on DSU has considered multi-threaded programs, no prior system considers the question of safety in any depth?either no automatic sup- port is provided, leaving the problem entirely to the programmer, or the automatic support is insufficient to establish safety (see Section 6.1.3). The main technical challenge we address is to ensure updates can be applied in a timely fashion, while providing certain safety guarantees. The key concept that helps us accomplish our goal of balancing safety and availability is that of induced update points. Similar to our approach for updating single-threaded programs (Sec- tion 3.3.5), we allow programmers to designate update points via DSU update(). We call these semantic update points, because they are points chosen by the programmer so that writing an update is straightforward, e.g., where the global state is consis- tent. The update, however, can take place in between semantic update points, at 83 an induced update point. Our system enforces that an update appears to execute at an semantic update point, and that execution is version-consistent. This means that even if a code update takes place in between two semantic update points, the execution trace can be attributed to exactly one program version. We find that semantic update points serve as a useful mechanism for reasoning about update safety, while induced update points permit far greater flexibility in choosing when an update takes place. In particular, programmers can think of updates as possibly occurring at semantic update points only, while the run-time system can actually apply the update at any time so long as it maintains this illusion. This flexibility is crucial in being able to update multi-threaded programs, since it allows us to apply updates without imposing many synchronization constraints on threads. Section 4.2 introduces our idea of semantic update points and explains how we implement them using the contextual effects analysis described in Chapter 5. Sec- tion 4.3 provides further details about our implementation in Ginseng and strategies for reaching a safe update point, and in particular demonstrates that it our system is able to apply an update quickly without compromising safety. Section 4.4 describes our experience using our implementation to update three multi-threaded servers. Section 4.5 measures the performance of our approach, and shows that overhead introduced by update support is detectable using micro-benchmarks but negligible in more realistic scenarios. 84 4.2 Induced Update Points The manual enumeration of a few update points works well for single-threaded programs. However, in a multi-threaded program, an update can only be applied when all threads have reached a safe update point. Since this situation is unlikely to happen naturally, we could imagine interpreting each occurrence of DSU update() as part of a barrier?when a thread reaches a safe update point, it blocks until all other threads have done likewise, and the last thread to reach the barrier applies the update and releases the blocked threads. Unfortunately, because all threads must reach safe points, this approach may fail to apply an update in a timely fashion. With only one or two update points per thread, each thread may take a while to reach its safe point, and with many threads and few processors, it will take even longer for all threads to do so. More problematic are threads that are blocked, on I/O or on a condition variable, say, since they may take a while to get unblocked. In the worst case, a thread could be delayed indefinitely by the update protocol itself. For example, a thread t might reach the barrier while holding a lock that another thread tprime, yet to reach the barrier, is blocked on. To avoid deadlock we could imagine causing threads to sleep, rather than block, at update points. But then there is no guarantee the protocol will converge (essentially resulting in a kind of livelock). In all these cases, the update protocol degrades the normal application?s performance as its threads are blocked or delayed. Note that several systems that support dynamic updates to multi-threaded 85 programs employ the activeness check we explained in Section 3.3.4: an update cannot take place if it affects code that is actively executing, i.e., is referenced by the stack of a running thread [23, 24, 1]. The problem is that programmer-specified update points are usually in active code, e.g., a long-running loop. Our approach combines programmer annotations with static analysis and a runtime protocol that permits updates at programmer-specified update points, but increases the chances that a safe point can be reached in the presence of multiple threads. Given a program in which the programmer has designated safe update points using DSU update() calls, we will insert some more update points in between,1 which we call induced update points (these can be viewed as calls to a function DSU induced update()). The additional update points will provide more opportunities for threads to reach safe points, and thus the update should be able to take effect more quickly. The key feature of induced update points, as enforced by the run-time system and a compile-time static analysis, is that if an update takes place while a thread is stopped at an induced update point, the program?s execution will still appear as if the update took place at a semantic update point instead. More precisely, it will appear the update took place at the previously-reached semantic update point, or one of the semantic update points that could occur subsequently. The key benefit of induced update points is that the programmer is able to write the update code as if it will be applied at semantic update points, even though it could happen at potentially many more program points. This allows us to better 1At the moment, induced update points are manually inserted, but with some more engineering the compiler could insert them automatically. 86 balance safety with availability (in Section 5.3.6 we present experimental results that show how induced update points increase availability). We implement induced update points by ensuring they preserve version con- sistency between semantic update points. That is, we can view each semantic up- date point as beginning (or ending) a transaction such that the execution of the transaction is version-consistent: even if an update takes place while the transac- tion executes, the execution nevertheless can be attributed to either the old or new version. In Chapter 5 we explain transactions and version consistency in detail. Given these high-level ideas, we explore compiler and run-time system support for implementing induced update points. We consider two possible approaches, which we dub the ?barrier approach? and the ?relaxed approach?. In the former, we still require all threads to be at update points (semantic or induced) when the update takes effect, which we can force by performing barrier synchronization, for example. The relaxed approach is similar to barrier approach, but we no longer re- quire threads to actually be stopped at an update point when an update takes effect. Instead, a thread can ?check in? its effects (restriction on the form of the update at a particular program point) and then proceed?thus, DSU update() and DSU induced update() no longer block. An available update may proceed, as long as it does not conflict with the combined effects of all the checked-in threads. We now proceed to describing the barrier and relaxed approaches in detail, along with the static analysis and runtime support required to implement them. 87 1 DSU up da te(); // sem antic up. point 2 // { }= ? {f ,g ,h ,i, k} = ? 3 f(); // {f } {g ,h, i,k } 4 g= 42; // {f ,g} {h ,i, k} 5 h( ); // {f ,g ,h} {i ,k} 6 7 i() ; // {f ,g ,h ,i} {k } 8 k() ; // {f ,g ,h ,i, k} {} 9 10 DSU up da te(); // sem antic up. point 1 DSU up da te(); // sem antic up. point 2 DSU ind uce dup da te( {} ,{f, g,h,i ,k} ); 3 f(); 4 g = 42; 5 h( ); 6 DSU ind uce dup da te( {f, g,h },{ i,k }); 7 i() ; 8 k() ; 9 DSU ind uce dup da te( {f, g,h,i ,k} ,{} ); 10 DSU up da te(); // sem antic up. point 1 DSU up da te(); // sem antic up. point 2 DSU che ckin ({f, g,h },{ f,g,h,i ,k} ); 3 f(); 4 g = 42; 5 h( ); 6 DSU che ckin ({f, g,h,i ,k} ,{i,k }); 7 i() ; 8 k() ; 9 DSU che ckin ({f, g,h,i ,k} ,{} ); 10 DSU up da te(); // sem antic up. point (a) Orig inal prog ram (b) Ba rrier appr oa ch (c) Re laxed appro ach and con textual effec ts Figur e4 .1: Con tex tual eff ects and their use for implemen ting induced up date po ints. 88 4.2.1 Barrier Approach In the barrier approach, after an update has been requested, a thread blocks at an update point (induced or semantic) if it does not conflict with the update. When all threads are blocked, we can perform the update. The question is how to determine, once a thread reaches an induced (or semantic) whether the update is safe; our definition of safe is that the update looks like it was applied at a semantic update point for each thread. We make this determination using information from a static analysis that takes into account programmer-designated semantic update points. The static analysis first performs a standard effect inference on the program. In a traditional effect system, the effect ? of some program expression e characterizes an aspect of e?s non-functional behavior, for example the names of locks e acquires, or the abstract names of memory locations e dereferences. For enforcing version consistency, an effect consists of the names of functions that are called and the names of global variables that are read or written. For example, the effect of the block{ f (); g = 1 ; h(); }where g is a global variable would contain{f,g,h}because functions f and h are called, and g is written to. Functions f, and h may contain additional effects, in which case these effects would be included in the effect of the entire block. Next, we compute a generalization of traditional effects which we call con- textual effects (explained in more detail in Section 5.2).The contextual effect of an expression e consists of a three-tuple [?;?;?], where ? is the effect of e, as above; 89 ? is the prior effect, which characterizes the computation since the last semantic update point up to (but not including) e; and ? is the future effect, which charac- terizes the computation following e, up until the next semantic update point. Thus, the contextual effect of statement g = 1; within block { f (); g = 1; h(); } would be [{f};{g};{h}]. Here, ? ={f}because f is called prior to the write to g, and ? ={h} because h is called following the write to g. In Figure 4.1 (a) we present a sample program with two semantic update points and several uses of functions and global variables in between. Contextual effects for the sample code are listed in comments, on the right hand side of Figure 4.1 (a). If the listed code appears in a function foo, then foo will be included in prior and future effects for the scope of the function but for clarity, we omit the name of the enclosing function from our presentation. We can use contextual effects to enforce version consistency as follows. First, the compiler computes the contextual effect of each induced update point, and passes the resulting prior and future effects to DSU induced update at run-time (i.e., the DSU induced update() call is changed to be DSU induced update(?,?,D), where D is the capability required for enforcing type-safety as described in Section 3.3.1). This is illustrated in Figure 4.1 (b). Second, when an update that changes definitions u becomes available, the system barrier-synchronizes all threads on safe update points. An update point is safe in one of two situations: 1. For all threads ti, u??i =?where ?i is the future effect of thread ti. In this case, the update will appear as if it took place at the next semantic update 90 Program Trace Program Trace DSU update(); f (); g = 42; h(); a59update f,g i (); k(); DSU update(); f, {1} g, {1} h, {1} i, {1,2} k, {1,2} DSU update(); f (); g = 42; h(); a59update i,k i (); k(); DSU update(); f, {1,2} g, {1,2} h, {1,2} i, {2} k, {2} (a) Roll forward update (b) Rollback update Figure 4.2: Examples of version consistent updates. point. 2. For all threads ti, u??i = ? where ?i is the prior effect of thread ti. In this case, the update has not changed any definitions the thread has already accessed, and thus, even if it changes definitions that the thread will access subsequently, the entire execution will appear as if the update took place at the prior semantic update point. In database parlance, the first condition results in a roll forward semantics, since the execution is as if due to the old version, while the second condition results in a rollback semantics, since the execution is as if due to the new version. We will heretofore refer to the prior and future effects together as VC effects and to the safety conditions 1. and 2. above as the VC check. To illustrate how these conditions enforce version consistency, in Figure 4.2 we show two possible updates to the program in Figure 4.1, along with a program trace that shows the version at which a function is executed or a global variable 91 is accessed; this helps us determine whether the execution is version consistent. Figure 4.2 (a) shows the execution trace if an update to function f and variable g is applied after the call to h. We see that f, g and h appear in the trace with version set {1}, because they are accessed prior to the update. Functions i and k appear in the trace with version set {1,2} because they are not changed by the update, so their definition is the same in both versions. We can now illustrate the VC check. Figure 4.1(a) shows the contextual effects at each program point, and we can see that at line 6 we have ? ={i,k}. So the VC check u?? = ? is satisfied, because u ={f,g} and ? ={i,k}. Therefore, the update is version consistent because the execution of the block between the two semantic update points can be attributed to a single version?version 1. We call this update a roll forward update, because the update appears to have been applied at the end of the block (second semantic update point). Similarly, Figure 4.2 (b) shows the execution trace if an update to functions i and k is applied after the call to h. We see that the trace is consistent again, since the execution can be attributed to version 2. At the point where the update is applied, the VC check u?? =?is satisfied, because u ={i,k}and ? ={f,g,h}. Therefore, the update is version consistent because the execution of the block in between semantic update points can be attribute to a single version?version 2. We call this update a roll back update, because the update appears to have been applied at the beginning of the block (first semantic update point). In Figure 4.3 we present an example where version consistency is violated. At the point where the update is applied, we have ? ={f,g,h}, ? ={i,k} and u ={h,i}. 92 Program Trace DSU update(); f (); g = 42; h(); a59update h,i i (); k(); DSU update(); f, {1} g, {1} h, {1} i, {2} k, {1,2} Figure 4.3: Example of a version inconsistent update. Therefore, the VC check fails (we have u??negationslash=?and u??negationslash=?) and the update is deemed unsafe. Indeed, we can see that if we apply the update, the trace cannot be attributed to a single program version. To understand how this approach performs in practice, we have implemented a thread synchronization protocol that builds on it. In this protocol, semantic update points and induced update points are no-ops if an update has not been requested. If an update is requested and the current thread?s restriction is compatible with the update, the thread blocks until all other threads have reached a semantic update point or an induced update point. After all threads are blocked, we apply the update. The results of running the experiments on our three test applications are discussed in detail in Section 4.5.1. 4.2.2 Relaxed Approach The main problem with the barrier approach is that blocking all threads until they reach safe update points may create an undue delay, and even deadlock. For 93 example, if thread T blocks at an induced update point while holding lock L, then any other thread that wants to acquire L cannot make progress and reach one of its semantic update points to allow the update to proceed. To avoid blocking, we can adapt the barrier approach as follows. Instead of calling DSU induced update() with its prior and future effect, we call a different function, DSU checkin(), that registers the union of the contextual effects computed at all program points between this and the next call to DSU checkin(). In effect, DSU checkin() stands in for a series of induced update points from its call site until the next DSU checkin() or semantic update point. After performing the registration, the thread may continue running?even if an update becomes available prior to reaching the next update point, the update may take effect so long as it satisfies the safety condition described above: for all threads ti, u??i =?or u??i =?. If a thread reaches a check-in point or a semantic update point while an update is in progress, it pauses at that point until the update is finished, and then continues (at the new version). To see how check-ins work, consider the example in Figure 4.1 (c), which shows the check-in points explicitly. The prior effect of the first DSU checkin call on line 2 is ? = {}, but notice that we check in ? = {f,g,h} instead?this is because we now allow the possibility that an update could occur on line 3, 4, 5, or prior to the call to DSU checkin on line 6, and thus the checked-in ? must approximate the prior effect at these program points. Generally speaking, at each check-in point we register prior effect ???, where ? is the prior effect at that check-in point, and ? is the effect of the code between that point and the next check-in point, or the next 94 semantic update point, whichever comes first. The second argument to DSU checkin, the future effect, contains the future effect ? only, because it over-approximates the future effects of subsequent statements (up to the next semantic update point). For example, the future effect ? = {h,k} at line 6 is a sound approximation of future execution, should an update be applied at lines 7, 8, or prior to the check-in on line 9. 4.2.3 Discussion An important aspect of both barrier and relaxed approaches is where, and how often, should induced update points (DSU induced update, and DSU checkin, respec- tively)beplaced. Havingfewerinducedupdatepointsreducesruntimeoverhead, but might impact liveness. Also, in the relaxed approach, because of over-provisioning on future effects, fewer check-in points means stronger restrictions on what can be updated, which is detrimental to liveness as well. Currently, we manually place induced update points at the beginning and end of each stage in a thread loop body (Section 4.3.1). This strategy results in 3-4 induced update points per thread. Note that we could also use an automated, adaptive scheme for choosing in- duced update points. If the runtime system observes that the current induced update point granularity is not sufficient, we can easily construct a ?gratuitous? update whose only purpose is to replace the current function with one having more induced update points (Section 4.4). We now present the protocol that implements the check-in based relaxed ap- 95 1 // pe r? thr ea d restrictions 2 typ ed ef struct { 3 set ?; set ?; set D; 4 } thr ea drest r; 5 6 thr ea drest r restriction []; 7 rwlo ck rest ric tion mute x; 8 mute xup da te mute x; 9 volati le bo ol up date requ este d = 0; 10 11 // ele me nts chan ge db ythe up da te 12 set up date conte nts ; 13 14 bo ol conflicts (rst [], u) 15 { 16 bo ol res = false ; 17 for (i = 0; i< nthr ea ds ;i ++) { 18 if (rst [i ].D ? unegationslash= ?|| 19 (rst [i ].? ? unegationslash= ?&& 20 rst [i ].? ? unegationslash= ?)) 21 { res = tru e; break ;} 22 } 23 ret urn res ; 24 } 25 void DSU che ckin (? ,? ,D ) 26 { 27 i= thread self (); 28 29 read lock (res tri ction mute x); 30 rest ric tion [i] = {? ,? ,D }; 31 un lock (rest ric tion mute x); 32 33 leader selec tion (); 34 } 35 36 void leader selec tion () 37 { 38 if (u pda te requ este d) { 39 if (t rylo ck (up date mute x) == OK) { 40 // thread ?i? is lead er 41 leader (); 42 } 43 els e { 44 // thread ?i? is follo we r 45 fol low er (); 46 } 47 } 48 } 49 50 void leader () 51 { 52 write lock (res tric tion mute x); 53 if (! conflicts (rest ric tion , 54 up da te conte nts )) { 55 ap ply up da te (); 56 up da te requ este d = 0; 57 } 58 // else :couldn ?t ap ply the up da te yet 59 // give other thr ea ds achan ce to be lea de r 60 61 // up da te proto col fin ish ed 62 un lock (rest ric tion mute x); 63 un lock (up da te mute x); 64 } 65 66 void fol low er () 67 { 68 // do noth ing ; lead er will che ck for 69 // saf ety an dap ply the up da te 70 } Figur e4 .4: Ps eudo -co de for safet yc hec kand up date prot ocol rout ine s. 96 proach described above. The pseudo-code for this protocol is shown in Figure 4.4. We keep a global restriction array protected by a reader-writer lock (lines 2?7). If an update is signaled, the flag update requested (defined on line 8) is set, and the names of the patch elements are written into update contents (defined on line 12). This set is used by the safety check in function conflicts (lines 14?24). A thread reaching a check-in point (line 25) checks-in its restriction first(lines 29?31), and, if an update has become available, calls leader selection to initiate or joins the update protocol. We use the update mutex to ensure only one thread is leader. If an update has been requested and no other thread has taken the lead (by acquiring update mutex), the thread is declared a leader; it will possibly perform the update. If update mutex is taken, the current thread will be a follower. The leader code (lines 50?64) checks whether the update is safe for all threads by comparing the update contents with each thread?s restrictions (effects and capability). The follower (lines 66?70) simply waits until the leader is done, so as to avoid changing a thread?s restriction while the leader is busy performing the update safety check. Note that this protocol does not guarantee progress, i.e., it is possible that the leader?s conflict check fails. To alleviate this, Ginseng performs a static conflict check at patch compilation time. Check-in points in each thread are verified against update contents and if, for a cer- tain thread, the safety check would always fail at runtime (i.e., all check-in points in that thread conflict with the update), Ginseng notifies the user. In practice, however, we were always able to reach a safe update point within 2 seconds, and typically we could do so in under 8 ms. The results of running the experiments on our three test applications are presented in Section 4.5.1. 97 A potential shortcoming of our approach is the difficulty of writing state trans- formers in the presence of I/O. We use contextual effects to implement induced update points and provide the ?illusion? that an update is applied at a semantic update point. This illusion, however, is only maintained if the program does not store state outside the process (e.g., via I/O), because contextual effects might fail to capture such state. Since the update could be applied at induced update points in the middle of several I/O operations, the programmer might need to adjust the state transformer based on whether certain I/O operations have completed or not. We have not encountered this situation in practice. 4.3 Implementation The compiler and runtime system are built on top of single-threaded Ginseng, presented in Chapter 3. We will now talk about how we extended Ginseng to handle multi-threaded programs, and what the programmer has to do to prepare multi-threaded programs for compilation with Ginseng. 4.3.1 Replacing Active Code Updating long-running programs raises the issue of having to update active code. Performing an activeness check (Section 3.3.4) to determine which functions can be updated is problematic because it prohibits updates to functions which con- tain active code. To cope with this, in Section 3.2.4 we introduced a technique called code extraction that permits updates to code on the stack (e.g., long-running 98 loops). Ginseng extracts a programmer-indicated block (loop body), into a separate function, so to apply an update, we only need to wait for the current iteration to finish, rather than having to wait for the loop to terminate. However, multi-threading complicates things further: since multi-threaded server programs employ threads each running its own loop, to apply an update we would have to wait until each thread has reached the end of its current loop iteration. This condition is difficult to meet, if not impossible. An example would be producer/consumer threads where one thread is blocked while the other is doing work. To solve this problem, we have to permit updates to a loop body before the thread has completed the iteration. We accomplish this by partitioning long-running loops into ?stages? that are designated for code extraction and hence can be updated independently. In Section 4.4.1 we show an example of using code extraction to permit updates to the Icecast server, and in Section 4.4.4 we list the number of instances of code and loop body extraction in our test programs. 4.3.2 Concurrency Issues Since type wrapping changes the representation of updateable (named) types, we have to be careful not to introduce races. A read?read access in a multi-threaded program is race-free without wrapping, but could be problematic once wrappers are introduced. Since con functions can potentially call the type transformer to update a value to the current version, suddenly a read?read access can become a write? 99 Program Release Date Size Description Type State xform xform (LOC) (count) (LOC) 2.2.0 12/04 25,349 2.3.0rc1 08/05 28,593 Major feature enhanc. 23 5 Icecast 2.3.0rc2 09/05 28,788 Other 4 1 2.3.0rc3 09/05 28,796 Other - - 2.3.1 11/05 29,079 Other 7 1 1.2.2 05/07 5,743 Memcached 1.2.3 07/07 5,732 Other 1 - 1.2.4 02/08 6,144 Major bugfixes 3 2 1.2.5 03/08 6,345 Major bugfixes 2 - 0.307 10/06 18,738 0.316 10/06 19,077 Minor feature enhanc. 11 - Space 0.319 10/06 19,399 Minor feature enhanc. 2 - Tyrant 0.331 04/07 19,526 Minor bugfixes - - 0.335 05/07 19,753 Minor feature enhanc. - 6 0.347 08/07 19,979 Minor bugfixes/enhanc. 1 2 0.351 10/07 20,223 Minor feature enhanc. 2 1 Table 4.1: Application releases. write access. To prevent this from happening, con functions use per-type locks and double-checked locking to make the version check fast while guaranteeing that type transformers are invoked atomically. Another way to avoid this problem is to convert data eagerly, at update time, an approach we might consider in future work. 4.4 Experience We used Ginseng to dynamically update three open-source multi-threaded pro- grams: the Icecast streaming media server,2 Memcached, a high-performance, dis- tributed memory object caching system,3 and the Space Tyrant multiplayer gaming server.4 We chose these programs because they are long-running, maintain soft state that could be usefully preserved across updates, and exhibit a variety of threading 2http://www.icecast.org 3http://www.danga.com/memcached/ 4http://spacetyrant.com/ 100 Program Changes Functions Types Global variables Proto Body Icecast 10 292 25 1 Memcached 14 118 6 6 SpaceTyrant 0 107 11 5 Table 4.2: Changes to applications. models. In the remainder of this section, we describe the evolution of these programs during the period we considered, as well as changes we had to make to prepare the programs for compilation with Ginseng. Table 4.1 shows release and update information for each program. Columns 2? 4 show the version number, release date and program size for each release. Column 5 contains the nature of individual releases.5 Column 6 shows the number of type transformers for that specific update, while column 7 presents the size of the state transformer (in lines of code); ?-? means no type or state transformers were needed for a particular release. Table 4.2 shows the cumulative number of changes that occurred to the soft- ware over that span. ?Types? refers to structs, unions and typedefs. These statistics reinforce the findings of Chapters 2 and 3: a dynamic software updating system must support changes, additions, and deletions for functions, types and global variables if it is to handle realistic software evolution. For each program, we downloaded several releases, converted the programs to updateable applications, wrote dynamic patches, applied all patches in release order 5As described at http://freshmeat.net/ in the case of Icecast and Memcached; for Space Tyrant we provide our own characterization. 101 and performed testing/benchmarks. In Section 3.5 we have laid out guidelines for preparingCprogramsforconversionintoupdateableCprograms; weinformtheGin- seng compiler about long-running loops and, if necessary, override Ginseng?s safety analysis via user-inserted annotations. In this chapter we will focus on programmer effort and annotations that are specific to multi-threaded programs: identifying se- mantic update points and picking check-in points. We will discuss how we chose semantic update points and check-in points for each program later (Sections 4.4.1, 4.4.2, and 4.4.3). Our experience with the multi-threaded servers we have considered is that, just like the single-threaded servers we have presented in Chapter 3, they perform a few high-level operations whose boundaries are easily identified as semantic up- date points. Examples of such operations are processing one event, accepting and dispatching a client connection, etc. In Section 4.4.1 we will present a concrete illustration of how we picked semantic update points and check-in points in the Ice- cast server. We have also found that semantic update points tend to remain stable within a program?s overall structure, even as the overall implementation changes. Nevertheless, the programmer does not have to pick the right induced update point at the outset. If, for example, it turns out that in a function, more check-in points are needed, or code extraction boundaries are too coarse, it is easy to construct a ?gratuitous? dynamic update of the function with more check-in points and finer- grained code extraction boundaries. 102 4.4.1 Icecast Icecast is a streaming media server?a popular solution for building Internet radio stations. Updating Icecast on the fly enables media content providers to keep their streams alive 24/7, yet be protected with the latest security fixes. We considered five consecutive Icecast releases spanning 49 weeks (Table 4.1). Icecast uses a fixed number of threads, each performing separate duties: accept connection, handling incoming connections, reading from a media source, keeping statistics, etc. The principal threads are presented in Figure 4.5. Lines marked with ?*? are annotations we had to insert; these are used by the Ginseng compiler, and denote long-running loops, semantic update points, check-in points, etc. For each thread, we placed a semantic update point at the beginning of that thread?s long-running loop (lines 3, 20, and 49, respectively). In the loop bodies, we use check-ins (DSU checkin) to snapshot the thread?s current effect, and code extraction (DSU extract) to permit updates to code on the stack. We placed check-in points and extraction boundaries around each separate stage in loop bodies. The connection accept thread?s operation is split into two stages: checking for termination requests (lines 6?7) and waiting for incoming connections (lines 11?14). When a new connection is opened, it packs the information in a con structure, and passes it to the pool of connection handler threads. A connection handler thread takes an accepted connection, uses an HTTP parser to parse the client?s request, and dispatches the request according to the request type, con type. If, for instance, the client requests some statistics, handle stats request () will fire a new thread that 103 sends statistics information to the client (line 35). If the client requests a file, handle get request () (inlined here for clarity, lines 38?41) will create a new fclient structure and add it to the file serving thread?s working queue. The file serving thread?s operation is also split into two stages: 1) tending to active clients (lines 52? 57), i.e., sending listeners chunks of file contents via HTTP connections, and 2) moving new clients which were generated by the connection handler and added to pending list (lines 63?64) into the active client list, active list (lines 65-66). 4.4.2 Memcached Memcached is a high-performance, distributed memory object caching system that is used on popular sites such as Slashdot or Wikipedia to store pre-rendered HTML pages without having to access the database and render pages individually for each client (since the database is a bottleneck in these situations). Updating Memcached on-the-fly is essential to maintaining a high web server throughput; taking Memcached down to install the next version will flush the in- memory cache and cause degraded operation while the cache fills up again in the new version. We considered four consecutive Memcached releases spanning 10 months (Table 4.1). Multi-threading was introduced in version 1.2.2, so we did not consider releases prior to that. Memcachedusesahomogeneousthreadingmodel, whereallapplicationthreads (a user-configurable number) perform the same fixed task. Memcached uses the libevent library [95] to process client requests; each thread is associated with a 104 1 // acc epts conn ections 2 wh ile (1) { 3 ? DSU up da te(); 4 ? DSU che ckin (); 5 6 if (! globa lru nnin g()) 7 break ; 8 9 ? DSU che ckin (); 10 ? DSU ex tract { 11 con = acce pt connec tion (); 12 if (con) { 13 ad dconnec tion (con ); 14 } 15 ? } 16 ? DSU che ckin (); 17 } 18 // pa rse s an dha ndl es conn ections 19 wh ile (1) { 20 ? DSU up da te(); 21 ? DSU che ckin (); 22 23 if (! globa lru nnin g()) 24 break ; 25 26 ? DSU che ckin (); 27 ? DSU ex tract { 28 con = get connec tion (); 29 head er = read head er(c on ? so ck ); 30 pa rse r = httpp cre ate pa rse r( header ); 31 con typ e= httpp get va r(pa rser , 32 ?p roto co l? ); 33 switch (con typ e) { 34 case ST ATS : 35 ha ndl estats requ est (con ,pa rser ); 36 break ; 37 case GET : 38 // ha ndle get requ est (con ,pa rser ) 39 fclient = clie nt cre ate (con ,pa rser ); 40 fclient ? next = pe nding list; 41 pe nd ing list = fclient ; 42 break ; 43 } 44 ? } 45 ? DSU che ckin (); 46 } 47 // sen ds da ta to clie nts 48 wh ile (ru nfserv ){ 49 ? DSU up da te(); 50 ? DSU che ckin (); 51 ? DSU ex tract { 52 fclient = active list ; 53 wh ile (f clie nt != NUL L) { 54 clie nt sen db yte s( fcli ent , ...); 55 fclient = fcl ien t? next; 56 } 57 ? } 58 ? DSU che ckin (); 59 ? DSU ex tract { 60 wh ile (ru nfserv ){ 61 if (p endin glist ){ 62 to mov e= pe ndin glist ; 63 pe nd ing list = pe nding list ? next; 64 to mov e? next = ac tive list ; 65 active list = to mov e; 66 } 67 if (p oll ( clie nt file desc , ...) >0) 68 break ; 69 } 70 ? } 71 ? DSU che ckin (); 72 } (a) Co nnec tion accept thread (b) Connectio nha ndler thread (c) File serving thread Figur e4 .5: Icecast structur e. DSU an nota tions are mark ed with ?*?. 105 separate libevent instance called base; events belonging to the same base will be processed by the same thread. To ensure that the execution associated with pro- cessing an event is version-consistent, we placed a semantic update point just prior to the start of processing an event, and another semantic update point after finishing processing the event. 4.4.3 Space Tyrant Space Tyrant is a multi-threaded gaming server. According to a report by In-Stat, a market research firm, on-line gaming is a multi-billion dollar industry, and expected to grow rapidly [56]. Therefore, continuous game server operation is essential for companies providing on-line gaming services. Since Space Tyrant has a release model based on very frequent, incremental releases (a release every couple of days/weeks), we considered a year in its lifetime, corresponding to versions 0.307 to 0.351. Given the small magnitude of per-release changes in this model, instead of mapping one update per release, we decided to consolidate several adjacent releases, so we could have type and/or global variable changes for each update, resulting in 7 releases (Table 4.1). Space Tyrant uses a mixed threading model: three fixed threads (for managing the game state, accepting new connections and performing backups) and two threads per each client, one dealing with user input, one dealing with output from the server to the client. Similar to Icecast (Section 4.4.1), we associated one iteration of a thread?s 106 processing loop with two semantic update points, and used 3?4 check-in points per loop, where check-in points delimit stages within the loop. 4.4.4 Source Code Changes When building updateable applications with Ginseng, the programmer might need to intervene at two stages: when preparing the source code for compilation with Ginseng, and when creating dynamic patches. We present details on programmer effort (annotations or lines of code) for each of these stages, in turn, for our three test programs. Changes to applications. Constructing updateable versions of each application required inserting some annotations and making a few changes to its source code, mainly indications to the Ginseng compiler. These changes have to be applied to each program version, but once we figured out the changes we had to make to the first version, we were able to automate the process and use patch to change the following versions. The changes and annotations are presented in detail in Table 4.3. The second column shows the number of long-running loops designated for extraction. Iden- tifying long-running loops is easy, as each long-running thread essentially executes a loop. We identified 11 such loops in Icecast and 7 in Space Tyrant; loop body extraction was not necessary for Memcached because iteration is done externally in libevent. The reason why the number of extracted loops is higher than the number of runtime threads is that in Icecast, some threads are short-lived, and only used un- 107 der certain configurations (we omitted these short-running threads from Figure 4.5 for brevity). In Space Tyrant, which has five distinct kinds of threads, we used loop extraction to extract two nested loops, so the total number of extracted loops was 7. The third column shows the number of code blocks designated for extraction; the procedure we used to identify such blocks was to find long-running stages within the bodies of thread loops, and mark each stage for extraction. The fourth column shows the number of semantic update points. We placed a semantic update point at the beginning of each thread loop body for Icecast and Space Tyrant, and in the case of Memcached, the two semantic update points delimit the processing of one event. The fifth column shows the number of check-in points. The rest of the changes (the number of lines of code changed is presented in column 6) consisted of: ? Directives to the compiler to override the conservativity of the safety analy- sis. Ginseng?s analysis does not model existentially quantified types, so even though they are safely used, Ginseng reports a possible safety violation. We had to add two such directives in Icecast, three in Memcached and two in Space Tyrant. ? Changing four low-level, type unsafe, field access macros in Memcached into function calls, so Ginseng?s compiler and safety analysis can reason about them. 108 Program Annotations Source code Extract Semantic Check-in changes loop code upd. points (LOC) Icecast 11 18 11 34 42 Memcached 0 0 2 2 23 SpaceT 7 16 5 32 19 Table 4.3: Annotations and changes to source code. Adjusting auto-generated patches. Manual intervention was also required to inspect (and complete, where necessary) the auto-generated type transformers and write state transformers, when needed; across all patches, we had to write 80 lines of code for Icecast, 12 for Memcached and 81 lines for Space Tyrant. 4.5 Experiments We performed a suite of experiments to evaluate how our approach balances safety and liveness, and to measure the overhead of using our approach to build and run updateable software. We considered the following aspects: 1. Update availability. To see how the protocols described in Sections 4.2.1 and 4.2.2 perform in practice, we measured, for each of the updates we have considered (13 in total), the time from the moment an update was signaled to the time it could safely be applied. Our main findings are that the relaxed approach provides higher availability than the barrier approach, and that per- forming only the activeness check improves availability even more, though at the expense of safety. 109 2. Application performance. We measured the overhead that update support imposes on a application performance, by running stress tests on unmodified and Ginseng-compiled versions of the same program. We also measured the additional overhead that Ginseng imposes on multi-threaded programs, as opposed to single-threaded Ginseng. We found that DSU overhead is modest for the applications we considered. 3. Memory footprint. We also measured the update support overhead in terms of memory footprint, again for unmodified applications, and updatable applica- tions compiled with single- and multi-threaded versions of Ginseng. We found the increase to be negligible for Icecast and Memcached (less than 1% and 4%, respectively), but up to 46% for Space Tyrant. 4. Build time. We also measured the running time of Ginseng to build our test programs using Ginseng, to measure the overhead of compilation and our analyses. In all cases, the time to compile and link the programs was less than 30 seconds. We conducted our experiments using a client-server setup, where the update- ableapplicationsranonaquad-coreXeon2.66GHzserverwith4GBofRAMrunning Red Hat Enterprise Linux AS release 4, kernel version 2.6.9. The clients ran on a two-way SMP Xeon 2.8GHz machine with 3.6GB of RAM running Red Hat Enter- prise Linux WS release 3, kernel version 2.4.21. The client and server systems were connected by a 100Mbps network. All C code (generated by Ginseng or otherwise), was compiled with gcc 3.4.6 at optimization level -O2. 110 Pro to col P- Rel axed P- Rel axed -D P- OpW ait P- Pos tRe laxed P- Ba rrier P- Ba rrier -D #Clien ts ? 4 16 4 16 4 16 4 16 4 16 4 16 Up d. id ? 0 1,7 50 1,0 68 10 3 10 2 X X 17 ,03 7 10 ,24 3 1,0 62 94 5 50 6 94 5 Icecast 1 2.1 2.1 1.2 1.3 X X 80 5 1,2 33 97 6 94 2 97 1 93 7 2 0.7 0.7 0.6 0.6 X X 60 3 1,0 00 47 8 93 7 97 5 94 0 3 2.3 2.3 1.4 1.4 X X 69 1 1,3 03 48 4 94 1 96 3 93 5 0 0.6 0.7 0.5 0.6 15 0 21 1.1 1.2 0.9 1 0.8 0.9 Mem c- 1 1.2 1.2 0.7 0.8 37 3 29 1.5 1.5 1.7 1.8 1 1 4 2 0.8 0.8 0.6 0.6 27 5 28 1.3 1.4 1 1 0.8 0.9 0 1 1.1 0.6 0.6 X 1,1 63 X 2.3 X 1.7 X 1.2 Mem c- 1 2.8 2.8 0.9 1 X 1,2 55 X 3.6 X 4.5 X 1.8 16 2 1.4 1.4 0.8 0.8 X 3,4 65 X 2.9 X 2 X 1.3 0 3.4 5.2 1.5 1.6 X X 3,5 13 3,4 92 3,5 05 3,4 39 3,5 03 3,4 72 1 4.7 9.4 1.3 1.4 X X 3,5 24 3,5 11 3,5 06 3,4 56 3,5 25 3,4 69 Space 2 1 3.4 0.4 0.4 X X 3,5 45 3,4 77 3,5 26 3,5 14 3,5 23 3,4 44 Tyran t 3 3.9 4.2 2.9 3 X X 3,6 21 3,4 61 3,5 24 3,4 45 3,5 24 3,4 14 4 1.4 1.5 1.6 6 X X 3,5 53 3,4 69 3,5 08 3,4 59 3,5 21 3,4 50 5 4.4 3.7 0.8 0.8 X X 3,5 04 3,4 82 3,5 04 3,4 26 3,5 28 3,4 23 Ta ble 4.4: Time to reac ha safe up dat ep oin tusing var ious pr oto co ls (ms). 111 4.5.1 Update Availability To gain additional insights into the trade-offs of different thread synchroniza- tion protocols, in addition to P-Barrier and P-Relaxed, the two protocols pre- sented in Section 4.2.1 and Section 4.2.2, we have implemented two other protocols, P-OpWait and P-PostRelaxed. We now proceed to describing these protocols in more detail. P-OpWait. This is a simple, optimistic protocol. Whenever a thread reaches a check-in point, if an update has been requested, that thread yields by calling sched yield (alternatively, we could have the thread sleep by calling nanosleep). This gives the other threads a chance to reach their check-in points as well. If the last thread discovers that all threads are at check-in point, we do the safety check and apply the update if it is safe to do so. As we will see shortly, this protocol performs poorly in practice; it usually times out without being able to reach a safe update point. P-PostRelaxed. In this protocol, check-ins are no-ops if an update has not been requested. Once an update is requested, threads will check-in their restrictions and continue running. Once all threads have checked-in their restrictions at least once, the update protocol is the leader/follower used in P-Relaxed. The advantage of this protocol is low runtime overhead: we only perform check-ins when an update has actually been requested. In addition to the four protocols presented so far, we also implemented type- 112 safety only versions of P-Relaxed and P-Barrier; we denote these P-Relaxed- D and P-Barrier-D. The reason we considered these D-only protocols was to compare our approach against one based on the activeness check. The time to reach a safe update point (in ms), is presented for each protocol in Table 4.4. The first column shows the program, while the second column shows the update sequence number, e.g., entry ?0? for Icecast corresponds to the update Icecast 2.2.0?Icecast 2.3.0rc1, entry ?0? for Memcached corresponds to the update Memcached 1.2.2?Memcached 1.2.3, etc. We ran experiments for each update and measured the time it took the system from the moment the update was signaled to the moment it could safely be applied. We tested two configurations, 4 and 16 concurrent clients. The number of server- side threads varied, depending on the application and number of clients, as we explain below. Icecast has a fixed number of threads (in our configuration this number was 6 in Icecast 2.2.0, and 7 in later versions, respectively), regardless of the number of clients.6 Memcached has a thread pool with a configurable number of handler threads, independent of the number of clients. We present results for two configurations, one with four server threads (Memc-4), and one with 16 server threads (Memc-16). Space Tyrant uses two threads per connected client, plus three fixed threads that perform housekeeping. To summarize, the number of server-side threads were: 6In the first version, the number of connection handler threads is configurable, but in later versions there is only one connection handling thread, so for update # 0 we fixed the number of handlers to one, to keep things consistent with later versions. 113 ? Icecast: 6 for update # 0, and 7 for updates # 1?3. ? Memcached: 4 and 16 for Memc-4 and Memc-16 respectively. ? Space Tyrant: 11 (8 client handlers + 3 fixed) for the 4-client configuration, and 35 (32 client handlers + 3 fixed) for the 16-client configuration. We are interested in measuring update availability (time between an update is signaled until it can be applied safely), using the six protocols, while the server is under load (since thread activity is likely to obstruct updates from taking effect). The methodology for each program was to start the server, connect 4 (or 16) clients that are constantly asking for data, and while the server is performing work, send an update request. We then measured the time from the moment an update was requested to the moment the update could be safely applied, or time out after 30 seconds7. We performed each experiment 11 times; we report the median time to reach a safe update point for terminating runs. An ?X? entry means that for that specific configuration, none of the 11 runs could reach such a point within 30 seconds. We also measured update loading times (time to load a dynamic patch after reaching a safe point). Since this is not our focus, we omit showing the loading times; they are proportional to patch size, and in all cases were less than 3 ms. Overall, P-Relaxed (columns 3 and 4 of Table 4.4) performs best. As ex- pected, P-Relaxed-D (columns 5 and 6 of Table 4.4) reaches a safe point faster 7We have chosen 30 seconds as a time-out value because we want to provide a reasonable update availability guarantee, and reduce the vulnerability window for security fixes (i.e., the time programs run without being patched). 114 than P-Relaxed because it performs a less strict safety check; however, using P- Relaxed-D is potentially unsafe since it can lead to version consistency violations. P-OpWait (columns 7 and 8 of Table 4.4) performs poorly because it requires all threads to reach check-in points simultaneously, a condition difficult to meet when the server is under load. Protocols P-PostRelaxed, P-Barrier and P-Barrier-D (columns 9?14 of Table 4.4) require threads to check-in only after an update has been requested, so it takes some time until all the threads have checked in their effects. In the Memc- 16 scenario, we have 16 server threads, and activity from only 4 clients prevents each of the 16 thread from being scheduled within the 30 seconds time-out window, hence the ?X? entries. Note that P-Relaxed and P-Relaxed-D do not have this problem because the runtime system is aware of all thread effects at all times: if an update is compatible with each thread, then the first thread to reach a check-in point becomes leader and applies the update. A possible fix for P-PostRelaxed and P-Barrier would be to force a check-in when 1) a thread is being preempted, or 2) prior to a thread entering a blocking system call, so the runtime system can inspect each thread?s effects without having to first wait for all the threads to run and reach a check-in point after an update has been signaled. Solution 1 is difficult to implement without modifying the scheduler. Solution 2 can be implemented by inserting check-ins only prior to blocking I/O. However, our performance experi- ments in Section 4.5.2 show that, in practice, the cost of always doing check-ins is modest, so P-Relaxed provides a good balance between overhead and availability. The only situation where P-Relaxed takes longer to reach a safe point than a 115 comparatively safe protocol (P-PostRelaxed or P-Barrier) is the update #0 to Icecast, presented in the first Icecast row. P-Relaxed takes 1.75 and 1.06 seconds to reach a safe update point, compared to 1.06 and 0.94 for P-Barrier. The reason why P-Relaxed takes longer to reach safety for this particular update is due to numerous induced update points conflicting with this (particularly large) update. In the relaxed approach, for a term e delimited by check-ins, whose contextual effects are [?;?;?] we check-in (???,???). This effectively prevents anything in ? from being updated while e is being evaluated. For the Icecast update #0, the relaxed approach has to ?skip? many induced update points because the safety check fails due to conflicts between u, the update contents, and the ?i associated with each thread i. In contrast, using the barrier approach, the safety check is more precise, which permits us to reach a non-conflicting update point faster. However, in all other cases P-Relaxed reaches a safe point faster (sometimes orders of magnitude faster) than P-PostRelaxed or P-Barrier. We can see that performing only the type safety check increases availability, as the time to reach a safe update point is lower for P-Relaxed-D compared to P- Relaxed and for P-Barrier-D compared to P-Barrier. However, in practice we want to enforce version consistency, and the point of this experiment is only to compare our approach to other multi-threaded DSU systems that employ the activeness check. Performing the activeness/type safety check only essentially means we ?pretend? that an induced update point is a semantic update point. This is clearly not what the programmer had intended, since global invariants might not be satisfied at an induced update point. As a consequence, applying the update 116 Completion time (sec) # Clients ? 4 16 Config. Compilation ? Stock Ginseng-ST Ginseng-MT Stock Ginseng-ST Ginseng-MT Application ? Icecast 11.09 11.08 (-0.09) 11.09 (0.00) 40.50 40.58 (0.20) 40.84 (0.84) Remote Memcached 31.64 31.30 (-1.07) 31.58 (-0.19) 86.32 87.08 (0.88) 87.66 (1.55) SpaceT 44.54 44.48 (-0.13) 44.49 (-0.11) 44.62 44.51 (-0.25) 44.60 (-0.04) Icecast 1.64 1.66 (1.22) 1.75 (6.71) 2.65 2.63 (-0.75) 2.70 (1.89) Local Memcached 7.58 7.89 (4.09) 7.92 (4.49) 31.37 32.13 (2.42) 32.55 (3.76) SpaceT 34.60 34.62 (0.06) 34.61 (0.03) 44.52 44.65 (0.29) 44.62 (0.22) Table 4.5: Impact of update support on performance (absolute time, in seconds, and in % relative to the stock server). based on the activeness check only can lead to errors at update time or later, in the execution of the new program version. 4.5.2 Application Performance We evaluated the impact of dynamic update support on application perfor- mance using two metrics: 1) application-specific benchmarks, and 2) memory foot- print. We report the performance results in Table 4.5 and memory results in Ta- ble 4.6. For each application, we measured the performance of its most recent version under three configurations. The stock configuration forms our base for benchmark- ing, and consists of the application compiled normally, without support for updating. The Ginseng-ST configuration is the application compiled with the single-threaded Ginseng compiler, and using a single-threaded runtime system. The Ginseng-MT configuration is the application compiled with the multi-threaded Ginseng com- piler, and using a multi-threaded runtime system. Comparing the Ginseng-ST and Ginseng-MT configurations is useful for considering the additional overhead that multi-threading support imposes on applications compiled with Ginseng, e.g., check- 117 ins or locking in con functions. For each application, we ran a specific benchmark and measured the comple- tion time and memory footprint (at the completion of the benchmark) in all three configurations: stock, Ginseng-ST, and Ginseng-MT. The memory footprint was measured at the completion of each benchmark. For Icecast, we measured the time it took the streaming server to serve eight mp3 files to a client, using wget as a client. Each file has size 1, 2, ... 8 MB. To eliminate jitter due to disk I/O, we directed wget to send both its output, and the downloaded file, to /dev/null. For Memcached, we ran a ?slap? test that comes bundled with the server. The test program spawns multiple clients in parallel, each client inserting key/value pairs into Memcached?s hash table. We measured the time it took the test program to complete insertion of 50,000 key/value pairs. For Space Tyrant, we created a scenario file that simulates a client performing 500 random moves across the universe, and spawn concurrent clients running this scenario. We measured the time it took the server to complete serving all the clients. In Tables 4.5 and 4.6 we report the median completion time and median mem- ory footprint across 11 runs. We ran each benchmark in two setups. The first setup, remote, shows the results of running the clients and server on separate machines, a scenario that models how the updatable servers would be used in practice. The second setup, local, shows the results of running the clients and server on the same machine (we used the quad-core machine mentioned above). The point of measuring overhead in a local configuration is to factor out network latency and bandwidth 118 Memory footprint (MB) # Clients ? 4 16 Config. Compilation ? Stock Ginseng-ST Ginseng-MT Stock Ginseng-ST Ginseng-MT Application ? Icecast 69.83 70.08 (0.36) 72.32 (3.56) 70.36 70.09 (-0.39) 72.84 (3.52) Remote Memcached 99.94 100.21 (0.27) 99.93 (-0.00) 223.74 223.42 (-0.14) 223.91 (0.08) SpaceT 62.75 90.15 (43.66) 92.04 (46.68) 63.42 90.18 (42.19) 91.81 (44.76) Icecast 69.71 70.33 (0.89) 72.23 (3.62) 69.86 70.03 (0.24) 73.18 (4.75) Local Memcached 99.39 99.14 (-0.25) 99.67 (0.28) 222.67 223.23 (0.25) 223.05 (0.17) SpaceT 63.73 89.85 (40.99) 91.96 (44.29) 62.97 89.44 (42.03) 91.73 (45.68) Table 4.6: Impact of update support on memory footprint (absolute footprint, in MB, and in % relative to the stock server). issues (while reducing parallelism of the server). Similar to the update protocol experiments in Section 4.5.1, we report figures for 4 and 16 clients, respectively. Benchmark completion times are presented in columns 3?8 of Table 4.5. In the remote setup, for Icecast and Space Tyrant, the completion time is similar to the stock server. Memcached is however slower in the 16-thread configuration, with the multi-threaded updatable version 1.6% slower. In the local setup, impact of update support on completion time is higher than in the remote setting; this is because, as expected, update support (e.g., check-ins, function and type indirection) slows down the application and the slow-down cannot be masked by network latency. However, even in this scenario the slowdown is small. 4.5.3 Memory Footprint Memory footprint overhead is presented in columns 3?8 of Table 4.6. As ex- pected, local or remote setups exhibit the same memory footprint. Update support (function and type indirections, and the Ginseng runtime) increases the memory footprint of application compiled with Ginseng. To quantify this impact, we mea- sured the virtual memory footprint in the stock and updatable configurations using 119 pmap. For Memcached the difference is almost imperceptible. For Icecast, the mem- ory footprint increases by up to 4.8% compared to the stock server. For Space Tyrant, the increase is at most 46.7%; the reason why the increase is so large has to do with Space Tyrant and Ginseng interaction. The median memory footprint for the stock version is around 63 MB, while the median memory footprint for Ginseng- compiled versions is around 92 MB. Space Tyrant uses a pre-allocated global array that keeps the game map, divided into 100,000 sectors, and the size of this array is 7.6 MB. The Ginseng compilation scheme allows room for growth in each structure (Section 3.2.2); by default, room for growth is equal to the initial structure size. Be- cause the large array resides within two nested structures, its size becomes 31.2 MB, hence the growth to 92 MB and the 46% increase. This problem could be solved by keeping updatable data by reference or by allocating sector data on demand. Since in our experiments we used at most 8 concurrent clients, each patrolling at most 500 sectors, the actual memory overhead would be at most 11.7% for the 4-threads case, and 13.3% for the 16-threads case. In almost all cases, the multi-threaded setup (Ginseng-MT) has a slightly higher overhead on the application than the single-threaded setup (Ginseng-ST): around 3% more for Icecast and 4% more for Space Tyrant. This increase is due to extra work and extra memory requirements, e.g., check-ins (Section 4.2.3), and locks for con functions (Section 4.3.2). 120 Program Size Build time (LOC) (sec) gcc Ginseng Icecast 25,452 2.3 27 Memcached 5,752 0.7 4.2 Space Tyrant 10,480 2.6 20.1 Table 4.7: Time to build (compile and link) the test programs. 4.5.4 Compilation Time To provide a sense of Ginseng?s compilation overhead, in Table 4.7 we present the time to compile and link each test program in two configurations. The first, normal configuration, without update support, using gcc (which in turn calls ld) is presented in column 3. The second configuration shows build time for updateable applications using Ginseng; it consists of compiling C code into updateable C code using Ginseng, then followed by gcc and linking together with the Ginseng runtime system. The build times for this case are presented in column 4; the bulk of the time is spent in the safety analyses (Sections 3.3 and 5.3.5), and we imagine the figures could be improved with some more engineering. 4.6 Conclusion In this chapter, we presented an approach to updating multi-threaded pro- grams while they run, and show how we have implemented this approach in Ginseng. Updating multi-threaded programs is more difficult than updating single-threaded code because of the tension between update safety and update availability. We solve this tension using a novel concept of induced update points, that allow us to 121 perform updates to multi-threaded while providing the same safety guarantees as in the single-threaded case, while still allowing the update to be applied promptly. We evaluated our approach on three realistic multi-threaded servers. We found that programmer effort for building updateable versions of these applications was modest, and experiments show that update support does not significantly impact application performance. 122 Chapter 5 Version Consistency In Chapter 4 we showed how we can perform timely updates to multi-threaded programs by using version consistency?ensuring that certain code blocks are atomic with respect to updating. In this chapter we provide formalisms for reasoning about, and soundness proofs for, version consistency. 5.1 Introduction As mentioned in Section 4.1, to update multi-threaded programs, the user must designated semantic update points. Semantic update points serve as a useful mechanism for reasoning about update safety, while induced update points permit flexibility in choosing when an update takes place. In particular, programmers can think of updates as possibly occurring at semantic update points only, while the run- time system can actually apply the update at any time so long as it maintains this illusion. The crucial property that allow us to make updates appear as having taken place at a semantic update point is version consistency, a property that Ginseng enforces using static analysis information and runtime checks. To implement version consistency, we developed an extension of type and effect systems called contextual effects, a formalism that permits reasoning about the effects of past and future computation. In Section 5.2 we present our contextual 123 effects calculus and sketch its soundness proof. We then extend contextual effects with support for updates and transactions (version-consistent lexically scoped code blocks), and prove that under certain conditions updates can be safely performed inside transactions, while preserving version consistency (Section 5.3). Next, we extend our update calculus with check-ins (Section 5.4) that help us model relaxed updates. Finally, in Section 5.5 we add multi-threading support to our relaxed update calculus and prove that relaxed updates to multi-threaded programs preserve version consistency. 5.2 Contextual effects Type and effect systems provide a framework for reasoning about the possible side effects of a program?s executions. Effects traditionally consider assignments or allocations, but can also track other events, such as functions called or operations performed. A standard type and effect system [66, 83] proves judgments ?;?turnstilelefte : ?, where ? is the effect of the expression e. For many applications, knowing the effect of the context in which e appears is also useful. For example, if e includes a security-sensitive operation, then knowing the effect of execution prior to evaluating e could be used to support history-based access control [4, 100]. Conversely, knowing the effect of execution following e could be used for some forms of static garbage collection, e.g., to free initialization functions once initialization is complete [40]. In this section we introduce our core contextual effects calculus. Our contex- tual effect system proves judgments of the form ?;? turnstileleft e : ?, where ? is a tuple 124 [?;?;?] containing ?, the standard effect of e, and ? and ?, the prior effect and future effect, respectively, of e?s context. For example, in an application e1 e2, the prior effect of e2 includes the effect of e1, and likewise the future effect of e1 includes the effect of e2. We believe that contextual effects have many other uses, in par- ticular any application in which the past or future computation of the program is relevant at various program points. 5.2.1 Syntax Figure 5.1 presents our source language, which contains expressions e that consist of values v (integers or functions); variables; let binding; function application; and the conditional if0, which tests its integer-valued guard against 0. Our language also includes updateable references refL e along with dereference and assignment. Here we annotate each syntactic occurrence of ref with a label L, which serves as the abstract name for the locations allocated at that program point. We use labels to define contextual effects. For simplicity we do not model recursive functions directly in our language, but they can be encoded using references. Our system uses two kinds of effect information. An effect, written ?, ?, or ?, is a possibly-empty set of labels, and may be 1, the set of all labels. A contextual effect, written ?, is a tuple [?;?;?]. In our system, if eprime is a subexpression of e, and eprime has contextual effect [?;?;?], then ? The current effect ? is the effect of evaluating eprime itself. ? The prior effect ? is the effect of evaluating e up until we begin evaluating eprime. 125 Expressions e ::= v|x|let x = e in e|e e | if0 e then e else e | refL e|!e|e := e Values v ::= n|?x.e Effects ?,?,? ::= ?|1|{L}|??? Contextual Effs. ? ::= [?;?;?] Types ? ::= int |ref ? ? |? ??? ? Labels L Figure 5.1: Contextual effects source language ? The future effect ? is the effect of the remainder of the evaluation of e after eprime is fully evaluated. Thus ? is the effect of eprime itself, and ??? is the effect of the context in which eprime appears?and therefore ????? contains all locations accessed during the entire reduction of e. To make contextual effects easier to work with, we introduce some shorthand. We write ??, ??, and ?? for the prior, current, and future effect components, respectively, of ?. We also write ?? for the empty effect [1;?;1]?by subsumption, discussed below, an expression with this effect may appear in any context. In what follows, we refer to contextual effects simply as effects, for brevity. 5.2.2 Typing We now present a type and effect system to determine the contextual effect of every subexpression in a program. Types ?, listed at the end of Figure 5.1, include the integer type int; reference types ref ? ?, which denote a reference to memory of type ? where the reference itself is annotated with a label L??; and function types 126 Typing (TInt) ? ?;? turnstileleft n : int (TVar) ?(x) = ?? ?;? turnstileleft x : ? (TLet) ?1;? turnstileleft e1 : ?1 ?2;?,x : ?1 turnstileleft e2 : ?2 ?1 a3?2 arrowhookleft? ? ?;? turnstileleft let x = e1 in e2 : ?2 (TIf) ?1;? turnstileleft e1 : int ?2;? turnstileleft e2 : ? ?2;? turnstileleft e3 : ? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft if0 e1 then e2 else e3 : ? (TRef) ?;? turnstileleft e : ? ?;? turnstileleft refL e : ref {L} ? (TDeref) ?1;? turnstileleft e : ref ? ? ??2 = ? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !e : ? (TAssign) ?1;? turnstileleft e1 : ref ? ? ?2;? turnstileleft e2 : ? ??3 = ? ?1 a3?2 a3?3 arrowhookleft? ? ?;? turnstileleft e1 := e2 : ? (TLam) ?;?,x : ?prime turnstileleft e : ?? ?;? turnstileleft ?x.e : ?prime ??? ? [TApp] ?1;? turnstileleft e1 : ?1 ???f ?2 ?2;? turnstileleft e2 : ?1 ?1 a3?2 a3?f arrowhookleft? ? ?;? turnstileleft e1 e2 : ?2 (TSub) ?prime;? turnstileleft e : ?prime ?prime ? ? ?prime ? ??;? turnstileleft e : ? Effect combinator (XFlow-Ctxt) ?1 = [?1;?1;(?2 ??2)] ?2 = [(?1 ??1);?2;?2] ? = [?1;(?1 ??2);?2] ?1 a3?2 arrowhookleft? ? Subtyping (SInt) int ? int (SRef) ? ? ?prime ?prime ? ? ? ? ?prime ref ? ? ? ref ?prime ?prime (SFun) ? prime1 ? ?1 ?2 ? ?prime2 ? ? ?prime ?1 ??? ?2 ? ?prime1 ???prime ?prime2 (SCtxt) ?2 ? ?1 ?1 ? ?2 ?2 ? ?1[? 1;?1;?1] ? [?2;?2;?2] Figure 5.2: Contextual effects type system ? ??? ?prime, where ? and ?prime are the domain and range types, respectively, and the function has contextual effect ?. Figure 5.2 presents our contextual type and effect system. The rules prove judgments of the form ?;? turnstileleft e : ?, meaning in type environment ?, expression e has type ? and contextual effect ?. The first two rules, (TInt) and (TVar), assign 127 the expected types and the empty effect, since values have no effect. (TLet) types subexpressions e1 and e2, which have effects ?1 and ?2, respec- tively, and requires that these effects combine to form ?, the effect of the entire expression. We use a call-by-value semantics, and hence the effect of the let should be the effect of e1 followed by the effect of e2. We specify the sequencing of effects with the combinator ?1a3?2 arrowhookleft??, defined by (XFlow-Ctxt) in the middle part of Figure 5.2. Since e1 happens before e2, this rule requires that the future effect of e1 be ?2??2, i.e., everything that happens during the evaluation of e2, captured by ?2, plus everything that happens after, captured by ?2. Similarly, the past effect of e2 must be ?1??1, since e2 happens just after e1. Lastly, the effect ? of the entire ex- pression has ?1 as its prior effect, since e1 happens first; ?2 as its future effect, since e2 happens last; and ?1??2 as its current effect, since both e1 and e2 are evaluated. We write ?1 a3?2 a3?3 arrowhookleft?? as shorthand for (?1 a3?2 arrowhookleft??prime)?(?primea3?3 arrowhookleft??). (TIf) requires that its branches have the same type ? and effect ?2, which can be achieved with subsumption (below), and usesa3to specify that ?1, the effect of the guard, occurs before either branch. (TRef) types memory allocation, which has no effect but places the annotation L into a singleton effect{L}on the output type. This singleton effect can be increased as necessary by using subsumption. (TDeref) types the dereference of a memory location of type ref ? ?. In a standard effect system, the effect of !e is the effect of e plus the effect ? of accessing the pointed-to memory. Here, the effect of e is captured by ?1, and because the dereference occurs after e is evaluated, (TDeref) puts ?1 in sequence just before some ?2 such that ?2?s current effect is ?. Therefore by (XFlow-Ctxt), ?? is 128 ??1 ??, and e?s future effect ??1 must include ? and the future effect of ?2. On the other hand, ??2 is unconstrained by this rule, but it will be constrained by the context, assuming the dereference is followed by another expression. (TAssign) is similar to (TDeref), combining the effects ?1 and ?2 of its subexpressions with a ?3 whose current effect is ?. (TLam) types the function body e and sets the effect on the function arrow to be the effect of e. The expression as a whole has no effect, since the function produces no run-time effects until it is actually called. (TApp) types function application, which combines ?1, the effect of e1, with ?2, the effect of e2, and ?f, the effect of the function. The last rule in our system, (TSub), introduces subsumption on types and effects. The judgments ?prime?? and ?prime?? are defined at the bottom of Figure 5.2. (SInt), (SRef), and (SFun) are standard, with the usual co- and contravariance where appropriate. (SCtxt) defines subsumption on effects, which is covariant in the current effect, as expected, and contravariant in both the prior and future effects. To understand the contravariance, first consider an expression e with future effect ?1. Since future effects should soundly approximate (i.e., be a superset of) the locations that may be accessed in the future, we can use e in any context that accesses at most locations in ?1. Similarly, since past effects approximate locations that were accessed in the past, we can use e in any context that accessed at most locations in ?1. 129 5.2.3 Semantics and Soundness The semantics of the contextual effects system, the formal definitions of its key soundness properties, and its soundness proof are due to Pratikakis and can be found in his dissertation [94]. We include here the contextual effects semantics and the formal definitions of its key soundness properties for completeness. The top of Figure 5.3 gives some basic definitions needed for our operational semantics. We extend values v to include the form rL, which is a run-time heap location r annotated with label L. We need to track labels through our operational semantics to formulate and prove soundness, but these labels need not exist at run- time. We define heaps H to be maps from locations to values. Finally, we extend typing environments ? to assign types to heap locations. The bottom part of Figure 5.3 defines a big-step operational semantics for our language. The reduction rules are straightforward. [Id] reduces a value to itself without changing the state or the effects. [Call] evaluates the first expression to a function, the second expression to a value, and then the function body with the formal argument replaced by the actual argument. [Ref] generates a fresh location r, which is bound in the heap to v and evaluates to rL. [Deref] reads the location r in the heap and adds L to the standard evaluation effect. This rule requires that the future effect after evaluating e have the form ?prime?{L}, i.e., L must be in the capability after evaluating e, but prior to dereferencing the result. Then L is added to ?prime in the the output configuration of the rule. Notice that ?prime?{L}is a standard union, and so L may also be in ?prime. This allows the same location can be accessed 130 Values v ::= ...|rL Heaps H ::= ?|H,rmapsto?v Environments ? ::= ?|?,x : ? |?,r : ? [Id] ??,?,H,v??? ???,?,H,v? [Call] ??,?,H,e1????1 ??1,?1,H1,?x.e? ??1,?1,H1,e2????2 ??2,?2,H2,v2? ??2,?2,H2,e[xmapsto?v2]????3 ??prime,?prime,Hprime,v? ??,?,H,e1 e2????1??2??3 ??prime,?prime,Hprime,v? [Ref] ??,?,H,e?????? prime,?prime,Hprime,v? r /?dom(Hprime) ??,?,H,refL e??????prime,?prime,(Hprime,rmapsto?v),rL? [Deref] ??,?,H,e?????? prime,?prime?{L},Hprime,rL? r?dom(Hprime) ??,?,H,!e?????{L}??prime?{L},?prime,Hprime,Hprime(r)? [Assign] ??,?,H,e1????1 ??1,?1,H1,rL? ??1,?1,H1,e2????2 ??2,?2?{L},(H2,rmapsto?vprime),v? ??,?,H,e1 := e2????1??2?{L} ??2?{L},?2,(H2,rmapsto?v),v? [If-T] ??,?,H,e1????1 ??1,?1,H1,v1? v1 = 0 ??1,?1,H1,e2????2 ??2,?2,H2,v? ??,?,H,if0 e1 then e2 else e3????1??2 ??2,?2,H2,v? [If-F] ??,?,H,e1????1 ??1,?1,H1,v1? v1 = nnegationslash= 0 ??1,?1,H1,e3????3 ??3,?3,H3,v? ??,?,H,if0 e1 then e2 else e3????1??3 ??3,?3,H3,v? [Let] ??,?,H,e1????1 ??1,?1,H1,v1? ??1,?1,H1,e2[xmapsto?v1]????2 ??2,?2,H2,v2? ??,?,H,let x = e1 in e2????1??2 ??2,?2,H2,v2? Figure 5.3: Contextual effects operational semantics (partial) multiple times. [Assign] behaves similarly to [Deref]. Lastly, [If-T] and [If-F] give the two cases for conditionals, and [Let] binds x to the result of evaluating e1 inside of e2. Our semantics also includes rules (not shown) that produce err when the program tries to access a location that is not in 131 the input capability, or when values are used at the wrong type. Given this operational semantics, we can now prove that the contextual effect system in Figure 5.2 is sound. We now state our main lemmas and theorems. We begin with a standard definition of heap typing. Definition 5.2.1 (Heap Typing). We say heap H is well-typed under ?, written ?turnstileleftH, if dom(?) = dom(H) and if for every r?dom(H), we have ??;?turnstileleftH(r) : ?(r). Given this definition, we show the standard effect soundness theorem, which states that the program does not go wrong and that the standard effect ?? captures the effect of evaluation. Theorem 5.2.2 (Standard Effect Soundness). If ?;? turnstileleft e : ? and ? turnstileleft H and ?1,1,H,e???? ?1,1,Hprime,R?, then there is a ?prime ? ? such that R is a value v for which ?0;?primeturnstileleftv : ? where ?primeturnstileleftHprime and ????. Next, we show the operational semantics is adequate, in that it moves effects from the future to the past during evaluation. Lemma 5.2.3 (Adequacy of Semantics). If ??,?,H,e? ??? ??prime,?prime,Hprime,v? then ?prime = ??? and ? = ?prime??. Next we must define what it means for the statically-ascribed contextual effects of some expression e to be sound with respect to the effects of e?s evaluation. Suppose thatep isaprogramthatiswell-typedaccordingtotypingderivationT andevaluates to some value v as witnessed by an evaluation derivation D. Observe that each term 132 e1 that is reduced in a subderivation of D is either a subterm of ep, or is derived from a subterm e2 of ep via reduction; in the latter case it is sound to give e1 the same type and effect that e2 has inT. To reason about the soundness of the effects, therefore, we must track the static effect of expression e2 as it is evaluated. We do this by defining a new typed operational semantics that extends stan- dard configurations with a typing derivation of the term in that configuration. The key property of this semantics is that it preserves the effect ? of a term throughout its evaluation, and we prove that given standard evaluation and typing derivations of the original program, we can always construct a corresponding typed operational semantics derivation. Finally, we prove that given a typed operational semantics derivation, the effect ? in the typing in each configuration conservatively approximates the actual prior and future effect. Theorem 5.2.4 (Prior and Future Effect Soundness). If E ::?T,?,?,H,e?????Tv,?prime,?prime,Hprime,v? where T :: ?;?turnstilelefte : ?, ???? and ?prime ??? then for all sub-derivations Ei of E, Ei ::?Ti,?i,?i,Hi,ei???? ?Tvi,?primei,?primei,Hprimei,vi? where Ti :: ?i;?i turnstileleftei : ?i, it will hold that ?i???i and ?primei???i . The proof of the above theorem is by induction on the derivation, starting at the root and working towards the leaves, and relying on Theorem 5.2.2 and Lemma 5.2.3. Finally, the soundness of the Contextual Effects system follows as a corollary. 133 Theorem 5.2.5 (Contextual Effect Soundness). Given a program ep with no free variables, its typingT and its canonical evaluationD, we can construct a typed eval- uationE such that for every sub-derivation E ::?T,?,?,H,e?????Tv,?prime,?prime,Hprime,v? inE, where T :: ?;?turnstilelefte : ?, it is always the case that ????, ???? and ????prime. 5.2.4 Contextual Effect Inference The typing rules in Figure 5.2 form a checking system, but we would prefer to infer effect annotations rather than require the programmer to provide them. Here we sketch the inference process, which is straightforward and uses standard constraint-based techniques. We change the rules in Figure 5.2 into inference rules by making three mod- ifications. First, we make the rules syntax-driven by integrating (TSub) into the other rules [76]; second, we add variables ? to represent as-yet-unknown effects; and third, we replace implicit equalities with explicit equality constraints. The resulting rules are mostly as expected, with one interesting difference for (TApp). We might expect inlining subsumption into (TApp) to yield the following rule: (*) ?1;?turnstilelefte1 : ?1???f ?2 ?2;?turnstilelefte2 : ?prime1 ?prime1??1 ?1 a3?2 a3?f arrowhookleft?? ?;?turnstilelefte1 e2 : ?2 However, this would cause the inferred ?f effect to be larger than necessary if there are multiple calls to the same function. For example, consider the following code, where f is some one-argument function, x, y, and z are references, and A and B identify two program points: 134 (if0 ... then /?A?/ (f 1; !x) else /?B?/ (f 2; !y)); !z If we used rule (*), then from branch A, we would require {x,z}???f, and from branch B, we would require{y,z}???f, where ?f is the effect of function f. Putting these together, we would thus have ??f ={x,y,z}. This result is appropriate, since any of those locations may be accessed after some call to f. However, consider the future effect ??A at program point A. By (XFlow-Ctxt), ??A would contain ??f, and yet y will not be accessed once we reach A, since that access is on another branch. The analogous problem happens at program point B, whose future effect is polluted by x. The problem is that our effect system conflates all calls to f. One solution would be to add Hindley-Milner style parametric polymorphism, which would ad- dress this particular example. However, even with Hindley-Milner polymorphism we would suffer the same problem at indirect function calls, e.g., in C, calls through function pointers would be monomorphic. The solution is to notice that inlining subsumption into (TApp) should not yield (*), but instead results in the following rule: (TAppprime) ?1;?turnstilelefte1 : ?1???f ?2 ?2;?turnstilelefte2 : ?prime1 ?f ??primef ?prime1??1 ?primef fresh ?1 a3?2 a3?primef arrowhookleft?? ?;?turnstilelefte1 e2 : ?2 Applied to the above example, (TAppprime) results in two constraints on the future effect of ?f: ??f ???fA ={x,z} ??f ???fB ={y,z} Here ?fA and ?fB are the fresh function effects at the call to f in A and B, re- 135 Definitions d ::= main e | var g = v in d | fun f(x) = e in d Expressions e ::= v|x|let x = e in e|e e | if0 e then e else e | ref e|!e|e := e | tx e|update?,? Values v ::= n|z Effects ?,?,? ::= ?|1|{z}|??? Global symbols f,g,z ? GSym Dynamic updates upd ::= {chg,add} Additions add ? GSym arrowrighttophalf (??b) Changes chg ? GSym arrowrighttophalf (??b) Bindings b ::= v|?x.e Figure 5.4: Proteus-tx syntax, effects, and updates spectively. Notice that we have ??f = {x,y,z}, as before, since f is called in both contexts. But now ??fA need not contain y, and ?fB need not contain x. Thus with (TAppprime), a function?s effect summarizes all of its contexts, but does not cause the prior and future effects from different contexts to pollute each other. To perform type inference, we apply our inference rules, viewing them as generating the constraints C in their hypotheses, given by the following grammar: C ::= ? ??prime|???prime|?1 a3?2 arrowhookleft? ? We can then solve the constraints by performing graph reachability to find, for each variable ?, the set of base effects{L}or 1 that reach it. In practice, these constraints can be solved very efficiently using a toolkit such as Banshee [61]. 136 5.3 Single-threaded Transactional Version Consistency In Chapter 4 we showed how we allow programmers to specify semantic update points and how our runtime system permits updates in between semantic update points (i.e., at induced update points) while still maintaining the illusion that code between semantic update points executes at the same version. In this section we present a formalism and proof of version consistency. We model semantic and induced update points using two syntactic elements: transactions tx B and update points update?,?. A transaction tx B designates a lexically scoped code block B whose execution should be version consistent. Trans- actions are a restricted form of semantic update point placement: instead of allowing arbitrary semantic update point placement, we require that semantic update points are paired, and delimit lexical scopes. In other words, the beginning and end of transaction correspond to two semantic update points. We require lexical scoping because it makes modeling and reasoning about version consistency easier, with- out hampering flexibility in choosing update points?instead of inserting a semantic update point at the end of a loop iteration, the programmer simply designates the loop body as a transaction. An update point update?,? in our calculus represents an induced update point, where ? and ? are the contextual effects at that point. As we will see in Section 5.3.3, contextual effects impose constraints on updates, to ensure that version consistency is preserved. Transactional version consistency for DSU is similar to the property of iso- lation in database-style transactions: just as in the ACID model other operations 137 can either see the entire effect of a transaction, or no effect at all, in DSU the execution of a lexical scope delimited by semantic update points can either be at- tributed to the old version, or the new version, but not both. Transactions allow us to reason easier about version consistency without placing undue burden on the programmer: instead of having one update point at the beginning of a loop itera- tion (Section 4.4.4), we simply designate the loop body as a transaction. The formal property our calculus establishes is called transactional version consistency (TVC), meaning that transactions execute as if they were entirely the old version or entirely the new version, no matter where an update actually occurs. 5.3.1 Syntax Figure 5.4 presents Proteus-tx, which extends the language from Section 5.2 to model transactional version-consistent dynamic updates, adapting the ideas of Proteus, our prior dynamic updating calculus [107]. A Proteus-tx program is a definition d, which consists of an expression main e, possibly preceded by definitions of global symbols, written f, g, or z and drawn from a set GSym. The definition var g = v in d binds mutable variable g to v within the scope of d, and the definition fun f(x) = e in d binds f to a (possibly-recursive) function with formal parameter x and body e. Expressions e in Proteus-tx have several small differences from the language of Figure 5.1. We add global symbols z to the set of values v. We also remove anony- mous lambda bindings to keep things simpler, for reasons discussed in Section 5.3.3. 138 To mark transactions, we add a form tx e for a transaction whose body is e. We specify program points where dynamic updates may occur with the term update?,?, where the annotations ? and ? specify the prior and future effects at the update point, respectively. When evaluation reaches update?,?, an available update is applied if its contents do not conflict with the future and prior effect annotations; otherwise evaluation proceeds without updating. A dynamic update upd consists of a pair of partial functions chg and add that describe the changes and additions, respectively, of global symbol bindings. The range of these functions is pairs (?,b), where b is the new or replacement value (which may be a function ?x.e) and ? is its type. Note that Proteus-tx disallows type-altering updates, though Section 5.3.5 explains how they can be supported by employing ideas from our earlier work [107]. Also, although Ginseng allows state initialization functions, for simplicity we do not model them in Proteus-tx. Finally, effects in Proteus-tx consist of sets of global symbol names z, which represent either a dereference of or assignment to z (if it is a variable) or a call to z (if it is a function name). Because updates in Proteus-tx can only change global symbols (and do not read or write through their contents), we can ignore the effects of the latter (we use syntax ref e instead of refL e). 5.3.2 Typing Figure 5.5 extends the core contextual effect typing rules from Figure 5.2 to Proteus-tx. The first three rules define the judgment ?turnstileleftd, meaning definition d is 139 (TMain) ?;?turnstilelefte : ??turnstileleftmain e (TDVar) ??;?turnstileleftv : ? ?,g : ref {g} ? turnstileleftd ?turnstileleftvar g = v in d (TDFun) ?prime = ?,f : ? ??? ?prime ?;?prime,x : ? turnstilelefte : ?prime ?primeturnstileleftd {f}?? ?turnstileleftfun f(x) = e in d (TGvar) ?(f) = ?? ?;?turnstileleftf : ? (TUpdate) ? ???prime ?? ??prime ?;?turnstileleftupdate?prime,?prime : int (TTransact) ?1;?turnstilelefte : ? ? ????1 ?? ???1 ?;?turnstilelefttx e : ? Figure 5.5: Proteus-tx typing (extends Figure 5.2) well-typed in environment ?. (TMain) types e in ?, where e may have any effect and any type. (TDVar) types the value v, which has the empty effect (since it is a value), and then types d with g bound to a reference to v labeled with effect{g}. The last definition rule, (TDFun), constructs a new environment ?prime that extends ? with a binding of f to the function?s type. The function body e is type checked in ?prime, to allow for recursive calls. This rule also requires that f appear in all components of the function?s effect ?, written{f}??. We add f to the prior effect because f must have been called for its entry to be reached. We add f to the current effect so that it is included in the effect at a call site. Lastly, we add f to the future effect because f is on the call stack and we consider its continued execution to be an effect. Note that this prohibits updates to main(), which is always on the stack. However, we can solve this problem by extracting portions of main() into separate functions, which can then be updated; Ginseng provides support to automate this process via loop and code extraction (explained in Sections 3.2.4 and 4.3.1). The next rule, (TGVar), types global variables, which are bound in ?. The last two 140 rules type the dynamic updating-related elements of Proteus-tx. (TUpdate) types update by checking that its prior and future effect annotations are supersets of (and thus conservatively approximate) the prior and future effects of the context. Finally, (TTransact) types transactions. A key design choice here is decid- ing how to handle nested transactions. In (TTransact), we include the prior and future effects of ?, from the outer context, into those of ?1, from the transaction body. This ensures that an update within a child transaction does not violate version consistency of its parent. However, we do not require the reverse?the components of ?1 need not be included in ?. This has two consequences. First, sequenced transactions are free to commit independently. For example, consider the following code tx { tx { /?A?/}; /?B?/ tx{/?C?/}} According to (TTransact), the effect at B is included in the prior and future effects of C and A, respectively, but not the other way around. Thus neither trans- action?s effect propagates into the other, and therefore does not influence any update operations in the other. The second consequence is that version consistency for a parent transaction ignores the effects of its child transactions. This resembles open nesting in con- currency transactions [81]. For example, suppose in the code above that A and C contain calls to a hash table T. Without the inner transaction markers, an update to T available at B would be rejected, because due to A it would overlap with the prior effect, and due to C it would overlap with the future effect. With the inner 141 transactions in place, however, the update would be allowed. As a result, the parent transaction could use the old version of the hash table in A and the new version in C. This treatment of nested transactions makes sense when inner transactions contain code whose semantics is largely independent of the surrounding context, e.g., the abstraction represented by a hash table is independent of where, or how often, it is used. Baumann et al.[11] have applied this semantics to successfully partition dynamic updates to the K42 operating system into independent, object- sized chunks. While we believe open nesting makes sense, we can see circumstances in which closed nesting might be more natural, so we expect to refine our approach in future work. 5.3.3 Operational Semantics Figures 5.6 and 5.7 define a small-step operational semantics that reduces configurations?n;?;H;e?, where n defines the current program version (a successful dynamic update increments n), ? is the transaction stack (explained shortly), H is the heap, and e is the active program expression. Reduction rules have the form ?n;?;H;e? ??? ?nprime;?prime;Hprime;eprime?, where the event ? on the arrow is either ?, a dynamic update that occurred (discussed below), or ?, the effect of the evaluation step. In our semantics, heaps map references r and global variables z to triples (?,b,?) consisting of a type ?, a binding b (defined in Figure 5.4), and a version 142 Defi nitio ns Heap s H ::= ?| rmapsto? (?, b, ?) ,H | zmapsto? (? ,b, ?) ,H Ve rsion sets ? ::= ?| {n }? ? Trace s ? ::= ?| (z, ?) ? ? Trans actio n sta cks ? ::= ?| (n, ?) ,? Ex pressio ns e ::= ... |r |intx e Ev en ts ? ::= ?| ? Up date Direc tion dir ::= bck |fwd Up date Bund les ? ::= (up d, dir ) Co mpil ation C( H ;mai ne ) = H ;e C( H ;fu nf (x )= ein d) = C( H, fmapsto? (? ?? ? ?prime ,?x.e, {0 }); d) C( H ;va rg = vin d) = C( H, gmapsto? (? ,v ,{ 0} );d ) Ev aluati on Con texts E ::= [] |E e| vE |le tx = E in e | ref E |! E |E := e| r:= E |g := E | if0 E th en eel se e Co mpu tatio n [let ] ?n ;(n prime, ?); H ;le tx = vin e? ?? ? ?n ;(n prime, ?); H ;e [x mapsto? v]? [ref ] ?n ;(n prime, ?); H ;ref v? ?? ? ?n ;(n prime, ?); H [r mapsto? (?, v, ?)]; r? rnegationslash? do m( H ) [deref ] ?n ;(n prime, ?); H ;! r? ?? ? ?n ;(n prime, ?); H ;v ? H (r )= (?, v, ?) [assign ] ?n ;(n prime, ?); H ;r := v? ?? ? ?n ;(n prime, ?); H [r mapsto? (?, v, ?)]; v? r? do m( H ) [if-t ] ?n ;(n prime, ?); H ;if0 0th en e1 else e2 ? ?? ? ?n ;(n prime, ?); H ;e 1? [if-f ] ?n ;(n prime, ?); H ;if0 nprimeprime th en e1 else e2 ? ?? ? ?n ;(n prime, ?); H ;e 2? nprimeprime negationslash= 0 [cong ] ?n ;? ;H ;E [e] ? ?? ? ?n prime; ?prime ;H prime; E[ eprime] ? ?n ;? ;H ;e ?? ? ? ?n prime; ?prime ;H prime; eprime? [gv ar-deref ] ?n ;(n prime, ?); H ;! z? ?? {z } ?n ;(n prime, ?? (z, ?)); H ;v ? H (z) = (? ,v ,? ) [gv ar-assign ] ?n ;(n prime, ?); H ;z := v? ?? {z } ?n ;(n prime, ?? (z, ?)); H [z mapsto? (? ,v ,? )]; v? H (z) = (? ,v prime, ?) [cal l] ?n ;(n prime, ?); H ;z v? ?? {z } ?n ;(n prime, ?? (z, ?)); H ;e [x mapsto? v]? H (z) = (? ,?x.e, ?) [tx-st ar t] ?n ;(n prime, ?); H ;tx e? ?? ? ?n ;(n prime, ?) ,(n, ?); H ;intx e? [tx-cong-1 ] ?n ;(n primeprime, ?) ,?; H ;intx e? ?? ? ?n prime; U[( nprimeprime ,? )]? n prime, ?prime ;H prime; intx eprime? ?n ;? ;H ;e ?? ? ? ?n prime; ?prime ;H prime; eprime? [tx-cong-2 ] ?n ;? ;H ;intx e? ?? ? ?n prime; ?prime ;H prime; intx eprime? ?n ;? ;H ;e ?? ? ? ?n prime; ?prime ;H prime; eprime? [tx-end ] ?n ;(( nprime ,? prime) ,(n primeprime, ?primeprime )); H ;intx v? ?? ? ?n ;(n prime, ?prime );H ;v ? trac eOK (n primeprime, ?primeprime ) [upd ate ] ?n ;(n prime, ?); H ;up da te?, ?? ?? (up d, dir ) ?n + 1; U[( nprime ,? )]up d, dir n+1 ;U [H ]up d n+1 ;1 ? up date OK (up d, H, (?, ?) ,dir ) [no-upd ate ] ?n ;(n prime, ?); H ;up da te?, ?? ?? ? ?n ;(n prime, ?); H ;0 ? Figur e5 .6: Proteus-tx op erat iona lseman tics 143 Update Safety updateOK(upd,H,(?,?),dir) = dir = bck ? ??dom(updchg) = ? ? dir = fwd ? ? ?dom(updchg) = ? ? ? = types(H) ? ?upd = ?,types(updadd) ? ?z mapsto? (?,b,?) ? updchg.` ??;?upd turnstileleft b : ? ? heapType(?,z) = ?(z)? ? ?z mapsto? (?,b,?) ? updadd.` ??;?upd turnstileleft b : ? ? z /? dom(H)? Trace Safety traceOK(n,?) = (?(z,?) ? ?. n ? ?) Heap Updates U[(z mapsto? (?,b,?),H)]updn = 8 >< >: z mapsto? (?,bprime,{n}),U[H]updn if updchg(z) mapsto? (?,bprime) z mapsto? (?,b,? ?{n}),U[H]updn otherwise U[(r mapsto? (?,b,?),H)]updn = (r mapsto? (?,b,?)),U[H]updn U[?]updn = {z mapsto? (?,b,{n}) | z mapsto? (?,b) ? updadd} Heap Typing Environments types(?) = ? types(z mapsto? (?,b,?),Hprime) = z : heapType(?,z),types(Hprime) heapType(?1 ??? ?2,z) = ?1 ??? ?2 z ? ? heapType(?,z) = ref {z} ? ? negationslash= (?1 ??? ?2) Trace Stack Updates U[(nprime,?)]upd,fwdn = (nprime,?) U[(nprime,?)]upd,bckn = (n,Ut[?]updn ) Ut[?]updn = {(z,? ?{n} | z negationslash? dom(updchg)} ? {(z,?) | z ? dom(updchg)} Figure 5.7: Proteus-tx update safety set ?. The first and last components are relevant only for global symbols; the type ? is used to ensure that dynamic updates do not change the types of global bindings, and the version set ? contains all the program versions up to, and including, the current version since the corresponding variable was last updated. When an update occurs, new or changed bindings are given only the current version, while all other bindings have the current version added to their version set (i.e., we preserve the fact that the same binding was used in multiple program versions). As evaluation proceeds, we maintain a transaction stack ?, which is a list of 144 pairs (n,?) that track the currently active transactions. Here n is the version the program had when the transaction began, and ? is a trace. A trace is a set of pairs (z,?), each of which represents a global symbol access paired with its version set at the time of use. The traces act as a log of dynamic events, and we track them in our semantics so we can prove that all global symbols accessed in a transaction come from the same version. To evaluate a program d, we first computeC(?,d) using the functionC shown at the top of Figure 5.6, which yields a pair H;e. This function implicitly uses the types derived by typing d using the rules in Figure 5.5. Then we begin regular evaluation in the configuration?0;(0,?);H;e?, i.e., we evaluate e at version 0, with initial transaction stack (0,?), and with the declared bindings H. This causes the top-level expression e in main e to be treated as if it were enclosed in a transaction block. The first several reduction rules in Figure 5.6 are straightforward. [let], [ref], [deref], [assign], [if-t], and [if-f] are small-step versions of the rules in Figure 5.3, though normal references no longer have effects. None of these rules affects the current version or transaction stack. [cong] is standard. [gvar-deref], [gvar-assign], and [call] each have effect{z}and add (z,?) to the current transaction?s trace, where ? is z?s current version set. Notice that [call] performs dereference and application in one step, finding z in the heap and performing substitution. Since dynamic updates modify heap bindings, this ensures that every function call is to the most recent version. Notice that although both functions and variables are stored in the heap, we assign regular function types 145 to functions ((TDFun) in Figure 5.5) so that they cannot be assigned to within a program. Including ?-terms in the expression language would either complicate function typing or make it harder to define function updates so we omit them to keep things simpler. The next several rules handle transactions. [tx-start] pushes the pair (n,?) onto the right of the transaction stack, where n is the current version and?is the empty trace. The expression tx e is reduced to intx e, which is a new form that represents an actively-evaluating transaction. The form intx e does not appear in source programs, and its type rule matches that of tx e (see Figure 5.8). Next, [tx-cong-1] and [tx-cong-2] perform evaluation within an active transaction intx e by reducing e to eprime. The latter rule applies if e?s reduction does not include an update, in which case the effect ? of reducing e is treated as?in the outer transaction. This corresponds to our model of transaction nesting, which does not consider the effects of inner transactions when updating outer transactions. Oth- erwise, if an update occurs, then [tx-cong-1] applies, and we use the function U to update version numbers on the outermost entry of the transaction stack. U is discussed shortly. The key property guaranteed by Proteus-tx, that transactions are version con- sistent, is enforced by [tx-end], which gets stuck unless traceOK(nprimeprime,?primeprime) holds. This predicate, defined just below the reduction rules, states that every element (z,?) in the transaction?s trace ?primeprime satisfies nprimeprime??, meaning that when z was used, it could be attributed to version nprimeprime, the version of the transaction. If this predicate is satisfied, [tx-end] strips off intx and pops the top (rightmost) entry on the transaction stack. 146 The last two rules handle dynamic updates. When update?,? is in redex posi- tion, these rules try to apply an available update bundle ?, which is a pair (upd,dir) consisting of an update (from Figure 5.4) and a direction dir that indicates whether we should consider the update as occurring at the beginning or end of the trans- action, respectively. If updateOK(upd,H,(?,?),dir) is satisfied for some dir, then [update] applies and the update occurs. Otherwise [no-update] applies, and the update must be delayed. If [update] applies, we increment the program?s version number and update the heap using U[H]updn+1, defined in the middle-right of Figure 5.6. This function replaces global variables and adds new bindings according to the update. New and replaced bindings? version sets contain only the current version, while unchanged bindings add the current version to their existing version sets. The updateOK() predicate is defined just below the reduction rules in Fig- ure 5.6. The first two conjuncts enforce the update safety requirement discussed in Section 5.3. There are two cases. If dir = bck, then we require that the update not intersect the prior effects, so that the update will appear to have happened at the beginning of the transaction. In this case, we need to update the version number of the transaction to be the new version, and any elements in the trace not mod- ified by the update can have the new version added to their version sets, i.e., the past effect can be attributed to the new version. To do this, [update] applies the functionU[(nprime,?)]upd,dirn+1 , defined on the bottom right of Figure 5.6, with dir = bck. The update applies to outer transactions as well, and thus [tx-cong-1] applies this same version number replacement process across the transaction stack. 147 TIntrans ?1;?turnstilelefte : ? ? ????1 ?? ???1 ?;?turnstileleftintx e : ? dom(?) = dom(H) ?zmapsto?(? ??? ?prime,?x.e,?)?H. ?;?,x : ? turnstilelefte : ?prime ? ?(z) = ? ??? ?prime ? z?? ?zmapsto?(?,v,?)?H. ??;?turnstileleftv : ? ? ?(z) = ref ? ? ? z?? ?rmapsto?(?,v,?)?H. ??;?turnstileleftv : ? ? ?(r) = ref ? ? ?zmapsto?(?,b,?)?H. n?? n;? turnstileleft H (TC1) f???f?? f???n?ver(H,f) [?;?;?],?;Hturnstileleft(n,?) (TC2) ?prime,R;Hturnstileleft? f???f?? f???n?ver(H,f) [?;?;?],?prime,R;Hturnstileleft(n,?),? where ver(H,f) = ? iff H(f) = (?,b,?) Figure 5.8: Proteus-tx typing extensions for proving soundness In the other case, if dir = fwd, we require that the remainder of the transaction not be affected by the update, so the update will appear to have happened at the end of the transaction. In this case we need not modify the transaction stack, and henceU[(nprime,?)]upd,dirn with dir = fwd simply returns (nprime,?). The remaining premises of updateOK() determine whether the update itself is well-formed: each replacement binding must have the same type as the original, and new and added bindings must type check in the context of the updated heap. 5.3.4 Soundness We have proven that well-typed Proteus-tx programs are version-consistent. The main result is that a well-typed, well-formed program either reduces to a value 148 or evaluates indefinitely while preserving typing and version consistency. To prove this we need two additional judgments, shown in Figure 5.8. Heap typing n;?turnstileleftH extends Definition 5.2.1 from the core system, where the additional conditions ensure that global symbols are well-typed, have well-formed effects, and include version n (presumed to be the current version) in their version sets. Stack well-formednessR;Hturnstileleft? checks that a transaction stack ? is correctly approximated by a transaction effect R, which consists of a list of contextual effects ?, one for each nested transaction. R is computed from a typing derivation in a straightforward way according to the function llbracket?;?turnstilelefte : ?rrbracket = R, extracting ?1 from each occurrence of (TIntrans) recursively; the rules are not shown due to space constraints. Stack well-formedness ensures two properties. First, it ensures that each element in the trace ? is included in the corresponding prior effect ? (i.e., f ???f ??). As a result, we know that bck updates that rewrite the stack will add the new version to all elements of the trace, since none have changed. Second, it ensures that elements in each transaction?s current effect (i.e., the part yet to be executed) have the version of that transaction: f ???n?ver(H,f). With this we can prove the core result: Theorem 5.3.1 (Single-step Soundness). If ?;?turnstilelefte : ? where llbracket?;?turnstilelefte : ?rrbracket =R; and n;? turnstileleft H; and ?,R;H turnstileleft ?; and traceOK(?), then either e is a value, or there exist nprime, Hprime, ?prime, ?prime, eprime, and ? such that?n;?;H;e? ?? ? ?nprime;?prime;Hprime;eprime?and ?prime;?prime turnstilelefteprime : ? where llbracket?prime;?primeturnstilelefteprime : ?rrbracket =Rprime; and nprime; ?prime turnstileleftHprime; and ?prime,Rprime;Hprime turnstileleft?prime; and traceOK(?prime) for some ?prime,?prime,Rprime. 149 The proof is based on progress and preservation lemmas, as is standard. De- tails are in Appendix B. From this lemma we can prove soundness: Corollary 5.3.2 (Soundness). If ?;?turnstilelefte : ? and 0;?turnstileleftH then?0;(0,?);H;e?a59 A ?nprime;(nprimeprime,?);Hprime;v?for some value v or else evaluates indefinitely, where a59 A is the reflexive, transitive closure of the?? ? relation such that A is a set of events ?. 5.3.5 Implementing Version Consistency for C Programs Ginseng implements transactional version consistency for C using contextual effects. The programmer indicates transactional blocks as lexically scoped blocks delimited by semantic update points, and the compiler annotates these points with contextual effects information. To perform effect inference, we first compute a context-sensitive points-to anal- ysis using CIL [80]. Then we generate (context-insensitive) effect constraints (as described in Section 5.2.4) using labels derived from the points-to analysis, and we solve the constraints with Banshee [61]. After computing the contextual effects, Ginseng transforms the program to make it updateable, and transforms each occurrence of update?,? into a call to a function DSU induced update(?,?,D). Here ? and ? are the prior and future effects at the update point, pre-computed by our contextual effect inference, and D is the capability (Section 3.3.1), i.e., the set of type names whose definitions cannot be modified and variable or functions whose types cannot be modified. 150 When DSU induced update is called at run time, it checks to see whether an update is available and, if so, applies the update if it is both type safe (i.e., no variable or type in D has been changed by the update to have a different type) and version consistent (given ? and ?). If an update is not safe, it is delayed and execution continues at the old version. State Transformation. Our version consistency condition is slightly more com- plicated in practice due to state transformers (described in Section 3.4). The pro- grammer writes the state transformer as if it will be run at the beginning or end of a transaction, and our system must ensure that this appearance is true. That is, to allow an update to occur within a transaction, we must ensure that (1) the writes performed by the state transformer do not violate the version consistency of the current program transactions, and (2) the effects of the current transactions do not violate the version consistency of the state transformer itself. We achieve both ends by considering the update changes (dom(updchg)) and the state transformer?s current effect ?xf as the effect of the update when performing the usual checks for version consistency. For example, if an update point DSU induced update(?,?,D) is reached within a transaction and ??(?xf ?dom(updchg)) =?, then the remaining actions of the transaction will not affect the state transformer, and vice versa. Therefore, the update appears to have occurred at the end of the transaction. Likewise, if ??(?xf ?dom(updchg)) =?then the effect of the transaction to this point has no bearing on the execution of the state transformer, and vice versa, so it is as if 151 the update occurred at the beginning of the transaction. Note that because state transformers can also access the heap from global variables we need to include accesses to standard heap references (i.e., names L as in Section 5.2) in our effects. A similar complication arises from the use of type transformers (Section 3.2.2). If type transformers have effects (e.g., call functions or access global variables), these effects need to be taken into account for our version consistency condition. We can do this by simply adding the effects of all type transformers to dom(updchg), as we do with the effects of state transformers above. In our current implementation, we do not consider type transformer effects?we have manually inspected our type transformers and found them to be safe, since the type transformer code for our test applications is very simple. However, we plan to add this feature in future work. 5.3.6 Experiments We measured the potential benefits of transactional version consistency by analyzing 12 dynamic updates to Vsftpd. When updating Vsftpd (Section 3.5.3), we manually placed one semantic update point at the end of an iteration of the long- running accept loop. Placing the update point there ensured that updates were de facto version consistent, as the entire loop iteration executes at the same (old or new) version. However, having a single update point hampers update availability, as we need to wait until the end of an iteration to apply the update. In this section, we try to determine how induced update points can improve availability by allowing updates to be applied inside transactions, while preserving version consistency. 152 We first designated the two long-running loops in Vsftpd (the accept loop and the command processing loop) as transactions. Then, we modified Ginseng to seed the code used in transactions with candidate update points (i.e., induced update points). While we could conceivably insert induced update points at every statement, we found through manual examination that inserting them just before the return statement of any function reachable from within a transaction provides good coverage. Finally, we used Ginseng to infer the contextual effects and type modification restrictions at each induced update point, and computed at how many of them we could safely apply the update. We conducted our experiments on an Athlon 64 X2 dual core 4600 machine with 4GB of RAM, running Debian, kernel version 2.6.18. Figure 5.9 summarizes our results. For each version, we list its size, the time Ginseng takes to pre-compute contextual effects and type modification restrictions, and the number of candidate update points that were automatically inserted. The analysis takes around 10 min- utes for the largest example, and we expect that time could be reduced with more engineering effort. The last two columns indicate how many update points are type safe, and how many are both type safe and version consistent, with respect to the update from the version in that row to the next version. Note that determining whether an update is type safe and version consistent is very fast, and so we do not report the time for that computation. From the table, we can see that several induced update points are type safe and version consistent. We manually examined all of these points. For all program versions except 1.1.0, 1.2.1, and 2.0.2pre2, we found that roughly one-third of the 153 Version Size Time Candidate Type-safe VC-safe (LOC) (sec) upd. points 1.1.0 10,157 193 344 300 33 1.1.1 10,245 196 346 19 9 1.1.2 10,540 234 350 25 8 1.1.3 10,723 238 354 19 8 1.2.0 12,027 326 413 31 9 1.2.1 12,662 264 438 368 146 1.2.2 12,691 278 439 32 9 2.0.0 13,465 440 471 392 9 2.0.1 13,478 420 471 459 9 2.0.2pre2 13,531 632 471 471 9 2.0.2pre3 14,712 686 484 484 8 2.0.2 17,386 649 471 468 9 Figure 5.9: Version consistency analysis results. VC-safe induced update points occur somewhere in the middle of a transaction, providing better potential update availability. Another third occur close to or just before the end of a transaction, and the last third occur in dead code, providing no advantage. For the remaining versions, 1.1.0, 1.2.1, and 2.0.2pre2, we found that roughly 10% of the induced update points are in the middle of transactions, and almost all the remaining ones are close to the end of a transaction, with a few more in dead code. One reason so many safe induced update points tend to occur toward the end of the transaction is due to the field-insensitivity of the alias analysis we used. In Vsftpd, the type vsf session contains a multitude of fields and is used pervasively throughout the code. The field-insensitive analysis causes spurious conflicts when one field is accessed early in the transaction but others are accessed later on, as is typical. This pushes the safe induced update points to the end of the transaction, following vsf session ?s last use. We plan to integrate a field-sensitive alias analysis into Ginseng to remedy this problem. 154 Interestingly, there are generally far more updates that are exclusively type safe than those that are both type safe and version consistent. We investigated some of these, and we found that the reasons for this varied with the update. For example, the updates that do not change vsf session (e.g., 1.1.0) have a high number of type-safe update points, while those that do (e.g., 1.1.1) have far fewer. This makes sense, given vsf session ?s frequent use. In summary, these results show that many induced update points are both type safe and version consistent, providing greater availability of updates than via manual placement. We expect still more update availability with a more accurate alias analysis. 5.4 Relaxed Updates The barrier approach for updating multi-threaded programs (Section 4.2.1) performs a synchronous safety check at update points: the contextual effects (?,?) of a thread are computed statically, and a runtime check verifies whether they con- flict with the update contents u. To avoid barrier synchronization, in Section 4.2.2 we proposed a relaxed ap- proach where threads periodically check-in their effects. We call these relaxed up- dates, because an update does not necessarily take place at an update point. When an update becomes available, the runtime system does not have to barrier synchro- nize to do the safety check?all it has to do is inspect the checked-in effects of each thread. To ensure update safety, we need a means to prove that check-ins are safe, 155 Definitions d ::= main e | var g = v in d | fun f(x) = e in d Expressions e ::= v|x|let x = e in e|e e | if0 e then e else e | ref e|!e|e := e | tx e| checkin?,? Values v ::= n|z Effects ?,?,?, ? ::= ?|1|{z}|??? Cntxt Effs. ? ::= [?;?;?; ?i ; ?o ] Global symbols f,g,z ? GSym Dynamic updates upd ::= {chg,add} Additions add ? GSym arrowrighttophalf (??b) Changes chg ? GSym arrowrighttophalf (??b) Bindings b ::= v|?x.e Figure 5.10: Source language for relaxed updates. i.e., at runtime, they over-approximate the actual thread effects. 5.4.1 Syntax The language syntax is presented in Figure 5.10; it extends the calculus in Figure 5.4 with support for check-ins (all additions and changes are highlighted ). We only discuss these additions. Check-ins checkin?,? ?snapshot? a thread?s prior (?) and future (?) effects. To account for effects of code blocks between check- in points, we use two extra components in the type-level contextual effects. The contextual effect ? of a term e is [?;?;?;?i;?o] where ?, ?, and ? are the prior, normal and future effects, as described in Section 5.3. The output check-in effect, ?o is the effect of the program after evaluating e, up to the next check-in point. The input check-in effect, ?i contains the effect from the start of e?s evaluation up to the first check-in term in e, or in e?s continuation. The relation ?i = ?o?? holds for 156 terms that do not contain check-ins. 5.4.2 Typing We present the relevant type rules in Figure 5.11. The rules not shown are identical to the ones in Figure 5.5, and we highlight the differences for the ones we present. The fundamental rule for contextual effects is the (XFlow-Ctxt); if an expression e consists of two sub-expressions e1 and e2 where e1 is evaluated first, (XFlow-Ctxt) captures this by adding e1?s normal effect (?1) to e2?s prior effect, and adding e2?s normal effect (?2) to e1?s future effect; the combined effect ? is a tuple whose prior effect is e1?s prior effect, normal effect is the union of normal effects of e1 and e2, and future effect is the future effect of e2. Note how the e1?s output check-in effect, ?o1, is the same as e2?s input effect ?i2; this ensures proper chaining for check-in effects. The subtyping rule, (TSub), allows an expression e, whose contextual effect is ?prime, to be type-checked in a context ? if ?prime ? ?; intuitively, this means that the prior and future components of ? make fewer assumptions about the prior and future computations, but we can pretend that e?s normal effect is larger. The rules for deference (TDeref) and assignment (TAssign) ensure proper check-in effect chaining from ?1 to ?2 in the premise ??i2 = ??o2 ??. The application rule, (TApp), pushes the prior and future effects of the caller into the callee, and adds the effect of the body of the callee to the effect of the call. The transaction rule, (TTransact), pushes ?, the effect of the transaction?s enclosing scope, into the transaction body. The rationale for this is to disallow an update in 157 an inner transaction that could potentially break version consistency for an outer transaction. (TCheckin) is the key rule that makes the static contextual effects available to the runtime system. Note how the runtime prior effect, ?prime contains not only the effects of the program so far, ?, but also the effects of the evaluation up to the next check-in point, ?o. This is necessary because we do not have control over when an update is applied in the interval from the current check-in point up to the next check-in point. The runtime future effect ?prime is a safe approximation of the type-level future effect, ?prime. We provide an example of how the check-in effects work in practice in Section 4.2.2. 5.4.3 Operational Semantics Our operational semantics extends the Proteus-tx semantics from Section 5.3.3 with support for checkin-ins; additions and changes are highlighted. The evalua- tion rules (Figure 5.12) are transitions between configurations ?n;?;H;e? ??? ?nprime;?prime;Hprime;eprime? where n is a global program version, ? is a transaction stack, and H is the heap. A stack element has the form (nprime,?,?) where nprime is the program version associated with the transaction, ? (trace) contains the bindings accessed so far in the transaction, and ? is a runtime restriction. When a function is called or a global variable is dereferenced [gvar-deref], the name is added to the trace. When we start a transaction [tx-start], we push a new element on the transaction stack, with an empty trace and an initial restriction ?, and we mark the fact that we are evaluating inside a transaction using an intx marker. Reductions inside a 158 (XFlow-Ctxt) ?1 = [?1;?1;(?2??2); ?i1 ; ?i2 ] ?2 = [(?1??1);?2;?2; ?i2 ; ?o2 ] ? = [?1;(?1??2);?2; ?i1 ; ?o2 ] ?1 a3?2 arrowhookleft? ? (SCtxt) ?2??1 ?1??2 ?2??1 ?i1??i2 ?o2??o1 [?1;?1;?1; ?i1 ; ?o1 ]?[?2;?2;?2; ?i2 ; ?o2 ] (TSub) ?prime;?turnstilelefte : ?prime ?prime?? ?prime?? ?;?turnstilelefte : ? (TDeref) ?1;?turnstilelefte : ref ? ? ??2 = ? ??i2 = ??o2 ?? ?1 a3?2 arrowhookleft?? ?;?turnstileleft!e : ? (TAssign) ?1;?turnstilelefte1 : ref ? ? ?2;?turnstilelefte2 : ? ??3 = ? ??i3 = ??o3 ?? ?1 a3?2 a3?3 arrowhookleft?? ?;?turnstilelefte1 := e2 : ? (TApp) ?1;?turnstilelefte1 : ?1???f ?2 ?2;?turnstilelefte2 : ?1 ?1 a3?2 a3?3 arrowhookleft?? ??3 = ??f ??3 ???f ??3 ???f ??i3 = ??o3 ???f ??o3 ???of ?;?turnstilelefte1 e2 : ?2 (TTransact) ?1;?turnstilelefte : ? ? ???? 1 ? ? ??? 1 ?;?turnstileleft tx(??1???i1 ,??1???1) e : ? (TCheckin) ???o?? prime ???prime [?;?;?;?;?o];?turnstileleftcheckin?prime,?prime:int Figure 5.11: Selected type rules for relaxed updates. transaction can proceed normally [tx-cong-1], or perform an update [tx-cong-2], in which case we update the stack using the U[] function. If the expression in the transaction body has been reduced to a value, we can exit the transaction via the [tx-end] rule. Updates [update] can only take place if the runtime restriction ? 159 does not conflict with the update contents upd. The [checkin] rule will set a new runtime restriction ?. The definitions of update safety check, heap updates and stack updates are presented in Figure 5.13; they are straightforward extensions of the definitions used in synchronous updates (see Figure 5.7 and Section 5.3.3). 5.4.4 Soundness The goal of our formal system is to prove that relaxed updates are version consistent. We first need to introduce some auxiliary definitions (Figure 5.14). Configuration typing and (TIntrans) are the same as the one used in Section 5.3.3. The definition of a well-formed transaction stack, R;H turnstileleft ?, differs from the one used in the synchronous case to account for ?, the runtime approximation for contextual effects. (TC1) shows the well-formedness condition for one stack element. The first premise is unchanged; it ensures that each element in the trace ? is included in the corresponding prior effect ?. The second premise ensures that elements in each transaction?s current effect (i.e., the part yet to be executed, up to the first checkin) have the version of that transaction: f ?(???i)?n?ver(H,f). The third premise ensures that the runtime approximation of the prior effect, ??, covers the prior execution, and the rest of the execution up to the next check-in point. It is necessary that ?? include not only the past execution, but part of the future execution as well, because we do not know where exactly an update will be applied in the evaluation from the current redex to the next check-in point. The 160 Defi nitio ns Heap s H ::= ?| rmapsto? (?, b, ?) ,H | zmapsto? (? ,b, ?) ,H Ve rsion sets ? ::= ?| {n }? ? Trace s ? ::= ?| (z, ?) ? ? Restric tions ? ::= (?, ?) Trans actio n sta cks ? ::= ?| (n, ?, ? ),? Ex pressio ns e ::= ... |r | tx( ?, ?) e |intx e | checki n( ?, ?) e Ev en ts ? ::= ?| ? Up date Direc tion dir ::= bck |fwd Up date Bund les ? ::= (up d, dir ) Co mpil ation C( H ;mai ne ) = H ;e C( H ;fu nf (x )= ein d) = C( H, fmapsto? (? ?? ? ?prime ,?x.e, ?); d) C( H ;va rg = vin d) = C( H, gmapsto? (? ,v ,? );d ) Ev aluati on Con texts E ::= [] |E e| vE |le tx = E in e | ref E |! E |E := e| r:= E |g := E | if0 E th en eel se e Co mpu tatio n [let ] ?n ;(n prime, ?, ? );H ;le tx = vin e? ?? ? ?n ;(n prime, ?, ? );H ;e [x mapsto? v]? [ref ] ?n ;(n prime, ?, ? );H ;ref v? ?? ? ?n ;(n prime, ?, ? );H [r mapsto? (?, v, ?)]; r? rnegationslash? do m( H ) [deref ] ?n ;(n prime, ?, ? );H ;! r? ?? ? ?n ;(n prime, ?, ? );H ;v ? H (r )= (?, v, ?) [assign ] ?n ;(n prime, ?, ? );H ;r := v? ?? ? ?n ;(n prime, ?, ? );H [r mapsto? (?, v, ?)]; v? r? do m( H ) [if-t ] ?n ;(n prime, ?, ? );H ;if0 0th en e1 else e2 ? ?? ? ?n ;(n prime, ?, ? );H ;e 1? [if-f ] ?n ;(n prime, ?, ? );H ;if0 nprimeprime th en e1 else e2 ? ?? ? ?n ;(n prime, ?, ? );H ;e 2? nprimeprime negationslash= 0 [cong ] ?n ;? ;H ;E [e] ? ?? ? ?n prime; ?prime ;H prime; E[ eprime] ? ?n ;? ;H ;e ?? ? ? ?n prime; ?prime ;H prime; eprime? [gv ar-deref ] ?n ;(n prime, ?, ? );H ;! z? ?? {z } ?n ;(n prime, ?? (z, ?) ,? );H ;v ? H (z) = (? ,v ,? ) [gv ar-assign ] ?n ;(n prime, ?, ? );H ;z := v? ?? {z } ?n ;(n prime, ?? (z, ?) ,? );H [z mapsto? (? ,v ,? )]; v? H (z) = (? ,v prime, ?) [cal l] ?n ;(n prime, ?, ? );H ;z v? ?? {z } ?n ;(n prime, ?? (z, ?) ,? );H ;e [x mapsto? v]? H (z) = (? ,?x.e, ?) [tx-st ar t] ?n ;(n prime, ?, ?prime );H ;tx ? e? ?? ? ?n ;(n prime, ?, ?prime ),( n, ?, ? );H ;intx e? [tx-cong-1 ] ?n ;(n primeprime, ?, ? ),?; H ;intx e? ?? ? ?n prime; Uh (n primeprime, ?, ?prime )i ? nprime ,? prime; H prime; intx eprime? ?n ;? ;H ;e ?? ? ? ?n prime; ?prime ;H prime; eprime? [tx-cong-2 ] ?n ;? ;H ;intx e? ?? ? ?n prime; ?prime ;H prime; intx eprime? ?n ;? ;H ;e ?? ? ? ?n prime; ?prime ;H prime; eprime? [tx-end ] ?n ;(( nprime ,? prime, ?prime ),( nprimeprime ,? primeprime, ?primeprime )); H ;intx v? ?? ? ?n ;(n prime, ?prime ,? prime );H ;v ? trac eOK (n primeprime, ?primeprime ,? primeprime ) [upd ate ] ?n ;(n prime, ?, ? );H ;e ? ?? (up d, dir ) ?n + 1; U? (n prime, ?, ? )?up d, dir n+1 ;U [H ]up d n+1 ;e ? up date OK (up d, H, (? ?, ?? ), dir ) [checkin ] ?n ;(n prime, ?, ?prime );H ;checki n? ? ?? ? ?n ;(n prime, ?prime ,? );H ;1 ? Figur e5 .12: Re laxed up da tes: op eratio nal seman tics. 161 fourth premise ensures that the runtime approximation of the future effect, ??, covers the current term and its continuation. With this we can prove the core result: Theorem 5.4.1 (Single-step Soundness). If ?;?turnstilelefte : ? where llbracket?;?turnstilelefte : ?rrbracket =R; and n;? turnstileleft H; and ?,R;H turnstileleft ?; and traceOK(?), then either e is a value, or there exist nprime, Hprime, ?prime, ?prime, eprime, and ? such that?n;?;H;e? ?? ? ?nprime;?prime;Hprime;eprime?and ?prime;?prime turnstilelefteprime : ? where llbracket?prime;?primeturnstilelefteprime : ?rrbracket =Rprime; and nprime; ?prime turnstileleftHprime; and ?prime,Rprime;Hprime turnstileleft?prime; and traceOK(?prime) for some ?prime,?prime,Rprime. The proof is based on progress and preservation lemmas, as is standard. De- tails are in Appendix C. 5.5 Multi-threaded Version Consistency The formal system presented in Section 5.4 ensures version consistency for re- laxed updates to single-threaded programs. In this section, we extend the formalism to be able to prove version consistency for multi-threaded programs. This is essen- tial for ensuring safety in our multi-threaded Ginseng implementation (Chapter 4). Our multi-threaded calculus extends the calculus presented in Section 5.4 with the ability to model multiple threads. Again, we only present newly-introduced rules, or rules that are different from the single-threaded calculus. 162 5.5.1 Syntax The additions to the syntax are presented in Figure 5.15. In this calculus, each thread has its own transaction stack ? and its own evaluation context e. We use T to denote the sequence of all threads? evaluation contexts as a sequence of pairs (?,e). The new expression fork? e models spawning a new thread: the current, parent, thread forks a new child thread whose body is e, where ? is the child thread?s initial runtime effect. 5.5.2 Typing The only new type rule is (TFork)?the rule for spawning a thread (Fig- ure 5.16). The child thread?s initial restriction has two components. The prior effect, ?prime??iprime, ensures that an update in the child thread takes into account the prior execution in the parent (?prime) and the effect of the evaluation up to the first check-in point in the child (?iprime). The future effect consists solely of the effect of the evaluation in the child, ?prime; this is modeling the ?exit? after the child thread has finished execution. 5.5.3 Operational semantics. Configuration typing (Figure 5.17) is the same as single-threaded configuration typing in Figure 5.8, n;? turnstileleft H. This is because the heap H is shared among threads, there is only one global version n, and we require well-typing for each thread. The operational semantics extends the semantics from Section 5.4.3 with sup- 163 port for multiple threads. The multi-threaded semantics (Figure 5.18) models an abstract machine that non-deterministically chooses a thread and lets that thread take a step; this is similar to multi-threading formal models used by others [36, 46]. Depending on the expression in redex position, the step taken is either one from the single-threaded semantics, or a concurrency-specific step such as thread creation ([fork]) or thread exit ([return]). The evaluation rules consist of transitions of form: ?n;H;T? ?(?,j) ?nprime;Hprime;Tprime? where T is the sequence of thread contexts, j is the thread taking a step, and ? is the effect of the evaluation. There is a single global version n and a single heap H, just like in the single-threaded semantics. Rule [fork] spawns a new thread by pushing a new thread context onto T; note that the new thread?s restriction comes from the (TFork) typing rule. Rule [return] does just the opposite: when a thread has finished evaluation (i.e., its evaluation context is a single value), it is removed from the context sequence. Rule [mt-update] states that we can perform a multi-threaded update if each thread can take an [update] step (Figure 5.12); this implies that no thread can conflict with the update. The crucial aspect of setting the [update] and [mt-update] rules this way is that the system can perform an asynchronous update, at any point, without the need to have an update in redex position (compare these rules with the synchronous [update] rule in Figure 5.6). 164 5.5.4 Soundness The goal of our formal system is to prove that updating a multi-threaded pro- gram preserves version consistency. In our system, a program is version-consistent if each thread is version-consistent. Albeit simple, this definition of version consis- tency essentially says that transactions in different threads are unrelated, though this might be considered too lax. With this we can prove the core result: Theorem 5.5.1 (Multi-threadingSoundness). LetT = (?1,e1).(?2,e2)...(?|T |,e|T |). Suppose we have the following: 1. nturnstileleftH,T 2. ?i?1..|T|. ?i,Ri;Hturnstileleft?i 3. ?i?1..|T|. traceOK(?i) Then for all ?i such that ?i,Ri;H turnstileleft?i, and traceOK(?i), either ei is a value, or there exist nprime,Hprime,Tprime,j,nprime,?prime such that: n;H;T ?(?,j) nprime;Hprime;Tprime Tprime = (?prime1,eprime1).(?prime2,eprime2)...(?prime|T |,eprime|T |) and we have: 1. nprimeturnstileleftHprime,Tprime (such that nprime;?primeturnstileleftHprime and?i?1..|T|prime. ?primei;?primeturnstilelefteprimei : ? a59Rprimei ?prime?? and some ?primei such that ? ?primei = ?i, if inegationslash= j 165 ? ?primei?[??i ??0;?primei;??i ;??ii ;??oi ], ?primei??0???i, if i = j 2. ?i?1..|T|prime. ?primei,Rprimei;Hprimeturnstileleft?primei 3. ?i?1..|T|prime. traceOK(?primei) The proof is based on progress and preservation lemmas, as is standard. De- tails are in Appendix D. 5.6 Conclusion This chapter first introduces contextual effects, which extend standard effect systems to capture the effect of the context in which each subexpression appears. Then, we present transactional version consistency, a new correctness condition for dynamic software updates. Finally, we present two formalisms based on contextual effects (relaxed updates and a multi-threaded calculus) that help us safely update multi-threaded programs. 166 Up date Safet y up dateOK (up d, H, (? ,? ),dir )= dir = bck ? ?? dom( up dch g) = ? ? dir = fwd ? ?? dom( up dch g) = ? ? ?= typ es (H ) ? ?upd = ?, typ es (up dad d) ? ?z mapsto? (?, b,? )? up dch g. (? ?; ?upd turnstileleft b: ? ? heapT yp e( ?, z) = ?( z)) ? ?z mapsto? (?, b,? )? up dad d. (? ?; ?upd turnstileleft b: ? ? z/? dom( H)) Tr ace Sa fet y trac eOK (n, ?, ?) = (? (z, ?) ? ?. n? ?) Heap Up dates U[( zmapsto? (?, b,? ),H )]up d n = ???? ??? zmapsto? (?, bprime, {n }), U[ H] updn ifup dch g( z) mapsto? (?, bprime) zmapsto? (?, b,? ?{ n} ),U [H ]updn othe rwis e U[( rmapsto? (?, b,? ),H )]up d n = (r mapsto? (?, b,? )), U[ H] updn U[ ?]up d n = {z mapsto? (?, b,{ n} )| zmapsto? (?, b) ? up dad d} Heap Typing En vir onmen ts typ es (?) = ? typ es (z mapsto? (?, b,? ),H prime) = z: heapT yp e( ?, z), typ es (H prime) heapT yp e( ?1 ?? ? ?2, z) = ?1 ?? ? ?2 z? ? heapT yp e( ?, z) = ref {z } ? ?negationslash= (?1 ?? ? ?2) Tr ace St ack Up dates U[( nprime, ?, ?)] up d, fwd n = (n prime, ?, ?) U[( nprime, ?, ?)] up d, bck n = (n, Ut [?] up d n ,? ) Ut [?] up d n = {(z ,? ?{ n} |z negationslash? dom( up dch g) } ? {(z ,? )| z? dom( up dch g) } Figur e5 .13: Re laxed up da tes: he ap typing ,tr ace and up da te saf ety . 167 TIntrans ?1;?turnstilelefte : ? ? ???? 1 ? ? ??? 1 ?;?turnstileleftintx e : ? dom(?) = dom(H) ?zmapsto?(? ??? ?prime,?x.e,?)?H. ?;?,x : ? turnstilelefte : ?prime ? ?(z) = ? ??? ?prime ? z?? ?zmapsto?(?,v,?)?H. ??;?turnstileleftv : ? ? ?(z) = ref ? ? ? z?? ?rmapsto?(?,v,?)?H. ??;?turnstileleftv : ? ? ?(r) = ref ? ? ?zmapsto?(?,b,?)?H. n?? n;? turnstileleft H (TC1) f ???f ?? f ?(???i )?n?ver(H,f) ???(???i) ?? ?(???) [?;?;?; ?i ; ?o ],?;Hturnstileleft(n,?, ? ) (TC2) ?prime,R;Hturnstileleft? f ???f ?? f ?(???i )?n?ver(H,f) ???(???i) ?? ?(???) [?;?;?; ?i ; ?o ],?prime,R;Hturnstileleft(n,?, ? ),? where ver(H,f) = ? iff H(f) = (?,b,?) Figure 5.14: Relaxed updates: typing extensions for proving soundness Thread Sequences T ::= (?,e)|(?,e).T Expressions e ::= ...|fork? e Figure 5.15: Multi-threaded syntax TFork ?prime;?turnstilelefte : ? a59? ?prime?[?prime;?prime;?prime;?iprime;?oprime] ?;?turnstileleftfork?prime??iprime, ?prime e : int a59? Figure 5.16: Multi-threaded additions for expression typing n;? turnstileleft H T = (?1,e1).(?2,e2)...(?|T |,e|T |) ?i?1..|T|. ?i;?turnstileleftei : ? a59Ri where ?i?[?i;?i;?i;?ii;?oi] nturnstileleftH;T Figure 5.17: Multi-threaded configuration typing 168 [fork ] ?n ;H ;T 1. (? i, E[ for k? e]) .T2 ? ? (? ,i) ?n ;H ;T 1. (? i, E[0]) .(( n, ?, ?) ,e) .T2 ? [retur n] ?n ;H ;T 1. ((n primeprime, ?primeprime ,? primeprime) ,v ).T 2? ? ? ?n ;H ;T 1. T2 ? [mt-upd ate ] ?n ;? 1; H; e1? ?? (up d, dir ) ?n + 1; U[? 1] up d, dir n+1 ;U [H ]up d n+1 ;e1 ? ?n ;? 2; H; e2? ?? (up d, dir ) ?n + 1; U[? 2] up d, dir n+1 ;U [H ]up d n+1 ;e2 ? ... ?n ;? |T |; H; e|T |? ?? (up d, dir ) ?n + 1; Ubracketleftbig ?|T |bracketrightbig up d, dir n+1 ;U [H ]up d n+1 ;e|T |? ?n ;H ;(? 1, e1) .(? 2, e2) ... (? |T |, e|T |) ? ? (up d, dir ) ?n + 1; U[ H] up d n+1 ;(U [? 1] up d, dir n+1 ,e1 ).( U[? 2] up d, dir n+1 ,e2 ). ..( Ubracketleftbig ?|T |bracketrightbig up d, dir n+1 ,e|T |) ? [mt-co ng ] ?n ;? i; H; e? ?? ? ?n prime; ?prime i ;H prime; eprime? ?n ;H ;T 1. (? i, E[ e]) .T2 ? ? (? ,i) ?n prime; Hprime ;T 1. (? prime i, E[ eprime]) .T2 ? Figur e5 .18: Multi-threa de do pera tiona ls ema ntics ru les. 169 Chapter 6 Related Work Over the past thirty years, a variety of approaches have been proposed for dynamically updating running software. In this section we compare our approach with a few past systems, focusing on differences in functionality, safety, and updating model. 6.1 Dynamic Software Updating 6.1.1 Update Support A large number of compiler- or library-based systems have been developed for C [42, 47, 20, 6], C++ [52, 60], Java [17, 86, 31, 70], and functional languages like ML [32, 43] and Erlang [8]. Many do not support all of the changes needed to make dynamic updates in practice. For example, updates cannot change type definitions or function prototypes [86, 31, 52, 60, 6], or else only permit such changes for abstract types or encapsulated objects [60, 43]. In many cases, updates to active code (e.g., long-running loops) are disallowed [43, 70, 42, 47, 60], and data stored in local variables may not be transformed [50, 47, 42, 52]. Some approaches are intentionally less full-featured, targeting ?fix and continue? development [58, 44] or dynamic instrumentation [20]. On the other hand, Erlang [8] and Boyapati et al. [17] are both quite flexible, and have been used to build and upgrade significant 170 applications. Many systems employ the notion of type or state transformer, as we do. Boy- apati et al. [17] improve on our interface by letting one type transformer look at the old representation of an encapsulated object, to allow both the parent and the child to be transformed at once. In our setting, the child will always have to be transformed independent of the parent, which can make writing transformers more complicated or impossible (e.g., if a field was moved from a child object into the parent), though we have not run into this problem as yet. Duggan [32] also proposes lazy dynamic updates to types using type transformers, using fold/unfold primitives similar to our conT/absT. Ours is the first work to explore the implementation of such primitives. The most similar system to ours is DLpop?Hicks?s work on providing dynamic updating in a type-safe C-like language called Popcorn [50]. While that system is fairly flexible, this work makes three substantial improvements. First, DLpop could not transform data in local variables, could not automatically update function pointers, and had no support for updating long-running loops. We have found all of these features to be important in the server programs, and are part of our current work. Second, while DLpop ensured that all updates were type-safe, it did not ensure they were representation-consistent (Section 3.3), as it permitted multiple versions of a type to coexist in the running program. In particular, when a type definition changed, it required making a copy of existing data having the old type, opening the possibility that old code could operate on stale data. Finally, DLpop only experimented with a single program (a port of the Flash web server, about 171 8,000 LOC), and all updates to it were crafted by the author, rather than being official releases. 6.1.2 Correctness of Dynamic Software Updating Several systems for on-line updates have been proposed. In this section we focus on how prior work controls update timing to assure its effects are correct. Re- call that in our approach, the programmer can control update timing by placement of update points (Section 3.3.5), and Ginseng adds constraints on types that can change at an update point, to preserve type safety. Most DSU systems disallows updates to code that is active, i.e., actually run- ning or referenced by the call stack. The simplest approach to updating active code, taken by several recent systems [43, 69, 23], is to passively wait for it to become inac- tive. This can be problematic for multi-threaded programs, since there is a greater possibility that active threads reference a to-be-updated object. To address this problem, the K42 operating system [60, 103, 12, 11] employs a quiescence protocol. Once an update for an object is requested, an adapter object is interposed to block subsequent callers of the object. Once the active threads have exited, the object is upgraded and the blocked callers are resumed. The danger is that dependencies be- tween updated objects could result in a deadlock. While applying updates based on code inactivity is useful, activeness is not sufficient for ensuring correctness?update timings allowed by the this approach can result in incorrect combinations of old and new behavior. In particular, version consistency may require delaying an update if 172 to-be-updated objects are not currently active but were during the transaction. Lee [63] proposed a generalization of the quiescence condition by allowing programmers to specify timing constraints on when elements of an update should occur; recent work by [24] is similar. As an example, the condition update P, Q when P, M, S idle specifies that procedures P and Q should be updated only when procedures P, M, and S are not active. Lee provides some guidance for using these conditions. For example, if procedure P?s type has changed, then an update to it and its callers should occur when all are inactive. Our work relies on programmer- designated transactions (which are higher-level and arguably easier to find out and specify), and uses program analysis to discover conditions that enforce transactional version consistency. Ginseng?s updatability analysis (Section 3.3) gathers type constraints imposed by the active (old) code at each program point and only allows an update to take place if it satisfies the constraints. This is more fine-grained than Lee?s constraints? if the type of a function changes, we can update it even when its callers are active so long as they will not call the updated function directly. Our current work is complementary to this work, as a type-safe update will not necessarily be version- consistent, and depending on how transactions are specified the reverse may also be true. Our use of transactions to ensure version consistency resembles work by [17] on lazily upgrading objects in a persistent object store (POS). Using a type system that guarantees object encapsulation, their system ensures that an object?s transfor- mation function, used to initialize the state of a new version based on old state, sees 173 only objects of the old version, which is similar to our version consistency property. How updates interact with application-level transactions is less clear to us. The as- sumption seems to be that updates to objects are largely semantically-independent, so there is less concern about version-related dependencies between objects within a transaction. 6.1.3 Multi-threaded Systems Several existing systems permit updates to multi-threaded programs. They tend to be either less flexible than Ginseng, or if flexible, no automatic safety support is provided, leaving the problem entirely to the programmer. First, there are systems that do not permit updates to currently-running func- tions, and rely on the activeness check for safety, such as the K42 operating sys- tem [103, 12, 11], OPUS [6], or Ksplice [1]. The problem with these approaches is two-fold: 1) the activeness check does not preclude badly timed updates (see Section 4.5.1), and 2) updating long-running multi-threaded programs requires up- dates to active functions; our approach permits updates to running code using code extraction (Section 3.2.4). The second category is systems that do not employ the activeness check, such as Lucos [23], Polus [24], and UpStare [68]. Lucos and Polus employ binary rewriting in function prologues to redirect calls to the new function versions, and permit updates to active code, but active functions continue to execute at the old version. This is problematic as it can lead to type safety violations and precludes updates 174 to long-running loops. UpStare permits dynamic updates to multi-threaded C programs using a tech- nique called stack reconstruction: an update can be applied at any point in any thread, and the stack state for all threads is reconstructed according to the new stack layout in the new program version; this resembles our code extraction tech- nique for supporting sub-function level updates to code, but UpStare?s technique is more general since it does not require explicitly annotations for code to be ex- tracted. Albeit very flexible, this approach permits updates at points where the global state is likely not consistent, e.g., in the middle of a loop, without making the update appear at the beginning or end of the iteration. Another disadvantage of UpStare is that threads need to cooperate with the update coordinator thread, in other words, to apply an update, all threads but one must be blocked; detecting whether threads are blocked is difficult. Our system uses check-ins and does not have this requirement, as the runtime system always keeps a safe approximation of each thread?s effects; if a thread does not conflict with the update, it can continue executing while an update is applied. 6.1.4 Operating Systems Chen et al. [23] have developed LUCOS, an approach to updating Linux using Xen-based virtualization. Xen is used for hardware decoupling, as well as update management (initiation, rollback). Detecting changes between versions, as well as update construction is completely manual. To update functions, stack-walking and 175 binary rewriting will find and fix all references to old functions, and redirect them to the new versions. To update data, paging is set up to detect when an access to an old type value occurs, and, upon such accesses, a transformer function will convert the old type value to a new type value. Although the authors present some realistic updates from the 2.6.10?2.6.11 patch, the approach has some room for improvement. First, LUCOS does not require quiescence for an update to be applied, but this can be problematic, as the update could be applied while updated type values are (or will be) used concretely, might have live pointers into them, updated functions are still on the stack, or function signatures change. These issues can all lead to type safety violations. Manual patch construction (finding instances of changed data, finding all the functions that changed) is also unlikely to scale, especially in a large system as the Linux kernel. Baumann et al. [103, 10, 12] have worked on implementing dynamic updates in K42, an operating system developed at IBM Research. K42 is almost entirely object- oriented, and is written in C++. In K42, dynamic updates are performed at class level: since all the code is encapsulated behind class interfaces, a dynamic update to code or data consist of updates to one or more classes. All classes that might be subject to dynamic updating have to provide state import and state export methods, thus upon update, the new version imports the old version?s exported state. This is similar to our type transformers, but in K42 import and export methods are written manually, whereas in Ginseng type transformers are generated mostly automatically. Quiescence detection [103] is based on a mechanism similar to RCU [72]. The update is split into two phases: first, while existing calls finish, all the incoming calls are 176 tracked; second, after all untracked calls have finished, incoming calls are blocked, while waiting for the tracked calls to finish. At the end of the second phase the object is quiescent and can be dynamically updated. All objects of a certain type are accounted for using factories. Object updates can be performed either lazily, or at update time. Changes to class interfaces are supported via adapters; to avoid changing callers when the interface of an objects changes, an adapter will expose the old class interfaces to objects using the old interface. Encapsulation makes the task of updating K42 easier than updating C code: ADTs are explicit, so types and functions operating on those types change together; because all accesses go via indirection (OTT), abstraction-violating pointers are ruled out. Ksplice [1] is a dynamic update approach for updating the Linux kernel. It supports changes to functions only (no changes to types or function signatures) by loading the new functions as a kernel module, and inserting trampolines at the entry points of old functions to redirect the callers to the new function versions. Ksplice uses the activeness check to only allow function updates if the changed function does not appear on any thread?s stack, and exhibits the deficiencies of activeness-check based DSU systems. 6.1.5 DSU in Languages Other Than C DVM [70] and DUSC [86] add dynamic updating support to Java; both systems preserve update type safety by requiring that a class update keep the class interface unchanged. DVM changes the Java virtual machine to add type safety checks upon 177 patch loading, and uses a split phase mark-and-sweep algorithm for marking to-be- update classes at update time, and performing the update lazily. DUSC does not require changes to the JVM, because it uses a source-to source transformation tools (called proxy-, and implementation-class builders) to redirect all accesses to classes that could potentially be updated to wrapper classes. Wrapper classes use delegation to direct the access to the most recent version of an updatable class. Upon update, the wrapper class will instantiate an updated class object for each existing old instance, passing the stateof theold instance as aparametertothe constructorof the new instance. Naming issues restrict the developer of updatable classes from using public or private fields directly?accesses must use methods instead. Since neither DVM nor DUSC support changes to actively-running methods or class interfaces, the scope of possible updates is limited. DynamicML [43] uses encapsulation to permit updates to ML modules. Up- dates cannot change function signatures, but the new interface can add functions, types and values. DynamicML uses the ML garbage collector in an elegant fashion to support updates to values. All updateable values are tagged with a runtime type, and when the generational garbage collector performs a from?to copy operation, all tagged values are updated if needed. The garbage-collector approach can also help with rolling back an update: if during an update an exception is thrown, the system simply reverts to the from space, effectively canceling any modifications performed during the update. Like K42, DynamicML uses encapsulation to permit whole-module updates without worry about module?s implementation internals, at the expense of having to wait for the module to be quiescent before proceeding with 178 the update. C has no modules or encapsulation boundaries, which allow us to update types, function signatures and active code in an unrestricted fashion; however, the burden of reasoning about safety guarantees (beyond type safety) and appropriate update points falls on the programmer. Stewart and Chakravarty [104] provide a solution for dynamic reconfiguration and dynamic updating of Haskell applications. Their approach is based on chang- ing the software architecture such that configuration values, application state and application code are separated; configuration and state are optional arguments to a modified application main function that is ready to accept exiting state. Dynamic update or reconfiguration is simply a matter of reinvoking main with a (possibly changed) configuration and the old state. Changes in type representations are dealt with prior to the update: before saving the global state, the old state values are converted to the new format, so that upon reloading, the new code will directly see new data. Unlike Ginseng, this approach requires changes to the software architec- ture; it is also not clear to us how easy it is to separate data from code, given a legacy application. 6.1.6 Edit and Continue Development Microsoft?s Visual Studio IDE allows programmers to make (and observe the effects of) small-scale changes to their applications after having started the appli- cation in the debugger. This feature, called ?Edit and Continue?, is available for several programming languages such as C++, C# and Visual Basic [74]. Though 179 convenient, Edit and Continue only permits a limited set of changes to code and data, so this approach is suitable for incremental development rather than long-term software evolution. For example, the documentation for the C++ version of Edit and Continue does not support ?most changes to global or static data?, or ?Changes to a data type that affect the layout of an object, such as data members of a class?, or ?Adding more than 64k bytes of new code or data?. In contrast, Ginseng permits such changes, but requires programs to be compiled specially, and offers no smooth integration with a debugger/IDE. Java supports a limited class replacement mechanism, primarily meant to be used by debuggers, IDEs (for incremental development) and profilers [30]. The JVM Tool Interface (JVMTI [3]) is a programming interface that development and monitoring tools can use to communicate with (and control) the JVM. JVMTI provides a RedefineClasses() method that allows runtime class redefinition, but the only thing the new class can change is method bodies. Threads having old methods on the stack continue to execute the old code, but all new invocations will use the new method (JVMTI also supports a PopFrame() operation that can be used to discard currently running old methods by effectively popping the old method?s stack frame and returning to the call site). Method replacement in Java is hence similar to the approach we take in Ginseng?following an update, all calls to replaced functions are to new versions. Although easy to implement [29], this method replacement scheme can lead to violations of version consistency. 180 6.1.7 Dynamic Code Patching Dynamic code patching is a helpful technique for debugging, instrumentation, or small-scale updates, e.g., fixing a limited-scope security bug. Dyninst [20] uses binary rewriting to replace code in a process with programmer-defined code. Dyninst supports fine-grained changes to code within a function, and works on unmodified binaries. It is difficult however, to achieve long-term code evolution using code patching only: updating data, and providing safety guarantees become necessary for non-trivial, realistic updates and multi-year changes. 6.2 Alternative (Non-DSU) Approaches to Online Updating A typical approach to upgrading on-line systems is to use a load-balancer [18, 85]. It redirects requests away from a to-be-updated application until it is idle, at which point it can be halted and replaced with a new version. Such approaches typically employ redundant hardware, which, while required for fault-tolerance, is undesirable in some settings (e.g., upgrading a personal OS). Microvisor [65] employs a virtual-machine monitor (VMM) to follow this basic methodology on a single node. When an application or OS on a server node is to be upgraded, a second OS instance is started concurrently on the same node and upgraded. When the original instance becomes idle, applications are restarted on the new instance and the machine is devirtualized. While Microvisor avoids the need for extra hardware, it shares the same drawbacks as the load-balancing approach: applications must be stateless (so they can be stopped and restarted) or they must 181 be able to save their state under the old version, and then restore the state under the new version. While checkpointing [19] or process migration [102, 75] can be used to stop and restart the same version of an application, it cannot support version changes. DSU handles application state changes naturally. Since all state is visible to an update, it can be changed as necessary to be compatible with the new code. Zap [87] combines checkpoint/restart with virtualization to support process migration in Linux. Autopod [82, 93] builds on Zap to provide OS upgrades without disrupting user applications by checkpointing user processes, starting a new OS instance, upgrading it, and reinstating the user processes on the updated machine. The range of OS upgrades is limited, however, since all the virtualization metadata layout has to be kept the same across upgrades, or converted upon process restart. Another similar model is dependency-agnostic upgrades [33] at the system level. In this model, the new software version is installed on a different machine; old and new versions continue to run in parallel, and the data is converted gradually. While the data is migrated and converted, writes to old data are invalidated to prevent inconsistency. It is less clear to us whether the upgrade can be totally dependency-agnostic due to the presence of system-level state (e.g., open files, OS and application data in memory) which might need to be transferred as well. While checkpointing [92, 19] or process migration [102] can be used to stop and restart the same version of an application, it cannot support version changes. DSU handles application state changes naturally. Since all state is visible to an update, it can be changed as necessary to be compatible with the new code. Indeed, one can imagine composing our approach with checkpointing to combine updating 182 with process migration. 6.3 Software Evolution A number of systems for identifying differences between programs have been developed. We discuss a few such systems briefly. Yang [114] developed a system for identifying ?relevant? syntactic changes between two versions of a program, filtering out irrelevant ones that would be pro- duced by diff. Yang?s solution matches parse trees (similar to ASTdiff) and can even match structurally different trees using heuristics. In contrast, ASTdiff stops at the very first node mismatch in order not to introduce spurious name or type bijections. Yang?s tool cannot deal with variable renaming or type changes, and in general focuses more on finding a maximum syntactic similarity between two parse trees. We take the semantics of AST nodes into account, distinguish between dif- ferent program constructs (e.g., types, variables and functions) and specific changes associated with them. Horwitz [53] proposed a system for finding semantic, rather than syntactic, changes in programs. Two programs are semantically identical if the sequence of observable values they produce is the same, even if they are textually different. For example, with this approach semantics-preserving transformations such as code motion or instruction reordering would not be flagged as a change, while they would in our approach. Horwitz?s algorithm runs on a limited subset of C that does not include functions, pointers, or arrays. 183 Binkley [14] proposes using semantic differencing to reduce the cost of regres- sion testing?for a new version of a program, regression tests only need to be run for components whose behavior has changed. A semantic difference is defined as a difference in program?s behavior, e.g., values assigned to variables, values of boolean predicates, or return values for procedures. Regression testing is performed only for components whose behavior has changed. The target language used is a simplified one that does not contain pointers, arrays, or global variables. Jackson and Ladd [57] developed a differencing tool that analyzes two versions of a procedure to identify changes in dependencies between formals, locals, and globals. Their approach is insensitive to local variable names, like our approach, but their system performs no global analysis, does not consider type changes, and sacrifices soundness for the sake of suppressing spurious differences. 6.4 Contextual Effects Contextual effects extend standard effect systems to capture the effect of the context in which each subexpression appears, i.e., the effect of evaluation both before and after the evaluation of the subexpression. We use contextual effects to determine the effects of past and future computation at each program point, and enforce transactional version consistency while allowing updates to occur more frequently within programs. Several researchers have proposed extending standard effect systems [66, 83] to model more complex properties. One common approach is to use traces of actions 184 for effects rather than sets of actions. These traces can be used to check that resources are accessed in the correct order [55], to statically enforce history-based access control [100], and to check communication sequencing [83]. Skalka et al.?s trace effects include sequencing, choice, and recursion operators to precisely describe an execution history, in contrast to standard effects which are sets of unordered events. We believe we could construct contextual versions of trace effects (and thus derive prior and future trace effects), at least for sequencing and choice, via extension to our a3 operator. Nielson et al.?s communication analysis [83] uses a type and effect system to characterize the actions of a concurrent program and the temporal order among them. Their effects resemble our normal effects, but there are no prior or future effects in their system. However, our prior and future effects consists of sets, and do not capture order; we could imagine adding ordering to our system by using a sequencing operation on effects (;), and defining composition as (?1a3?2)? = ??1;??2 instead of (?1 a3?2)? = ??1???2. While these systems can model the ordering of events, they do not compute the prior or future effect at a program point. We believe we could combine trace effects with our approach to create a contextual trace effect system, which we leave for future work. Hicks et al. [51] introduced continuation effects ?, which resemble the union ??? of our standard and future effects. Judgments in this system have the form ?;?turnstilelefte : ?;?prime, where ?prime describes the effect of e?s continuation in the remainder of the program, and ? is equivalent to ???prime where ? is the standard effect of e. The 185 drawback of this formulation is that the standard effect ? of e cannot be recovered from ?, since (???prime)??prime negationslash= ? when ???prime negationslash=?. This system also does not include prior effects. Capabilities in Alias Types [101] and region systems like CL [110] are likewise related to our standard and future effects. A capability consists of static approx- imation of the memory locations that are live in the program, and thus may be dereferenced in the current expression or in later evaluation. Because these sys- tems assume their inputs are in continuation passing style (CPS), the effect of a continuation is equivalent to our future effects. The main differences are that we compute future effects at every program point (rather than only for continuations), that we compute prior effects, and that we do not require the input program to be CPS-converted. 186 Chapter 7 Future Work In this dissertation we have shown that DSU can be practical for updating re- alistic applications as they are written now, and as they evolve in practice. However, the three dimensions of DSU practicality (flexibility, safety, and efficiency) could be further explored. In this section we lay out some possible directions for future work. 7.1 DSU for Other Categories of Applications DSU is useful for long-running programs that do not tolerate reboots. In Figure 7.1 we present a spatio-temporal categorization of long-running applications, based on how long a history they keep, and where state is stored. In this dissertation we have focused on programs in quadrant 1, i.e., programs that keep long-lived, in- memory state. We now discuss the other categories of applications (quadrants 2?4) from a DSU perspective, trying to answer questions such as ?Does DSU provide significant benefits for this application class??, or ?What are the costs of making a certain application class updateable on-the-fly??. 1: Long history, volatile state. This is the class of applications we have focused on in this work. DSU is appealing for such programs because restarting the program causes the in-memory state to be lost, which causes disruption at the client, or performance degradation at the server. For example, restarting 187 Long StateStorage Memory Disk HistoryShort OperatingSystems Email servers Cachingwebservers DBMSs Routers SSHserversNon-cachingwebservers DNSservers 2 1 3 4 Figure 7.1: Spatio-temporal characteristics of long-running applications. OpenSSHd shuts down all the long-lived SSH connections, which is disruptive for the clients, while restarting Icecast interrupts live streaming. Restarting Zebra causes all the routing information the server has accumulated to be lost; upon restart, the BGP, RIP and OSPF routing daemons using Zebra will have to take time to re-learn routes. Restarting Memcached will cause the underlying HTTP server to query the database for each web page, which degrades performance while the new instance of Memcached fills up its cache. 2: Short history, volatile state. Applicationsfallingintothiscategoryareservers with short-lived connections/requests, e.g., NTP servers or non-caching web servers. The standard approaches to updating web servers are 1) deny ac- 188 cess to new clients while the pending requests are completed, followed by update installation and restart at the new version, or 2) taking down a cer- tain percentage of servers and updating them, and redirecting new requests away from the old web server to an updated server running the new version. These stop/restart approaches have certain disadvantages. Shutting down the server while requests are pending is undesirable, especially for e-commerce applications. Waiting for pending requests to complete might have security implications, e.g., if the update fixes a security bug that might be triggered by a pending request. Redirecting new connections to a spare web server requires spare resources. All these issues can make DSU support attractive for web servers. However, if the system is already provisioned with spare hardware or spare virtual machines for fault-tolerance, then upgrades based on redirecting new requests away from the old web server to a server running the new version make sense. 3: Short history, persistent state. Anexampleofsuchanapplicationisanemail server such as Sendmail. The state of an email server (email messages) is per- sistent, but is under the server?s control for only a short period, i.e., while the server is accepting the e-mail and delivering it to other servers or to users? mailboxes. The server?s history (emails accepted and delivered in the past) is not significant to the current system state. Shutting down and restarting the server does not cause significant client disruption and slow operation dur- ing the warm-up phase. Therefore, DSU is a less compelling case for such 189 short-history, persistent state applications.1 4: Long history, persistent state. Database systems make canonical examples of applications with long history and persistent state. Schemas for long- running databases need to evolve, but the current approach is tedious. It involves changing the application to use the new schema, shutting down the database, creating a new table with the new schema, writing code that popu- lates the new table and converts the old elements in the process, and finally restarting the system. ACID semantics guarantee that shutting down a server will not lead to corrupted state. However, being able to perform on-the-fly changes to a database schema without shutting down the DBMS has several advantages [27]: the process presented above is time consuming, and mostly manual, the database is unavailable during the operation, and application soft state (e.g. caches) is lost. Ultimately, application users have to decide whether the cost of running an updateable (DSU) application with a slight overhead outweighs the cost of having to shut down and restart the application, and bear the associated disruption or warm-up costs. In an enterprise where redundant hardware is used for fault toler- ance anyway, the redirect/reboot approach might be less expensive than the cost (hardware resources, power consumption) incurred because of DSU performance 1Incidentally, SMTP is designed with support for high availability using mail exchangers. Ex- changers act as backup servers that can accept email messages when the primary server is down, and deliver them to recipients later, when the main server comes back up. This scheme makes SMTP even more tolerant of server reboots. 190 overhead. On the other hand, for low-cost systems (e.g., home routers, PCs, lap- tops, cell phones), the cost of extra hardware might be prohibitive, and the users prefer to pay a performance penalty in exchange for the convenience of not having to reboot. 7.2 DSU for Operating Systems It has become commonplace for OS vendors to releasetheir patches online; end- users download the patch, install it, and restart the OS so that the patch can take effect. Experts suggest that high release frequency is needed to reduce vulnerability of end-user systems, but increased patch release frequency is burdensome for patch recipients. The reason why users and administrators are slow in applying OS patches is because they are disruptive and might introduce new bugs [97]. The disruption stems from the OS?s position at the bottom of the software stack: upon restart, all the transient state is lost. The lack of confidence in the latest patch is due in part to difficulties in expressing and validating update safety. Despite a lot of progress toward on-the-fly OS updates, updating a commodity OS?s kernel (e.g., updating to the next Linux kernel version) still requires rebooting. The main reason is that OS code is low-level, sizable, complex, highly concurrent and performance-critical. This presents a significant opportunity for improvement, and is an area worth exploring in future work. Updating an OS requires solving two main problems: finding a safe update point, and after reaching a safe update point, redirecting execution to the new 191 version. Finding safe update points. As explained in Section 3.3.5, update timing is crucial for update correctness, and picking a right update point is up to the pro- grammer. In Chapter 5 we showed that to preserve version consistency, updates should be applied at program points where they do not conflict with in-flight trans- actions. Therefore, the challenge is to define what constitutes an OS transaction, and identify transactions in the OS code. One possible direction is to model the operating system as an event processor, where ?processing an event? could be serv- ing an interrupt/exception or running a bottom half. If we define processing an event as a transaction, we can compute contextual effects at possible update points in the OS code, and leverage our transactional version consistency framework to determine whether applying the update at a certain program point is safe. Apply- ing static and dynamic analyses to kernel code is not trivial, due to the presence of wild casts, inline assembly, preemption and non-standard control flow [96, 16], etc. Nevertheless, recent results [35] indicate that modeling and reasoning about some of these low-level operations formally is feasible. Redirecting execution to the new version. One approach to OS updating is to perform an in-place update where new code and data are loaded into a running OS. This is the solution used by Ginseng and other systems [69, 23]. An alternative would be a checkpointing/VM approach [65]: save the entire OS state at the old version, create a new virtual machine that runs the new OS version, and import the 192 old state into the new machine. The difference between these two solutions consists in how they reorganize the memory image after an update. For the in-place approach, the concern is find- ing, converting and accommodating data whose representation has changed, which often requires indirection and imposes some overhead. In the checkpointing/VM approach, at update time, we have to find and checkpoint all the data, convert data whose representation has changed and reinstate it in the new virtual machine, a pro- cess that is complicated and time-consuming. However, this approach offers more flexibility in dealing with changes to data representations, and there is no overhead. The ideal solution would be to combine the ease of use of the in-place approach and the low overhead/flexibility of the VM approach. 7.3 Safety of Dynamic Software Updating As argued throughout this dissertation, safety is a paramount concern for dynamic updating, as the consequences of violating update safety can be severe.2 In Section 3.3.5 we showed how a type safe update can go wrong due to improper timing. Therefore, we would like a DSU system to provide safety guarantees that 2In a recent incident [62], a nuclear power plant was forced to shut down because a software update on a monitoring system reset the data on a control system. The incident was the result of applying an otherwise safe update to only one party in a distributed system: a perfectly safe update was applied on the monitoring system only, instead of both monitoring and control systems being updated simultaneously. Allowing this inconsistency triggered data synchronization and the subsequent reset. 193 give application developers (and patch developers) assurances that following the update, the program will behave as intended. The question, though, is how to define update correctness. Gupta [47] proposed the following correctness property: given program P1 and its new version P2, an update to the active P1 will eventually lead to a program state reachable from a normal evaluation of P2. That is, an update U to P1 is correct iff for all Pprime1 such that P1??? Pprime1 there exists some P12,Pprime2 such that Pprime1 ??U P12??? Pprime2 implies P2 ??? Pprime2. While illuminating, in practice this property is too strong because it cannot account for bug fixes and some kinds of feature enhancements, which are the most frequent reasons a program is changed. As a consequence, we have to relax the update correctness condition, and define an update as correct if, following an update U to program P1, the resulting updated program P1 ?U exhibits property ?. Here ? is defined in terms of a specific program verification technique. For example, when using regression testing, satisfying ? means that the updated program passes the new program?s regression tests. If using type-based approaches to correctness, ? could mean that the type of a function has not changed as result of an update. We now present several dynamic and static approaches that aim to satisfy this relaxed update correctness condition. 194 7.3.1 Dynamic Approaches An obvious approach to finding update-introduced errors is to apply regression tests after the update, as we would test the new release. The accuracy of finding update correctness errors depends on the existence of a thorough test suite for the application. However, this is not sufficient, because update timing and program state might influence the outcome of the test, so to test the update, we would have to try applying it at various update points, and in various program states. This can quickly lead to an explosion of states and update points that need to be tested. An interesting area for exploration is how to reduce this possible explosion by identifying equivalent update points and equivalent states, so we only apply the update at one representative update point, and in one state, respectively, for each equivalence class. Another approach to guaranteeing correctness after applying the update is to use checkpointing, speculations, or transactional memory and I/O [15] so we can roll back when applying an update has lead to a program crash, exception, etc. The idea is to take a checkpoint, start a transaction or speculation [108] prior to applying the update, and let the program run for a while. If the program crashes or throws an exception, we restart from the pre-update checkpoint, or abort the speculation/transaction. A similar approach named ?Band-aid Patching? [99], based on Dyninst [20], forks extra processes that run the continuation of the update point in parallel, at the old and new version. If the old or new version crashes, it is discarded. However, this approach needs to deal with resource duplication, and the 195 coexistence of old and new versions. 7.3.2 Static Approaches Several formalisms allow expressing and checking program correctness prop- erties: type systems, model checking, theorem proving, etc. We show how these formalisms could be used to express and check our relaxed notion of update correct- ness. Type-based Approaches. Advanced type systems can capture higher-level pro- gram correctness properties. For example, if the original application contains lock acquire/release operations, one of the possible specifications for update correctness is that after applying the update, the locks are in the same state as they were in the original program. This could be modeled using refinement types [41, 71], type qualifiers [39, 25], or dependent types [88, 112, 113]. Type-based approaches for update safety have two main advantages. First, Ginseng performs type checking for, and has access to, both the old and new pro- gram; therefore, a straightforward comparison between program elements? types in the old and new version could reveal potential errors. Second, type checkers are fast, and with certain restrictions, decidable, so they do not place undue burden on the programmers of updateable applications. Model Checking and Theorem Proving. Advances in software model checking and theorem prover-based verifiers have made these techniques increasingly popular. 196 In the last few years, a variety of checkers and verifiers have shown to be able to efficiently verify substantial, realistic programs [48, 9, 49, 38, 37]. These features make checkers and verifiers a suitable candidate for enforcing update correctness in Ginseng. Returning to our locking example we could model lock operations via a state machine, and lock usage not conforming to the state machine is flagged as an error (e.g., taking a lock twice or releasing an unlocked lock). Again, if the original application contains lock acquire/release operations, we could use model checking to ensure that, as a result of update, the locks keep their state. 7.4 Software Evolution Understanding how software changes over time can improve our ability to build and maintain it. We have access to the history of many sizable open source programs, but, to be able to tap the potential of large source code repositories, we need to build tools to effectively ?mine? repositories and try to paint a clearer image of the software evolution process. Having such tools can have beneficial implications beyond DSU systems. Apart from learning what classes of changes are frequent, and using them to design the next DSU systems, we can inform tool and language writers what artifacts would be needed to facilitate change and make software easier to maintain. Detecting common classes of changes developers make when writing/maintain- ing their programs can reveal certain ?evolution patterns,? such as refactoring. By providing automatic support for effecting such changes in development tools and 197 environments, we cut down on time and errors. Similarly, mining repositories for common fixes enables us to infer ?bug patterns? [111] that can be generalized and added to bug detectors [54], allowing us to root out certain classes of defects early in the process. One possible direction for continuing this work is to extend ASTdiff (Chap- ter 2) with support for tracking a larger set of program aspects e.g., software com- plexity metrics, or support for inferring patterns in the observed changes. In fact, coming up with formal models or theories that allow us to understand how software evolves is listed as one of the most important challenges of software evolution [73, 26]. 198 Chapter 8 Conclusions The central idea of this dissertation is that Dynamic Software Updating can be practical: programs can be updated while they run, with modest programmer effort, while providing certain update safety guarantees, and without imposing a significant performance overhead. As evidence for this thesis, we present an approach and tool called Ginseng, that permits constructing and applying dynamic updates to C programs, along with an evaluation of Ginseng on six realistic server applications. This dissertation shows that Ginseng makes DSU practical by meeting several criteria we believe to be critical for supporting long-term dynamic updates to C programs: ? Flexibility. Ginseng permits updates to single- and multi-threaded C pro- grams. The six test programs are realistic, substantial and most of them are widely used in constructing real-world Internet services. Ginseng sup- ports changes to functions, types, and global variables, and as a result we could perform all the updates in the 10 months?4 years time frame we consid- ered. Patches were based on actual releases, even though the developers made changes without having dynamic updating in mind. ? Efficiency. We had to make very few changes to the application source code. Despite the fact that differences between releases were non-trivial, generating 199 and testing patches was relatively straightforward. We developed tools to generate most of a dynamic patch automatically by comparing two program versions, reducing programmer work. We found that DSU overhead is modest forI/Oboundapplications, butmorepronouncedforCPU-boundapplications. Our novel version consistency property improves update availability, resulting in a smaller delay between the moment an update is available and the moment the update is applied. ? Safety. Updates cannot be applied at arbitrary points during a program?s execution, because that could lead to safety violations. Ginseng performs a suite of static safety analyses to determine times during the running program?s execution at which an update can be performed safely. In summary, this dissertation makes the following contributions: 1. A practical framework to support dynamic updates to single- and multi- threaded C programs. Ours is the most flexible, and arguably the most safe, implementation of a DSU system to date. 2. A substantial study of the application of our system to six sizable C server programs, three single-threaded, and three multi-threaded, over long periods of time ranging from 10 months to 4 years worth of releases. 3. A novel type-theoretical system that generalizes standard effect systems, called contextual effects; contextual effects are useful when the past or future com- putation of the program is relevant at various program points, and have ap- 200 plications beyond DSU. We also present a formalism and soundness proof for our novel update correctness property, version consistency, which permits us to provide certain update safety guarantees for single- and multi-threaded programs 4. An approach for comparing the source code of different versions of a C pro- gram, as well as a software evolution study of various versions of popular open source programs, including BIND, OpenSSH, Apache, Vsftpd and the Linux kernel. 201 Appendix A Developing Updateable Software Using Ginseng When developing updateable software with Ginseng, the programmer might need to intervene at two stages: when compiling the initial program, and when generating dynamic patches. We will touch on these two aspects in turn. A.1 Preparing Initial Sources To prepare C programs for compilation with Ginseng, developers might have to annotate the source code with #pragma directives which indicate where the update points are, which loops ought to be extracted, which functions act like malloc, etc. A.1.1 Specifying Update Points Update points in single-threaded programs and semantic update points in multi-threaded programs are points in the program at which there are no partially- completed transactions, and all global state is consistent (see Sections 3.5.3 and 4.2). Dynamic updates are best applied at such quiescent points, and preferably those that are stable throughout a system?s lifetime. If the application/thread is structured around an event processing loop, the end of the loop defines a stable quiescent point: there are no pending function calls, little or no data on the stack, and the global state is consistent. Once the user has identified such quiescent points in the 202 1 ... 2 f (); 3 g(); 4 ... 1 #pragma DSU extract(?FOO?) 2 3 ... 4 FOO:{ 5 f (); 6 g(); 7 } 8 ... Original program Program prepared for code extraction Figure A.1: Directing Ginseng to perform code extraction. program, an update point can be specified by inserting a call to the runtime system: DSU update(). A.1.2 Code Extraction Ginseng cannot replace code on the stack, but can replace functions via func- tion indirection (Section 3.2). Thus, to replace a code block on the stack, we can direct Ginseng to extract the block in a separate function, a technique called code extraction (Section 3.2.4). The user can request code extraction as indicated in Fig- ure A.1. The first step is delimiting the code to be extracted, by adding a labeled scope (e.g., FOO), and the second step is adding a #pragma DSU extract(?FOO?) to the source file. Ginsengalsosupportsacombinationofloopbodyextractionandautomatically- inserted update points at the end of an iteration of long-running loops using the directive #pragma DSU loopupd(??label??). 203 A.1.3 Memory Allocation Functions Ginseng treats malloc and other memory allocation functions specially. Since these functions (named absT [106]) are used to construct wrapped type values, Ginseng has to properly initialize the version field of a type wrapped value (Sec- tion 3.2.2). Ginseng recognizes malloc, calloc and alloca by default, but sometimes the applications use custom memory allocators, hence the names of allocation functions have to be communicated to the compiler using #pragma DSU malloc(?function name?). Forinstance, OpenSSHusesacustomfunctionformemoryallocationcalled?xmalloc?, so the user has to notify Ginseng by adding the following line to the original program: #pragma DSU malloc(?xmalloc?). A.1.4 Analysis As described in Section 3.5.3, Ginseng performs safety analyses to detect types used in a representation-dependent way that hampers future changes in a type?s rep- resentation. For example, uses of sizeof or unsafe type casts that are legal in the current program version might become illegal in future versions, once the type rep- resentation has changed. A type used in an illegal fashion is deemed non-updateable; Ginseng will not use the type wrapping scheme for such a type, and its representa- tion cannot change in future versions. The programmer might have to guide Ginseng?s safety analysis in certain cases. Since the analysis is monomorphic, it will not detect universal or existential uses of types, rendering certain types non-updateable, although they are used in a type- 204 safe, representation-independent fashion. On the other hand, the analysis might deem a type updateable, but the programmer needs to have a fixed, non-wrapped representation for the type in question. To override the analysis and force a type (non)updateable, Ginseng provides two pragma primitives, #pragma DSU FORCE NONUPDATABLE and #pragma DSU FORCE UPDATABLE. Their use is detailed below. Whenever Ginseng encounters an ?illegal? type use, it prints out an error message of the form: (:) setTypeNonUpdatable() () This points the programmer to the offending source code line; there are cases when changes to the source code eliminate the offending use (e.g., instantiating an existential). When such changes are not effective, the last resort is forcing types updateable. For example, when updating Sshd we had to use the directive #pragma DSU FORCE UPDATABLE(?struct Channel?) to tell Ginseng that an existential use of struct Channel is update-safe. Conversely, when updating Vsftpd we used a #pragma DSU FORCE NONUPDATABLE(?struct vsf sysutil ipv4addr?) to prevent Ginseng from wrapping struct vsf sysutil ipv4addr whose representation must exactly match the IP address format. C lacks support for universal or existential polymorphism, so programmers have to resort to using void ?for polymorphism. Ginseng checks all upcasts to void ? and downcasts from void ? to ensure no type ?laundering? occurs (Section 3.3.3). Ginseng tracks all upcasts from an abstract type pointer T?into void ?by annotat- 205 ing the void ? and tracking its subsequent flow. If a void ? flows to an abstract type pointer S?, with T negationslash= S, both S and T are set non-updateable, to avoid represen- tation inconsistencies. Whenever a downcast to S? from a void ? with annotation T?, U?, V?, etc. is encountered, Ginseng emits an error message of the form: (:) printVoidConstraints <= The user can then decide whether this a benign or problematic cast. Note that having non-updateable types in a program limits the range of possible updates, so the developers should strive to reduce the number of such types. A.1.5 Check-ins Ginseng supports both synchronous (barrier-based) and asynchronous (re- laxed) updates. Synchronous updates take place at an user-specified update point (marked by a user-inserted DSU update(), as explained in Section 3.3). Asynchronous updates take place at an induced update point or an arbitrary, though safe, point inside a scoped check-in block (Section 4.2). To designate a check-in block, the user simply adds curly braces around the code block and a DSU CHECKIN label to the block, as shown in Figure A.2; no #pragma is necessary. Scoped check-ins ?snapshot? a safe approximation of thread?s current restriction plus the effects of executing the block; the result of this is that the effects of the block will appear in both the prior and future restrictions for the entire execution of the block (Sec- tion 4.2.2). While, in our example, this prevents f, g, and s from changing, the advantage is that multi-threaded programs can perform updates without the need 206 1 struct S{ int i; }; 2 ... 3 struct S s; 4 ... 5 f (); 6 s.i = 0; 7 g(); 8 ... 1 struct S{ int i; }; 2 ... 3 struct S s; 4 ... 5 DSU CHECKIN 1:{ 6 7 //{S,f,g}= ? {S,f,g}= ? 8 9 f (); 10 s.i = 0; 11 g(); 12 13 //{S,f,g}= ? {}= ? 14 } 15 ... Original program Program annotated with check-ins Figure A.2: Directing Ginseng to perform check-ins. for blocking synchronization?as long as all threads have check-in effects that do not conflict with the update, the update can be performed right away. A.2 Dynamic Patches Dynamic patches are generated mostly automatically by Ginseng, but (de- pending on the nature of changes between versions), the programmer might still have to complete the auto-generated type transformers and write state transformers (Section 3.4). Source code for patches consists of two files: a .patch.custom.c file con- taining state and type transformers, which can be tailored by the programmer, and a .patch.gen.c containing definitions of new (or changed) types and functions. Gin- seng generates both these files automatically, but the programmer is only supposed to alter the former. We now provide examples of completing auto-generated type transformers and writing state transformers. 207 1 // OLD program, sshd 3.7.1p2 2 struct Authct old { 3 int failures ; 4 char ?user; 5 char ?service ; 6 struct passwd ?pw; 7 char ?style ; 8 }; 9 10 // NEW program, sshd 3.8p1 11 struct Authct new{ 12 int failures ; 13 int force pwchange; 14 char ?user; 15 char ?service ; 16 struct passwd ?pw; 17 char ?style ; 18 }; 1 void tt Authct(struct Authct old ?xin, 2 struct Authct new?xout){ 3 xout?failures = xin?failures; 4 xout?force pwchange = ??; 5 xout?user = xin?user; 6 xout?service = xin?service; 7 xout?pw = xin?pw; 8 xout?style = xin?style; 9 } (a) Source code (c) Type Transformer Figure A.3: Type transformer example. A.2.1 Type Transformers When type representations change, type transformers will convert values from the old representation to the new one. Ginseng compiler automatically generates type transformer skeletons containing ?best guess? conversion functions between representations, but the programmer still has to intervene in order to verify the auto-generated conversions and add initialization code where needed. For instance, if a struct type has changed, the stub consists of code to copy the preserved fields over from the old to the new definition, and the programmer will have to initialize newly added fields. Type transformers have the following signature: void DSU tt type(type old ?xin, type new?xout, wrapped type?xnew) The arguments are pointers to the concrete type representations (xin and xout) and to the wrapped representation (xnew); most of the time, xin and xout are sufficient for 208 1 // OLD program, zebra 0.93b 2 struct route table ?rib table ipv4 ; 3 struct route table ? static table ipv4 ; 4 5 struct route table ?rib table ipv6 ; 6 struct route table ? static table ipv6 ; 7 8 // NEW program, zebra 0.94 9 struct route table ?vrf [4]; 1 void DSU state xform(){ 2 vrf [0] = rib table ipv4 ; 3 vrf [1] = rib table ipv6 ; 4 vrf [2] = static table ipv4 ; 5 vrf [3] = static table ipv6 ; 6 } (a) Source code (b) State Transformer Figure A.4: State transformer example. writing the conversion function, but when converting linked structures e.g., trees or lists, xnew is needed as well. In most cases type is a struct, and the effort consists of initializing newly added fields. As an example, in Figure A.3 we show the Ginseng-generated type transformer for struct Authct in the update from Sshd version 3.7.1p2 to version 3.8p1. The new version adds a field force pwchange (line 13). Ginseng generates code to copy the existing fields, but the programmer has to write the correct initializer for the newly- introduced field. Depending on when or how the new code uses the newly added fields, writing the type transformer can range from trivial (assigning a default value) to impossible (Section 3.5.3). If no type has changed, the auto-generated .patch.custom.c will be empty, mean- ing there are no type transformers to be filled out. Note however that state trans- formers (described in the next section) might still be necessary. 209 A.2.2 State Transformers A state transformer is an optional function supplied by the programmer and invoked by the runtime system run at update time (Section 3.4). The purpose of state transformers is two-fold: 1) to convert global state and establish the invariants the new program version expects, and 2) to run initialization code the new program depends on, but is not part of the old program?s initialization code. Since a state transformer function is optional, it is not included by default in the .patch.custom.c; the programmer has to add it using the following skeleton: void DSU state xform(){ ... } As an example, in Figure A.4 we show the state transformer we had to write for the update from Zebra version 0.93b to version 0.94. We see that the old version keeps routing tables in four different global variables (rib table ipv4, static table ipv4, rib table ipv6, and static table ipv6), whereas the new version uses a routing table array, vrf. The state transformer makes the array elements point the associated routing table. Just like in the type transformer case, state transformer complexity can range from trivial (if at all needed) to impossible (e.g., if at boot time hardware is initial- ized differently by the old and the new program). The most complicated cases we have encountered were refactorings of global structures where global state had to be transferred between the old and new storage model. 210 Appendix B Proteus-tx Proofs Lemma B.0.1 (Weakening). If ?;? turnstileleft e : ? and ?prime ? ? then ?;?prime turnstileleft e : ?. Proof. By induction on the typing derivation of ?;? turnstileleft e : ?. Lemma B.0.2 (Flow subtyping). If ?1 a3?2 arrowhookleft? ? then ?1 ? ? and ?2 ? ?. Proof. Follows directly from the definitions. Lemma B.0.3 (Subtyping reflexivity). ? ? ? for all ?. Proof. Straightforward, from the definition of subtyping in Figure 5.2. Lemma B.0.4 (Subtyping transitivity). For all ?,?prime,?primeprime, if ? ? ?prime and ?prime ? ?primeprime then ? ? ?primeprime. Proof. By simultaneous induction on ? ? ?prime and ?prime ? ?primeprime. Notice that subtyping is syntax-directed, and this forces the final rule of each derivation to be the same: case (SInt,SInt) : From the definition of (SInt), we have int ? int, hence ? ? ?primeprime follows directly. case (SRef,SRef) : We have: SRef ? ? ?prime ?prime ? ? ? ? ?prime ref ? ? ? ref ?prime ?prime SRef ?prime ? ?primeprime ?primeprime ? ?prime ?prime ? ?primeprime ref ?prime ?prime ? ref ?primeprime ?primeprime We know that ? ? ?prime ? ?prime ? ?primeprime ? ? ? ?primeprime, and by induction we have that ? ? ?prime ? ?prime ? ?primeprime ? ? ? ?primeprime and ?prime ? ? ??primeprime ? ?prime ? ?primeprime ? ?, respectively. We can now apply (SRef): SRef ? ? ?primeprime ?primeprime ? ? ? ? ?primeprime ref ? ? ? ref ?primeprime ?primeprime case (SFun,SFun) : We have: SFun ?prime1 ? ?1 ?2 ? ?prime2 ? ? ?prime ?1 ??? ?2 ? ?prime1 ???prime ?prime2 SFun ?primeprime1 ? ?prime1 ?prime2 ? ?primeprime2 ?prime ? ?primeprime ?prime1 ???prime ?prime2 ? ?primeprime1 ???primeprime ?primeprime2 We know that ? ? ?prime ??prime ? ?primeprime ? ? ? ?primeprime, and by induction (see (SRef,SRef) above) we have ?primeprime1 ? ?1 and ?2 ? ?primeprime2 . We can now apply (SFun): SFun ?primeprime1 ? ?1 ?2 ? ?primeprime2 ? ? ?primeprime ?1 ??? ?2 ? ?primeprime1 ???primeprime ?primeprime2 Lemma B.0.5 (Value typing). If ?;? turnstileleft v : ? then ?prime;? turnstileleft v : ? for all ?prime. Proof. By induction on the typing derivation of ?;? turnstileleft v : ?. case (TInt) : Thus v ? n and we prove the result as follows: TSub TInt ? ?;? turnstileleft n : int int ? int SCtxt ?? ? ?? ?prime ? [?;?prime;?] ? ? ?prime? ? ? ?prime ?prime;? turnstileleft n : int 211 case (TGvar) : We have TGvar ?(f) = ?? ?;? turnstileleft f : ? We prove the result as follows: TSub TGvar ?(f) = ?? ?;? turnstileleft f : ? ? ? ? SCtxt ?? ? ?? ?prime ? [?;?prime;?] ? ? ?prime? ? ? ?prime ?prime;? turnstileleft f : ? case (TLoc) : Similar to (TGvar). case (TSub) : We have TSub ?primeprime;? turnstileleft v : ?prime ?prime ? ? SCtxt ?primeprime ? [?;?primeprime;?] ? ? [?;?;?] ?primeprime ? ??primeprime ? ? ?;? turnstileleft v : ? The result follows by induction on ?primeprime;? turnstileleft v : ?prime and by applying [TSub]. Lemma B.0.6 (Subtyping Derivations). If ?;? turnstileleft e : ? then we can construct a proof derivation of this judgment that ends in one use of (TSub) whose premise uses a rule other than (TSub). Proof. By induction on ?;? turnstileleft e : ?. case (TSub) : We have TSub ?prime;? turnstileleft e : ?prime ?prime ? ? SCtxt ?prime? ? ?? ?? ? ?prime? ?? ? ?prime??prime ? ? ?;? turnstileleft e : ? By induction, we have TSub ?primeprime;? turnstileleft e : ?primeprime ?primeprime ? ?prime SCtxt ?primeprime? ? ?prime? ?prime? ? ?primeprime? ?prime? ? ?primeprime??primeprime ? ?prime ?prime;? turnstileleft e : ?prime where the derivation ?primeprime;? turnstileleft e : ?primeprime does not conclude with (TSub). By the transitivity of subtyping (Lemma B.0.4), we have ?primeprime ? ?; we also have ?primeprime ? ? and finally we get the desired result by (TSub): TSub ?primeprime;? turnstileleft e : ?primeprime ?primeprime ? ? SCtxt ?primeprime? ? ?? ?? ? ?primeprime? ?? ? ?primeprime??primeprime ? ? ?;? turnstileleft e : ? case all others : Since we have that the last rule in ?;? turnstileleft e : ? is not (TSub), we have the desired result by applying (TSub) (where ? ? ? follows from the reflexivity of subtyping, Lemma B.0.3): TSub ?;? turnstileleft e : ? ? ? ? ? ? ??;? turnstileleft e : ? Lemma B.0.7 (Flow effect weakening). If ?;? turnstileleft e : ? where ? ? [?;?;?], then ?prime;? turnstileleft e : ? where ?prime ? [?prime;?;?prime], ?prime ? ?, and ?prime ? ?, and all uses of [TSub] applying ?prime ? ? require ?prime? = ?? and ?prime? = ??. 212 Proof. By induction on ?;? turnstileleft e : ?. case (TGvar),(TInt),(TVar) : Trivial. case (TUpdate) : We have TUpdate ? ? ?primeprime ? ? ?primeprime (??);? turnstileleft update?primeprime,?primeprime : int Since ?prime ? ? and ?prime ? ? we can apply (TUpdate): TUpdate ?prime ? ?primeprime ?prime ? ?primeprime ([?prime;?;?prime]);? turnstileleft update?prime,?prime : int case (TTransact) : We have TTransact ?primeprime;? turnstileleft e : ? ?? ? ?primeprime? ?? ? ?primeprime? ?;? turnstileleft tx e : ? Let ?prime = [?prime;?;?prime]. Since ?prime? ? ?? and ?prime? ? ?? we can apply (TTransact): TTransact ?primeprime;? turnstileleft e : ? ?prime? ? ?primeprime? ?prime? ? ?primeprime? ?prime;? turnstileleft tx e : ? case (TIntrans) : Similar to (TTransact). case (TSub) : We have TSub ?prime;? turnstileleft e : ?prime ?prime ? ? ?prime? ? ?? ?? ? ?prime? ?? ? ?prime? ?prime ? ? ?;? turnstileleft e : ? Let ?primeprime = [??;?prime?;??]. Thus we have TSub ?primeprime;? turnstileleft e : ?prime ?prime ? ? ?primeprime? ? ?? ?? = ?primeprime? ?? = ?primeprime? ?primeprime ? ? ?;? turnstileleft e : ? where the first premise follows by induction (which we can apply because ?primeprime? ? ?prime? and ?primeprime? ? ?prime? by assumption); the first premise of ?primeprime ? ? is by assumption, and the latter two premises are by definition of ?primeprime. case (TRef) : We know that TRef ?;? turnstileleft e : ??;? turnstileleft ref e : ref ? ? and have ?prime;? turnstileleft e : ? by induction, hence we get the result by (TRef). case (TDeref) : We know that TDeref ?1;? turnstileleft e : ref ? ? ??2 = ? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !e : ? We have ?prime ? [?prime;??1 ? ??2;?prime] where ?prime ? ?? and ?prime ? ??. Choose ?prime1 ? [?prime;??1;??2 ? ?prime] and ?prime2 ? [?prime ???1;??2;?prime], hence ?prime1 ?? ?prime2, ?prime?2 = ??2 = ?, and ?prime ? ?prime1 a3?prime2. We want to prove that ?prime;? turnstileleft !e : ?. Since ?prime ? ? and ??2 ? ?prime ? ??2 ? ? we can apply induction to get ?prime1;? turnstileleft e : ref ? ? and we get the result by applying (TDeref): TDeref ?prime1;? turnstileleft e : ref ? ? ?prime?2 = ? ?prime1 a3?prime2 arrowhookleft? ?prime ?prime;? turnstileleft !e : ? 213 case (TRet) : Similar to [TDeref]. case (TApp) : We know that TApp ?1;? turnstileleft e1 : ?1 ???f ?2 ?2;? turnstileleft e2 : ?1 ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f ?;? turnstileleft e1 e2 : ?2 We have ?prime ? [?prime;??1 ? ??2 ? ??3;?prime] where ?prime ? ?? and ?prime ? ??. Choose ?prime1 ? [?prime;??1;??2 ? ??3 ? ?prime], ?prime2 ? [?prime ???1;??2;??3 ??prime], ?prime3 ? [?prime ???1 ???2;??3;?prime], hence ?prime?3 = ??3 = ?f, and ?prime1a3?prime2a3?prime3 arrowhookleft? ?prime. We want to prove that ?prime;? turnstileleft e1 e2 : ?2. Since ?prime ? ? and ??2 ???3 ??prime ? ??2 ???3 ??prime we can apply induction to get ?prime1;? turnstileleft e1 : ?1 ???f ?2. Similarly, since ?prime ? ??1 ? ? ? ??1 and ??3 ? ?prime ? ??3 ? ?, we can apply induction to get ?prime2;? turnstileleft e2 : ?1. We get the get the result by applying (TApp): TApp ?prime1;? turnstileleft e1 : ?1 ???primef ?2 ?prime2;? turnstileleft e2 : ?1 ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime ?prime?3 = ??f ?prime?3 ? ??f ?prime?3 ? ??f ?prime;? turnstileleft e1 e2 : ?2 case (TAssign), (TIf), (TLet) : Similar to (TApp). Definition B.0.8. If ?;? turnstileleft e : ?, llbracket?;? turnstileleft e : ?rrbracket = R, and ?;? turnstileleft e : ? a59 Rprime then R ? Rprime, where llbracket?;? turnstileleft e : ?rrbracket is defined in Figure B.1. Lemma B.0.9 (Left subexpression version consistency). If ?,R;H turnstileleft ? and ?1 a3?2 arrowhookleft? ? then ?1,R;H turnstileleft ?. Proof. We know that ?1 a3?2 ? [?1;?1 ??2;?2]. We have two cases: R ? ?: Thus ? ? (?,?) and by assumption we have: TC1 f ? ? ? f ? ?1 f ? ?1 ??2 ? nprime ? ver(H,f) [?1;?1 ??2;?2],?;H turnstileleft (?,?) The result follows from [TC1]: TC1 f ? ? ? f ? ?1 f ? ?1 ? nprime ? ver(H,f) [?1;?1;?1],?;H turnstileleft (?,?) R ? ?prime,Rprime: Thus we must have TC2 ?prime,Rprime;H turnstileleft ?prime ? ? [?1;?1 ??2;?2] f ? ? ? f ? ?1 f ? ?1 ??2 ? nprime ? ver(H,f) ?,?prime,Rprime;H turnstileleft ((?,?),?prime) where ? ? ((?,?),?prime). The result follows by [TC2]: TC2 ?prime,Rprime;H turnstileleft ?prime ?1 ? [?1;?1;?1] f ? ? ? f ? ?1 f ? ?1 ? nprime ? ver(H,f) ?1,?prime,Rprime;H turnstileleft ((?,?),?prime) where the first premise follows by assumption. 214 LARGEllbracket (TInt) ? ?;? turnstileleft n : int LARGErrbracket = ? LARGEllbracket (TVar) ?(x) = ?? ?;? turnstileleft x : ? LARGErrbracket = ? LARGEllbracket (TGvar) ?(f) = ?? ?;? turnstileleft f : ? LARGErrbracket = ? Hugellbrackettop HugellbracketexHugellbracketex Hugellbracketbot(TSub) D :: ?prime;? turnstileleft e : ?prime ?prime ? ? ?prime ? ? ?;? turnstileleft e : ? Hugerrbrackettop HugerrbracketexHugerrbracketex Hugerrbracketbot = R, where llbracketDrrbracket = R hugellbracket (TUpdate) ?? ? ?prime ?? ? ?prime ?;? turnstileleft update?prime,?prime : int hugerrbracket = ? LARGEllbracket (TRef) D :: ?;? turnstileleft e : ??;? turnstileleft ref e : ref ? ? LARGErrbracket = R, where llbracketDrrbracket = R Hugellbrackettop HugellbracketexHugellbracketex Hugellbracketbot(TDeref) D1 :: ?1;? turnstileleft e : ref ? ? ??2 = ? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !e : ? Hugerrbrackettop HugerrbracketexHugerrbracketex Hugerrbracketbot = R1, where llbracketD1rrbracket = R1 Hugellbrackettop HugellbracketexHugellbracketex Hugellbracketbot(TAssign) D1 :: ?1;? turnstileleft e1 : ref ? ? D2 :: ?2;? turnstileleft e2 : ? ??3 = ? ?1 a3?2 a3?3 arrowhookleft? ? ?;? turnstileleft e1 := e2 : ? Hugerrbrackettop HugerrbracketexHugerrbracketex Hugerrbracketbot = R1 trianglerighttriangleleft R2, where llbracketD1rrbracket = R1 llbracketD2rrbracket = R2 e1 negationslash? v ? R2 = ? Hugellbrackettop HugellbracketexHugellbracketex HugellbracketexHugellbracketex HugellbracketexHugellbracketbot(TIf) D1 :: ?1;? turnstileleft e1 : int D2 :: ?2;? turnstileleft e2 : ? D3 :: ?2;? turnstileleft e3 : ? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft if0 e1 then e2 else e3 : ? Hugerrbrackettop HugerrbracketexHugerrbracketex HugerrbracketexHugerrbracketex HugerrbracketexHugerrbracketbot = R1, where llbracketD1rrbracket = R1 llbracketD2rrbracket = ? llbracketD3rrbracket = ? Hugellbrackettop HugellbracketexHugellbracketbot(TLet) D1 :: ?1;? turnstileleft e1 : ?1 D2 :: ?2;?,x : ?1 turnstileleft e2 : ?2 ?1 a3?2 arrowhookleft? ? ?;? turnstileleft let x : ?1 = e1 in e2 : ?2 Hugerrbrackettop HugerrbracketexHugerrbracketbot = R 1, where llbracketD1rrbracket = R1 llbracketD2rrbracket = ? Hugellbrackettop HugellbracketexHugellbracketex HugellbracketexHugellbracketex HugellbracketexHugellbracketex HugellbracketexHugellbracketbot (TApp) D1 :: ?1;? turnstileleft e1 : ?1 ???f ?2 D2 :: ?2;? turnstileleft e2 : ?1 ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f ?;? turnstileleft e1 e2 : ?2 Hugerrbrackettop HugerrbracketexHugerrbracketex HugerrbracketexHugerrbracketex HugerrbracketexHugerrbracketex HugerrbracketexHugerrbracketbot = R1 trianglerighttriangleleft R2, where llbracketD1rrbracket = R1 llbracketD2rrbracket = R2 e1 negationslash? v ? R2 = ? LARGEllbracket (TTransact) ?1;? turnstileleft e : ? ? ? ? ??1 ?? ? ??1 ?;? turnstileleft tx e : ? LARGErrbracket = ? LARGEllbracket (TIntrans) D1 :: ?1;? turnstileleft e : ? ? ? ? ??1 ?? ? ??1 ?;? turnstileleft intx e : ? LARGErrbracket = ?,R1 where llbracketD1rrbracket = R1 ? trianglerighttriangleleft R = R ? R trianglerighttriangleleft ? = R Figure B.1: Transaction effect extraction Lemma B.0.10 (Subexpression version consistency). If ?,R1 trianglerighttriangleleft R2;H turnstileleft ? and ?1 a3?2 arrowhookleft? ? then (i) R2 ? ? implies ?1,R1;H turnstileleft ? (ii) R1 ? ? and ??1 ? ? implies ?2,R2;H turnstileleft ? Proof. (i) Since R2 = ? by assumption, we have R1 = R1 trianglerighttriangleleft R2. We have two cases: 215 R1 ? ?: Thus we must have TC1 f ? ? ? f ? ?1 f ? ?1 ??2 ? nprime ? ver(H,f) [?1;?1 ??2;?2],?;H turnstileleft (?,?) where ? ? (?,?). The result follows from [TC1] : TC1 f ? ? ? f ? ?1 f ? ?1 ? nprime ? ver(H,f) [?1;?1;?1],?;H turnstileleft (?,?) R1 ? ?prime,Rprime: Thus we must have TC2 ?prime,Rprime;H turnstileleft ?prime ? ? [?1;?1 ??2;?2] f ? ? ? f ? ?1 f ? ?1 ??2 ? nprime ? ver(H,f) ?,?prime,Rprime;H turnstileleft ((?,?),?prime) where ? ? ((?,?),?prime). The result follows by [TC2]: TC2 ?prime,Rprime;H turnstileleft ?prime ?1 ? [?1;?1;?1] f ? ? ? f ? ?1 f ? ?1 ? nprime ? ver(H,f) ?1,?prime,Rprime;H turnstileleft ((?,?),?prime) where the first premise follows by assumption. (ii) Since R1 = ? by assumption, we have R2 = R1 trianglerighttriangleleft R2. We have two cases: R2 ? ?: Thus we must have TC1 f ? ? ? f ? ?1 f ? ?1 ??2 ? nprime ? ver(H,f) [?1;?1 ??2;?2],?;H turnstileleft (?,?) where ? ? (?,?). Since ??1 ? ? and ??2 = ??1 ? ??1 we have ??2 = ??1 and the result follows from [TC1]: TC1 f ? ? ? f ? ?2 f ? ?2 ? nprime ? ver(H,f) [?2;?2;?2],?;H turnstileleft (?,?) R2 ? ?prime,Rprime: Thus we must have TC2 ?prime,Rprime;H turnstileleft ?prime ? ? [?1;?1 ??2;?2] f ? ? ? f ? ?1 f ? ?1 ??2 ? nprime ? ver(H,f) ?,?prime,Rprime;H turnstileleft ((?,?),?prime) where ? ? ((?,?),?prime). The result follows by [TC2] and ??2 = ??1 (because ??1 ? ? and ??2 = ??1 ???1): TC2 ?prime,Rprime;H turnstileleft ?prime ?2 ? [?2;?2;?2] f ? ? ? f ? ?2 f ? ?2 ? nprime ? ver(H,f) ?2,?prime,Rprime;H turnstileleft ((?,?),?prime) where the first premise follows by assumption. Lemma B.0.11 (Stack Shapes). If ?n;?;H;e? ???0 ?n;?prime;Hprime;eprime? then top(?) = (?,?) and top(?prime) = (?prime,?prime) where nprime = nprimeprime and ? ? ?prime. Proof. By induction on ?n;?;H;e? ???0 ?n;?prime;Hprime;eprime?. 216 Lemma B.0.12 (Update preserves heap safety). If n;? turnstileleft H and updateOK(upd,H,(?,?),dir) then n+1;U[?]upd turnstileleft U[H]updn+1. Proof. Let nprime = n + 1 and ?prime ? U[?]upd and Hprime ? U[H]updnprime . From the definition of heap typing (Figure 5.8), to prove nprime;?prime turnstileleft Hprime, we need to show: 1. dom(?prime) = dom(Hprime) 2. ?z mapsto? (?,v,?) ? Hprime. ??;?prime turnstileleft v : ? ? ?prime(z) = ref ? ? ? z ? ? 3. ?z mapsto? (? ??? ?prime,?(x).e,?) ? Hprime. ?;?prime,x : ? turnstileleft e : ?prime ? ?prime(z) = ? ??? ?prime ? z ? ?? ? z ? ?? 4. ?r mapsto? (?,v,?) ? Hprime. ??;?prime turnstileleft v : ? ? ?prime(r) = ref ? ? 5. ?z mapsto? (?,b,?) ? Hprime. nprime ? ? Proof by induction on H. case H ? ? : We have U[?]updnprime = updadd (modified to have version set {n + 1}), and thus dom(Hprime) = dom(updadd). Our assumption dom(H) = dom(?) implies that ? = ?, and thus ?prime = U[?]upd = types(updadd). 1. dom(?prime) = dom(types(updadd)) = dom(updadd) = dom(Hprime). 2. Since Hprime = updadd, this follows directly from updateOK(upd,H,(?,?),dir) given the definition of ?prime = U[?]upd = types(updadd). 3. Similar to 2. 4. Vacuously true, since r negationslash? dom(Hprime) = dom(updadd) for all r. 5. Holds by the definition of U[?]updnprime . case H ? (r mapsto? (?,b,?),Hprimeprime) : We have Hprime ? U[(r mapsto? (?,b,?),Hprimeprime)]updnprime = (r mapsto? (?,b,?)),U[Hprimeprime]updnprime . Our assumption dom(H) = dom(?) implies ? ? (r : ?,?primeprime) for some ?primeprime, where dom(Hprimeprime) = dom(?primeprime) and ?prime ? U[r : ?,?primeprime]upd = r : ?,U[?primeprime]upd. 1. By induction we know dom(U[?primeprime]upd) = dom(U[Hprimeprime]updnprime ). But dom(Hprime) = dom(r = (?,b,?)),U[Hprimeprime]updnprime ) = {r}?dom(U[Hprimeprime]updnprime ), and dom(?prime) = dom(r : ?,U[?primeprime]upd) = {r}?dom(U[?primeprime]upd). 2. Follows by induction, since r negationslash= z for all z. 3. Same as above. 4. For r, this follows by assumption, since it is clear that H(r) = U[H]updnprime (r) and ?(r) = U[?]upd(r), and for the rest of the heap the property follows by induction. 5. Follows by induction, since r negationslash= z for all z. case H ? (z = (?,b,?),Hprimeprime) : We have Hprime ? U[(z mapsto? (?,b,?),Hprimeprime)]updnprime = (z mapsto? (?,bprime,?prime)),U[Hprimeprime]updnprime . Our assumption dom(H) = dom(?) implies ? ? (z : heapType(z,?),?primeprime) for some ?primeprime, where dom(Hprimeprime) = dom(?primeprime) and ?prime ? U[z : ?,?primeprime]upd = z : heapType(z,?),U[?primeprime]upd. 1. Similar to the argument for the H ? (r mapsto? (...),Hprimeprime) case. 4. This follows by induction, since z negationslash= r. Now consider the remaining cases according to z with respect to updchg: case z negationslash? dom(updchg) : 2. For z, this follows by assumption, since it is clear that H(z) = U[H]updnprime (z) and ?(z) = U[?]upd(z). The rest of the heap follows by induction. 3. Same as above. 217 5. We have U[(z mapsto? (?,b,?),Hprimeprime)]updnprime = (z mapsto? (?,b,? ?{nprime}),U[Hprimeprime]updnprime ) where nprime ? (? ?{nprime}) for z, and the rest follows by induction. case z ? dom(updchg) : 2. From the definition of updateOK(upd,H,(?,?),dir) we know that (i) ??;U[?]upd turnstileleft vprime : ?. Considering z, from the definition of heapType(?,z) we have (ii) heapType(?,z) = ref ? ? where z ? ?. Combining (i) and (ii) yields ??;?prime turnstileleft v : ? ? ?prime(z) = ref ? ? ? z ? ? The property holds for the rest of the heap by induction. 3. Similar to the previous. 5. We have U[(z mapsto? (?,b,?),Hprimeprime)]updnprime = (z mapsto? (?,bprime,{nprime}),U[Hprimeprime]updnprime ) and obviously nprime ? {nprime} for z, and the rest by induction. The following lemma states that if we start with a well-typed program and a version-consistent trace and we take an update step, then afterward we will still have a well-typed program whose trace is version-consistent. Lemma B.0.13 (Update preservation). Suppose we have the following: 1. n turnstileleft H,e : ? (such that ?;? turnstileleft e : ? a59R and n;? turnstileleft H for some ?,?) 2. ?,R;H turnstileleft ? 3. traceOK(?) 4. ?n;?;H;e? ?? ? ?n + 1;?prime;Hprime; e? where Hprime ? U[H]updn+1, ?prime ? U[?]upd, ? = (upd,dir), ?prime ? U[?]upd,dirn , and top(?prime) = (?prime,?prime). Then for some ?prime such that ?prime? = ??, ?prime? = ??, and ?prime? ? ?? and some ?prime ? ? we have that: 1. n + 1 turnstileleft Hprime,e : ? where ?prime;?prime turnstileleft e : ? a59R and n + 1;?prime turnstileleft Hprime 2. ?prime,R;Hprime turnstileleft ?prime 3. traceOK(?prime) 4. (dir = bck) ? nprimeprime ? n + 1 ? (dir = fwd) ? (f ? ? ? ver(H,f) ? ver(Hprime,f)) Proof. Since U[?]upd ? ?, ?;U[?]upd turnstileleft e : ? a59 R follows by weakening (Lemma B.0.1). Proceed by simultaneous induction on the typing derivation of e (n turnstileleft H,e : ?) and on the evaluation derivation ?n;?;H;e? ?? ? ?n + 1;?prime;Hprime; e?. Consider the last rule used in the evaluation derivation: case [gvar-deref], [gvar-assign], [call], [let], [tx-start], [tx-end], [ref], [deref], [assign], [if-t], [if-f], [no- update] : Not possible, as these transitions cannot cause an update to occur. case [update] : This implies that e ? update?,? and thus ?n;(?,?);H;update?,?? ?? ? ?n + 1;U[(?,?)]upd,dirn+1 ;U[H]updn+1;1? where ? ? (upd,dir) and updateOK(upd,H,(?,?),dir). By subtyping derivations (Lemma B.0.6) we have TSub TUpdate ? ? ?primeprime ? ? ?primeprime ?u ? [?;?;?] ?u;? turnstileleft update?primeprime,?primeprime : int a59? int ? int ?u ? ? ? ? [?;?;?] ?;? turnstileleft update?,? : int a59? 218 and by flow effect weakening (Lemma B.0.7) we know that ? and ? are unchanged in the use of (TSub). Let ?prime = ?u (hence ?prime? = ??, ?prime? = ??, and ? ? ?? as required) and (?prime,?prime) ? U[(?,?)]upd,dirn+1 . To prove 1., we get n + 1;?prime turnstileleft Hprime by Lemma B.0.12 and ?u;?prime turnstileleft 1 : int a59? by [Tint]. To prove 2., we must show ?u,?;Hprime turnstileleft (?prime,?prime). By assumption, we have TC1 f ? ? ? f ? ? f ? ? ? nprime ? ver(H,f) [?;?;?],?;H turnstileleft (?,?) We need to prove TC1 f ? ?prime ? f ? ? f ? ? ? nprimeprime ? ver(Hprime,f) [?;?;?],?;Hprime turnstileleft (?prime,?prime) We have the first premise by assumption (since dom(?) = dom(?prime) from the definition of U[(?,?)]upd,dirn+1 ). The second premise holds vacuously. To prove 3., we must show traceOK(?prime,?prime). Consider each possible update type: case dir = bck : From the definition of U[(?,?)]upd,bckn+1 , we know that nprimeprime = n + 1. Consider (f,?) ? ?; it must be the case that f negationslash? dom(updchg). This is because dir = bck implies ? ? dom(updchg) = ? and by assumption (from the first premise of [TC1] above) f ? ?. Therefore, since f negationslash? dom(updchg), its ?prime entry is (f,? ?{nprimeprime}), which is the required result. case dir = fwd : Since U[(?,?)]upd,fwdn+1 = (?,?), the result is true by assumption. To prove 4., we must show nprimeprime ? n + 1 ? (f ? ? ? ver(H,f) ? ver(Hprime,f)). Consider each possible update type: case dir = bck : From the definition of U[(?,?)]upd,bckn+1 , we know that nprimeprime = n + 1 so we are done. case dir = fwd : We have U[(?,?)]upd,fwdn+1 = (?,?), and from updateOK(upd,H,(?,?),dir) we know that f ? ? ? f negationslash? dom(updchg). From the definition of U[H]updn we know that U[(f mapsto? (?,b,?),H)]updn+1 = f mapsto? (?,b,?? {n+1}) if f negationslash? dom(updchg). This implies that for f ? ?, ver(H,f) = ? and ver(Hprime,f) = ??{n+1}, and therefore ver(H,f) ? ver(Hprime,f). case [tx-cong-1] : We have that?n;((?,?),?);H;intx e? ?? ? ?n+1;(U[(?,?)]upd,dirn+1 ,?prime);Hprime;intx eprime?follows from?n;?;H;e? ?? ? ?n + 1;?prime;Hprime;eprime? by [tx-cong-1], where ? ? (upd,dir). Let (?prime,?prime) ? U[(?,?)]upd,dirn+1 . By assumption and subtyping derivations (Lemma B.0.6) we have TSub TIntrans ?e;? turnstileleft e : ?prime a59R ? ? ??e ? ? ??e [?;?;?];? turnstileleft intx e : ?prime a59?e,R ?prime ? ? [?;?;?] ? [?;?;?] [?;?;?];? turnstileleft intx e : ? a59?e,R and by flow effect weakening (Lemma B.0.7) we know that ? and ? are unchanged in the use of (TSub). We have ?e ? [?e;?e;?e], so that ?e ? ? and ?e ? ?. To apply induction, we must show that ?e,R;H turnstileleft ? (which follows by inversion on ?,?e,R;H turnstileleft ((?,?),?)); ?e;? turnstileleft e : ?prime a59R (which follows by assumption); and n;? turnstileleft H (by assumption). By induction we have: (i) ?primee;?prime turnstileleft eprime : ?prime a59R and (ii) n + 1;?prime turnstileleft Hprime (iii) ?primee,R;Hprime turnstileleft ?prime (iv) traceOK(?prime) (v) (dir = bck) ? nprimeprime ? n + 1 ? (dir = fwd) ? (f ? ?e ? ver(H,f) ? ver(Hprime,f)) 219 where ?primee ? [?e;?primee;?e], ?primee ? ?e. Let ?prime = [?;?;?] (hence ?prime? = ??, ?prime? = ??, and ? ? ?? as required). To prove 1., we can show TSub TIntrans ?primee;?prime turnstileleft eprime : ? a59R ? ? ?prime?e ? ? ?prime?e ?prime;? turnstileleft intx eprime : ? a59?primee,R ?prime ? ? ?prime ? ?prime ?prime;? turnstileleft intx eprime : ? a59?primee,R The first premise of [TIntrans] follows by (i), and the second since ?e ? ? and ?e ? ?. To prove 2., we need to show that TC2 ?primee,R;Hprime turnstileleft ?prime f ? ?prime ? f ? ? f ? ? ? nprimeprime ? ver(Hprime,f) [?;?;?],?primee,R;Hprime turnstileleft ((?prime,?prime),?prime) We have the first premise by (iii), the second by assumption (since dom(?) = dom(?prime) from the definition of U[(?,?)]upd,dirn+1 ), and the last holds vacuously. To prove 3., we must show traceOK((?prime,?prime),?prime), which reduces to proving traceOK(?prime,?prime) since we have traceOK(?prime) from (iv). We have traceOK(?,?) by assumption. Consider each possible update type: case dir = bck : From the definition of U[(?,?)]upd,bckn+1 , we know that nprimeprime = n + 1. Consider (f,?) ? ?; it must be the case that f negationslash? dom(updchg). This is because dir = bck implies ?e ? dom(updchg) = ? and by assumption we have ? ? ?e (from (TIntrans)) and f ? ? (from the first premise of [TC1] above). Therefore, since f negationslash? dom(updchg), its ?prime entry is (f,? ?{nprimeprime}), which is the required result. case dir = fwd : Since U[(?,?)]upd,fwdn+1 = (?,?), the result is true by assumption. Part 4. follows directly from (v) and the fact that ?e ? ?. case [cong] : We have that ?n;?;H;E[e]? ?? ? ?n + 1;?prime;Hprime;E[e]? follows from ?n;?;H;e? ?? ? ?n + 1;?prime;Hprime;e? by [cong], where ? ? (upd,dir). Consider the shape of E: case : The result follows directly by induction. case E e2 : By assumption, we have ?;? turnstileleft (E e2)[e1] : ? a59 R. By subtyping derivations (Lemma B.0.6) we know we can construct a proof derivation of this ending in (TSub): TSub TApp ?1;? turnstileleft E[e1] : ?1 ???f ?prime2 a59R1 ?2;? turnstileleft e2 : ?1 a59? ?1 a3?2 a3?3 arrowhookleft? ?s ??3 = ??f ??3 ? ??f ??3 ? ??f E [e1] negationslash? v ? R2 = ? ?s;? turnstileleft (E e2)[e1] : ?prime2 a59R1 ?prime2 ? ?2 SCtxt ? ? [?;?;?] ?s ? [?;?1 ??2 ??f;?] (?1 ??2 ??f) ? ? ?s ? ? ?;? turnstileleft (E e2)[e1] : ?2 a59R1 and by flow effect weakening (Lemma B.0.7) we know that ? and ? are unchanged in the use of (TSub). By inversion on ?n;?;H;(E e2)[e1]? ?? ? ?n + 1;?prime;Hprime;(E e2)[e1]? we have ?n;?;H;e1? ?? ? ?n + 1;?prime;Hprime;eprime1?, and then applying [cong] we have ?n;?;H;E[e1]? ?? ? ?n + 1;?prime;Hprime;E[eprime1]?. From ?,R1;H turnstileleft ? we know that: f ? ? ? f ? ? f ? ? ? nprime ? ver(H,f) 220 where (?,?) is the top of ?. Since ? ? [?;?;?] and ?s ? [?;?s;?] and ?s = ?1 ? ?2 ? ?3 (where ?3 = ?f), we have f ? ? ? f ? ? f ? ?1 ? nprime ? ver(H,f) but since ?1 ? [?;?1;?1], we have ?1,R1;H turnstileleft ?. Hence we can apply induction on ?1;? turnstileleft E[e1] : ?1 ???f ?prime2 a59R1, yielding: (i) ?prime1;?prime turnstileleft E[eprime1] : ?1 ???f ?2 a59R1 and (ii) n + 1;?prime turnstileleft Hprime (iii) ?prime1,R1;Hprime turnstileleft ?prime (iv) traceOK(?prime) (v) (dir = bck) ? nprimeprime ? n + 1 ? (dir = fwd) ? (f ? ?1 ? ver(H,f) ? ver(Hprime,f)) where ?prime1 ? [?s;?prime1;?1] and ?prime1 ? ?1. Choose ?prime2 = [?1 ??prime1;?2;?2] and ?prime3 = [?1 ??prime1 ??2;?f;?s] and thus ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?primes and ?prime?3 = ??f. Let ?prime = [?;?prime1 ??2 ??f;?], where ?prime1 ??2 ??f ? ?, as required. To prove 1., we have n + 1;?prime turnstileleft Hprime by (ii), and apply (TApp): TApp ?prime1;?prime turnstileleft E[eprime1] : ?1 ???f ?prime2 a59R1 ?prime2;?prime turnstileleft e2 : ?1 a59? ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?primes ?prime?3 = ??f ?prime?3 ? ??f ?prime?3 ? ??f E[eprime1] negationslash? v ? R2 = ? ?primes;?prime turnstileleft (E e2)[eprime1] : ?prime2 a59R1 The first premise follows by (i), the second because we have ?2;?prime turnstileleft e2 : ?1 by weakening (since ?prime ? ?) and then ?prime2;?prime turnstileleft e2 : ?1 by flow effect weakening (Lemma B.0.7) (which we can apply because ?prime?2 = ??2 , ?prime?2 = ??2, ?prime?2 = ?1 ? ?prime1 ??2 = ?1 ? ?1 hence ?prime?2 ? ??2) the third?sixth by choice of ?prime2, ?prime3 and ?primes, and the last as R2 ? ? by assumption. We can now apply (TSub): TSub ?prime;? turnstileleft (E e2)[eprime1] : ?prime2 a59R1 ?prime2 ? ?2 ?prime ? ?prime ?prime;? turnstileleft (E e2)[eprime1] : ?2 a59Rprime1 To prove part 2., we must show that ?prime,R1;Hprime turnstileleft ?prime. By inversion on ?,R1;H turnstileleft ? we have ? ? (?,?) or ? ? (?,?),?primeprime. We have two cases: ? ? (?,?): By (iii) we must have R1 ? ? such that TC1 f ? ?prime ? f ? ? f ? ?prime1 ? nprimeprime ? ver(Hprime,f) [?;?prime1;?1],?;Hprime turnstileleft (?prime,?prime) To achieve the desired result we need to prove: TC1 f ? ?prime ? f ? ? f ? ?prime1 ??2 ??f ? nprimeprime ? ver(Hprime,f) [?;?prime1 ??2 ??f;?],?;Hprime turnstileleft (?prime,?prime) The first premise is byassumption (since dom(?) = dom(?prime) from the definition ofU[(?,?)]upd,dirn+1 ). For the second premise, we need to show that for all f ? (?2??f) ? nprimeprime ? ver(Hprime,f) (for those f ? ?prime1 the result is by assumption). Consider each possible update type: case dir = bck : From the definition of U[(?,?)]upd,bckn+1 , we know that nprimeprime = n + 1; from the definition of U[H]updn we know that n + 1 ? ver(Hprime,f) for all f, hence nprimeprime ? ver(Hprime,f) for all f. case dir = fwd : From (v) we have that f ? ?1 ? ver(H,f) ? ver(Hprime,f). Since (?2 ? ?f) ? ?1 (by ?prime1 a3 ?prime2 a3 ?prime3 arrowhookleft? ?prime), we have f ? (?2 ? ?f) ? ver(H,f) ? ver(Hprime,f). By inversion on ?,R1;H turnstileleft ? we have f ? (?1 ? ?2 ? ?f) ? nprime ? ver(H,f), and thus f ? (?2 ? ?f) ? nprime ? ver(Hprime,f). We have U[(?,?)]upd,fwdn+1 = (?,?) hence nprimeprime = nprime, so finally we have f ? (?2 ??f) ? nprimeprime ? ver(Hprime,f). 221 ? ? (?,?),?primeprime By (iii), we must have R1 ? ?primeprime,Rprimeprime such that TC2 ?primeprime,Rprimeprime;Hprime turnstileleft ?primeprime ?prime1 ? [?;?prime1;?1] f ? ?prime ? f ? ? f ? ?prime1 ? nprimeprime ? ver(Hprime,f) ?prime1,?primeprime,Rprimeprime;Hprime turnstileleft ((?prime,?prime),?primeprime) We wish to show that TC2 ?primeprime,Rprimeprime;Hprime turnstileleft ?primeprime ?prime ? [?;?prime1 ??2 ??f;?] f ? ?prime ? f ? ? f ? (?prime1 ??2 ??f) ? nprimeprime ? ver(Hprime,f) ?prime,?primeprime,Rprimeprime;Hprime turnstileleft ((?prime,?prime),?primeprime) ?primeprime,Rprimeprime;Hprime turnstileleft ? follows by assumption while the third and fourth premises follow by the same argument as in the ? ? (?,?) case, above. Part 3. follows directly from (iv). Part 4. follows directly from (v) and the fact that ?1 ? ? (because ?1 ? ?2 ??f ??). case v E : By assumption, we have ?;? turnstileleft (v E)[e2] : ? a59 R. By subtyping derivations (Lemma B.0.6) we have: TSub TApp ?1;? turnstileleft v : ?1 ???f ?prime2 a59? ?2;? turnstileleft E[e2] : ?1 a59R2 ?1 a3?2 a3?3 arrowhookleft? ?s ??3 = ??f ??3 ? ??f ??3 ? ??f v negationslash? vprime ? R2 = ? ?s;? turnstileleft (v E)[e2] : ?prime2 a59R2 ?prime2 ? ?2 SCtxt ? ? [?;?;?] ?s ? [?;?1 ??2 ??f;?] (?1 ??2 ??f) ? ? ?s ? ? ?;? turnstileleft (v E)[e2] : ?2 a59R2 and by flow effect weakening (Lemma B.0.7) we know that ? and ? are unchanged in the use of (TSub). By inversion on ?n;?;H;(v E)[e2]? ?? ? ?n + 1;?prime;Hprime;(v E)[e2]? we have ?n;?;H;e2? ?? ? ?n + 1;?prime;Hprime;eprime2?, and then applying [cong] we have ?n;?;H;E[e2]? ?? ? ?n + 1;?prime;Hprime;E[e2]?. From ?,R2;H turnstileleft ? we know that: f ? ? ? f ? ? f ? ? ? nprime ? ver(H,f) where (?,?) is the top of ?. We have ? ? [?;?;?], ?s ? [?s;?s;?s], ?s ? ?, ?s = ?1 ? ?2 ? ?3 (where ?3 = ?f), ?2 ? [?2;?2;?2], ?2 ? ?1 ? ?1 = ? (since ?1 = ?; if it?s not ? we can construct a derivation for v that has ?1 = ? as argued in preservation (Lemma D.0.36), (TApp)-[Cong], case v E). We have f ? ? ? f ? ? f ? ?2 ? nprime ? ver(H,f) hence ?2,R2;H turnstileleft ? and we can apply induction on ?2;? turnstileleft E[e2] : ?1 ???f ?prime2 a59R2, yielding: (i) ?prime2;?prime turnstileleft E[e2] : ?1 a59R2 and (ii) n + 1;?prime turnstileleft Hprime (iii) ?prime2,R2;Hprime turnstileleft ?prime (iv) traceOK(?prime) (v) (dir = bck) ? nprimeprime ? n + 1 ? (dir = fwd) ? (f ? ?2 ? ver(H,f) ? ver(Hprime,f)) where ?prime2 ? [?2;?prime2;?2] and ?prime2 ? ?2. Choose ?prime1 = [?;?;?2 ??prime2] and ?prime3 = [???prime2;?f;?] and thus ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime and ?prime?3 = ??f. Let ?prime ? [?;?prime2 ??f;?] and thus ?prime2 ??f ? ? as required. 222 To prove 1., we have n + 1;?prime turnstileleft Hprime by (ii), and apply (TApp): TApp ?prime1;?prime turnstileleft v : ?1 ???f ?prime2 a59? ?prime2;?prime turnstileleft E[e2] : ?1 a59R2 ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime ?prime?3 = ??f ?prime?3 ? ??f ?prime?3 ? ??f v negationslash? vprime ? R2 = ? ?prime;?prime turnstileleft (v E)[e2] : ?prime2 a59R2 The first premise follows by value typing, the second by (i), the third?sixth by choice of ?prime1 and ?prime3, and the last holds vacuously. We can now apply (TSub): TSub ?prime;? turnstileleft (v E)[e2] : ?prime2 a59R2 ?prime2 ? ?2 ?prime ? ?prime ?prime;? turnstileleft (v E)[e2] : ?2 a59R2 To prove part 2., we must show that ?prime,R2;Hprime turnstileleft ?prime. By inversion on ?,R2;H turnstileleft ? we have ? ? (?,?) or ? ? (?,?),?primeprime. We have two cases: ? ? (?,?): By (iii) we must have R2 ? ? such that TC1 f ? ?prime ? f ? ? f ? ?prime2 ? nprimeprime ? ver(Hprime,f) [?;?prime2;?2],?;Hprime turnstileleft (?prime,?prime) To achieve the desired result we need to prove: TC1 f ? ?prime ? f ? ? f ? ?prime2 ??f ? nprimeprime ? ver(Hprime,f) [?;?prime2 ??f;?],?;Hprime turnstileleft (?prime,?prime) The first premise follows by assumption (since dom(?) = dom(?prime) from the definition of U[(?,?)]upd,dirn+1 ). For the second premise, we need to show that for all f ? ?f ? nprimeprime ? ver(Hprime,f) (for those f ? ?prime2 the result is by assumption). Consider each possible update type: case dir = bck : From the definition of U[(?,?)]upd,bckn+1 , we know that nprimeprime = n + 1; from the definition of U[H]updn we know that n + 1 ? ver(Hprime,f) for all f, hence nprimeprime ? ver(Hprime,f) for all f. case dir = fwd : From (v) we have that f ? ?2 ? ver(H,f) ? ver(Hprime,f). Thus ?f ? ?2 (by ?prime1a3?prime2a3 ?prime3 arrowhookleft? ?prime) implies f ? ?f ? ver(H,f) ? ver(Hprime,f). By inversion on ?,R2;H turnstileleft ? we have f ? (?2 ? ?f) ? nprime ? ver(H,f), and thus f ? ?f ? nprime ? ver(Hprime,f). We have U[(?,?)]upd,fwdn+1 = (?,?) hence nprimeprime = nprime, so finally we have f ? ?f ? nprimeprime ? ver(Hprime,f). ? ? (?,?),?primeprime By (iii), we must have R2 ? ?primeprime,Rprimeprime such that TC2 ?primeprime,Rprimeprime;Hprime turnstileleft ?primeprime ?prime2 ? [?;?prime2;?2] f ? ?prime ? f ? ? f ? ?prime2 ? nprimeprime ? ver(Hprime,f) ?prime2,?primeprime,Rprimeprime;Hprime turnstileleft ((?prime,?prime),?primeprime) We wish to show that TC2 ?primeprime,Rprimeprime;Hprime turnstileleft ?primeprime ?prime ? [?;?prime2 ??f;?] f ? ?prime ? f ? ? f ? (?prime2 ??f) ? nprimeprime ? ver(Hprime,f) ?prime,?primeprime,Rprimeprime;Hprime turnstileleft ((?prime,?prime),?primeprime) ?primeprime,Rprimeprime;Hprime turnstileleft ? follows by assumption while the third and fourth premises follow by the same argument as in the ? ? (?,?) case, above. Part 3. follows directly from (iv). Part 4. follows directly from (v) and the fact that ?2 ? ?. 223 case all others : Similar to cases above. This lemma says that if take an evaluation step that is not an update, the version set of any z remains unchanged. Lemma B.0.14 (Non-update step version preservation). If ?n;?;H;e? ??? ?n;?prime;Hprime; eprime? then for all z ? dom(Hprime), ver(Hprime,z) = ver(H,z). Proof. By inspection of the evaluation rules. The following lemma states that if we start with a well-typed program and a version-consistent trace and we can take an evaluation step, then afterward we will still have a well-typed program whose trace is version-consistent. Lemma B.0.15 (Preservation). Suppose we have the following: 1. n turnstileleft H,e : ? (such that ?;? turnstileleft e : ? a59R and n;? turnstileleft H for some ? and ?) 2. ?,R;H turnstileleft ? 3. traceOK(?) 4. ?n;?;H;e? ??? ?n;?prime;Hprime; eprime? Then for some ?prime ? ? and ?prime ? [?? ??0;?prime;??] such that ?prime ??0 ? ??, we have: 1. n turnstileleft Hprime,eprime : ? where ?prime;?prime turnstileleft eprime : ? a59Rprime and n;?prime turnstileleft Hprime 2. ?prime,Rprime;Hprime turnstileleft ?prime 3. traceOK(?prime) Proof. Induction on the typing derivation n turnstileleft H,e : ?. By inversion, we have that ?;? turnstileleft e : ? a59R; consider each possible rule for the conclusion of this judgment: case (TInt-TVar-TGvar-TLoc) : These expressions do not reduce, so the result is vacuously true. case (TRef) : We have that: (TRef) ?;? turnstileleft e : ? a59R?;? turnstileleft ref e : ref ? ? a59R There are two possible reductions: case [ref] : We have that e ? v, R = ?, and ?n;(?,?);H;ref v? ??? ?n;(?,?);Hprime;r? where r /? dom(H) and Hprime = H,r mapsto? (?,v,?). Let ?prime = ?,r : ref ? ? and ?prime = ? (which is acceptable since ?prime? = ?? ? ?, ?prime ? ? ? ??, and ?prime? = ??), and Rprime = ?. We have part 1. as follows: (TSub) (TLoc) ?prime(r) = ref ? ?? ?;?prime turnstileleft r : ref ? ? a59? ref ? ? ? ref ? ? ?? ? ? ?;?prime turnstileleft r : ref ? ? a59? Heap well-formedness n;?prime turnstileleft H,r mapsto? (?,v,?) holds since ??;?prime turnstileleft v : ? follows by value typing (Lemma B.0.5) from ?;?prime turnstileleft v : ?, which we have by assumption and weakening; we have n;?prime turnstileleft H by weakening. To prove 2., we must show ?,?;Hprime turnstileleft (?,?). This follows by assumption since Hprime only contains an additional location (i.e., not a global variable) and nothing else has changed. Part 3. follows by assumption since ?prime = ?. 224 case [cong] : We have that ?n;?;H;ref E[eprimeprime]? ??? ?n;?prime;Hprime;ref E[eprimeprimeprime]? from ?n;?;H;eprimeprime? ??? ?n;?prime;Hprime;eprimeprimeprime?. By [cong], we have ?n;?;H;e? ??? ?n;?prime;Hprime;eprime? where e ? E[eprimeprime] and eprime ? E[eprimeprimeprime]. By induction we have: (i) ?prime;?prime turnstileleft eprime : ? a59Rprime and (ii) n;?prime turnstileleft Hprime (iii) ?prime,Rprime;Hprime turnstileleft ?prime (iv) traceOK(?prime) where ?prime? = ?? ? ?0, ?prime ? ?0 ? ??, and ?prime? = ??. We prove 1. using (ii), and applying [TRef] using (i): (TRef) ?prime;?prime turnstileleft eprime : ? a59Rprime?prime;?prime turnstileleft ref eprime : ref ? ? a59Rprime Part 2. follows directly from (iii), and part 3. follows directly from (iv). case (TDeref) : We know that (TDeref) ?1;? turnstileleft e : ref ?r ? a59R ??2 = ?r ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !e : ? a59R We can reduce using either [gvar-deref], [deref], or [cong]. case [gvar-deref] : Thus we have e ? z such that ?n;(?,?);(Hprimeprime,z mapsto? (?prime,v,?));!z? ??{z} ?n;(?,? ?(z,?));(Hprimeprime,z mapsto? (?prime,v,?));v? (where H ? (Hprimeprime,z mapsto? (?prime,v,?))), by subtyping derivations (Lemma B.0.6) we have (TSub) (TGVar) ?(z) = ref ? prime r ?prime ??;? turnstileleft z : ref ?primer ?prime a59? ?prime ? ? ? ? ?prime ?primer ? ?r ref ?primer ?prime ? ref ?r ? ?? ? ?1 ?1;? turnstileleft z : ref ?r ? a59? and (TDeref) ?1;? turnstileleft z : ref ?r ? a59? ??2 = ?r ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !z : ? a59? (where R = ?) and ? ? [??1;??1 ??r;??2 ]. Let ?prime = ?, ?prime = [??1 ?{z};?;??2 ] and Rprime = R = ?. Since z ? ?r (by n;? turnstileleft H) we have ??{z} ? (??1??r) hence ?prime?{z} ? ??. The choice of ?prime is acceptable since ?prime? = ?? ?{z}, ?prime ?{z} ? ??, and ?prime? = ??. To prove 1., we need to show that ?prime;? turnstileleft v : ? a59 ? (the rest of the premises follow by assumption of n turnstileleft H,!z : ?). H(z) = (?prime,v,?) and ?(z) = ref ?primer ?prime implies ?prime;? turnstileleft v : ?prime a59 ? by n;? turnstileleft H. The result follows by (TSub): (TSub) ?prime;? turnstileleft v : ?prime a59? ?prime ? ? ?prime ? ?prime?prime;? turnstileleft v : ? a59? For part 2., we know ?,?;H turnstileleft (?,?): (TC1) f ? ? ? f ? ??1 f ? (??1 ??r) ? nprime ? ver(H,f) [??1;??1 ??r;??2 ],?;H turnstileleft (?,?) and need to prove ?prime,?;H turnstileleft (?,? ?(z,?)), hence: (TC1) f ? (? ?(z,?)) ? f ? ??1 ?{z} f ? ? ? nprime ? ver(H,f) [??1 ?{z};?;??2 ],?;H turnstileleft (?,? ?(z,?)) The first premise is true by assumption for all f ? ?, and for (z,?) since z ? ??1 ?{z}. The second premise is vacuously true. For part 3., we need to prove traceOK(nprime,? ?(z,?)); we have traceOK(nprime,?) by assumption, hence need to prove that nprime ? ?. Since by assumption of version consistency we have that f ? ??1 ??r ? nprime ? ver(H,f) and ver(H,f) = ver(Hprime,f) = ? (by Lemma B.0.14), and {z} ? ?r (by n;? turnstileleft H), we have nprime ? ?. 225 case [deref] : Follows the same argument as the [gvar-deref] case above for part 1.; parts 2 and 3 follow by assumption since the trace has not changed. case [cong] : Here ?n;?;H;!e? ??? ?n;?prime;Hprime;!eprime? follows from ?n;?;H,e? ??? ?n;?prime;Hprime,eprime?. To apply induc- tion, we must have ?1,R;H turnstileleft ? which follows by Lemma B.0.9 since ?,R;H turnstileleft ? and ?1a3?2 arrowhookleft? ?. Hence we have: (i) ?prime1;?prime turnstileleft eprime : ref ?r ? a59Rprime (ii) n;?prime turnstileleft Hprime (iii) ?prime1,Rprime;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?prime1 ? [??1 ? ?0;?prime1;??1 ] where ?prime1 ? ?0 ? ??1. Let ?prime2 = [??1 ? ?0;?r;??2 ] hence ?prime?2 = ?r and ?prime1a3?prime2 arrowhookleft? ?prime, where ?prime ? [??1 ??0;?prime1 ??r;??2 ] and (?prime1 ??r)??0 ? (?1 ??r) hence ?prime? = ?? ??0, ?prime ??0 ? ??, and ?prime? = ?? as required. We prove 1. by (ii) and by applying [TDeref]: (TDeref) ?prime1;?prime turnstileleft eprime : ref ?r ? a59Rprime ?prime?2 = ?r ?prime1 a3?prime2 arrowhookleft? ?prime ?prime;?prime turnstileleft !eprime : ? a59Rprime The first premise follows from (i) and the second and third premises follows by definition of ?prime and ?prime2. To prove part 2., we must show that ?prime,Rprime;Hprime turnstileleft ?prime. We have two cases: ?prime ? (?,?): By (iii) we must have Rprime ? ? such that (TC1) f ? ? ? f ? ??1 ??0 f ? ?prime1 ? nprime ? ver(Hprime,f) [??1 ??0;?prime1;??1 ],?;Hprime turnstileleft (?,?) To achieve the desired result we need to prove: (TC1) f ? ? ? f ? ??1 ??0 f ? (?prime1 ??r) ? nprime ? ver(Hprime,f) [??1 ??0;?prime1 ??r;??1 ],?;Hprime turnstileleft (?,?) The first premise follows directly from (iii). To prove the second premise, we observe that by Lemma B.0.11, top(?) = (nprime,?prime) where ?prime ? ?, and by inversion on ?;R;H turnstileleft ? we know (a) f ? ?prime ? f ? ??1, and (b) f ? ?1 ? ?r ? nprime ? ver(H,f). The second premise follows from (iii) and the fact that f ? ?r ? nprime ? ver(H,f) by (b), and for all f, ver(H,f) = ver(Hprime,f) by Lemma B.0.14. ?prime ? (?,?),?primeprime: By (iii), we must have Rprime ? ?primeprimeprime,Rprimeprimeprime such that (TC2) ?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft ?primeprime ?prime1 ? [??1 ??0;?prime1;??1 ] f ? ? ? f ? ??1 ??0 f ? ?prime1 ? nprime ? ver(Hprime,f) ?prime1,?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft (?,?),?primeprime We wish to show that (TC2) ?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft ?primeprime ?prime ? [??1 ??0;?prime1 ??r;??2 ] f ? ? ? f ? ??1 ??0 f ? ?prime1 ??r ? nprime ? ver(Hprime,f) ?prime,?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft (?,?),?primeprime The first and third premises follow from (iii), while the fourth premise follows by the same argument as in the ?prime ? (?,?) case, above. Part 3. follows directly from (iv). 226 case (TAssign) : We know that: (TAssign) ?1;? turnstileleft e1 : ref ?r ? a59R1 ?2;? turnstileleft e2 : ? a59R2 ??3 = ?r ?1 a3?2 a3?3 arrowhookleft? ? e1 negationslash? v ? R2 = ? ?;? turnstileleft e1 := e2 : ? a59R1 trianglerighttriangleleft R2 From R1 trianglerighttriangleleft R2 it follows that either R1 ? ? or R2 ? ?. We can reduce using [gvar-assign], [assign], or [cong]. case [gvar-assign] : This implies that e ? z := v with ?n;(?,?);(Hprimeprime,z mapsto? (?,vprime,?));z := v? ??{z} ?n;(?,? ?(z,?));(Hprimeprime,z mapsto? (?,v,?));v? where H ? (Hprimeprime,z mapsto? (?,vprime,?)). R1 ? ? and R2 ? ? (thus R1 trianglerighttriangleleft R2 ? ?). Let ?prime = ?, Rprime = ?, and ?prime = [???{z};?;??]. Since z ? ?r (by n;? turnstileleft H) we have ? ? (?1??2??r), hence ? ? {z} ? (?1 ? ?2 ? ?r) which means ?prime ? {z} ? ??. The choice of ?prime is acceptable since ?prime? = ?? ? {z}, ?prime ? {z} ? ??, and ?prime? = ??. We prove 1. as follows. Since ?2;? turnstileleft v : ? a59 ?, by value typing (Lemma B.0.5) we have ?prime;? turnstileleft v : ? a59 ?. n;? turnstileleft Hprime follows from n;? turnstileleft H and ?prime;? turnstileleft v : ? a59? (since ?? = ?). Parts 2. and 3. are similar to the (TDeref) case. case [assign] : Part 1. is similar to (gvar-assign); we have parts 2. and 3. by assumption. case [cong] : Consider the shape of E: case E := e : ?n;?;H;e1 := e2? ??? ?n;?prime;Hprime;eprime1 := e2? follows from ?n;?;H;e1? ??? ?n;?prime;Hprime;eprime1?. Since e1 negationslash? v ? R2 = ? by assumption, by Lemma B.0.10 we have ?1,R1;H turnstileleft ?, hence we can apply induction: (i) ?prime1;?prime turnstileleft eprime1 : ref ?r ? a59Rprime1 and (ii) n;?prime turnstileleft Hprime (iii) ?prime1,Rprime1;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?prime1 ? [??1 ??0;?prime1;??1 ] where ?prime1??0 ? ?1 and ??1 ? ??2??r???3 . Let ? prime2 ? [??1 ??prime1 ??0;??2;?r ???3 ] ?prime3 ? [??1 ??prime1 ??0 ???2;?r;??3 ] Thus ?prime?3 = ?r and ?prime1a3?prime2a3?prime3 arrowhookleft? ?prime such that ?prime ? [??1 ??0;?prime1???2??r;??3 ] The choice of ?prime is acceptable since ?prime? = ?? ??0, (?prime1 ??r ??2)??0 ? (?1 ??r ??2) i.e., ?prime ??0 ? ?? and ?prime? = ?? as required). To prove 1., we have n;?prime turnstileleft Hprime by (ii), and apply (TAssign): (TAssign) ?prime1;?prime turnstileleft eprime1 : ref ?r ? a59Rprime1 (TSub) ?2;?prime turnstileleft e2 : ? a59R2 ? ? ? ??1 ??prime1 ??0 ? ??1 ???1 ??2 ? ??2 ?r ???3 ? ?r ???3 ?2 ? ?prime2 ?prime2;?prime turnstileleft e2 : ? a59R2 ?prime?3 = ?r ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime eprime1 negationslash? v ? R2 = ? ?prime;?prime turnstileleft eprime1 := e2 : ? a59Rprime1 trianglerighttriangleleft R2 Note that ?2;?prime turnstileleft e2 : ? follows from ?2;? turnstileleft e2 : ? by weakening (Lemma B.0.1). To prove part 2., we must show that ?prime,Rprime1;Hprime turnstileleft ?prime (since Rprime1 trianglerighttriangleleft R2 = Rprime1). By inversion on ?,R;H turnstileleft ? we have ? ? (?,?) or ? ? (?,?),?primeprime. We have two cases: ?prime ? (?,?): By (iii) we must have Rprime1 ? ? such that (TC1) f ? ? ? f ? ??1 ??0 f ? ?prime1 ? nprime ? ver(Hprime,f) [??1 ??0;?prime1;??1 ],?;Hprime turnstileleft (?,?) 227 To achieve the desired result we need to prove: (TC1) f ? ? ? f ? ??1 ??0 f ? (?prime1 ???2 ??r) ? nprime ? ver(Hprime,f) [??1 ??0;?prime1 ???2 ??r;??3 ],?;Hprime turnstileleft (?,?) The first premise follows directly from (iii). To prove the second premise, we observe that by Lemma B.0.11, top(?) = (nprime,?prime) where ?prime ? ?, and by inversion on ?;R;H turnstileleft ? we know (a) f ? ?prime ? f ? ??1, and (b) f ? ??1 ? ??2 ? ?r ? nprime ? ver(H,f). The second premise follows from (iii) and the fact that f ? ?r ? nprime ? ver(H,f) by (b), and for all f, ver(H,f) = ver(Hprime,f) by Lemma B.0.14. ?prime ? (?,?),?primeprime: By (iii), we must have Rprime1 ? ?primeprimeprime,Rprimeprimeprime such that (TC2) ?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft ?primeprime ?prime1 ? [??1 ??0;?prime1;??1 ] f ? ? ? f ? ??1 ??0 f ? ?prime1 ? nprime ? ver(Hprime,f) ?prime1,?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft (?,?),?primeprime We wish to show that (TC2) ?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft ?primeprime ?prime ? [??1 ??0;?prime1 ???2 ??r;??3 ] f ? ? ? f ? ??1 ??0 f ? (?prime1 ???2 ??r) ? nprime ? ver(Hprime,f) ?prime,?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft (?,?),?primeprime The first and third premises follow from (iii), while the fourth premise follows by the same argument as in the ?prime ? (?,?) case, above. Part 3. follows directly from (iv). case r := E : ?n;?;H;r := e2? ??? ?n;?prime;Hprime;r := eprime2? follows from ?n;?;H;e2? ??? ?n;?prime;Hprime;eprime2?. Since e1 ? r, by inversion R1 ? ?. By Lemma B.0.10 (which we can apply because ??1 ? ?; if ??1 negationslash? ? we can rewrite the derivation using value typing to make it so) we have ?2,R2;H turnstileleft ?, hence we can apply induction to get: (i) ?prime2;?prime turnstileleft eprime2 : ? a59Rprime2 (ii) n;?prime turnstileleft Hprime (iii) ?prime2,Rprime2;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?prime2 ? [??2 ??0;?prime2;??2 ] where (?prime2??0) ? ??2; note ??2 ? ??1 (since ??1 ? ?) and ??2 ? ?3 ???3 . Let ? prime1 ? [??1 ??0;?;?prime2 ??r ???3 ] ?prime3 ? [??1 ??0 ??prime2;?r;??3 ] Thus ?prime?3 = ?r and ?prime1a3?prime2a3?prime3 arrowhookleft? ?prime such that ?prime ? [??1 ??0;?prime2??r;??3 ] and (?prime2??r)??0 ? (??2 ??r). The choice of ?prime is acceptable since ?prime? = ?? ??0, ?prime ??0 ? ?? and ?prime? = ?? as required). To prove 1., we have n;?prime turnstileleft Hprime by (ii), and we can apply [TAssign]: (TAssign) ?prime1;?prime turnstileleft r : ref ?r ? a59? ?prime2;?prime turnstileleft eprime2 : ? a59Rprime2 ?prime?r3 = ?r ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime r negationslash? v ? Rprime2 = ? ?prime;?prime turnstileleft r := eprime2 : ? a59? trianglerighttriangleleft Rprime2 Note that we have ?prime1;?prime turnstileleft r : ref ?r ? a59? from ?1;? turnstileleft r : ref ?r ? a59? by value typing and weakening To prove part 2., we must show that ?prime,Rprime2;Hprime turnstileleft ?prime (since R1 trianglerighttriangleleft R2 = Rprime2). By inversion on ?,R;H turnstileleft ? we have ? ? (?,?) or ? ? (?,?),?primeprime. We have two cases: ?prime ? (?,?): By (iii) we must have Rprime2 ? ? such that (TC1) f ? ? ? f ? ??2 ??0 f ? ?prime2 ? nprime ? ver(Hprime,f) [??2 ??0;?prime2;??2 ],?;Hprime turnstileleft (?,?) 228 To achieve the desired result we need to prove: (TC1) f ? ? ? f ? ??1 ??0 f ? (?r ??prime2) ? nprime ? ver(Hprime,f) [??1 ??0;?prime2 ??r;??3 ],?;Hprime turnstileleft (?,?) The first premise follows from (iii) since ??1 = ??2. To prove the second premise, we observe that by Lemma B.0.11, top(?) = (nprime,?prime) where ?prime ? ?, and by inversion on ?;R;H turnstileleft ? we know (a) f ? ?prime ? f ? ??1, and (b) f ? ?r ???2 ? nprime ? ver(H,f). The second premise follows from (iii) and the fact that f ? ?r ? nprime ? ver(H,f) by (b), and for all f, ver(H,f) = ver(Hprime,f) by Lemma B.0.14. ?prime ? (?,?),?primeprime: By (iii), we must have Rprime2 ? ?primeprimeprime,Rprimeprimeprime such that: (TC2) ?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft ?primeprime ?prime2 ? [??2 ??0;?prime2;??2 ] f ? ? ? f ? ??2 ??0 f ? ?prime2 ? nprime ? ver(Hprime,f) ?prime2,?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft (?,?),?primeprime We wish to show that (TC2) ?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft ?primeprime ?prime ? [??1 ??0;?prime2 ??r;??3 ] f ? ? ? f ? ???0 f ? ?prime2 ??r ? nprime ? ver(Hprime,f) ?prime,?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft (?,?),?primeprime The first and third premises follow from (iii), while the fourth premise follows by the same argument as in the ?prime ? (?,?) case, above. Part 3. follows directly from (iv). case (TUpdate) : case [no-update] : Thus we must have ?n;(?,?);H;update?primeprime,?primeprime? ?? ?n;(?,?);H;0? Let ?prime = ? and ?prime = ? (and thus ? ? ? ? ??, ?prime? = ?? ? ?, and ?prime? = ??) as required. For 1., ?;? turnstileleft 0 : int a59? follows from (TInt) and value typing and n;? turnstileleft H is true by assumption. Parts 2. and 3. follow by assumption. case (TIf) : We know that: (TIf) ?1;? turnstileleft e1 : int a59R ?2;? turnstileleft e2 : ? a59? ?2;? turnstileleft e3 : ? a59? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft if0 e1 then e2 else e3 : ? a59R We can reduce using [if-t], [if-f] or [cong]. case [if-t] : This implies that e1 ? v hence R = ?. We have ?n;(?,?);H;if0 v then e2 else e3? ?? ?n;(?,?);H;e2? Let ?prime = ? and ?prime = ? (and thus ??? ? ??, ?prime? = ?? ??, and ?prime? = ??) as required. To prove 1., we have n;? turnstileleft H by assumption, and we have (TSub) ?2;? turnstileleft e2 : ? a59? ? ? ? ?2 ? ? ?;? turnstileleft e2 : ? a59? The first premise holds by assumption, the second by reflexivity of subtyping, and the third by Lemma B.0.2. 229 case [if-f] : This is similar to [if-t]. case [cong] : ?n;?;H;if0 e1 then e2 else e3? ??? ?n;?prime;Hprime;if0 eprime1 then e2 else e3? follows from ?n;?;H;e1? ??? ?n;?prime;Hprime;eprime1?. To apply induction, we must have ?1,R;H turnstileleft ? which follows by Lemma B.0.9 since ?,R;H turnstileleft ? and ?1 a3?2 arrowhookleft? ?. Hence we have: (i) ?prime1;?prime turnstileleft eprime1 : int a59Rprime and (ii) n;?prime turnstileleft Hprime (iii) ?prime1,Rprime;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?prime1 ? [??1 ??0;?prime1;??1 ] where ?prime1 ??0 ? ??1. (Note that ??1 ? ??2 ???2 .) Let ?prime2 ? [??1 ? ?prime1 ? ?0;??2;??2 ]. Thus ?prime1 a3?prime2 arrowhookleft? ?prime so that ?prime ? [??1 ? ?0;?prime1 ? ??2;??2 ] where ?prime1 ??0 ???2 ? ??1 ???2 and ?prime? = ?? as required. To prove 1., we have n;?prime turnstileleft Hprime by (ii), and can apply (TIf): We prove 1. by (ii) and as follows: (TIf) (TSub) ?2;?prime turnstileleft e2 : ? a59? ? ? ? ?2 ? ?prime2 ?prime2;?prime turnstileleft e2 : ? a59? (TSub) ?2;?prime turnstileleft e2 : ? a59? ? ? ? ?2 ? ?prime2 ?prime2;?prime turnstileleft e3 : ? a59? ?prime1;?prime turnstileleft eprime1 : int a59Rprime1 ?prime1 a3?prime2 arrowhookleft? ?prime ?prime;?prime turnstileleft if0 eprime1 then e2 else e3 : ? a59Rprime Note that ?2;?prime turnstileleft e2 : ? a59 R follows from ?2;? turnstileleft e2 : ? a59 R by weakening (Lemma B.0.1) and likewise for ?2;?prime turnstileleft e3 : ? a59R . Parts 2. and 3. follow by an argument similar to (TDeref)-[cong] and (TAssign)-[cong]. case (TTransact) : We know that: (TTransact) ?primeprime;? turnstileleft e : ? a59? ?? ? ?primeprime? ?? ? ?primeprime? ?;? turnstileleft tx e : ? a59? We can reduce using [tx-start]: ?n;(?,?);H;tx e? ?? ?n;(n,?),(?,?);H;intx e? Let ?prime = ? and ?prime ? [??;?;??] (and thus ??? ? ??, ?prime? = ?? ??, and ?prime? = ??, as required). To prove 1., we have n;? turnstileleft H by assumption, and the rest follows by (TIntrans): (TIntrans) ?primeprime;? turnstileleft e : ? a59? ?prime? ? ?primeprime? ?prime? ? ?primeprime? ?prime;? turnstileleft intx e : ? a59?primeprime,? The first premise is true by assumption, and the second by choice of ?prime. We prove 2. as follows: (TC2) (TC1) f ? ? ? f ? ?primeprime? f ? ?primeprime? ? n ? ver(H,f) ?primeprime,?;H turnstileleft (n,?) f ? ? ? f ? ?? f ? ? ? nprime ? ver(H,f) [??;?;??],?primeprime,?;H turnstileleft (n,?),(?,?) First premise of [TC1] is true vacuously, and the second is true by n;? turnstileleft H, which we have by assumption. For [TC2], the first premise holds by inversion of ?,?;H turnstileleft (?,?), which we have by assumption, and the second holds vacuously. Part 3. follows easily: we have traceOK((?,?)) by assumption, traceOK((n,?)) is vacuously true, hence traceOK((n,?),(?,?)) is true. case (TIntrans) : We know that: (TIntrans) ?primeprime;? turnstileleft e : ? a59R ?? ? ?primeprime? ?? ? ?primeprime? ?;? turnstileleft intx e : ? a59?primeprime,R There are two possible reductions: 230 case [tx-end] : We have that e ? v and thus R ? ?; we reduce as follows: traceOK(nprimeprime,?prime) ?n;((?prime,?prime),(?,?));H;intx v? ?? ?n;(?,?);H;v? Let ?prime = ? and ?prime = ? (and thus ?prime? = ?? ??, ?prime ?? ? ??, and ?prime? = ?? as required). To prove 1., we know that n;? turnstileleft H follows by assumption and ?;? turnstileleft v : ? a59? by value typing. To prove 2., we must show that ?,?;H turnstileleft (?,?), but this is true by inversion on ?,?primeprime,?;H turnstileleft ((?prime,?prime),(?,?)). For 3., traceOK((?,?)) follows from traceOK(((?prime,?prime),(?,?))) (which is true by assumption). case [tx-cong-2] : We know that ?n;?;H;e? ?? ? ?nprime;?prime;Hprime;eprime? ?n;?;H;intx e? ??? ?nprime;?prime;Hprime;intx eprime? follows from ?n;?;H;e? ??? ?n;?prime;Hprime;eprime? (because the reduction does not perform an update, hence ? ? ?0 and we apply [tx-cong-2]). We have ?primeprime,R;H turnstileleft ? by inversion on ?,?primeprime,R;H turnstileleft ((?,?),?), hence by induction: (i) ?primeprimeprime;?prime turnstileleft eprime : ? a59Rprime and (ii) n;?prime turnstileleft Hprime (iii) ?primeprimeprime,Rprime;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?primeprimeprime such that ?primeprimeprime? = ?primeprime? ??0, ?primeprimeprime ??0 ? ?primeprime?, and ?primeprimeprime? = ?primeprime?. Let ?prime = ? (hence ?prime? = ?prime? ?? , ?prime ?? ? ??, and ?prime? = ?? as required) and ?prime = ?. To prove 1., we have n;?prime turnstileleft Hprime by (ii), and we can apply [TIntrans]: (TIntrans) ?primeprimeprime;?prime turnstileleft eprime : ? a59Rprime ?prime? ? ?primeprimeprime? ?prime? ? ?primeprimeprime? ?prime;?prime turnstileleft intx eprime : ? a59?primeprimeprime,Rprime The first premise follows from (i), and the second holds because ?? ? ?primeprime? and ?? ? ?primeprime? by assumption and we picked ?prime = ? (hence ?prime? ? ?primeprimeprime? ?prime? ? ?primeprimeprime?). Part 2. follows directly from (iii). Part 3. follows directly from (iv). case (TLet) : We know that: (TLet) ?1;? turnstileleft e1 : ?1 a59R ?2;?,x : ?1 turnstileleft e2 : ?2 a59? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft let x : ?1 = e1 in e2 : ?2 a59R We can reduce using either [let] or [cong]. case [let] : This implies that e1 ? v hence R ? ?. We have: ?n;(?,?);H;let x : ? = v in e? ?? ?n;(?,?);H;e[x mapsto? v]? To prove 1., we have n;? turnstileleft H by assumption; let ?prime = ? and ?prime = ?; since ?2 ? (?1 ? ?2), we can apply [TSub]: (TSub) ?2;?,x : ?1 turnstileleft e2 : ?2 a59? ?2 ? ?2 ?2 ? ? ?;?,x : ?1 turnstileleft e2 : ?2 a59? The first premise holds by assumption, the second by reflexivity of subtyping, and the third by Lemma B.0.2. By value typing we have ?;? turnstileleft v : ?1 a59 ?, so by substitution (Lemma B.0.17) we have ?;? turnstileleft e2[x mapsto? v] : ?2 a59?. Parts 2. and 3. hold by assumption. case [cong] : Similar to (TIf)-[Cong]. 231 case (TApp) : We know that: (TApp) ?1;? turnstileleft e1 : ?1 ???f ?2 a59R1 ?2;? turnstileleft e2 : ?1 a59R2 ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f e1 negationslash? v ? R2 = ? ?;? turnstileleft e1 e2 : ?2 a59R1 trianglerighttriangleleft R2 We can reduce using either [call] or [cong]. case [call] : We have that ?n;(?,?);(Hprimeprime,z mapsto? (?,?(x).e,?));z v? ??{z} ?n;(?,? ?(z,?));(Hprimeprime,z mapsto? (?,?(x).e,?));e[x mapsto? v]? (where H ? (Hprimeprime,z mapsto? (?,?(x).e,?))), and (TApp) ?1;? turnstileleft z : ?1 ???f ?2 a59? ?2;? turnstileleft v : ?1 a59? ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f z negationslash? v ? R2 = ? ?;? turnstileleft z v : ?2 a59? where by subtyping derivations (Lemma B.0.6) we have (TSub) (TGVar) ?(z) = ? prime1 ???primef ?prime2 ??;? turnstileleft z : ?prime1 ???primef ?prime2 a59? ?1 ? ?prime1 ?prime2 ? ?2 ?primef ? ?f ?prime1 ???primef ?prime2 ? ?1 ???f ?2 ?? ? ?1 ?1;? turnstileleft z : ?1 ???f ?2 a59? Define ?f ? [?f;?f;?f] and ?primef ? [?primef;?primef;?primef]. Let ?prime = ?, Rprime = ? and choose ?prime = [??1 ?{z};?f;??3 ]. Since z ? ?primef (by n;? turnstileleft H) and ?primef ? ?f (by ?primef ? ?f) we have ?f ?{z} ? (?1 ??2 ??f). The choice of ?prime is acceptable since ?prime? = ?? ?{z}, ?prime??{z} ? ??, and ?prime? = ??. For 1., we have n;? turnstileleft Hprime by assumption; for the remainder we have to prove ?prime;? turnstileleft e[x mapsto? v] : ?2 a59?. First, we must prove that ?primef ? ?prime. Note that since {z} ? ?f by n;? turnstileleft Hprime, from ?1 a3?2 a3?3 arrowhookleft? ? and choice of ?prime we get ?prime?3 ?{z} ? ?f. We have: ?prime ? [??1 ?{z};?f;??3 ] (by choice of ?prime) ?f ? [?f;?f;?f] ?primef ? [?primef;?primef;?primef] ?primef ? ?f (by ?primef ? ?f) ?f ? ?primef (by ?primef ? ?f) ?f ? ?primef (by ?primef ? ?f) ?prime?3 ?{z} ? ?f (by assumption and choice of ?prime) ?prime?3 = ??1 ???1 ??prime?2 (by ?1 a3?2 a3?3 arrowhookleft? ?) ?prime?3 ? ?f (by assumption and choice of ?prime) Thus we have the result by [TSub] ?primef;? turnstileleft e[x mapsto? v] : ?prime2 a59? ?prime2 ? ?2 ?primef ? ?prime ?prime;? turnstileleft e[x mapsto? v] : ?2 By assumption, we have ?2;? turnstileleft v : ?1 a59?. By value typing and ?1 ? ?prime1 we have ?prime;? turnstileleft v : ?prime1 a59?. Finally by substitution we have ?prime;? turnstileleft e[x mapsto? v] : ?2 a59?. For part 2., we need to prove ?prime,?;H turnstileleft (?prime,?prime) where ?prime = ? ?(z,?) and nprimeprime = nprime, hence: (TC1) f ? (? ?(z,?)) ? f ? ?? ?{z} f ? ?f ? nprime ? ver(H,f) ?prime,?;H turnstileleft (?prime,?prime) The first premise is true by assumption and the fact that {z} ? {z}. The second premise is true by assumption. For part 3., we need to prove traceOK(? ?(z,?)); we have traceOK(?) by assumption, hence need to prove that nprime ? ?. Since by assumption we have that f ? ?1 ? ?2 ? ?f ? nprime ? ver(H,f) and {z} ? ?f, we have nprime ? ?. 232 case [cong] : case E e : ?n;?;H;e1 e2? ??? ?n;?prime;Hprime;eprime1 e2? follows from ?n;?;H;e1? ??? ?n;?prime;Hprime;eprime1?. Since e1 negationslash? v ? R2 = ? by assumption, by Lemma B.0.10 we have ?1,R1;H turnstileleft ? hence we can apply induction: (i) ?prime1;?prime turnstileleft eprime1 : ?1 ???f ?2 a59Rprime1 and (ii) n;?prime turnstileleft Hprime (iii) ?prime1,Rprime1;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?prime1 ? [??1 ??0;?prime1;??1 ] where ?prime1??0 ? ?1 and ??1 ? ??2??f ???3 . Let ? prime2 ? [??1 ??prime1 ??0;??2;?f ???3 ] ?prime3 ? [??1 ??prime1 ??0 ???2;?f;??3 ] Thus ?prime?3 = ?f, ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime, ?prime?3 ? ??f and ?prime?3 ? ??f (since ?prime?3 ? ?0 ? ??3 and ?prime?3 = ??3 ). We have ?prime ? [??1 ??0;?prime1 ???2 ??f;??3 ]. The choice of ?prime is acceptable since ?prime? = ?? ? ?0, (?prime1 ? ?f ? ?2) ? ?0 ? (?1 ? ?2 ? ?f) i.e., ?prime ? ?0 ? ?? and ?prime? = ?? as required). To prove 1., we have n;?prime turnstileleft Hprime by (ii), and apply (TApp): (TApp) ?prime1;?prime turnstileleft eprime1 : ?1 ???f ?2 a59Rprime1 (TSub) ?2;?prime turnstileleft e2 : ?1 a59R2 ?1 ? ?1 ??1 ??prime1 ??0 ? ??1 ???1 ??2 ? ??2 ?f ???3 ? ?f ???3 ?2 ? ?prime2 ?prime2;?prime turnstileleft e2 : ?1 a59R2 ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime ?prime?3 = ??f ?prime?3 ? ??f ?prime?3 ? ??f eprime1 negationslash? v ? R2 = ? ?prime;?prime turnstileleft eprime1 e2 : ?2 a59Rprime1 trianglerighttriangleleft R2 Note that ?2;?prime turnstileleft e2 : ?1 a59 R2 follows from ?2;? turnstileleft e2 : ?1 a59 R2 by weakening (Lemma B.0.1). The last premise holds vacuously as R2 ? ? by assumption. To prove part 2., we must show that ?prime,Rprime;Hprime turnstileleft ?prime. The proof is similar to the (TAssign)- [cong] proof, case E := e but substituting ?f for ?r. Part 3. follows directly from (iv). case v E : ?n;?;H;v e2? ??? ?n;?prime;Hprime;v eprime2? follows from ?n;?;H;e2? ??? ?n;?prime;Hprime;eprime2?. For convenience, we make ??1 ? ?; if ??1 negationslash? ?, we can always construct a typing derivation of v that uses value typing to make ??1 ? ?. Note that ?1 a3?2 a3?3 arrowhookleft? ? would still hold since Lemma B.0.7 allows us to decrease ??2 to satisfy ??2 = ??1 ???1; similarly, since ??3 = ??1 ? ??1 ? ??2 we know that ??3 ? ??f would still hold if ??3 was smaller as a result of shrinking ??1 to be ?. Since e1 ? v, by inversion R1 ? ? and by Lemma B.0.10 (which we can apply since ??1 ? ?), we have ?2,R2;H turnstileleft ?; hence by induction: (i) ?prime2;?prime turnstileleft eprime2 : ?1 a59Rprime2 (ii) n;?prime turnstileleft Hprime (iii) ?prime2,Rprime2;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?prime2 ? [??2 ??0;?prime2;??2 ] where (?prime2??0) ? ??2; note ??2 ? ??1 (since ??1 ? ?) and ??2 ? ?3 ???3 . Let ? prime1 ? [??1 ??0;?;?prime2 ??f ???3 ] ?prime3 ? [??1 ??0 ??prime2;?f;??3 ] Thus ?prime?3 = ?f, ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime, ?prime?3 ? ??f and ?prime?3 ? ??f (since ?prime?3 ? ?0 ? ??3 and ?prime?3 = ??3 ). We have ?prime ? [??1 ??0;?prime2 ??f;??3 ] and (?prime2 ??f)??0 ? (??2 ??f). The choice of ?prime is acceptable since ?prime? = ?? ??0, ?prime ??0 ? ?? and ?prime? = ?? as required). To prove 1., we have n;?prime turnstileleft Hprime by (ii), and we can apply [TApp]: (TApp) ?prime1;?prime turnstileleft v : ?1 ???f ?2 a59? ?prime2;?prime turnstileleft eprime2 : ?1 a59Rprime2 ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime ?prime?3 = ??f ?prime?3 ? ??f ?prime?3 ? ??f v negationslash? vprime ? Rprime2 = ? ?prime;?prime turnstileleft v eprime2 : ?2 a59? trianglerighttriangleleft Rprime2 233 (Note that ? trianglerighttriangleleft Rprime2 = Rprime2.) The first premise follows by value typing and weakening; the second by (i); the third?sixth by choice of ?prime, ?prime1, ?prime2, ?prime3; the last holds vacuously since R1 ? ? by assumption. To prove part 2., we must show that ?prime,Rprime;Hprime turnstileleft ?prime. The proof is similar to the (TAssign)- [cong] proof, case r := E but substituting ?f for ?r. Part 3. follows directly from (iv). case (TSub) : We have (TSub) ?primeprime;? turnstileleft e : ?primeprime a59R ?primeprime ? [?;?primeprime;?] ? ? [?;?;?] ?primeprime ? ? ?primeprime ? ? ?;? turnstileleft e : ? a59R since by flow effect weakening (Lemma B.0.7) we know that ? and ? are unchanged in the use of (TSub). We have ?n;?;H;e? ??? ?n;?prime;Hprime;eprime?. To apply induction we must show that n;? turnstileleft H, which we have by assumption, ?primeprime;? turnstileleft e : ?primeprime a59 R, which we also have by assumption, and ?primeprime,R;H turnstileleft ?, which follows easily since ?primeprime ? ?. Hence we have: (i) ?primeprimeprime;?prime turnstileleft eprime : ?primeprime a59Rprime and (ii) n;?prime turnstileleft Hprime (iii) ?primeprimeprime,Rprime;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ?, ?primeprimeprime such that ?primeprimeprime? = ? ? ?0, ?primeprimeprime? ? ?0 ? ?primeprime Let ?prime ? ?primeprimeprime, and thus ?prime? = ? ? ?0, ?prime? ??0 ? ? since ?primeprime ? ?, and ?prime? = ? as required. All results follow by induction. Lemma B.0.16 (Progress). If n turnstileleft H,e : ? (such that ?;? turnstileleft e : ? a59 R and n;? turnstileleft H) and for all ? such that ?,R;H turnstileleft ? and traceOK(?), then either e is a value, or there exist nprime,Hprime,?prime,eprime such that ?n;?;H; e? ??? ?nprime;?prime;Hprime; eprime?. Proof. Induction on the typing derivation n turnstileleft H,e : ?; consider each possible rule for the conclusion of this judgment: case (TInt-TGvar-TLoc) : These are all values. case (TVar) : Can?t occur, since local values are substituted for. case (TRef) : We must have that (TRef) ?;? turnstileleft eprime : ? a59R?;? turnstileleft ref eprime : ref ? ? a59R There are two possible reductions, depending on the shape of e: case eprime ? v : By inversion on ?;? turnstileleft v : ? a59 ? we know that R ? ? hence by inversion on ?,R;H turnstileleft ? we have ? ? (?,?). We have that ?n;(?,?);H;ref v? ?? n;(?,?);Hprime;r where r /? dom(H) and Hprime = H,r mapsto? (?,v,?) by (ref). case eprime negationslash? v : By induction, ?n;?;H;eprime? ??? ?nprime;?prime;Hprime;eprimeprime?and thus?n;?;H;(ref )[eprime]? ??? ?nprime;?prime;Hprime;(ref )[eprimeprime]? by [cong]. case (TDeref) : We know that (TDeref) ?1;? turnstileleft e : ref ?r ? a59R ??2 = ?r ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !e : ? a59R Consider the shape of e: 234 case eprime ? v : Since v is a value of type ref ?r ?, we must have v ? z or v ? r. case eprime ? z : We have (TDeref) ?1;? turnstileleft z : ref ?r ? a59? ??2 = ?r ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !z : ? a59? where by subtyping derivations (Lemma B.0.6) we have (TSub) (TGVar) ?(z) = ref ? prime r ?prime ??;? turnstileleft z : ref ?primer ?prime a59? ?prime ? ? ? ? ?prime ?primer ? ?r ref ?primer ?prime ? ref ?r ? ?? ? ?1 ?1;? turnstileleft z : ref ?r ? a59? By inversion on ?,?;H turnstileleft ? we have ? ? (?,?). By n;? turnstileleft H we have z ? dom(H) (and thus H ? Hprimeprime,z mapsto? (ref ?primer ?prime,v,?))) since ?(z) = ref ?primer ?prime. Therefore, we can reduce via [gvar-deref]: ?n;(?,?);(Hprimeprime,z mapsto? (ref ?primer ?prime,v,?)); !z? ??{z} ?n;(?,??(z,?));(Hprimeprime,z mapsto? (ref ?primer ?prime,v,?));v? case eprime ? r : Similar to the eprime ? z case above, but reduce using [deref]. case eprime negationslash? v : Let E ? ! so that e ? E[eprime]. To apply induction, we have ?1,R;H turnstileleft ? by Lemma B.0.9. Thus we get ?n;?;H;eprime? ??? ?nprime;?prime;Hprime;eprimeprime?, hence we have that ?n;?;H;E[eprime]? ??? ?nprime;?prime;Hprime;E[eprimeprime]? by [cong]. case (TAssign) : (TAssign) ?1;? turnstileleft e1 : ref ?r ? a59R1 ?2;? turnstileleft e2 : ? a59R2 ??3 = ?r ?1 a3?2 a3?3 arrowhookleft? ? e1 negationslash? v ? R2 = ? ?;? turnstileleft e1 := e2 : ? a59R1 trianglerighttriangleleft R2 Depending on the shape of e, we have: case e1 ? v1,e2 ? v2 : Since v1 is a value of type ref ?r ?, we must have v1 ? z or v1 ? r. The results follow by reasoning quite similar to [TDeref] above. case e1 ? v1,e2 negationslash? v : Let E ? v1 := so that e ? E[e2]. Since e1 is a value, R1 ? ? hence we have ?2,R;H turnstileleft ? by Lemma B.0.10 and we can apply induction. We have ?n;?;H;e2? ??? ?nprime;?prime;Hprime;eprime2?, and thus ?n;?;H;E[e2]? ??? ?nprime;?prime;Hprime;E[eprime2]? by [cong]. case e1 negationslash? v : Since e1 is a not value, R2 ? ? hence we have ?1,R;H turnstileleft ? by Lemma B.0.10 and we can apply induction. The rest follows by an argument similar to the above case. case (TUpdate) : By inversion on ?;? turnstileleft update?,? : int a59 R we have that R ? ?, hence by inversion on ?,?;H turnstileleft ? we have ? ? (?,?). If updateOK(upd,H,(?,?),dir) = tt, then update?,? reduces via [update], otherwise update?,? reduces via [no-update]. case (TIf) : (TIf) ?1;? turnstileleft e1 : int a59R ?2;? turnstileleft e2 : ? a59? ?2;? turnstileleft e3 : ? a59? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft if0 e1 then e2 else e3 : ? a59R Depending on the shape of e, we have: 235 case e1 ? v : This implies R ? ? so by inversion on ?,?;H turnstileleft ? we have ? ? (?,?). Since the type of v is int, we know v must be an integer n. Thus we can reduce via either [if-t] or [if-f]. case e1 negationslash? v : Let E ? if0 then e2 else e3 so that e ? E[e1]. To apply induction, we have ?1,R;H turnstileleft ? by Lemma B.0.9. We have ?n;?;H;e1? ??? ?nprime;?prime;Hprime;eprime1? and thus ?n;?;H;E[e1]? ??? ?nprime;?prime;Hprime;E[eprime1]? by [cong]. case (TTransact) : We know that: (TTransact) ?prime;? turnstileleft e : ? a59? ?? ? ?prime? ?? ? ?prime? ?;? turnstileleft tx e : ? a59? By inversion on ?,?;H turnstileleft ? we have ? ? (?,?). Thus we can reduce by [tx-start]. case (TIntrans) : We know that: (TIntrans) ?prime;? turnstileleft e : ? a59R ?? ? ?prime? ?? ? ?prime? ?;? turnstileleft intx e : ? a59?prime,R Consider the shape of e: case e ? v : Thus (TIntrans) ?prime;? turnstileleft v : ? a59? ?? ? ?prime? ?? ? ?prime? ?;? turnstileleft intx v : ? a59?prime,? We have ?,?prime,?;H turnstileleft ? by assumption: (TC2) ?prime,?;H turnstileleft ? ? ? [?;?;?] f ? ? ? f ? ? f ? ? ? nprime ? ver(H,f) ?,?prime,?;H turnstileleft ((?prime,?prime),(?,?)) By inversion we have ? ? ((?prime,?prime),(?,?)); by assumption we have traceOK(nprimeprime,?primeprime) so we can reduce via [tx-end]. case e negationslash? v : We have ?,?prime,R;H turnstileleft ? by assumption. By induction we have ?n;?prime;H;eprime? ??? ?nprime;?primeprime;Hprime;eprimeprime?, hence by [tx-cong-2]: ?n;?prime;H;intx eprime? ??? ?nprime;?primeprime;Hprime;intx eprimeprime? case (TLet) : We know that: (TLet) ?1;? turnstileleft e1 : ?1 a59R ?2;?,x : ?1 turnstileleft e2 : ?2 a59? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft let x : ?1 = e1 in e2 : ?2 a59R Consider the shape of e: case e1 ? v : Thus ?1;? turnstileleft v : ? a59? and by inversion on ?,?;H turnstileleft ? we have ? ? (?,?). We can reduce via [let]. case e1 negationslash? v : Let E ? let x : ?1 = in e2 so that e ? E[e1]. To apply induction, we have ?1,R;H turnstileleft ? by Lemma B.0.9. We have?n;?;H;e1? ??? ?nprime;?prime;Hprime;eprime1?and so?n;?;H;E[e1]? ??? ?nprime;?prime;Hprime;E[eprime1]? by [cong]. 236 case (TApp) : (TApp) ?1;? turnstileleft e1 : ?1 ???f ?2 a59R1 ?2;? turnstileleft e2 : ?1 a59R2 ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f e1 negationslash? v ? R2 = ? ?;? turnstileleft e1 e2 : ?2 a59R1 trianglerighttriangleleft R2 Depending on the shape of e, we have: case e1 ? v1,e2 ? v2 : Since v1 is a value of type ?1 ??? ?2, we must have v1 ? z, hence (TApp) ?1;? turnstileleft z : ?1 ???f ?2 a59? ?2;? turnstileleft v : ?1 a59? ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f z negationslash? v ? R2 = ? ?;? turnstileleft z v : ?2 a59? where by subtyping derivations (Lemma B.0.6) we have (TSub) (TGVar) ?(z) = ? prime1 ???primef ?prime2 ??;? turnstileleft z : ?prime1 ???primef ?prime2 a59? ?1 ? ?prime1 ?prime2 ? ?2 ?primef ?f ?f ?prime1 ???primef ?prime2 ? ?1 ???f ?2 ?? ? ?1 ?1;? turnstileleft z : ?1 ???f ?2 a59? By inversion on ?,?;H turnstileleft ? we have ? ? (?,?). By n;? turnstileleft H we have z ? dom(H) and H ? (Hprimeprime,z mapsto? (?prime1 ???primef ?prime2,?(x).eprimeprime,?)) since ?(z) = ?prime1 ???primef ?prime2. By [call], we have: ?n;(?,?);(Hprimeprime,z mapsto? (?prime1 ???primef ?prime2,?(x).eprimeprime,?));z v? ??{z} ?n;(?,? ?(z,?));(Hprimeprime,z mapsto? (?prime1 ???primef ?prime2,?(x).eprimeprime,?));eprimeprime[x mapsto? v]? case e1 negationslash? v : Let E ? e2 so that e ? E[e1]. Since e1 is a not value, R2 ? ? hence we have ?1,R;H turnstileleft ? by Lemma B.0.10 and we can apply induction and we have: ?n;?;H;e1? ??? ?nprime;?prime;Hprime;eprime1?, and thus ?n;?;H;E[e1]? ??? ?nprime;?prime;Hprime;E[eprime1]? by [cong]. case e1 ? v1,e2 negationslash? v : Let E ? v1 so that e ? E[e2]. Since e1 is a value, R1 ? ? hence we have ?2,R;H turnstileleft ? by Lemma B.0.10 and we can apply induction. The rest follows similarly to the above case. case (TSub) : We know that: (TSub) ?1;? turnstileleft e : ?prime a59R ?prime ? ? ?1 ? [?;?1;?] ? ? [?;?;?] ?1 ? ? ?;? turnstileleft e : ? a59R If e is a value v we are done. Otherwise, since ?1,R;H turnstileleft ? follows from ?,R;H turnstileleft ? (by ??1 ? ?? and ??1 = ??); we have ?n;?;H;e? ??? ?nprime;?prime;Hprime;eprime? by induction. Lemma B.0.17 (Substitution). If ?;?,x : ?prime turnstileleft e : ? and ?;? turnstileleft v : ?prime then ?;? turnstileleft e[x mapsto? v] : ?. Proof. Induction on the typing derivation of ?;? turnstileleft e : ?. case (TInt) : Since e ? n and n[x mapsto? v] ? n, the result follows by (TInt). 237 case (TVar) : e is a variable y. We have two cases: case y = x : We have ? = ?prime and y[x mapsto? v] ? v, hence we need to prove that ?;? turnstileleft v : ? which is true by assumption. case y negationslash= x : We have y[x mapsto? v] ? y and need to prove that ?;? turnstileleft y : ?. By assumption, ?;?,x : ?prime turnstileleft y : ?, and thus (?,x : ?prime)(y) = ?; but since x negationslash= y this implies ?(y) = ? and we have to prove ?;? turnstileleft y : ? which follows by (Tvar). case (TGvar),(TLoc), (TUpdate) : Similar to (TInt). case (TRef) : We know that ?;?,x : ?prime turnstileleft ref e : ref ? ? and ?;? turnstileleft v : ?prime, and need to prove that ?;? turnstileleft (ref e)[x mapsto? v] : ref ? ?. By inversion on ?;?,x : ?prime turnstileleft ref e : ref ? ? we have ?;?,x : ?prime turnstileleft e : ?; applying induction to this, we have ?;? turnstileleft e[x mapsto? v] : ?. We can now apply [TRef]: (TRef) ?;? turnstileleft e[x mapsto? v] : ??;? turnstileleft ref (e[x mapsto? v]) : ref ? ? The desired result follows since ref (e[x mapsto? v]) ? (ref e)[x mapsto? v]. case (TDeref) : We know that ?;?,x : ?prime turnstileleft !e : ? and ?;? turnstileleft v : ?prime and need to prove that ?;? turnstileleft (!e)[x mapsto? v] : ?. By inversion on ?;?,x : ?prime turnstileleft !e : ? we have ?1;?,x : ?prime turnstileleft e : ref ?r ? and ?2 such that ?1 a3 ?2 arrowhookleft? ? and ? ? ?1 a3 ?2. By value typing we have ?1;? turnstileleft v : ?prime. We can then apply induction, yielding ?1;? turnstileleft e[x mapsto? v] : ref ?r ?. Finally, we apply (TDeref) (TDeref) ?1;? turnstileleft e[x mapsto? v] : ref ?r ? ??2 = ?r ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !e[x mapsto? v] : ? Note that the second premise holds by inversion on ?;?,x : ?prime turnstileleft !e : ?. The desired result follows since !(e[x mapsto? v]) ? (!e)[x mapsto? v]. case (TSub) : We know that ?;?,x : ?prime turnstileleft e : ? and ?;? turnstileleft v : ?prime and need to prove that ?;? turnstileleft e[x mapsto? v] : ?. By inversion on ?;?,x : ?prime turnstileleft e : ? we have ?prime;?,x : ?prime turnstileleft e : ?prime. By value typing we have ?prime;?,x : ?prime turnstileleft v : ?prime. We can then apply induction, yielding ?prime;? turnstileleft e[x mapsto? v] : ?prime. Finally, we apply (TSub) (TSub) ?prime;? turnstileleft e[x mapsto? v] : ?prime ?prime ? ? ?prime ? ??;? turnstileleft e[x mapsto? v] : ? and get the desired result. case (TTransact),(TIntrans) : Similar to (TSub). case (TApp) : We know that (TApp) ?1;?,x : ?prime turnstileleft e1 : ?1 ???f ?2 ?2;?,x : ?prime turnstileleft e2 : ?1 ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f ?;?,x : ?prime turnstileleft e1 e2 : ?2 where ?;? turnstileleft v : ?prime, and need to prove that ?;? turnstileleft (e1 e2)[x mapsto? v] : ?2. Call the first two premises above (1) and (2), and note that we have (3) ?;? turnstileleft v : ?prime ? ?1;? turnstileleft v : ?prime and (4) ?;? turnstileleft v : ?prime ? ?2;? turnstileleft v : ?prime by 238 the value typing lemma. By (1), (3) and induction we have ?1;? turnstileleft e1[x mapsto? v] : ?1 ???f ?2. Similarly, by (2), (4) and induction we have ?2;? turnstileleft e2[x mapsto? v] : ?1. We can now apply (TApp): (TApp) ?1;? turnstileleft e1[x mapsto? v] : ?1 ???f ?2 ?2;? turnstileleft e2[x mapsto? v] : ?1 ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f ?;? turnstileleft e1[x mapsto? v] e2[x mapsto? v] : ?2 Since e1[x mapsto? v] e2[x mapsto? v] ? (e1 e2)[x mapsto? v] we get the desired result. case (TAssign-TIf-TLet) : Similar to (TApp). Theorem B.0.18 (Single-step Soundness). If ?;? turnstileleft e : ? where llbracket?;? turnstileleft e : ?rrbracket = R; and n;? turnstileleft H; and ?,R;H turnstileleft ?; and traceOK(?), then either e is a value, or there exist nprime, Hprime, ?prime, ?prime, eprime, and ? such that ?n;?;H;e? ??? ?nprime;?prime;Hprime;eprime? and ?prime;?prime turnstileleft eprime : ? where llbracket?prime;?prime turnstileleft eprime : ?rrbracket = Rprime; and nprime; ?prime turnstileleft Hprime; and ?prime,Rprime;Hprime turnstileleft ?prime; and traceOK(?prime) for some ?prime,?prime,Rprime. Proof. From progress (Lemma D.0.37), we know that if n turnstileleft H,e : ? then either e is a value, or there exist nprime,Hprime,?prime,?prime,eprime,? such that ?n;?;H;e? ??? ?nprime;?prime;Hprime;eprime?. If e is a value we are done. If e is not a value, then there are two cases. If ? = ? then the result follows from update preservation (Lemma B.0.13). If ? = ?0, then the result follows from preservation (Lemma D.0.36). 239 Appendix C Relaxed Updates Proofs Lemma C.0.19 (Weakening). If ?;? turnstileleft e : ? and ?prime ? ? then ?;?prime turnstileleft e : ?. Proof. By induction on the typing derivation of ?;? turnstileleft e : ?. Lemma C.0.20 (Subtyping reflexivity). ? ? ? for all ?. Proof. Straightforward, from the definition of subtyping in Figure 5.2. Lemma C.0.21 (Subtyping transitivity). For all ?,?prime,?primeprime, if ? ? ?prime and ?prime ? ?primeprime then ? ? ?primeprime. Proof. By simultaneous induction on ? ? ?prime and ?prime ? ?primeprime, similar to Lemma B.0.4 Lemma C.0.22 (Value typing). If ?;? turnstileleft v : ? then ?prime;? turnstileleft v : ? for all ?prime. Proof. By induction on the typing derivation of ?;? turnstileleft v : ?. Lemma C.0.23 (Subtyping Derivations). If ?;? turnstileleft e : ? then we can construct a proof derivation of this judgment that ends in one use of (TSub) whose premise uses a rule other than (TSub). Proof. By induction on ?;? turnstileleft e : ?. case (TSub) : We have TSub ?prime;? turnstileleft e : ?prime ?prime ? ? SCtxt ?prime? ? ?? ?? ? ?prime? ?? ? ?prime? ?prime?i ? ??i ??o ? ?prime?o ?prime ? ? ?;? turnstileleft e : ? By induction, we have TSub ?primeprime;? turnstileleft e : ?primeprime ?primeprime ? ?prime SCtxt ?primeprime? ? ?prime? ?prime? ? ?primeprime? ?prime? ? ?primeprime? ?primeprime?i ? ?prime?i ?prime?o ? ?primeprime?o ?primeprime ? ?prime ?prime;? turnstileleft e : ?prime where the derivation ?primeprime;? turnstileleft e : ?primeprime does not conclude with (TSub). By the transitivity of subtyping (Lemma C.0.21), we have ?primeprime ? ?; the rest of the premises follow by transitivity of ?, and finally we get the desired result by (TSub): TSub ?primeprime;? turnstileleft e : ?primeprime ?primeprime ? ? SCtxt ?primeprime? ? ?? ?? ? ?primeprime? ?? ? ?primeprime? ?primeprime?i ? ??i ??o ? ?primeprime?o ?primeprime ? ? ?;? turnstileleft e : ? case all others : Since we have that the last rule in ?;? turnstileleft e : ? is not (TSub), we have the desired result by applying (TSub) (where ? ? ? follows from the reflexivity of subtyping, Lemma C.0.20): TSub ?;? turnstileleft e : ? ? ? ? ? ? ??;? turnstileleft e : ? Lemma C.0.24 (Flow effect weakening). If ?;? turnstileleft e : ? where ? ? [?;?;?;?i;?o], then ?prime;? turnstileleft e : ? where ?prime ? [?prime;?;?prime;?i;?o], ?prime? ? ??, ?prime? ? ??, and all uses of [TSub] applying ?prime ? ? require ?prime? = ??, ?prime? = ??, ?prime?i = ??i, and ?prime?o = ??o. 240 Proof. By induction on ?;? turnstileleft e : ?. case (TGvar),(TInt),(TVar) : Trivial. case (TCheckin) : We have (TCheckin) ???o ? ?primeprime ? ? ?primeprime [?;?;?;?;?o];? turnstileleft checkin?primeprime,?primeprime:int Since ?prime ? ?, and ?prime ? ?, we can apply (TCheckin): (TCheckin) ?prime ??o ? ?primeprime ?prime ? ?primeprime [?prime;?;?prime;?;?o];? turnstileleft checkin?primeprime,?primeprime:int case (TTransact) : We have TTransact ?primeprime;? turnstileleft e : ? ?? ? ?primeprime? ?? ? ?primeprime? ?;? turnstileleft tx(?primeprime???primeprime?i,?primeprime???primeprime?) e : ? Let ?prime = [?prime;?;?prime;?iprime;?o]. Since ?prime? ? ??, and ?prime? ? ??, we can apply (TTransact): TTransact ?primeprime;? turnstileleft e : ? ?prime? ? ?primeprime? ?prime? ? ?primeprime? ?prime;? turnstileleft tx(?primeprime???primeprime?i,?primeprime???primeprime?) e : ? case (TIntrans) : Similar to (TTransact). case (TSub) : We have TSub ?prime;? turnstileleft e : ?prime ?prime ? ? ?prime? ? ?? ?? ? ?prime? ?? ? ?prime? ?prime?i ? ??i ??o ? ?prime?o ?prime ? ? ?;? turnstileleft e : ? Let ?primeprime = [??;?prime?;??;??i;??o]. Thus we have: TSub ?primeprime;? turnstileleft e : ?prime ?prime ? ? ?primeprime? ? ?? ?? = ?primeprime? ?? = ?primeprime? ??i = ?prime?i ??o = ?primeprime?o ?primeprime ? ? ?;? turnstileleft e : ? where the first premise follows by induction (which we can apply because ?primeprime? ? ?prime? and ?primeprime? ? ?prime? by assumption); the first premise of ?primeprime ? ? is by assumption, and the latter two premises are by definition of ?primeprime. case (TRef) : We know that TRef ?;? turnstileleft e : ??;? turnstileleft ref e : ref ? ? and have ?prime;? turnstileleft e : ? by induction, hence we get the result by (TRef). 241 case (TDeref) : We know that TDeref ?1;? turnstileleft e : ref ? ? ??2 = ? ??i2 = ??o2 ?? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !e : ? We have ?prime ? [?prime;??1???2;?prime;?iprime;?oprime] where ?prime ? ??, and ?prime ? ??. Choose ?prime1 ? [?prime;??1;??2??prime;?i;??2??o] and ?prime2 ? [?prime ???1;??2;?prime;??2 ??o;?o], hence ?prime1 ?? ?prime2, ?prime?2 = ??2 = ?, ?prime?i2 = ?prime?o2 ??, and ?prime ? ?prime1a3?prime2. We want to prove that ?prime;? turnstileleft !e : ?. Since ?prime ? ?, and ??2 ? ?prime ? ??2 ? ? we can apply induction to get ?prime1;? turnstileleft e : ref ? ? and we get the result by applying (TDeref): TDeref ?prime1;? turnstileleft e : ref ? ? ?prime?2 = ? ??i2 = ??o2 ?? ?prime1 a3?prime2 arrowhookleft? ?prime ?prime;? turnstileleft !e : ? case (TApp) : We know that TApp ?1;? turnstileleft e1 : ?1 ???f ?2 ?2;? turnstileleft e2 : ?1 ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f ??i3 = ??o3 ???f ??o3 ? ??of ?;? turnstileleft e1 e2 : ?2 We have ?prime ? [?prime;??1 ? ??2 ? ??3;?prime;?i;?o1] where ?prime ? ?? and ?prime ? ??. Choose ?prime1 ? [?prime;??1;??2 ? ??3 ? ?prime;?i;?o1], ?prime2 ? [?prime ? ??1;??2;??3 ? ?prime;?o1;?o2], ?prime3 ? [?prime ? ??1 ? ??2;??3;?prime;?o2;?o], hence ?prime?3 = ??3 = ?f , ??i3 = ??o3 ? ??f, ??o3 ? ??of , and ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime. We want to prove that ?prime;? turnstileleft e1 e2 : ?2. Since ?prime ? ? and ??2 ???3 ??prime ? ??2 ???3 ??prime we can apply induction to get ?prime1;? turnstileleft e1 : ?1 ???f ?2. Similarly, since ?prime ???1 ? ????1 and ??3 ??prime ? ??3 ??, we can apply induction to get ?prime2;? turnstileleft e2 : ?1. We get the get the result by applying (TApp): TApp ?prime1;? turnstileleft e1 : ?1 ???primef ?2 ?prime2;? turnstileleft e2 : ?1 ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime ?prime?3 = ??f ?prime?3 ? ??f ?prime?3 ? ??f ?prime?i3 = ?prime?o3 ???f ?prime?o3 ? ??of ?prime;? turnstileleft e1 e2 : ?2 case (TAssign), (TIf), (TLet) : Similar to (TApp). Lemma C.0.25 (Left subexpression version consistency). If ?,R;H turnstileleft ? and ?1 a3?2 arrowhookleft? ? then ?1,R;H turnstileleft ?. Proof. We know: TC1 f ? ? ? f ? ? f ? (???i) ? nprime ? ver(H,f) ?? ? (???i) ?? ? (? ??) [?;?;?;?i;?o],?;H turnstileleft (nprime,?,?) We need to prove: TC1 f ? ? ? f ? ?1 f ? (?1 ??i1) ? nprime ? ver(H,f) ?? ? (?1 ??i1) ?? ? (?1 ??1) [?1;?1;?1;?i1;?o1],?;H turnstileleft (nprime,?,?) The first premise follows since ?1 ? ?. The second follows because ?i1 ? ?i and ?1 ? ?. The third follows because ?1 ? ? and ?i1 ? ?i. The fourth follows because ? ?? ? ? ??1 ??2 ? (? ??2)??1 ? ?1 ??1. 242 Lemma C.0.26 (Subexpression version consistency). If ?,R1 trianglerighttriangleleft R2;H turnstileleft ? and ?1 a3?2 arrowhookleft? ? then (i) R2 ? ? implies ?1,R1;H turnstileleft ? (ii) R1 ? ? and ??1 ? ? implies ?2,R2;H turnstileleft ? Proof. Similar to Lemma C.0.25. Lemma C.0.27 (Stack Shapes). If ?n;?;H;e? ???0 ?n;?prime;Hprime;eprime? then top(?) = (nprime,?,?) and top(?prime) = (nprimeprime,?prime,?prime) where nprime = nprimeprime, ? ? ?prime and ? = ?prime. Proof. By induction on ?n;?;H;e? ???0 ?n;?prime;Hprime;eprime?. Lemma C.0.28 (Update preserves heap safety). If n;? turnstileleft H and updateOK(upd,H,(?,?),dir) then n+1;U[?]upd turnstileleft U[H]updn+1. Proof. Same proof as Lemma B.0.12. The following lemma states that if we start with a well-typed program and a version-consistent trace and we take an update step, then afterward we will still have a well-typed program whose trace is version-consistent. Lemma C.0.29 (Update preservation). Suppose we have the following: 1. n turnstileleft H,e : ? (such that ?;? turnstileleft e : ? a59R and n;? turnstileleft H for some ?,?) 2. ?,R;H turnstileleft ? 3. traceOK(?) 4. ?n;?;H;e? ?? ? ?n + 1;?prime;Hprime; e? where Hprime ? U[H]updn+1, ?prime ? U[?]upd, ? = (upd,dir), ?prime ? U[?]upd,dirn , and top(?prime) = (nprimeprime,?prime,?prime). Then for some ?prime such that ?prime? = ??, ?prime? = ??, ?prime?i = ??i, ?prime?o = ??o, and ?prime? ? ?? and some ?prime ? ? we have that: 1. n + 1 turnstileleft Hprime,e : ? where ?prime;?prime turnstileleft e : ? a59R and n + 1;?prime turnstileleft Hprime 2. ?prime,R;Hprime turnstileleft ?prime 3. traceOK(?prime) 4. (dir = bck) ? nprimeprime ? n + 1 ? (dir = fwd) ? (f ? ? ? ver(H,f) ? ver(Hprime,f)) Proof. Since U[?]upd ? ?, ?;U[?]upd turnstileleft e : ? a59R follows by weakening (Lemma C.0.19). Proceed by simultaneous induction on the typing derivation of e (n turnstileleft H,e : ?) and on the evaluation derivation ?n;?;H;e? ?? ? ?n + 1;?prime;Hprime; e?. Consider the last rule used in the evaluation derivation: case [update] : We have ?n;(nprime,?,?);H;e? ??(upd,dir) ?nprimeprime;U?(nprime,?,?)?upd,dirnprimeprime ;U[H]updnprimeprime ;e? where nprimeprime ? n + 1. Let ?prime = ? and (nprimeprime,?prime,?prime) ? U[(nprime,?,?)]upd,dirn+1 . To prove 1., we get nprimeprime;?prime turnstileleft Hprime by Lemma C.0.28 and ?;?prime turnstileleft eprime : ? a59R by weakening. To prove 2., we must show ?,?;Hprime turnstileleft (nprimeprime,?prime,?prime). By assumption, we have TC1 f ? ? ? f ? ? f ? ???i ? nprime ? ver(H,f) ?? ? (???i) ?? ? (? ??) [?;?;?;?i;?o],?;H turnstileleft (nprime,?,?) We need to prove TC1 f ? ? ? f ? ? f ? ???i ? nprimeprimeprime ? ver(Hprime,f) ?? ? (???i) ?? ? (? ??) [?;?;?;?i;?o],?;H turnstileleft (nprimeprime,?prime,?prime) We have the first, third and fourth premises by assumption. For the second premise, we need to prove f ? ???i ? nprimeprimeprime ? ver(Hprime,f). Consider each possible update type: 243 case dir = bck : From the definition of U[(nprime,?,?)]upd,bckn+1 , we know that nprimeprimeprime = n+1; from the definition of U[H]updn+1 we know that n + 1 ? ver(Hprime,f) for all f, hence nprimeprimeprime ? ver(Hprime,f) for all f. case dir = fwd : From the definition of U[(nprime,?,?)]upd,bckn+1 , we know that nprimeprimeprime = nprime. Since ?? ? (? ? ?), from updateOK(upd,H,(??,??),dir) we know that ?f ? (? ? ?), f negationslash? dom(upd.UB), hence ver(Hprime,f) = ver(Hprime,f). Hence f ? ???i ? nprime ? ver(H,f) (assumption) implies f ? ???i ? nprimeprimeprime ? ver(Hprime,f). To prove 3., we must show traceOK(nprimeprime,?prime,?prime). Consider each possible update type: case dir = bck : From the definition of U[(nprime,?,?)]upd,bckn+1 , we know that nprimeprimeprime = n + 1. Consider (f,?) ? ?; it must be the case that f negationslash? dom(updchg). This is because dir = bck implies ?? ?dom(updchg) = ? and by assumption (from [TC1] above) f ? ? and ?? ? ?. Therefore, since f negationslash? dom(updchg), its ?prime entry is (f,? ?{nprimeprimeprime}), which is the required result. case dir = fwd : Since U[(nprime,?,?)]upd,fwdn+1 = (nprime,?,?), the result is true by assumption. To prove 4., we must show nprimeprimeprime ? n + 1 ? (f ? ? ? ver(H,f) ? ver(Hprime,f)). Consider each possible update type: case dir = bck : From the definition of U[(nprime,?,?)]upd,bckn+1 , we know that nprimeprimeprime = n + 1 so we are done. case dir = fwd : We have U[(nprime,?,?)]upd,fwdn+1 = (nprime,?,?), and from updateOK(upd,H,(??,??),dir) and we know that f ? ?? ? f negationslash? dom(updchg) and by assumption (from [TC1] above) we know ?? ? ?. From the definition of U[H]updn we know that U[(f mapsto? (?,b,?),H)]updn+1 = f mapsto? (?,b,? ? {n + 1}) if f negationslash? dom(updchg). This implies that for f ? ?, ver(H,f) = ? and ver(Hprime,f) = ? ? {n + 1}, and therefore ver(H,f) ? ver(Hprime,f). case [tx-cong-1] : We have that?n;((nprime,?,?),?);H;intx e? ??? ?nprimeprime;U[(nprime,?,?)]?nprimeprime,?prime;Hprime;intx eprime?follows from?n;?;H;E[e]? ??? ?nprimeprime;?prime;Hprime;E[eprime]? by [tx-cong-1], where ? ? (upd,dir) and nprimeprime ? n+1. Let (nprimeprime,?prime,?prime) ? U[(nprime,?,?)]upd,dirn+1 . By assumption and subtyping derivations (Lemma C.0.23) we have TSub TIntrans ?e;? turnstileleft e : ?prime a59R ? ? ??e ? ? ??e [?;?;?;?i;?o];? turnstileleft intx e : ?prime a59?e,R ?prime ? ? [?;?;?;?i;?o] ? [?;?;?;?i;?o] [?;?;?;?i;?o];? turnstileleft intx e : ? a59?e,R and by flow effect weakening (Lemma C.0.24) we know that ?, ?, ?i and ?o are unchanged in the use of (TSub). We have ?e ? [?e;?e;?e;?ie;?oe], so that ?e ? ? and ?e ? ?. To apply induction, we must show that ?e,R;H turnstileleft ? (which follows by inversion on ?,?e,R;H turnstileleft ((nprime,?,?),?); ?e;? turnstileleft e : ?prime a59 R (which follows by assumption); and n;? turnstileleft H (by assumption). By induction we have: (i) ?primee;?prime turnstileleft eprime : ?prime a59R and (ii) n + 1;?prime turnstileleft Hprime (iii) ?primee,R;Hprime turnstileleft ?prime (iv) traceOK(?prime) (v) (dir = bck) ? nprimeprimeprime ? n + 1 ? (dir = fwd) ? (f ? ?e ? ver(H,f) ? ver(Hprime,f)) where ?primee ? [?e;?primee;?e;?ie;?oe], ?primee ? ?e. Let ?prime = [?;?;?;?i;?o] (hence ?prime? = ??, ?prime? = ??, ? ? ??, ?prime?i = ??i, and ?prime?o = ??o as required). To prove 1., we can show TSub TIntrans ?primee;?prime turnstileleft eprime : ? a59R ? ? ?prime?e ? ? ?prime?e ?prime;? turnstileleft intx eprime : ? a59?primee,R ?prime ? ? ?prime ? ?prime ?prime;? turnstileleft intx eprime : ? a59?primee,R 244 The first premise of [TIntrans] follows by (i), and the second ?fifth by assumption (from [?;?;?;?i;?o];? turnstileleft intx e : ?prime a59?e,R). To prove 2., we need to show that TC2 ?primee,R;Hprime turnstileleft ?prime f ? ?prime ? f ? ? f ? (???i) ? nprimeprimeprime ? ver(Hprime,f) ?? ? (???i) ?? ? (? ??) [?;?;?;?i;?o],?primee,R;Hprime turnstileleft ((nprimeprimeprime,?prime,?prime),?prime) We have the first premise by (iii), the second by assumption (since dom(?) = dom(?prime) from the definition of U[(nprime,?,?)]upd,dirn+1 ), the third holds vacuously, and the fourth and fifth follow by assumption (note that ?prime = ?). To prove 3., we must show traceOK((nprimeprimeprime,?prime,?prime),?prime), which reduces to proving traceOK((nprimeprime,?prime,?prime) since we have traceOK(?prime) from (iv). We have traceOK(nprime,?,?) by assumption. Consider each possible update type: case dir = bck : From the definition of U[(nprime,?,?)]upd,bckn+1 , we know that nprimeprimeprime = n + 1. Consider (f,?) ? ?; it must be the case that f negationslash? dom(updchg). This is because dir = bck implies ?e ?dom(updchg) = ? and by assumption we have ? ? ?e (from (TIntrans)), f ? ? (from the first premise of [TC1] above), and ?? ? (? ? ?i) (from the fourth premise of [TC1] above). Therefore, since f negationslash? dom(updchg), its ?prime entry is (f,? ?{nprime}), which is the required result. case dir = fwd : Since U[(nprime,?,?)]upd,fwdn+1 = (nprime,?,?), the result is true by assumption. Part 4. follows directly from (v) and the fact that ?e ? ?. case [cong] : We have that ?n;?;H;E[e]? ?? ? ?nprimeprime;?prime;Hprime;E[eprime]? follows from ?n;?;H;e? ?? ? ?nprimeprime;?prime;Hprime;eprime? by [cong], where ? ? (upd,dir). Consider the shape of E: case : The result follows directly by induction. case E e2 : By assumption, we have ?;? turnstileleft (E e2)[e1] : ? a59 R. By subtyping derivations (Lemma C.0.23) we know we can construct a proof derivation of this ending in (TSub): TSub TApp ?1;? turnstileleft E[e1] : ?1 ???f ?prime2 a59R1 ?2;? turnstileleft e2 : ?1 a59? ?1 a3?2 a3?3 arrowhookleft? ?s ??3 = ??f ??3 ? ??f ??3 ? ??f ??i3 = ??o3 ???f ??o3 ? ??of E [e1] negationslash? v ? R2 = ? ?s;? turnstileleft (E e2)[e1] : ?prime2 a59R1 ?prime2 ? ?2 SCtxt ? ? [?;?;?;?i;?o] ?s ? [?;?1 ??2 ??f;?;?i;?o] (?1 ??2 ??f) ? ? ?s ? ? ?;? turnstileleft (E e2)[e1] : ?2 a59R1 and by flow effect weakening (Lemma C.0.24) we know that ?, ? , ?i and ?o are unchanged in the use of (TSub). By inversion on ?n;?;H;(E e2)[e1]? ?? ? ?nprimeprime;?prime;Hprime;(E e2)[e1]? we have ?n;?;H;e1? ?? ? ?nprimeprime;?prime;Hprime;e1?, and then applying [cong] we have ?n;?;H;(E)[e1]? ?? ? ?nprimeprime;?prime;Hprime;(E)[eprime1]? From ?,R1;H turnstileleft ? we know that: f ? ? ? f ? ? f ? ???i ? nprime ? ver(H,f) ?? ? (???i) ?? ? (? ??) 245 where (nprime,?,?) is the top of ?. Since ? ? [?;?;?;?i;?o] and ?s ? [?;?s;?;?i;?o] and ?s = ?1 ??2 ??3 (where ?3 = ?f), we have ? ? ?1 hence: f ? ? ? f ? ?1 f ? ?1 ??i ? nprime ? ver(H,f) ?? ? (?1 ??i1) ?? ? (?1 ??1) but since ?1 ? [?;?1;?1;?i1;?o1], we have ?1,R1;H turnstileleft ?. Hence we can apply induction on ?1;? turnstileleft E[e1] : ?1 ???f ?prime2 a59R1, yielding: (i) ?prime1;?prime turnstileleft E[eprime1] : ?1 ???f ?2 a59R1 and (ii) n + 1;?prime turnstileleft Hprime (iii) ?prime1,R1;Hprime turnstileleft ?prime (iv) traceOK(?prime) (v) (dir = bck) ? nprimeprimeprime ? n + 1 ? (dir = fwd) ? (f ? ?1 ? ver(H,f) ? ver(Hprime,f)) where ?prime1 ? [?;?prime1;?1;?i;?o1] and ?prime1 ? ?1. Choose ?prime2 = [???prime1;?2;?2;?i2;?o2] and ?prime3 = [???prime1 ??2;?f;?;?i3;?o] and thus ?prime1a3?prime2a3?prime3 arrowhookleft? ?primes and ?prime?3 = ??f. Let ?prime = [?;?prime1 ??2 ??f;?;?i;?o], where ?prime1 ??2 ??f ? ?, as required. To prove 1., we have n + 1;?prime turnstileleft Hprime by (ii), and apply (TApp): TApp ?prime1;?prime turnstileleft E[eprime1] : ?1 ???f ?prime2 a59R1 ?prime2;?prime turnstileleft e2 : ?1 a59? ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?primes ?prime?3 = ??f ?prime?3 ? ??f ?prime?3 ? ??f ?prime?i3 = ?prime?o3 ???f ?prime?o3 ? ??of E[eprime1] negationslash? v ? R2 = ? ?primes;?prime turnstileleft (E e2)[eprime1] : ?prime2 a59R1 The first premise follows by (i), the second because we have ?2;?prime turnstileleft e2 : ?1 by weakening (since ?prime ? ?) and then ?prime2;?prime turnstileleft e2 : ?1 by flow effect weakening (Lemma C.0.24) (which we can apply because ?prime?2 = ??2 , ?prime?2 = ??2, ?prime?2 = ?1 ? ?prime1, ??2 = ?1 ? ?1 hence ?prime?2 ? ??2 , ?prime?i2 = ??i2 , and ?prime?o2 = ??o2 ) , the third? eighth by choice of ?prime2, ?prime3 and ?primes, and the last as R2 ? ? by assumption. We can now apply (TSub): TSub ?prime;? turnstileleft (E e2)[eprime1] : ?prime2 a59R1 ?prime2 ? ?2 ?prime ? ?prime ?prime;? turnstileleft (E e2)[eprime1] : ?2 a59Rprime1 To prove part 2., we must show that ?prime,R1;Hprime turnstileleft ?prime. By inversion on ?,R1;H turnstileleft ? we have ? ? (nprime,?,?) or ? ? ((nprime,?,?),?primeprime). We have two cases: ? ? (nprime,?,?): Hence ?prime ? U[(nprime,?,?)]upd,dirnprimeprime ? (nprimeprime,?prime,?prime). By (iii) we must have R1 ? ? such that TC1 f ? ?prime ? f ? ?1 f ? (?prime1 ??i1) ? nprimeprimeprime ? ver(Hprime,f) ??prime ? (?1 ??i1) ??prime ? (?1 ??prime1) [?;?prime1;?1;?i1;?o1],?;Hprime turnstileleft (nprimeprime,?prime,?prime) To achieve the desired result we need to prove: TC1 f ? ?prime ? f ? ? f ? ((?prime1 ??2 ??f)??i1) ? nprimeprimeprime ? ver(Hprime,f) ??prime ? (???i1) ??prime ? (? ??prime1 ??2 ??f) [?;?prime1 ??2 ??f;?;?i;?o],?;Hprime turnstileleft (nprimeprime,?prime,?prime) Note that ? ? ?1. The first premise is by assumption (since dom(?) = dom(?prime) from the definition of U[(nprime,?,?)]upd,dirn+1 ). For the second premise, we need to show that for all f ? ((?2 ??f)??i1) ? nprimeprimeprime ? ver(Hprime,f) ; for those f ? (?prime1 ??i1) the result is by assumption. Consider each possible update type: 246 case dir = bck : From the definition of U[(nprime,?,?)]upd,bckn+1 , we know that nprimeprimeprime = n+1; from the definition of U[H]updn+1 we know that n+1 ? ver(Hprime,f) for all f, hence nprimeprimeprime ? ver(Hprime,f) for all f. case dir = fwd : From (v) we have that f ? ?1 ? ver(H,f) ? ver(Hprime,f). Since (?2??f) ? ?1 (by ?prime1a3 ?prime2a3?prime3 arrowhookleft? ?prime), we have ((?2??f)??i1) ? ?1 hence f ? ((?2??f)??i1) ? ver(H,f) ? ver(Hprime,f). By inversion on ?,R1;H turnstileleft ? we have f ? (?1 ??2 ??f) ? nprime ? ver(H,f), and thus f ? (?2??f)??i1) ? nprime ? ver(Hprime,f). We have U[(nprime,?,?)]upd,fwdn+1 = (nprime,?,?) hence nprimeprimeprime = nprime, so finally we have f ? ((?2 ??f))??i1) ? nprimeprimeprime ? ver(Hprime,f). The third and fourth premises follow by assumption since ?prime = ? and ?prime1 ? ?1. ? ? ((nprime,?,?),?primeprime) Hence ?prime ? U[(nprime,?,?)]upd,dirnprimeprime ? ((nprimeprimeprime,?prime,?prime),?primeprimeprime) By (iii), we must have R1 ? ?primeprime,Rprimeprime such that TC2 ?primeprime,Rprimeprime;Hprime turnstileleft ?primeprime ?prime1 ? [?;?prime1;?1;?i1;?o1] f ? ?prime ? f ? ?1 f ? (?prime1 ??i1) ? nprimeprimeprime ? ver(Hprime,f) ??prime ? (?1 ??i1) ??prime ? (?1 ??prime1) ?prime1,?primeprime,Rprimeprime;Hprime turnstileleft ((nprimeprimeprime,?prime,?prime),?primeprimeprime) We wish to show that TC2 ?primeprime,Rprimeprime;Hprime turnstileleft ?primeprime ?prime ? [?;?prime1 ??2 ??f;?;?i;?o] f ? ?prime ? f ? ? f ? ((?prime1 ??2 ??f)??i1) ? nprimeprimeprime ? ver(Hprime,f) ??prime ? (???i1) ??prime ? (? ??prime1 ??2 ??f) ?prime,?primeprime,Rprimeprime;Hprime turnstileleft ((nprimeprimeprime,?prime,?prime),?primeprimeprime) ?primeprime,Rprimeprime;Hprime turnstileleft ?primeprime follows by assumption while the rest of the premises follow by the same argument as in the ? ? (nprime,?,?) case, above. Part 3. follows directly from (iv). Part 4. follows directly from (v) and the fact that ?1 ? ? (because ?1 ? ?2 ??f ??). case v E : By assumption, we have ?;? turnstileleft (v E)[e2] : ? a59 R. By subtyping derivations (Lemma C.0.23) we have: TSub TApp ?1;? turnstileleft v : ?1 ???f ?prime2 a59? ?2;? turnstileleft E[e2] : ?1 a59R2 ?1 a3?2 a3?3 arrowhookleft? ?s ??3 = ??f ??3 ? ??f ??3 ? ??f ??i3 = ??o3 ???f ??o3 ? ??of v negationslash? vprime ? R2 = ? ?s;? turnstileleft (v E)[e2] : ?prime2 a59R2 ?prime2 ? ?2 SCtxt ? ? [?;?;?;?i;?o] ?s ? [?;?1 ??2 ??f;?;?i;?o] (?1 ??2 ??f) ? ? ?s ? ? ?;? turnstileleft (v E)[e2] : ?2 a59R2 and by flow effect weakening (Lemma C.0.24) we know that ?, ?, ?i and ?o are unchanged in the use of (TSub). By inversion on ?n;?;H;(v E)[e2]? ?? ? ?nprimeprime;?prime;Hprime;(v E)[eprime2]? we have ?n;?;H;e2? ?? ? ?nprimeprime;?prime;Hprime;eprime2?, and then applying [cong] we have ?n;?;H;(E)[e2]? ?? ? ?nprimeprime;?prime;Hprime;(E)[eprime2]? From ?,R2;H turnstileleft ? we know that: f ? ? ? f ? ? f ? ???i ? nprime ? ver(H,f) ?? ? (???i) ?? ? (? ??) 247 where (nprime,?,?) is the top of ?. We have ? ? [?;?;?;?i;?o], ?s ? [?s;?s;?s;?is;?os], ?s ? ?, ?s = ?1 ??2 ??3 (where ?3 = ?f), ?2 ? [?2;?2;?2;?i2;?o2], ?2 ? ?1 ??1 = ? (since ?1 = ?; if it?s not ? we can construct a derivation for v that has ?1 = ? as argued in preservation (Lemma C.0.31), (TApp)-[Cong], case v E). Similarly, we have ?2 = ?1 = ? and ?i2 = ?i1 = ?i. We have f ? ? ? f ? ?2 f ? ?2 ??i2 ? nprime ? ver(H,f) ?? ? (?2 ??i2) ?? ? (?2 ??2) hence ?2,R2;H turnstileleft ? and we can apply induction on ?2;? turnstileleft E[e2] : ?1 ???f ?prime2 a59R2, yielding: (i) ?prime2;?prime turnstileleft E[e2] : ?1 a59R2 and (ii) n + 1;?prime turnstileleft Hprime (iii) ?prime2,R2;Hprime turnstileleft ?prime (iv) traceOK(?prime) (v) (dir = bck) ? nprimeprime ? n + 1 ? (dir = fwd) ? (f ? ?2 ? ver(H,f) ? ver(Hprime,f)) where ?prime2 ? [?2;?prime2;?2;?i2;?o2] and ?prime2 ? ?2. Choose ?prime1 = [?;?;?2 ??prime2;?i1;?o1] and ?prime3 = [???prime2;?f;?;?i3;?o] and thus ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime and ?prime?3 = ??f. Let ?prime ? [?;?prime2 ??f;?;?i;?o] and thus ?prime2 ??f ? ? as required. To prove 1., we have n + 1;?prime turnstileleft Hprime by (ii), and apply (TApp): TApp ?prime1;?prime turnstileleft v : ?1 ???f ?prime2 a59? ?prime2;?prime turnstileleft E[e2] : ?1 a59R2 ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime ?prime?3 = ??f ?prime?3 ? ??f ?prime?3 ? ??f ?prime?i3 = ?prime?o3 ??f ?prime?o3 ? ??of v negationslash? vprime ? R2 = ? ?prime;?prime turnstileleft (v E)[e2] : ?prime2 a59R2 The first premise follows by value typing, the second by (i), the third? eighth by choice of ?prime1 and ?prime3, and the last holds vacuously. We can now apply (TSub): TSub ?prime;? turnstileleft (v E)[e2] : ?prime2 a59R2 ?prime2 ? ?2 ?prime ? ?prime ?prime;? turnstileleft (v E)[e2] : ?2 a59R2 To prove part 2., we must show that ?prime,R2;Hprime turnstileleft ?prime. By inversion on ?,R2;H turnstileleft ? we have ? ? (nprime,?,?) or ? ? ((nprime,?,?),?primeprime). We have two cases: ? ? (nprime,?,?): By (iii) we must have R2 ? ? such that TC1 f ? ?prime ? f ? ?2 f ? ?prime2 ??i2 ? nprimeprimeprime ? ver(Hprime,f) ??prime ? (?2 ??i2) ??prime ? (?2 ??prime2) [?;?prime2;?2;?i2;?o2],?;Hprime turnstileleft (nprimeprime,?prime,?prime) To achieve the desired result we need to prove: TC1 f ? ?prime ? f ? ? f ? (?prime2 ??f)??i ? nprimeprimeprime ? ver(Hprime,f) ??prime ? (???i) ??prime ? (? ??prime2 ??f) [?;?prime2 ??f;?;?i;?o],?;Hprime turnstileleft (nprimeprime,?prime,?prime) Note that ?2 = ?1 = ?. The first premise follows by assumption (since dom(?) = dom(?prime) from the definition of U[(nprime,?,?)]upd,dirn+1 ). The third and fourth premise follow by assumption since ?i = ?i2, ? = ?2, ?1 = ? and ?2 = ? ? ?f. For the second premise, we need to show that for all f ? (?f ??i) ? nprimeprime ? ver(Hprime,f) (for those f ? ?prime2 ??i the result is by assumption). Consider each possible update type: 248 case dir = bck : From the definition of U[(nprime,?,?)]upd,bckn+1 , we know that nprimeprimeprime = n+1; from the definition of U[H]updn+1 we know that n+1 ? ver(Hprime,f) for all f, hence nprimeprimeprime ? ver(Hprime,f) for all f. case dir = fwd : From (v) we have that f ? ?2 ? ver(H,f) ? ver(Hprime,f). Then ?f ? ?2 (by ?prime1a3?prime2a3 ?prime3 arrowhookleft? ?prime) implies f ? ?f ? ver(H,f) ? ver(Hprime,f). By inversion on ?,R2;H turnstileleft ? we have f ? ((?2 ? ?f) ? ?i) ? nprime ? ver(H,f), and thus f ? ?f ? nprime ? ver(Hprime,f). We have U[(nprime,?,?)]upd,fwdn+1 = (nprime,?,?) hence nprimeprimeprime = nprime, so finally we have f ? (?f ??i) ? nprimeprimeprime ? ver(Hprime,f). The fourth and fifth premises follow by assumption since ?prime = ? and ?prime2 ? ?2. ? ? (nprimeprime,?,?),?primeprime By (iii), we must have R2 ? ?primeprime,Rprimeprime such that TC2 ?primeprime,Rprimeprime;Hprime turnstileleft ?primeprime ?prime2 ? [?;?prime2;?2;?i2;?o2] f ? ?prime ? f ? ?2 f ? ?prime2 ??i2 ? nprimeprimeprime ? ver(Hprime,f) ??prime ? (?2 ??i2) ??prime ? (?2 ??prime2) ?prime2,?primeprime,Rprimeprime;Hprime turnstileleft ((nprimeprimeprime,?prime,?prime),?primeprimeprime) We wish to show that TC2 ?primeprime,Rprimeprime;Hprime turnstileleft ?primeprime ?prime ? [?;?prime2 ??f;?;?i;?o] f ? ?prime ? f ? ? f ? (?prime2 ??f)??i ? nprimeprimeprime ? ver(Hprime,f) ??prime ? (???i) ??prime ? (? ??prime2 ??f) ?prime,?primeprime,Rprimeprime;Hprime turnstileleft ((nprimeprimeprime,?prime,?prime),?primeprimeprime) ?primeprime,Rprimeprime;Hprime turnstileleft ?primeprime follows by assumption while the third and fourth premises follow by the same argument as in the ? ? (nprime,?,?) case, above. Part 3. follows directly from (iv). Part 4. follows directly from (v) and the fact that ?2 ? ?. case all others : Similar to cases above. This lemma says that if take an evaluation step that is not an update, the version set of any z remains unchanged. Lemma C.0.30 (Non-update step version preservation). If ?n;?;H;e? ??? ?n;?prime;Hprime; eprime? then for all z ? dom(Hprime), ver(Hprime,z) = ver(H,z). Proof. By inspection of the evaluation rules. The following lemma states that if we start with a well-typed program and a version-consistent trace and we can take an evaluation step, then afterward we will still have a well-typed program whose trace is version-consistent. Lemma C.0.31 (Preservation). Suppose we have the following: 1. n turnstileleft H,e : ? (such that ?;? turnstileleft e : ? a59R and n;? turnstileleft H for some ? and ?) 2. ?,R;H turnstileleft ? 3. traceOK(?) 4. ?n;?;H;e? ??? ?n;?prime;Hprime; eprime? Then for some ?prime ? ? and ?prime ? [?? ??0;?prime;??;??i;??o] such that ?prime ??0 ? ??, we have: 1. n turnstileleft Hprime,eprime : ? where ?prime;?prime turnstileleft eprime : ? a59Rprime and n;?prime turnstileleft Hprime 249 2. ?prime,Rprime;Hprime turnstileleft ?prime 3. traceOK(?prime) Proof. Induction on the typing derivation n turnstileleft H,e : ?. By inversion, we have that ?;? turnstileleft e : ? a59R; consider each possible rule for the conclusion of this judgment: case (TInt-TVar-TGvar-TLoc) : These expressions do not reduce, so the result is vacuously true. case (TRef) : We have that: (TRef) ?;? turnstileleft e : ? a59R?;? turnstileleft ref e : ref ? ? a59R There are two possible reductions: case [ref] : We have that e ? v, R = ?, and ?n;(nprime,?,?);H;ref v? ??? ?n;(nprime,?,?);Hprime;r? where r /? dom(H) and Hprime = H,r mapsto? (?,v,?). Let ?prime = ?,r : ref ? ? and ?prime = ? (which is acceptable since ?prime? = ?? ??, ?prime ?? ? ??, ?prime? = ??, ?prime?i = ??i, ?prime?o = ??o, and Rprime = ?. We have part 1. as follows: (TSub) (TLoc) ?prime(r) = ref ? ?? ?;?prime turnstileleft r : ref ? ? a59? ref ? ? ? ref ? ? ?? ? ? ?;?prime turnstileleft r : ref ? ? a59? Heap well-formedness n;?prime turnstileleft H,r mapsto? (?,v,?) holds since ??;?prime turnstileleft v : ? follows by value typing (Lemma C.0.22) from ?;?prime turnstileleft v : ?, which we have by assumption and weakening; we have n;?prime turnstileleft H by weakening. To prove 2., we must show ?,?;Hprime turnstileleft (nprime,?,?). This follows by assumption since Hprime only contains an additional location (i.e., not a global variable) and nothing else has changed. Part 3. follows by assumption since ?prime = ?. case [cong] : We have that ?n;?;H;ref E[eprimeprime]? ??? ?n;?prime;Hprime;ref E[eprimeprimeprime]? from ?n;?;H;eprimeprime? ??? ?n;?prime;Hprime;eprimeprimeprime?. By [cong], we have ?n;?;H;e? ??? ?n;?prime;Hprime;eprime? where e ? E[eprimeprime] and eprime ? E[eprimeprimeprime]. By induction we have: (i) ?prime;?prime turnstileleft eprime : ? a59Rprime and (ii) n;?prime turnstileleft Hprime (iii) ?prime,Rprime;Hprime turnstileleft ?prime (iv) traceOK(?prime) where ?prime? = ?? ??0, ?prime ??0 ? ??, ?prime? = ??, ?prime?i = ??i, and ?prime?o = ??o. We prove 1. using (ii), and applying [TRef] using (i): (TRef) ?prime;?prime turnstileleft eprime : ? a59Rprime?prime;?prime turnstileleft ref eprime : ref ? ? a59Rprime Part 2. follows directly from (iii), and part 3. follows directly from (iv). case (TDeref) : We know that (TDeref) ?1;? turnstileleft e : ref ?r ? a59R ??2 = ?r ??i2 = ??o2 ??r ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !e : ? a59R We can reduce using either [gvar-deref], [deref], or [cong]. 250 case [gvar-deref] : Thus we have e ? z such that ?n;(nprime,?,?);(Hprimeprime,z mapsto? (?prime,v,?));!z? ??{z} ?n;(nprime,? ?(z,?),?);(Hprimeprime,z mapsto? (?prime,v,?));v? (where H ? (Hprimeprime,z mapsto? (?prime,v,?))), by subtyping derivations (Lemma C.0.23) we have (TSub) (TGVar) ?(z) = ref ? prime r ?prime ??;? turnstileleft z : ref ?primer ?prime a59? ?prime ? ? ? ? ?prime ?primer ? ?r ref ?primer ?prime ? ref ?r ? ?? ? ?1 ?1;? turnstileleft z : ref ?r ? a59? and (TDeref) ?1;? turnstileleft z : ref ?r ? a59? ??2 = ?r ??i2 = ??o2 ??r ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !z : ? a59? (where R = ?) and ? ? [??1;??1 ??r;??2 ;??i1 ;??o2 ]. We have ??i1 = ??o1 = ??i2 and ??i2 = ??o2 ??r. Let ?prime = ?, ?prime = [??1 ?{z};?;??2 ;??i1 ;??o2 ] and Rprime = R = ?. Since z ? ?r (by n;? turnstileleft H) we have ??{z} ? (??1 ??r) hence ?prime ?{z} ? ??. By the same argument we have {z} ? ??i1 . The choice of ?prime is acceptable since ?prime? = ?? ?{z}, ?prime ?{z} ? ??, ?prime? = ??, ?prime?i = ??i and ?prime?o = ??o. To prove 1., we need to show that ?prime;? turnstileleft v : ? a59 ? (the rest of the premises follow by assumption of n turnstileleft H,!z : ?). H(z) = (?prime,v,?) and ?(z) = ref ?primer ?prime implies ?prime;? turnstileleft v : ?prime a59 ? by n;? turnstileleft H. The result follows by (TSub): (TSub) ?prime;? turnstileleft v : ?prime a59? ?prime ? ? ?prime ? ?prime?prime;? turnstileleft v : ? a59? For part 2., we know ?,?;H turnstileleft (nprime,?,?): (TC1) f ? ? ? f ? ??1 f ? ((??1 ??r)???i1 ) ? nprime ? ver(H,f) ?? ? (??1 ???i1 ) ?? ? (??2 ???1 ??r) [??1;??1 ??r;??2 ;??i1 ;??o2 ],?;H turnstileleft (nprime,?,?) and need to prove ?prime,?;H turnstileleft (nprime,? ?(z,?),?), hence: (TC1) f ? (? ?(z,?)) ? f ? ??1 ?{z} f ? (????i1 ) ? nprime ? ver(H,f) ?? ? (??1 ?{z}???i1 ) ?? ? (??2 ??) [??1 ?{z};?;??2 ;??i1 ;??o2 ],?;H turnstileleft (nprime,? ?(z,?),?) The first premise is true by assumption for all f ? ?, and for (z,?) since z ? ??1 ?{z}. The second premise is vacuously true. The third premise is true by assumption and the fact that {z} ? ??i1 . The fourth premise is true by assumption. For part 3., we need to prove traceOK(nprime,? ?(z,?)); we have traceOK(nprime,?,?) by assumption, hence need to prove that nprime ? ?. Since by assumption of version consistency we have that f ? ??1 ? ?r ? nprime ? ver(H,f) and ver(H,f) = ver(Hprime,f) = ? (by Lemma C.0.30), and {z} ? ?r (by n;? turnstileleft H), we have nprime ? ?. case [deref] : Follows the same argument as the [gvar-deref] case above for part 1.; parts 2 and 3 follow by assumption since the trace has not changed. case [cong] : Here ?n;?;H;!e? ??? ?n;?prime;Hprime;!eprime? follows from ?n;?;H,e? ??? ?n;?prime;Hprime,eprime?. To apply in- duction, we must have ?1,R;H turnstileleft ? which follows by Lemma C.0.25 since ?,R;H turnstileleft ? and ?1 a3?2 arrowhookleft? ?. Hence we have: 251 (i) ?prime1;?prime turnstileleft eprime : ref ?r ? a59Rprime (ii) n;?prime turnstileleft Hprime (iii) ?prime1,Rprime;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?prime1 ? [??1 ? ?0;?prime1;??1 ;??i1 ;??o1 ]. where ?prime1 ? ?0 ? ??1. Let ?prime2 = [??1 ? ?0;?r;??2 ;??i2 ;??o2 ] hence ?prime?2 = ?r and ?prime1 a3?prime2 arrowhookleft? ?prime, where ?prime ? [??1 ??0;?prime1 ??r;??2 ;??i1 ;??o2 ] and (?prime1 ? ?r) ? ?0 ? (?1 ? ?r) hence ?prime? = ?? ? ?0, ?prime ? ?0 ? ??, ?prime? = ?? , ?prime?i = ??i, and ?prime?o = ??o as required. We prove 1. by (ii) and by applying [TDeref]: (TDeref) ?prime1;?prime turnstileleft eprime : ref ?r ? a59Rprime ?prime?2 = ?r ?prime?i2 = ?prime?o2 ??r ?prime1 a3?prime2 arrowhookleft? ?prime ?prime;?prime turnstileleft !eprime : ? a59Rprime The first premise follows from (i) and the second and third premises follows by definition of ?prime and ?prime2. To prove part 2., we must show that ?prime,Rprime;Hprime turnstileleft ?prime. We have two cases: ?prime ? (nprime,?,?): By (iii) we must have Rprime ? ? such that (TC1) f ? ? ? f ? ??1 ??0 f ? (?prime1 ???i1 ) ? nprime ? ver(Hprime,f) ??prime ? (??1 ???i1 ) ??prime ? (??1 ??prime1) [??1 ??0;?prime1;??1 ;??i1 ;??o1 ],?;Hprime turnstileleft (nprime,?,?) To achieve the desired result we need to prove: (TC1) f ? ? ? f ? ??1 ??0 f ? ((?prime1 ??r)???i1 ) ? nprime ? ver(Hprime,f) ??prime ? (??1 ???i1 ??prime ? (??2 ??prime1 ??r) [??1 ??0;?prime1 ??r;??2 ;??i1 ;??o2 ],?;Hprime turnstileleft (nprime,?,?) The first premise follows directly from (iii). To prove the second premise, we observe that by Lemma C.0.27, top(?) = (nprime,?prime,?prime) where ?prime ? ?, and by inversion on ?;R;H turnstileleft ? we know f ? ?1 ? ?r ? nprime ? ver(H,f). Then the second premise follows because for all f, ver(H,f) = ver(Hprime,f) by Lemma C.0.30. The third premise follows directly by assumption. The fourth premise follows by assumption and the fact that ??1 ? ?? ??r. ?prime ? (nprime,?,?),?primeprime: By (iii), we must have Rprime ? ?primeprimeprime,Rprimeprimeprime such that (TC2) ?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft ?primeprime ?prime1 ? [??1 ??0;?prime1;??1 ;??i1 ;??o1 ] f ? ? ? f ? ??1 ??0 f ? (?prime1 ???i1 ) ? nprime ? ver(Hprime,f) ??prime ? (??1 ???i1 ) ??prime ? (??1 ??prime1) ?prime1,?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft (nprime,?,?),?primeprime We wish to show that (TC2) ?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft ?primeprime ?prime ? [??1 ??0;?prime1 ??r;??2 ;??i2 ;??o2 ] f ? ? ? f ? ??1 ??0 f ? ((?prime1 ??r)???i1 ) ? nprime ? ver(Hprime,f) ??prime ? (??1 ???i1 ) ??prime ? (?? ??prime1 ??r) ?prime,?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft (nprime,?,?),?primeprime The first and third premises follow from (iii), while the fourth, fifth and sixth premises follows by the same argument as in the ?prime ? (nprime,?,?) case, above. 252 Part 3. follows directly from (iv). case (TAssign) : We know that: (TAssign) ?1;? turnstileleft e1 : ref ?r ? a59R1 ?2;? turnstileleft e2 : ? a59R2 ??3 = ?r ??i3 = ??o3 ??r ?1 a3?2 a3?3 arrowhookleft? ? ?;? turnstileleft e1 := e2 : ? a59R1 trianglerighttriangleleft R2 From R1 trianglerighttriangleleft R2 it follows that either R1 ? ? or R2 ? ?. We can reduce using [gvar-assign], [assign], or [cong]. case [gvar-assign] : This implies that e ? z := v with ?n;(nprime,?,?);(Hprimeprime,z mapsto? (?,vprime,?));z := v? ??{z} ?n;(nprime,? ?(z,?),?);(Hprimeprime,z mapsto? (?,v,?));v? where H ? (Hprimeprime,z mapsto? (?,vprime,?)). R1 ? ? and R2 ? ? (thus R1 trianglerighttriangleleft R2 ? ?). Let ?prime = ?, Rprime = ?, and ?prime = [?? ? {z};?;??;??i1 ;??o2 ]. Since z ? ?r (by n;? turnstileleft H) we have ? ? (?1 ??2 ??r), hence ??{z} ? (?1 ??2 ??r) which means ?prime ?{z} ? ??. By the same argument we have {z} ? ??i1 . The choice of ?prime is acceptable since ?prime? = ?? ?{z}, ?prime ?{z} ? ??, ?prime? = ??, ?prime?i = ??i and ?prime?o = ??o. We prove 1. as follows. Since ?2;? turnstileleft v : ? a59 ?, by value typing (Lemma C.0.22) we have ?prime;? turnstileleft v : ? a59?. n;? turnstileleft Hprime follows from n;? turnstileleft H and ?prime;? turnstileleft v : ? a59? (since ?? = ?). Parts 2. and 3. are similar to the (TDeref) case. case [assign] : Part 1. is similar to (gvar-assign); we have parts 2. and 3. by assumption. case [cong] : Consider the shape of E: case E := e : ?n;?;H;e1 := e2? ??? ?n;?prime;Hprime;eprime1 := e2? follows from ?n;?;H;e1? ??? ?n;?prime;Hprime;eprime1?. Since e1 negationslash? v ? R2 = ? by assumption, we have R ? R1. To apply induction we must show ?1,R?;H turnstileleft ? This follows by an argument similar to Lemma C.0.25, because ??1 ? ?? , ??i1 ? ??i, and ??1 = ?? ? ??2 ? ?r hence ?? ? (?? ? ??i) implies ?? ? (??1 ? ??i1 ) and ?? ? (?? ???1 ???2 ??r) implies ?? ? (??1 ???1). Hence by induction we have (i) ?prime1;?prime turnstileleft eprime1 : ref ?r ? a59Rprime1 and (ii) n;?prime turnstileleft Hprime (iii) ?prime1,Rprime1;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?prime1 ? [??1 ? ?0;?prime1;??1 ;??i1 ;??o1 ] where ?prime1 ? ?0 ? ?1 and ??1 ? ??2 ??r ???3 . Let ? prime2 ? [??1 ??prime1 ??0;??2;?r ???3 ;??i2 ;??o2 ] ?prime3 ? [??1 ??prime1 ??0 ???2;?r;??3 ;??i3 ;??o3 ] Thus ?prime?3 = ?r and ?prime1a3?prime2a3?prime3 arrowhookleft? ?prime such that ?prime ? [??1 ??0;?prime1 ???2 ??r;??3 ;??i1 ;??o3 ] The choice of ?prime is acceptable since ?prime? = ?? ??0, (?prime1 ??r ??2)??0 ? (?1 ??r ??2) i.e., ?prime ??0 ? ??, ?prime? = ?? , ?prime?i = ??i, and ?prime?o = ??o as required). To prove 1., we have n;?prime turnstileleft Hprime by (ii), and apply (TAssign): (TAssign) ?prime1;?prime turnstileleft eprime1 : ref ?r ? a59Rprime1 (TSub) ?2;?prime turnstileleft e2 : ? a59R2 ? ? ? ??1 ??prime1 ??0 ? ??1 ???1 ??2 ? ??2 ?r ???3 ? ?r ???3 ?prime?i2 = ??i2 ?prime?o2 = ??o2 ?2 ? ?prime2 ?prime2;?prime turnstileleft e2 : ? a59R2 ?prime?3 = ?r ?prime?i3 = ?prime?o3 ??r ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime ?prime;?prime turnstileleft e1 := e2 : ? a59Rprime1 trianglerighttriangleleft R2 Note that ?2;?prime turnstileleft e2 : ? follows from ?2;? turnstileleft e2 : ? by weakening (Lemma C.0.19). To prove part 2., we must show that ?prime,Rprime1;Hprime turnstileleft ?prime (since Rprime1 trianglerighttriangleleft R2 = Rprime1). By inversion on ?,R;H turnstileleft ? we have ? ? (nprime,?,?) or ? ? (nprime,?,?),?primeprime. We have two cases: 253 ?prime ? (nprime,?,?): By (iii) we must have Rprime1 ? ? such that (TC1) f ? ? ? f ? ??1 ??0 f ? (?prime1 ???i1 ) ? nprime ? ver(Hprime,f) ??prime ? (??1 ???i1 ) ??prime ? (??1 ??prime1) [??1 ??0;?prime1;??1 ;??i1 ;??o1 ],?;Hprime turnstileleft (nprime,?,?) To achieve the desired result we need to prove: (TC1) f ? ? ? f ? ??1 ??0 f ? ((?prime1 ???2 ??r)???i1 ) ? nprime ? ver(Hprime,f) ??prime ? (??1 ???i1 ) ??prime ? (? ??prime1 ???2 ??r) [??1 ??0;?prime1 ???2 ??r;??3 ;??i1 ;??o3 ],?;Hprime turnstileleft (nprime,?,?) The first premise follows directly from (iii). To prove the second premise, we ob- serve that by Lemma C.0.27, top(?) = (nprime,?prime,?prime) where ?prime ? ?, and by inversion on ?;R;H turnstileleft ? we know (a) f ? ?prime ? f ? ??1, and (b) f ? ??1 ???2 ??r ? nprime ? ver(H,f). The second premise follows from (iii) and the fact that f ? (?r ???2) ? nprime ? ver(H,f) by (b), and for all f, ver(H,f) = ver(Hprime,f) by Lemma C.0.30. The third premise follows directly by assumption. The fourth premise follows by assumption and the fact that ??1 ? ? ???2 ??r. ?prime ? (nprime,?,?),?primeprime: By (iii), we must have Rprime1 ? ?primeprimeprime,Rprimeprimeprime such that (TC2) ?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft ?primeprime ?prime1 ? [??1 ??0;?prime1;??1 ;??i1 ;??o1 ] f ? ? ? f ? ??1 ??0 f ? (?prime1 ???i1 ) ? nprime ? ver(Hprime,f) ??prime ? (??1 ???i1 ) ??prime ? (??1 ??prime1) ?prime1,?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft (nprime,?,?),?primeprime We wish to show that (TC2) ?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft ?primeprime ?prime ? [??1 ??0;?prime1 ???2 ??r;??3 ;??i1 ;??o3 ] f ? ? ? f ? ??1 ??0 f ? ((?prime1 ???2 ??r)???i1 ) ? nprime ? ver(Hprime,f) ??prime ? (??1 ???i1 ) ??prime ? (? ??prime1 ???2 ??r) ?prime,?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft (nprime,?,?),?primeprime The first and third premises follow from (iii), while the fourth, fifth and sixth premises follows by the same argument as in the ?prime ? (nprime,?,?) case, above. Part 3. follows directly from (iv). case r := E : ?n;?;H;r := e2? ??? ?n;?prime;Hprime;r := eprime2? follows from ?n;?;H;e2? ??? ?n;?prime;Hprime;eprime2?. Since e1 ? r, by inversion R1 ? ?. and we have R ? R2. To apply induction we must show ?2,R?;H turnstileleft ?. This follows by an argument similar to (TDeref)-[cong], because ??2 ? ??1 ? ??, ??i2 ? ??i1 ? ??i, and ??2 = ??3 ? ?r hence ?? ? (?? ? ??i) implies ?? ? (??2 ???i2 ) and ?? ? (?? ???) implies ?? ? (??2 ???2). (i) ?prime2;?prime turnstileleft eprime2 : ? a59Rprime2 (ii) n;?prime turnstileleft Hprime (iii) ?prime2,Rprime2;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?prime2 ? [??2 ? ?0;?prime2;??2 ;??i2 ;??o2 ] where (?prime2 ? ?0) ? ??2; note ??2 ? ??1 (since ??1 ? ?) and ??2 ? ?3 ???3 . Let ? prime1 ? [??1 ??0;?;?prime2 ??r ???3 ;??i2 ;??o2 ] ?prime3 ? [??1 ??0 ??prime2;?r;??3 ;??i3 ;??o3 ] 254 Thus ?prime?3 = ?r and ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime such that ?prime ? [??1 ??0;?prime2 ??r;??3 ;??i1 ;??o3 ] and (?prime2 ??r)??0 ? (??2 ??r). The choice of ?prime is acceptable since ?prime? = ?? ??0, ?prime ??0 ? ??, ?prime? = ??, ?prime?i = ??i and ?prime?o = ??o as required. To prove 1., we have n;?prime turnstileleft Hprime by (ii), and we can apply [TAssign]: (TAssign) ?prime1;?prime turnstileleft r : ref ?r ? a59? ?prime2;?prime turnstileleft eprime2 : ? a59Rprime2 ?prime?r3 = ?r ?prime?i3 = ?prime?o3 ??r ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime ?prime;?prime turnstileleft r := eprime2 : ? a59? trianglerighttriangleleft Rprime2 Note that we have ?prime1;?prime turnstileleft r : ref ?r ? a59? from ?1;? turnstileleft r : ref ?r ? a59? by value typing and weakening To prove part 2., we must show that ?prime,Rprime2;Hprime turnstileleft ?prime (since R1 trianglerighttriangleleft R2 = Rprime2). By inversion on ?,R;H turnstileleft ? we have ? ? (nprime,?,?) or ? ? (nprime,?,?),?primeprime. We have two cases: ?prime ? (nprime,?,?): By (iii) we must have Rprime2 ? ? such that (TC1) f ? ? ? f ? ??2 ??0 f ? (?prime2 ???i2 ) ? nprime ? ver(Hprime,f) ??prime ? (??2 ???i2 ) ??prime ? (??2 ??prime2) [??2 ??0;?prime2;??2 ;??i2 ;??o2 ],?;Hprime turnstileleft (nprime,?,?) To achieve the desired result we need to prove: (TC1) f ? ? ? f ? ??1 ??0 f ? ((?r ??prime2)???i2 ) ? nprime ? ver(Hprime,f) ??prime ? (??1 ???i1 ) ??prime ? (?? ??prime2 ??r) [??1 ??0;?prime2 ??r;??3 ;??i1 ;??o3 ],?;Hprime turnstileleft (nprime,?,?) The first premise follows from (iii) since ??1 = ??2. To prove the second premise, we observe that by Lemma C.0.27, top(?) = (nprime,?prime,?) where ?prime ? ?, and by inversion on ?;R;H turnstileleft ? we know f ? ?1 ??r ? nprime ? ver(H,f). The second premise follows because we have f ? ((?1 ? ?r) ? ?i) ? nprime ? ver(H,f) by assumption and for all f, ver(H,f) = ver(Hprime,f) by Lemma C.0.30. The third premise follows directly by assumption since ??1 = ??2 and ??i1 = ??i2 . The fourth premise follows by assumption and the fact that ??2 ? ?? ??r. ?prime ? (nprime,?,?),?primeprime: By (iii), we must have Rprime2 ? ?primeprimeprime,Rprimeprimeprime such that: (TC2) ?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft ?primeprime ?prime2 ? [??2 ??0;?prime2;??2 ;??i2 ;??o2 ] f ? ? ? f ? ??2 ??0 f ? (?prime2 ???i2 ) ? nprime ? ver(Hprime,f) ??prime ? (??2 ???i2 ) ??prime ? (??2 ??prime2) ?prime2,?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft (nprime,?,?),?primeprime We wish to show that (TC2) ?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft ?primeprime ?prime ? [??1 ??0;?prime2 ??r;??3 ;??i1 ;??o3 ] f ? ? ? f ? ???0 f ? ((?prime2 ??r)???i2 ) ? nprime ? ver(Hprime,f) ??prime ? (??1 ???i1 ) ??prime ? (?? ??prime2 ??r) ?prime,?primeprimeprime,Rprimeprimeprime;Hprime turnstileleft (nprime,?,?),?primeprime The first and third premises follow from (iii), while the fourth, fifth and sixth premises follow by the same argument as in the ?prime ? (nprime,?,?) case, above. Part 3. follows directly from (iv). 255 case (TCheckin) : We know that: (TCheckin) ???o ? ?prime ? ? ?prime [?;?;?;?;?o];? turnstileleft checkin?prime,?prime:int a59? case [checkin] : Thus we must have: ?n;(nprime,?,?);H;checkin(?prime,?prime)? ??? ?n;(nprime,?prime,(?prime,?prime));H;1? Let ?prime = ? and ?prime = ? (and thus ?prime ? ? ? ??, ?prime? = ?? ? ?, ?prime? = ?? , ?prime?i = ??i, and ?prime?o = ??o ) as required. For 1., ?prime;? turnstileleft 1 : int a59 ? follows from (TInt) and value typing and n;? turnstileleft H is true by assumption. For part 2., we know (TC1) f ? ? ? f ? ? f ? (???) ? nprime ? ver(H,f) ?? ? (???) ?? ? (? ??) [?;?;?;?;?o],?;H turnstileleft (nprime,?,?prime) and need to prove: (TC1) f ? ? ? f ? ? f ? (???) ? nprime ? ver(H,f) ??prime ? (???) ??prime ? (? ??) [?;?;?;?;?o],?;H turnstileleft (nprime,?,?) The first premise is true by assumption. The second is vacuously true. The third and fourth premises follow since we know that ??prime ? ???o and ??prime ? ? by assumption. Part 3. follows by assumption. case (TIf) : We know that: (TIf) ?1;? turnstileleft e1 : int a59R ?2;? turnstileleft e2 : ? a59? ?2;? turnstileleft e3 : ? a59? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft if0 e1 then e2 else e3 : ? a59R We can reduce using [if-t], [if-f] or [cong]. case [if-t] : This implies that e1 ? v hence R = ?. We have ?n;(nprime,?,?);H;if0 v then e2 else e3? ?? ?n;(nprime,?,?);H;e2? We have ?2 = ? (because ??1 ? ?; if ??1 negationslash? ? we can rewrite the derivation using value typing to make it so). Let ?prime = ? and ?prime = ? (and thus ??? ? ??, ?prime? = ????, ?prime? = ?? , and ?prime?i = ??i, ?prime?o = ??o as required). To prove 1., we have n;? turnstileleft H and ?;? turnstileleft e2 : ? a59? by assumption. Parts 2. and 3. also follow by assumption. case [if-f] : This is similar to [if-t]. case [cong] : ?n;?;H;if0 e1 then e2 else e3? ??? ?n;?prime;Hprime;if0 eprime1 then e2 else e3? follows from ?n;?;H;e1? ??? ?n;?prime;Hprime;eprime1?. To apply induction, we must have ?1,R;H turnstileleft ? which follows by Lemma C.0.25 since ?,R;H turnstileleft ? and ?1 a3?2 arrowhookleft? ?. (i) ?prime1;?prime turnstileleft eprime1 : int a59Rprime and (ii) n;?prime turnstileleft Hprime (iii) ?prime1,Rprime;Hprime turnstileleft ?prime (iv) traceOK(?prime) 256 for some ?prime ? ? and some ?prime1 ? [??1 ? ?0;?prime1;??1 ;??i1 ;??o1 ] where ?prime1 ? ?0 ? ??1. (Note that ??1 ? ??2 ???2 .) Let ?prime2 ? [??1 ? ?prime1 ? ?0;??2;??2 ;??i2 ;??o2 ]. Thus ?prime1 a3 ?prime2 arrowhookleft? ?prime so that ?prime ? [??1 ? ?0;?prime1 ? ??2;??2 ;??i1 ;??o2 ] where ?prime1??0???2 ? ??1???2, ?prime? = ??, ?prime?i = ??i, and ?prime?o = ??o as required. To prove 1., we have n;?prime turnstileleft Hprime by (ii), and can apply (TIf): We prove 1. by (ii) and as follows: (TIf) (TSub) ?2;?prime turnstileleft e2 : ? a59? ? ? ? ?2 ? ?prime2 ?prime2;?prime turnstileleft e2 : ? a59? (TSub) ?2;?prime turnstileleft e2 : ? a59? ? ? ? ?2 ? ?prime2 ?prime2;?prime turnstileleft e3 : ? a59? ?prime1;?prime turnstileleft eprime1 : int a59Rprime1 ?prime1 a3?prime2 arrowhookleft? ?prime ?prime;?prime turnstileleft if0 eprime1 then e2 else e3 : ? a59Rprime Note that ?2;?prime turnstileleft e2 : ? a59R follows from ?2;? turnstileleft e2 : ? a59R by weakening (Lemma C.0.19) and likewise for ?2;?prime turnstileleft e3 : ? a59R . Parts 2. and 3. follow by an argument similar to (TDeref)-[cong] and (TAssign)-[cong]. case (TTransact) : We know that: (TTransact) ?primeprime;? turnstileleft e : ? a59? ?? ? ?primeprime? ?? ? ?primeprime? ?;? turnstileleft tx(?primeprime???primeprime?i,?primeprime???primeprime?) e : ? a59? We can reduce using [tx-start]: ?n;(nprime,?,?);H;tx?prime e? ??? ?n;(nprime,?,?),(n,?,(?primeprime? ??primeprime?i,?primeprime? ??primeprime?));H;intx e? where ?prime ? (?primeprime? ? ?primeprime?i,?primeprime? ? ?primeprime?). Let ?prime = ? and ?prime ? [??;?;??;??i;??o] (and thus ?prime ? ? ? ??, ?prime? = ?? ? ?, ?prime? = ??, ?prime?i = ??i, and ?prime?o = ??o as required). To prove 1., we have n;? turnstileleft H by assumption, and the rest follows by (TIntrans): (TIntrans) ?primeprime;? turnstileleft e : ? a59? ?prime? ? ?primeprime? ?prime? ? ?primeprime? ?prime;? turnstileleft intx e : ? a59?primeprime,? The first premise is true by assumption, and the rest are true by choice of ?prime. We prove 2. as follows: (TC1) f ? ? ? f ? ?primeprime? f ? (?primeprime? ??primeprime?i) ? n ? ver(H,f) ?primeprime? ??primeprime?i ? ?primeprime? ??primeprime?i ?primeprime? ??primeprime? ? ?primeprime? ??primeprime? ?primeprime,?;H turnstileleft (n,?,(?primeprime? ??primeprime?i,?primeprime? ??primeprime?)) The first premise is true vacuously, the second is true by n;? turnstileleft H (which we have by assumption), and the third and fourth trivially hold. (TC2) ?primeprime,?;H turnstileleft (n,?,?prime) f ? ? ? f ? ?? f ? (????i) ? nprime ? ver(H,f) ?? ? (?? ???i) ?? ? (?? ??) [??;?;??;??i;??o],?primeprime,?;H turnstileleft (nprime,?,?),(n,?,?prime) We have proved the first premise above, the second premise holds vacuously, and the rest hold by inversion of ?,?;H turnstileleft (nprime,?,?). Part 3. follows easily: we have traceOK((nprime,?,?)) by assumption, traceOK((n,?,?prime)) is vacuously true, hence traceOK((nprime,?,?),(n,?,?prime)) is true. 257 case (TIntrans) : We know that: (TIntrans) ?primeprime;? turnstileleft e : ? a59R ?? ? ?primeprime? ?? ? ?primeprime? ?;? turnstileleft intx e : ? a59?primeprime,R There are two possible reductions: case [tx-end] : We have that e ? v and thus R ? ?; we reduce as follows: traceOK(nprimeprime,?primeprime,?primeprime) ?n;(nprime,?,?),(nprimeprime,?primeprime,?primeprime);H;intx v? ??? ?n;(nprime,?,?);H;v? Let ?prime = ? and ?prime = ? (and thus ?prime? = ????, ?prime?? ? ??, ?prime? = ??, ?prime?i = ??i, and ?prime?o = ??o as required). To prove 1., we know that n;? turnstileleft H follows by assumption and ?;? turnstileleft v : ? a59 ? by value typing. To prove 2., we must show that ?,?;H turnstileleft (nprime,?,?), but this is true by inversion on ?,?primeprime,?;H turnstileleft (nprime,?,?),(nprimeprime,?primeprime,?primeprime). For 3., traceOK((nprime,?,?)) follows from traceOK((nprime,?,?),(nprimeprime,?primeprime,?primeprime)) (which is true by assump- tion). case [tx-cong-2] : We know that ?n;?;H;e? ?? ? ?nprime;?prime;Hprime;eprime? ?n;?;H;intx e? ??? ?nprime;?prime;Hprime;intx eprime? follows from ?n;?;H;e? ??? ?n;?prime;Hprime;eprime? (because the reduction does not perform an update, hence ? ? ?0 and we apply [tx-cong-2]). We have ?primeprime,R;H turnstileleft ? by inversion on ?,?primeprime,R;H turnstileleft ((nprime,?,?),?), hence by induction: (i) ?primeprimeprime;?prime turnstileleft eprime : ? a59Rprime and (ii) n;?prime turnstileleft Hprime (iii) ?primeprimeprime,Rprime;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?primeprimeprime such that ?primeprimeprime? = ?primeprime???0, ?primeprimeprime??0 ? ?primeprime?, ?primeprimeprime? = ?primeprime?, ?primeprimeprime?i = ?prime?i, and ?primeprimeprime?o = ?prime?o. Let ?prime = ? (hence ?prime? = ?prime? ?? , ?prime ?? ? ??, ?prime? = ??, ?prime?i = ??i, and ?prime?o = ??o as required) and ?prime = ?. To prove 1., we have n;?prime turnstileleft Hprime by (ii), and we can apply [TIntrans]: (TIntrans) ?primeprimeprime;?prime turnstileleft eprime : ? a59Rprime ?prime? ? ?primeprimeprime? ?prime? ? ?primeprimeprime? ?prime;?prime turnstileleft intx eprime : ? a59?primeprimeprime,Rprime The first premise follows from (i), while the rest follow by assumption and choice of ?prime. Part 2. follows directly from (iii) and ?,?primeprime,?;H turnstileleft (nprime,?,?),(nprimeprime,?primeprime,?primeprime) (which we have by as- sumption). Part 3. follows directly from (iv). case (TLet) : We know that: (TLet) ?1;? turnstileleft e1 : ?1 a59R ?2;?,x : ?1 turnstileleft e2 : ?2 a59? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft let x : ?1 = e1 in e2 : ?2 a59R We can reduce using either [let] or [cong]. case [let] : This implies that e1 ? v hence R ? ?. We have: ?n;(nprime,?,?);H;let x : ? = v in e? ?? ?n;(nprime,?,?);H;e[x mapsto? v]? We have ?2 = ? (because ??1 ? ?; if ??1 negationslash? ? we can rewrite the derivation using value typing to make it so). Let ?prime = ? and ?prime = ? (and thus ??? ? ??, ?prime? = ????, ?prime? = ??, and ?prime?i = ??i, ?prime?o = ??o) as required. To prove 1., we have n;? turnstileleft H and ?;?,x : ?1 turnstileleft e2 : ?2 a59? by assumption. By value typing we have ?;? turnstileleft v : ?1 a59 ?, so by substitution (Lemma C.0.33) we have ?;? turnstileleft e2[x mapsto? v] : ?2 a59?. Parts 2. and 3. hold by assumption. 258 case [cong] : Similar to (TIf)-[Cong]. case (TApp) : We know that: (TApp) ?1;? turnstileleft e1 : ?1 ???f ?2 a59R1 ?2;? turnstileleft e2 : ?1 a59R2 ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f ??i3 = ??o3 ???f ??o3 ? ??of ?;? turnstileleft e1 e2 : ?2 a59R1 trianglerighttriangleleft R2 We can reduce using either [call] or [cong]. case [call] : We have that ?n;(nprime,?,?);(Hprimeprime,z mapsto? (?,?(x).e,?));z v? ??{z} ?n;(nprime,??(z,?),?);(Hprimeprime,z mapsto? (?,?(x).e,?));e[x mapsto? v]? (where H ? (Hprimeprime,z mapsto? (?,?(x).e,?))), and (TApp) ?1;? turnstileleft z : ?1 ???f ?2 a59? ?2;? turnstileleft v : ?1 a59? ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f ??i3 = ??o3 ???f ??o3 ? ??of ?;? turnstileleft z v : ?2 a59? where by subtyping derivations (Lemma C.0.23) we have (TSub) (TGVar) ?(z) = ? prime1 ???primef ?prime2 ??;? turnstileleft z : ?prime1 ???primef ?prime2 a59? ?1 ? ?prime1 ?prime2 ? ?2 ?primef ? ?f ?prime1 ???primef ?prime2 ? ?1 ???f ?2 ?? ? ?1 ?1;? turnstileleft z : ?1 ???f ?2 a59? Define ?f ? [?f;?f;?f;?if;?of] and ?primef ? [?primef;?primef;?primef;?if;?of]. Let ?prime = ?, Rprime = ? and choose ?prime = [??1 ? {z};?f;??3 ;??i1 ;??o3 ]. Since z ? ?primef (by n;? turnstileleft H) and ?primef ? ?f (by ?primef ? ?f) we have ?f ?{z} ? (?1??2??f). By the same argument we have {z} ? ??i1 . The choice of ?prime is acceptable since ?prime? = ?? ? {z}, ?prime? ? {z} ? ??, ?prime? = ?? , ?prime?i = ??i, and ?prime?o = ??o. For 1., we have n;? turnstileleft Hprime by assumption; for the remainder we have to prove ?prime;? turnstileleft e[x mapsto? v] : ?2 a59 ?. First, we must prove that ?primef ? ?prime. Note that since {z} ? ?f by n;? turnstileleft Hprime, from ?1 a3?2 a3?3 arrowhookleft? ? and choice of ?prime we get ?prime?3 ?{z} ? ?f. We have: ?prime ? [??1 ?{z};?f;??3 ;??i1 ;??o3 ] (by choice of ?prime) ?f ? [?f;?f;?f;?if;?of] ?primef ? [?primef;?primef;?primef;?iprimef;?oprimef] ?primef ? ?f (by ?primef ? ?f) ?f ? ?primef (by ?primef ? ?f) ?f ? ?primef (by ?primef ? ?f) ?iprimef ? ?if (by ?primef ? ?f) ?of ? ?oprimef (by ?primef ? ?f) ?prime?3 ?{z} ? ?f (by assumption and choice of ?prime) ?prime?3 = ??1 ???1 ??prime?2 (by ?1 a3?2 a3?3 arrowhookleft? ?) ?prime?3 ? ?f (by assumption and choice of ?prime) ?prime?o3 ? ?of (by assumption and choice of ?prime) Thus we have the result by [TSub] ?primef;? turnstileleft e[x mapsto? v] : ?prime2 a59? ?prime2 ? ?2 ?primef ? ?prime1 ?prime1;? turnstileleft e[x mapsto? v] : ?2 259 By assumption, we have ?2;? turnstileleft v : ?1 a59?. By value typing and ?1 ? ?prime1 we have ?prime;? turnstileleft v : ?prime1 a59?. Finally by substitution we have ?prime;? turnstileleft e[x mapsto? v] : ?2 a59?. For part 2., we need to prove ?prime,?;H turnstileleft (nprimeprime,?prime,?prime) where ?prime = ? ?(z,?) and nprimeprime = nprime, hence: (TC1) f ? (? ?(z,?)) ? f ? ?? ?{z} f ? (?f ???i) ? nprime ? ver(H,f) ?? ? (?? ?{z}??i) ?? ? (?? ??f) ?prime,?;H turnstileleft (nprimeprime,?prime,?prime) The first premise is true by assumption and the fact that {z} ? {z}. The second premise is true by assumption. For part 3., we need to prove traceOK(? ?(z,?)); we have traceOK(?) by assumption, hence need to prove that nprime ? ?. Since by assumption we have that f ? ?1 ? ?2 ? ?f ? nprime ? ver(H,f) and {z} ? ?f, we have nprime ? ?. case [cong] : case E e : ?n;?;H;e1 e2? ??? ?n;?prime;Hprime;eprime1 e2? follows from ?n;?;H;e1? ??? ?n;?prime;Hprime;eprime1?. Since e1 negationslash? v ? R2 = ? by assumption, by Lemma C.0.26 we have ?1,R1;H turnstileleft ? hence we can apply induction: (i) ?prime1;?prime turnstileleft eprime1 : ?1 ???f ?2 a59Rprime1 and (ii) n;?prime turnstileleft Hprime (iii) ?prime1,Rprime1;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?prime1 ? [??1 ? ?0;?prime1;??1 ;??i1 ;??o1 ] where ?prime1 ? ?0 ? ?1 and ??1 ? ??2 ??f ???3 . Let ? prime2 ? [??1 ??prime1 ??0;??2;?f ???3 ;??i2 ;??o2 ] ?prime3 ? [??1 ??prime1 ??0 ???2;?f;??3 ;??i3 ;??o3 ] Thus ?prime?3 = ?f, ?prime1 a3 ?prime2 a3 ?prime3 arrowhookleft? ?prime, ?prime?3 ? ??f and ?prime?3 ? ??f (since ?prime?3 ? ?0 ? ??3 and ?prime?3 = ??3 ). We have ?prime ? [??1 ? ?0;?prime1 ? ??2 ? ?f;??3 ;??i1 ;??o3 ]. The choice of ?prime is acceptable since ?prime? = ?? ? ?0, (?prime1 ? ?f ? ?2) ? ?0 ? (?1 ? ?2 ? ?f) i.e., ?prime ? ?0 ? ??, ?prime? = ?? , ?prime? = ??, ?prime?i = ??i, and ?prime?o = ??o as required). To prove 1., we have n;?prime turnstileleft Hprime by (ii), and apply (TApp): (TApp) ?prime1;?prime turnstileleft eprime1 : ?1 ???f ?2 a59Rprime1 (TSub) ?2;?prime turnstileleft e2 : ?1 a59R2 ?1 ? ?1 ??1 ??prime1 ??0 ? ??1 ???1 ??2 ? ??2 ?f ???3 ? ?f ???3 ??i2 = ??i2 ??o2 = ??o2 ?2 ? ?prime2 ?prime2;?prime turnstileleft e2 : ?1 a59R2 ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime ?prime?3 = ??f ?prime?3 ? ??f ?prime?3 ? ??f ?prime?i3 = ?prime?o3 ???f ?prime?o3 ? ??of ?prime;?prime turnstileleft eprime1 e2 : ?2 a59Rprime1 trianglerighttriangleleft R2 Note that ?2;?prime turnstileleft e2 : ?1 a59 R2 follows from ?2;? turnstileleft e2 : ?1 a59 R2 by weakening (Lemma C.0.19). The last premise holds vacuously as R2 ? ? by assumption. To prove part 2., we must show that ?prime,Rprime;Hprime turnstileleft ?prime. The proof is similar to the (TAssign)- [cong] proof, case E := e but substituting ?f for ?r. Part 3. follows directly from (iv). case v E : ?n;?;H;v e2? ??? ?n;?prime;Hprime;v eprime2? follows from ?n;?;H;e2? ??? ?n;?prime;Hprime;eprime2?. For convenience, we make ??1 ? ?; if ??1 negationslash? ?, we can always construct a typing derivation of v that uses value typing to make ??1 ? ?. Note that ?1 a3?2 a3?3 arrowhookleft? ? would still hold since Lemma C.0.24 allows us to decrease ??2 to satisfy ??2 = ??1 ???1; similarly, since ??3 = ??1 ? ??1 ? ??2 we know that ??3 ? ??f would still hold if ??3 was smaller as a result of shrinking ??1 to be ?. Since e1 ? v, by inversion R1 ? ? and by Lemma C.0.26 (which we can apply since ??1 ? ?), we have ?2,R2;H turnstileleft ?; hence by induction: 260 (i) ?prime2;?prime turnstileleft eprime2 : ?1 a59Rprime2 (ii) n;?prime turnstileleft Hprime (iii) ?prime2,Rprime2;Hprime turnstileleft ?prime (iv) traceOK(?prime) for some ?prime ? ? and some ?prime2 ? [??2 ? ?0;?prime2;??2 ;??i2 ;??o2 ] where (?prime2 ? ?0) ? ??2; note ??2 ? ??1 (since ??1 ? ?) and ??2 ? ?3 ???3 . Let ? prime1 ? [??1 ??0;?;?prime2 ??f ???3 ;??i1 ;??o1 ] ?prime3 ? [??1 ??0 ??prime2;?f;??3 ;??i3 ;??o3 ] Thus ?prime?3 = ?f, ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime, ?prime?3 ? ??f and ?prime?3 ? ??f (since ?prime?3 ? ?0 ? ??3 and ?prime?3 = ??3 ). We have ?prime ? [??1 ? ?0;?prime2 ? ?f;??3 ;??i1 ;??o3 ] and (?prime2 ? ?f) ? ?0 ? (??2 ? ?f). The choice of ?prime is acceptable since ?prime? = ?? ? ?0, ?prime ? ?0 ? ??, ?prime? = ??, ?prime?i = ??i, and ?prime?o = ??o as required). To prove 1., we have n;?prime turnstileleft Hprime by (ii), and we can apply [TApp]: (TApp) ?prime1;?prime turnstileleft v : ?1 ???f ?2 a59? ?prime2;?prime turnstileleft eprime2 : ?1 a59Rprime2 ?prime1 a3?prime2 a3?prime3 arrowhookleft? ?prime ?prime?3 = ??f ?prime?3 ? ??f ?prime?3 ? ??f ?prime?i3 = ?prime?o3 ???f ?prime?o3 ? ??of ?prime;?prime turnstileleft e1 eprime2 : ?2 a59? trianglerighttriangleleft Rprime2 (Note that ? trianglerighttriangleleft Rprime2 = Rprime2.) The first premise follows by value typing and weakening; the second by (i); the third? eighth by choice of ?prime, ?prime1, ?prime2, ?prime3. To prove part 2., we must show that ?prime,Rprime;Hprime turnstileleft ?prime. The proof is similar to the (TAssign)- [cong] proof, case r := E but substituting ?f for ?r. Part 3. follows directly from (iv). case (TSub) : We have (TSub) ?primeprime;? turnstileleft e : ?primeprime a59R ?primeprime ? [?;?primeprime;?;?i;?o] ? ? [?;?;?;?i;?o] ?primeprime ? ? ?primeprime ? ? ?;? turnstileleft e : ? a59R since by flow effect weakening (Lemma C.0.24) we know that ? and ? are unchanged in the use of (TSub). We have ?n;?;H;e? ??? ?n;?prime;Hprime;eprime?. To apply induction we must show that n;? turnstileleft H, which we have by assumption, ?primeprime;? turnstileleft e : ?primeprime a59 R, which we also have by assumption, and ?primeprime,R;H turnstileleft ?. We prove ?primeprime,R;H turnstileleft ? below. We know (TC1) f ? ? ? f ? ? f ? (???i) ? n ? ver(H,f) ?? ? (???i) ?? ? (? ??) [?;?;?;?i;?o],?;H turnstileleft (n,?,?) and need to show (TC1) f ? ? ? f ? ? f ? (?primeprime ??i) ? n ? ver(H,f) ?? ? (???i) ?? ? (? ??primeprime) [?;?primeprime;?;?i;?o],?;H turnstileleft (n,?,?) The first premise is true by assumption. The second follows easily by assumption and the fact that ?primeprime ? ?. The third premise follows by assumption. The fourth premise similarly follows by assumption and by ?primeprime ? ?. Hence we have: (i) ?primeprimeprime;?prime turnstileleft eprime : ?primeprime a59Rprime and (ii) n;?prime turnstileleft Hprime (iii) ?primeprimeprime,Rprime;Hprime turnstileleft ?prime (iv) traceOK(?prime) 261 for some ?prime ? ?, ?primeprimeprime such that ?primeprimeprime? = ???0, ?primeprimeprime? ??0 ? ?primeprime, ?primeprimeprime?i = ?primeprime?i, and ?primeprimeprime?o = ?primeprime?o. Let ?prime ? ?primeprimeprime, and thus ?prime? = ???0, ?prime? ??0 ? ? since ?primeprime ? ?, ?prime? = ?, and ?prime?i = ??i, and ?prime?o = ??o as required. All results follow by induction. Lemma C.0.32 (Progress). If n turnstileleft H,e : ? (such that ?;? turnstileleft e : ? a59 R and n;? turnstileleft H) and for all ? such that ?,R;H turnstileleft ? and traceOK(?), then either e is a value, or there exist nprime,Hprime,?prime,eprime such that ?n;?;H; e? ??? ?nprime;?prime;Hprime; eprime?. Proof. Induction on the typing derivation n turnstileleft H,e : ?; consider each possible rule for the conclusion of this judgment: case (TInt-TGvar-TLoc) : These are all values. case (TVar) : Can?t occur, since local values are substituted for. case (TRef) : We must have that (TRef) ?;? turnstileleft eprime : ? a59R?;? turnstileleft ref eprime : ref ? ? a59R There are two possible reductions, depending on the shape of e: case eprime ? v : By inversion on ?;? turnstileleft v : ? a59 ? we know that R ? ? hence by inversion on ?,R;H turnstileleft ? we have ? ? (nprime,?,?). We have that ?n;(nprime,?,?);H;ref v? ?? n;(nprime,?,?);Hprime;r where r /? dom(H) and Hprime = H,r mapsto? (?,v,?) by (ref). case eprime negationslash? v : By induction, ?n;?;H;eprime? ??? ?nprime;?prime;Hprime;eprimeprime?and thus?n;?;H;(ref )[eprime]? ??? ?nprime;?prime;Hprime;(ref )[eprimeprime]? by [cong]. case (TDeref) : We know that (TDeref) ?1;? turnstileleft e : ref ?r ? a59R ??2 = ?r ??i2 = ??o2 ??r ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !e : ? a59R Consider the shape of e: case eprime ? v : Since v is a value of type ref ?r ?, we must have v ? z or v ? r. case eprime ? z : We have (TDeref) ?1;? turnstileleft z : ref ?r ? a59? ??2 = ?r ??i2 = ??o2 ??r ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !z : ? a59? where by subtyping derivations (Lemma C.0.23) we have (TSub) (TGVar) ?(z) = ref ? prime r ?prime ??;? turnstileleft z : ref ?primer ?prime a59? ?prime ? ? ? ? ?prime ?primer ? ?r ref ?primer ?prime ? ref ?r ? ?? ? ?1 ?1;? turnstileleft z : ref ?r ? a59? By inversion on ?,?;H turnstileleft ? we have ? ? (nprime,?,?). By C;? turnstileleft H we have z ? dom(H) (and thus H ? Hprimeprime,z mapsto? (ref ?primer ?prime,v,?))) since ?(z) = ref ?primer ?prime. Therefore, we can reduce via [gvar-deref]: ?n;(nprime,?,?);(Hprimeprime,z mapsto? (?prime,v,?));!z? ??{z} ?n;(nprime,? ?(z,?),?);(Hprimeprime,z mapsto? (?prime,v,?));v? 262 case eprime ? r : Similar to the eprime ? z case above, but reduce using [deref]. case eprime negationslash? v : Let E ? ! so that e ? E[eprime]. To apply induction, we have ?1,R;H turnstileleft ? by Lemma C.0.25. Thus we get ?n;?;H;eprime? ??? ?nprime;?prime;Hprime;eprimeprime?, hence we have that ?n;?;H;E[eprime]? ??? ?nprime;?prime;Hprime;E[eprimeprime]? by [cong]. case (TAssign) : (TAssign) ?1;? turnstileleft e1 : ref ?r ? a59R1 ?2;? turnstileleft e2 : ? a59R2 ??3 = ?r ??i3 = ??o3 ??r ?1 a3?2 a3?3 arrowhookleft? ? ?;? turnstileleft e1 := e2 : ? a59R1 trianglerighttriangleleft R2 Depending on the shape of e, we have: case e1 ? v1,e2 ? v2 : Since v1 is a value of type ref ?r ?, we must have v1 ? z or v1 ? r. The results follow by reasoning quite similar to [TDeref] above. case e1 ? v1,e2 negationslash? v : Let E ? v1 := so that e ? E[e2]. Since e1 is a value, R1 ? ? hence we have ?2,R;H turnstileleft ? by Lemma C.0.26 and we can apply induction. We have ?n;?;H;e2? ??? ?nprime;?prime;Hprime;eprime2?, and thus ?n;?;H;E[e2]? ??? ?nprime;?prime;Hprime;E[eprime2]? by [cong]. case e1 negationslash? v : Since e1 is a not value, R2 ? ? hence we have ?1,R;H turnstileleft ? by Lemma C.0.26 and we can apply induction. The rest follows by an argument similar to the above case. case (TCheckin) : (TCheckin) ???o ? ?prime ? ? ?prime [?;?;?;?;?o];? turnstileleft checkin?prime,?prime:int a59? By inversion on ?;? turnstileleft checkin?prime,?prime : int a59 R we have that R ? ?, hence by inversion on ?,R;H turnstileleft ? we have ? ? (nprime,?,?) and can reduce via [checkin]. case (TIf) : (TIf) ?1;? turnstileleft e1 : int a59R ?2;? turnstileleft e2 : ? a59? ?2;? turnstileleft e3 : ? a59? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft if0 e1 then e2 else e3 : ? a59R Depending on the shape of e, we have: case e1 ? v : This implies R ? ? so by inversion on ?,?;H turnstileleft ? we have ? ? (nprime,?,?). Since the type of v is int, we know v must be an integer n. Thus we can reduce via either [if-t] or [if-f]. case e1 negationslash? v : Let E ? if0 then e2 else e3 so that e ? E[e1]. To apply induction, we have ?1,R;H turnstileleft ? by Lemma C.0.25. We have ?n;?;H;e1? ??? ?nprime;?prime;Hprime;eprime1? and thus ?n;?;H;E[e1]? ??? ?nprime;?prime;Hprime;E[eprime1]? by [cong]. case (TTransact) : We know that: (TTransact) ?primeprime;? turnstileleft e : ? a59? ?? ? ?primeprime? ?? ? ?primeprime? ?;? turnstileleft tx(?primeprime???primeprime?i,?primeprime???primeprime?) e : ? a59? By inversion on ?,?;H turnstileleft ? we have ? ? (nprime,?,?). Thus we can reduce by [tx-start]. 263 case (TIntrans) : We know that: (TIntrans) ?primeprime;? turnstileleft e : ? a59R ?? ? ?primeprime? ?? ? ?primeprime? ?;? turnstileleft intx e : ? a59?primeprime,R Consider the shape of e: case e ? v : Thus (TIntrans) ?primeprime;? turnstileleft v : ? a59? ?? ? ?primeprime? ?? ? ?primeprime? ?;? turnstileleft intx v : ? a59?primeprime,? We have ?,?primeprime,?;H turnstileleft ? by assumption: (TC2) ?primeprime,?;H turnstileleft (nprimeprime,?primeprime,?primeprime) f ? ? ? f ? ?? f ? (????i) ? nprime ? ver(H,f) ?? ? (?? ???i) ?? ? (?? ??) [??;?;??;??i;??o],?primeprime,?;H turnstileleft (nprime,?,?),(nprimeprime,?primeprime,?primeprime) By inversion we have ? ? ((nprime,?,?),(nprimeprime,?primeprime,?primeprime)); by assumption we have traceOK(nprimeprime,?primeprime,?primeprime) so we can reduce via [tx-end]. case e negationslash? v : We have ?,?prime,R;H turnstileleft ? by assumption. By induction we have ?n;?prime;H;eprime? ??? ?nprime;?primeprime;Hprime;eprimeprime?, hence by [tx-cong-2]: ?n;?prime;H;intx eprime? ??? ?nprime;?primeprime;Hprime;intx eprimeprime? case (TLet) : We know that: (TLet) ?1;? turnstileleft e1 : ?1 a59R ?2;?,x : ?1 turnstileleft e2 : ?2 a59? ?1 a3?2 arrowhookleft? ? ?;? turnstileleft let x : ?1 = e1 in e2 : ?2 a59R Consider the shape of e: case e1 ? v : Thus ?1;? turnstileleft v : ? a59? and by inversion on ?,?;H turnstileleft ? we have ? ? (nprime,?,?). We can reduce via [let]. case e1 negationslash? v : Let E ? let x : ?1 = in e2 so that e ? E[e1]. To apply induction, we have ?1,R;H turnstileleft ? by Lemma C.0.25. We have ?n;?;H;e1? ??? ?nprime;?prime;Hprime;eprime1? and so ?n;?;H;E[e1]? ??? ?nprime;?prime;Hprime;E[eprime1]? by [cong]. case (TApp) : (TApp) ?1;? turnstileleft e1 : ?1 ???f ?2 a59R1 ?2;? turnstileleft e2 : ?1 a59R2 ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f ??i3 = ??o3 ???f ??o3 ? ??of ?;? turnstileleft e1 e2 : ?2 a59R1 trianglerighttriangleleft R2 Depending on the shape of e, we have: case e1 ? v1,e2 ? v2 : Since v1 is a value of type ?1 ??? ?2, we must have v1 ? z, hence (TApp) ?1;? turnstileleft z : ?1 ???f ?2 a59? ?2;? turnstileleft v : ?1 a59? ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f ??i3 = ??o3 ???f ??o3 ? ??of ?;? turnstileleft z v : ?2 a59? 264 where by subtyping derivations (Lemma C.0.23) we have (TSub) (TGVar) ?(z) = ? prime1 ???primef ?prime2 ??;? turnstileleft z : ?prime1 ???primef ?prime2 a59? ?1 ? ?prime1 ?prime2 ? ?2 ?primef ?f ?f ?prime1 ???primef ?prime2 ? ?1 ???f ?2 ?? ? ?1 ?1;? turnstileleft z : ?1 ???f ?2 a59? By inversion on ?,?;H turnstileleft ? we have ? ? (nprime,?,?). By C;? turnstileleft H we have z ? dom(H) and H ? (Hprimeprime,z mapsto? (?prime1 ???primef ?prime2,?(x).eprimeprime,?)) since ?(z) = ?prime1 ???primef ?prime2. By [call], we have: ?n;(nprime,?,?);(Hprimeprime,z mapsto? (?prime1 ???primef ?prime2,?(x).eprimeprime,?));z v? ??{z} ?n;(nprime,? ?(z,?),?);(Hprimeprime,z mapsto? (?prime1 ???primef ?prime2,?(x).eprimeprime,?));eprimeprime[x mapsto? v]? case e1 negationslash? v : Let E ? e2 so that e ? E[e1]. Since e1 is a not value, R2 ? ? hence we have ?1,R;H turnstileleft ? by Lemma C.0.26 and we can apply induction and we have: ?n;?;H;e1? ??? ?nprime;?prime;Hprime;eprime1?, and thus ?n;?;H;E[e1]? ??? ?nprime;?prime;Hprime;E[eprime1]? by [cong]. case e1 ? v1,e2 negationslash? v : Let E ? v1 so that e ? E[e2]. Since e1 is a value, R1 ? ? hence we have ?2,R;H turnstileleft ? by Lemma C.0.26 and we can apply induction. The rest follows similarly to the above case. case (TSub) : We know that: (TSub) ?1;? turnstileleft e : ?prime a59R ?prime ? ? ?1 ? [?;?1;?;?i;?o] ? ? [?;?;?;?i;?o] ?1 ? ? ?;? turnstileleft e : ? a59R If e is a value v we are done. Otherwise, since ?1,R;H turnstileleft ? follows from ?,R;H turnstileleft ? (by ??1 ? ?? and ??1 = ??); we have ?n;?;H;e? ??? ?nprime;?prime;Hprime;eprime? by induction. Lemma C.0.33 (Substitution). If ?;?,x : ?prime turnstileleft e : ? and ?;? turnstileleft v : ?prime then ?;? turnstileleft e[x mapsto? v] : ?. Proof. Induction on the typing derivation of ?;? turnstileleft e : ?. case (TInt) : Since e ? n and n[x mapsto? v] ? n, the result follows by (TInt). case (TVar) : e is a variable y. We have two cases: case y = x : We have ? = ?prime and y[x mapsto? v] ? v, hence we need to prove that ?;? turnstileleft v : ? which is true by assumption. case y negationslash= x : We have y[x mapsto? v] ? y and need to prove that ?;? turnstileleft y : ?. By assumption, ?;?,x : ?prime turnstileleft y : ?, and thus (?,x : ?prime)(y) = ?; but since x negationslash= y this implies ?(y) = ? and we have to prove ?;? turnstileleft y : ? which follows by (Tvar). case (TGvar),(TLoc), (TCheckin) : Similar to (TInt). case (TRef) : We know that ?;?,x : ?prime turnstileleft ref e : ref ? ? and ?;? turnstileleft v : ?prime, and need to prove that ?;? turnstileleft (ref e)[x mapsto? v] : ref ? ?. By inversion on ?;?,x : ?prime turnstileleft ref e : ref ? ? we have ?;?,x : ?prime turnstileleft e : ?; applying induction to this, we have ?;? turnstileleft e[x mapsto? v] : ?. We can now apply [TRef]: (TRef) ?;? turnstileleft e[x mapsto? v] : ??;? turnstileleft ref (e[x mapsto? v]) : ref ? ? The desired result follows since ref (e[x mapsto? v]) ? (ref e)[x mapsto? v]. 265 case (TDeref) : We know that ?;?,x : ?prime turnstileleft !e : ? and ?;? turnstileleft v : ?prime and need to prove that ?;? turnstileleft (!e)[x mapsto? v] : ?. By inversion on ?;?,x : ?prime turnstileleft !e : ? we have ?1;?,x : ?prime turnstileleft e : ref ?r ? and ?2 such that ?1 a3 ?2 arrowhookleft? ? and ? ? ?1 a3 ?2. By value typing we have ?1;? turnstileleft v : ?prime. We can then apply induction, yielding ?1;? turnstileleft e[x mapsto? v] : ref ?r ?. Finally, we apply (TDeref) (TDeref) ?1;? turnstileleft e[x mapsto? v] : ref ?r ? ??2 = ?r ??i2 = ??o2 ??r ?1 a3?2 arrowhookleft? ? ?;? turnstileleft !e[x mapsto? v] : ? Note that the second premise holds by inversion on ?;?,x : ?prime turnstileleft !e : ?. The desired result follows since !(e[x mapsto? v]) ? (!e)[x mapsto? v]. case (TSub) : We know that ?;?,x : ?prime turnstileleft e : ? and ?;? turnstileleft v : ?prime and need to prove that ?;? turnstileleft e[x mapsto? v] : ?. By inversion on ?;?,x : ?prime turnstileleft e : ? we have ?prime;?,x : ?prime turnstileleft e : ?prime. By value typing we have ?prime;?,x : ?prime turnstileleft v : ?prime. We can then apply induction, yielding ?prime;? turnstileleft e[x mapsto? v] : ?prime. Finally, we apply (TSub) (TSub) ?prime;? turnstileleft e[x mapsto? v] : ?prime ?prime ? ? ?prime ? ??;? turnstileleft e[x mapsto? v] : ? and get the desired result. case (TTransact),(TIntrans) : Similar to (TSub). case (TApp) : We know that (TApp) ?1;?,x : ?prime turnstileleft e1 : ?1 ???f ?2 ?2;?,x : ?prime turnstileleft e2 : ?1 ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f ??i3 = ??o3 ???f ??o3 ? ??of ?;?,x : ?prime turnstileleft e1 e2 : ?2 where ?;? turnstileleft v : ?prime, and need to prove that ?;? turnstileleft (e1 e2)[x mapsto? v] : ?2. Call the first two premises above (1) and (2), and note that we have (3) ?;? turnstileleft v : ?prime ? ?1;? turnstileleft v : ?prime and (4) ?;? turnstileleft v : ?prime ? ?2;? turnstileleft v : ?prime by the value typing lemma. By (1), (3) and induction we have ?1;? turnstileleft e1[x mapsto? v] : ?1 ???f ?2. Similarly, by (2), (4) and induction we have ?2;? turnstileleft e2[x mapsto? v] : ?1. We can now apply (TApp): (TApp) ?1;? turnstileleft e1[x mapsto? v] : ?1 ???f ?2 ?2;? turnstileleft e2[x mapsto? v] : ?1 ?1 a3?2 a3?3 arrowhookleft? ? ??3 = ??f ??3 ? ??f ??3 ? ??f ??i3 = ??o3 ???f ??o3 ? ??of ?;? turnstileleft e1[x mapsto? v] e2[x mapsto? v] : ?2 Since e1[x mapsto? v] e2[x mapsto? v] ? (e1 e2)[x mapsto? v] we get the desired result. case (TAssign-TIf-TLet) : Similar to (TApp). 266 Appendix D Multi-threading Proofs Lemma D.0.34 (Fork derivations). If ?;? turnstileleft E[fork?,? e] : ? a59R then ?;? turnstileleft E[0] : ? a59R. Proof. By induction on E: case E = : By assumption, we have ?;? turnstileleft fork?,? e : int a59?. We have ?;? turnstileleft 0 : int a59? by (TInt). case E = v Eprime : By assumption, we have ?;? turnstileleft v Eprime[fork?primeprime,?primeprime e] : ? a59 R. By subtyping derivations (Lemma B.0.6) we know we can construct a proof derivation of this ending in (TSub): TSub TApp ?1;? turnstileleft v : ?1 ???f ?prime2 a59? ?2;? turnstileleft Eprime[fork?primeprime,?primeprime e] : ?1 a59R ?1 a3?2 a3?3 arrowhookleft? ?prime ??3 = ??f ??3 ? ??f ??3 ? ??f ??i3 = ??o3 ???f ??o3 ? ??of ?prime;? turnstileleft v Eprime[fork?primeprime,?primeprime e] : ?prime2 a59R ?prime2 ? ? ?prime ? [?;?prime;?] ? ? [?;?;?] ?prime ? ? ?;? turnstileleft v Eprime[fork?primeprime,?primeprime e] : ? a59R By induction we have ?2;? turnstileleft Eprime[0] : ?1 a59R. We can now apply (TApp): TSub TApp ?1;? turnstileleft v : ?1 ???f ?prime2 a59? ?2;? turnstileleft Eprime[0] : ?1 a59R ?1 a3?2 a3?3 arrowhookleft? ?prime ??3 = ??f ??3 ? ??f ??3 ? ??f ??i3 = ??o3 ???f ??o3 ? ??of ?prime;? turnstileleft v Eprime[0] : ?prime2 a59R ?prime2 ? ? ?prime ? [?;?prime;?] ? ? [?;?;?] ?prime ? ? ?;? turnstileleft v Eprime[0] : ? a59R case all others : By induction, similar to above cases. Preservation is very similar to the single-threaded version (Lemma C.0.31). n;? turnstileleft H is unchanged since it?s independent of the numbers of threads. We require ?i;? turnstileleft ei : ? a59Ri ??i,Ri;H turnstileleft ?i ? traceOK(?i) ? ?primei;? turnstileleft eprimei : ? a59Rprimei ??primei,Rprimei;Hprime turnstileleft ?primei ?traceOK(?primei) for each thread i, which we can prove by invoking the single-threaded proof and paying attention to MT-specific issues like (TFork) and (TReturn) that create and destroy a thread, respectively. Lemma D.0.35 (Multithreaded VC non-interference). Let T = (?1,e1).(?2,e2)...(?|T|,e|T|). Suppose we have the following: 1. n turnstileleft H,T 2. ?i ? 1..|T |. ?i,Ri;H turnstileleft ?i and thread j takes a non-update evaluation step: ?n;?j;H;e? ??? ?nprime;?primej;Hprime;eprime? Then for some ?prime ? ? and for all threads i ? 1..|T |prime such that i negationslash= j we have: 1. ?i;?prime turnstileleft ei : ? a59Ri 2. ?i,Ri;Hprime turnstileleft ?i 3. traceOK(?primei) 267 Proof. Part 1. is true by weakening since ?prime ? ?. We only need to prove 2. Proceed by induction on the typing derivation ?j;? turnstileleft e : ? a59Rj, only considering rules that change the heap: case (TRef) : We have that: (TRef) ?j;? turnstileleft ej : ? a59R? j;? turnstileleft ref ej : ref ? ? a59R There are two possible reductions: case [ref] : We have that ej ? v, R = ?, and ?n;(nprime,?j,?j);H;ref v? ??? ?n;(nprime,?j,?j);Hprime;r? where r /? dom(H), Hprime = H,r mapsto? (?,v,?), and ?prime = ?,r : ref ? ?. To prove 2., we must show ?i,Ri;Hprime turnstileleft ?i. This follows by assumption since Hprime only contains an additional location (i.e., not a global variable) and no heap element has undergone a version change. case [cong] : We have ?n;?j;H;ref E[eprimeprime]? ??? ?n;?primej;Hprime;ref E[eprimeprimeprime]? from ?n;?j;H;eprimeprime? ??? ?n;?primej;Hprime;eprimeprimeprime?. By [cong], we have ?n;?j;H;e? ??? ?n;?primej;Hprime;eprime? where e ? E[eprimeprime] and eprime ? E[eprimeprimeprime]. The result follows directly by induction. case (TAssign) : We know that: (TAssign) ?j1;? turnstileleft e1 : ref ?r ? a59Rj1 ?j2;? turnstileleft e2 : ? a59Rj2 ??3 = ?r ??i3 = ??o3 ??r ?j1 a3?j2 a3?3 arrowhookleft? ?j ?j;? turnstileleft e1 := e2 : ? a59Rj1 trianglerighttriangleleft Rj2 From Rj1 trianglerighttriangleleft Rj2 it follows that either Rj1 ? ? or Rj2 ? ?. We can reduce using [gvar-assign], [assign], or [cong]. case [gvar-assign] : This implies that e ? z := v with ?n;(nprime,?j,?j);(Hprimeprime,z mapsto? (?,vprime,?));z := v? ??{z} ?n;(nprimej,?j ?(z,?),?j);(Hprimeprime,z mapsto? (?,v,?));v? where H ? (Hprimeprime,z mapsto? (?,vprime,?)). Rj1 ? ? and Rj2 ? ? (thus Rj1 trianglerighttriangleleft Rj2 ? ?), ?prime = ?, Rprimej = ?. Consider the case ?i ? (nprime,?i,?i). We know (TC1) f ? ?i ? f ? ?i f ? (?i ??ii) ? ni ? ver((Hprimeprime,z mapsto? (?,v,?)),f) ??i ? (?i ??ii) ??i ? (?i ??i) [?i;?i;?i],?;H turnstileleft (ni,?i,?i) and need to prove: (TC1) f ? ?i ? f ? ?i f ? (?i ??ii) ? ni ? ver((Hprimeprime,z mapsto? (?,vprime,?)),f) ??i ? (?i ??ii) ??i ? (?i ??i) [?i;?i;?i],?;H turnstileleft (ni,?i,?i) All premises follow by assumption (no heap element has changed version). If ?i ? (nprimei,?i?(z,?),?i), the result follows by the same argument. case [assign] : The proof follows by the same argument as in case (TDeref)-[deref]. case [cong] : In both cases (E := e, r := E) the result follows by induction. 268 Lemma D.0.36 (Preservation). Let T = (?1,e1).(?2,e2)...(?|T|,e|T|). Suppose we have the following: 1. n turnstileleft H,T 2. ?i ? 1..|T |. ?i,Ri;H turnstileleft ?i 3. ?i ? 1..|T |. traceOK(?i) 4. ?n;T ;H? ??(?,j) ?nprime;T prime;Hprime? where j is the thread taking a transition and ? is the evaluation effect. Then, for T prime = (?prime1,eprime1).(?prime2,eprime2)...(?prime|T|,eprime|T|) we have that: 1. nprime turnstileleft Hprime,T prime (such that nprime;?prime turnstileleft Hprime and ?i ? 1..|T |prime. ?primei;?prime turnstileleft eprimei : ? a59 Rprimei for some nprime, ?prime ? ? and some ?primei such that ? ?primei = ?i, if i negationslash= j ? ?primei ? [??i ??0;?primei;??i ;??ii ;??oi ], ?primei ??0 ? ??i, if i = j 2. ?i ? 1..|T |prime. ?primei,Rprimei;Hprime turnstileleft ?primei 3. ?i ? 1..|T |prime. traceOK(?primei) Proof. By case analysis on the rule used to take an evaluation step: case (fork) : We have ?n;H;T1.(?j,E[fork? e]).T2? ?(?,j) ?n;H;T1.(?j,E[0]).((n,?,?),e).T2? and we know that TFork ?m;? turnstileleft e : ? a59? ?m ? [?m;?m;?m] ?m ? (?m ??im, ?m) ?j;? turnstileleft fork?m e : int a59? Thus nprime = n and Hprime = H; let ?prime = ?. We have n;? turnstileleft H by assumption, so nprime;?prime turnstileleft Hprime is immediate. Let j be the index of the thread whose context is E[fork? e], and m the index of the newly created thread. We get 1. and 2. for all threads i ? 1..|T |,i negationslash= j,m by assumption (since Hprime = H) and choosing ?primei = ?i. For thread j, we have ?j;? turnstileleft E[fork? e] : int a59 ? and need to prove ?primej;? turnstileleft E[0] : int a59 ?. Let ?primej = ?j; then ?primej;? turnstileleft E[0] : int a59? follows by Lemma D.0.34. Part 2., ?primej,?;Hprime turnstileleft ?primej follows by assumption since ?primej = ?j and ?primej = ?j. Part 3. similarly follows by assumption since ?primej = ?j. For the newly created thread m, we need to prove ?m;? turnstileleft e : ? a59 ? which follows by assumption (from (TFork)), and 2., ?m,?;Hprime turnstileleft ?m, which we prove by [VC1]: (TC1) f ? ?m ? f ? ?m f ? (?m ??im) ? nm ? ver(H,f) ??m ? (?m ??im) ??m ? (?m ??m) [?m;?m;?m],?;H turnstileleft (nm,?m,?m) Since (n,?,?) is the new thread context (from [fork]), by inversion on ?m;? turnstileleft e : ? a59 ? we have ?m ? (nm,?m,?m) hence (nm,?m,?) ? (n,?,(?m??im, ?m)). The first premise is vacuously true since ?m ? ?. The second premise follows directly from n;? turnstileleft H (which states ?z mapsto? (?,b,?) ? H. n ? ?) since nm ? n. The third and fourth premises follow directly since ?m ? (?m ? ?im, ?m) For part 3 we need to prove traceOK(nm,?m,?) which is vacuously true since ?m ? ?. case (return) : We have ?n;H;T1.((nprimeprime,?primeprime,?primeprime),v).T2? ?? ?n;H;T1.T2? hence nprime = n and Hprime = H. Let ?prime = ?; then 1., 2., and 3. follow by assumption for all threads. 269 case (update) : We know that ?n;?1;H;e1? ??(upd,dir) ?n + 1;U[?1]upd,dirn+1 ;U[H]updn+1;e1? ?n;?2;H;e2? ??(upd,dir) ?n + 1;U[?2]upd,dirn+1 ;U[H]updn+1;e2? ... ?n;?|T|;H;e|T|? ??(upd,dir) ?n + 1;U??|T|?upd,dirn+1 ;U[H]updn+1;e|T|? ?n;H;(?1,e1).(?2,e2)...(?|T|,e|T|)? ?(upd,dir) ?n + 1;U[H]updn+1;(U[?1]upd,dirn+1 ,e1).(U[?2]upd,dirn+1 ,e2)...(U??|T|?upd,dirn+1 ,e|T|)? We have nprime = n + 1. Let ?prime = U[?]upd; we have Hprime = U[H]updn+1. n + 1;?prime turnstileleft Hprime follows directly from Lemma B.0.12. For each thread i, we have ?n;?i;H;ei? ??(upd,dir) ?n + 1;U[?i]upd,dirn+1 ;U[H]updn+1;e? hence ?primei;?prime turnstileleft eprimei : ? a59Ri, ?primei,Ri;Hprime turnstileleft ?primei, and traceOK(?primei) for each i follow from single-threaded update preservation (Lemma B.0.13). case (mt-cong) : Let j be the thread that takes a step. We have ?n;?j;H;e? ??? ?nprime;?primej;Hprime;eprime? ?n;H;T1.(?j,E[e]).T2? ?(?,j) ?nprime;Hprime;T1.(?primej,E[eprime]).T2? hence nprime;?prime turnstileleft Hprime, ?primej;? turnstileleft eprimej : ? a59 Rprimej, ?primej,Rprimej;Hprime turnstileleft ?primej, and traceOK(?primej) follow from single-threaded preservation (Lemma D.0.36). For all threads i ? 1..|T |,i negationslash= j we have ?primei = ?i, Ri = Rprimei, and ?primei = ?i since they don?t take any steps. Hence we have ?primei;?prime turnstileleft eprimei : ? a59 Rprimei by assumption and weakening, ?primei,Rprimei;Hprime turnstileleft ?primei by assumption, and the observation that the only way j could have changed the heap was via [gvar-assign], [assign], or [ref], but this does not affect ?primei,Rprimei;Hprime turnstileleft ?primei (by Lemma D.0.35). Finally, we have traceOK(?primei) by assumption, since ?primei = ?i. Progress is also similar to the single-threaded version; we pick a thread, and prove that it can take a step. Lemma D.0.37 (Progress). Let T = (?1,e1).(?2,e2)...(?|T|,e|T|). Suppose we have the following: 1. n turnstileleft H,T 2. ?i ? 1..|T |. ?i,Ri;H turnstileleft ?i 3. ?i ? 1..|T |. traceOK(?i) Then for all ?i such that ?i,Ri;H turnstileleft ?i, and traceOK(?i), either ei is a value, or there exist nprime,Hprime,T prime such that n;H;T ??(?,j) nprime;Hprime;T prime. Proof. Case analysis on the structure of the entire program. Assume |T | > 0 and consider ei, for some i such that 1 ? i ? |T|. case ei ? v : The thread context is (?i,v). By assumption, we have ?i;? turnstileleft ei : ? a59 Ri which in our case means ?i;? turnstileleft v : ? a59 ? so R ? ?, hence by inversion on ?i,Ri;H turnstileleft ?i we have ?i ? (nprimeprime,?primeprime,?primeprime) and we can reduce via [return]: ?n;H;T1.((nprimeprime,?primeprime,?primeprime),v).T2? ?? ?n;H;T1.T2? case ei negationslash? v : We have ?n;?i;H;ei? ?? ? ?nprime;?primei;Hprime;eprimei? ? ? {upd,epsilon1} from single-threaded progress (Lemma D.0.37). Proceed by case analysis on ?: 270 case epsilon1 : case ei ? E[fork? e] : We can reduce using [fork]: ?n;H;T1.(?i,E[fork? e]).T2? ?(?,i) ?n;H;T1.(?i,E[0]).((n,?,?),e).T2? case ei negationslash? E[fork? e] : We can apply [mt-cong]: ?n;?i;H;e? ??? ?nprime;?primei;Hprime;eprime? ?n;H;T1.(?i,E[e]).T2? ?(?,i) ?nprime;Hprime;T1.(?primei,E[eprime]).T2? 271 Bibliography [1] Ksplice: Rebootless Linux kernel security updates. http://web.mit.edu/ ksplice/. [2] Sun Microsystems. Java HotSpot VM. http://www.javasoft.com/ products/hotspot. [3] Sun Microsystems. JVM Tool Interface. http://java.sun.com/j2se/1.5. 0/docs/guide/jvmti. [4] Mart??n Abadi and C?edric Fournet. Access control based on execution history. In Proceedings of the 10th Annual Network and Distributed System Security Symposium, pages 107?121, 2003. [5] Alex Aiken, Jeffrey S. Foster, John Kodumal, and Tachio Terauchi. Check- ing and Inferring Local Non-Aliasing. In Proceedings of the 2003 ACM SIG- PLAN Conference on Programming Language Design and Implementation, pages 129?140, San Diego, California, June 2003. [6] Gautam Altekar, Ilya Bagrak, Paul Burstein, and Andrew Schultz. Opus: online patches and updates for security. In Proceedings of the 14th conference on USENIX Security Symposium, pages 287?302, Berkeley, CA, USA, 2005. USENIX Association. [7] JesperAndersson, MarcusComstedt, andTobiasRitzau. Run-timesupportfor dynamic java architectures. In Proceedings of ECOOP Workshop on Object- Oriented Architectures, 1998. [8] Joe Armstrong, Robert Virding, Claes Wikstrom, and Mike Williams. Con- current programming in ERLANG (2nd ed.). Prentice Hall International (UK) Ltd., 1996. [9] Thomas Ball and Sriram K. Rajamani. Automatically validating temporal safety properties of interfaces. In SPIN ?01: Proceedings of the 8th inter- national SPIN workshop on Model checking of software, pages 103?122, New York, NY, USA, 2001. Springer-Verlag New York, Inc. [10] Andrew Baumann, Jonathan Appavoo, Dilma Da Silva, Orran Krieger, and Robert W. Wisniewski. Improving Operating System Availability With Dy- namic Update. In Proc. Workshop on Operating System and Architectural Support for the on demand IT InfraStructure (OASIS), pages 21?27, October 2004. 272 [11] Andrew Baumann, Jonathan Appavoo, Robert W. Wisniewski, Dilma Da Silva, Orran Krieger, and Gernot Heiser. Reboots are for hardware: Chal- lenges and solutions to updating an operating system on the fly. In USENIX Annual Technical Conference, pages 337?350, 2007. [12] Andrew Baumann, Gernot Heiser, Jonathan Appavoo, Dilma Da Silva, Orran Krieger, Robert W. Wisniewski, and Jeremy Kerr. Providing dynamic update in an operating system. In USENIX Annual Technical Conference, General Track, pages 279?291, 2005. [13] David Berlind. Taking a closer look at Windows Vista. http: //news.cnet.com/Taking-a-closer-look-at-Windows-Vista/1606-2\ 3-6200749.html. [14] D. Binkley. Using semantic differencing to reduce the cost of regression testing. In Proceedings of the International Conference on Software Maintenance 1992, pages 41?50, 1992. [15] Colin Blundell, E. Christopher Lewis, and Milo M. K. Martin. Unrestricted transactional memory: Supporting I/O and system calls within transactions. Technical Report TR-CIS-06-09, Department of Computer and Information Science University of Pennsylvania, May 2006. [16] Daniel Pierre Bovet and Marco Cassetti. Understanding the Linux Kernel. O?Reilly & Associates, Inc., Sebastopol, CA, USA, 2000. [17] Chandrasekhar Boyapati, Barbara Liskov, Liuba Shrira, Chuang-Hue Moh, and Steven Richman. Lazy modular upgrades in persistent object stores. In Proceedings of the 9th Annual Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 403?417, 2003. [18] Eric A. Brewer. Lessons from giant-scale services. IEEE Internet Computing, 5(4):46?55, 2001. [19] Greg Bronevetsky, Daniel Marques, Keshav Pingali, Peter K. Szwed, and Mar- tin Schulz. Application-level checkpointing for shared memory programs. In ASPLOS, pages 235?247, 2004. [20] Bryan Buck and Jeffrey K. Hollingsworth. An API for runtime code patching. Journal of High Performance Computing Applications, 14(4):317?329, 2000. [21] Cristiano Calcagno. Stratified Operational Semantics for Safety and Correct- ness of The Region Calculus. In POPL?01, pages 155?165, 2001. [22] Craig Chambers, David Ungar, and Elgin Lee. An efficient implementation of self - a dynamically-typed object-oriented language based on prototypes. In OOPSLA, pages 49?70, 1989. 273 [23] Haibo Chen, Rong Chen, Fengzhe Zhang, Binyu Zang, and Pen-Chung Yew. Live updating operating systems using virtualization. In VEE ?06: Proceedings of the 2nd international conference on Virtual execution environments, pages 35?44, New York, NY, USA, 2006. ACM Press. [24] Haibo Chen, Jie Yu, Rong Chen, Binyu Zang, and Pen-Chung Yew. POLUS: A POwerful Live Updating System. In ICSE, pages 271?281, 2007. [25] Brian Chin, Shane Markstrum, and Todd D. Millstein. Semantic type quali- fiers. In PLDI, pages 85?95, 2005. [26] Marcus Denker and St?ephane Ducasse. Software evolution from the field: An experience report from the squeak maintainers. Electr. Notes Theor. Comput. Sci., 166:81?91, 2007. [27] Amol Deshpande and Michael Hicks. Toward on-line schema evolution for non- stop systems. Presented at the 11th High Performance Transaction Systems Workshop, September 2005. [28] Danny Dig and Ralph Johnson. How do apis evolve? a story of refactoring. Journal of Software Maintenance, 18(2):83?107, 2006. [29] M. Dmitriev. Towards flexible and safe technology for runtime evolution of java language applications. In Proceedings of the Workshop on Engineering Complex Object-Oriented Systems for Evolution, in association with OOPSLA 2001, October 2001. [30] Mikhail Dmitriev. Design of JFluid: A profiling technology and tool based on dynamic bytecode instrumentation. Technical Report SMLI TR-2003-125, Sun Microsystems, November 2003. [31] S. Drossopoulou and S. Eisenbach. Flexible, source level dynamic linking and re-linking. In Proc. Workshop on Formal Techniques for Java Programs, 2003. [32] Dominic Duggan. Type-based hot swapping of running modules. In Interna- tional Conference on Functional Programming, pages 62?73, September 2001. [33] Tudor Dumitras, Jiaqi Tan, Zhengheng Gho, and Priya Narasimhan. No more hotdependencies: toward dependency-agnostic online upgrades in distributed systems. In HotDep?07: Proceedings of the 3rd workshop on on Hot Topics in System Dependability, page 14, Berkeley, CA, USA, 2007. USENIX Associa- tion. [34] Eagle Rock Alliance, Ltd. ?2001 Cost of Downtime? Online Survey Results. www.contingencyplanningresearch.com/2001%20Survey.pdf. [35] Xinyu Feng, Zhong Shao, Yuan Dong, and Yu Guo. Certifying low-level pro- grams with hardware interrupts and preemptive threads. In PLDI, pages 170?182, 2008. 274 [36] Cormac Flanagan and Mart??n Abadi. Types for safe locking. In ESOP, pages 91?108, 1999. [37] Cormac Flanagan and K. Rustan M. Leino. Houdini, an annotation assistant for esc/java. In FME, pages 500?517, 2001. [38] Cormac Flanagan, K. Rustan M. Leino, Mark Lillibridge, Greg Nelson, James B. Saxe, and Raymie Stata. Extended static checking for Java. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Lan- guage Design and Implementation (PLDI?2002), volume 37, pages 234?245, June 2002. [39] Jeffrey Foster. Type Qualifiers: Lightweight Specifications to Improve Software Quality. PhD thesis, University of California, Berkeley, December 2002. [40] Jeffrey S. Foster, Robert Johnson, John Kodumal, and Alex Aiken. Flow- Insensitive Type Qualifiers. TOPLAS, 28(6):1035?1087, November 2006. [41] T. Freeman and F. Pfenning. Refinement types for ML. In Proceedings of the ACM SIGPLAN ?91 Conference on Programming Language Design and Implementation, volume 26, pages 268?277, Toronto, Ontario, Canada, June 1991. [42] Ophir Frieder and Mark E. Segal. On dynamically updating a computer pro- gram: From concept to prototype. The Journal of Systems and Software, 14(2):111?128, February 1991. [43] Stephen Gilmore, Dilsun Kirli, and Chris Walton. Dynamic ML without dy- namic types. Technical Report ECS-LFCS-97-378, LFCS, University of Edin- burgh, 1997. [44] A. Goldberg and D. Robson. Smalltalk 80 - the Language and its Implemen- tation. Addison-Wesley, Reading, 1989. [45] Patrick Gray. Experts question Windows patch policy. http://news.zdnet. com/2100-1009\ 22-5105454.html. [46] Dan Grossman. Type-safe multithreading in cyclone. In TLDI, pages 13?25, 2003. [47] D. Gupta. On-line Software Version Change. PhD thesis, Indian Institute of Technology, Kanpur, November 1994. [48] Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar, and Gregoire Sutre. Lazy abstraction. In Symposium on Principles of Programming Languages, pages 58?70, 2002. [49] Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar, and Gr?egoire Sutre. Software verification with blast. In SPIN, pages 235?239, 2003. 275 [50] M. W. Hicks. Dynamic Software Updating. PhD thesis, The University of Pennsylvania, August 2001. [51] Michael Hicks, Jeffrey S. Foster, and Polyvios Pratikakis. Inferring locking for atomic sections. In On-line Proceedings of the ACM SIGPLAN Workshop on Languages, Compilers, and Hardware Support for Transactional Computing (TRANSACT), June 2006. [52] G??sli Hj?almt?ysson and Robert Gray. Dynamic C++ classes, a lightweight mechanism to update code in a running program. In Proc. USENIX Annual Technical Conference, pages 65?76, June 1998. [53] Susan Horwitz. Identifying the semantic and textual differences between two versions of a program. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 234?245, June 1990. [54] David Hovemeyer and William Pugh. Finding bugs is easy. SIGPLAN Not., 39(12):92?106, 2004. [55] Atsushi Igarashi and Naoki Kobayashi. Resource usage analysis. In POPL, pages 331?342, 2002. [56] In-Stat. Online Gaming in Asia: Strong Potential for Growth. http://www. instat.com/Abstract.asp?ID=318\&SKU=IN0804025CM. [57] Daniel Jackson and David A. Ladd. Semantic diff: A tool for summarizing the effects of modifications. In Proceedings of the IEEE International Conference on Software Maintenance (ICSM), pages 243?252, September 1994. [58] Java platform debugger architecture. http://java.sun.com/j2se/1.4.2/ docs/guide/jpda/. [59] Thomas Johnsson. Lambda lifting: Treansforming programs to recursive equa- tions. In FPCA, pages 190?203, 1985. [60] The K42 Project. http://www.research.ibm.com/K42/. [61] John Kodumal and Alexander Aiken. Banshee: A scalable constraint-based analysis toolkit. In SAS, pages 218?234, 2005. [62] Brian Krebs. Cyber Incident Blamed for Nuclear Power Plant Shutdown. Washington Post, June 5, 2008. [63] Insup Lee. DYMOS: A Dynamic Modification System. PhD thesis, Dept. of Computer Science, University of Wisconsin, Madison, April 1983. [64] John R. Levine. Linkers and Loaders. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1999. 276 [65] David E. Lowell, Yasushi Saito, and Eileen J. Samberg. Devirtualizable vir- tual machines enabling general, single-node, online maintenance. In Proc. ASPLOS, pages 211?223. ACM Press, 2004. [66] John M. Lucassen. Types and Effects: Towards the Integration of Functional and Imperative Programming. PhD thesis, MIT Laboratory for Computer Science, August 1987. MIT/LCS/TR-408. [67] John M. Lucassen and David K. Gifford. Polymorphic Effect Systems. In Pro- ceedings of the 15th Annual ACM SIGPLAN-SIGACT Symposium on Princi- ples of Programming Languages, pages 47?57, San Diego, California, January 1988. [68] Kristis Makris and Rida Bazzi. Immediate multi-threaded dynamic software updates using stack reconstruction. Technical Report TR-08-007, Arizona State University, 2008. [69] Kristis Makris and Kyung Dong Ryu. Dynamic and adaptive updates of non- quiescent subsystems in commodity operating system kernels. In EuroSys, pages 327?340, 2007. [70] Scott Malabarba, Raju Pandey, Jeff Gragg, Earl Barr, and J. Fritz Barnes. Runtime support for type-safe dynamic java classes. In Proc. European Con- ference on Object-Oriented Programming (ECOOP), pages 337?361. Springer- Verlag, 2000. [71] Yitzhak Mandelbaum, David Walker, and Robert Harper. An effective theory of type refinements. In ICFP, pages 213?225, 2003. [72] P. E McKenney and J.D. Slingwine. Read-copy update: Using execution history to solve concurrency problems. International Conference on Parallel and Distributed Computing and Systems, 1998. [73] Tom Mens, Michel Wermelinger, St?ephane Ducasse, Serge Demeyer, Robert Hirschfeld, and Mehdi Jazayeri. Challenges in software evolution. In IWPSE, pages 13?22, 2005. [74] Microsoft. Visual Studio Debugger - Edit and Continue. http://msdn. microsoft.com/en-us/library/bcew296c.aspx. [75] Dejan S. Milojicic, Fred Douglis, Yves Paindaveine, Richard Wheeler, and Songnian Zhou. Process migration. ACM Comput. Surv., 32(3):241?299, 2000. [76] John C. Mitchell. Type inference with simple subtypes. JFP, 1(3):245?285, July 1991. [77] Iulian Neamtiu, Jeffrey S. Foster, and Michael Hicks. Understanding source code evolution using abstract syntax tree matching. In Proceedings of the International Workshop on Mining Software Repositories (MSR), pages 1?5, May 2005. 277 [78] Iulian Neamtiu, Michael Hicks, Jeffrey S. Foster, and Polyvios Pratikakis. Contextual effects for version-consistent dynamic software updating and safe concurrent programming. In Proceedings of the ACM Conference on Principles of Programming Languages (POPL), pages 37?50, January 2008. [79] Iulian Neamtiu, Michael Hicks, Gareth Stoyle, and Manuel Oriol. Practical dynamic software updating for C. In PLDI ?06: Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation, pages 72?83, New York, NY, USA, 2006. ACM Press. [80] George C. Necula, Scott McPeak, Shree P. Rahul, and Westley Weimer. CIL: Intermediate language and tools for analysis and transformation of C pro- grams. LNCS, 2304:213?228, 2002. [81] Yang Ni, Vijay Menon, Ali-Reza Adl-Tabatabai, Antony L. Hosking, Richard L. Hudson, J. Eliot B. Moss, Bratin Saha, and Tatiana Shpeisman. Open nesting in software transactional memory. In PPOPP, pages 68?78, 2007. [82] Jason Nieh. Autopod: Unscheduled system updates with zero data loss. In ICAC ?05: Proceedings of the Second International Conference on Automatic Computing, pages 367?368, Washington, DC, USA, 2005. IEEE Computer Society. [83] Flemming Nielson, Hanne R. Nielson, and Chris Hankin. Principles of Pro- gram Analysis. Springer-Verlag, 1999. [84] David Oppenheimer, Aaron Brown, James Beck, Daniel Hettena, Jon Kuroda, Noah Treuhaft, David A. Patterson, and Katherine Yelick. Roc-1: Hardware support for recovery-oriented computing. IEEE Trans. Comput., 51(2):100? 107, 2002. [85] David L. Oppenheimer, Archana Ganapathi, and David A. Patterson. Why do internet services fail, and what can be done about it? In USENIX Symposium on Internet Technologies and Systems, 2003. [86] A. Orso, A. Rao, and M.J. Harrold. A technique for dynamic updating of Java software. In Proc. IEEE International Conference on Software Maintenance (ICSM), pages 649?658, 2002. [87] Steven Osman, Dinesh Subhraveti, Gong Su, and Jason Nieh. The design and implementation of zap: a system for migrating computing environments. volume 36, pages 361?376, New York, NY, USA, 2002. ACM. [88] Xinming Ou, Gang Tan, Yitzhak Mandelbaum, and David Walker. Dynamic typing with dependent types. In IFIP TCS, pages 437?450, 2004. [89] Yoann Padioleau, Julia L. Lawall, and Gilles Muller. Understanding collateral evolution in linux device drivers. SIGOPS Oper. Syst. Rev., 40(4):59?71, 2006. 278 [90] Steve Parker. A simple equation: IT on = Business on. The IT Journal, Hewlett Packard, 2001. [91] Karl Pettis and Robert C. Hansen. Profile guided code positioning. In PLDI, pages 16?27, 1990. [92] James S. Plank. An overview of checkpointing in uniprocessor and distributed systems, focusing on implementation and performance. Technical Report UT- CS-97-372, Computer Science Department, the University of Tennessee, 1997. [93] Shaya Potter and Jason Nieh. Reducing downtime due to system maintenance and upgrades. In LISA ?05: Proceedings of the 19th conference on Large Installation System Administration Conference, pages 47?62, Berkeley, CA, USA, 2005. USENIX Association. [94] Polyvios Pratikakis. Sound, Precise and Efficient Static Race Detection for Multi-Threaded Programs. PhD thesis, University of Maryland, August 2008. [95] Niels Provos. Libevent - an event notification library. http://www.monkey. org/?provos/libevent/. [96] Mohan Rajagopalan, Somu Perinayagam, Haifeng He, Gregory Andrews, and Saumya Debray. Binary rewriting of an operating system kernel. Workshop on Binary Instrumentation and Applications, 2006. [97] Eric Rescorla. Security holes... who cares? In USENIX Security Symposium, August 2003. [98] RobinRowe. Safety-CriticalSystemsComputerLanguage Survey, 1994. http: //vl.fmnet.info/safety/lang-survey.html. [99] Stelios Sidiroglou, Sotiris Ioannidis, and Angelos D. Keromytis. Band-aid patching. In HotDep?07: Proceedings of the 3rd workshop on on Hot Topics in System Dependability, pages 102?106, Berkeley, CA, USA, 2007. USENIX Association. [100] Christian Skalka, Scott Smith, and David Van horn. Types and trace effects of higher order programs. J. Funct. Program., 18(2):179?249, 2008. [101] Fred Smith, David Walker, and Greg Morrisett. Alias types. In ESOP, pages 366?381, 2000. [102] Jonathan M. Smith. A survey of process migration mechanisms. ACM Oper- ating Systems Review, SIGOPS, 22(3):28?40, 1988. [103] Craig A. N. Soules, Jonathan Appavoo, Kevin Hui, Robert W. Wisniewski, Dilma Da Silva, Gregory R. Ganger, Orran Krieger, Michael Stumm, Marc A. Auslander, Michal Ostrowski, Bryan S. Rosenburg, and Jimi Xenidis. System support for online reconfiguration. In USENIX Annual Technical Conference, General Track, pages 141?154, 2003. 279 [104] Don Stewart and Manuel M. T. Chakravarty. Dynamic applications from the ground up. In Haskell ?05: Proceedings of the 2005 ACM SIGPLAN workshop on Haskell, pages 27?38, New York, NY, USA, 2005. ACM Press. [105] Gareth Stoyle. A Theory of Dynamic Software Updates. PhD thesis, Computer Laboratory, University of Cambridge, July 2006. [106] Gareth Stoyle, Michael Hicks, Gavin Bierman, Peter Sewell, and Iulian Neamtiu. Mutatis mutandis: safe and predictable dynamic software updating. In POPL ?05: Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 183?194, New York, NY, USA, 2005. ACM Press. [107] Gareth Stoyle, Michael Hicks, Gavin Bierman, Peter Sewell, and Iulian Neamtiu. Mutatis Mutandis: Safe and flexible dynamic software updating. TOPLAS, 29(4):22, August 2007. [108] Cristian Tapus. Distributed Speculations: Providing Fault-tolerance and Im- proving Performance. PhD thesis, California Institute of Technology, June 2006. [109] Mads Tofte and Jean-Pierre Talpin. Region-based memory management. In- formation and Computation, 132(2):109?176, 1997. [110] David Walker, Karl Crary, and Greg Morrisett. Typed memory management in a calculus of capabilities. TOPLAS, 24(4):701?771, July 2000. [111] Chadd C. Williams and Jeffrey K. Hollingsworth. Recovering system specific rules from software repositories. In Proceedings of the International Workshop on Mining Software Repositories (MSR), 2005. [112] Hongwei Xi and Frank Pfenning. Eliminating array bound checking through dependent types. In PLDI ?98: Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, pages 249? 257, New York, NY, USA, 1998. ACM Press. [113] Hongwei Xi and Frank Pfenning. Dependent types in practical programming. In POPL ?99: Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 214?227, New York, NY, USA, 1999. ACM Press. [114] Wuu Yang. Identifying syntactic differences between two programs. Software - Practice and Experience, 21(7):739?755, 1991. [115] Peng Zhao and Jos?e Nelson Amaral. Function outlining and partial inlining. In SBAC-PAD, pages 101?108, 2005. [116] Benjamin Zorn. Personal communication, based on experience with Microsoft Windows customers, August 2005. 280