ABSTRACT Dissertation Title: MEASURING AND MITIGATING POTENTIAL RISKS OF THIRD-PARTY RESOURCE INCLUSIONS Soumya Indela Doctor of Philosophy, 2021 Electrical and Computer Engineering Advised by: Professor Dave Levin Computer Science In today’s computer services, developers commonly use third-party resources like libraries, hosting infrastructure and advertisements. Using third-party components im- proves the efficiency and enhances the quality of developing custom applications. How- ever, while using third-party resources adopts their benefits, it adopts their vulnerabilities, as well. Unfortunately, developers are uninformed about the risks, as a result of which, the services are susceptible to various attacks. There has been a lot of work on how to develop first-hand secure services. The key focus in my thesis is quantifying the risks in the inclusion of third-party resources and looking into possible ways of mitigating them. Based on the fundamental ways that risks arise, we broadly classify them into Direct and Indirect Risks. Direct risk is the risk that comes with invoking the third-party resource incorrectly—even if the third party is otherwise trustworthy whereas indirect risk is the risk that comes with the third-party resource potentially acting in an untrustworthy man- ner—even if it were invoked correctly. To understand the security related direct risks in third-party inclusions, we study cryptographic frameworks. Developers often use these frameworks incorrectly and in- troduce security vulnerabilities. This is because current cryptographic frameworks erode abstraction boundaries, as they do not encapsulate all the framework-specific knowledge and expect developers to understand security attacks and defenses. Starting from the doc- umented misuse cases of cryptographic APIs, we infer five developer needs and we show that a good API design would address these needs only partially. Building on this obser- vation, we propose APIs that are semantically meaningful for developers. We show how these interfaces can be implemented consistently on top of existing frameworks using novel and known design patterns, and we propose build management hooks for isolating security workarounds needed during the development and test phases. To understand the performance related direct risks in third-party inclusions, we study resource hints in webpage HTML. Today’s websites involve loading a large number of resources, resulting in a considerable amount of time issuing DNS requests, request- ing resources, and waiting for responses. As an optimization for these time sinks, web- sites may load resource hints, such as DNS prefetch, preconnect, preload, pre-render, and prefetch tags in their HTML files to cause clients to initiate DNS queries and resource fetches early in their web-page downloads before encountering the precise resource to download. We explore whether websites are making effective use of resource hints using techniques based on the tool we developed to obtain a complete snapshot of a webpage at a given point in time. We find that many popular websites are highly ineffective in their use of resource hints, causing clients to query and connect to extraneous domains, download unnecessary data, and may even use resource hints to bypass ad blockers. To evaluate the indirect risks, we study the web topology. Users who visit benign, popular websites are unfortunately bombarded with malicious popups, malware- loading sites, and phishing sites. The questions we want to address here are: Which domains are responsible for such malicious activity? At what point in the process of loading a popular, trusted website does the trust break down to loading dangerous content? To answer these questions, we first understand what third-party resources websites load (both directly and indirectly). I present a tool that constructs the most complete map of a website’s resource- level topology to date. This is surprisingly nontrivial; most prior work used only a single run of a single tool (e.g., Puppeteer or Selenium), but I show that this misses a significant fraction of resources. I then apply my tool to collect the resource topology graphs of 20,000 websites from the Alexa ranking, and analyze them to understand which third- party resource inclusions lead to malicious resources. I believe that these third-party inclusions are not always constant or blocked by existing Ad-blockers. We argue that greater accountability of these third parties can lead to a safer web. ANALYZING AND MITIGATING POTENTIAL RISKS OF THIRD-PARTY RESOURCE INCLUSIONS by Soumya Indela Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2021 Advisory Committee: Professor Dave Levin, Chair/Advisor Professor Tudor Dumitras Professor Charalampos Papamanthou Professor Ashok Agrawala Professor Lawrence Washington c© Copyright by Soumya Indela 2021 Acknowledgments I would like to start by praying to God who continually showers His blessings. I owe my gratitude to all the people who have made this thesis possible and because of whom my graduate experience has been one that I will cherish forever. First and foremost I would like to thank my advisor, Professor Dave Levin for giving me an invaluable opportunity to work on extremely interesting projects over the past four years. He has always made himself available for advice and help both in academics and personal growth. It has been a pleasure to work with and learn from such an extraordinary individual. He will be an ideal that I aspire to for the rest of my career and I hope I can live up to the high standards that he has instilled in me. I am very grateful to Professor Tudor Dumitras whose guidance in the initial stages of my PhD helped me develop interest in cyber security research and also inculcate the quality to collaborate and initiate technical discussions. I am immensely grateful that working under the guidance of such a proficient person paved way to my thesis. I would like to acknowledge my defense committee members, Professor Charalam- pos Papamanthou, Professor Ashok Agrawala and Professor Lawrence Washinton for their valuable time and feedback. I would also like to thank my undergraduate advi- sor Professor Sanjay Bose and the professors at University of Maryland, who instilled the thrill of researching in me and trained me with the skill set to pursue research. I ii would also like to appreciate the help and technical support from some the University of Maryland staff members. I owe my deepest thanks to my co-authors, Mukul Kulkarni, Kartik Nayak, col- leagues Matthew Lentz, Ivan Petrov, Zhihao Li, Stephen Herwig, Preston Tong, Melissa Hoff, for their aid in improving my skillset and research expertise in cyber security. All our discussions have always kept me motivated, and helped me to keep my graduate stu- dent life exciting. My dear friends Sriram Vasudevan, Srikanth Govindarajan, Akshaya Sharma, Nikhil Valluru, Vidya Raju, Arun Shankar, Dwith CYN, Amit Kumar have en- riched my graduate life in many ways and deserve a special mention. Our interactions always made me think of the practical considerations in my research and expanded my world-view. I feel greatly obliged to my parents (Raghuramulu, Jyothi), in-laws (Vidyasagar, Durga), siblings (Sravya, Paavan, Lalitsagar), grandparents and extended family members for their endless support and encouragement in all my endeavours. They have played a significant role in shaping me into what I am today. The journey would not have been possible without my family and their belief in me. It is only because of their teachings and guidance that I could achieve what I have. Words cannot express my emotions and it is almost impossible for me to verbal- ize my friendship with Raghuvaran Yaramasu, Pranali Shetty, Vidya Vemparala, Harika Matta, Sharadha Kalyanam, Naresh Maruthi, Rengarajan Sankaranarayanan, Ratan Vish- wanath, Saif Mohammed, Kamala Raghavan, Shalvi Raj, Aakanksha Mishra, Sai Sumana, Vandana Banapuram, Sravya Amudapuram, Alok Kumar and Ranjitha Devikere. With- out their company, my journey would not have been so much gratifying and I would have iii found it extremely difficult to wade through all the highs and lows. Last but not the least I am extremely indebted to my husband, Amarsagar Reddy Ramapuram Matavalam, who has always stood by me and guided me through the final phases of my PhD. He gave me the strength to persevere and courage to pull through impossible odds at times. I would also like to express my gratitude for the substantial support during the Covid pandemic. Finally, I have to particularly recognize the financial support received from National Science Foundation, Department of Defense and Maryland Procurement Office. iv Table of Contents Acknowledgements ii Table of Contents v List of Tables viii List of Figures x List of Abbreviations xii Chapter 1: Introduction 1 Chapter 2: Related Work 7 2.1 Measuring and mitigating misuse of cryptographic APIs . . . . . . . . . . 7 2.1.1 Misuse of Cryptography. . . . . . . . . . . . . . . . . . . . . . . 7 2.1.2 Design patterns for application security and privacy. . . . . . . . . 8 2.1.3 Simplified usage of cryptographic APIs. . . . . . . . . . . . . . . 9 2.1.4 Static analysis and type systems. . . . . . . . . . . . . . . . . . . 10 2.2 Downloading Websites . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.1 Web Crawling Tools . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.2 Web topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.3 Comparison across crawls . . . . . . . . . . . . . . . . . . . . . 13 2.3 Measuring third-party resource inclusion on the web . . . . . . . . . . . . 13 2.3.1 Online advertsing . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.2 Mobile vs Desktop . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.3 Mobile Advertising . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.4 Trust in Third-party Resource Inclusions . . . . . . . . . . . . . . 17 2.4 Resource Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Chapter 3: Toward Semantic Interfaces for Cryptographic Frameworks 21 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3 Needs of Developers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4 Unified Framework for Secure Application Development . . . . . . . . . 32 3.4.1 Semantic APIs . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.4.2 Integrating External Information . . . . . . . . . . . . . . . . . . 39 3.4.3 Managing Security Checks during Development and Testing . . . 48 v 3.5 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.5.1 Case Study 1: Mobile Money Application . . . . . . . . . . . . . 51 3.5.2 Case Study 2: Secure Messaging . . . . . . . . . . . . . . . . . . 53 3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Chapter 4: Sound Methodology for Downloading Webpages 60 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.2 What Effect Do Tools Have? . . . . . . . . . . . . . . . . . . . . . . . . 63 4.2.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.2.3 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.3 Is Disagreement Caused by Dynamism? . . . . . . . . . . . . . . . . . . 71 4.4 How Many Refreshes? . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.4.3 Adaptive Reloading . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.4.4 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Chapter 5: Resource Hints or Resource Waste? 85 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.3 Resource Use and Misuse . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.3.1 Resource Hint Invocations . . . . . . . . . . . . . . . . . . . . . 89 5.3.2 Resource Hint Usage . . . . . . . . . . . . . . . . . . . . . . . . 90 5.3.3 Resource Hint Links . . . . . . . . . . . . . . . . . . . . . . . . 95 5.4 Circumventing Ad Blockers . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.4.1 Ad Blockers and Resource Hints . . . . . . . . . . . . . . . . . . 97 5.4.2 URLs that Bypass Blocking . . . . . . . . . . . . . . . . . . . . 98 5.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Chapter 6: Measurement Study of the Malicious Web Topology 101 6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 6.2 Experimental Methodology . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.2.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.2.2 Conceptual Graph . . . . . . . . . . . . . . . . . . . . . . . . . . 108 6.2.3 VirusTotal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.3 Analysis Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.3.1 Factors Influencing the Topology . . . . . . . . . . . . . . . . . . 113 6.3.2 Trust breakdown: What are the intermediary benign domains that should be held accountable? . . . . . . . . . . . . . . . . . . . . 118 6.3.3 AdBlockers: Can we evaluate the effectiveness of various blocklists?127 6.4 Proposed Mitigation Strategies . . . . . . . . . . . . . . . . . . . . . . . 131 vi 6.4.1 Trust Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 6.4.2 Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . 136 6.4.3 Greedy Iterative Algorithm . . . . . . . . . . . . . . . . . . . . . 139 6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Chapter 7: Conclusions 145 Bibliography 148 vii List of Tables 3.1 Mapping developer needs, mistakes and solution - CWE provides a uni- fied, measurable set of software weaknesses [31]. For example from the CWE 297 in the list corresponds to the mistake “Improper Validation of Certificate with Host Mismatch” https://cwe.mitre.org/data/definitions/ 297.html. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.2 Proposed Semantic API with functions for both application and library developers [46]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.1 Number of page loads necessary to obtain column% of domains for row% of webpages from the Alexa top-1000, using a Desktop UserAgent. . . . 77 4.2 Number of page loads necessary to obtain column% of domains for row% of webpages from the Alexa top-1000, using a Mobile UserAgent. . . . . 77 4.3 Number of page loads necessary to obtain column% of edges for row% of webpages from the Alexa top-1000, using a Desktop (top) and Mobile (bottom) UserAgent. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.4 Number of page loads necessary to obtain column% of resources for row% of webpages from the Alexa top-1000, using a Desktop (top) and Mobile (bottom) UserAgent. . . . . . . . . . . . . . . . . . . . . . . . . 78 4.5 Requisite page loads and amount of content received for different down- load strategies, when run against the 982 of the Alexa top-1000 websites that responded with a Desktop UserAgent. . . . . . . . . . . . . . . . . . 79 4.6 Requisite page loads and amount of content received for different down- load strategies, when run against the 982 of the Alexa top-1000 websites that responded with a Mobile UserAgent. . . . . . . . . . . . . . . . . . 80 5.1 Number of websites from the Alexa top-100k that invoke each given re- source hint at least once. . . . . . . . . . . . . . . . . . . . . . . . . . . 90 5.2 Aggregated number of links per resource hint, and how many go unused. . 91 5.3 Number of resources that would have been blocked by an ad blocker. . . . 98 6.1 Graph properties for more popular websites. . . . . . . . . . . . . . . . . 115 6.2 Graph properties for less popular websites. . . . . . . . . . . . . . . . . 115 6.3 Graph properties when crawling with the Desktop UserAgent string. . . . 117 6.4 Graph properties when crawling with the Mobile UserAgent string. . . . 117 6.5 Bad domains classification for More Popular Websites . . . . . . . . . . . 130 6.6 Bad domains classification for Less Popular Websites . . . . . . . . . . . 130 viii https://cwe.mitre.org/data/definitions/297.html https://cwe.mitre.org/data/definitions/297.html 6.7 For different threshold on Scalar trust metric, the percentage of back- propagation nodes with metric less than the thershold and corresponding percentage of bad domains loaded by more popular domains on desktop (top) and mobile (bottom). . . . . . . . . . . . . . . . . . . . . . . . . . 132 6.8 For different threshold on Scalar trust metric, the percentage of back- propagation nodes with metric less than the thershold and corresponding percentage of bad domains loaded by less popular domains on desktop (top) and mobile (bottom). . . . . . . . . . . . . . . . . . . . . . . . . . 133 6.9 The percentage of back-propagation nodes and corresponding percentage of bad domains minimized using vector trust metric. . . . . . . . . . . . . 135 ix List of Figures 3.1 HTTPS request in Python. . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 Communicate interface - example send and secureSend functions to per- form a connection using HTTP and HTTPS protocols. . . . . . . . . . . . 34 3.3 General structure of Regulator pattern - The darkblue arrows are used to indicate updating of parameters by a regulator, which retrieves this data from an external source, intermittently. The dark green arrows indicate update where an application directly contacts the subject. . . . . . . . . . 42 3.4 Sequence Diagram showing the Push Model . . . . . . . . . . . . . . . . 43 3.5 Sequence Diagram showing the Pull Model . . . . . . . . . . . . . . . . 44 3.6 Sequence Diagram showing the Selective Pull Model . . . . . . . . . . . 45 3.7 Producing separate binaries for test and production environments using a build configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.8 A pom.xml file which selects different keystores for development and production environment. . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.1 Comparison of the domains, edges, and resources obtained by Crawlium and ZBrowse when obtaining the Alexa top-10k sites. These plots com- pare when both Desktop and Mobile UserAgent strings are used. . . . . . 64 4.2 This is the same as Figure 4.1, but focused instead on less popular sites (a random selection of 10k sites from the Alexa top 10,001–1M most pop- ular websites). Less popular sites tend to have more in common between the two tools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.3 How the domains found by only one tool compare to the domains found by the other. The tools show no significant difference for less popular sites. 68 4.4 When a tool obtains a unique edge, how often both tools observe the edge’s domains. The tools show only slightly greater agreement for less popular domains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.5 The number of runs in which different domains and edges are obtained by Crawlium and ZBrowse, limited to the domains that required all 30 page loads. (Desktop only shown; Mobile results are very similar.) . . . . . . . 72 4.6 The number of runs in which different domains and edges are obtained by Crawlium and ZBrowse, covering all domains regardless of the number of page loads. (Desktop only shown; Mobile results are very similar.) . . 73 4.7 Percentage of domains obtained by Crawlium and ZBrowse, and percent- age of edges and resources obtained by ZBrowse in subsequent page loads for the Alexa top-10k sites and 10k sites from among Alexa rank 10,001 to 1 million using Desktop UserAgent string. . . . . . . . . . . . . . . . 81 x 4.8 Number of page loads for adaptive strategy (δ = 3) . . . . . . . . . . . . 82 5.1 Fraction of Alexa top-100k sites that invoke each given resource hint (x- axis ordered by Alexa ranking, binned into buckets of size 1,000). . . . . 91 5.2 DNS Prefetch Usage (bucket size 1,000). . . . . . . . . . . . . . . . . . . 92 5.3 DNS Prefetch Usage ON. . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5.4 Preconnect Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.5 Prefetch Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.6 Preload Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.7 CDF of File Sizes of Unused Preloaded Links . . . . . . . . . . . . . . . 94 6.1 Block Diagram representing the Experimental Methodology . . . . . . . 102 6.2 One of the paths when loading mangapanda.com . . . . . . . . . . . . . 104 6.3 Conceptual Graph Template showing the types of nodes and edges in a typical graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.4 Example: Alexa graph for tribunnews.com highlighting the root and bad nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.5 Alexa graph for tribunnews.com highlighting the nodes and edges in the back-propagation graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.6 Example of the back-propagation graph for tribunnews.com . . . . . . . . 111 6.7 Fraction of Alexa websites that load at least one bad domain as a function of Alexa ranking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 6.8 Plot of the frequency as a cumulative fraction of the number of domains. . 120 6.9 Scatter plot of domains present in at least one back-propagation graph rep- resenting the number of back-propagation graphs to the number of Alexa graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 6.10 Same plot as in Figure 6.9, but limited to only the nodes present in at least 10 Alexa graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 6.11 Distribution of the bad node count as a cumulative fraction of the number of benign domains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 6.12 Distribution of the hop count as a cumulative fraction of the number of (domain, bad domain) pairs. . . . . . . . . . . . . . . . . . . . . . . . . 126 6.13 Examples enumerating paths to a bad node in graphs with cycles ensuring that the nodes in the cycles are counted only once. . . . . . . . . . . . . . 129 6.14 Heuristic Algorithm to identify the list of back-propagation nodes to be added to a blocklist in order to indirectly block bad domains . . . . . . . 139 6.15 Distribution of the fraction of bad nodes that are blocked as a cumulative fraction of back-propagation nodes added to the blocklist. . . . . . . . . . 140 xi List of Abbreviations AES Advanced Encryption Standard API Application Program Interface CA Certificate Authority CDF Cumulative Distribution Function CDN Content Delivery Network CRL Certificate Revocation List CWE Common Weakness Enumeration DES Data Encryption Standard DNS Domain Name System DOM Document Object Model DoS Denial-of-Service E2LD Effective Second Level Domain EC2 Elastic Compute Cloud ECB Electronic CodeBook FQDN Fully Qualified Domain Name HTML HyperText Markup Language HTTP HyperText Transfer Protocol HTTPS HyperText Transfer Protocol Secure JSSE Java Secure Socket Extension MD5 Message Digest 5 MITM Man-in-the-Middle NaCl Networking and Cryptography library NIST National Institute of Standards and Technology OCSP Online Certificate Status Protocol OpenCCE Open CROSSING Crypto Expert OS Operating System RC4 Rivest Cipher 4 SDK Software Development Kit SHA-1 Secure Hash Algorithm 1 SQL Structured Query Language SSL Secure Sockets Layer TCP Transmission Control Protocol TLS Transport Layer Security URL Uniform Resource Locator VT Virus Total xii Chapter 1: Introduction Computer services such as software and web development are large and sophisti- cated in design and usage. These services involve many complex components, including hosting infrastructure (CDNs, cloud) and content (libraries, advertising, tracking scripts, web fonts). Libraries are commonly used in advertising, analytics, cloud and social me- dia (Facebook) [12]. In most cases, developers use third-party resources. For example, 66% PyPI packages are used by developers [23] and Content Delivery Networks like Akamai [85] deliver more than 20% of web traffic. Component-oriented development improves the efficiency and the quality of devel- oping custom applications, thus enhancing the use of third-party resources. Addition- ally, re-creating these resources incurs a waste of time and redundant memory allocation. Sometimes, it is safer to use trusted third-party code—especially in cryptography, where it is suggested ‘not’ to write your own code as it is challenging to implement an algorithm that is secure [93,96]. Overall, the benefits of using third-party content include improved operational speed, delivery time, and scalability. Incorporating third-party resources involves a high level of trust as it imports not only the features, but also the vulnerabilities associated with it. Thus, if the third-party resource gets compromised, the application is also compromised. Besides vulnerabilities 1 in the resources, there is a possibility of incorporating the resources incorrectly in the application. Thus, the interface via which the third-party resource is incorporated may also be at risk. Additionally, in the case of websites, most of the resources included are scripts, which have complete access to the Document Object Model (DOM), thus pro- viding access to personal data like credentials and session cookies, in turn compromising the website. Downloading malware without users’ knowledge is another common risk associated with various applications. Unfortunately, developers are oblivious of the vulnerabilities when using third-party resources, and thus seem to be taking on more risk than they might realize. The devel- opers need to be provided with good documentation to ensure security, but this is likely insufficient [5]. Sometimes, the developer has no control; for instance in online advertis- ing, the website developer is unaware of the actual advertisements that the ad networks on the website embed. Developers do not understand the various security threats and make critical mistakes like disabling security checks during the testing and development phase and these checks remain disabled in production [43,93]. Security critical errors can creep into applications by large organizations with well funded security teams as well. For example, 88% of Android applications using cryptographic APIs make at least one mistake [40]. As a result of the mistakes, there are severe real-world consequences. Recently, the crypt32.dll windows module [86] and ads in Microsoft’s ad-supported apps [112] have led to installing malware on the users’ devices. Another example is the Heartbleed [39] bug in the OpenSSL library, which had long-term consequences. Due to the mistake of improper input validation, the input value for the request messages missed a check on the bounds 2 causing a buffer overflow. This compromised upto 64KB of sensitive data, including keys for certificates, resulting in the revocation of over 73% of vulnerable certificates [124]. The entire ecosystem could be improved by having tools that better inform devel- opers about the risks they are taking on. In my thesis, I aim to study the hypothesis that “The potential risks in the use of third-party inclusions can be measured and mitigated.” To this end, I observe that risks arise either (a) at the interface via which the third- party resource is included or (b) via inclusions of yet another third-party resource because the developer no longer has control. Thus, based on the fundamental ways that risks arise, we broadly classify them into Direct and Indirect Risks. Direct risk originates from invok- ing the third-party resource incorrectly—even if the third party is otherwise trustworthy whereas indirect risk originates from the third-party resource potentially acting in an un- trustworthy manner—even if it were invoked correctly. An example of direct and indirect risk arises in the context of online advertising. Including a DNS prefetch link is an example of a direct risk that may occur due to incor- rect invocation. An ad network may resell the space to other ad networks, which in turn allow third-party advertisers to load Javascript into that iframe. In this case, malicious behaviour by either the ad networks or the advertisers can lead to indirect risks. Other examples of direct risks include critical mistakes in cryptography [40, 102, 118]. Pop-up ads [81], [77] and malicious website redirections are well known examples of indirect risks. Direct and indirect risks by our definition are orthogonal—it is perfectly possible that a third-party resource may be both malicious and invoked incorrectly. Since the risks with inclusion of third-party resources are inherently different, I evaluate my thesis by 3 exploring them separately. My goal in studying direct risks is to understand the lapses in developing a secure interface and ultimately defining some semantic rules to design a secure API. To address this, I consider cryptographic APIs and identify some of the common mistakes including exchanging keys without authenticating the endpoint [93], storing sensitive information in cleartext [93] or with weak protection [40], using parameters known to be insecure (e.g. the Electronic Codebook mode or non-random initialization vectors) for block ci- phers [40], using encryption keys that are constant [40] or are generated from insuffi- cient randomness [40,93], or performing improper TLS certificate validation [43,47,93]. More specific examples of library misuses include using defaults which are insecure (ssl.VERIFY_CRL_CHECK_CHAIN should be used instead of ssl.VERIFY_default to check for certificate revocation) and using incorrect parameters due to differences across libraries (for hostname verification and certificate validation ssl.CERT_REQUIRED re- quires integer 2 in cURL whereas it is boolean value TRUE, which corresponds to integer 1 in JSSE) [60]. Besides security related issues of direct risks as in the case of cryptographic APIs, another concern is performance. We study this in the context of resource hints that can be included in the HTML of a webpage to instruct browsers to speculatively perform some part of fetching [50] (e.g., to issue a DNS request for a domain that is likely to be loaded later in the HTML page). HTML resource hints are easy to include, easy to reason about, and, if used correctly, have the potential to significantly decrease webpage load times. Our analysis of resource hints especially on websites including Cloudflare show that ma- jority of the websites do not make effective use of resource hints. For our comprehensive 4 study of resource hints on Alexa top-100,000 websites, we use the techniques based on the sound methodology that we developed for downloading resource-level topology of a website. My goal in studying indirect risks is to understand the amount of trust develop- ers have in third-party resources and inform the users and/or developers where there are risks related to trust breakdown. Towards this, I perform a measurement study to expose the trust relationships involved in incorporating third-party resources at various levels. For this measurement study, I consider the web topology and study the multiple redi- rections in the Web. For example, a webpage includes an ad network, which in turn loads various ads. Specifically, yahoo.com (Alexa rank 11), which loads the malicious site https://dt.adsafeprotected.com both directly and through https://s.yimg.com. In the case of mangapanda.com, a click at any point on the webpage, would load an ad in a new tab. This occurs due to a redirect to the malicious cobalten.com through the script srv.aftv-serving.bid and go.pub2srv.com. From these examples, it is clear that an impor- tant first step in the measurement study is to learn what the third-party resources are. To obtain a good snapshot of the topology of a webpage, I developed a sound methodology to construct the most comprehensive map of a website’s resource-level topology. Once the web topology is obtained, I analyze the data as a large graph with trust relationships using different metrics and provide information to users and developers on the third-party resources that can be delegated to websites. My dissertation ultimately seeks to understand some of the potential risks involved with using third-party resources, and to develop techniques to empirically measure and reason about these risks. That said, it is important to note that this is not a comprehen- 5 sive study of all forms of risk across all possible applications. The two broad domains I consider in this dissertation—crypto APIs and the web—are, I believe, extremely im- portant and highly representative, and the kinds of risks that I consider—weaker security, bad performance, or inclusion of malicious content—are also common in many settings. My hope is that my findings and methodologies can, to some extent, generalize to other domains, as well; I speak briefly to this in the conclusion of this dissertation. 6 Chapter 2: Related Work In this chapter, I will describe related work as it pertains to measuring and mitigat- ing misuse of cryptographic APIs in the context of direct risks. In the context of indirect risks, I will describe related work on downloading websites and measuring third-party resource inclusions in the web. 2.1 Measuring and mitigating misuse of cryptographic APIs The related work includes empirical observations of cryptography misuse and at- tempts to prevent such misuse by formulating security design patterns and simplifying cryptographic APIs. 2.1.1 Misuse of Cryptography. Egele et al. [40] performed an empirical study of cryptographic misuse in Android applications. Reaves et al. [93] studied 46 Android applications that perform financial transactions and reported instances of incorrect certificate validation, storing login cre- dentials in clear text and using poor authentication practices such as do-it-yourself cryp- tography. Georgiev et al. [47] found that many widely used applications (Amazon’s EC2 Java library, PayPal’s merchant SDK), shopping carts (osCOmmerce, Ubercart) and mo- 7 bile applications (Chase mobile banking) all perform broken certificate validation. Fahl et al. [43] showed that root causes in SSL development are not simply careless develop- ers, but also limitations and issues of the current SSL development paradigm. Acar et al. [5] perform a user study to show that participants who use Stack Overflow produced significantly less secure code than those using official documentation or published books. We add to this body of work by identifying systematic behavior differences among pop- ular cryptographic frameworks when implementing the same functionality. However, our main focus is in presenting a solution to these problems. 2.1.2 Design patterns for application security and privacy. Prior work has introduced several design patterns for application security and pri- vacy [33, 53, 95, 98, 105, 107]. The early work of Yoder et al. [120] introduced several architectural patterns for enabling application security and presented a framework using the patterns to build secure applications. Sommerlad [105] introduced reverse proxy pat- terns to protect servers at the application layer at the network perimeter. Schumacher [99] proposed patterns for protection against cookies and pseudonymous email in the semi- nal paper on privacy patterns, and Hafiz [53] extended this work by suggesting patterns for the design of anonymity systems. An authentication enforcer pattern [108] is used to ensure that authentication happens at all relevant parts of the code. Kern et al. [64] de- signed a database query API that avoids SQL injection vulnerabilities. The effectiveness of these patterns has been questioned in recent studies. Heyman et al. [56] identify 220 security patterns, introduced over a ten year period, and Hafiz et al. [54] report duplicates 8 among these patterns, as different authors describe similar concepts but give them dif- ferent names. Yskout et al. [121] quantify the benefits of security patterns to developer productivity and conclude that design patterns do not reduce development time. However, they do not assess whether the use of these patterns leads to more secure code. Rather than identifying design patterns that capture common implementation techniques, we charac- terize and bridge the semantic gap between the needs of developers and the cryptographic APIs. Starting from common misuse patterns of existing APIs and from incorrect recom- mendations found in widely used programming resources, we design semantic APIs that can be used correctly by developers who lack an in-depth knowledge of cryptography. 2.1.3 Simplified usage of cryptographic APIs. The NaCl cryptographic library [20] introduced simplified APIs, aiming to avoid some misuse patterns observed with existing cryptographic libraries. Fahl et al. [43] propose modifying the Android OS to provide the main SSL usage patterns as a service that can be added to apps via configuration, to prevent developers from implementing their own SSL code. OpenCCE [11] is a tool for managing software product lines that builds on the observation that many cryptographic solutions represent combinations of common cryptographic algorithms, parameterized at compile time. OpenCCE guides developers through the selection of the appropriate algorithms, and synthesizes both Java code and a usage protocol. This approach requires monitoring the code changes over time, through static analysis, to ensure that the usage protocol is not violated. Additionally, the code synthesized is tied to the library used and does not account for the need to 9 incorporate external information at runtime. Rather than creating a new library or a new service, we introduce a semantic layer on top of existing libraries, which allows us to provide unified APIs across different libraries, programming languages and platforms. Additionally, we design portable semantic APIs for bridging the gap between developer needs and cryptographic expertise, instead of static analysis tools. 2.1.4 Static analysis and type systems. Recent work on type systems and static analysis aims to remove the burden of im- plementing security checks from developers and to generate proofs that an application satisfies certain security properties. For example, the analysis of information flow allows determining if a program satisfies certain confidentiality policies [35, 58]. Van Delft et al. [114] extended this approach to information-flow properties which change during pro- gram execution. FlowTracker [94] focuses on discovering time-based side-channels in cryptographic libraries. Bodei et al. [22] focus on finding weaknesses in cryptographic protocols. We focus on misuses of the APIs exposed by popular frameworks, rather than on attacks against cryptographic protocols. Moreover, we observe that, for some security checks, developers need the flexibility to chose the most appropriate implementation. For example, there is currently no agreement about the best method for checking the revo- cation status of TLS certificates, and the choice is likely to be platform dependent [72]. Type systems and static analyses are complementary to a better API design, and they can improve the performance of well established security protocols by moving the checks to compile-time. 10 2.2 Downloading Websites The related work includes the tools used to download websites, obtain the Web topology, and compare the various crawls to identify the appropriate parameters to down- load a website completely. We hope that our results will lead to more papers exploring these various parameters and reporting on them so that others may evaluate and reproduce more accurately. 2.2.1 Web Crawling Tools Before the widespread use of dynamic content in webpages, it would suffice to use curl or wget to download web content. But, neither of these tools include a JavaScript engine, and thus miss a large portion of the web’s content. However, even as early as 2012, Nikiforakis et al. [83] showed that more than 93% of the most popular websites include JavaScript from external sources. Many research projects seek to measure as many third-party resource inclusions as possible [9, 15–17, 59, 62, 67, 76, 78], making more sophisticated, headless browsers a necessity. To address this need, many researchers use Puppeteer [90], a Node.js library that gives programmatic control and data collection over a headless Chromium browser. In our research, we consider two tools that take complementary approaches in obtaining the Resource Tree: ZBrowse and Crawlium. ZBrowse [122] uses Node.js’s built-in getResourceTree method for obtaining the DOM tree after the webpage has been loaded. It also augments this tree by collecting data from two network event triggers: requestWillBeSent and responseReceived. Rather than using Node.js’s built-in resource tree construction, Crawlium [32] builds its own tree from 11 the collection of the various network events. Crawlium triggers on the same network events as ZBrowse, plus others to capture data sent and received via web sockets, frame navigation, new execution contexts, the parsing of scripts, and when the console API is called. 2.2.2 Web topology In our study, we will demonstrate several measurement parameters that can have a significant effect on the proportion of the inclusion graph a tool is able to obtain. As their motivation for creating Crawlium, Arshad et al. [10] observe that merely obtaining the DOM tree can miss critical resource inclusions. ZBrowse uses Node.js’s DOM tree, but overcomes this limitation at least in part by augmenting the tree with inclusions learned from network events [67, 122]. Although much prior research related to downloading content on webpages focuses solely on the resources, there is also work by Gibson et al. [48] where the links between the resources are analyzed. Barabasi et al. [13] utilize the incoming and outgoing link distribution of the Web to understand the network topology in the context of self-organization and scaling in random networks, whereas Castillo et al. [26] study both the link-based and content-based features, and use the topology of the Web graph to examine that linked hosts belong to the same class—either both are spam or both are non-spam. We study the web topology to identify the links, and ultimately the domains that lead to trust breakdown and load malicious content. 12 2.2.3 Comparison across crawls The papers introducing Crawlium [10] and ZBrowse [67] specify their tools’ net- work events, but do not investigate multiple page loads. Other studies have investigated the variation of page content from one page-load to another. Zeber et al. [123] and Englehardt and Narayanan [42]—as part of broader studies—both used OpenWPM, a Selenium-based web privacy measurement tool, to compare resources obtained between a pair of simultaneous or back-to-back crawls. Their results broadly agree, and indicate that the same third-party URLs are loaded 28% of the time and the typical overlap is about 90%. We study a slightly different question: how many page loads would we need in or- der to exhaustively obtain the inclusion graph? Our study also extends upon these prior efforts by comparing multiple tools and presenting an adaptive page-loading technique for obtaining more complete inclusion graphs. 2.3 Measuring third-party resource inclusion on the web The related work includes advertising on desktop and mobile devices, the trust in third-party inclusions and the mechanisms by which the resources are incorporated into the wesite. 2.3.1 Online advertsing Online advertising has become a cross-browser monetization platform introducing significant malicious activity. Prior work has shown that ad abuse leads to losses in the order of millions of dollars for advertisers. Blocking ads as well as forced redirects cost 13 publishers [104]. Using monetization tactics similar to DNSChanger, several large botnets (i.e., ZeroAccess and TDSS/TDL4) abuse the ad ecosystem at scale. Third party services also make money through ad abuse. Chen et al. [28] and Thomas et al. [111] analyze the depth of financial ad abuse. Thomas et al. [111] find that ad injection introduces malware and small number of software developers support a large number of ad injectors. Li et al. [70] study web traces and obtain ad redirection chains related to advertising networks, and identify that the top ranked Alexa websites and leading advertising networks are injected with malware. Targeted advertising uses specialised user data and has become more common due to the prevalent use of social media. Plane et al. [87] conduct a pilot study and a multiple- step survey in which users are provided with various advertising scenarios to understand targeted advertising. While Andreou et al. [7] perform a case study on Facebook to iden- tify that ads are displayed for different users based on the user interests as a result of advertising networks obtaining user data that sometimes contain sensitive information, Cabanas et al. [25] study how a large portion of Facebook users in the European Union are linked with potentially sensitive interests which lead to leakage of confidential per- sonal data and that malicious third-party services reveal the identity of Facebook users. Recently, pop-up ads and forced redirections have become more prevalent where malware can be spread without the users knowledge. Identifying the websites (third party services, pop-ups, automatic redirect webpages, etc.) that lead to malware is important for safe web browsing. Many popular websites when accessed lead to fake advertisements [49] and pop-up ads [81], [77] which sometimes bypass ad blockers causing inconvenience [73]. There are many ad-blockers that prevent the loading of malicious domains that are 14 indirectly loaded by a website. But, some malicious third-parties evade these ad blocking mechanisms as well. The use of adblocking tools like Adblock Plus [1] and ublock origin affects the advertising revenue streams. Nithyanand et al. [84] analyze the arms race of ad blocking and anti-adblocking utilizing third-party services that are shared across multiple web pages. Thus, we need to analyze the detailed web topology to identify the source of malicious activity and propose to analyze the effectiveness of the adblockers. Guha et al. [51] study the challenges in measurement methodologies used for advertising networks and propose new metrics robust to noise present in ad distribution networks. They also identify measurement pitfalls and artifacts, and provide mitigation strategies. The paper studies ad network distribution as a whole, whereas we focus on malicious activity not only due to advertising, but also other ways like javascript obfuscation. Seifert et al. [100] utilize static attributes on an HTML page to detect and classify malicious web pages, whereas Poornachandran et al. [89] perform static analysis followed by a behavioral analysis to identify malicious advertisements. We move a step ahead and use the topology instead of the static HTML page to study malicious activity. 2.3.2 Mobile vs Desktop While analyzing the web topology, we observed significant differences in the mo- bile version of the website compared to the desktop version. While Botha et al. [24] study the difference in security, Johnson and Seeling [63] study the webpage object requests in mobile and desktop browsers. Botha et al. [24] explore the availability of security mecha- nisms in a mobile context similar to the desktop environment and conclude that the same 15 protection level in the desktop environment cannot be achieved due to usability issues. Johnson and Seeling [63] identify that the number of webpage object requests increases steadily and that the growth is slightly higher in desktop versions of the webpage. Al- though there are mobile applications corresponding to web applications, an online survey and a user study conducted by Maurer et al. [74] identified that more people prefer using original content on mobile web browsers instead of the mobile application, especially for new generation mobile devices. Thus, we analyze the Web on both desktop and mobile devices and compare the topologies, both in terms of the number of malicious domains and in terms of the number of insecure redirection links to test our hypothesis that al- though desktop versions have a larger number of malicious domains, the number of links to these domains is less. 2.3.3 Mobile Advertising With the rapid deployment of new mobile devices, there is a need to understand the security of mobile versions of websites. Mobile advertising has become an easy way to steal information on user’s devices using advertising products [27]. Similar to web advertising, ad networks behind in-app advertising employ personalization to improve the effectiveness/profitability of their ad-placement. A lot of in-app advertisements work at the mobile app-web interface where users tap on an advertisement and are led to a web page which may further redirect, sometimes automatically until the user reaches the final destination. Mobile advertising is more prone to malicious activity, one of the reasons being less to no use of Adblockers on mobile devices. 16 Dong et al. [37] explore various new ad frauds in mobile applications that include both static placement and dynamic interaction fraud whereas Rastogi et al. [92] explore the interface between mobile applications and the web links and identify that destination webpages may result in scams. Due to these vulnerabilities in the applications and the interface, significant user data is leaked from mobile devices. Meng et al. [75] study the amount of sensitive user information that mobile in-app advertising networks learn and that personalized ads can be used to reconstruct the data obtained by the ad networks. Also, there are various ways in which the data can be accessed. While Son et al. [106] show that few applications require access to external storage to cache videos and images whereas Demetriou et al. [34] analyze the data that can be obtained by advertising net- works from installed apps, the libraries, other files and user inputs; all of which results in revealing sensitive data. We aim to identify the web applications that are more prone to malicious activity and possibly detect where and how such behaviour arises. We also analyze whether the basic Adblockers currently available on mobile devices can protect the device from malicious domains. 2.3.4 Trust in Third-party Resource Inclusions Ikram et al. [59] study dependency chains in the Web ecosystem focusing on suspi- cious or malicious third-party content that is indirectly loaded by first-party websites via dependency chains. While analyzing the malicious activity on the Web, we begin with what the paper does and obtain the dependency chains. We then utilize it to identify the links between first- and third-party domains and ultimately identify not just the domains 17 that are outright bad (by VirusTotal), but also the domains that indirectly load these do- mains when a user accesses multiple first-party websites (Alexa top ranked websites). The major reason behind malicious activity in the Web is the trust first-party websites place on third-party resources. Chen et al. [29] study how network reputation and malicious activity is exploited in ad-bidding between ad exchanges and advertisers. For new or unknown domains, the reputation score is determined based on known legitimate and malicious domains. Anton- akakis et al. [8] propose a dynamic reputation system by building a domain model using passive DNS query data and network related data. Sometimes, adversaries exploit domain ownership changes and use legitimate domains, which are no longer in use to introduce malicious activity. Lever et al. study that ownership changes are exploited by adversaries resulting in residual trust abuse reducing the safety of the domain and the users accessing such domains [69]. They additionally measure the extent of residual domain trust abuse and its growth in recent times [68]. Instead of using a known reputation, we first use VirusTotal to identify bad domains and propose to use the web topology to effectively define a better trust metric for the domains by back-tracing from the bad domains. 2.4 Resource Hints There has been some prior work measuring the percent of websites that implement resource hints [55]. However, to the best of our knowledge, none investigate how op- timally websites are in using resource hints. We intend to explore the actual fraction of resource hint links that websites use and whether they are properly enabling DNS 18 prefetching for HTTPS. In order to enable DNS prefetching, there are two main ways to do so. According to the popular web browsers Firefox and Chrome, websites can con- tain the following meta tag: ¡meta http-equiv=”x-dns-prefetch-control” content=”on”¿ in order to enable DNS prefetching [36] [119]. An alternative is to enable it within HTTP headers with the following syntax: ”X-DNS-Prefetch-Control: on” [119]. Chrome de- notes this requirement as a response to prevent eavesdroppers from ”inferring the host names of hyperlinks” [36]. A wide variety of previous work has established that DNS resolution latency is one of the prime causes of slowdowns in page load times on the internet [52, 103, 109, 117]. Habib and Abrams were able to demonstrate the damage caused by DNS latency in slow- ing down web browsing, and suggest initial mitigations, such as increasing the size of the DNS cache [52]. In their analysis of how to improve internet speed, Singla et al. assessed the latency bottleneck of DNS resolution as causing a 7.4x inflation over c-latency (that is, “the time it would take for light to travel round-trip along the shortest path between the same end-points”) [103]. Sundaresan et al. empirically showed that DNS caching could reduce maximum page lookup times by between 15 and 50 milliseconds [109]. And Wang et al. used profiling to show that DNS lookup is the cause of almost 13% of page delay time during the loading process [117]. As a result of the significant contribution of DNS latency to slower internet speeds, a number of techniques have been proposed to mitigate this latency. These techniques include ideas varying from querying multiple DNS servers simultaneously to eliminating DNS resolvers. However, of all of these ideas, only one has caught on in practice [97,116]. DNS prefetching has been demonstrated to provide significant latency improvements in 19 theory and, as a result, has been implemented in practice by almost all major browsers [30, 82]. However, not every researcher is equally pleased with the idea of resource hints as a potential solution. There is a chance that resource hints can introduce new privacy concerns while browsing the web. In particular, as Krishnan and Monrose observed, because DNS prefetching causes all of the DNS requests for a web page to be sent at the same time, these requests are likely to be clustered together in DNS logs. If these requests are sufficiently unique to allow identification of individual pages of websites, this could potentially leak information about the actual content users are viewing, rather than just what IPs they are visiting [65, 66]. In addition, resource hints could be used for various vectors of attacks, such as framing attacks, targeted DoS attacks, cross-site forgery attacks, and data-analytic pol- lution attacks [115]. However, these attacks would require a malicious host for web- pages. Although resource hints opens the door to new attacks, a malicious host already has the capability to implement various additional attacks against their users and oth- ers. Perhaps as a result of these security concerns, some major browsers have chosen to disable DNS prefetching by default under HTTPS, although it is still enabled by de- fault under HTTP [36, 119]. Major browsers do not act on DNS prefetch hints within HTTPS files unless the page first explicitly turns DNS prefetching on, by including the X-DNS-Prefetch-Control HTTP header, or equivalently by including the HTML: If at any point in the HTML DNS prefetching is turned off (content="off"), Chrome does not allow it to be turned back on. 20 Chapter 3: Toward Semantic Interfaces for Cryptographic Frameworks This chapter focuses on direct risk—the risk that arises with invoking a third-party resource incorrectly, even if the third party is otherwise trustworthy. We study direct risks within the context of cryptographic frameworks. Several mature cryptographic frame- works are available, and they have been utilized for building complex applications. How- ever, developers often use these frameworks incorrectly and introduce security vulnera- bilities. This is because current cryptographic frameworks erode abstraction boundaries, as they do not encapsulate all the framework-specific knowledge and expect developers to understand security attacks and defenses. In our paper [60], we characterize this se- mantic gap between the needs of developers and the cryptographic APIs, and we present techniques for bridging this gap. 3.1 Overview Cryptographic algorithms are often a necessary building block for complex appli- cations and libraries, for instance to implement secure client-server communications, to store data securely or to process payments. Several mature cryptographic frameworks are currently available for this task, including Oracle JSSE, IBM JSSE, BouncyCastle, and OpenSSL. These frameworks are implemented by cryptography experts, include state-of- 21 the-art algorithms, and their code has been audited and analyzed with formal verification tools. They have also used to build real world software that provides strong security. Unfortunately, software developers who lack cryptography expertise often make critical mistakes when using these frameworks, including exchanging keys without au- thenticating the endpoint [93], storing sensitive information in cleartext [93] or with weak protection [40], using parameters known to be insecure (e.g. the Electronic Codebook mode or non-random initialization vectors) for block ciphers [40] using encryption keys that are constant [40] or are generated from insufficient randomness [40, 93], or perform- ing improper TLS certificate validation [43,47,93]. The Common Weaknesses Enumera- tion dictionary [31], which provides a comprehensive taxonomy of frequent programming mistakes, includes 14 common implementation errors related to the use of cryptography. These errors allow attackers to impersonate legitimate users [43,47,93], to harvest sensi- tive personal information [43, 47, 93] and even to steal money [93]. The solutions that have been proposed for this problem include simplified crypto- graphic APIs [3, 20, 43, 110], secure default values for the parameters of cryptographic algorithms [47] and using static analysis tools to discover bugs related to cryptography misuse [11, 40, 64]. These solutions do not address a more fundamental problem: the fact that current cryptographic frameworks erode abstraction boundaries, as they do not encapsulate all the framework-specific knowledge and expect developers to understand security attacks and defenses. In this chapter, we characterize the semantic gap between the needs of developers and the cryptographic APIs, and we present techniques for bridging this gap. For example, simplified cryptographic APIs do not provide a true separation of concerns as they do not 22 provide the flexibility that developers need to implement the complex business logic re- quired (e.g. authentication and authorization services, e-commerce SDKs and integrated shopping carts [47] or mobile payment systems [93]). This problem could be addressed in part by following best practices in API design [21], but some challenges are specific to the cryptography domain. In particular, the security of applications using cryptography often depends on information external to the system. For example, the SHA-1 crypto- graphic hash function, introduced two decades ago, is no longer considered secure given the performance of modern hardware, yet many web sites still advertise TLS certificates that use SHA-1 for generating digital signatures. The checks for implementation choices that may lead to insecurity must be done are runtime, as in some cases the information changes frequently. For example, when TLS certificates are compromised, they must be revoked and reissued immediately to prevent man in the middle attacks,1 and client-side code must check for the revocation status of these certificates. Moreover, developers need the flexibility to select the most appropriate mechanism for incorporating this informa- tion. For example, an application can check the revocation status of TLS certificates by downloading certificate revocation lists (CRLs), by using the Online Certificate Status Protocol (OCSP) or by implementing OCSP stapling. There is currently no agreement about what method is best, and the choice is likely to be platform dependent [72]. In con- sequence, it is difficult to define simplified APIs or statically verifiable security protocols that cover all the ways cryptography is used in the real world. Another domain specific challenge is that developers often need to disable security 1Exploits for the Heartbleed vulnerability, which enabled breaking TLS certificates at scale, were ob- served in the wild less than 24h after the vulnerability was disclosed [38]. 23 checks in the development environment in order to run and test their application (e.g. by disabling server authentication with self-signed TLS certificates [43]), and sometimes these checks remain disabled in production because the developers do not understand the security threats associated with these workarounds [43, 93]. This suggests that designing good cryptography APIs is not sufficient for addressing the problem, and the solution must extend to the build management system. To address these challenges, we propose semantic APIs for cryptographic libraries, which expose the security decisions without requiring in depth knowledge of attacks and defenses. We describe several design patterns for implementing these APIs, including three ways of incorporating external information, and we demonstrate how our APIs can be implemented on top of the existing cryptographic frameworks. Our APIs represent a first step toward striking the right balance between restricting the security decisions that developers make and giving them the flexibility needed for complex applications that use cryptography. In addition to these semantic APIs, we propose compile-time checks to separate the development environment from the production environment. This allows for a clean definition of workarounds during development that should not be used in production. In summary, we make three contributions: 1. We identify new problems with the existing cryptographic APIs, and we classify the root causes of the new and the known programming mistakes related to using these APIs. 2. We present a solution to these problems by introducing semantic APIs for crypto- 24 graphic operations. We also discuss design patterns for implementing these inter- faces on top of existing cryptographic frameworks. 3. We propose build management hooks for isolating the workarounds used during development and testing. The rest of the chapter is organized as follows. In Section 3.2 we outline our goals and non-goals. In Section 3.3 we review problems with existing cryptographic APIs, not described in the prior work, and we categorize the needs of developers who use these APIs. In Section 3.4 we describe our solutions to these problems. In Section 3.5 we val- idate our solutions through several case studies. In Section 3.6 we discuss the remaining challenges. 3.2 Problem Statement Figure 3.1: HTTPS request in Python. Consider a developer Alice who wants to develop an application in Python, and she wants to use the HTTPS protocol in her software to communicate securely with a web service called Binary Object Broker (BOB). The HTTPS protocol allows clients to connect to servers they have not encountered before, and with which they have no shared secrets, by utilizing the TLS protocol to exchange keys during the initial handshake. A 25 common misuse of cryptographic APIs (CWE-322) is to perform a key exchange with- out first authenticating the server [31, 43, 93], which results in the establishment of a secure channel without first ensuring that the client has connected to the correct server. This programming mistake allows an adversary to intercept the communication through a man-in-the-middle attack [57]. To prevent this attack, an HTTPS server must present a digital certificate, signed by a Certification Authority that the client trusts, which au- thenticates the server to the client. While Alice is not a cryptography expert, she tries to avoid such common mistakes by looking up the best practices for using Python libraries to establish a secure connection with the web server. This results in the code shown in Figure 3.1.2 As the figure shows, Alice creates a default context, which authenticates the BOB service by requesting a certificate and verifies that the web server’s certificate hostname matches the one from the certificate. Although Alice believes that her imple- mentation is secure, her code never checks if the certificate has been revoked, leaving the application exposed to a man-in-the-middle attack.3 This is because the default op- tion (ssl.VERIFY_DEFAULT) in Python does not check for revoked certificates. Alice must explicitly specify context.verify_flags = ssl.VERIFY_CRL_CHECK_CHAIN in Figure 3.1 to ensure this. The fundamental cause of this error is that the library expects developers to have a good understanding of security attacks and defenses and to know details about the library’s implementation and configuration. This is just one example of security vulnerabilities that can be introduced by mis- using cryptography; recent studies [40, 43, 47, 93] have reported that such mistakes are 2 From https://docs.python.org/2/library/ssl.html#ssl-security; 3Because TLS certificates are sometimes compromised (e.g. in the wake of the Heartbleed vulnerabil- ity [124]), client-side TLS code must check the revocation status of certificates presented by servers during the TLS handshake to prevent an adversary from eavesdropping on the connection. 26 https://docs.python.org/2/library/ssl.html#ssl-security common. The root causes of these mistakes fall into four categories: 1. No separation of concerns. Existing cryptographic frameworks are implemented by experts and can be used correctly. However, the APIs they expose do not en- capsulate all the cryptographic knowledge and expect users such as Alice to un- derstand security attacks and defenses and the subtle impact of various parameters used in the framework implementation. In consequence, when using these frame- works developers cannot focus only on the application logic but must also learn about cryptography. 2. Diverse needs of developers. Multiple types of developers may need to use cryp- tographic frameworks. For example, functionality engineers (such as Alice) often have simple requirements, such as communicating securely over the Internet, while security engineers must implement more complex services that rely on cryptogra- phy. Prior efforts to simplify cryptographic APIs, to make them more suitable for developers who lack cryptographic expertise [20, 43], do not take into account the diversity of these developers’ needs. 3. Need to incorporate external information. One of the reasons why existing cryp- tographic frameworks expose a dizzying array of options and parameters is to allow developers to react to the community’s evolving understanding of cryptographic at- tacks and defenses. Some implementation choices are found to jeopardize security, and the rate at which this information is updated ranges from years (in the case of cryptographic algorithms, e.g. SHA-1) to days (in the case of certificate revoca- tions). Developers must find a way to incorporate such external information into 27 their systems. 4. Reliance on secure default values. The prior efforts to simplify cryptographic APIs remove choices available to developers, which leads to additional mistakes when programmers develop workarounds. For example, while the default settings for the Android SSL library ensure correct certificate validation, developers often need to disable these validation checks during development and testing, by using self-signed certificates [93], and do not understand that using this workaround in production code fails to authenticate the server [43]. In addition to cryptographic APIs that ensure a separation of concerns and that provide secure defaults, devel- opers need appropriate build management tools to define workarounds that should only be used in the development environment. Goals and non-goals. We present ideas that would help developers to use cryptography correctly, by addressing the four challenges identified above. However, we do not aim to enforce a secure usage of cryptographic APIs. We do not think this is currently feasi- ble as developers could always resort to do-it-yourself crypto to bypass our enforcement points. In other words, we assume that, like Alice in our example, the developers are trying to write secure software and we aim to make it easier for them to achieve this goal. Furthermore, we do not propose new cryptographic techniques or algorithms, and we do not describe any cryptographic weaknesses in the existing algorithms. Instead, we focus on preventing common mistakes in the way these algorithms are used in appli- cations. Finally, we do not consider the mistakes made by developers who implement their own custom ways of encrypting data, authenticating users, etc., thereby bypassing 28 cryptographic frameworks entirely. 3.3 Needs of Developers From the context of programming mistakes reported in prior work [40, 43, 47, 93], we identify five general needs of developers who must use cryptographic APIs. By further investigating these needs, we also identify several challenges, not documented before, for using existing APIs correctly. Need 1: Establish secure connections. One of the most common reasons to use crypto- graphic frameworks is to implement client-server applications that communicate over se- cure channels, often using the HTTPS protocol. However, the counter-intuitive interfaces and parameters exposed by these frameworks can lead to programming mistakes. For example, JSSE performs hostname verification only if an algorithm field is correctly set to “HTTPS” [47], but developers who are not familiar with the steps from the TLS hand- shake are likely to establish an insecure connection. Similarly, in Python, hostname ver- ification is skipped if the developer forgets to specify ssl.CERT_REQUIRED (Figure 3.1). In the cURL library, the parameter for requesting hostname verification and certificate val- idation is 2 (an integer), while in JSSE it is TRUE (a boolean), which confuses developers who sometimes invoke the cURL API with 1 (the integer that corresponds to TRUE) [47]. We also identified cases where different cryptographic frameworks behave differ- ently when providing the same functionality. This can lead to mistakes when developers move from one framework to another. For example, if a trusted certificate has expired, in an IBM JSSE implementation, the handshake will fail, even though the expired certifi- 29 cate is trusted. An Oracle JSSE implementation will flag such a connection as secure and expect the developer to check for expired certificates. Developers who are not security experts need a cryptographic API that abstracts away these details and that minimizes astonishment [21]. Need 2: Store data securely. Another frequent requirement is to store sensitive data, e.g. personally identifiable information, keys, credit card information, account balances and other application related information. However, developers sometimes store such in- formation on the device in plaintext, use hard-coded keys or insufficiently random values for encryption, or allow the sensitive information to leak through log files [93]. Devel- opers usually know which information handled by their applications is sensitive, but they make mistakes because they must invoke correct encryption operations each time the data is written to disk, in some form. Instead, a cryptographic API should decouple the task of specifying that certain data structures contain sensitive information and the secure storage primitives. For data marked as sensitive, these primitives should automatically encrypt the data in a way that ensures confidentiality (an unauthorized party cannot decrypt it) and integrity (the data cannot be forged or tampered with). Need 3: Incorporate security-critical external information. Systems that were designed and implemented correctly a decade ago may be vulnerable today because of changes in the security landscape. For example, the DES encryption algorithm or the MD5 hashing algorithm were considered secure in the past, but today they are known to be insecure. Nevertheless, there are still applications that use DES and MD5. Similarly, as comput- 30 ing power increases, increasingly longer keys are necessary for providing security against brute force attacks. The cryptographic framework should check and enforce these rec- ommendations transparently. This can be achieved by requesting the information, in a machine readable format, from a trusted third party—perhaps the National Institute of Standards and Technology (NIST), which periodically publishes recommendations for the use of cryptographic algorithms and key lengths [14]. Web browsers use a similar pattern for determining the revocation status of TLS certificates, which is another type of external information that is critical for security. Need 4: Use default parameters securely. Developers who lack a background in cryp- tography will often end up using the default values of parameters required by various algo- rithms. Sometimes, the default values compromise security; for example, when the AES block cipher is used, the insecure ECB mode is the default in Python’s PyCrypto [71], Java JSSE libraries (as well as resulting Android libraries) [40]. Cryptographic frame- works should provide default values that ensure security. Need 5: Disable security checks during development and testing. Because security im- plies that certain operations will be disallowed, developers often need a way to bypass security checks in the development environment in order to test all the code paths. For example, when a developer starts building an SSL application, the code throws exceptions either due to the absence of a certificate or to the use of a self signed certificate. Many such developers then bypass SSL certificate validation, to be able to continue writing and testing their code. While these workarounds have a legitimate purpose in the develop- 31 Table 3.1: Mapping developer needs, mistakes and solution - CWE provides a unified, measurable set of software weaknesses [31]. For example from the CWE 297 in the list corresponds to the mistake “Improper Validation of Certificate with Host Mismatch” https://cwe.mitre.org/data/definitions/297.html. Developer Needs Example Mistake Solution CWE Section 1. Establish secure connection Allow expired certificates, Semantic APIs, 295, 297, 599, 3.4.1, 3.4.2 Skip Hostname Verification Integrating external 319, 321, 322, information 324, 327 2. Store data securely Secret keys stored unencrypted Semantic APIs 311, 312, 532 3.4.1 3. Incorporate security Using SHA1 digest Integrating external 299, 327 3.4.2 critical information information 370, 676 4. Use default parameters Using AES in ECB mode Semantic APIs, 276, 453 3.4.1, 3.4.3 securely Setup build configuration 5. Disable security checks Using self-signed certificates Setup build configuration 296 3.4.3 during development in production environment ment environment, they are sometimes deployed in a production environment, making the application vulnerable [43]. Cryptographic frameworks should provide a way for de- velopers to specify that certain workarounds should be executed only in the development environment. 3.4 Unified Framework for Secure Application Development The needs and few mistakes identified in the previous section have been summarized in Table 3.1. To address these needs, we propose a unified framework with the following components: 1. Semantic APIs, which present the high level functionality and security guarantees to the developers without exposing low level implementation specifics. We show that these APIs can be implemented consistently on different platforms, without modifying the underlying cryptographic framework (Section 3.4.1). 2. Design patterns for integrating information from external trusted sources, transpar- 32 https://cwe.mitre.org/data/definitions/297.html Table 3.2: Proposed Semantic API with functions for both application and library devel- opers [46]. API Parameters Semantics Design Pattern Functionality Engineers Communicate Interface send(sock, msg) addr: address of the sender/receiver msg: data to be sent/received sock: socket for communication Sends a message to the receiver Regulator, Proxy, Template secureSend(sock, msg) Sends authenticated encrypted message receive(sock) Receive a message from addr secureReceive(sock) Receives authenticated encrypted message connect(addr) Returns an established connection secureConnect(addr) Returns a secure SSL connection (authenticity, confidentiality, integrity) disconnect(sock) Disconnects connection Storage Interface write(key, val) key: search key (or filename) val: data to be stored Writes data in plaintext Proxy, Template secureWrite(key, val) Writes encrypted data to file read(key) Reads data from file secureRead(key) Reads encrypted data from file Security Engineers dispatch(sock, msg) sock: Receiver end-point Sends the message through socket receive(sock) sock: Sender end-point Receives message through socket isConfidential(msg) msg: data to be checked for Returns whether data is confidential ently and at runtime (Section 3.4.2). 3. Ensuring correct compile time procedures to separate development environment from production environment. This can be done using code annotations in Java (for instance, using profiles in Spring framework) or using preprocessor macros in languages like C (Section 3.4.3). 3.4.1 Semantic APIs In this section we introduce our semantic APIs, along with the use cases that moti- vated their design. Table 3.2 lists these APIs. Our design goal is to allow only a few developers, which we call the Security En- gineers, to be involved in making security decisions and provide them with the tools they 33 Figure 3.2: Communicate interface - example send and secureSend functions to perform a connection using HTTP and HTTPS protocols. need to implement custom protocols that meet certain security requirements, whereas the other developers, called Functionality Engineers should focus on functionality specific needs and their design choices should not affect security. We achieve this goal by ensur- ing that the functions written by the Functionality Engineers do not involve any security decisions whereas the functions written by the Security Engineers perform all the security tasks and these functions can be used by the Functionality Engineers directly without the complete knowledge of the underlying function and cannot be modified by the Function- ality Engineers. 3.4.1.1 Functionality Engineers Communicate Interface. The functionality required by developer need 1, described in Section 3.3, can be implemented with connect, secureConnect, send, secureSend, 34 receive and secureReceive functions. connect, send and receive allow communi- cation without authentication or encryption, whereas secureConnect, secureSend and secureReceive ensure both properties. From a functionality engineers’ perspective, the secure functions would be used in the same manner as their insecure counterparts. Such developers would use secureConnect/ secureSend for sending sensitive data such as login credentials, SSN, credit card num- bers, etc. We must therefore ensure that these functions perform all security checks that are required for what developers expect to be a secure channel. A detailed description of a TLS connection establishment for HTTPS and the necessary check to be performed is as follows. An HTTPS connection used for connecting to web servers is HTTP protocol using an SSL/TLS connection. HTTPS is primarily used for two purposes: 1. Authenti- cating the server 2. Ensuring confidentiality of the message sent to the server. An SSL connection is established by the following steps: 1. First, a hello message is exchanged between the client and the server. The client sends all the cryptographic information such as cipher suites that it supports, the SSL/TLS protocol version it supports, etc. 2. Based on the client’s information, the server responds with the cipher suite and the version of protocol that will be used. 3. The server then presents a SSL certificates to the client to prove its identity. Each certificate contains details of the server such as name, location, etc., the time for which it is valid, a public key associated with a certificate and a digital signature from root certificate authority (or an intermediate certificate authority). Root cer- 35 tificate authority’s certificates are self signed. 4. For every certificate, the client checks whether it is correctly signed by another certificate authority or whether it implicitly trusts the certificate authority (if the certificate is self signed). In addition, the client validates each certificate’s expiry date and verifies the hostnames of the certificates. 5. In addition, the client needs to verify that the certificate has not been revoked in the recent past. This is performed by one of three ways: 1. Checking with a list of revoked certificates that was updated recently (CRLs) 2. Obtaining the revocation status for every request using online certificate status protocol (OCSP) 3. The server sends a time-stamped OCSP response in addition to the certificate (OCSP stapling). 6. Once the client validates the certificate and the server is authenticated, the client and server perform a key exchange protocol to setup a symmetric secret key that would be used to subsequently encrypt communication. send can be used for all other communications and for channels that cannot be secured (e.g. text messages sent by a smartphone). When sending a message on a secure channel, the functionality engineer specifies only the address addr and the data msg. The API implementation should ensure that confidential message is always sent using secureSend (as shown in Figure 3.2). This can be achieved by requiring security engineers to explicitly specify an isConfidential function, which takes a msg as a parameter and returns a boolean value, indicating whether the data is confidential. 36 The send function invokes isConfidential and returns an error rather than send- ing confidential data over an insecure channel. The method returns true by default, explicitly whitelisting data that is not sensitive. The security engineers define msg to be an instance of a specific superclass, which tags all data as sensitive or non-sensitive. This decouples the task of specifying which data is confidential from the task of implementing network communications, and ensures that the network programmer cannot compromise security by mistakingly using send in cases where secureSend should be used. Storage Interface. The functionality required by developer need 2 can be implemented with the write, secureWrite, read and secureRead functions. A write allows writing a value (or a file) to the storage system in plaintext whereas secureWrite ensures the integrity and confidentiality of the file. Whether the data stored is sensitive or not, a functionality engineer only needs to specify a value that is to be stored and a key that can be used to read the value. As in the Communicate interface, an isConfidential function must be specified that checks whether some data written in plaintext is indeed non-sensitive, thus decoupling the data sensitivity checks from the application logic. The APIs presented to the functionality engineer when using the write and secureWrite (or send and secureSend) methods are very similar and do not require any parameters that control how security is achieved. The Communicate and Storage interfaces provide a clean separation of concerns by hiding all the cryptography details from the functional- ity engineer and by decoupling the security specification from the implementation of the input-output functionality. 37 3.4.1.2 Security Engineers Unlike functionality engineers, security engineers implement more complex interaction protocols (e.g. authentication and authorization, online shopping, payment processing), and the resulting libraries may be included in third party applications. In consequence, se- curity engineers need a better understanding of security principles, and are responsible for exposing intuitive APIs to the functionality engineers. However, security engineers can also benefit from semantic cryptographic APIs. For example, a security engineer might implement the send and secureSend functions discussed above. Figure 3.2 outlines an implementation, using the proxy design pattern [46] which adds a wrapper and delegation to protect the real component from undue complexity. connect/send. The connect function creates a socket for communication whereas send transmits the message to the destined address. In this scenario, the data may be sent in plaintext. As mentioned earlier, this function invokes the isConfidential function to ensure that the data sent over the network is not sensitive. secureConnect/secureSend. The secureConnect function creates a secure communi- cation channel, e.g. by using HttpsUrlConnection for HTTPS in Java. The function performs all SSL checks that are not performed by the underlying libraries. For exam- ple, if the security engineer uses a JSSE library by Oracle, then this function checks for the expiry of certificates. The function then performs necessary checks for certificate re- vocation before transmitting the message to the destination. This requires incorporating information from external sources, and can be achieved using the Regulator pattern. 38 To address the semantic gap between the developers’ needs and the crypto- graphic APIs provided, we allow security decisions by Security Engineers alone while Functionality Engineers focus on application specific design choices. Semantic API 3.4.2 Integrating External Information In this section we describe the Regulator pattern for incorporating external informa- tion, such as parameters known to be insecure or certificates that have been revoked. We will start with a motivating example for the proposed design pattern and then will explain the pattern in standard format. • Regulator: This pattern can address one of the most common mistakes when using cryptographic APIs: not checking if TLS certificates have been revoked and thus ac- cepting potentially compromised certificates for authentication. Certificate revoca- tion check can be implemented by downloading certificate revocation lists (CRLs), through the Online Certificate Status Protocol (OCSP) or OCSP stapling [72]. Each of these methods require a different mechanism to integrate the information pro- vided by a trusted external source and hence we propose separate (but closely re- lated) design patterns for each method. X.509 certificates are issued to a web server by certificate authorities (CAs). These certificates serve as a proof to the client on the server that they are connecting to. The certificates are valid for a certain duration of time after which they need to be reissued. However, due to potential key compromises and other such reasons, 39 the CA may revoke the certificate. Thus, despite proving to the client about the existence of a certificate, the client needs to additionally verify if the certificate has been revoked. We present some of the methods used to check for revocation in the following. 1. Certificate Revocation Lists (CRLs): CRL is a list of revoked certificates maintained by the CAs. To use CRLs, the client intermittently retrieves a fresh CRL from the CA and verifies a certificate against this CRL. Whether a client has the most recent list depends on the frequency of requests made to the CA and this impacts the number of queries that the CA needs to support. An intermediary can reduce load on the CA by retrieving these lists more frequently and pushing them to the clients. 2. Online Certificate Status Protocol (OCSP): OCSP is an alternative to CRLs where, for every certificate, the client verifies for the certificate’s revocation status with the CA (OCSP server). 3. OCSP with Stapling: In OCSP stapling the server provides the certificate and client validates the certificate. If the certificate is revoked, the client then checks the time elapsed since the revocation. If the certificate is presented in some pre-defined time-frame since the revocation, the client accepts the certificate. Otherwise the client forwards the certificate to CA (OCSP server) for validation. Pattern Name and Classification: Regulator is a behavioral design pattern. Like the traditional Observer pattern [46], Regulator defines a one-to-many relationships 40 between objects and allows the dependent objects to update themselves automat- ically whenever the subject changes its state. However, there are two differences between the Observer and Regulator patterns: 1. The Regulator pattern does not expect the subject to be aware of all the depen- dent regulators, and hence the subject does not notify the regulators whenever the subject changes its state. On the contrary, it is regulators’ responsibility to check if the subject has changed the state. 2. The Regulator pattern does not allow the regulators to modify the state of the subject. This is crucial since the subject is expected to be a standard or benchmark for all the regulators. Intent: The intent of this design pattern is to facilitate mechanism for integration of information from external sources. For example it allows one to update various critical information based on the current standards. This ensures that the security checks are performed according to the current standards instead of outdated ones, thus avoiding any weak implementations. It is recommended that the regulator pattern should be used for every interface which requires to choose parameters from multiple available options by comparing them with some benchmarks. Motivation: The prime reason for using this pattern is the unfortunate observation that many of the critical errors in the cryptographic implementations are caused by implementations of weak cryptographic algorithms like RC4 or MD5 hash function. Another example is using obsolete certificate revocation lists (CRLs) to validate re- voked certificates. We therefore propose that the implementation of such interfaces 41 should incorporate the regulator pattern which will update the lists periodically on its own and synchronize the lists with the current approved standards (like those published by the NIST [14]). This pattern could be used for periodic updates of CRLs and list of secure algorithms used for SSL handshake negotiation. We wish to highlight that attacks on implementations using specific parameters are much more frequent than attacks on the algorithms themselves. It is of utmost importance, therefore to keep the parameter selection up-to-date with standards at run-time in- stead of compile-time. Applicability: The regulator design pattern should be used in the design whenever security depends on performing checks with reference to content published by a trusted authority. App External Regulator source Figure 3.3: General structure of Regulator pattern - The darkblue arrows are used to in- dicate updating of parameters by a regulator, which retrieves this data from an external source, intermittently. The dark green arrows indicate update where an application di- rectly contacts the subject. Structure: Figure 3.3 shows the high level idea of how the regulator design pattern can be used for updating list of secure algorithms or list of secure parameters used for encryption. We later present detailed examples of regulator pattern that can be used in different models of updating the standard information. 42 Figure 3.4: Sequence Diagram showing the Push Model We now propose 3 ways of implementing the regulator pattern. These 3 implemen- tation choices differ in the situations where they can be applied. We explain these differences below and give an example for each of the implementations. – Push: This implementation is illustrated in Figure 3.4, which corresponds to certificate revocation checks using CRLs. Here, the application stores a local copy of CRL and uses it to verify whether the certificate provided by the server is valid or not. The regulator also maintains a copy of CRL and updates its own copy periodically by downloading the CRL updates from the certification authority (CA). The regulator then pushes the updated CRL to application and replaces the local CRL of the application with the updated 43 Figure 3.5: Sequence Diagram showing the Pull Model one. The main advantage of this method is the application (or client) can check for the revocations locally. However, the disadvantage of this method is the application is required to store the (potentially large) list locally.4 This method can also be used in scenarios where the regulator must update the lists from multiple external sources. – Pull: This implementation is illustrated in Figure 3.5, which corresponds to certificate revocation checks using the OCSP protocol. Here, the application checks for the validity of the certificate by sending the certificate to an OCSP server (or to another trusted authority, e.g. a CA). The certificate is validated based on the response from the OCSP server (or CA). Here, the regulator relays the responses between OCSP server (or CA) and the application. We do not recommend this method in practice, because it adds overhead to the 4In the wake of events that triggered mass revocations (e.g. the Heartbleed vulnerability [124]), CRLs grew by up to two orders of magnitude. 44 Figure 3.6: Sequence Diagram showing the Selective Pull Model secure channel establishment and because of the potential privacy concerns related to disclosing to a third party every HTTPS site visited. As a result of these drawbacks, OCSP is used only in rare cases [72]. – Selective Pull: This implementation is similar to the Pull model, but in this case the information is downloaded only in certain conditions. This is illus- trated in Figure 3.6, which corresponds to certificate revocation checks using OCSP Stapling. Here, the application requests the regulator to validate the certificate. The certificate is accepted as valid based on the time-stamp when the certificate is provided. If the regulator is not able to verify the certificate’s validity locally then it forwards the certificate to OCSP server (or CA). The decision regarding accepting the certificate is then taken based on the response from the OCSP server (or CA) and relayed back to the application. 45 This method may be used instead of the Pull method, as it reduces the latency by allowing certificates to be caches for a certain period of time. Beside certificate revocation, these patterns can also be used in other scenarios where information from some external source needs to be integrated transparently, with- out requiring the application developers to retrieve and interpret the information. For example, the Regulator pattern could be used in updating the list of secure algorithms; a trusted third party (e.g., the NIST, who already publishes standards for secure algorithms and key lengths [14]) could provide this information in a machine readable format, suit- able for ingestion by our framework. As mentioned in Section 3.3, NIST periodically publishes recommendations about secure cryptographic algorithms to use. Presently, ad- hering to these recommendations is up to application developers, leading to wide use of old broken algorithms such as MD5 hashing. Instead, we propose that a trusted authority (such as NIST) maintains an service with a list of all secure algorithms at any point in time. Every client can periodically (say once every three months) update their local list using an automated procedure. The application would hence use only those algorithms that are recently listed as secure by NIST. Consequences: The Regulator pattern allows application developers to use only algorithms and parameters that are considered secure, and it allows library developers to co