ABSTRACT
Title of dissertation: ANALYZING INTERNET RELIABILITY
REMOTELY WITH
PROBING-BASED TECHNIQUES
Ramakrishna Padmanabhan
Doctor of Philosophy, 2018
Dissertation directed by: Professor Neil Spring
Department of Computer Science
Internet reliability for home users is increasingly important as a variety of
services that we use migrate to the Internet. Yet, we lack authoritative measures
of residential Internet reliability. Measuring reliability requires the detection of
Internet outage events experienced by home users. But residential Internet out-
ages are rare events. Further, they can affect relatively few users. Thus, detecting
residential Internet outages requires broad and longitudinal measurements of in-
dividual users’ Internet connections. However, such measurements of Internet
reliability are challenging to obtain accurately and at scale.
Probing-based remote outage detection techniques can scale but their ac-
curacy is questionable. These techniques detect Internet outages across time as
well as across the IPv4 address space by sending active probes, such as pings
and traceroutes, to users’ IP addresses and use probe responses to infer Internet
connectivity. However, they can infer false outages since their foundational as-
sumption can sometimes be invalid: that the lack of response to an active probe
is indicative of failure. In this dissertation, I show how to use probing-based
techniques to measure residential Internet reliability by defending the following
thesis: It is possible to remotely and accurately detect substantial outages experienced
by any device with a stable public IP address that typically responds to active probes and
use these outages to compare reliability across ISPs, media-types, geographical areas, and
weather conditions.
In the first part of the dissertation, I address the inaccuracy of probing-based
techniques’ detected outages and show how to use probe responses to correctly
detect outages. I illustrate two scenarios where the lack of response to an active
probe is not indicative of failure. In the first scenario, responses are delayed be-
yond the prober’s timeout, leading these techniques to infer packet-loss instead
of delay. In the second scenario, these techniques can falsely infer packet-loss
when the address they are probing gets dynamically reassigned. I examine how
often delayed responses and dynamic reassignment occur across ISPs to quantify
the inaccuracy of these techniques. I show how outages can be inferred correctly
even in networks with dynamic reassignment using complementary datasets that
can reveal whether an address was dynamically reassigned before, during, and
after a detected outage for that address.
In the second part of the dissertation, I motivate why the detection of in-
dividual addresses’ outages is necessary for analyzing residential reliability. An
individual address typically represents one residential customer; therefore, de-
tecting outages for individual addresses can allow capturing even small outages.
Prior probing-based techniques focus upon the detection of edge network out-
ages affecting a substantial set of addresses belonging to a BGP prefix or to a /24
address block. Here, I quantitatively demonstrate the extent to which prior tech-
niques can miss residential outages. I show that even individual address outages
occur rarely in most networks. When multiple simultaneous outages of related
individual addresses occur, there is likely a common underlying cause. With this
insight, I develop and evaluate an approach to find outage events that are statis-
tically unlikely to have occurred independently. I show that the majority of such
events do not affect entire /24 address blocks or BGP prefixes, and are therefore
not likely to be detected by existing techniques which look for outages at these
granularities.
In the final part of the dissertation, I show how to use individual addresses’
outages detected by probing-based techniques to assess Internet reliability across
media-types, geographical areas, and weather conditions. Individual outages are
not direct measures of reliability: they can occur independently because users
disable equipment or can be observed falsely due to dynamic address renum-
bering. I use the insight that the statistical change in outage rate in different
challenging environments (e.g., thunderstorm) can quantitatively expose actual
outage “inflation”. I show how to study the effect of challenging environments
upon the reliability of a group of addresses by analyzing the inflation in outage
rate for that group during its presence.
This dissertation’s contributions will help achieve comprehensive measure-
ments of Internet reliability that can be used to identify vulnerable networks and
their challenges, inform which enhancements can help networks improve relia-
bility, and evaluate the efficacy of deployed enhancements over time.
ANALYZING INTERNET RELIABILITY REMOTELY
WITH PROBING-BASED TECHNIQUES
by
Ramakrishna Padmanabhan
Dissertation submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
2018
Advisory Committee:
Professor Neil Spring, Chair/Advisor
Professor David Levin
Professor Bobby Bhattacharjee
Professor John Dickerson
Professor Mark Shayman
©c Copyright by
Ramakrishna Padmanabhan
2018
Acknowledgments
Throughout the time I have worked on this dissertation, I have had men-
torship, help, and support, from many people. To all of them: Thank you very
much. I am truly grateful.
I would first like to thank my advisor, Neil Spring. His mentorship has
been invaluable in the making of this dissertation, and on a broader level, in the
making of the researcher that I am today. His passion for research and attention to
detail are qualities that I look up to and have learned to emulate. He has ever been
a bastion of constructive criticism and his feedback has helped tremendously in
honing my research skills. In addition, he has been flexible and accommodative
of my occasionally unusual schedules. I could not have asked for more from an
advisor.
There were several other mentors at the University of Maryland who had a
direct role in shaping this dissertation. I have worked closely with Dave Levin
and have benefitted much from his approach to research, his clear writing style,
his brilliant presentations, his generosity, and his wonderful sense of humor. He
has been a source of immense support and is an inspiration. Aaron Schulman
helped me choose University of Maryland, introduced me to the problem I tack-
led in this dissertation, and importantly, also helped me sustain my research dur-
ing difficult times. Aaron’s enthusiasm, curiosity, and work-ethic, inspired me to
persist with graduate school. I am also very grateful to Bobby Bhattacharjee; his
incisive questions and feedback throughout have helped this dissertation signifi-
ii
cantly and have also helped me grow as a researcher. Ashok Agrawala provided
valuable guidance early on and Atif Memon gave helpful comments during the
thesis proposal. I would also like to thank my committee members, John Dick-
erson and Mark Shayman, for their feedback on the dissertation manuscript and
presentation.
Mentors outside UMD have also provided valuable assistance during dif-
ferent stages of the dissertation. Alberto Dainotti has been instrumental during
the finishing stages of the dissertation and has provided valuable feedback and
support. Arthur Berger is another mentor who has helped considerably during
the latter half of the dissertation; his thoughtfulness, ability to listen, and patience
are qualities that I aim to emulate. I am also grateful to Matthew Luckie, Amogh
Dhamdhere, and kc claffy, for introducing me to CAIDA and for helping me nav-
igate the academic research world outside UMD. I would also like to thank Dave
Plonka for showing me how to approach research with infectious enthusiasm. A
special thank you to Krishna Sivalingam for giving me the opportunity to pursue
my first research problem, and for his considerable help with graduate school
applications.
I am also very grateful to my colleagues and friends, who have made grad-
uate school an enjoyable experience. Aaron, Randy, Lex, Yunus, Greg, Karla, and
Jessy were brilliant colleagues who welcomed me to graduate school and helped
me learn the ropes. Matt Lentz has been an extraordinary colleague and friend,
helping me immensely with my research and presentations, while also being a
participant in some of the more memorable conversations I’ve had. I am also very
iii
grateful to Zhihao Li for his friendship and support; I am especially privileged to
have been his first co-author on a paper. I am also grateful to Philipp Richter for
a delightful paper-writing experience together. James, Youndo, Stephen, Katura,
Richie, thank you all for your support and encouragement, especially during the
thesis proposal and dissertation defense. Thank you also for making the lab a
great place to hang out. A special thank you to Brandi Adams for being a won-
derful friend and for having shared in the journey. Sharron McElroy, thank you
for the conversations and for making the reimbursement process enjoyable. My
housemates and fellow PhD students, Bhaskar Ramasubramaniam, Amit Cha-
van, and Kartik Nayak: I am truly grateful for all the tea, food, help and support.
Anshul Sawant, Manish Purohit, Meethu Malu, and Sudha Rao, thank you for all
the memories over the years. And Kleoniki Vlachou, though your cakes were one
of the highlights of my first couple of years of graduate school, it is your support
and encouragement through the years that have been the true icing on the cake.
I am also grateful to the students I’ve had the privilege to mentor and teach.
Patrick Owen was the first undergraduate student I worked with and I am grati-
fied that he is continuing to work on similar topics. Working with Ramakrishnan
Sundara Raman and Reethika Ramesh was truly a delight; their enthusiasm and
eagerness to learn made for an excellent mentoring experience.
Last, but by no means the least, I would like to thank my family—including
my grandparents, uncles, aunts, cousins, and even my nephews and nieces—
for their unceasing love and encouragement. I am especially grateful to my par-
ents, Padmanabhan Raman and Lalitha Ramakrishnan for teaching me values
iv
like dedication, diligence, and optimism; without these values, I couldn’t have
completed this dissertation. I am also immensely grateful to my parents for al-
ways giving me the freedom to pursue my dreams. Ranjani Padmanabhan, my
cheerful and witty sister, has been ever ready with her support. My wife, Janani
Saikumar, who has also been pursuing her PhD alongside me, is one of the prin-
cipal reasons this dissertation was possible. From providing me with initial en-
couragement to pursue graduate school, to continuous love and support during
the ebbs and flows of graduate life, she has played an immense part.
v
Table of Contents
Acknowledgements ii
List of Tables x
List of Figures xi
1 Introduction 1
1.1 Background: state of the art in measuring residential Internet reli-
ability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Prior approaches . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 State of the art in residential Internet outage detection 10
2.1 Outage detection: an overview . . . . . . . . . . . . . . . . . . . . . 10
2.2 On-premises outage detection techniques . . . . . . . . . . . . . . . 11
2.3 Probing-based remote outage detection techniques . . . . . . . . . . 13
2.3.1 Trinocular detects failures of /24 address blocks . . . . . . . 13
2.3.2 Thunderping detects failures of individual addresses dur-
ing severe weather . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.3 Zmap was used to study Internet outages during Hurricane
Sandy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Probing-based techniques can scale but require improved accuracy 15
2.4.1 Confusing delay with loss . . . . . . . . . . . . . . . . . . . . 15
2.4.2 Making false inferences about outages due to dynamic ad-
dressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Mitigating false inferences due to early timeout 17
3.1 Challenges in selecting a timeout for probing techniques . . . . . . 18
3.1.1 Timeouts used in outage and connectivity studies . . . . . . 19
3.2 Primary dataset overview . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.1 Raw ISI survey data . . . . . . . . . . . . . . . . . . . . . . . 20
vi
3.2.2 Matched response latencies are capped at the timeout . . . . 22
3.2.3 Unmatched responses . . . . . . . . . . . . . . . . . . . . . . 23
3.2.3.1 Broadcast responses . . . . . . . . . . . . . . . . . . 24
3.2.3.2 Duplicate responses . . . . . . . . . . . . . . . . . . 30
3.3 Recommended Timeout Values . . . . . . . . . . . . . . . . . . . . . 32
3.3.1 Incorporating unmatched responses . . . . . . . . . . . . . . 33
3.3.2 Recommended Timeout Values . . . . . . . . . . . . . . . . . 35
3.4 Verification of long ping times . . . . . . . . . . . . . . . . . . . . . . 37
3.4.1 Are high latencies observed by other probing schemes? . . . 37
3.4.2 Is it a particular survey or vantage point? . . . . . . . . . . . 40
3.4.3 Is it ICMP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Why do pings take so long? . . . . . . . . . . . . . . . . . . . . . . . 47
3.5.1 Are satellites involved? . . . . . . . . . . . . . . . . . . . . . 47
3.5.2 Autonomous Systems with the most high latency addresses 49
3.5.3 Is it the first ping? . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.5.4 Patterns associated with RTTs greater than 100 seconds . . . 59
3.5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.6 Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . 61
4 Mitigating false inferences due to dynamic addressing 63
4.1 IP addresses can be proxies for end users . . . . . . . . . . . . . . . 64
4.2 Probing-based techniques can make false outage inferences due to
dynamic addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3 Dynamic addressing background . . . . . . . . . . . . . . . . . . . . 66
4.3.1 Dynamic Host Configuration Protocol . . . . . . . . . . . . . 66
4.3.2 Point-to-Point Protocol . . . . . . . . . . . . . . . . . . . . . . 67
4.3.3 Potential dynamic address change causes . . . . . . . . . . . 67
4.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.5 The RIPE Atlas datasets . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.5.1 RIPE Atlas connection logs dataset . . . . . . . . . . . . . . . 72
4.5.2 Probe filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.5.3 Connection log entry filtering . . . . . . . . . . . . . . . . . . 77
4.5.4 k-root ping dataset . . . . . . . . . . . . . . . . . . . . . . . . 78
4.5.5 SOS-uptime dataset . . . . . . . . . . . . . . . . . . . . . . . 80
4.5.6 Associating inter-connection gaps with outage events . . . . 81
4.6 Periodic address changes . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.6.1 Metric to detect periodic address durations . . . . . . . . . . 82
4.6.2 Periodic address changes by geography . . . . . . . . . . . . 84
4.6.3 Periodic address changes by AS . . . . . . . . . . . . . . . . 85
4.6.3.1 Is periodic renumbering prevalent across all ISPs? 86
4.6.3.2 Is periodic renumbering geographically correlated? 87
4.6.4 ISPs that renumber periodically . . . . . . . . . . . . . . . . 89
4.6.4.1 What fraction of probes is periodic? . . . . . . . . . 90
vii
4.6.4.2 Why are some address durations longer than the
period? . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.6.4.3 Are changes synchronized? . . . . . . . . . . . . . . 95
4.7 Outage-caused address changes . . . . . . . . . . . . . . . . . . . . . 96
4.7.1 Filtering falsely inferred power outages . . . . . . . . . . . . 97
4.7.2 Removing reboots caused by firmware updates . . . . . . . 98
4.7.3 Most outages result in an address change for some ASes . . 99
4.7.4 Is there a relationship between outage duration and ad-
dress changes? . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.8 Does a user’s dynamic address prefix change? . . . . . . . . . . . . 106
4.9 Using complementary datasets that provide IDs to confirm outages 109
4.9.1 CDN dataset provides IDs . . . . . . . . . . . . . . . . . . . . 111
4.9.2 Confirming outages detected by Thunderping . . . . . . . . 111
4.10 Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . 114
5 The need for measuring individual address outages 116
5.1 Background: dependent residential outages can be small . . . . . . 117
5.1.1 Prior techniques focus upon larger disruptions . . . . . . . . 118
5.1.2 The Thunderping dataset yields per-address disruptions . . 120
5.2 Detecting dependent disruptions . . . . . . . . . . . . . . . . . . . . 121
5.2.1 Finding dependent events in an address aggregate . . . . . 121
5.2.1.1 Choosing aggregate sets of IP addresses . . . . . . 123
5.2.1.2 Calculating the probability of disruption (Pd) . . . 124
5.2.2 Applying our method to the Thunderping dataset . . . . . . 125
5.3 Properties of dependent disruptions . . . . . . . . . . . . . . . . . . 132
5.4 Dependent disruption events across ISPs . . . . . . . . . . . . . . . 133
5.4.1 Dependent disruptions are more frequent at night for some
ISPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.4.2 Dependent disruptions can recover together . . . . . . . . . 135
5.4.3 Dependent disruptions can be multi-ISP . . . . . . . . . . . 138
5.4.4 Dependent disruptions may not disrupt entire /24s . . . . . 139
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6 Analyzing weather’s effect on Internet Reliability 144
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.2 Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
6.2.1 Dropouts, Defined . . . . . . . . . . . . . . . . . . . . . . . . 150
6.2.2 Dropouts, collected . . . . . . . . . . . . . . . . . . . . . . . . 151
6.2.3 Weather, classified . . . . . . . . . . . . . . . . . . . . . . . . 153
6.2.4 Data, Summarized . . . . . . . . . . . . . . . . . . . . . . . . 154
6.2.5 Why this data? . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.2.6 Dataset Limitations . . . . . . . . . . . . . . . . . . . . . . . . 158
6.3 Quantifying weather dropouts . . . . . . . . . . . . . . . . . . . . . 159
6.3.1 Metric: Inflated probability of dropout . . . . . . . . . . . . 160
viii
6.3.2 Verifying the metric will not be skewed by common dropouts
correlating with weather . . . . . . . . . . . . . . . . . . . . . 163
6.3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.4 Weather Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.4.1 Relative dropout rates . . . . . . . . . . . . . . . . . . . . . . 166
6.4.2 Geographic variation . . . . . . . . . . . . . . . . . . . . . . . 170
6.4.3 Continuous weather variables . . . . . . . . . . . . . . . . . 175
6.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
7 Future Work 181
7.1 Tracking devices across IP addresses using IDs on a global scale . . 181
7.1.1 Dynamic DNS services . . . . . . . . . . . . . . . . . . . . . . 182
7.1.2 Open DNS resolvers . . . . . . . . . . . . . . . . . . . . . . . 184
7.2 Classifying IP addresses . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.3 Identifying outage causes can help orthogonal reliability analyses . 186
8 Conclusion 188
Bibliography 190
ix
List of Tables
3.1 Incorporating unmatched responses . . . . . . . . . . . . . . . . . . 33
3.2 Recommended timeout values . . . . . . . . . . . . . . . . . . . . . 35
3.3 Details of Zmap scans used in the analyses . . . . . . . . . . . . . . 38
3.4 ASes with the most addresses with RTTs greater than 1s . . . . . . . 49
3.5 Continents with the most addresses with RTTs greater than 1s . . . 50
3.6 ASes with the most addresses with RTTs greater than 100s . . . . . 51
3.7 Patterns of latency and loss near high latency responses . . . . . . . 58
4.1 RIPE Atlas Connection log sample . . . . . . . . . . . . . . . . . . . 73
4.2 Filtering RIPE Atlas probes . . . . . . . . . . . . . . . . . . . . . . . 75
4.3 Sample k-root ping data during the occurrence of a network outage 79
4.4 Sample SOS-uptime data during the reboot of a RIPE Atlas probe . 80
4.5 ASes that renumber periodically . . . . . . . . . . . . . . . . . . . . 94
4.6 Probes likely to change addresses upon network outages are also
likely to change addresses upon power outages. . . . . . . . . . . . 102
4.7 Address changes across prefixes . . . . . . . . . . . . . . . . . . . . 108
4.8 Confirming Thunderping outages across link types . . . . . . . . . 113
5.1 Dmin values for varying values of N and Pd . . . . . . . . . . . . . . 123
5.2 Dependent disruption events for different values of number of ad-
dresses that can potentially fail (N ) and probability of disruption
(Pd) from the Thunderping dataset . . . . . . . . . . . . . . . . . . . 128
5.3 Dependent recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.1 Summary of data set for large ISPs classified by link type . . . . . . 155
x
List of Figures
3.1 Survey-detected response latencies are capped at the timeout . . . . 22
3.2 Broadcast addresses that solicit responses in Zmap scans . . . . . . 26
3.3 Broadcast addresses that solicit responses in ISI surveys . . . . . . . 27
3.4 Potential incorrect match caused by a broadcast response . . . . . . 29
3.5 Some addresses send multiple responses to a single echo request . 31
3.6 Filtering unexpected responses is effective . . . . . . . . . . . . . . . 34
3.7 Distribution of RTTs for all Zmap scans performed in 2015 . . . . . 39
3.8 Confirmation of high latency with Scamper . . . . . . . . . . . . . . 41
3.9 A longitudinal view of RTTs in ISI surveys . . . . . . . . . . . . . . 42
3.10 High RTTs are observed across ICMP, UDP, and TCP . . . . . . . . . 44
3.11 High RTTs are not solely a property of satellite-only ISPs . . . . . . 47
3.12 The first response often has the largest RTT . . . . . . . . . . . . . . 55
3.13 Difference between initial RTT and observed minimum . . . . . . . 56
3.14 Percentage of addresses in a /24 prefix showing a drop from the
initial to the maximum . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.1 Cumulative distribution of total time fraction by continent . . . . . 84
4.2 Cumulative distribution of total time fraction by AS . . . . . . . . . 86
4.3 Cumulative distribution of total time fraction for German ASes . . 88
4.4 Periodic address changes in Orange . . . . . . . . . . . . . . . . . . 95
4.5 Periodic address changes in Deutsche Telekom . . . . . . . . . . . . 95
4.6 Number of unique RIPE Atlas probes that rebooted on each day of
the year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.7 Distribution of the conditional probability that an address change
occurred given a network outage . . . . . . . . . . . . . . . . . . . . 100
4.8 Distribution of the conditional probability that an address change
occurred given a power outage . . . . . . . . . . . . . . . . . . . . . 101
4.9 Relationship between outage duration and address changes . . . . 104
4.10 Outage duration vs. probability of address change for addresses
from various link types . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.1 Filtering lossy addresses . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.2 Potential N and Pd values in the Thunderping dataset . . . . . . . . 127
xi
5.3 Distribution of the probability that detected dependent disruption
events could have occurred independently . . . . . . . . . . . . . . 129
5.4 Dmin vs observed disruptions, for each detected dependent disrup-
tion event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.5 Dependent disruption events detected per ISP . . . . . . . . . . . . 133
5.6 Dependent disruption events that began in each hour of the week . 134
5.7 Durations of dependent disruption events . . . . . . . . . . . . . . . 137
5.8 Multi-ISP dependent disruption events over time . . . . . . . . . . 138
5.9 Multi-ISP dependent disruption events during Hurricane Irma . . . 138
5.10 Minimum actual disrupted addresses in a /24 vs. responsive ad-
dresses in a /24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.11 Minimum actual disrupted addresses in a /24 vs. responsive ad-
dresses in a /24 for Comcast, Qwest, and Viasat . . . . . . . . . . . 142
6.1 Weather does not occur most often during hours of the week when
there are an inflated number of dropouts . . . . . . . . . . . . . . . 164
6.2 Inflation in hourly probability of dropout for various link types
across weather conditions . . . . . . . . . . . . . . . . . . . . . . . . 167
6.3 Inflation in hourly dropout probability by U.S. state for thunder-
storm, rain, and snow . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.4 Inflation in hourly dropout probability by U.S. state for various
weather conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.5 Hourly dropout probability of hosts (all link types) as a function of
the number of hours the hosts’ nearest U.S. airport received snow . 174
6.6 Inflation in hourly dropout probability as a function of wind speed
across link types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
6.7 Inflation in hourly dropout probability as a function of tempera-
ture across link types . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.8 Inflation in hourly dropout probability as a function of different
types of precipitation across link types . . . . . . . . . . . . . . . . . 179
7.1 IP address renumbering in dynamic DNS domains over a week . . 183
xii
Chapter 1: Introduction
Residential Internet reliability is increasingly important as a variety of services
that we use migrate to the Internet. Internet users today can communicate with
each other, perform financial transactions, plan their travel, and even obtain crit-
ical services such as health monitoring [1, 2] and emergency services [3, 4] from
their homes. Our dependence upon the Internet is rising further as more of our
home devices become connected. Consequently, reliable residential Internet con-
nectivity is critical.
Broad and longitudinal measurements of users’ Internet reliability can iden-
tify vulnerable networks, these networks’ challenges, and potential enhancements.
For instance, weather conditions such as thunderstorms, rain, and gales, can ad-
versely affect Internet reliability [5]. Measurements can inform which areas are
particularly vulnerable to weather conditions. Comparing measurements against
other areas with similar weather conditions can provide insights on potential en-
hancements: for example, areas may be less vulnerable to gales where Internet
cables run underground. Once an enhancement is deployed, measurements can
reveal if the enhancement has resulted in improved Internet reliability.
1
The inferences from residential Internet reliability measurements can bene-
fit various stakeholders, including policymakers, Internet Service Providers (ISPs),
and residential users themselves. Policymakers around the world have begun
efforts to measure Internet reliability [6–9], since such measurements can drive
incentives and policies to improve reliability. ISPs can benefit from these mea-
surements in multiple ways. Since even large ISPs rely upon their users to in-
form them of network connectivity issues [10], they may be unaware of problems
in their own networks; these measurements can help ISPs recognize underyling
problems. Further, ISPs can learn about the reliability of their competitors. Mea-
surements of Internet reliability will also benefit residential users, since they can
take into account the reliability of ISPs in their geographic region when purchas-
ing Internet services.
However, we currently lack authoritative measures of residential Internet
reliability, due to several challenges associated with obtaining such measures.
1.1 Background: state of the art in measuring residential Internet
reliability
Intuitively, a reliable Internet connection is one that works continuously. In other
words, it experiences no outages.
Measuring Internet reliability, therefore, necessitates measuring Internet out-
ages and then using measured outages in a metric that represents some property
of outages. At a high level, an Internet outage is an event that prevents users
2
from communicating over the Internet. Since we expect outage events to be rare,
detecting them requires monitoring a broad set of residential users over long pe-
riods. After detecting outages over time for a group of residential addresses—say
addresses belonging to the same ISP or geographic region—the detected outages
can be used to calculate reliability metrics. An example metric is the outage rate,
the number of outages occurring over time for a group of addresses. These met-
rics can be used to compare different groups of addresses and to identify groups
with particularly low reliability.
1.1.1 Challenges
Detecting residential Internet outages is challenging. The first challenge is the
scale of residential users on the Internet: there are millions of residential Inter-
net connections to monitor. Further, residential Internet outages can vary in the
number of affected users. They can affect entire cities during large power out-
ages. They can also affect just an individual house if a fallen tree branch damages
the last-mile link between the house and the ISP. Another challenge is the hetero-
geneity of residential Internet connections. Some connections are cable connec-
tions, where the home router typically has a stable public IPv4 address. Others
are DSL connections where the address assigned to the home router can change
every 24 hours. Residential Internet connections can also use satellite links, which
are prone to higher latencies.
3
Designing an outage detection system that can measure users broadly and
yet remain accurate across diverse heterogeneous Internet connections remains a
challenge.
1.1.2 Prior approaches
Prior approaches tradeoff either outage detection breadth or accuracy.
Edge network outage detection techniques detect outages broadly but focus
upon detecting outages that affect a substanial numer of addresses in a group col-
lectively. The group may comprise addresses belonging to the same /24 address
block [11, 12], BGP prefix [13], or country [14]. Techniques seek such disrup-
tion events because individually, each large disruption has impact and their size
makes them easier to confirm, e.g., with operators. However, residential Inter-
net outages may be limited to a small neighborhood or apartment block; these
techniques are likely to miss such events. Thus, they trade off outage detection
accuracy for breadth.
On-premises techniques, such as RIPE Atlas [15], SamKnows [16], BISmark [17],
and DIMES [18] measure diverse aspects of users’ Internet connections, includ-
ing connectivity problems, but measure relatively few users. These techniques
deploy dedicated hardware or software at user premises that continuously con-
duct ping, traceroute, DNS measurements etc.; some of these measurements can
be used to infer Internet connectivity problems.
4
Whereas on-premises techniques have fundamental scaling difficulties ow-
ing to manufacturing and deployment costs, hundreds of millions of IP addresses
respond to active probes [19]. Since many residences have at least one device with
a public IP address [20] (typically the home router), these IP addresses can be
probed remotely, from vantage points under researcher control, to measure their
connectivity. Thunderping [5] and Trinocular [11] adopt this approach to out-
age detection: they focus upon measuring only connectivity but do so for many
users. Since these techniques can send probes remotely from servers under their
control, without requiring any user involvement, they are able to detect outages
across time as well as across the IPv4 address space.
Though capable of measuring Internet outages broadly, probing-based re-
mote outage detection techniques can make false inferences about outages when
some scenarios occur [19, 21]. The likelihood of occurrence of these scenarios
varies across geographic regions, ISPs, and media type. Analyzing outages in
the presence of these confounding factors requires broad measurements of these
factors in turn.
1.2 Thesis
The goal of this dissertation is to provide broad, longitudinal, and accurate mea-
surements of Internet reliability across ISPs, media-types, and geographic loca-
tions in a variety of circumstances. I work towards this goal using the probing-
based technique due to its ability to scale. In the rest of the dissertation, I illus-
5
trate my approaches to mitigate probing-based techniques’ problems in measur-
ing residential Internet reliability by defending the following thesis:
It is possible to remotely and accurately detect substantial outages experienced by
any device with a stable public IP address that typically responds to active probes and
use these outages to compare reliability across ISPs, media-types, geographical areas, and
weather conditions.
• Device with a stable public IP address: This is a device connected to the Internet,
like a home-router, to which an ISP has assigned a public IP address such that
the assignment is either static, or dynamic in a manner that allows the duration
of dynamic assignment to be estimated.
• Substantial outage: I define a substantial outage to be an event where a device
with an Internet connection is unable to send or receive any packets for at least
11 minutes. Such outages are likely to inconvenience residential users.
• Accuracy of outage detection: An outage detection technique is accurate when it
correctly identifies every substantial outage event experienced by an Internet-
connected-device. There are no time-periods when the address experiences a
substantial outage but it goes undetected (false negatives). Similarly, there are
no time-periods classified as outages when the destination address is able to
receive packets from the Internet (false positives).
• Reliability: I measure reliability using the outage rate metric, which I define as
the raw count of outage events over measured time.
6
1.3 Contributions
To demonstrate the thesis, I measure two confounding factors—high latency (Chap-
ter 3) and dynamic address reassignent (Chapter 4)—that can lead probing-based
outage detection techniques to make false outage inferences. In Chapter 5, I mo-
tivate the detection of individual addresses’ outages. I go on to show how to
measure Internet reliability in the presence of inference errors and unrelated out-
ages in Chapter 6. This dissertation is organized as follows:
Chapter 2: State of the art in residential Internet outage detection
I provide background and place existing work in Internet outage detection
in context. I describe the challenges that probing-based remote outage detection
techniques will need to address to measure residential Internet reliability. These
techniques study outages by sending active probes (such as ping’s echo requests)
and use probe responses to infer outages. They assume that a response to an
active probe indicates a working path to the probed user device and that lack of
response is indicative of failure. I illustrate two scenarios where this assumption
can be invalid, leading to potentially false outage inferences.
Chapter 3: Mitigating false inferences due to early timeout
I investigate the prevalence of delayed probe responses due to early time-
out. The lack of response to an active probe isn’t always indicative of loss; for
example, when responses are delayed beyond the prober’s timeout, the response
eventually arrives but the prober would never see the response because it timed
out too early. I report how commonly responses are delayed beyond timeouts in
7
networks around the world and use these measurements to discuss techniques
that would mitigate this problem.
Chapter 4: Mitigating false inferences due to dynamic addressing
I investigate how dynamic addressing can lead remote probing-based out-
age detection techniques to make false inferences about outages and techniques
to mitigate these false inferences. I measure the frequency and patterns in dy-
namic address reassignment for ISPs across the world. I also introduce a tech-
nique using a complementary dataset to determine whether an outage detected
for an address by a probing-based system is a false outage due to dynamic reas-
signment.
Chapter 5: The need for measuring individual address outages
I motivate the need to study individual address outages by showing that
individual address outage measurements can be used to find outage events that
are statistically unlikely to have occurred independently, and that many of these
events would not be detected by prior work. I describe how to use simultaneous
outages of individual addresses related to each other, by geography and ISP, to
identify outages that are highly unlikely to have occurred independently, and are
therefore likely to share a common underlying cause.
Chapter 6: Analyzing weather’s effect on Internet Reliability
I show how to measure and compare the reliability of groups of addresses—
like addresses belonging to the same ISP, media-type, geographic region—when
facing challenging environments. I consider one class of challenging environ-
ments that residential Internet connections can face: severe weather conditions.
8
I show how to use the inflation in outage rate to measure the effect of different
classes of weather upon various groups of addresses.
Chapter 7: Future Work
I describe directions for future work in measuring residential reliability us-
ing probing-based techniques.
9
Chapter 2: State of the art in residential Internet outage detec-
tion
In this chapter, I begin with an overview of Internet outage measurement with a
focus upon residential outage measurement. Next, I discuss probing-based tech-
niques to detect outages remotely in detail and show their potential to measure
residential users at scale. Then I illustrate scenarios where these techniques could
make false inferences about outages.
2.1 Outage detection: an overview
The architects of the Internet predicted that network outages could occur, and de-
signed the Internet to have the ability to route around outages [22]. As predicted,
a variety of factors cause outages in the Internet, including optical fiber cuts [23],
routing and infrastructure failures [24, 25], and hurricanes [5].
Large Internet outages that can affect packets from thousands of Internet
hosts have received attention from the research community [11, 13, 13, 26–36].
Outages occurring in the Internet’s core can cause Internet path failures; researchers
have investigated transient Internet path failures caused by route changes [33–
36] and longer path failures caused by infrastructure device outages [13, 27–32].
10
Dainotti et al. [14] observe Internet Background Radiation traffic sent to IPv4
darknets to detect outages affecting entire countries.
Another class of techniques detects outages at the Internet’s edge, for net-
work prefixes or address blocks, but does not target outages of individual users’
Internet connections. Hubble studies reachability problems affecting BGP pre-
fixes [13]. Trinocular detects outages affecting /24 address blocks. Richter et al. [12]
use the observation point of a large CDN to detect periods of reduced activity
from /24 address blocks consistent with outages. CAIDA’s IODA system [37]
detects outages affecting countries, ASNs, and geographic provinces using three
complementary datasets: BGP updates from Routviews [38] and RIPE RIS [39],
active probing data obtained with CAIDA’s implementation of the Trinocular
methodology, and IBR data using the technique introduced by Dainotti et al. [14].
However, outages that affect individual users have received comparatively
less attention [5, 40–42]. In the rest of this chapter, I classify these efforts to detect
outages into on-premises outage detection techniques and remote probing-based
outage detection techniques, and discuss their approaches and challenges.
2.2 On-premises outage detection techniques
Recognizing the need for long term measurements of residential Internet perfor-
mance, policymakers such as the FCC from the U.S., and Ofcom from the U.K.
have deployed the SamKnows hardware platform [16] inside residences to mea-
sure residential Internet connections continuously by performing active and pas-
11
sive measurements and reporting their results to users, ISPs, and policy makers.
RIPE NCC, the European RIR, has pioneered the RIPE Atlas [15] project and Sun-
daresan et al. the BISmark project [17], which also study user connectivity using
dedicated hardware measurement devices on user premises. On-premises tech-
niques can also use measurements from software deployed on user machines: the
DIMES project [18] and DASU are two notable examples [43].
Hardware-based approaches can offer accurate reports about Internet con-
nectivity since the hardware devices are designed to make measurements contin-
uously as long as they are powered. These techniques have the ability to perform
a range of other measurements such as DNS anycast tests that can identify which
instance of a root-server is closest, and even passive measurement of the web-
sites that users access. However, these approaches are fundamentally limited in
scale since their hardware is expensive, distributing the hardware to users is time
consuming, and convincing users to keep their hardware running is challenging.
For example, the RIPE Atlas project, which began in 2010 and has been continu-
ously expanding across the world, has fewer than 10,000 probes that are currently
making measurements, out of more than 15,000 distributed probes.
While some of these costs can be offset by utilizing measurements from de-
ployed software on user systems [18, 43, 44] or using a combination of hardware
and software measurements [45], deploying software widely remains challeng-
ing. Separating user behavior, such as turning off their laptops, from Internet
outage events presents additional challenges for these techniques [44].
12
2.3 Probing-based remote outage detection techniques
Probing-based remote outage detection techniques can detect connectivity prob-
lems remotely through active probing from servers under reseacher control. Though
this approach will prevent certain types of measurements, such as DNS anycast
tests, it can measure Internet connectivity for individual users at scale. However,
existing techniques can infer false outages in some scenarios as I illustrate next.
Probing-based remote outage detection techniques study connectivity prob-
lems by sending active probes (such as ping’s echo requests) and use probe re-
sponses to infer connectivity problems. For example, an echo-response from the
end-host indicates that its network connection is working. If a previously re-
sponsive destination ceases to respond to probes, current techniques infer that
the destination could be experiencing connectivity problems. Thunderping [5],
Trinocular [11], and Zmap [46], have used this technique to detect outages, albeit
at different scales. I discuss each approach in detail next.
2.3.1 Trinocular detects failures of /24 address blocks
Trinocular pings addresses in 4M /24 address blocks and uses the responses
to detect Internet outages affecting entire blocks. Using historical data from the
ISI census [47], it models the responsiveness of blocks and finds subsets of ad-
dresses within each block that are likely to respond to pings. The system pings a
few of these addresses from each block at random and probes them in 11-minute
rounds. Trinocular then employs Bayesian inference to reason about responses
13
from blocks. When a block’s responsiveness is lower than expected, Trinocu-
lar probes the block at a faster rate and eventually detects an outage when the
follow-up probes also indicate the block’s lack of Internet connectivity.
2.3.2 Thunderping detects failures of individual addresses during
severe weather
Thunderping pings sampled addresses from multiple ISPs in geographic areas in
the United States. Originally designed to evaluate how weather affects Internet
outages, the system uses Planetlab vantage points to ping 100 IPv4 addresses
from multiple ISPs in U.S. counties with active weather alerts. Each address is
pinged from multiple Planetlab vantage points (at least 3) every 11 minutes, and
addresses in a county are pinged six hours before, during, and after a weather
alert for that county.
2.3.3 Zmap was used to study Internet outages during Hurricane
Sandy
Zmap is an active probing technique designed to send packets of a specified type
(such as ICMP echo) to all IPv4 addresses at very fast speeds (under an hour in
2013 [46], under 5 minutes today [48]. A key to these speeds is that the Zmap tool
chooses to not store state at the prober; instead, response packets are matched
with sent ones by encoding destination-specific data in the sent packets. By us-
ing cyclic generators, Zmap probes destination addresses in a random order, re-
14
ducing probing burden on individual ISPs. However, Zmap’s decision to not
store state comes with a trade off: probe retransmissions upon the detection of a
lost probe is difficult. The Zmap tool was used to detect Internet outages during
Hurricane Sandy [46].
2.4 Probing-based techniques can scale but require improved ac-
curacy
Since probing-based techniques send probes from machines under reseacher con-
trol, they have control over the number of addresses they probe and how fre-
quently to probe. The Zmap technique has demonstrated that it is possible to
send a ping to the entire IPv4 address space in less than five minutes [48].
However, probing-based remote outage detection techniques can infer false
outages as a consequence of their foundational assumption: that a response to
an active probe indicates a working path to the probed IP address and that lack
of response is indicative of failure. Current techniques can make false positive
inferences about outages in the following scenarios:
2.4.1 Confusing delay with loss
Traditionally, active probe based approaches time out probes after a few seconds.
Thunderping [5] and Trinocular [11] time out probes after a few seconds. Re-
sponses that arrive after the timeout will be reported as lost. When this happens,
existing techniques would infer loss though the responses are in fact merely de-
15
layed. Chapter 3 presents a measurement study on probe response latencies in
networks around the world and discusses approaches to disambiguate delayed
probes from lost probes.
2.4.2 Making false inferences about outages due to dynamic ad-
dressing
Consider an IP address that was previously responsive. If the host to which that
IP address was assigned changed its IP address as a result of dynamic addressing,
and if the probed IP address is not reassigned to any host, then echo responses
will cease to arrive. Existing techniques would thus infer false probe-loss and
consequently, false outages. Consider an alternate scenario where the probed IP
address has an outage. Suppose that at some point during the outage, the IP
address is reassigned to some other end-host which responds to probes. Existing
techniques would infer that the arrival of responses signals the end of the outage
and would infer that the outage ended prematurely. I address how to mitigate
false inferences due to dynamic address reassignment in Chapter 4.
16
Chapter 3: Mitigating false inferences due to early timeout
In this chapter, I begin by describing how probe responses delayed beyond time-
outs used by current probing-based techniques can lead to false probe-loss infer-
ences, and thereby to false outage inferences.
Next, I describe work with colleagues that measured how frequently re-
sponses to active probes are delayed beyond timeouts set by existing approaches.
We began by studying ping latencies from Internet-wide surveys [47] conducted
by ISI, including 9.64 billion ICMP Echo Responses from 4 million different IP
addresses in 2015, and identified addresses that are particularly likely to be sub-
ject to high delay. We then verified the high latencies by repeating measure-
ments using other probing techniques, comparing the statistics of various sur-
veys, and investigating high-latency behavior of ICMP compared to UDP and
TCP. Finally, we explained these distributions by isolating satellite links, consid-
ering sequences of latencies at a higher sampling rate, and classifying a complete
sample of the Internet address space through a modified Zmap client. These re-
sults are reproduced from our published work [19].
Using these results, I discuss how probing-based outage detection tech-
niques can mitigate false outage inferences caused by delayed responses.
17
3.1 Challenges in selecting a timeout for probing techniques
Conventional wisdom suggests that active probes on the Internet should time-
out after a few seconds. The belief is that after a few seconds there is a very
small chance that a probe and response will still exist in the network. Once a
probe times out, the prober can free the state associated with the probe, thereby
reclaiming memory.
Conventional wisdom also suggests that a single timed out probe is insuffi-
cient to reason about end-host failures, due to potential random loss on the Inter-
net. For most probing systems, any timed out active probes are followed up with
retransmissions to increase the confidence that a lack of response is due to an
outage event and not due to random loss on the Internet. These followup probes
will also have a timeout that is generally the same as the first attempt.
Setting correct timeouts is critical for probing-based remote outage detec-
tion techniques. These techniques infer outages based upon lost probes and probe
response loss is dependent upon the prober’s timeout. Additonally, since probe
timeouts trigger followup probes, setting appropriate timeouts is vital to these
techniques.
However, choosing an appropriate timeout is challenging. Selecting a time-
out value that is too low will ignore delayed responses and might add to conges-
tion by performing retransmissions to an already congested host. Timeout values
that are too high will delay retransmissions that can confirm an outage. In addi-
tion, too-high timeouts increase the amount of state that needs to be maintained
18
at a prober, since every probe will need to be stored until either the probe times
out, or the response arrives.
3.1.1 Timeouts used in outage and connectivity studies
Outage detection systems such as Trinocular [11] and Thunderping [5] tend to
use a 3 second timeout for active probes because it is the default TCP SYN/ACK
timeout [49]. Both techniques will not infer outages if a single response is delayed
beyond the timeout, since they send follow-up probes to confirm suspected out-
ages. However, if a series of responses are delayed beyond the timeout, both
techniques can potentially infer false probe-loss and therefore, false outages.
Internet performance monitoring systems use a wide range of probe time-
outs. On the shorter side, iPlane [50] and Hubble [13] send ICMP echo requests
with a 2 second timeout. iPlane declares a host unresponsive after one failed
retransmission. Hubble waits two minutes after a failed probe then retransmits
probes six times and finally declares reachability with traceroutes. On the longer
side, Feamster et al. [51] used a one hour timeout after each probe. However, they
chose a long timeout to avoid errors due to clock drift between their probing and
probed hosts; they did not do so to account for links that have excessive delays.
PlanetSeer [52] assumed that four consecutive TCP timeouts (3.2-16 seconds) in-
dicates a path anomaly.
It is especially important for connectivity measurements from probing hard-
ware placed inside networks to have timeouts because of the limited memory in
19
the probing hardware. The RIPE Atlas [15] probing hardware sends continu-
ous pings to various hosts on the Internet to observe connectivity. The timeout
for their ICMP echo requests is 1 second [53]. The SamKnows probing hardware
uses a 3 second timeout for ICMP echo requests sent during loaded intervals [16].
We started this study with the expectation that these timeout values might
need minor adjustment to account for large buffers in times of congestion; what
we found was quite different.
3.2 Primary dataset overview
In this section, we describe the ISI survey dataset we use for our analysis of
ping latency. We perform a preliminary analysis of ping latency and find that
the dataset contains different types of responses that should (or should not) be
matched to identify high-latency responses. Finally, we describe techniques to
remove responses that could induce errors in the latency analysis.
3.2.1 Raw ISI survey data
ISI has conducted Internet wide surveys [47] since 2006. Precise details can be
found in Heidemann et al. [47], and technical details of the data format online [54],
but we present a brief overview here.
Each survey includes pings sent to approximately 24,000 /24 address blocks,
meant to represent 1% of all allocated IPv4 address space. Once an address block
is included, ICMP echo request probes are sent to all 256 addresses in the selected
20
/24 address blocks once every 11 minutes, typically for two weeks. The blocks
included in each survey consist of four classes, including blocks that were chosen
in 2006 and probed ever since, as well as samples of blocks that were responsive
in the last census—another ISI project that probes the entire address space, but
less frequently. However, we treat the union of these classes together.
We use data from 103 surveys taken between April 2006 and February 2015,
and performed initial studies based on 2011–2013 data, but focus on the most
recent of them, in January and February of 2015 for data quality and timeliness.
The dataset consists of all echo requests that were sent as part of the surveys
in this period, as well as all echo responses that were received. Of particular
importance is that echo responses received within, typically, three seconds of an
echo request to the same address are matched into a single record and given a
round-trip measurement precise to microseconds. Should an echo response take
four seconds to arrive, a “timeout” record is recorded associated with the probe,
and an “unmatched” record is recorded associated with the response. These two
packets have timestamps precise only to seconds. The dataset also includes ICMP
error responses (e.g., “host unreachable”); we ignore all probes associated with
such responses since the latency of ICMP error responses is not relevant.
In later sections, we will complement this dataset with results from Zmap [46]
and additional experiments including more frequent probing with Scamper [55]
and Scriptroute [56].
21
1.0
0.8
50
0.6
80
90
95
98
0.4
99
0.2
0
0 2 4 6
RTT (s)
Figure 3.1: CDF of percentile latency of survey-detected responses per IP address: Each
point represents an IP address and each curve represents the percentile from that IP ad-
dress’s response latencies. The slope of the latency percentiles increases around the 3
second mark, suggesting that ISI’s prober timed out responses that arrived after 3 sec-
onds.
3.2.2 Matched response latencies are capped at the timeout
In this section, we present the latencies we would observe when considering only
those responses that were matched to requests because they arrived within the
timeout. We call these responses survey-detected responses.
We aggregate round trip time measurements in terms of the distribution of
latency values per IP address, focusing on characteristic values on the median,
80th, 90th, 95th, 98th and 99th percentile latencies. That is, we attempt to treat
each IP address equally, rather than treat each ping measurement equally. This
22
CDF
aggregation ensures that well-connected hosts that reply reliably are not over-
represented relative to hosts that reply infrequently.
Taking ISI survey datasets from 2011–2013 together, we show a CDF of these
percentile values considering only survey-detected responses in Figure 3.1. Taken
literally, 95% of echo replies from 95% of addresses will arrive in less than 2.85
seconds. However, it is apparent that the distribution is clipped at the 3 second
mark, although a few responses were matched even after 7 seconds.
We observe three broad phases in this graph: (1) the lower 40% of addresses
show a reasonably tight distribution in which the 99th percentile stays close to the
98th; (2) the next 50% in which the median remains low but the higher percentiles
increase; and (3) the top 10% where the median rises above 0.5 seconds.
3.2.3 Unmatched responses
If a probe takes more than three seconds to solicit a response, it appears as if the
probe timed-out and the response was unsolicited or unmatched. Since it appears
from Figure 3.1 that three seconds is short enough that it is altering the distribu-
tion of round trip times, we are interested in matching these echo responses to
construct the complete distribution of round trip times.
Matching these responses to find delayed responses is not a simple matter,
however. In particular, we find two causes of unexpected responses that should
not yield samples of round trip times: unmatched responses solicited by echo
requests sent to broadcast addresses and apparent denial of service responses.
23
We match a delayed response with its corresponding request as follows:
Given an unmatched response having a source IP address, we look for the last
request sent to that IP address. If the last request timed out and has not been
matched, the latency is then the difference between the timestamp of the response
and the timestamp of the request. ISI recorded the timestamp of unmatched re-
sponses to a 1 second precision, thus the latencies of inferred delayed responses
are precise only to a second.
The presence of unexpected responses can lead to the inference of incor-
rect latencies for delayed responses using this technique: not all unexpected re-
sponses should be matched by source address. We thus develop filters to remove
unexpected responses from the set of unmatched responses.
We note that it is possible to match responses to requests explicitly using the
id and sequence numbers associated with ICMP echo requests, and even perhaps
using the payload. These attributes were not recorded in the ISI dataset, which
motivates us to develop the source address based scheme. We use these fields
when running Zmap or other tools to confirm high latencies in Section 3.4 below.
3.2.3.1 Broadcast responses
The dataset contains several instances where a ping to a destination times out,
but is closely followed by an unmatched response from a source address that is
within the same /24 address block, but different from the destination. In each
round of probing, this behavior repeats. Here, we analyze these unmatched re-
24
sponses, find that they are likely caused by probing broadcast addresses, and
filter them.
Network prefixes often include a broadcast address, where one address
within a subnet represents all devices connected to that prefix [57]. The broadcast
address in a network should be an address that is unlikely to be assigned to a real
host [57], such as the address whose host-part bits are all 1s or 0s, allowing us to
characterize broadcast addresses. Devices that receive an echo request sent to the
broadcast address may, depending on configuration, send a response [49], and
if sending a response, will use a source address that is their own. We call these
responses broadcast responses. No device should send an echo response with the
source address that is the broadcast destination of the echo request.
We hypothesize that pings that trigger responses from different addresses
within the same /24 address block result when the ping destination is a broadcast
address. We examine ping destinations that solicit a response from a different
address in the same /24 address block, and check if they appear to be broadcast
addresses.
We extended the ICMP probing module in the Zmap scanner [46] to embed
the destination into the echo request, then to extract the destination from the echo
response. Doing so allows us to infer the destination address to which the probe
was originally sent. Zmap collected the data and made it available for download
at scans.io.
We choose the Zmap scan conducted closest in time to the last ISI survey
we studied, on April 17 2015, to investigate the host-part bits of destination ad-
25
50000
40000
30000
20000
10000
0
0 31 63 95 127 159 191 223 255
Last Octet
Figure 3.2: Broadcast addresses that solicit responses in Zmap: Broadcast addresses usu-
ally have last octets whose last N bits are either 1 or 0 (where N > 1).
dresses that triggered responses from a different address from the same /24 ad-
dress block. We plot the distribution of the last octets of these addresses in Fig-
ure 3.2. Last octets with the last N bits ending in 1 or 0, where N is greater than
1, such as 255, 0, 127, 128 etc., have spikes. These addresses are likely broadcast
addresses. On the other hand, last octets that end in binary ’01’ or ’10’ have very
few addresses.
Broadcast responses exist in the dataset
We examine if unmatched responses in the ISI dataset are caused by pings sent to
broadcast addresses. Since broadcast responses are likely to be seen after an Echo
Request sent to a broadcast address, we find the most recently probed address
26
Broadcast addresses
100M
50M
0
0 31 63 95 127 159 191 223 255
Last Octet
Figure 3.3: Broadcast addresses that solicit responses in ISI surveys: Number of un-
matched responses that followed a probe sent to address with last octet X. Last octets
with last N bits ending in 0s and 1s (where N > 1) observe spikes, likely caused by
broadcast responses. Not all unmatched responses are caused by broadcast responses,
however, since there exist roughly 10M unmatched responses distributed evenly across
all last octets.
within the same /24 prefix for each unmatched response. We then extract the last
octet of the most recently probed address. Figure 3.3 shows the distribution of un-
matched responses across these last octets. We find that around 10M unmatched
responses are distributed evenly across all last octets: these are unmatched re-
sponses that don’t seem to be broadcast responses. However, last octets that have
their last N bits as 1s and 0s ,when N is greater than 1, observe spikes similar to
those in Figure 3.2.
27
Unmatched responses
If left in the data, broadcast responses could yield substantial latency over-
estimates in the following, common, scenario, which we illustrate in Figure 3.4.
Assume that the echo request sent to an address 211.4.10.254 is lost and that the
device is configured to respond to broadcast pings. The echo request sent to
211.4.10.254 could then be matched to the response to the request sent to 211.4.10.255,
the broadcast address of the enclosing prefix. This would lead to a latency based
on the interval between probing 211.4.10.254 and 211.4.10.255, as shown in the
figure.
Filtering broadcast responses
We develop a method which uses ISI’s non-random probing scheme to detect ad-
dresses that source broadcast responses. We call such addresses broadcast respon-
ders, and seek to filter all their responses. We believe that delayed responses are
likely to exhibit high variance in their response latencies, since congestion varies
over time. On the other hand, a broadcast response is likely to have relatively
stable latency.
ISI’s probing scheme sends probes to each address in a /24 address block
in a nonrandom sequence, allowing us to develop a filter that checks if a source
address responds to a broadcast address each round. Addresses are probed such
that last octets that are off by one, such as 254 and 255, receive pings spaced 330
seconds apart (half the probing interval of 11 minutes) as shown in Figure 3.4. For
every unmatched response with a latency of at least 10 seconds, the filter checks if
28
T = 0 Ping 211.4.10.254 
Reply 211.4.10.254 
T = 330 Ping 211.4.10.255 
Reply 211.4.10.254 
T = 660 Ping 211.4.10.254 
False
Match T = 990 Ping 211.4.10.255 
Reply 211.4.10.254 
Figure 3.4: We filter broadcast responses since they can lead to the inference of false la-
tencies. This figure illustrates a potential incorrect match caused by a broadcast response.
Echo requests sent to the broadcast address 211.4.10.255 at T = 330 and T = 990 seconds
solicit responses from 211.4.10.254. When a timeout occurs for a request sent directly to
211.4.10.254 at T = 660 seconds, we would falsely connect that request to the response at
T = 990 seconds.
the same source address had sent an unmatched response with a similar latency
in the previous round. We take an exponentially weighted moving average of
the number of times this occurs for a given source address with α = 0.01. Most
broadcast responders have the maximum of this moving average > 0.9, but since
probe-loss can potentially decrease this value, we mark IP addresses with values
> 0.2 and filter all their responses.
29
We confirm that we find broadcast responders correctly in the ISI surveys
by comparing the ones we found in the ISI 2015 surveys with broadcast respon-
ders from the Zmap dataset. Zmap detected 939,559 broadcast responders in
the April 17 2015 scan, of which 7212 had been addresses that provided Echo
Responses in ISI’s IT63w (20150117) and IT63c (20150206) datasets. The filter
detected 7044 (97.7%) of these as broadcast responders. We inspected the 168 re-
maining addresses and found that 154 addresses have 99th percentile latencies
below 2.5 seconds. Since ISI probes a /24 prefix only once every 2.5 seconds,
these addresses cannot be broadcast responders. Another 5 addresses have 99th
percentiles latencies below 5 seconds; these are unlikely to be broadcast respon-
ders as well.
The remaining 9 addresses had 99th percentile latencies in excess of 300s
and seem to be broadcast responders. Upon closer inspection, we found that
these addresses only occasionally sent an unmatched response: around once ev-
ery 50 rounds. The α parameter of the filter can tolerate some rounds with miss-
ing responses, but these addresses respond in so few rounds that they pass un-
detected. If these 9 are indeed broadcast responders as suggested by high 99th
percentile latencies, this yields a false negative rate of our filter of 0.13%.
3.2.3.2 Duplicate responses
Packets can be duplicated. A duplicated packet will not affect inferred latencies
as long as the original response to the original probe packet reaches the prober,
30
1.0
0.1
0.01
0.001
0.0001
0.00001
0.000001
0.0000001
1 3 10 100 1K 10K 100K 1M 10M
Max responses per ping
Figure 3.5: Maximum number of responses received for a single echo request, for IP
addresses that sent more than 2 responses to an echo request. The red dots indicate
instances where addresses responded to a single echo request with more than 1M echo
responses. We believe that these are caused by DoS attacks.
since our scheme ignores subsequent duplicate responses. However, we find that
some IP addresses respond many times to a single probe. In this case, the incom-
ing packets aren’t responses to probes, but are either caused by incorrect config-
urations or malicious behavior.
Figure 3.5 shows the distribution of the maximum number of echo responses
observed in response to a single echo request. Since broadcast responses can also
be interpreted as duplicate responses, we look only at IP addresses that sent more
than 2 echo responses for an echo request. Of 658,841 such addresses, we find that
4,985 (0.7%) sent at least 1,000 echo responses. The red dots in the figure show
31
CCDF
26 addresses that sent more than one million echo responses, with one address
sending nearly 11 million responses in 11 minutes.
Zmap authors reported that they observed retaliatory DoS attacks in re-
sponse to their Internet-wide probes [46]. We believe that some of the responses
in the ISI dataset are also caused by DoS attacks.
We filter duplicate responses by ignoring IP addresses that ever responded
more than 4 times to a single echo request, based on observing the distribution
of duplicates shown in Figure 3.5. Packets can sometimes get duplicated on the
Internet, and we want to be selective in our filtering to remove as little as neces-
sary. Even if a response from the probed IP address is duplicated and a broadcast
response is also duplicated, there should be only 4 echo responses in the dataset.
We believe that IP addresses observing more than 4 echo responses to a single
echo request are either misconfigured or are participating in a DoS attack. In
either case, the latencies are not trustworthy.
3.3 Recommended Timeout Values
In this section, we analyze the ping latencies of all pings obtained from ISI’s Inter-
net survey datasets from 2015 to find reasonable timeout values. We demonstrate
the effectiveness of our matching scheme for recovering delayed responses from
the dataset. We then group the survey-detected responses and delayed responses
together to determine what timeout values would be necessary to recover vari-
ous percentiles of responses. Some IP addresses observe very high latencies in
32
Packets Addresses
Survey-detected 9,644,670,150 4,008,703
Naive matching 9,768,703,324 4,008,830
Broadcast responses 33,775,148 9,942
Duplicate responses 67,183,853 20,736
Survey + Delayed 9,667,744,323 3,978,152
Table 3.1: Adding unmatched responses to survey-detected responses
the ISI dataset; we verify that these are real in Section 3.4 and examine causes in
Section 3.5.
3.3.1 Incorporating unmatched responses
ISI detected 9.64 Billion echo responses from 4 Million IP addresses in 2015 in
the IT63w (20150117) and IT63c (20150206) datasets, as shown in the first row
of Table 3.1. The next row shows the number of responses we would have ob-
tained if we had used a naive matching scheme where we simply matched each
unmatched response for an IP address with the last echo request for that IP
address, without filtering unexpected responses. The number of responses in-
creases by 1.3% to 9.77 Billion; however, this includes responses from addresses
that received broadcast responses and duplicate responses. After filtering unex-
pected responses, the number of IP addresses reduces to 99.23% of the original
addresses. Of 30,678 discarded IP addresses, 9,942 (32.4%) addresses were dis-
33
1.0 1.0
median median
0.99 0.99
80 80
90 90
95 95
98 98
99 99
0.98 0.98
0 200 400 600 0 200 400 600
latency (s) latency (s)
(a) Before filtering (b) After filtering
Figure 3.6: CDF of Percentile latency per IP address before and after filtering unexpected
responses. Each point represents an IP address and each color represents the percentile
from that IP address’s response latencies. Before filtering unexpected responses, there are
bumps caused by broadcast responses at 330s, 165s and 495s, fractions of the 11 minute
(660s) probing interval.
carded because they also received broadcast responses. The majority of discarded
IP addresses, 20,736 (67.6%) were addresses that sent more than 4 echo responses
in response to a single echo request.
Though the number of discarded IP addresses is relatively small, removing
them eliminates responses that cluster around 330, 165, and 495 seconds. Fig-
ure 3.6 shows the distribution of percentile latency per IP address before and
after filtering unexpected responses. Comparing these two graphs shows that
the “bumps” in the CDF are removed by the filtering.
After discarding addresses, our matching technique yields 23,074,173 ad-
ditional responses for the remaining addresses, giving us a total of 9.67 Billion
34
CDF
CDF
% of pings
1% 50% 80% 90% 95% 98% 99%
1% 0.01 0.03 0.04 0.07 0.10 0.13 0.18
50% 0.16 0.19 0.21 0.26 0.42 0.53 0.64
80% 0.19 0.26 0.33 0.43 0.54 0.74 1.21
90% 0.22 0.31 0.42 0.57 0.84 1.61 3
95% 0.25 1.42 2.38 3 5 9 15
98% 0.30 1.94 4 6 12 41 78
99% 0.33 2.31 4 8 22 76 145
Table 3.2: Minimum timeout in seconds that would have captured c% of pings from r%
of IP addresses in the IT63w (20150117) and IT63c (20150206) datasets (where r is the row
number and c is the column number).
Echo Responses from 3.98 Million IP addresses. We perform our latency analysis
on this combined dataset.
3.3.2 Recommended Timeout Values
We now find retransmission thresholds which recover various percentiles of re-
sponses for the IP addresses from the combined dataset. For each IP address, we
find the 1st, 50th, 80th, 90th, 95th, 98th and 99th percentile latencies. We then find
the 1st, 50th, 80th, 90th, 95th, 98th and 99th percentiles of all the 1st percentile la-
tencies. We repeat this for each percentile and show the results in Table 3.2.
35
% of addresses
The 1st percentile of an address’s latency will be close to the ideal latency
that its link can provide. We find that the 1st percentile latency is below 330ms for
99% of IP addresses: most addresses are capable of responding with low latency.
Further, 50% of pings from 50% of the addresses have latencies below 190ms,
showing that latencies tend to be low in general.
However, we see that a substantial fraction of IP addresses also have sur-
prisingly high latencies. For instance, to capture 95% of pings from 95% addresses
requires waiting 5 seconds. Restated, at least 5% of pings from 5% of addresses
have latencies higher than 5 seconds. Thus, even setting a timeout as high as 5
seconds will infer a false loss rate of 5% for these addresses.
Note that retrying lost pings cannot be used as a substitute for setting a
longer timeout since a retried ping is not an independent sample of latency. What-
ever caused the first one to be delayed is likely to cause the followup pings to be
delayed as well, as we show in Section 3.5.
At the extreme, we see 1% of pings from 1% of addresses having latency
above 145 seconds! These latencies are so high that we investigate these addresses
further. We now consider 60 seconds to be a reasonable timeout to balance progress with
response rate, at least when studying outages and latencies, although an ideal timeout
may vary for different settings. A timeout of 60 seconds easily covers 98% of pings
to 98% of addresses, yet does not seem long enough to slow measurements un-
necessarily.
36
3.4 Verification of long ping times
In this section, we address doubts that long observed ping times are real: that
they are a product of ISI’s probing scheme, that they might be caused by errors in
a particular data set, or that they might derive from discrimination against ICMP.
3.4.1 Are high latencies observed by other probing schemes?
Some of the latencies in Table 3.2 are so high that we considered if they could
be artifacts of ISI’s probing scheme. We investigate latencies obtained using two
other probing techniques, Zmap and scamper, and check if the high latencies
observed in the ISI datasets are reproducible.
Does Zmap observe high latencies?
We check for high latencies using the Zmap scanner [46]. As part of our extension
of the ICMP probing module in the Zmap scanner, we also embed the probe send
time into the echo request, and extract it from the echo response, allowing us to
estimate the RTT, albeit without the precision of kernel send timestamps.
Zmap has performed these scans since April 2015. Scans have been con-
ducted over a range of different times, different days of the week and across four
months in 2015 (as of Sep 5, 2015), as shown in Table 3.3. Typically, scans were
performed on Sundays or Thursdays, beginning at noon UTC time. However,
the scans on April 17, May 22, and June 15 were conducted on other days and
37
Scan Date Day Begin Time Echo Responses
Apr 17, 2015 Fri 02:44 339M
Apr 19, 2015 Sun 12:07 340M
Apr 23, 2015 Thu 12:07 343M
Apr 26, 2015 Sun 12:07 343M
Apr 30, 2015 Thu 12:08 344M
May 3, 2015 Sun 12:08 344M
May 17, 2015 Sun 12:09 347M
May 22, 2015 Fri 00:57 371M
May 24, 2015 Sun 12:09 369M
May 31, 2015 Sun 12:09 362M
Jun 4, 2015 Thu 12:10 368M
Jun 15, 2015 Mon 13:53 357M
Jun 21, 2015 Sun 12:11 368M
Jul 2, 2015 Thu 12:00 369M
Jul 5, 2015 Sun 12:00 368M
Jul 9, 2015 Thu 12:00 369M
Jul 12, 2015 Sun 12:00 367M
Table 3.3: Zmap scan details: For each Zmap scan in Figure 3.7, the table shows the date,
day of the week, the time at which the scan began (in UTC time), and the number of
destinations that responded with Echo Responses.
at other times, increasing diversity. Each Zmap scan takes 10 and a half hours to
complete and recovers Echo Responses from around 350M addresses.
We choose all available scans and analyze the distribution of RTTs for the
Echo Responses in Figure 3.7. Most responses arrive with low latency, having
a median latency lower than 250ms for each scan. However, 5% of addresses
responded with RTTs greater than 1 second in each scan. Further, 0.1% of ad-
dresses responded with latencies exceeding 75 seconds in each scan although the
99.9th percentile latency exhibited some variation: the May 22 scan had the low-
est 99.9th percentile latency (77 seconds) whereas the July 9 scan had the highest
38
1.0
0.95
Apr 17
Apr 19
0.8 Apr 23
Apr 26
Apr 30
May 3
0.6 May 17
May 22
May 24
May 31
0.4 Jun 4
Jun 15
Jun 21
Jul 2
0.2 Jul 5
Jul 9
Jul 12
0.0
0.01 0.1 1 10 100
RTT (s)
Figure 3.7: Distribution of RTTs for all Zmap scans performed in 2015. Around 5% of
addresses have latencies greater than 1s in each scan, and 0.1% of addresses observed
latencies in excess of 75s.
(102 seconds). We infer from these nearly identical latency distributions that high
latencies are persistent for a consistent fraction of addresses.
Does scamper also observe high latencies?
Both ISI and Zmap probe millions of addresses, and we investigate whether la-
tencies are affected by these probing schemes triggering rate-limits or firewalls.
We select a small sample of addresses that are likely to have high latencies from
the ISI dataset, probe them using scamper [55], and check for unusually high
latencies.
In the 2011 - 2013 ISI dataset, 20,095 IP addresses had at least 5% of their
pings with latencies 100 seconds and above. We chose 2000 random IP addresses
from this subset and sent 1000 pings to them, once every 10 seconds using scam-
per [55] and analyzed the responses. In this analysis, we used scamper’s default
39
CDF
packet response matching mechanism: so long as scamper continues to run, re-
ceived responses will be matched with sent packets. Because we used scamper’s
defaults, scamper ceased to run 2 seconds after the last packet was sent, so we
missed responses to the last few pings that arrived after scamper ceased running.
Although scamper can be configured to wait longer for responses, in later anal-
yses, we ran tcpdump simultaneously and matched responses to sent packets
separately.
Of the 2000 addresses, 1244 responded to our probes. Figure 3.8 shows
the percentile latency per IP address. The 95th percentile latency for 50% of the
addresses is now considerably lower, at 7.3s. This suggests that addresses prone
to extremely high latencies vary with time: we investigate addresses with this
behavior further in Section 3.5.
Nevertheless, Figure 3.8 shows that scamper also observes some instances
of very high latencies. 17% of addresses observe latencies greater than 100 sec-
onds for 1% of their pings. We therefore rule out the possibility that the high
latencies are a product of the probing scheme.
3.4.2 Is it a particular survey or vantage point?
ISI survey data are collected from four vantage points at different times. Vantage
points are identified by initial letter, and are in Marina del Rey, California, “w”; Ft.
Collins, Colorado, “c”; Fujisawa-shi, Kanagawa, Japan, “j”; and Athens, Greece,
“g”.
40
1.0
0.8
0.6
95
99
0.4
0.2
0
0 200 400 600
RTT (s)
Figure 3.8: Confirmation of high latency: Percentile latency per IP address for 2000 ran-
domly chosen IP addresses from ISI’s 2011 - 2013 surveys that had > 5% of pings with
latencies 100s and above. Each point represents an IP address and the lines represent
the percentile latency from that IP address. 17% of them continue to observe 1% of their
pings with latencies > 100s.
In this section, we look at summary metrics of each of the surveys. In Fig-
ure 3.9, our intent was to ensure that the results were consistent from one survey
to the next, but we found a surprising result as well. The consistency of val-
ues is apparent: the median ping from the median address remains near 200ms
for the duration. However, there are exceptions in the following data sets: IT59j
(20140515), IT60j (20140723), IT61j (20141002), IT62g (20141210). These higher
sampled latencies are coincident with a substantial reduction in the fraction of re-
sponses that are matched: in typical ISI surveys, 20% of pings receive a response;
in these, between 0.02% and 0.2% see a response. It appears that these data sets
should not be considered further. Additionally, it54c (20130524) it54j (20130618)
41
CDF
100 100
10 10 99%
98%
95%
1 1 90%
80%
50%
0.1 0.1 1%
0.01 0.01
2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
c
c wcwwcj cwwj wcjc
jwjwjwcjwc
jwc wc c wcw
c wc c wc wc wcw
20 w w j cw
cj jj w
j
w j c Colorado
w w w wc
wcwcw wc wcwcwwcw c j g15 w w w ww w ww c wc c wc g Greecew
10 j Japan
5 w w California
0 c j j j
2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Figure 3.9: Top: Minimum timeout required to capture the cth percentile latency sample
from the cth percentile address in each survey, organized by time. Each point represents
the timeout required to capture, e.g., 95% of the responses from 95% of the addresses. The
1% line is indicative of the minimum. Bottom: Response rate for each survey; symbols
represent which vantage point was used. Surveys from Japan with very few successes
are not plotted on the top graph.
42
Percentage of Min Timeout (s)
successful pings
and it54w (20130430) were flagged by ISI as having high latency variation due to
a software error [58].
Ignoring the outliers, trends are apparent. The timeout necessary to cap-
ture 95% of responses from 95% of addresses increased from near two seconds
in 2007 to near five seconds in 2011. (We note that the apparent stability of this
line may be misleading; since the y-axis is a log scale and our latency estimates
are only precise to integer seconds when greater than 3, small variations will be
lost.) The 98th percentile latency from the 98th percentile address has increased
steadily since 2011, and the 99th increased from a modest 20 seconds in 2011 to a
surprising 140 in 2013. These latency observations are not isolated to individual
traces.
In sum, high latency is increasing, and although some surveys show atyp-
ical statistics, early 2015 datasets that we focus on appear typical of expected
performance.
3.4.3 Is it ICMP?
One might expect that high latencies could be a result of preferential treatment
against ICMP. RFC 1812 allows routers responding to ICMP to rate-limit replies [59,
60], however, this limitation of ICMP should not substantially affect the results
since each address is meant to receive a ping from ISI once every eleven minutes.
Nevertheless, one can imagine firewalls or similar devices that would interfere
specifically with ICMP.
43
1.0
0.8
0.6
ICMP (seq 0)
0.4
UDP (seq 0)
TCP (seq 0)
ICMP (seq 1, 2)
UDP (seq 1, 2)
0.2
TCP (seq 1, 2)
0.0
0.1 1 10 100
RTT (s)
Figure 3.10: 98th percentile RTTs associated with high-latency IP addresses using differ-
ent probe protocols. The first probe of a triplet (seq 0) often has a higher latency than
the rest; TCP probes appear to have a similar distribution except for firewall-sourced
responses.
To evaluate this possibility, we selected high-latency addresses from the
IT63c (20150206) survey. To these addresses we sent a probe stream consisting of
three ICMP echo requests separated by one second, then 20 minutes later, three
UDP messages separated by one second, then again 20 minutes later, three TCP
ACK probes separated by one second. We avoided TCP SYNs because they may
appear to be associated with security vulnerability scanning. We then consider
the characteristics of these hosts in terms of the difference between ICMP delay
and TCP or UDP delay.
44
CDF
“High-latency” addresses to sample
We choose the top 5% of addresses when sorting by each of the median, 80th,
90th and 95th percentile latencies. Many of these sets of addresses overlap: those
who have among the highest medians are also likely to be among the highest 80th
percentiles. However, we considered these different sets to be important so that
the comparison would include both hosts with high persistent latency and those
with high occasional latency. After sampling 15,000 addresses from each of these
four sets, then removing duplicates, we obtain 53,875 addresses to probe.
From these addresses, we found that only 5,219 responded to all probes
from all protocols on April 29, 2015. This is somewhat expected: Only 27,579
responded to any probe from any protocol.
To complete the probing, we use Scamper [55] to send the probe stream to
each of the candidate addresses. Note that scamper uses a 2s timeout by default
although the timeout can be configured. Instead of setting an alternate timeout
in Scamper, we run tcpdump to collect all received packets, effectively creating
an “indefinite” timeout. This allows us to observe packets that arrive arbitrarily
late since we continue to run tcpdump days after the Scamper code finished.
All protocols are treated the same (mostly)
For each protocol, we select the 98th percentile RTT per address and plot the
distribution in Figure 3.10. We noticed two obvious features of the data: that the
first packet of the triplet often had a noticeably different distribution of round
45
trip times, and that the TCP responses often had a mode around 200ms. We will
investigate the “first ping” problem in Section 3.5.3.
The TCP responses appear to be generated by firewalls that recognize that
the acknowledgment is not part of a connection and sent a RST without notifying
the actual destination: this cluster of responses all had the same TTL and ap-
plied to all probes to entire /24 blocks. That is, for each address that had such a
response, all other addresses in that /24 had the same.
Ignoring the quick TCP responses apparently from a firewall, it does not
appear that any protocol has significant preferential treatment among the high-
latency hosts. Of course, this observation does not show that prioritization does
not occur along any of these paths; our assertion is only that such prioritization,
if it exists, is not a source of the substantial latencies we observe.
3.4.4 Summary
In this section, we confirmed that extremely high latencies are also observed by
techniques besides ISI’s. We find that the high latencies are not a result of a few
individual ISI datasets, even though some did appear atypical. Further, high
latencies affect all protocols the same.
We also found that the prevalence of high latencies has been increasing since
2011. In 2015, a consistent 5% of addresses have latencies greater than a second.
46
1 1
0 0
Hughes
Viasat
Skylogic
BayCity
100 iiNet100
On Line
Skymesh
Telesat
Horizon
10 10
1 1
0.5 1.0 1.5 0 1 0.5 1.0 1.5 0 1
1 percentile RTT (s) 1 percentile RTT (s)
Figure 3.11: Scatterplot of 1st and 99th percentile latencies for addresses with high values
of both in survey IT63c; Left omits satellite-only ISPs; Right includes only satellite-only
ISPs.
3.5 Why do pings take so long?
In this section, we aim to determine what causes high RTTs. We investigate the
RTTs of satellite links and find that they account for a small fraction of high RTT
addresses. We follow up with an analysis of Autonomous Systems and geo-
graphic locations that are most prone to two potentially different types of high
RTTs: RTTs greater than 1s and RTTs greater than 100s. We then investigate ad-
dresses that exhibit each type of RTT and find potential explanations.
3.5.1 Are satellites involved?
A reasonable hypothesis is that satellite links, widely known for their necessar-
ily high minimum latency, would also be responsible for very high maximum
latencies. Transmissions via geosynchronous satellite must transit 35,786km to a
47
99 percentile RTT (s)
99 percentile RTT (s)
satellite and back, leading to about 125 ms of one way delay [61, 62]. Another 125
ms for the return trip yields a theoretical minimum of 250ms.
We expect satellite ISPs to have high 1st percentile latencies, but we con-
sider whether they have high 99th percentile latencies as well. We use data from
ISI survey IT63c (20150206) for this analysis because it provides hundreds of ping
samples per IP address, and we wish to study relatively few addresses in some
detail. Figure 3.11 shows the plot of 1st percentile latencies vs. the 99th percentile
latencies for addresses in this survey. We separate addresses that the Maxmind
database maps to known satellite providers, including Hughes and ViaSat. At
left, we show the overall distribution without addresses from known satellite
ISPs; at right, we show only satellite ISPs. (Recall that the precision of values just
above the ISI timeout of three seconds is limited to integers; this creates the hor-
izontal bands.) The satellite-only ISPs plot shows that the 1st percentile RTT for
these addresses exceeds 500ms in all cases, showing that the RTTs are almost dou-
ble the theoretical minimum. There are some points in the left plot that remain
within the satellite-like cluster; at least some of these are from rural broadband
providers that provide both satellite and other connectivity, such as xplorenet
in Canada, which had at least one IP address report with a below 0.5s first per-
centile.
Each satellite provider has a distinct cluster in this scatter plot, and two
smaller providers, Horizon and iiNet, have clusters of reports that produce near-
horizontal lines in the graph, with varying 1st percentile but fairly consistent 99th
48
May 2015 June 2015 July 2015
ASN Owner >1s % Rank >1s % Rank >1s % Rank
26599 TELEFONICA BRASIL 3.56M 80.4 1 3.87M 77.5 1 4.20M 77.0 1
26615 Tim Celular S.A. 1.35M 74.5 3 1.42M 71.5 2 1.72M 71.6 2
45609 Bharti Airtel Ltd. 1.46M 76.6 2 1.21M 81.0 3 1.03M 79.2 3
22394 Cellco Partnership 0.55M 73.4 8 0.58M 73.5 4 0.63M 72.7 4
1257 TELE2 0.67M 69.5 5 0.42M 65.5 9 0.58M 67.4 5
27831 Colombia Movil 0.53M 68.8 9 0.54M 64.3 5 0.53M 62.8 6
6306 VENEZOLAN 0.69M 77.3 4 0.41M 76.4 10 0.40M 75.7 10
9829 National Internet Backbone 0.57M 27.6 7 0.43M 30.9 7 0.43M 29.5 9
4134 Chinanet 0.60M 1.5 6 0.38M 0.9 11 0.34M 0.9 11
35819 Etihad Etisalat (Mobily) 0.42M 54.0 10 0.43M 54.5 6 0.45M 55.8 8
Table 3.4: Autonomous Systems sorted by the addresses summed across three Zmap
scans for addresses that observed RTTs greater than 1s. The table shows for each AS: the
number and percentage of addresses with RTT greater than 1s and the rank in that scan.
percentile, as if queuing for these addresses is capped but the base distance to the
satellite varies by geography.
Although some satellite hosts do have remarkably high RTT values—up
to 517s—their 99th percentile values are predominantly below 3s. They do not
have such high 99th percentile values as the rest of the hosts with over 0.3s first
percentiles (those shown on the left graph). Thus, satellite ASes accounted for
very few of the high latency addresses.
3.5.2 Autonomous Systems with the most high latency addresses
Next, we investigate the ASes and geographic locations with the most high la-
tency addresses to identify relationships. For this analysis, we use Zmap scans
from 2015 to identify high latency addresses. Zmap pings every IPv4 address,
49
May 2015 June 2015 July 2015
Continent >1s % >1s % >1s %
South America 7.27M 26.7 7.41M 25.8 8.05M 26.9
Asia 5.56M 3.8 4.73M 3.4 4.56M 3.2
Europe 2.56M 2.7 2.09M 2.2 2.32M 2.4
Africa 1.12M 29.4 1.20M 30.3 1.30M 31.7
North America 0.93M 1.0 1.04M 1.1 1.14M 1.2
Oceania 0.08M 3.9 0.08M 3.7 0.08M 3.7
Table 3.5: Continents sorted by the addresses summed across three Zmap scans for
addresses that observed RTTs greater than 1s. The table shows for each AS: the number
and percentage of addresses with RTT greater than 1s in that scan.
thereby covering addresses from all ASes. We chose the May 22, Jun 21 and Jul
9 Zmap scans to study. These scans were conducted at different times of the day,
on different days of the week and in different months, as shown in Table 3.3.
For each of these Zmap scans, we use Maxmind to find the ASN and geographic
location for every address that responded.
ASes most prone to RTTs greater than 1 second
Figure 3.7 showed that the percentage of addresses that sent high latency Echo
Responses remained stable over time. In particular, around 5% of addresses ob-
served RTTs greater than a second in each scan. We refer to these addresses as
50
May 2015 June 2015 July 2015
ASN Owner >100s % Rank >100s % Rank >100s % Rank
26599 TELEFONICA BRASIL 51.9K 1.2 1 63.5K 1.3 1 77.6K 1.4 1
12430 VODAFONE ESPANA S.A.U. 12.8K 4.4 2 11.6K 4.1 2 14.6K 5.2 3
26615 Tim Celular S.A. 6.2K 0.3 7 9.4K 0.5 3 14.7K 0.6 2
3352 TELEFONICA DE ESPANA 8.5K 0.2 3 7.3K 0.1 5 7.5K 0.2 4
6306 VENEZOLAN 7.5K 0.8 5 8.4K 1.5 4 6.6K 1.2 6
22394 Cellco Partnership 6.9K 0.9 6 6.6K 0.8 6 7.5K 0.9 5
27831 Colombia Movil 3.2K 0.4 10 5.0K 0.6 7 5.2K 0.6 7
45609 Bharti Airtel Ltd. 7.8K 0.4 4 2.6K 0.2 9 2.9K 0.2 9
35819 Etihad Etisalat (Mobily) 3.8K 0.5 9 3.9K 0.5 8 4.0K 0.5 8
1257 TELE2 6.2K 0.4 8 1.7K 0.3 14 2.4K 0.3 12
Table 3.6: Autonomous Systems sorted by the addresses summed across three Zmap
scans for addresses that observed RTTs greater than 100s. The table shows for each AS:
the number and percentage of addresses with RTT greater than 100s and the rank in that
scan.
turtles and investigate their distribution across Autonomous Systems to identify
relationships.
For each Zmap scan, we found the turtles and identified their AS, and
ranked ASes by the number of contributed turtles. Finally, we summed the tur-
tles from each AS across the three scans and sort ASes accordingly and show the
top ten in Table 3.4. For example, AS26615 had the second-largest sum of turtles
across the three Zmap scans, but was ranked third within the May 2015 scan.
Inspecting the owners of each of these Autonomous Systems reveals that a
majority of them are cellular. AS26599 (TELEFONICA BRASIL), a cellular AS in
Brazil, has the most turtles, more than double that of the next largest AS in each
51
of the scans. The next two ASes, AS45609 (Bharti Airtel Ltd.), and AS26615 (Tim
Celular), are also cellular, and so are 5 of the remaining 7 ASes in the top 10.
Also notable is the percentage of responding addresses that are turtles for
these ASes. Most of the cellular ASes have around 70% of all probed addresses
being turtles. AS9829, one of the two ASes with turtles accounting for lower than
50% of probed addresses, is known to offer other services in addition to cellular.
AS4134, with only 1% of its probed addresses being turtles, is also known to offer
other services. We believe that the cellular addresses observe high RTTs while
others do not, explaining the low ratio of probed addresses with RTTs greater
than 1 second.
Finally, nine ASes were observed in the top ten in every scan. AS4134 was
the only exception, but it ranked 11th in the June and July scans. Thus, the Au-
tonomous Sytems with the most turtles also remain consistent over time.
Table 3.5 shows the continents with the most turtles. South America and
Asia alone account for around 75% of all turtles. Further, around a quarter of
all addresses in South America and a third of the addresses in Africa experienced
RTTs greater than 1s in each scan. On the other hand, only 1% of North America’s
addresses are turtles (of which more than half come from a single ASN: AS22394).
ASes most prone to RTTs greater than 100 seconds
Next, we investigate the Autonomous Systems of addresses with RTTs greater
than 100 seconds in the three Zmap scans: we refer to these addresses as sleepy-
52
turtles. We consider whether these addresses are different from turtles to identify
whether there is a different underlying cause. Following the same process in
identifying ASes and sorting them as in Table 3.4, Table 3.6 shows Autonomous
Systems that are most prone to RTTs greater than 100 seconds.
We find that sleepy-turtles exhibit similarities to turtles. Every Autonomous
System in Table 3.6 is cellular. Further, the ranks of the Autonomous Systems
remain stable over time across the scans. However, there is more variation across
the scans for the percentage of sleepy-turtles among all probed addresses for an
AS. This suggests that the fraction of addresses experiencing RTTs greater than
100 seconds is less stable over time.
3.5.3 Is it the first ping?
RTTs that are consistently greater than a second are sufficiently high that interac-
tive application traffic would seem impractical with these delays. We suspected
that the latencies measured by ISI and Zmap might not be typical of application
traffic.
We considered two broad explanations—extraordinary persistent latency
due to oversized queues associated with low-bandwidth links, or extraordinary
temporary, initial latency due to MAC-layer time slot negotiation or device wake-
up.
53
In this section, we find that the latter appears to be a more likely expla-
nation, qualitatively consistent with prior investigations of GPRS performance
characteristics [63], but showing quantitatively more significant delay.
We extracted 236,937 IP addresses from the 20150206 ISI dataset (February
2015), including all addresses with a median RTT of at least one second. To se-
lect only responsive addresses that still had high latency, for each of these IP
addresses, we sent two pings, separated by five seconds, with a timeout of 60
seconds. We omit 151,769 addresses that did not respond to either probe and
1,994 addresses that responded, on average, within 200ms.
Of the 83,174 addresses that remain, we wait approximately 80 seconds be-
fore sending ten pings, once per second with the same 60-second timeout. We
next classify how the round trip time of the first ping, RTT1, differs from those of
the rest of the responded pings, RTT2 . . . RTTn, where n may be smaller than 10
if responses are missing. For most of these addresses, 51,646, the first response
took longer than the maximum of the rest. This suggests that roughly 2/3 of high
latency observations are a result of negotiation or wake-up rather than random
latency variation or persistent congestion. For 11,874, median(RTT2 . . . RTTn) <
RTT1 < max(RTT2 . . . RTTn), i.e., the first response took longer than the median,
but not the maximum, of the rest. The first response was smaller than the median
of the rest for a comparable 10,910. That the first is above or below the median in
roughly equal measure suggests that for these addresses there is little observed
penalty to the first ping. Finally, we omit analysis of 8,329 addresses because
we did not receive a response to, at least, the first probe, even though they did
54
1.0
0.8
0.6
0.4
0.2
0.0
-1 0 1
1.0
All (73119)
0.8
RTT_1 > max (50663)
0.6
0.4
0.2
0.0
-1 0 1
RTT_1 - RTT_2
Figure 3.12: Bottom: Difference between initial latency and second probe latency; values
around 1 indicate that both responses arrive at about the same time, values near zero
indicate that the RTTs were about the same. The second line includes only those where
RTT1 > max(RTT2 . . . RTTn). Top: The probability that, given RTT1 − RTT2 on the
x-axis, that RTT1 > max(RTT2 . . . RTTn).
respond to the initial pair of probes, and we omit an additional 415 addresses
that did respond to the first probe, but not to at least four probes overall (i.e., we
require n ≥ 4 before computing the median or maximum for comparison).
Can the overestimate be detected?
We show in Figure 3.12 the differences between the first and second round trip
times for all those that had a first and second response. (1,311 addresses re-
sponded to the first but not the second). Rarely, latency increases from first to
the second (yielding a negative difference) or decreases sufficient to indicate re-
ordering (yielding a difference greater than one second). Typical among these
55
CDF probability
1.0
0.8
0.6
0.4
0.2
0.0
0 2 4 6 8 10
RTT_1 - min(RTT_2..RTT_n) 
Figure 3.13: Difference between initial latency and observed minimum. The typical setup
time is below four seconds.
addresses is for the second ping to be one second less than the first, that is, for
both responses to arrive at about the same time.
We infer that a measurement approach that sent a second probe after one
second could detect this behavior. The top graph of Figure 3.12 shows the proba-
bility that the maximum will be less than the first based on the difference between
the first two latencies. (When the RTT difference exceeds 1 at the right edge of
the upper graph, there are very few samples in an environment of substantial
reordering.) Any significant drop from RTT1 to RTT2 is indicative of an overes-
timate with high probability.
How long does the negotiation or wake-up process take, and how
large is the overestimate?
We observe that this can be estimated by comparing the first round trip time to the
lowest seen among the ten probes. Of course, if the negotiation takes 15 seconds,
56
CDF 
1.0
0.8
0.6
0.4
0.2
0.0
0 20 40 60 80 100
Percentage of addresses with RTT_1 > max(RTT_2..RTT_n)
Figure 3.14: Percentage of addresses in a /24 prefix showing a drop from the initial to the
maximum.
the first probe rtt will take at most 9 seconds longer than the last, so this data set
will treat all instances of a setup time between 10 and 60 seconds as taking 9. We
show in Figure 3.13 the differences between RTT1 and min(RTT2 . . . RTTn) for
those 51,646 addresses that had a higher first rtt than the maximum of the rest.
The median is 1.37 seconds, and 90% of the differences are below 4 seconds. Only
2% of the samples are above 8.5 seconds, suggesting that we do not underestimate
this time substantially, and thus conclude that the wake-up or negotiation process
generally takes from one-half to four seconds.
Are the addresses that show a high initial ping scattered across the
IP address space or clustered into /24s?
The 236,937 IP addresses that we decided to probe initially are from only 1,887
“/24” prefixes. This is somewhat fewer prefixes than would be expected, given
that there are 3.6M addresses in 34K prefixes in the overall 20150206 dataset. That
57
Pattern Pings Events Addrs
Low latency, then decay 615 13 10
Loss, then decay 1528 81 33
Sustained high latency and loss 2994 21 14
High latency between loss 12 12 12
Table 3.7: We observed distinct patterns of latency and loss near high latency responses,
classifying all 5149 pings above 100 seconds from the sample.
is, as one might expect, greater than one second latencies do seem to be a prop-
erty of the networks associated with selected prefixes. The 83,174 addresses that
responded are from only 1,230 prefixes. We show the percentage of responsive
addresses within each prefix that dropped from the initial ping to the maximum
of the rest in Figure 3.14. Several prefixes did not have an initial latency greater
than the maximum; these typically had very few responsive addresses. In other
prefixes, most addresses showed a reduction. Finally, the 51,646 that showed a re-
duction from the initial ping are from only 1,083 prefixes. Of the 161 prefixes that
had only one address with above one-second median latency, only 39 showed a
reduced from the initial RTT to the maximum of the rest. Taken together, we be-
lieve this distribution of addresses across relatively few prefixes indicates that the
wake-up behavior is associated with some providers but not restricted to them.
58
3.5.4 Patterns associated with RTTs greater than 100 seconds
Finally, we look at addresses with extraordinarily high latencies ( greater than 100
seconds); in particular, we want to understand whether these high latencies are
an instance of a first-ping-like behavior, where wireless negotiation or buffering
during intermittent connectivity creates the high value, or, on the other hand, are
instances of extreme congestion. To separate the two types of events, we consider
a sequence of probes, looking for whether or not the latency diminishes after a
ping beyond 100 seconds.
We sample 3,000 of 38,794 addresses whose 99th percentile latency was
greater than 100 seconds in the IT63c (20150206) dataset. Of this sample, 1,400 re-
sponded. We sent each address 2000 ICMP Echo Request packets using Scamper,
spaced by 1 second. To collect responses with very high delays without altering
the Scamper timeout, we simultaneously run tcpdump to capture packets.
Ping samples that saw a round trip time above 100 seconds exist in the con-
text of a few very distinct patterns. Often, a series of successive ping responses
would be delivered together almost simultaeously, leading to a steady decay in
their round trip times. For example, after 136 seconds of no response from IP
address 191.225.110.96, we received all 136 responses over a one second interval:
every subsequent response’s round-trip latency was 1 second lower than the pre-
vious. This pattern is sometimes preceded by a relatively low latency ping (<
10 seconds) and at other times, follows a few lost pings: we distinguish between
these two cases and call the former Low latency, then decay and the latter Loss, then
59
decay. It is possible that these are both observing the same underlying action on
the network, but we leave them separate since there are substantially many of
each.
Another characteristic pattern is that a high round trip time is followed by
several responses of even greater latency, possibly with intermittent losses. This
behavior is usually sustained for several minutes with latencies remaining higher
than normal (>10 seconds) throughout the duration: we call this behavior Sus-
tained high latency and loss. Finally, there are some cases where a single ping has a
latency > 100 seconds and is preceded and followed by loss. We call these cases
High latency between loss.
We count the number of occurrences of each pattern in Table 3.7. For each
pattern, we show the number of pings greater than 100 seconds that were part of
that pattern, the number of instances of that pattern occurring, and the number
of unique addresses for which it occurred. We observe that the majority of events
and addresses are Loss, then decay, yet almost twice as many pings are part of
Sustained high latency and loss.
3.5.5 Summary
High latencies appear to be a property mainly of cellular Autonomous Systems,
though a few also appear on satellite links. Latencies in the ISI data that are
regularly above one second seem to be caused by the first-ping behavior associ-
ated with several addresses, where the first ping in a stream of pings has higher
60
latency than the rest. Egregiously high latencies, i.e., latencies greater than a hun-
dred seconds, occur in two broad patterns. In the first, latencies steadily decay
with each probe, as if clearing a backlog. In the second, latencies are continuously
high and are accompanied by loss, as if the network link is oversubscribed.
3.6 Conclusion and Discussion
Researchers use tools like ping to detect network outages, but generally guessed
at the timeout after which a ping should be declared “failed” and an outage sus-
pected. The choice of timeout can affect the accuracy and timeliness of outage
detection: if too small, the outage detector may falsely assert an outage in the
presence of congestion; if too large, the outage detector may not pass the outage
along quickly for confirmation or diagnosis.
We investigated the latencies of responses in the ISI survey dataset to deter-
mine a good timeout, considering the distributions of latencies on a per-destination
basis. Foremost, latencies are higher than we expected, based on conventional
wisdom, and appear to have been increasing. We show that these high latencies
are not an artifact of measurement choices such as using ICMP or the particular
vantage points or probing schemes used, although different data sets vary some-
what. We show that high latencies are not caused by links with a substantial base
timeout, such as satellite links. Finally, we showed that in many instances, the
initial communication to cellular wireless devices is largely to blame for high la-
tency measures. Similar spikes that may be consistent with handoff also dissipate
61
over time, to more conventional latencies that support application traffic. With
this data, researchers should be able to reason about what to expect in terms of
false outage detection for a given timeout and how to design probing methods to
account for these behaviors.
As memory capacity and performance becomes less of a limiting factor, we
believe that the lesson of this work is to design network measurement software
to approach outage detection using a method comparable to that of TCP: send
another probe after 3 seconds, but continue listening for a response to earlier
probes, at least for a duration based, at least in part, on the error rates implied by
Table 3.2.
When investigating historical outage measurement data collected by prob-
ing based techniques, the timeouts used by the technique must be compared with
timeouts that would have captured almost all responses for the addresses pinged
by the technique. For example, Thunderping probes addresses only in U.S. net-
works. For these addresses, both the ISI and Zmap datasets showed that more
than 99% of ping responses arrived within the 5s timeout used by Thunderping.
The probability of false outage inferences due to high latency is small. However,
if the Thunderping technique had been used to ping addresses in South Amer-
ican cellular ISPs, there would be a significantly higher probability of detecting
false outages, since the 5s timeout would have missed many delayed responses.
62
Chapter 4: Mitigating false inferences due to dynamic address-
ing
In this chapter, I begin by describing a common assumption—that IP addresses
can be used as proxies—for users in Section 4.1. In Section 4.2, I discuss how
dynamic addressing can lead probing-based outage detection techniques to make
false inferences about outages.
Next, I describe work with colleagues to empirically measure the frequency
of dynamic addressing and the durations for which addresses are assigned to
residential home router devices in several networks around the world and the ef-
fect of outages upon dynamic reassignment [21]. The measurements we used are
sourced from the RIPE NCC’s Atlas project, which deploys small devices, called
probes, that conduct measurements from globally distributed networks [15]. The
RIPE Atlas dataset offers measurements that allow us to determine when an
IP address change occurred and what the addresses were before and after the
change. In addition, the dataset includes many measurements that provide con-
text about what was happening around the time of the address change. I was
able to use these measurements to detect when RIPE Atlas probes rebooted and
were not sending pings (indicating a power outage) and when their pings were
63
not getting responses (indicating a network outage). In a study with colleagues of
active RIPE Atlas probes in 2015, we found 3,038 RIPE Atlas probes with address
changes hosted across 929 ISPs and 156 countries [21]. Using the measurements
from RIPE Atlas, I identify networks where addresses are typically stable.
Finally, in section 4.9 I discuss a technique to identify outages even in net-
works where dynamic reassignment is common. Using a complementary dataset
that allows checking if an address for which a probing-based technique detected
an outage has remained the same before and after the detected outage, we are
able to confirm outages even in networks where dynamic reassignment is com-
mon.
4.1 IP addresses can be proxies for end users
Academia and industry often rely on a simplifying assumption that IP addresses
uniquely identify end-hosts [64–78]. This assumption allows researchers to track
end host behavior over time [5, 68, 69], or to count participating users in peer-
to-peer systems [64–66]. Many organizations create blacklists of suspicious IP
addresses based on previously observed malicious traffic associated with those
addresses [75–78].
Probing-based techniques like Thunderping [5] make a similar assumption:
a probed address is representative of a residential customer’s Internet connection.
Many residences have at least one device with a public IP address [20], typically
64
the home router. When a home router’s address stops responding to pings, it
could be evidence of a residential Internet outage.
All of these applications would benefit from understanding how often and
when dynamic addresses assigned to user devices change.
4.2 Probing-based techniques can make false outage inferences due
to dynamic addressing
When probing-based remote outage detection techniques send probes to an ad-
dress, they expect that the address continues to be assigned to the same end-host
for the entirety of the probing duration. Depending upon how a dynamic address
gets reassigned, these techniques can make false inferences about outages in two
ways:
• Detecting false outages Probing-based remote outage detection techniques de-
tect outages when a previously responsive address stops responding to probes.
However, If a dynamic address being probed is withdrawn from its host and is
not assigned to any other host, active probes to the address will no longer elicit
responses. These techniques will infer false probe-loss, leading them to infer
false outages.
• Detecting false outage duration These techniques detect outage duration by con-
tinuing to probe an unresponsive address. When the address starts responding
to probes again, the outage is inferred to end. If a home router with a pub-
lic dynamic address has an outage and at some point during the outage, the
65
dynamic address is reassigned to some other home router which responds to
probes, probing-based remote outage detection techniques would infer that the
outage ended incorrectly.
My approach to mitigating these false inferences is to analyze how fre-
quently and for what reasons dynamic addresses are reassigned, in various net-
works. Using the results of these analysis, I identify networks where addresses
are typically stable.
4.3 Dynamic addressing background
An IP address can be used to uniquely identify the end-host it is assigned to until
the end-host’s address changes for some reason. The duration of time that a dy-
namic IP address continues to be assigned to the same CPE (Customer Premises
Equipment) device depends upon various causes that can induce the assigned IP
address to change. Here, I present techniques used for assigning dynamic ad-
dresses and the events and agents involved in dynamic address changes.
4.3.1 Dynamic Host Configuration Protocol
ISPs often use the Dynamic Host Configuration Protocol (DHCP) [79] for IP ad-
dress assignment. DHCP issues an IP address to a host for a lease duration con-
figured by the ISP. The host will try to renew the lease before it expires, typically
half-way into the lease. However, whether the same IP address is renewed, or a
different one is assigned, depends upon ISP policy. We speculate that the typical
66
behavior of ISPs using DHCP is to renew the lease of the currently assigned IP
address, since one of the stated design goals in the DHCP specification is that a
DHCP client should be assigned the same address in response to each request,
whenever possible. Thus, we typically only expect an ISP using DHCP, to change
the address of a CPE, if something happens to prevent the CPE from renewing its
lease (like an outage).
4.3.2 Point-to-Point Protocol
In some networks, end-hosts connect to an ISP using point-to-point links. For
these networks, the Point-to-Point Protocol (PPP) first configures and establishes
the point-to-point link [80]. Next, a Network Control Protocol (NCP) like the
Internet Protocol Control Protocol (IPCP) configures IP addresses [81]. The PPP
specification notes that the link will remain configured for communication until
the link is actively closed down through network administrator intervention or
when an inactivity timer expires.
4.3.3 Potential dynamic address change causes
Next, we identify the reasons dynamic addresses assigned using the above tech-
niques could change. We classify the following categories of address change:
• Changes after outages If the client is disconnected or loses power long enough
to fail to renew a DHCP lease, its address may be assigned to another; when
67
it returns, it may then get a new address. We call such changes outage-caused
address changes.
• Changes after reboot/reconnect While we expect addresses assigned through
traditional DHCP to change only when the outage duration is long enough to
prevent lease renewal, addresses assigned through PPP can change upon out-
ages of any duration. Any reboot or network reconnect event could cause the
client to forget its prior address and request a new one, or the state associated
with a connection may be lost. We call such address changes reboot-caused ad-
dress changes.
• Administrative address changes A purpose of dynamic address assignment is
to allow reconfiguration of the network; it is possible that a reconfiguration of
the DHCP server will force a change to the subnet on which the client lies. We
expect such reassignment to be rare.
• Periodic address changes We observe that some ISPs limit the session length
associated with an address, causing a reassignment after a fixed duration, typ-
ically one day to one week depending on the ISP.
Intuitively, the address change is either caused by the ISP (administrative
or periodic), or caused by the client (or an interruption in network service to the
client) in a reboot or outage.
68
4.4 Related Work
Previous work studied the performance of DHCP in small campus networks [82,
83] and settings where smartphone usage is widespread [84] and developed tech-
niques to reduce network address utilization and DHCP broadcast traffic. The
goal of those studies was to improve the performance of DHCP by tuning config-
uration.
Conceptually, so long as there is some uniquely identifying feature that
remains constant across a host’s address change, it is possible to track IP ad-
dress changes over time for that host. Several studies have used this broad
method [44, 47, 83, 85–89]. UDmap [85] studied dynamic address properties us-
ing Hotmail user login traces where the user’s login serves as the identifying
feature. Casado et al. [88] tracked clients using HTTP cookies when clients access
a CDN. Other studies [47, 87] used continuous responsiveness of an address itself
as the identifying feature, assuming that an address that responds continuously
belongs to the same user and that when an address stops responding to pings, it
has been reassigned.
While we share the same goal as these studies, our approach diverges in
that we are interested in the events associated with an address change. Previous
studies lacked access to end-host information that could reveal the cause of an
address change. One exception, Maier et al. [87], used access to the Radius server
of a European DSL provider from one urban area to identify why DSL sessions
terminated, and noted that the DSL provider often limited Radius session length
69
to 24 hours in that area. We extend this result to several ISPs in countries from
Europe, Asia, and South America, and identify other typical session length limits.
Argon et al. [44] used periodic measurements from end-hosts in the DIMES in-
frastructure [18]. DIMES software installed on an end-user computer is different
from RIPE Atlas hardware probes primarily in that it reports back only every 30-
60 minutes (as opposed to RIPE Atlas’s 3 minutes), the agent can be installed on
laptops that move (as opposed to RIPE Atlas probes that could move, but do not),
the hosts running DIMES are often powered down (resulting in limited uptime),
and DIMES hosts appear to have static IP addresses more often (they reported
60% had only one address). Nevertheless, Argon et al. observed that some small
ISPs exhibited address alternation with a 24 hour periodicity. In IPv6, the RFC for
privacy extensions for stateless address autoconfiguration recommends that IPv6
addresses be changed every 24 hours [90] and empirical results by Plonka and
Berger found that more than 90% of client IPv6 addresses were ephemeral [91].
We showed that 24 hour defaults are not uncommon in IPv4 as well.
These studies relied on relatively uncontrolled observations of the address
assigned to a device or user, both in terms of whether the devices are active,
whether the users connect using multiple devices, and how frequently samples
are provided. As a consequence, the dynamic IP address churn rates reported by
these studies vary. While UDmap reported that over 30% of IP addresses have
inter-user durations of 1–3 days [85], Heidemann et al. reported that 90% of IP
addresses were occupied for less than a day [47]. Maier et al. [87] reported that
a major European ISP had per-user median durations of just 20 minutes during
70
their study in 2009 (we did not observe this duration in 2015). We believe that
the perspective of a device using the dynamically assigned network is necessary
for understanding the reasons behind the address change and for getting precise
information about the duration that any address is held. Further, since RIPE Atlas
probes provide continuous, longitudinal measurements enabling the inference of
successive addresses assigned to a CPE device, we perform the first analysis of
dynamic prefixes from which devices are assigned successive addresses.
4.5 The RIPE Atlas datasets
Analyzing periodic and administrative address changes requires visibility of the
dynamic addresses assigned to a sample of the ISP’s customers and the ability
to see these addresses change over time. Analyzing outage-caused and reboot-
caused address changes requires knowledge of the events occurring on the end-
host at the time of an address change. Prior studies of dynamic addressing have
typically relied on incoming connections that have a unique client identifier, such
as a user name, but changing addresses, and thus have no information about
what caused a change or precisely when it occurred. The RIPE Atlas dataset is
unique since it includes necessary information about both address changes and
contemporaneous events at the host.
The RIPE NCC’s Atlas project deploys small devices, called probes, that
conduct measurements from globally distributed networks [15]. In this section,
we first describe the connection logs dataset from RIPE Atlas that we use to detect
71
IP address changes. We then describe the k-root ping and SOS-uptime datasets
from RIPE Atlas that we use to learn about events occurring on end-hosts.
4.5.1 RIPE Atlas connection logs dataset
RIPE Atlas probes connect to the RIPE Atlas infrastructure through a single SSH
session over TCP port 443 (typically used by HTTPS) [92]. RIPE Atlas servers
record the establishment and termination of these connections in connection logs.
Table 4.1 shows connection log entries for a RIPE Atlas probe in the dataset for
the first five days in January 2015.
Connection logs record each TCP connection made by the probe to a central
controller and include the timestamp of the beginning and end of the connection
(defined by the last receipt of data), the peer address of the connection that rep-
resents the publicly visible IP address used by the probe, and a unique identifier
of the probe device. Probes are typically deployed behind the Customer Premise
Equipment (CPE) of a user, so that the publicly visible IP address appearing in
the connection logs belongs to that of the CPE. We term this address the “probe’s
address” or the “end-host address,” since it is the useful, publicly visible address
that the probe uses, even though the address may technically belong to the CPE
and the probe has a different, private, RFC 1918 address.
We find IP address changes by inspecting these connection logs. A new en-
try in a probe’s connection log is created whenever an event occurs that causes the
existing TCP connection to break. This connection will break when the probe’s
72
ID Start time End time IP Address Dur
206 Dec 31 03:21:34 Jan 1 02:57:37 91.55.174.103 NA
206 Jan 1 03:22:16 Jan 1 17:34:11 91.55.169.37 14.2
206 Jan 1 18:00:54 Jan 1 18:42:31 91.55.132.252 0.7
206 Jan 1 19:06:46 Jan 2 02:19:16 91.55.155.115 7.2
206 Jan 2 02:41:55 Jan 3 02:18:00 91.55.141.95 23.6
206 Jan 3 02:43:14 Jan 4 02:16:59 91.55.165.167 23.6
206 Jan 4 02:40:58 Jan 5 02:15:45 91.55.163.252 23.6
206 Jan 5 02:38:39 Jan 6 02:14:48 91.55.141.63 NA
Table 4.1: Connection log sample for the first five days of 2015. We compute the address
duration, shown in the last column in hours.
IP address changes, when a probe reboots, or when there is an outage. We can
infer that the address changed between the end time of one connection and the
start time of the next, if the addresses differ in consecutive entries. For example,
in Table 4.1, there are seven address changes. Between changes, we can identify
the duration that the probe held an address, shown in hours. In this example,
each connection had a different address, so the address durations are equal to
the connection duration, though this is not always the case. The duration of the
first address is unknown because we do not know when that IP address was first
assigned to the probe; the duration of the last address is also unknown.
The interval between connections, in the example of Table 4.1, typically 20–
25 minutes, is information we also use in concert with other datasets described
below to determine the type and duration of the event that led to a new connec-
tion. An active RIPE Atlas probe should report experiments back to the central
controller about every three minutes [93]. We attribute this long delay between
73
the end of one connection to the beginning of the next when there is an address
change to waiting for TCP to exhaust its retransmission attempts (RFC 1122 Sec-
tion 4.2.3.5) [49].
We obtained connection logs from January 1, 2015 to December 31, 2015
belonging to 10,977 active RIPE Atlas probes that had been connected to their
central controllers for more than 30 days in 2015. We first found the list of active
probes as of December 31, 2015, using the RIPE Atlas probe archive [94], and
found 16,584 active probes. Next, we scraped each active probe’s connection logs
directly from the probe’s webpage [95]. Subsequently, we found 10,977 probes
who had been connected to their central controllers for an aggregate duration of
more than 30 days in 2015.
4.5.2 Probe filtering
We omit from our analysis two sets of data: probes that are connected using
a method where using different addresses does not indicate changes to the ad-
dresses that were assigned, for example, multihomed probes, as well as connec-
tion log entries that represent movement from one location or provider to another.
Once we omit a probe for anomalous behavior in connection logs, we omit that
probe from our analysis of the other RIPE Atlas datasets as well.
Table 4.2 provides an overview of the probes we omitted from the analysis.
IPv6 and dual-stacked probes
Probes that communicate, even occasionally, over IPv6 are not useful for under-
74
Category Probes
Total Probes 10,977
Not Analyzable
Never changed 3,073
Dual Stack 3,728
IPv6 237
Multihomed / Core / Data-center (tags) 174
Multihomed (alternating addresses) 511
Only address change from 193.0.0.78 216
Analyzable (geography) 3,038
Multiple ASes 766
Analyzable (AS-level) 2,272
Table 4.2: Of the 10,977 probes in the dataset, we are able to find address changes on 3,038
probes. 766 probes had addresses from multiple ASes; we discard address changes across
ASes for these probes from our geographic analysis and filter these probes altogether in
our AS-level analysis.
standing IPv4 address dynamics. We found 237 probes that made connections
solely over IPv6 and 3,728 that used both IPv4 and IPv6. The 3,728 that connect
over both protocols often alternated between address types, providing little in-
formation about the duration that the probe held any particular IPv4 address.
Concretely, if a dual-stacked probe established one TCP connection to the cen-
tral controller over IPv4 and the next TCP connection over IPv6, we cannot tell
75
whether or when the IPv4 address changed while the IPv6 connection was active.
We would need consecutive IPv4 connections from three different IPv4 addresses
to determine how long the probe held the address in the middle of the sequence.
In practice, a sequence of such IPv4 connections is rare for a dual-stack probe.
Multihomed and datacenter probes
We cannot use the connection logs dataset to observe address changes accurately
on multihomed probes (probes that have more than one available IP address con-
currently). For these probes, a connection from a new address could simply be
a connection from the other address assigned to the CPE, much like a dual-stack
probe. Probes at exchange points or in data centers are relatively few and seemed
more likely to be problematic (by exhibiting multihomed behavior) than instruc-
tive (by representing address changes experienced by customers).
To filter multihomed probes, we first looked for hints in user-provided “tags”
associated with a probe: 174 probes had at least one of the tags “multihomed,”
“datacentre,” or “core.” Tags are provided voluntarily and so probes may not be
tagged with those labels even if they were in fact multihomed; thus, we looked
for common features among the tagged probes which we could then use to omit
probes with similar behavior. The most common feature we found was that con-
nections from the tagged probes alternated between one fixed address and an-
other potentially changing address; we found this feature on 36 of the 174 tagged
probes. We found 511 other probes that matched this behavior and removed them
from the dataset. We expect that it is far more likely that when a host returns to
using a previously-used address, the host is choosing from among addresses it
76
holds for a long time rather than that the ISP reassigned a previously held ad-
dress to the host. We combine this behavioral, alternating-addresses, definition
of multihomed with the tags to choose probes to omit from analysis.
4.5.3 Connection log entry filtering
We omit some entries in the connection log because of properties of either the
address involved or because the detected address change was such that a probe
reported an address from one autonomous system for one connection and an
address from a different autonomous system for the next connection. Remov-
ing these connection log entries does not generally remove probes entirely from
analysis.
Testing addresses
Some probes had their first address transition from the same IP address, 193.0.0.78.
This address belongs to the RIPE NCC, and was used for testing before being
shipped to volunteers. There were 427 such probes that started with this address;
we remove this connection log entry. That left 216 additional probes with no
further address changes in 2015, so we omitted those probes in Table 4.2.
Address changes across ASes
When attributing behavior to individual autonomous systems, we omit from
analysis any probes where address changes indicated a change from the address
space of one autonomous system to the address space of another. We used CAIDA’s
IP-to-AS dataset [96] to map each IP address to its autonomous system. CAIDA
77
publishes the IP-to-AS dataset monthly; thus, we found the month in which a
new IP address was assigned to a probe and used CAIDA’s IP-to-AS dataset for
that month to find the AS for that address. We found 766 probes with at least one
address change spanning different autonomous systems. These ASes could be
sibling ASes owned by the same ISP, but could also belong to different ISPs if the
owner of the probes switched ISPs. For our geographic analysis (Section 4.6.2),
we discarded the address changes spanning ASes for these probes, but retained
the address changes within the same AS. For our AS-level analysis of renumber-
ing behavior (Section 4.6.3), we made the conservative choice of filtering these
probes altogether.
Table 4.2 summarizes the dataset and the number of probes filtered. After
the filtering process we had 2,272 probes analyzable for AS-level renumbering
behavior, and 3,038 probes analyzable for geographic renumbering behavior. For
each analyzable probe in Table 4.2, we found address changes along with the time
of the address change and used them to find the duration for which addresses
were assigned before changing.
4.5.4 k-root ping dataset
We detect network outages using two items from the built-in RIPE Atlas probe
measurements. Every four minutes, each probe sends three pings to the k-root
DNS server and logs the number of sent pings and the number of successful re-
sponses [97]. Table 4.3 shows a sample of this log. Probes report the results of
78
ID Timestamp N sent N success LTS
16893 Jan 27 09:01:42 3 3 86
16893 Jan 27 09:05:48 3 0 151
16893 Jan 27 09:09:45 3 0 388
16893 Jan 27 09:13:36 3 0 619
16893 Jan 27 09:17:49 3 0 872
16893 Jan 27 09:21:40 3 0 1103
16893 Jan 27 09:25:39 3 3 1342
16893 Jan 27 09:29:36 3 3 146
Table 4.3: Sample of k-root ping dataset for probe ID 16893 when a network outage
occurred. We detect a network outage when pings to the k-root server are lost and when
this ping loss is accompanied by increasing Last Time Synchronized (LTS) values. Here
we detect a network outage beginning at Jan 27 09:05:48 and ending at Jan 27 09:21:40.
these and other measurements via HTTP POST to the central controller once ev-
ery four minutes. Along with the measurement data, the probe also reports the
current LTS or “last time synchronised” value. This value indicates when the
probe last synchronized its clock with that of the central controller. Typically,
probes synchronize their clocks by NTP or upon receipt of the HTTP verify re-
sponse from the controller [93], so in the absence of an outage, the reported LTS
value should be less than four minutes (240 seconds).
We use a combination of the ping responses and the LTS value to infer a
network outage, so that we have two (mostly) independent measurements that
79
indicate that the probe’s network has failed. We consider the network outage
to start at the first measurement where all pings to the k-root server were lost,
and to end at the last measurement where all pings were lost. If the LTS value
did not grow, that would indicate that the probe was still able to communicate
with the controller, and thus would not be an outage. Note that this interval
underestimates the duration of a network outage by up to eight minutes.
4.5.5 SOS-uptime dataset
ID Timestamp Uptime counter value
206 Jan 1 03:15:18 262531
206 Jan 1 17:50:26 315038
206 Jan 1 17:50:55 19
206 Jan 1 17:53:59 203
206 Jan 1 18:59:44 4147
Table 4.4: Sample of SOS-uptime records from RIPE Atlas for January 1 2015 for probe ID
206. The third row shows that the uptime counter had reset 19 seconds before 17:50:55,
allowing us to infer that the probe rebooted at 17:50:36.
The SOS-uptime dataset contains probe uptime counter values over time.
The uptime counter on each probe is 64 bits long and counts the number of sec-
onds since the probe booted. Probes report their uptime counter value to the
central controller every time they make a new TCP connection to the controller.
80
We use the SOS-uptime dataset to determine when RIPE Atlas probes re-
booted by finding when the uptime counter was reset. For example, consider the
sample SOS-uptime records from the RIPE Atlas dataset for probe ID 206 shown
in Table 4.4. The first entry at 03:15:18 on January 1st shows that the probe had
been up for 262,531 seconds. Later that evening, the probe is shown to have been
up for 315,038 seconds, but the next uptime counter value reports that the probe
was up for only 19 seconds. We infer that a reboot occurred 19 seconds earlier, at
17:50:36.
After finding reboot times, we use the k-root ping dataset to measure how
long each power outage lasted. When we detect a reboot, we use the difference in
time between successive pings to the k-root server to estimate the power outage
duration.
4.5.6 Associating inter-connection gaps with outage events
The next task is to synthesize these three datasets to identify outage events that
occur between TCP connections to the central controller. The TCP connection
to the central controller breaks when the IP address changes, when the probe
reboots, when the CPE reboots, or when there is a power outage or significant
network outage. For example, the reboot at 17:50:36 in Table 4.1 corresponds
to rows 2 and 3 in Table 4.1 since the reboot time falls between the end of the
connection log entry ending at 17:34:11 and the start of the connection log entry
beginning at 18:00:54.
81
We use a priority ordering to assign outages to inter-connection gaps. If
the k-root dataset indicated a network outage in the gap, we associate it with a
network outage. If instead the SOS-uptime dataset indicates a reboot coincident
with missing attempted k-root pings from the k-root dataset, we associate the
gap with a power outage. If neither occurred, we mark the gap as a “no-outage”
indicating that the reconnection was not associated with any outage.
4.6 Periodic address changes
ISPs can assign dynamic addresses for as long as they wish. In DHCP, long leases
simplify administration, while short leases can be more efficient in reclaiming
unused addresses. DHCP leases, however, are meant to be renewable by devices
that are still active. In this section, we look at periodic address reassignment: in-
stances where a device changes address periodically, despite actively using the
address. Periodic reassignment is atypical for devices using DHCP since a de-
vice that is continuously renewing its lease should continue to keep its current
address [79].
4.6.1 Metric to detect periodic address durations
If ISPs intentionally renumber after specific durations, we would expect those
address durations to be prominent in a distribution of all address durations be-
longing to that ISP. We initially considered studying distributions of raw address
durations, similar to the analyses by Maier et al. [87] and Moura et al. [86], but
82
found that short address-durations were overrepresented. For example, in Ta-
ble 4.1, inspecting the cumulative distribution of address durations would sug-
gest that only half the addresses (3 of 6) were assigned for 24 hours. However,
when trying to reason about the expected duration that an address will continue
to be assigned to the CPE, we would like to know the fraction of total time that
each duration accounted for. For example, in Table 4.1, the CPE was assigned 24
hour long addresses for roughly three-quarters of the total measured time. This
latter notion is more useful to find whether an ISP is using periodic durations
consistently, since the modes at intervals on the scale of days will be more visible.
To capture this notion we define a metric, the total time fraction. For a given
probe and an address duration d, we define the total time fraction for d as the
fraction of time spent by the probe in durations of length d. We compute the total
time fraction for a given probe and a duration d by obtaining the total address
time for the probe, and computing the fraction of the total address time that was
accounted for by address durations of length d. For a probe p, if n(d) is the num-
ber of times the probe had an address duration d and D is an array containing all
address durations that were assigned to the probe, the total time fraction for the
address duration d is given by:
fpd = d× n(d)/Σ(D)
We use a similar procedure for computing the total time fraction consid-
ering all probes in an ISP, country, or continent. We believe that the total time
fraction offers a better representation of the probability that an address was as-
signed for a certain duration than a simple inspection of the address durations.
83
1.0
EU (784.3)
NA (127.2)
0.8 AS (97.79)
AF (48.06)
SA (41.71)
OC (27.27)
0.6
0.4
0.2
0.0
1h 6h 12h 1d 3d 1w 2w 1mo 2mo
IP address-duration (log-scale)
Figure 4.1: Cumulative distribution of total time fraction by continent. Modes (vertical
segments in the CDF) indicate periodic renumbering. Addresses in North America are
relatively long lived and free of periodic renumbering.
4.6.2 Periodic address changes by geography
We begin by inspecting how address durations vary across continents. We ex-
pected that address scarcity might affect address durations, leading to longer
durations in North America and shorter durations in Asia. We use RIPE At-
las’s probe database to find the country to which each probe belongs. Next, we
aggregate the address durations of probes by their respective countries and sub-
sequently, to their continents. Figure 4.1 shows the cumulative distribution of the
total time fraction for each continent, i.e., the y-axis shows the fraction of total ad-
dress duration accounted for by durations less than the x-axis value. The number
in parentheses in the legend for each continent shows the total address duration
for that continent in years (Σ(D)).
84
Fraction of total address-duration
In Europe, Asia, Africa, and South America, address durations exhibit well-
defined modes, mostly at intervals that are multiples of 24 hours. The most com-
mon mode is exactly at 24 hours: the total time fraction for European addresses at
24 hours is 0.16, African addresses is also 0.16, and Asian addresses is 0.07. One
week address durations are also common in Europe, with the total time fraction
at 1 week equaling 0.08. South American addresses exhibit multiple modes: their
total time fraction is 0.11 at 12 hours, 0.07 at 28 hours, 0.09 at 48 hours, and 0.03
at 192 hours (8 days).
The curves for North America and Oceania do not have well-defined modes,
suggesting that ISPs in these continents do not periodically change addresses.
Further, North American probes typically retain their dynamic addresses for much
longer durations than other continents; North American addresses spent more
than half of the total time in address durations longer than 50 days. This sug-
gests that IP addresses can be used as end-host identifiers in North America for
several weeks.
4.6.3 Periodic address changes by AS
We next considered whether the configuration decision to renumber periodically
was uniform across an AS, or could reflect some other feature. For example, pe-
riodic renumbering could be a result of an unexpected cron job on the RIPE Atlas
probe or a faulty DHCP client that could not renew. Periodic renumbering could
be due to government regulations in countries, perhaps as a privacy measure. It
85
1.0
Orange (79.71)
DTAG (50.89)
0.8
BT (45.43)
LGI (15.84)
Verizon (23.3)
0.6
0.4
0.2
0.0
1h 6h 12h 1d 3d 1w 2w 1mo 2mo
IP address-duration (log-scale)
Figure 4.2: Cumulative distribution of total time fractions for ASes with most RIPE Atlas
probes that yielded at least one address duration. Probes from Orange and DTAG spent
more than half of their total duration in periodic durations of 1 week and 1 day respec-
tively. BT also showed evidence of periodic renumbering with a mode at two weeks. On
the other hand, LGI and Verizon have no modes at any durations, and spent most of their
total time in durations that were weeks long.
could also simply reflect ISP policy, perhaps to hinder users from running web
servers as anecdotal evidence suggests [98]. Investigating AS-level behavior can
inform whether the periodic renumbering behavior is concentrated in some ASes
and absent in others, shedding light on its potential cause.
4.6.3.1 Is periodic renumbering prevalent across all ISPs?
We first investigate the ASes with the largest deployment of RIPE Atlas probes
where we detected at least two instances of address changes. Recall that we only
obtain an address duration when the address began and ended during the inter-
val we studied, so that a minimum of two address changes are necessary for a
86
Fraction of total address-duration
probe to yield an address duration. Figure 4.2 shows the cumulative distribution
of total time fractions for the five autonomous systems with the most probes that
yielded address durations. In this figure, Orange, an ISP from France, appears
to change addresses after a duration of 168 hours (1 week): 55% of its total ad-
dress duration was a week long. The German ISP, Deutsche Telekom AG (DTAG)
reassigns addresses after 24 hours: 76% of the total address duration lies in that
mode. British Telecom (BT) has a mode at 336 hours (2 weeks) with 13% of its total
duration being in 2 week intervals. We study these ASes further in Section 4.6.4.
The other two ISPs do not exhibit any evidence of periodic renumbering.
Liberty Global, an ISP to which probes spread across Europe belong, does not
appear to change addresses periodically and neither does Verizon (US). Among
these ASes, Verizon has the longest address durations.
Since periodic renumbering behavior is widespread in some ISPs and non-
existent in others, we conclude that the cause of periodic renumbering is likely
ISP policy.
4.6.3.2 Is periodic renumbering geographically correlated?
Next, we investigate how the periodic renumbering behavior of ISPs correlates
with the country in which they operate. Germany has more than a hundred RIPE
Atlas probes deployed across several ISPs, thus we study their address dura-
tions in Figure 4.3 for ISPs with probes that contributed at least 3 years of total
time. Many ISPs in Germany change addresses every 24 hours: 77% of the du-
87
1.0
DTAG (50.98)
Vodafone (15.59)
Telefonica1 (11.43)
0.8
Telefonica2 (12.32)
others (15.53)
Kabel DE (12.94)
Kabel BW (3.42)
0.6
0.4
0.2
0.0
1h 6h 12h 1d 3d 1w 2w 1mo 2mo
IP address-duration (log-scale)
Figure 4.3: Cumulative distribution of total time fractions for ASes in Germany. Many
German ISPs appear to change addresses every 24 hours. However, some ISPs have more
stable addresses.
ration in DTAG (AS 3320), 76% in Telefonica1 (AS 6805), 74% in Telefonica2 (AS
13184), and 29% in Vodafone (AS 3209), is 24 hours. We observe that the ’other’
ISPs also have a mode at 24 hours, suggesting that German ISPs are particularly
likely to renumber every 24 hours. However, this behavior is not universal: Ka-
bel Deutschland (AS 31334) and Kabel BW (AS29562) do not exhibit a mode at
24 hours; instead, more than 90% of their total address duration was spent in
durations longer than two weeks.
These results suggest that periodic renumbering behavior can exhibit some
geographic correlation, but is likely largely caused by ISP policy.
Private communication with a large European ISP confirmed that the ISP
renumbers every 24 hours, since the ISP considers this scheme to be more ’privacy
secure’ although there is no government regulation that forces this feature. The
88
Fraction of total address-duration
ISP also reported that it uses PPPoE instead of DHCP for its DSL lines (which
accounted for the vast majority of its customers). Since periodic behavior would
be atypical of DHCP but consistent with PPP techniques for address assignment,
we speculate that periodic renumbering is a property of ISPs that use PPP.
4.6.4 ISPs that renumber periodically
In this section, we look specifically at ISPs that renumber periodically to infer the
period over which they renumber, the fraction of the ISPs’ probes which periodi-
cally renumber, how reliably the renumbering occurs at the end of the period, and
whether the renumbering is synchronized across probes. We classify a probe as
“periodic” when its total time fraction at some duration d exceeds 0.25. We set the
threshold to 0.25 because we expect a probe whose address is reassigned period-
ically to sometimes have a shorter duration, say, due to a reboot, and sometimes
have a longer duration, say, by receiving the same address again.
We consider autonomous systems having at least five probes with an ad-
dress change of which at least three probes are periodic, and provide an overview
of their renumbering period and behavior in Table 4.5. The periodic duration d is
shown in hours; 24 hour durations are typical. Renumbering in this table is pri-
marily a feature of central Europe, with some in Russia, Kazakhstan, Mauritius,
and South America. We describe the rest of the columns in the next subsections.
89
4.6.4.1 What fraction of probes is periodic?
Even for ISPs such as Orange and DTAG which have total time fraction at period
d in excess of 0.5, not all address durations equal d; some durations are shorter
and others longer as seen in Figure 4.2. One possible explanation is that only a
few probes in these ISPs were periodically renumbered while others were not.
Alternately, periodic probes sometimes have address durations not equal to d.
We find that it is usually a combination of both factors that lead to non-periodic
durations in these ISPs, although the extent to which each is responsible varies
by ISP.
In Table 4.5, the N column shows the number of probes with at least one
address change in the dataset. The next column, fpd > 0.25, shows the number
of periodic probes—those having a time fraction of more than 0.25 at duration d.
In some ISPs, only a small fraction of probes are periodically renumbered. For
example, only a fifth of the probes in BT were periodic with a 2-week period,
partially explaining why the total time fraction at 2-weeks for BT in Figure 4.2 is
only 0.13.
The subsequent columns, fpd > 0.5 and f
p
d > 0.75 show what percentage of
the periodic probes are persistently so, where the total time fraction at duration d
is more than half or three quarters. We show percentages rather than raw counts
in these columns to simplify the comparison, given that these providers have dif-
ferent sizes. A high percentage indicates that most of the periodic probes (with
fpd > 0.25), are strongly so (f
p
d > 0.75). A low percentage indicates that probes
90
may either be reassigned early (due to outages) or late (due to inconsistent reas-
signment). We can see that only 15% of the periodic probes in BT had fpd > 0.5
and none had fpd > 0.75, providing further explanation for why the total time
fraction at 2-weeks for BT is low.
Other ISPs have a much larger fraction of their probes that are periodic:
more than 80% of probes in Orange, DTAG, Telefonica Germany, A1 Telekom,
Hrvatski, ISKON, ANTEL, Global Village Telecom, Mauritius Telekom, Orange
Polska, and Digi Tavkozlesi are periodic. For each of these ISPs, more than 75%
of probes are persistently periodic, having fpd > 0.5. For DTAG, Telefonica, A1
Telekom, Hrvatski, ANTEL, and Orange Polska, more than 75% of probes have
fpd > 0.75. Notable is Orange Polska, which has four of its ten probes periodic
at 24 hours, and five more probes periodic at 22 hours, but 100% of them have a
time fraction at their respective durations greater than 0.75.
Probes in these ISPs typically have address durations capped at d. Address
durations can sometimes be shorter—potentially due to outages or reboot/reconnect
events as we show in Section 4.7—but can occasionally be larger than d as well.
We study these next.
4.6.4.2 Why are some address durations longer than the period?
Some address durations exceed the typical period, d, for an ISP. We would like to
determine whether this is a behavior limited to a few probes in the ISP (poten-
91
tially caused by unusually designed CPE devices), or if the longer-than-typical
durations are spread across probes.
How many periodic probes have an address duration longer than d? We
expected that no address duration for such probes would exceed the periodic du-
ration d. That is, if the ISP was renumbering a probe on a schedule, then some
additional renumbering would be possible due to other reasons, but the probe
would never keep its address longer than d. It turns out that this expectation is
not the case. The column MAX ≤ d shows the percentage of the periodic probes
that had their maximum address duration less than d (to capture only those dura-
tions that clearly exceeded d, we adjusted d to be d + 5% for this column). Across
all periodic probes, 94% of those that appear to be on a one-week renumbering
schedule did not have an address duration longer than one week; only 44% of
those that appeared to be on a one-day renumbering schedule had all durations
limited by twenty-four hours.
This fraction seemed surprisingly low. Why would so many probes show
daily renumbering, even reporting a total time fraction of 0.75, when the probe
might also keep its address longer? We considered two possible explanations that
would have the same symptoms: that a periodic renumbering was skipped or
that the same address was (perhaps by random chance) assigned again. In these
cases, rather than see an address change after 24 hours, we might see one at 48 or
even 72 hours. We term such address changes “Harmonics”, and consider what
fraction of the time all address changes are at or before d (as expected), or occur
at a multiple of d. The percentage of probes that match this loosened definition
92
(a superset of those in MAX ≤ d) appears in the last column of Table 4.5. Most
periodic probes from all ISPs except Global Village Telecom and SONATEL-AS
have maximum durations of this kind.
93
AS ASN Country N p p pd f > 0.25 f > 0.5 f > 0.75 MAX ≤ d Harmonic
d d d
All 24 2272 193 88.6% 68.4% 43.5% 89.6%
All 168 2272 123 74.0% 13.8% 94.3% 98.4%
Orange 3215 France 168 122 111 77% 14% 98% 99%
DTAG 3320 Germany 24 63 51 96% 86% 78% 98%
Telefonica DE 2 6805 Germany 24 17 15 93% 80% 27% 93%
Telefonica DE 1 13184 Germany 24 14 14 93% 86% 21% 100%
PJSC Rostelecom 8997 Russia 24 22 13 100% 69% 23% 100%
BT 2856 U.K. 337 67 13 15% 0% 38% 62%
Proximus 5432 Belgium 36 41 12 83% 8% 0% 83%
A1 Telekom 8447 Austria 24 12 11 100% 91% 73% 100%
Vodafone GmbH 3209 Germany 24 21 9 78% 11% 0% 89%
Hrvatski 5391 Croatia 24 7 7 100% 100% 43% 86%
ISKON 13046 Croatia 24 6 6 83% 33% 0% 100%
ANTEL 6057 Uruguay 12 6 6 100% 100% 33% 100%
Global Village Telecom 18881 Brazil 48 6 6 100% 67% 0% 17%
Mauritius Telecom 23889 Mauritius 24 6 5 100% 20% 20% 100%
JSC Kazakhtelecom 9198 Kazakhstan 24 15 5 80% 80% 60% 80%
Orange Polska 5617 Poland 22 10 5 100% 100% 60% 80%
VIPnet 31012 Croatia 92 7 4 75% 0% 75% 75%
Proximus 5432 Belgium 24 41 4 50% 25% 0% 75%
Digi Tavkozlesi 20845 Hungary 168 4 4 100% 25% 100% 100%
Orange Polska 5617 Poland 24 10 4 100% 100% 50% 100%
Free SAS 12322 France 24 12 3 100% 67% 0% 67%
SONATEL-AS 8346 Europe 24 7 3 33% 33% 33% 33%
Net by Net 12714 Russia 47 7 3 100% 100% 67% 100%
Table 4.5: Autonomous systems that had at least three probes with a total time fraction
for duration d (in hours) greater than 0.25. fpd > 0.25 shows the number of probes that
had a total time fraction at d greater than 0.25; fpd > 0.50 and f
p
d > 0.75 show the percent-
age of those probes that had fractions greater than 0.5 and 0.75 for the same duration.
MAX ≤ d shows the percentage of probes whose maximum duration was no greater
than d. “Harmonic” represents the percentage of probes that, if not renumbered after d,
are renumbered after some multiple of d hours. The ASes are sorted in decreasing order
of fpd > 0.25.
94
4.6.4.3 Are changes synchronized?
150
100
50
0
1 2 3 4 5 6 7 8 9 101112131415161718192021222324
Hour of the day (GMT)
Figure 4.4: Periodic address changes in Orange appear more evenly distributed among
the hours of the day.
3000
2000
1000
0
1 2 3 4 5 6 7 8 9 101112131415161718192021222324
Hour of the day (GMT)
Figure 4.5: Periodic address changes are more likely in some hours for Deutsche Telekom.
We imagine two broad strategies for daily renumbering: either leaving each
customer on an independent, free-running clock that resets after 24 hours, or
synchronizing all address changes to an off-peak time when few would be in-
terrupted. Both seem reasonable strategies: independent clocks seem simple to
implement, synchronized address changes seem more likely to shuffle addresses
since many addresses are made available during the synchronized interval. How-
ever, if one were to blacklist addresses for misbehavior, knowing which strategy
is in use would help to choose for how long to keep the blacklist entry. We expect
95
Address changes Address changes
that plotting the time of day at which addresses change for each ISP will expose
whether the renumbering is synchronized.
For Orange and DTAG, the two ISPs with the most periodic probes, we
choose the hour of the day in which every address duration that had duration d
ended and show these in Figure 4.4 and Figure 4.5. For Orange, periodic address
changes are not concentrated during any specific hours of the day. However,
DTAG assigns periodic durations more often during some hours of the day. In
private correspondence with a large European ISP, we learned that many CPE
devices come with an option to choose the time at which they should disconnect
and reconnect to receive a new address, as a privacy feature. Figure 4.5 supports
this deployment scenario, observing almost three quarters of all periodic address
changes between hours 24 to 6 (in GMT). However, some CPEs do not have this
feature because a quarter of the periodic address changes happen at other hours
of the day.
4.7 Outage-caused address changes
In Section 4.6.4, we saw that even probes from ISPs that renumber periodically
often have durations shorter than the typical period. In this section, we study an-
other potential cause of address change: outages occurring at the CPE (customer
premises equipment), due to loss of power or network connectivity. Here, we
quantify how frequently and for which probes an outage event at the CPE device
appears to cause the reassignment of its IP address. If an outage event occurs
96
at approximately the same time as an address change, we assume that the out-
age caused the address change. If an outage event occurs distant in time from an
address change, then we assume that the outage did not cause an address change.
There are three versions of RIPE Atlas probes: v1,v2, and v3. More than 75%
of probes are v3, although the distribution of versions within individual ISPs
varies. We find network outage events on all versions of probes since network
outages are by definition caused when a probe was up and reporting measure-
ments. However, finding power outage events is confounded by the presence
of potential false positives and negatives. We address these in detail next and
describe our approach for filtering falsely inferred power outages.
4.7.1 Filtering falsely inferred power outages
The SOS-uptime data (Section 4.5.5) allows us to determine when the probe re-
booted. Ideally, however, we would like to know when the CPE rebooted. Fortu-
nately, probe reboots are often representative of CPE reboots due to a combina-
tion of how the RIPE NCC suggests that probes be installed [99] and expected
fate sharing of co-located devices powered together, as we describe next.
The RIPE Atlas probe gets power from USB; because of this design, the
probe can be powered by the USB port on the CPE and will be power-cycled
whenever the CPE reboots. When the probe is plugged into the CPE, or both
together are power-cycled, a probe reboot indicates that the CPE also rebooted.
These represent the typical cases that are useful for the analysis of power outage
97
related address changes. The potential error scenarios are as follows. When the
CPE alone is rebooted but the probe is not, we would not observe a power out-
age, leading to a false negative. When the probe alone is rebooted but the CPE is
not, we would detect a power outage, leading to a false positive. Although we
expect probe reboots to be rare, a specific scenario in which they occur is when
the probe receives a firmware upgrade. We discuss how to remove probe reboots
due to firmware upgrades below in Section 4.7.2.
Older probe hardware (v1,v2) can also confound our inference of power
outages, because these probes may reboot when they create new TCP connec-
tions, since they are vulnerable to memory fragmentation [100]. Address changes
create new TCP connections and could induce such reboots, so for our power out-
age analysis we discard data from these older probes.
4.7.2 Removing reboots caused by firmware updates
The RIPE Atlas servers push firmware updates to probes simultaneously. When
a probe’s TCP connection to the central controller breaks, the probe will reboot
and install the firmware update. Our goal is to filter reboots that were associated
with a firmware update, since these reboots occur as a result of a dropped con-
nection rather than as a cause. Figure 4.6 shows the number of unique probes
that rebooted on each day of 2015. We observe five periods during the year when
probes experienced more than twice as many reboots as the median for at least
two consecutive days.
98
1000
800
600
400
200
0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Day of the year
Figure 4.6: Number of unique probes that rebooted on each day of the year. Days with
exceptionally many reboots follow the distribution of firmware updates. We indicate
days where updates seem to have been distributed with diamonds along the x-axis.
For each of these periods, we found the first day corresponding to the spike,
and identify that day as when the firmware update was distributed. Some dates
(April 14, July 6, October 5), agree precisely with documented RIPE Atlas firmware
and UI updates [101]. Other dates are close—we observe March 23 instead of
March 28, and January 25 instead of January 14—but nevertheless show the same
spike in reboots. We then discard the first reboot for each probe that occurred
after the firmware update.
4.7.3 Most outages result in an address change for some ASes
We found network and power outage events and associated them with inter-
connection gaps as described in Section 4.5. If the connection log entries on either
side of the inter-connection gap used different addresses, we infer that the event
caused an address change and call the address change an Address change with net-
99
Rebooted probes
1.0
0.8
0.6 Orange (101)
DTAG (57)
BT (43)
LGI (83)
0.4 Verizon (48)
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Probability of an address change given a network outage
Figure 4.7: Distribution of P (ac|nw) per probe for the ASes with the most probes that
had at least one address change. Probes in DTAG, Orange, and BT, are far more likely to
change addresses upon a network outage than probes in Verizon and LGI.
work outage, Address change with power outage, and Address change with no-outage,
depending upon the event.
For each individual probe, we consider the conditional probability of an ad-
dress change given a detected outage. P (ac|nw) represents the conditional prob-
ability that an address change occurred given a network outage and P (ac|pw)
represents the same for a power outage. We estimate this probability using the
fraction of outages occurring contemporaneously with an address change (out of
the total number of outages). We show the distribution of these probabilities by
probe to estimate whether the group of probes (by geography or ISP) is domi-
nated by those that always or seldom change addresses on an outage.
We find that the likelihood of address change upon an outage event differs
across ASes. Figure 4.7 shows the CDF of P (ac|nw) for the five ASes that host the
most probes with at least one address change and at least three network outage
events. We find that probes in ASes that periodically renumber—Orange, DTAG,
100
Fraction of probes
1.0
0.8
0.6 Orange (97)
DTAG (20)
BT (33)
LGI (49)
0.4 Verizon (31)
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Probability of an address change given a power outage
Figure 4.8: Distribution of P (ac|pw) per probe for probes running version 3. As with
network outages, probes in DTAG and Orange are more likely to change addresses upon
power outage than probes in Verizon and LGI.
and BT—have high P (ac|nw) compared to probes from ASes that do not periodi-
cally renumber, LGI and Verizon. Around half of the probes in both Orange and
DTAG had P (ac|nw) equal to 1: every network outage was accompanied by an
address change!
Figure 4.8 shows P (ac|pw) for these ASes. Recall that we discarded probes
with versions 1 and 2 due to their potential to reboot as a result of an address
change, thus we have fewer samples. The AS-level behavior for power outages is
similar to network outages. DTAG and Orange tend to renumber frequently upon
power outages; half of the probes in Orange and 40% of the probes in DTAG have
P (ac|pw) equal to 1. Verizon and LGI do not renumber frequently upon power
outages; only about half of their probes had an address change even once upon
an outage. Since the likelihood of an address change upon an outage can also
depend upon the duration of the outage, we investigate the distribution of outage
101
Fraction of probes
AS ASN Country N P (ac|nw) > 0.8 P (ac|nw) = 1 P (ac|pw) > 0.8 P (ac|pw) = 1
All 1113 29.1% 16.9% 28.3% 14.6%
Orange 3215 France 84 79% 54% 77% 50%
Telecom Italia 3269 Italy 28 71% 50% 57% 21%
BT 2856 U.K. 22 64% 55% 50% 14%
Proximus 5432 Belgium 20 70% 45% 60% 30%
DTAG 3320 Germany 19 58% 47% 47% 42%
Vodafone GmbH 3209 Germany 12 83% 75% 58% 42%
Wind Telecomunicazioni 1267 Italy 12 67% 42% 83% 42%
SFR 15557 France 16 38% 25% 50% 6%
ISKON 13046 Croatia 6 100% 50% 83% 67%
PJSC Rostelecom 8997 Russia 7 71% 29% 57% 14%
Table 4.6: Probes likely to change addresses upon network outages are also likely to
change addresses upon power outages. The table shows autonomous systems with at
least five probes whose conditional probability of address change upon network outage
was greater than 0.8. The N column shows the number of probes with at least three
network outages and at least three power outages. P (ac|nw) > 0.8 and P (ac|nw) = 1
show the percentage of N for which the conditional probability of address change upon
network outage was greater than 0.8 and equal to 1 respectively, and P (ac|pw) > 0.8,
P (ac|pw) = 1 show the same for power outages.
durations and the likelihood of address changes for different outage durations in
Section 4.7.4.
Since the ASes in Figure 4.7 and Figure 4.8 exhibit such disparate behavior,
we considered if some ASes are particularly likely to renumber upon outages. To
investigate this, we found the set of probes with at least three network and power
outages. We then found probes with P (ac|nw) of 0.8 or more and show ASes with
5 or more such probes in Table 4.6.
102
First, we observe strong geographic correlation; all these ISPs are in Europe.
Second, we observe that P (ac|pw) is also high; P (ac|nw) > 0.8 and P (ac|pw) > 0.8
are similar for all these ISPs (although P (ac|pw) = 1 tends to be lower because our
power outage detection technique is more prone to false positives). This suggests
that both types of outages are likely to cause address changes. Third, we find
that 7 of the 10 ISPs also appeared in Table 4.5. Maier et al. [87] studied the logs
from an urban area of a major European ISP that used Radius to assign addresses:
neither CPE nor Radius servers remember addresses. The behavior of these ISPs
that nearly always renumber is consistent with the behavior of the large DSL
provider in that study. Private communication with a large European ISP whose
probes consistently had an address change upon outage confirmed that they use
PPPoE and Radius to assign addresses for their DSL lines. We expect that this
property can be used as evidence in inferring a device’s link type.
103
Not renumbered 1500 Not renumbered
800 Renumbered Renumbered
600
1000
400
500
200
0
0
100
100
80
80
60 60
40 40
20 20
0 0
Outage duration Outage duration 
Figure 4.9: The likelihood of an address change (renumbering) given network or power
outages of different durations in LGI (left) and Orange (right). The top graph is a his-
togram; the complete bar represents the number of outages observed across all probes
in that AS. The lightly-shaded bar extends for those outages that also saw an address
change. The lower graph shows the same data as a percentage. Although relatively few
outages lasted longer than a day, the majority of these were coincident with an address
change in both ISPs. However, Orange (right) changed addresses even on the shortest
outages.
4.7.4 Is there a relationship between outage duration and address
changes?
Dynamic addresses assigned using DHCP should typically retain their addresses
as long as they continue to renew their lease half-way into the lease duration as
the standard recommends [79]. However, an outage could prevent them from
renewing their lease. Depending upon the address churn at the time, the ad-
104
% of outages Number of outages
< 5m
5-10m
10-20m
20-30m
30-60m
1-3h
3-6h
6-12h
12-24h
1-3d
3d-7d
>1w
% of outages Number of outages
< 5m
5-10m
10-20m
20-30m
30-60m
1-3h
3-6h
6-12h
12-24h
1-3d
3d-7d
>1w
dress they had previously been assigned may be reassigned to another device. In
this way, an outage longer than half a lease duration could potentially cause an
address change.
To investigate this, we analyzed the conditional probability of an address
change given the occurrence of network or power outages of different durations
for probes from LGI (AS 6830) and Orange (AS 3215) in Figure 4.9. For network
outages, we considered outages from all versions of probes while for power out-
ages, we only considered outages from probes running v3. We chose these ISPs
due to their difference in address change behavior upon the occurrence of out-
ages as seen in Figure 4.7 and Figure 4.8.
The behavior upon outages for the two ISPs is strikingly different. LGI’s
behavior appears consistent with what we would expect for dynamic addresses
assigned using DHCP: fewer than 3% of outages of up to an hour resulted in an
address change. More than 25% of outage durations that lasted at least twelve
hours resulted in an address change. This behavior is consistent with a DHCP
lease duration on the order of a few hours. Not every outage longer than twelve
hours resulted in an address change, consistent with DHCP behavior when a
client returns after an expired lease and the previously assigned address is still
available.
For Orange, we found that even very short outages resulted in address
changes. 91% of outages that lasted less than five minutes resulted in an ad-
dress change, and for every outage duration longer than five minutes and shorter
than three hours, more than 75% occurred with an address change. For outages
105
between three hours to three days long, the percentage of address changes was
closer to 50%, suggesting the presence of some CPE devices that do not renumber
upon every outage. However, as the outage duration increases beyond 3 days, al-
most every outage results in an address change.
Private communication with a large European ISP confirmed that this be-
havior is expected for PPPoE based DSL lines in that ISP: any reboot/reconnect
event will result in the assignment of a new address from the ISP’s dynamic ad-
dress pool. Since outages of such short durations can result in an address change,
a simple reboot of the CPE (resulting in a power outage), or unplugging and
replugging the network cable (resulting in a network outage), can change the
dynamic address assigned to the end-user. That end-users can change their dy-
namically assigned address has implications for researchers and operators who
use IP addresses to identify end-hosts, particularly when IP addresses are being
used to blacklist malicious actors.
4.8 Does a user’s dynamic address prefix change?
It is tempting to expect that a new address, when reassigned, will typically be
drawn from nearby addresses, say, from the same enclosing /24 prefix. If such
an assumption were true, it would allow blacklisting of the enclosing prefix of a
malicious host, if it were thought that the malicious host could cause its address
to change via reboot or by waiting a day. However, we find that such locality of
addresses is rare and address changes typically span prefixes.
106
We examined whether the dynamic address assigment also varies the en-
closing prefix, defined three ways. For each instance of address change that we
observed, we found the BGP prefix of the previous address and the new address
using CAIDA’s IP-to-AS dataset [96], as described in Section 4.5. We also ex-
tracted the /16 and /8 prefixes from the previous and new addresses. We then
compared how often the prefix of the previous address differed from the prefix
of the new address. Table 4.7 presents the results for the overall AS-level dataset
with 2,272 probes and for the ten ASes with the most probes that had at least one
address change.
ISPs varied prefixes even for consecutive addresses assigned to the same
customer; nearly half of the 166,644 total address changes we observed also changed
BGP prefixes. Unlike periodicity and renumbering upon outages, assigning ad-
dresses out of different prefixes appears to be a common behavior for ISPs. For
the ten ASes in Table 4.7, Verizon and DTAG had the lowest percentage of ad-
dress changes across prefixes, but even for these ASes, almost a quarter of all
address changes were across /16s and a fifth of all address changes were across
/8s. Thus, it is not just the dynamic addresses that change; their prefixes change
too. When a malicious actor receives a new address, even blacklisting the entire
enclosing /8 prefix of the old address would fail to prevent access for a third of
the address changes we observed.
107
AS ASN Country Diff BGP Diff /16 Diff /8
All 81,571 48.9% 79,430 47.7% 55,835 33.5%
Orange 3215 France 7,016 68% 6,961 67% 5,513 53%
LGI 6830 many 171 56% 168 55% 136 45%
BT 2856 U.K. 1,736 44% 2,685 68% 1,735 44%
DTAG 3320 Germany 4,706 24% 5,391 28% 4,610 24%
Verizon 701 U.S. 241 23% 241 23% 209 20%
Comcast 7922 U.S. 76 37% 74 36% 63 31%
Proximus 5432 Belgium 2,152 49% 2,331 53% 1,983 45%
Telecom Italia 3269 Italy 4,281 85% 4,412 88% 2,374 47%
Ziggo 9143 Netherlands 18 35% 22 43% 16 31%
Virgin Media 5089 U.K. 46 84% 49 89% 39 71%
Table 4.7: Number of address changes across prefixes. Diff BGP shows the number of
address changes where the previous address and the next address belonged to different
BGP prefixes. Diff /16 shows the number of address changes where the previous address
and the next address belonged to different /16 prefixes and Diff /8 shows the number
of address changes where the previous address and the next address belonged to differ-
ent /8 prefixes. The % column shows the percentage of total address changes for that
autonomous system.
108
4.9 Using complementary datasets that provide IDs to confirm out-
ages
The results from the RIPE Atlas measurement study allow the identification of
networks with stable addresses. However, they also show the existence of net-
works where dynamic addressing is common. In this section, I show how to use a
complementary dataset to confirm outages detected in networks where dynamic
addressing can occur.
Recall that Section 4.2 had described the different ways probing-based tech-
niques can make false inferences about outages when dynamic addressing oc-
curs. If the address being probed is withdrawn from its home router and not
reassigned to any other device, probe responses will cease to arrive and a false
outage will be inferred. If the home router experiences an outage causing its ad-
dress to cease to respond to probes, and before the outage ends, the address is
reassigned to some other device which responds to probes, probing-based tech-
niques will infer the occurrence of the outage correctly but will falsely conclude
that the outage ended before it did.
When a responsive address being probed is withdrawn from its host or
when the host it belongs to experiences an Internet outage, a probing-based tech-
nique will observe that responses cease to arrive. I define this event to be a
dropout. Formally: A dropout happens when the address attached to a residential link
transitions from being responsive to pings from multiple vantage points, to being unre-
109
sponsive from all of the vantage points. An observed dropout can either be due to an
outage or dynamic reassignment.
My key insight is that a complementary dataset which can yield some sort of
an unchanging identifier (an ID) uniquely associated with the device can provide
information about whether the device’s address changed. For instance, consider
the probe-ID field provided by RIPE Atlas, which uniquely identifies a device. If
the address associated with the device before and after the dropout is the same,
it is proof that dynamic address reassignment did not occur. The only way that
the address before and after the dropout can be identical and yet for dynamic
reassignment to have occurred, is if the device’s address changed to a new one
and then changed back to the original address. However, Section 4.8 showed that
subsequent addresses are often assigned from entirely different prefixes; thus, the
probability that a subsequent address is exactly the address that was assigned
before is small. Since dynamic reassignment is highly unlikely to have occurred,
we can infer that an outage occurred.
The ID-based approach provides two benefits:
• It can offer confirmation of the occurrence of an outage.
• It allows the estimation of outage recovery durations for the instances where
an outage is confirmed.
110
4.9.1 CDN dataset provides IDs
I measure how often addresses remained the same before and after a dropout
using a dataset of CDN software logs that contain a timestamp, unique identifier
of the software installation on the client machine and the public source IP address
visible to the CDN. The CDN offers a service to content owners whereby end
users can elect to install software that will improve the performance the client
experiences when accessing the content through the CDN. The CDN records logs
collected from its software installations on users’ desktops and laptop machines.
Each logline contains (among other fields) the timestamp at whch the logline was
created, the unique identifier of the software installation on the machine (the ID),
and the public IP address seen by the CDN’s infrastructure at this time. Loglines
in the CDN software dataset are dependent on user activity, and therefore, their
frequency varies.
4.9.2 Confirming outages detected by Thunderping
For dropouts detected by the Thunderping system [5], I measure how frequently
the complementary dataset confirmed that the dropout was an outage. I used
all dropout events detected in three years (2015, 2016, 2017) in this analysis and
compare against the CDN dataset during the same period.
To determine whether the address associated with a home router remained
the same before and after a detected dropout, I first collect all entries where the
address that experienced the dropout is present in the log up to one week before
111
the start time; this applies to only about one percent of the dropouts. The matched
address is ipp (for previous), and I refer to the next address after the dropout ipn
(for next). There are three categories of comparison that I show in Table 4.8:
1. When ipp = ipn, there was no apparent reassignment, which suggests that
an outage occurred and that an inferred outage duration is correct.
2. When ipp 6= ipn, and the observation of ipn was before ipp became respon-
sive again, the address was reassigned and the inferred outage duration is
incorrect. An outage may or may not have occurred.
3. When ipp 6= ipn, and the observation of ipn was after ipp became responsive
again, the address was reassigned but the address change may be indepen-
dent. Again, an outage may or may not have occurred.
Table 4.8 shows that 60% of Thunderping’s detected dropouts when con-
sidering all linktypes are not accompanied by address changes; thus the majority
of dropouts are outages. Additionally, the table shows that nearly all dropouts
for addresses with cable connections are outages, corroborating the results from
RIPE Atlas which suggested that cable addresses tend to be stable.
For DSL addresses, 31% of dropouts were confirmed. Without the comple-
mentary dataset, all of these dropouts were suspect, since prior results showed
that DSL addresses tend to be renumbered frequently. However, through the use
of a dataset that provides IDs, I am able to confirm outages even in networks
where addresses are not stable.
112
ipp 6= ipn
Link Type ipp present ipp = ipn during after
ALL 84837 (0.7%) 50973 (60.1%) 4765 (5.6%) 29047 (34.3%)
CABLE 21455 (1.1%) 18860 (88.0%) 354 (1.7%) 2221 (10.4%)
DSL 25061 (0.9%) 7761 (31.0%) 2857 (11.4%) 14422 (57.6%)
FIBER 1516 (1.0%) 853 (56.3%) 60 (4.0%) 603 (39.8%)
WISP 7381 (1.1%) 6013 (81.5%) 177 (2.4%) 1191 (16.1%)
SAT 9600 (0.4%) 6939 (72.3%) 241 (2.5%) 2412 (25.1%)
Table 4.8: Confirming Thunderping outages across link types: outages are confirmed
when ipp = ipn.
We next try to determine whether longer apparent outages correlate with
address changes. If short outages typically have no address change, we can at
least characterize short outage durations. However, if all dropouts lead to ad-
dress changes on recovery, the time until an address starts responding again is
more a function of address reuse than of recovery. Figure 4.10 shows the results
for each of the media types in our study. This uses the same data as in Table 4.8.
In Figure 4.10, the top graphs represent the raw histograms of apparent outage
duration, though only the distribution of dark bars (where the address is un-
changed) should be taken as a distribution of true outage duration. The bottom
graph represents the fraction of outages having an address change or no address
change. At a high level, graphs with more dark are media types or durations
that are more likely to be true outages rather than address renumbering. For
113
5000
ip_p == ip_n
400 2000
4000 ip_p != ip_n4000
300
8000 1500
3000 3000
6000 200 1000
2000 2000
4000
1000 100 500 1000
2000
0 0 0
0 100 100 100 100
100 80 80 80 80
80 60 60 60 60
60 40 40 40 40
40 20 20 20 20
20
0 0 0 0
0
Cable outage duration DSL outage duration Fiber outage duration WISP outage duration Sat outage duration
(a) Cable (b) DSL (c) Fiber (d) WISP (e) Satellite
Figure 4.10: Outage duration vs. probability of address change for addresses from vari-
ous link types.
WISP and Cable, the bulk of the outages at most 3 hours long have very little
renumbering and outage durations can be estimated well. For DSL, even short
apparent outages are often accompanied by address changes, meaning that out-
age duration should not be estimated based on responsiveness alone (to do so
would require additional data from clients). We observe few Fiber outages, but
the time-dependence is more pronounced.
4.10 Conclusion and Discussion
In this chapter, I showed that dynamic address reassignment can confuse probing-
based techniques and lead them to make false inferences about outages. Next, I
conducted a measurement study with colleagues to infer and analyze patterns
of address changes using an existing set of logs from 3,038 globally distributed
RIPE Atlas probes that saw address changes in 2015. We found several factors in
114
% of outages Number of outages
20m
30m
1h
3h
6h
12h
1d
3d
1w
>1w
20m
30m
1h
3h
6h
12h
1d
3d
1w
>1w
20m
30m
1h
3h
6h
12h
1d
3d
1w
>1w
20m
30m
1h
3h
6h
12h
1d
3d
1w
>1w
20m
30m
1h
3h
6h
12h
1d
3d
1w
>1w
play. Dynamic address durations vary by geography, with addresses from North
American ISPs persisting for weeks and addresses from many German ISPs as-
signed for a day. Dynamic addresses change as a result of network and power
outages in most ISPs. In some ISPs, an outage of any duration results in an ad-
dress change, while in others, the likelihood of address change increases with
outage duration. Using this study, I was able to identify which networks have sta-
ble addresses, where dynamic reassignment is uncommon. I also showed using
a complementary dataset that it is sometimes possible to confirm probing-based
techniques’ detected outages even in networks with frequent dynamic reassign-
ment.
115
Chapter 5: The need for measuring individual address outages
In this chapter, I develop and evaluate an approach to detect dependent Internet
disruption events that affect multiple residential addresses simultaneously using
measurements of individual address disruptions gathered with the Thunderping
technique. Borrowing terminology from Richter et al. [12], I define an Internet
disruption event for an address to be the abrupt loss of response to active probing
from that address.
Techniques that detect outages at the Internet’s edge often seek disruption
events affecting a substantial set of addresses. The set of addresses may comprise
those belonging to the same /24 address block [11, 12], BGP prefix [13], or coun-
try [14]. Techniques seek such disruption events because individually, each large
disruption has impact and their size makes them easier to confirm, e.g., with op-
erators. In contrast, disruptions affecting only a few users are harder to detect
with confidence. For example, the lack of response from a single address might
best be explained by a user switching off their home router—hardly an outage.
However, residential Internet outages may be limited to a small neighborhood or
apartment block; prior techniques are likely to miss such events.
116
In the rest of this chapter, I describe work with colleagues where we demon-
strate a technique that detects disruption events with quantifiable confidence,
by investigating the potential dependence between disruptions of multiple IP
addresses in a principled way. We apply a simple statistical method to a large
dataset of active probing measurements towards residential Internet users in the
US. We find times when multiple addresses experience a disruption simultane-
ously such that they are unlikely to have occurred independently; we call the
occurrence of such events dependent disruptions. We characterize these depen-
dent disruption events and present results that challenge conventional wisdom
on how such disruptions affect Internet address blocks. We show that many of
these events would be missed by existing techniques that do not perform indi-
vidual address outage detection.
5.1 Background: dependent residential outages can be small
Residential Internet connections are vulnerable. The last-mile link connecting
home routers to their ISP is typically not multi-homed and is therefore a single
point of failure. Further, last-mile links can be damaged by exposure to the ele-
ments or by broken tree limbs blown by the wind. Thus, residential outages may
be limited to a small neighborhood or apartment block.
117
5.1.1 Prior techniques focus upon larger disruptions
Prior techniques that detect edge Internet disruptions typically detect disruptions
that affect a group of addresses collectively. Like us, they also leverage the depen-
dence among the per-IP address “disruptions” that these larger disruptions cause.
However, they differ from our techique in that they look for dependence in large
aggregates (that is, so many addresses are affected at the same time that there
must be an evident anomaly) or limit their resolution to small address blocks,
looking only for outages that cause dependent disruptions for all the addresses
in a monitored block. Thus, these techniques may miss observing smaller resi-
dential failures.
For example, Trinocular looks only for outages affecting /24 address blocks [11].
Using historical data from the ISI census [47], it models the responsiveness of
blocks and finds addresses within each block that are likely to respond to pings.
The system pings a few of these addresses from each block at random in 11-
minute rounds. Trinocular then employs Bayesian inference to reason about re-
sponses from blocks. When a block’s responsiveness is lower than expected,
Trinocular probes the block at a faster rate and eventually detects an outage when
the follow-up probes also suggest the block’s lack of Internet connectivity. Since
Trinocular will not identify an outage if a single address in a block responds to
probing, Trinocular potentially neglects outages affecting /24 blocks only par-
tially, including larger outages affecting multiple /24 blocks.
118
Other systems have also investigated disruptions affecting entire blocks of
addresses. Recently, Richter et al. used CDN logs to detect disruptions affecting
/24 address blocks [12]. Hubble detects prefix-level unreachability problems [13].
The IODA system looks for the most impactful outages only, those causing an
extensive loss of connectivity for a geographical area or Autonomous System
(AS) [14, 37].
Disco [41] shares some features with our work: they also detect simulta-
neous disconnects of multiple RIPE Atlas probes within an ISP or geographic
region to infer outages. However, there are two major differences between the
Thunderping and RIPE Atlas datasets. At any given point in time, the Thun-
derping dataset typically consists of pings sent to roughly 50,000 addresses in
relatively small geographical areas with active severe weather alerts. The Disco
dataset consists of 10,000 RIPE Atlas probes distributed around the world; this
sparse distribution may prevent the detection of smaller outages localized to one
area (like a U.S. state). The second difference is that unlike Thunderping ping
data whose timestamps are only accurate to minutes, the timestamps available
in the RIPE Atlas datasets are accurate to seconds, permitting the use of Klein-
berg’s burst detection to detect bursts in probe disconnects. Discussions with the
authors of Disco suggested that Kleinberg’s burst detection model would not be
appropriate for the Thunderping data, although a more detailed evaluation of
the binomial test against Kleinberg’s burst detection in the Thunderping data is
future work.
119
5.1.2 The Thunderping dataset yields per-address disruptions
The key insight behind our technique is that simultaneous disruptions of multiple
individual IPv4 addresses could occur due to a common underlying cause. We
therefore require per-IP address disruptions.
Such data is present in the Thunderping dataset [5]. Thunderping pings
sampled IPv4 addresses from multiple ISPs in geographic areas in the United
States. Originally designed to evaluate how weather affects Internet outages, the
system uses Planetlab vantage points to ping 100 IPv4 addresses from multiple
ISPs in each U.S. county with active weather alerts. Each address is pinged from
multiple Planetlab vantage points (at least 3) every 11 minutes, and addresses in
a county are pinged six hours before, during, and after a weather alert.
Here, we analyze a dataset of Thunderping’s ping responses to detect dis-
ruptions for each probed address using Schulman and Spring’s technique [5].
When an address that is responsive stops responding to pings from all vantage
points that are currently probing it, we detect a disruption for that address. Since
a disruption is detected only when all vantage points declare unreachability, the
minimum duration of a disruption is 11 minutes (at the end of 11 minutes each
vantage point has pinged the address at least once). Thunderping continues to
probe an address after it has become unresponsive, allowing us to estimate how
long the unresponsive period lasted.
While per-IP address disruptions allow the detection of small disruptions,
all per-address disruptions are not necessarily the result of Internet connectiv-
120
ity outages. For example, an individual user may decide to turn off their home
router. In the rest of this chapter, we show how to detect dependent disruption
events using per-address disruptions.
5.2 Detecting dependent disruptions
In this section, we apply binomial testing to identify dependent disruptions in
the outage dataset. First, we show how the binomial test works to rule out in-
dependent events and show how to apply the test to network outages in reason-
ably sized aggregates of addresses. Second, we apply this method to the outage
dataset, omitting addresses with excessive baseline loss rates and evaluating our
chosen aggregation method. Finally we summarize the dependent disruptions
we found in this dataset. This sets up analysis of these events (time of day, geog-
raphy, and scope) which we defer to the following section.
5.2.1 Finding dependent events in an address aggregate
When many addresses experience a disruption simultaneously, there could be a
common underlying cause. Such disruptions are statistically dependent. To iden-
tify these dependent events, our insight is to model address disruptions as in-
dependent events; when disruptions co-occur in greater numbers than the inde-
pendent model can explain, the disruptions must be dependent. Binomial testing
provides precisely this ability to find events that are highly unlikely to have oc-
curred independently.
121
Given N addresses, the binomial distribution gives the probability that D
of them were disrupted independently as:
( )
N
Pr[D independent failures] = · PD(1− P )N−Dd d (5.1)D
where Pd represents the probability of disruption for the aggregate N . To apply
this formula, we must first set a threshold probability below which we consider
the simultaneous disruption to be too unlikely to be independent. We set this
threshold to 0.01%. We then solve for Dmin, the smallest (whole) number of si-
multaneous disruptions with a smaller than 0.01% chance of occurring indepen-
dently.
Table 5.1 presentsDmin, computed for various values ofN and Pd. This table
shows that, even for large aggregates of IP addresses, often few simultaneous
disruptions are necessary to be able to confidently conclude that a dependent
disruption has occurred. When applied to the Thunderping dataset, Dmin values
are typically below 8.
There are two practical challenges in applying this test. First, we must
choose aggregates of N IP addresses that define the scope of a dependent dis-
ruption: too large an aggregate will have too large a chance of simultaneous
independent failures and drive up D, while too small an aggregate may fail to
include all the addresses in an event. Second, we must estimate Pd for each ag-
gregate. We address each in turn.
122
N Dmin
Pd = 1/hour 1/day 1/week 1/month
10 8 3 2 2
50 21 5 3 2
100 35 7 4 3
500 126 14 6 4
1000 231 21 8 5
5000 1021 64 17 8
10000 1980 112 26 11
50000 9491 457 85 29
Table 5.1: Dmin values for varying values of N and Pd. There is less than 0.01% prob-
ability according to the binomial test that Dmin or more addresses fail for each N and
Pd.
5.2.1.1 Choosing aggregate sets of IP addresses
Our technique assumes some aggregate set of IP addresses among which to de-
tect a dependent disruption. We note that the correctness of our approach does
not depend on how this set is chosen—the binomial test will apply so long as
independent failures can be modeled by Pd. When applying our technique, IP
addresses must be aggregated into sets that are large enough to span interesting
disruption events, but not so large as to become insensitive to them.
123
In this paper, we aggregate IP addresses based on the U.S. state and the
ASN they are in. State-ASN aggregates have the benefit of spanning multiple
prefixes (so we can observe whether more than one /24 is affected by a given
disruption event), but also being constrained to a common geographic region (so
hosts in an aggregate are likely to share similar infrastructure). There are two
limitations with this approach: states are not of uniform size, though the test
elegantly handles varying N , and a few ISPs use multiple ASNs, which may hide
some dependent failures. Alternate aggregations are possible.
5.2.1.2 Calculating the probability of disruption (Pd)
As a final consideration, we discuss how to estimate the probability of disruption,
Pd, from an empirical dataset of disruptions. We assume that the dataset can
be separated into a set of discrete “time bins”; this is common with ping-based
outage detection, such as Thunderping and Trinocular, which both consider 11-
minute bins of time. Pd can be estimated using the following equation:
#disruptions
Pd = (5.2)#timebins
Here, #timebins represents the total number of observation intervals used: if a
single host was measured across 10 time intervals and five other hosts were all
measured across 3, then #timebins = 10 + 3 · 5 = 25.
We only consider state-ASN aggregates where we were able to obtain a sta-
tistically significant value for Pd. For statistical significance, we adhere to the
124
following rule of thumb [102, Chapter 6]: we accept a state-ASN aggregate with
t timebins and estimated probability of disruption Pd only if:
tPd(1− Pd) ≥ 10 (5.3)
5.2.2 Applying our method to the Thunderping dataset
We investigate all ping responses in the Thunderping dataset from January 1,
2017 to December 31, 2017 and detect disruptions according to the methodol-
ogy described above. During this time, Thunderping had sent at least 100 pings
to 3,577,895 addresses and detected a total of 1,694,125 individual address dis-
ruptions affecting 1,193,812 unique addresses. Figure 5.1 shows the top 15 ISPs
whose addresses Thunderping had sampled most frequently. These ISPs include
large cable providers (Comcast, Charter, Suddenlink), DSL providers (Windstream,
Qwest, Centurytel), WISP providers (RISE Broadband), and satellite providers
(Viasat).
Filtering lossy addresses
We find that some pinged addresses experience unusually high ping loss rates.
These addresses see disruption very frequently, since high loss rates can result in
pings from all vantage points to these addresses failing together. Disruptions for
such addresses are even more challenging to interpret because a variety of causes
can result in high ping loss rates, such as high response latency [19] and ICMP
125
1.0
With > 100 pings
With > 100 pings and > 10% loss
1000
0.8
800
0.6
600
0.4
400
0.2 3,577,895 addresses 200
0.0 0
0.0001 0.001 0.01 0.1 0.5 as
t
am es
t
rte
r n l t t
c zo yt
e a x k se th l3 t ia ss
re w a ri r ia
s co nl
in ri pa
a
ve ed lee
Loss rate per address during non-disrupted times co
m
ds
t q ch u vve nt de eg
a le r
e d ov
m i
w
in c su m
l n
w pa
v
iz
o
ve
r
Figure 5.1: (Left) The distribution of ping loss rates per IP address during times when
Thunderping believed an address was not experiencing a disruption. While most ad-
dresses have low loss rates, 2% of addresses had loss rates exceeding 10%. (Right) The
fraction of addresses per ISP with ping loss rates exceeding 10% during non-disruption
periods, for the 15 ISPs with the most pinged addresses. We filter from all remaining
analyses any address whose loss rate exceeded 10%.
rate-limiting [103]. Thus, we find these addresses and remove them from the rest
of the analyses.
Figure 5.1 shows the distribution of ping loss rates for IP addresses during
times when the addresses were not experiencing a disruption. 2% of addresses
have loss rates exceeding 10%. Figure 5.1 shows the prevalence of these addresses
in the 15 ISPs whose addresses Thunderping had sampled most frequently. Some
ISPs have a higher concentration of addresses with high loss rates, such as Viasat,
Verizon Wireless, and Pavlov Media. However, even in these ISPs, the majority
of addresses do not have high loss rates. Thus, instead of filtering the ISPs whole-
126
CDF
IP Addresses (Thousands)
1.0 1.0
0.8 0.8
state-asn (1559)
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0 50000 100000 150000
Responsive addresses
Probability of Disruption
Figure 5.2: Potential N and Pd values in the Thunderping dataset: On the left, we show
the distribution of all addresses (across all state-ASN aggregates) pinged by Thunderping
that can potentially fail in each 11 minute time bin. On the right, we show the distribution
of the probability of disruption (Pd) for various state-ASN address aggregates.
sale, we only remove the addresses whose loss rates exceeded 10% and do not
consider these addresses in the remaining analyses.
Detecting dependent disruptions in the Thunderping dataset
We use Figure 5.2 to describe potential N and Pd values in the Thunderping
dataset. On the left, we show the distribution of addresses pinged by Thunder-
ping in each 11 minute timebin in 2017. The median number is roughly 50,000
addresses across all U.S. states and ISPs. Since many weather alerts tend to be ac-
tive at any given point of time, these addresses are likely to be distributed among
tens of state-ASN aggregates. In 2017, the maximum addresses that could po-
tentially fail in any state-ASN aggregate was 15,863. On the right, we show the
127
CDF
0.00001
1 per year
0.00005
0.0001
1 per month
0.0005
1 per week
0.005
1 per day
0.01
distribution of Pd values for all state-ASN aggregates that we considered. There
is extensive variation: addresses in some of these aggregates experience disrup-
tions only once every year, whereas in other aggregates they experience disrup-
tions more often than once per day. 1
For each state-ASN aggregate, for each 11-minute window during which
Thunderping had pinged addresses, we identify the maximum number of ad-
dresses that can potentially fail, N , i.e., all the addresses that are responsive to
pings at the beginning of the window. Next, we apply the binomial test for each
of these windows since we know N and Pd. When the number of disruptions
in a window is at least Dmin, we determine that a dependent disruption event
occurred in that window with a probability greater than 0.9999.
N Probability of disruption (Pd)
1/hour >Pd ≥ 1/day 1/day >Pd ≥ 1/week 1/week >Pd ≥ 1/month 1/month >Pd
≤ 10 11 (0.1%) 486 (2.3%) 519 (2.5%) 179 (0.9%)
≤ 50 6 (0.0%) 1089 (5.2%) 1990 (9.6%) 868 (4.2%)
≤ 100 0 (0.0%) 863 (4.1%) 1229 (5.9%) 736 (3.5%)
≤ 500 0 (0.0%) 1807 (8.7%) 4328 (20.8%) 1360 (6.5%)
≤ 1000 0 (0.0%) 462 (2.2%) 1884 (9.0%) 405 (1.9%)
≤ 5000 0 (0.0%) 171 (0.8%) 1865 (9.0%) 458 (2.2%)
≤ 10000 0 (0.0%) 0 (0.0%) 83 (0.4%) 0 (0.0%)
≤ 50000 0 (0.0%) 0 (0.0%) 32 (0.2%) 0 (0.0%)
Table 5.2: Dependent disruption events for different values of number of addresses that
can potentially fail (N ) and probability of disruption (Pd) from the Thunderping dataset.
Of 20,831 total dependent disruption events, the majority were detected when Pd is low.
1Since disruptions are a superset of outages and dynamic reassignment, frequent disruptions
are not necessarily indicative of poor Internet connectivity. Also, the existence of many aggre-
gates with few disruptions indicates that Thunderping often pinged addresses during weather
conditions that were not conducive to disruptions.
128
1.0
0.8
0.6
0.4
0.2
0.0
001 01 0.000001 0.00001 0.0001
Probability that detected event occurred independently
Figure 5.3: Figure 5.3 shows the distribution of the probability that the 20,831 detected
dependent disruption events could have occurred independently. For 90% of events, the
probability of occurring independently is less than 0.00005.
In total, we detected 20,831 events with dependent disruptions in 2017. Ta-
ble 5.2 shows the number of detected events for various values of N and Pd in
the Thunderping dataset in 2017. The majority of events were detected for state-
ASNs with Pd lower than once a week. From Figure 5.2, we know that close
to three-quarters of state-ASN aggregates fall in this category, showing that our
technique is able to detect dependent disruptions in most aggregates.
Next, we analyzed our confidence in these dependent disruptions. The oc-
currence of Dmin disruptions has less than 0.01% probability according to the Bi-
nomial test. We test if most detected dependent disruption events have exactly
0.01% probability of occurring or if they are well clear of this threshold.
Figure 5.3 shows the distribution of the probability that we incorrectly clas-
sify an independent event as dependent. The probability of occurring indepen-
dently is less than 0.005% for 90% of the events and less than 0.001% for 75%.
129
CDF
1
0
100
10
1
0 5 10 15 20 0 1
Minimum threshold for dependent disruption (Dmin)
Figure 5.4: For each detected correlated disruption event, Figure 5.4 shows the Dmin
value on the x-axis and the corresponding number of observed disruptions on the y-
axis. 62% of the 20,831 detected events had more than Dmin observed disruptions. The
scatterplot adds a random gaussian offset to both x and y with mean of 0.1, clamped at
0.45, to show density.
Thus, the probabillity that detected events occurred independently is typically
much smaller than our choice of 0.01%.
How many addresses are disrupted dependently?
The Binomial test does not say that all of the addresses that were observed to
be disrupted during a dependent event were disrupted in a dependent manner.
Consider if Dmin is 4 and we detect an event where 7 addresses were disrupted.
The Binomial Test shows us that the event took place with very low probability.
However, that does not necessarily mean all 7 addresses were disrupted in a de-
130
Disrupted addresses in dependent disruption (D)
pendent manner; up to 3 of them could have been disrupted independently with
up to 99.99% probability.
We call the set of addresses in a state-ASN aggregate that were disrupted in
the time-bin of a dependent event the observed group of addresses that were dis-
rupted, or the observed disrupted group for short. Of the observed disrupted group,
our assumption is that some were disrupted together in a dependent manner: we
call this subset the actual group of addresses that were disrupted, or actual dis-
rupted group. We obtain a minimum bound on the actual disrupted group by
subtracting Dmin − 1 from the observed disrupted group. For the 20,831 depen-
dent disruption events, the total addresses in all the observed disrupted groups
is 229,413 and the total addresses in all the minimum actual groups is 165,328.
We study the relationship between Dmin for a state-ASN aggregate on the
x-axis and the corresponding number of addresses in the observed group of dis-
rupted addresses (on the y-axis) in Figure 5.4. Each point corresponds to one of
the 20,831 detected events. Sometimes, a state-ASN aggregate had such low Pd
that even a single disruption in a 11-minute bin occurred with less than 0.01%
probably and therefore had a Dmin value of 1. However, since we are looking for
unlikely disruptions of multiple addresses, all our detected events observed at
least two addresses that were disrupted in the same time-bin.
12,911 (62%) detected events observed more than Dmin disruptions, corrob-
orating the result from Figure 5.3 that most detected events would have been
detected even with a stricter threshold.
131
We detected dependent disruption events with various sizes as shown in
Figure 5.4. There are 693 (3%) events with more than 50 observed disrupted ad-
dresses. For the largest detected event, we observed 913 addresses experience
disruptions in the same time-bin in AS33489 (Comcast) in Florida at 2017-09-
13T20:33 UTC time. This detected event correlates to the minute with a known
failure event for Comcast that was discussed in the Outages mailing list [104].
However, for most of the events, the size of the observed group of disrupted ad-
dresses is small: there were 2,593 (12%) with two, 2,969 (14%) with three, 2,776
(13%) with four, and 2,175 (10%) with five observed disrupted addresses. These
results highlight the ability of our technique to detect even small sized disrup-
tions with confidence.
5.3 Properties of dependent disruptions
In this section, we study various properties of dependent disruptions. For some
properties, we conduct additional analyses on specific ISPs in the Thunderping
dataset: Comcast (cable), Qwest (DSL) and Viasat (Satellite). These are three ISPs
whose addresses are pinged frequently by Thunderping (as seen in Figure 5.1)
and where we were able to detect in excess of a thousand dependent disruption
events (3109 events for Comcast, 1855 for Viasat, 1734 for Qwest).
132
3000
2000
1000
0
st at st er m el th se n ss k ttt t o n a ox iaca as e
3
i w ar a a
i e i l
tre r
y r
ap ri
z
re
l nl c e
d vem v q h tu e i e m le
co c nd
s n eg v w d v
i ce m o
w on
- d l
su av
riz p
ve
Figure 5.5: Figure 5.5 shows the number of dependent disruption events detected per
ISP. Note that these numbers are more a reflection of addresses sampled and pinged in
the Thunderping dataset than any major underlying problem in their infrastructure.
5.4 Dependent disruption events across ISPs
We grouped dependent disruption events by ISP to check if any ISPs contribute
an unusual number of events. Figure 5.5 shows the top 15 ISPs with dependent
disruption events. Most of the ISPs from Figure 5.1 are represented here as well,
suggesting that no ISPs are unduly biasing our results. These top 15 ISPs together
account for 13,643 (65%) of all detected events.
We emphasize that these results are not meant to reflect any underlying
problems with these ISPs; the Thunderping system samples and pings large ISPs
more frequently and consequently, finds more disrupted addresses in them. The
purpose of this analysis is to ensure that no ISP contributes unduly many events.
133
Dependent dropout events
Hour of the week (CST, UTC - 6) Hour of the week (CST, UTC - 6) Hour of the week (CST, UTC - 6)
Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun
20 60
50
40
15
40
30
10
20
20
5
10
0 0 0
Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun
Hour of the week (UTC) Hour of the week (UTC) Hour of the week (UTC)
(a) Comcast (b) Qwest (c) Viasat
Figure 5.6: Dependent disruption events that began in each hour of the week. ’Mon’ on
the bottom x-axis refers to midnight on Monday in UTC time. On the top x-axis, ’Mon’
refers to midnight at UTC-6 (CST).
5.4.1 Dependent disruptions are more frequent at night for some
ISPs
Recent work has shown that disruptions tend to happen more frequently during
maintenance intervals close to midnight local time [12]. To obtain this result,
Richter et al. used proprietary data from a content delivery network, collected at
the granularity of every hour. Here, we investigate if our technique can identify
similar patterns of dependent disruptions.
Figure 5.6 shows that individual ISPs can have different behavior. Comcast
and Viasat have more dependent disruption events occurring close to midnight,
CST, on weekday nights. Qwest, on the other hand, does not appear to have
a clearly discernible pattern. Our results confirm those from prior work [12],
134
lending credence to our technique. Moreover, we are able to do so using public
(Thunderping) data and a granularity of every 11 minutes.
5.4.2 Dependent disruptions can recover together
Here, we investigate whether dependent disruption events are accompanied by
dependent recovery. Since Thunderping continues to probe an IP address even
after it becomes unresponsive until the end of the weather alert, it can observe
when the address becomes responsive again. This responsiveness may signal that
the disruption for the address has ended. Multiple addresses that are disrupted
together and also recover together offer evidence that: (a) the event was indeed
dependent and (b) the event has ended, allowing estimation of the disruption’s
duration.
Most dependent disruptions also have correlated recoveries. Of 20,831 de-
pendent disruption events, 6,869 (33%) had all disrupted addresses recover dur-
ing the same 11-minute time-bin. Further, 14,789 (71%) disruption events had at
least half of the disrupted addresses recover together. Across all of the 20,831
dependent disruption events, there were 229,413 disrupted addresses in total. Of
these, 121,648 (53%) disrupted addresses—from 15,117 (73%) disruption events—
exhibited a dependent recovery with other addresses from that same group. This
indicates that dependent recovery is quite common.
We also tested whether the likelihood of dependent recovery is a function
of the number of addresses in the observed disrupted group. It is possible that
135
Disruptions Correlated recoveries
-2 -1 0 2 3 ≤ 10 ≤ 50 ≤ 100 ≤ 1000
2 623 (24%) 286 (11%) 425 (16%) 1259 (49%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) 0 (0%)
3 476 (16%) 283 (10%) 463 (16%) 741 (25%) 1006 (34%) 0 (0%) 0 (0%) 0 (0%) 0 (0%)
≤ 10 869 (8%) 488 (5%) 1329 (13%) 1938 (19%) 1714 (17%) 3937 (38%) 0 (0%) 0 (0%) 0 (0%)
≤ 50 216 (5%) 78 (2%) 154 (4%) 282 (7%) 281 (6%) 1554 (36%) 1760 (41%) 0 (0%) 0 (0%)
≤ 100 15 (3%) 2 (0%) 4 (1%) 7 (1%) 5 (1%) 54 (11%) 218 (44%) 193 (39%) 0 (0%)
≤ 1000 2 (1%) 0 (0%) 1 (1%) 0 (0%) 0 (0%) 10 (6%) 35 (20%) 52 (30%) 71 (42%)
Table 5.3: The number of addresses that recovered (columns) for dependent disrup-
tions affecting different numbers of addresses (rows). -2 indicates that no addresses that
dropped out were observed to have recovered. -1 indicates that only one address recov-
ered. The other numbers show how many of the (at least two) addresses that recovered
did so in a correlated manner.
disruptions with fewer addresses in the observed disrupted group tended to ex-
perience correlated recovery more frequently. As the number of addresses in the
observed disrupted group increases, do the number of addresses that recover in
a correlated manner also increase?
Table 5.3 answers this question. The−2 and−1 columns show events where
there is insufficient data from the Thunderping dataset to determine recovery;−2
shows events where none of the addresses in the observed disrupted group re-
sponded to Thunderping’s pings after the disruption and−1 shows events where
only one of the addresses responded to Thunderping’s pings after the disruption.
The rest of the columns show how many events recovered in a correlated man-
ner. We observe that for the majority of events, irrespective of the number of
addresses in the observed disrupted group, more than 50% recover together.
136
1
1.0
0
1w
0.8 4d
2d
1d
0.6
12h
6h
0.4
3h
all (121648) 2h
comcast (24025)
0.2 1h
centurylink209 (8502)
30m
viasat (12744)
0.0 10m
10m 30m 1h 2h 3h 6h 12h 1d 2d 4d 1w 2 5 10 20 50 100 200 500 0 1
Duration of correlated disruption Addresses with correlated recovery
(a) (b)
Figure 5.7: (a) The distribution of durations of dependent dropouts for all addresses
that recovered in a correlated manner. 60% of addresses recovered in less than an hour.
(b) For dependent dropout events where at least two addresses recovered, this shows
the number of addresses that recovered on the x-axis and the corresponding recovery
duration for the event on the y-axis. Dependent dropout events vary in their duration
irrespective of the number of affected addresses.
Recovery times are often shorter than an hour
Next, we turn our attention to the time it takes dependent disruptions to recover.
Figure 5.7(a) shows that 60% of recovered addresses recovered in less than an
hour. Our technique is able to identify this, because we operate at the precision
of the 11-minute time-bins from standard outage detection datasets. Conversely,
recent work that finds disruptions spanning an entire calendar hour [12] would
miss these disruptions.
Next, we examine whether short recovery durations can be attributable to
small disruption events: that is, do the recoveries appear quick because only a
137
CDF
Duration of recovery
100
80
60
40
20
0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Days of the year
Figure 5.8: Multi-ISP dependent disruption events over time: several ISPs in the same
state have simultaneous disruption events on 333 occasions. Here, we show how many
events occurred on each day of the year in 2017. Days with many multi-ISP events often
correlate with days with large known power outages.
3
FL
GA
SC
2
Sep 11 00:00 Sep 12 00:00
Time in UTC
Figure 5.9: Multi-ISP dependent disruption events during Hurricane Irma in Florida
(FL), Georgia (GA), and South Carolina (SC). Of 111 events during this time, 15 affected
3 ISPs simultaneously and 96 affected 2.
couple hosts were disrupted? Figure 5.7(b) shows that the answer is no: Even
dependent disruptions with hundreds of addresses that recovered together often
last less than an hour.
5.4.3 Dependent disruptions can be multi-ISP
Dependent disruption events can also span multiple ISPs within a single state:
these events indicate a fault of infrastructure shared by the ISP or their customers.
138
Events
ISPs in multi-ISP event
Here, we broaden our analysis to examine whether the dependent disruption
events we detected are correlated across multiple ISPs within the same state.
We observe 333 instances where multiple ISPs in the same state had simul-
taneous dependent disruption events, and we are able to confirm that many oc-
curred on days when the media reported large power outages in those areas.
Figure 5.8 shows days in 2017 when multi-ISP dependent disruption events oc-
curred. Of the 333 instances, 88 (26%) occurred on a single day during Hurri-
cane Irma (Sep 11). Figure 5.9 shows multi-ISP events during Hurricane Irma
by state and by the number of individual ISPs affected during each multi-ISP
event. We observe 20 multi-ISP events in Florida on Sep 10, when Irma made
landfall [105]. As Irma moves northwards, we see multi-ISP events in Georgia
and South Carolina as well. Other days with many such events include Oct 30
with 19 events across six states in the Northeastern U.S. (Maine, New Hampshire,
Vermont, Connecticut, Massachusetts, Rhode Island); there were recorded power
outages during this time as a result of a severe storm [106–108]. On Oct 22, there
were 4 multi-ISP events in Oklahoma and 2 in Arkansas; there are corresponding
reports of power outages during these times as well [109].
5.4.4 Dependent disruptions may not disrupt entire /24s
Here, we examine if the dependent disruption events that we detected disrupt
entire /24 address blocks. If so, they would likely be detected by prior work that
looks for outages at these granularities [11, 12]. If there continue to be responding
139
addresses within a /24 with a disrupted address, however, prior work may miss
the disruption.
To analyze how dependent disruptions affect /24 address blocks, we find all
addresses in the observed disrupted group for a dependent disruption event and
group them by /24s. As a running example in this section, consider a dependent
disruption event comprising 3 addresses in 1.2.3.0/24, 5 addresses in 2.3.4.0/24,
and 2 addresses in 4.5.6.0/24. We call these the observed disrupted /24s. For each of
these /24s, we also find how many addresses were pinged by Thunderping that
were responding to pings before the dependent disruption and that continued to
respond for at least 30 minutes after the time-bin where the dependent disruption
occurred. We term these addresses the responsive addresses in a /24 since these
addresses were not affected by the disruption.
Our goal is to find how many /24s exist where at least one address was an
actual address in a dependent disruption but there were other addresses which
continued to be responsive.
First, we check how many of the 20,831 disruption events observed at least
one responsive address in all of the observed disrupted /24s. 12,825 (61%) have at
least one responsive address in all of the observed disrupted /24s. For each such
event, even if some of the disrupted /24s have addresses that failed indepen-
dently, since all disrupted /24s continue to have at least one responsive address,
prior work may miss detecting this event.
Next, we investigate the subset of observed disrupted /24s where there
were at least Dmin failures within the /24 itself. Since the entire state-ASN ag-
140
1
0
256
100
50
20
10
5
2
1
0
1 2 5 10 20 50 100 256 0 1
Minimum out addresses in /24
Figure 5.10: Minimum actual disrupted addresses in a /24 vs. responsive addresses
in a /24, for all /24s with at least Dmin address that were disrupted during a detected
dependent disruption event.
gregate only required Dmin failures, when Dmin or more addresses are disrupted
within a single /24, the /24 has at least one actual disrupted addresses. We ob-
tain the minimum bound on the number of actual disrupted addresses in a /24 by
subtracting Dmin−1 from the observed disrupted addresses in that /24. Suppose
the Dmin for the example dependent disruption event above was 3. We would
obtain a minimum bound of at least 1 actual disrupted address in 1.2.3.0/24.
In 2.3.4.0/24, the lower bound is 3. In 4.5.6.0/24, the lower bound is 0 and we
are unable to determine if the addresses in this /24 had a dependent disruption.
Of 92,777 /24s with observed disrupted /24s (across all dependent disruption
events), we find that 14,702 (16%) have at least Dmin disrupted addresses. Each
of these is a point in Figure 5.10.
141
Alive addresses in /24
1 1 1
0 0 0
256 256 256
100 100 100
50 50 50
20 20 20
10 10 10
5 5 5
2 2 2
1 1 1
0 0 0
1 2 5 10 20 50 100 256 0 1 1 2 5 10 20 50 100 256 0 1 1 2 5 10 20 50 100 256 0 1
Minimum out addresses in /24 Minimum out addresses in /24 Minimum out addresses in /24
(a) Comcast (b) Qwest (c) Viasat
Figure 5.11: For Comcast, Qwest, and Viasat: Minimum actual disrupted addresses in a
/24 vs. responsive addresses in a /24, for all /24s with at least Dmin address that were
disrupted during a detected dependent disruption event. All ISPs have /24s with actual
disrupted addresses where there continued to be responsive addresses throughout the
disruption.
We find that many disrupted /24s with actual disrupted addresses have
other addresses that continued to be responsive. 10,164 (69%) /24s had at least
one responsive address, 9327 (63%) had at least two responsive addresses, and
6,096 (41%) had at least 10 responsive addresses. 1,691 /24s had at least 10 actual
disrupted addresses; of those, 550 (33%) had at least 10 responsive addresses.
Next, we investigated if the responsiveness of other addresses in /24s with
actual disrupted address would vary across ISPs. Figure 5.11 shows per-ISP be-
havior. We see that all ISPs have /24s with actual disrupted addresses where
there continued to be responsive addresses throughout the disruption.
142
Alive addresses in /24
Alive addresses in /24
Alive addresses in /24
5.5 Conclusion
In this chapter, I showed how to detect dependent residential disruption events
using individual address disruptions. Using the binomial test, I detected events
where multiple addresses that are related to each other by geography and ISP
fail simultaneously such that the failures are unlikely to have occurred indepen-
dently. The technique is capable of detecting large known disruption events, such
as power outages during times of severe thunderstorms, but importantly, can also
detect much smaller events. By analyzing these events, I demonstrated that prior
techniques which detect dependent disruptions affecting a substantial number of
addresses in BGP prefixes or /24 address blocks can miss observing these events.
These results motivate finding individual address outages for measuring residen-
tial Internet reliability.
143
Chapter 6: Analyzing weather’s effect on Internet Reliability
One aspect of measuring Internet reliability is to determine if the occurrence of
certain events adversely affects Internet connectivity. Consider the occurrence
of adverse weather conditions for instance: prior work has shown that Internet
outages occur more frequently during times of precipitation [5]. However, this
work was preliminary in nature and was performed over a short duration (three
months).
In this section, I discuss a technique to quantify the effect of external factors,
such as the occurrence of various weather conditions, upon Internet connectivity
of residential addresses using measurements from the Thunderping probing sys-
tem [5]. The technique mitigates false outages due to dynamic addressing and
user behavior. First, I verify that weather conditions do not positively correlate
with peak diurnal failure periods, where dynamic addressing or false outages
due to user behavior are common. Next, I quantify the absolute increase in the
number of outages observed during weather, when compared to non-weather
periods—the outage inflation—for several types of weather including times with
precipitation and times with extreme temperatures and high winds. By study-
144
ing outage inflation of various link types and geographic regions across weather
conditions, I am able to identify networks that are vulnerable to weather.
6.1 Introduction
Wather-related damage to vital infrastructure can lead to significant economic
harm. Yet, little is known about the economic impact of weather-induced out-
ages on the most pervasive infrastructure that people use to access the Internet:
residential last-mile links. For massive last-mile outages, telcos are required by
U.S. policy [110] to report the outage to the FCC. However, the minimum re-
porting threshold is high: the outage must be at least 30 minutes in duration,
and it must have affected tens of thousands of customers [110]. Researchers
have also studied widespread link failures in the Internet, like undersea cable
cuts [111, 112], natural disasters [113], and backbone router failures [114].
In practice, most weather events are much more localized and not severe
enough to generate such a large outage. For decades, this everyday weather
has been known to lead to to smaller scale outages of telecom infrastructure.
For example, early telephone and cable television engineering documents de-
scribe how to avoid moisture in wires because it impedes signal propagation [115,
116]. Also, rain attenuates satellite signals above 10 GHz [117]. Finally, point-to-
point wireless links can experience multipath fading due to objects moving in
the wind [118]. In short, residential links are vulnerable to everyday weather be-
cause residential equipment and wiring are often installed outdoors: wind can
145
blow trees onto overhead wires, heat can cause equipment to fail, and rain can
seep into underground equipment cabinets.
Surprisingly, for these everyday weather conditions, there are no public
statistics on the frequency or magnitude of the outages they induce (directly and
indirectly). This could be a problem for Internet-based companies because they
do not know how many customers they are losing to nature, and for regulators
because they do not know how significant the problem is, and which conditions
and geographic areas deserve their attention. In this work we resolve this is-
sue: we provide the first comprehensive study that identifies the correlation be-
tween everyday weather and residential Internet last-mile outages. Specifically,
we quantify the absolute increase in the number of outages observed during
weather, when compared to non-weather periods.
Quantifying the relationship between occurrences of weather and an in-
crease in outages cannot be answered with a short term study. The data set
needs to be longitudinal because weather is seasonal—certain weather conditions
only happen at certain times of year—and because some weather events are rare
enough that providers in a specific location may not be adequately prepared.
Targeted probing is needed because weather is localized: at any time only specific
geographic locations are exposed to weather conditions. Broad observation of
outages of several links will capture correlated outages of several hosts, such as
the work by Heidemann et al. [11, 119], but it will not reveal failures of individual
links as may be the case for weather. Although some systems can obtain detailed
measurements at residential gateways [120, 121], the limited deployment of these
146
measurement systems make them inadequate for studying the scale needed to ob-
serve many different weather conditions, multiple times, in different geographic
areas. Therefore, we performed a seven year longitudinal study with targeted
measurements of residential links surviving weather events.
In 2011, we introduced a measurement system for this task called Thunder-
Ping [122]. For the past seven years ThunderPing has been following forecasts
of weather in the U.S. and pinging a sample of 100 hosts from each last-mile
provider in the area for six hours before, during, and six hours after the fore-
casted weather event. The focus of our initial paper on ThunderPing was its
probing methodology, but it also included a preliminary study that looked at 66
days of data. Given how limited the data set was, we were unable to draw statis-
tically significant conclusions and we saw only one season, summer, of one year.
We also did not have enough data to explore variations in effect of weather by ge-
ography, nor could we explore if the likelihood of failure varies with continuous
weather conditions (e.g., wind speed),
In this paper, the time totaled across all responsive links exposed to different
weather events is in the centuries. For example, we have observed a total of 100
centuries of DSL links exposed to cold weather. This large data set enabled us to
address all of these significant limitations of our prior preliminary study.
There is a challenge with quantifying how weather correlates with outages:
outages are relatively uncommon events, and thus every outage is a significant
event. This is compounded by the fact that we wish to analyze subsets of our data
to focus on, say, particular link types or locations. With so few outages observed
147
compared to the time that links are responsive, it is difficult to determine if dif-
ferent weather causes a statistically significant increase in outages. To address
this issue, we borrow statistical tools from epidemiology that enable us to reason
about the inflation in dropout probability, and to establish statistical significance
to our results, even though failures happen at relatively low rates. We detail this
approach in Section 6.3.1, as we believe it to be of general use to the community.
Another challenge is this metric could be artificially inflated by weather
conditions coinciding with daily network state changes such as maintenance or
renumbering [21]. We verify that weather does not appear to be positively corre-
lated with peak diurnal failure periods.
Observations and Contributions We present a dataset spanning seven years, all
weather conditions, and 76 billion responsive pings to 8.7 million hosts through-
out the U.S. We apply techniques from epidemiology to attribute statistically sig-
nificant rates of dropout to individual weather conditions. Our key findings span
four broad areas of analysis:
• Link type variations (§6.4.1): Different link types experience weather in highly
varying ways. For instance, compared to wired link types (cable, DSL, fiber),
wireless link types (WISPs and satellite) experience greater increases in dropout
rates during rainy conditions and high temperatures, but often decreases in
dropout rates in snow and cold temperatures.
• Geographic variations (§6.4.2): Different geographic regions can be affected
to varying degrees. For instance, Midwestern U.S. states are more prone to
148
failures in thunderstorms and rain than coastal states. Southern states are more
prone to failures in snow than other states.
• Continuous variable analysis (§6.4.3): Most link types have highly nonlinear
dropout rates with respect to changes in temperature, wind speed, and precip-
itation. For temperature, dropout rates are typically non-monotonic; satellite
links drop out more in moderate temperatures than low or high temperatures.
Our findings have ramifications on how network outage detection and anal-
ysis should be performed; limiting measurements to any particular geographic
region, link type, or time of year can introduce statistically significant bias. We
believe our results also have implications for network administrators and policy-
makers; an increased use of satellite links in the Midwestern U.S. has resulted in
those states’ increased dropout rates in rainy weather. We will be making all of
our data and code publicly available.
6.2 Data Set
This section describes the data we collected and its initial processing. We start
with a definition of the partially interpreted data we seek: “dropouts,” where
an address fails to respond in the context of otherwise “responsive hours” of an
address. Dropouts and disruptions from Chapter 5 are synonymous. We next
briefly review the ThunderPing data probing system and present brief statistics
about the raw active probing data. Then, we review the weather data, particu-
larly how and where it is collected and how we handle hurricanes. We conclude
149
by describing the benefits (and limitations) of this data for our study of weather-
related effects.
6.2.1 Dropouts, Defined
A dropout happens when the address attached to a residential link transitions
from being responsive to pings from multiple vantage points, to being unre-
sponsive from all of the vantage points. Specifically, we define a residential
link “dropout” as an hour when at least three vantage points pinging a host
and receiving replies suddenly experience 11 minutes (an entire probing inter-
val) where they do not receive a reply before a five second timeout. This dropout
occurs within a “responsive address hour,” a continuous observation of an IP ad-
dress in known weather conditions. A responsive hour may or may not include
a dropout, and the ratio of dropouts to responsive hours is a measure of outage
likelihood. Responsive hours add: two addresses both observed in the same hour
or one address observed for two hours in the same conditions are equivalent.
Our selection of three vantage points is based on prior work’s selection of
three vantage points to observe outages [11]. Our selection of a five second time-
out for ping responses is based on our prior work that observed that most ping
replies to residential hosts are received within five seconds [19]. Our selection of
one hour as the time period for a dropout is based on the fact that the weather
data we collected consists of hourly reports. Considering at most one dropout per
probed address per hour will diminish the number of observed dropouts from
150
individual links, if they should alternate between responsive and unresponsive
states: there can be at most one per hour, not five (due to the 11 minute probing
interval).
Observing a dropout is a sign that a residential link may (but may not) have
experienced an outage: Dropouts are a superset of outages. Dropouts can also occur
if the device re-attaches to the network with a new address after only momen-
tary disconnection, typically through re-association of a PPP session for a DSL
modem, but potentially through administrative renumbering of prefixes. For our
purposes, we expect these events to occur independent of weather, such that the
two events can be studied separately. We confirm that dropouts during typical
maintenance intervals are unlikely to correlate with weather in Section 6.3.2.
In short, by observing dropouts, we will be able to observe how residential
links behave during weather, at the scale necessary to make quantitative conclu-
sions about weather’s effect on residential links in the U.S.
6.2.2 Dropouts, collected
We briefly summarize the methodology of “ThunderPing”: our probing system
that has been running for seven years. More details about ThunderPing can be
found in our preliminary work in IMC 2011 [5].
The ThunderPing probing methodology is as follows: For every forecast of
severe weather provided by the US National Weather Service, ThunderPing pings
a sample of 100 residential hosts from each provider in the affected region. The
151
affected region is specified by FIPS code, which roughly corresponds to counties
in the U.S. The probing starts up to six hours before the forecast event, continues
during the event, and terminates six hours after the event, regardless of whether
the weather materializes.
The residential hosts ThunderPing pings during each weather event are se-
lected from a master list of residential hosts classified by provider (reverse DNS
name) and geographic location (FIPS code). We classify link type by provider,
when the provider implies a well-defined link type; (typically rural) providers
that use a variety of media types to provide connectivity are included under “All”
link types with the rest, but are not classified further. We determine location us-
ing a MaxMind database from the same year for choosing which addresses to
probe, but from the same month for analysis. Although there are errors in both
classifications, a location error would be expected to cause an underestimate of
the effect of weather by placing a host not in the forecast region falsely into the
area of weather effect.
ThunderPing sends pings to each of these hosts from up to 10 geographi-
cally dispersed PlanetLab vantage points every 11 minutes. This interval is due
to [119]. When a PlanetLab node fails, we replace it, but if the number of work-
ing vantage points drops below three, we discard observations at that time as
untrustworthy. When there are at least three, we require that all active vantage
points do not have a response in order to label the event as a dropout.
ThunderPing retransmits failed ICMP requests: when a vantage point sees
a lack of ping response it retries that ping with an exponential backoff up to 10
152
times within the 11 minute probing interval. Therefore, a dropout will typically
require at least 30 failed ICMP requests.
ThunderPing has been running for seven years, and has collected 76 billion
responsive pings to 8.7 million residential addresses.
6.2.3 Weather, classified
To quantify the effect of weather on dropouts, we needed to determine what
weather residential links were exposed to when a dropout did or did not occur.
The US National Weather Service (NWS) operates a network of 900 auto-
mated “ASOS” weather stations. These weather stations are typically located at
airports. The NWS weather stations record hourly observations of 24 weather
variables in METAR format and make those available [123].
There are two types of weather information: categories that account for
the common precipitation types (e.g., thunderstorm, hail, snow) and continuous
variables (e.g., wind speed, precipitation quantity).
We annotate each responsive address hour for an address with the corre-
sponding weather information associated with the geographically closest weather
station to that address. Doing so allows us to find the number of responsive hours
and dropout address hours in specific weather conditions.
Hurricanes are special Severe events are among the most important failure events
for us to study how the Internet is affected, as the Internet is increasingly relied
on as the primary mode of communication in an emergency [4]. However, se-
153
vere events have the potential to overwhelm the typical and obscure interesting
observations.
The following hurricanes made US landfall during our measurement: Irene
(Aug 26–30, 2011), Isaac (Aug 25–31, 2012), Sandy (Oct 28–Nov 1, 2012), Arthur
(Jul 3–5, 2014), Hermine (Sept 1–3, 2016), Matthew (Oct 6–9, 2016), Harvey (Oct
6–9, 2017), Irma (Sept 9–13, 2017), and Nate (Oct 7–10, 2017) [124]. Hurricanes
manifest as a combination of weather features and are so pronounced that their
contribution to thunderstorm or rain outages would be disproportionate.1 We
thus omit them from categorical weather classification (e.g., Figure 6.2). How-
ever, we consider data from Hurricane events when studying continuous vari-
ables (inches of rain and wind speed, for example, where these extremes are
clearly distinguishable). Collectively, these hurricane times account for less than
3% of responsive address hours and 4% of dropout hours.
6.2.4 Data, Summarized
This data set comprises observations from January 2011 to December 2017, though
only 1467 days included sufficiently many operating vantage points to classify a
responsive address hour.
We show per-ISP highlights in Table 6.1. We observe major providers such
as Comcast, Qwest, and ViaSat in all fifty states (and DC). Of the 1.77 Billion
1It is disappointing to realize the irony that the most significant weather events are also the
least surprising.
154
Dropout Responsive
IPs Airports States hours hours
Cable
Comcast 2,430,104 476 51 532,493 249,562,477
Charter 654,270 418 47 279,950 95,627,261
Suddenlink 195,398 156 26 158,159 34,684,550
Cox 166,596 270 47 61,318 24,659,573
DSL
Qwest 592,220 710 51 873,042 99,037,723
Centurylink 342,556 237 33 445,085 83,101,301
Verizon DSL 312,344 201 29 169,078 35,133,098
Megapath 147,860 351 43 206,569 65,436,394
Fiber
Verizon Fios 415,481 154 23 45,982 48,296,147
GVTC 17,758 7 1 9,113 2,023,618
Dickey 5,229 6 3 6,482 1,114,057
WISP
RISE Bdbd. 57,021 87 22 40,932 12,442,717
Skyriver 4,187 29 6 5,364 2,032,975
Watch Comm. 4,738 11 2 14,980 1,411,321
Satellite
ViaSat 161,592 763 51 815,258 29,364,585
SageNet 1,352 65 30 3,762 555,288
All 8,674,043 844 51 9,826,096 1,770,774,634
Table 6.1: Summary of data set for large ISPs classified by link type. “All” comprises
data from ISPs not included in this sample. (For this table, we count D.C. as a state.)
155
responsive address hours from Table 6.1, 139M (8%) were hours where responsive
addresses experienced rain, 66M (4%) snow, and 19M (1%) thunderstorm.
Contrasted to our preliminary study [5], this covers nearly 22 times the du-
ration (compared to 66 days), and includes roughly 60 times as many dropout
events (likely because those days were in spring and early summer).2
6.2.5 Why this data?
Others have studied outages and collected broad IP responsiveness data. Here
we describe the benefits of our data, addressing its limitations in Section 6.2.6.
Our data provides a view on outages of individual addresses, including iso-
lated outages of “customer premises” equipment or singly-connected links that
are most exposed. We rely on statistics to identify a significant change in likeli-
hood of failure, rather than rely upon large outages of infrastructure common to
a larger aggregate prefix to signify significance. Every residential link is wired
with its own infrastructure: every residence can have different equipment in-
stalled in different ways and has its own resident network administrator. As a
concrete example, we expect to observe the effect of water infiltration in the net-
work interface device (the demarcation point connecting premise phone wiring
to the provider). (We discuss the flip side of this coin below.)
2In the public reviews of the IMC 2011 paper, all of the reviewers stated that they wished the
dataset was more comprehensive so conclusions could be made about the effects of weather on
residential links.
156
Our data is of a scale large enough to compare link types, providers, geog-
raphy, and across time. Seven years of data make it feasible to observe multiple
instances of both severe and common weather events. Rare events include a fair
number of tornadoes and virtually unique events such as snow in Louisiana.
Many observations of similar weather increase the confidence in our dropout
probability estimates, making it possible to split the data and identify the differ-
ences between, for example, heavy and light rain on wireless ISPs in Kentucky.
The sampling approach—providing data for each provider in an area—ensures
that even less-used network links and providers are well-represented, permitting
a comparison with satellites and wireless ISPs that might be poorly represented
in end host measurement probes [16, 120, 121] or when using provider-specific
data [10, 125, 126].
Our data includes data from times not subject to interesting weather: the
method probes before and after forecast weather alerts. “Typical” weather occurs
particularly when the forecast does not materialize or the forecast is for a long-
term event (e.g., summer fire warnings). With these measurements, we can estab-
lish a baseline for the rate of dropouts in common weather conditions. Probing
after the weather also permits measuring recovery time as we wait for previously
responsive addresses to return.
Our data is not sensitive to link failures elsewhere in the Internet or to Plan-
etLab vantage point failures. Restated, with multiple vantage points, catastrophic
Internet link outages, such as the fiber cuts during the “Baltimore tunnel fire” in
2001 [127] will only be considered as an outage if all vantage points are unable to
157
communicate with the host over the residential link. As described above, without
three active vantage points, we make no decision about address responsiveness.
6.2.6 Dataset Limitations
The essence of ThunderPing is to selectively probe only when there is a weather
alert forecast for an area, which biases the data toward time periods where there
is some atypical weather present. Obviously, regions that experience temperate
weather are unlikely to be represented, and we thus do not attempt to quantify
what fraction of all residential network outages are caused by weather. More
subtly, during the interval around forecast severe weather, the weather conditions
may not be ideal: our estimate of the background dropout rate is likely inflated
by proximity to potentially severe weather, thus causing us to underestimate the
quantitative effect of that weather.
Our approach relies upon active probing to gain breadth across hundreds
of providers, but there are limits to this breadth: providers may administratively
filter ICMP requests and home routers may decline to respond. We assume that
providers and end hosts that filter are no more or less vulnerable to weather and
that these features do not affect our conclusions.
Our data set does not identify the cause of an individual dropout. Our
analysis seeks to correlate observations of dropouts with weather events under
the expectation that a change in probability of outage is related to the weather.
Should a user turn off equipment nightly, this is independent of weather and will
158
not not be a factor; should a user unplug equipment when lightning is nearby,
such would contribute to the probability of dropouts in thunderstorm. Residen-
tial Internet infrastructure is also explicitly reliant on residential electrical power,
and we do not isolate power failures. We expect network service outages to be
more common than power outages, for power outages to occur only in the most
severe of weather conditions, and for power outages not to correlate with link
type.
Finally, AT&T, one of the largest DSL and fiber providers in the US does not
assign reverse DNS names to their residential customers. As such, they were not
included in our master list of residential links that we probe with ThunderPing.
6.3 Quantifying weather dropouts
In this section we describe our methodology for quantifying how much weather
correlates with a change in the probability of residential link dropouts. The goal
is to find a metric that can measure how the likelihood of a dropout increases (or
not) during weather.
The challenge in computing this metric is, dropouts are uncommon. This
makes it difficult to demonstrate that there is a statistically significant increase in
dropouts during weather. Even if there is an increase, it may simply be due to
the fact that certain types of weather tend to occur during time periods that are
commonly used for network maintenance or IP renumbering.
159
By leveraging statistics developed for epidemiology, we overcome the first
challenge and find statistical significance. By carefully inspecting our data set,
we verify that no type of weather that we study correlates with diurnal variations
dropout probability.
We will now describe our metric, inflated probability of dropout, then we
will verify that conclusions we draw from this metric will not be skewed due to
time-correlated network state changes.
6.3.1 Metric: Inflated probability of dropout
Challenge: Dropouts are uncommon
Correlating dropouts with weather is challenging to do with statistical signifi-
cance because dropouts are rare events 3. On average, we observed a link dropping
out only once every 2 − 30 days that we were actively pinging and receiving re-
sponses from the host residential link, depending on the link type. The inverse
of the average dropout rate per link type—including the average across all link
types—is as follows:
Link type: Fiber Cable WISP DSL Sat All
Days b/w dropouts: 30 16 8 9 2 8
Given how rarely dropouts occur, we now describe a metric that accounts
for this phenomenon, and even provides a way to determine statistical signifi-
3...and they should be. If dropouts were common, residential links would be unusable.
160
cance so that we can ensure that our analysis does not involve too much slicing
and dicing of the few dropouts that occur.
Our Approach: Hazard Rate
Fortunately, there is a well-established set of techniques from the field of epi-
demiology that permit statistical significance over rare events. Epidemiology—
the study of the occurrence and determinants of disease—faces similar challenges
when analyzing mortality: deaths (“failures”) are rare, and subjects (“links”) can
be exposed to their disease (“weather condition”) for different amounts of time
until the time of death (“dropout”). Here, we describe the techniques we bor-
row from biostatistics [102] to address these concerns. Throughout our study, we
will consider different groups of “subjects”: link types, geographic regions, and
combinations thereof.
Like in epidemiological studies, we focus on estimating the hazard rate (some-
times referred to as the instantaneous death rate). In essence, what a hazard rate
gives us is the expected number of deaths per unit time. More concretely, for a
given hazard rate λ, the probability of death over a short duration of time t is λ · t.
The first challenge in estimating hazard rates is that different subjects may
be observed over different periods of time: in our study, hosts that remain respon-
sive can naturally be observed for longer periods of time than those that drop out.
Throughout an observation period, we track the amount of time Oi that we ob-
serve each host i = 1, . . . , n, and we also count the total number of dropouts, F .
161
An unbiased estimate of the hazard rate λ̂ can be obtained as follows [102, Chapter
15.4]:
∑ Fλ̂ = n (6.1)
i=1Oi
We exclude any bin of data if it does not have enough samples to permit
computing confidence intervals. We adhere to the following rule of thumb [102,
Chapter 6]: we accept a bin with n samples and estimated hazard rate λ̂ only if
n ≥ 20 and nλ̂(1− λ̂) ≥ 10 (6.2)
When these conditions hold, we can calculate 95% confidence intervals over the
estimated hazard rate as follows [102, Chapter 6.3]:
√
λ̂± · λ̂(1− λ̂)1.96 (6.3)
n
The above calculations yield the hazard rate along with its confidence inter-
vals; what remains is to compare two hazard rates, for instance, the overall hazard
rate for a given link type and the hazard rate for that link type specifically in
the presence of snow. Two estimated hazard rates λ̂1 and λ̂2 can be compared
by simply subtracting them [102]. Fortunately, with sufficiently many samples,
the confidence intervals over the difference of two hazard rates is given by the
addition of the confidence intervals over the original hazard rates.4
4This follows from the fact that var(λ1−λ2) is approximately var(λ1)+var(λ2) when Eq. (6.2)
holds.
162
To summarize: in the results throughout this paper, we compute hazard
rates using Eq. (6.1), discard any bins that do not satisfy Eq. (6.2), and compute
confidence intervals using Eq. (6.3). When presenting our results, we multiply the
hazard rate by a short time interval, one hour, to estimate the hourly probability
of a dropout.
6.3.2 Verifying the metric will not be skewed by common dropouts
correlating with weather
Challenge: Dropout probability in weather may be inflated due to
diurnal events
Observing an increase in dropout probability during weather periods compared
to non-weather periods may be skewed by common network state changes that
tend to occur during certain types of weather events. This is a significant problem
because it is more likely that a dropout would be caused by everyday changes in
network state such as nightly maintenance periods, IP renumbering, and cus-
tomers powering off their links at night, rather than weather-induced outages.
Our Approach: Verify that weather events do not positively corre-
late with common dropout periods
The first question we must answer is: Are there any hours of the week that have
a significantly higher probability of dropout than other hours of the week? To
163
0.05 2000000
1 per day
1500000
0.01
Sat cold
WISP rain
1 per week All hot
0.005 1000000
DSL snow
Cable thunderstorm
Fiber gale
500000
1 per month
0.001
0.0005 0
0 24 48 72 96 120 144 168 0 24 48 72 96 120 144 168
Hour of the week (UTC) Hour of the week (UTC)
(a) Dropout probability has significant (b) Different weather conditions are
diurnal variation. prominent at different times.
Figure 6.1: Weather does not occur most often during hours of the week when there are
an inflated number of dropouts.
answer this question, we evaluate the probability of dropouts in each hour of the
week in the following manner: for each hour of the week, we counted the number
of dropouts (recall that dropouts only occur at most once per hour per link) across
all links observed during that hour, then we divided that by the number of hours
which the link was responsive. We did this for each link type separately, as some
link types may be more likely to be renumbered. For example, in prior work
we discovered that European DSL links have their IP addresses reassigned every
night at 2:00 AM UTC [21]. Also, some link types may require maintenance more
often than others.
The results are shown in Figure 6.1(a). As expected, the hourly probability
of dropouts significantly varies in a diurnal pattern over the course of each week.
Prior work suggests that ISPs are more likely to perform administrative main-
tenance during weekday night hours [128, 129]; we speculate that the increased
164
Hourly P(Dropout)
Response hours
dropout probability during weekday night hours could be due to administrative
maintenance. The highest probability of dropouts for every link type rises in the
evening and peaks near midnight Eastern Standard Time (indicated with vertical
dotted black lines), after which it drops off significantly until the early hours of
the morning.
Given that we observe a diurnal variation in hourly likelihood of dropouts,
and the fact that weather conditions also have a known diurnal pattern of occur-
rence [130], the next question we must answer is: Does hourly weather occur-
rence positively correlate with dropout probability? To answer this question, we
count the total number of responsive hours that we observed in each hour of the
week for each weather condition.
The results are shown in Figure 6.1(b). As expected, most weather con-
ditions, possibly except for snow, have a diurnal pattern in their occurrence.
Fortunately, none of the weather conditions have a positive correlation with the
hourly probability of dropouts. There is however a negative correlation with cold
weather: the coldest point of the night is also when the lowest hourly probably of
dropouts occurred. This negative correlation will not have an effect on the quan-
tified failure rate, as dropouts are less likely to occur during the hours when it is
cold than during other hours.
165
Baseline probability of dropout depends on link type
The investigation into probability of dropout for each link type also provides
additional justification for the selection of a metric that is based on the increase in
failure probability due to weather. The dropout probability significantly different
for each link type, with Fiber being the lowest and Satellite being the highest
(Figure 6.1(a)). With this metric, the baseline failure rate will be removed from all
link types; including the diurnal variations in dropout probability.
6.3.3 Summary
We selected a hazard rate-based metric that enables us to study the statistically
significant increase in dropout probability during weather events. Then we veri-
fied that this metric will not be skewed by nightly spikes in dropout probability
because they do not correlate with occurrence of weather conditions.
6.4 Weather Analysis
In this section, we use our collected data to understand how weather conditions
affect dropouts.
6.4.1 Relative dropout rates
166
100
10
1
0.1
0.01
0.001
0.035 Tornado
Thunderstorm
0.030 Heavy rain
Moderate rain
0.025 Light rain
Heavy snow
0.020
Moderate snow
0.015 Light snow
Hail
0.010 Freezing rain
Gale
1 per week
Hot
1 per month Cold
0
All Cable DSL Fiber WISP Sat
Figure 6.2: The number of response-hours (in centuries) for which we have measured
various link types in various weather conditions (top), and the additional (“inflated”)
probability of dropout experienced in those link- and weather-types (bottom).
First, we analyze the relative rate of dropouts under various link types
and weather conditions, after omitting all hurricane periods. We use categorical
data from weather records (such as “thunderstorm present”), to assign a single
weather condition for each hour. If more than one weather condition occurred in
an hour, then we assign the most severe condition to that hour.
The top of Figure 6.2 shows the number of responsive hours for which we
measured the various link- and weather-types. Although there is a wide range
in their absolute values (note the log-scale of the y-axis), the overall shape of
the histograms remains mostly consistent across the different link types. This
reflects the fact that, in their deployment throughout the US, different link types
are exposed to very similar conditions, with one minor exception: we did not
measure any fiber or satellite links during tornadoes.
167
Inflation in hourly P(Dropout)
Response centuries
The bottom of Figure 6.2 shows the difference in the probability of failure
rate between the presence of a weather condition and its absence. A value of zero
signifies no observed difference with or without a particular weather condition;
positive values indicate increased probability of dropout during that weather
condition; and negative values indicate fewer failures during that weather con-
dition. In the bottom of the figure, we also include confidence intervals on all
bars; they are tight on almost all values, but satellite links are noticeably variable,
as are tornadoes.
We make three key observations from Figure 6.2. First, there are several
weather conditions that exhibit higher dropout probabilities across all of the link
types we measured. Thunderstorms, heavy rain, moderate rain, and (for the link
types that experienced it) tornadoes all yield a statistically significant increase in
dropout probabilities.
Second, for each given link type, heavier rates of precipitation (both rain
and snow) yield higher probabilities of dropout. We analyze dropout rates as a
function of precipitation in Section 6.4.3. Interestingly, the probability of dropouts
is greater during thunderstorms than during heavy rain for all link types. Recall
that we classify “thunderstorm” and “heavy rain” as mutually exclusive. This in-
dicates that the causes of failures during thunderstorms extend beyond the rain-
fall, perhaps to increased wind or power outages.
Finally, the dropout probabilities of wired link types (cable, DSL, and fiber)
are similar to one another, as are the dropout probabilities of wireless link types
(WISP and satellite), but wired and wireless link types are different from one
168
0.02 Thunderstorm
Rain
Snow
0.01
1 per week
0.00
HI AK OR WA CA NV ID AZ MT UT NM CO WY ND SD TX OK NE KS IA MN AR MO LA MS IL WI TN AL IN KY MI GA FL OH WV SC NC VA DC PA MD DE NJ NY CT VT NH RI MA ME
Fiber
1.0
Satellite
WISP
0.8
Cable
DSL
0.6
0.4
0.2
0.0
HI AK OR WA CA NV ID AZ MT UT NM CO WY ND SD TX OK NE KS IA MN AR MO LA MS IL WI TN AL IN KY MI GA FL OH WV SC NC VA DC PA MD DE NJ NY CT VT NH RI MA ME
U.S. state (sorted by longitude of state capital)
Figure 6.3: Top: Inflation in hourly dropout probability by U.S. state for thunderstorm,
rain, and snow (with 95% confidence intervals). Bottom: The fraction of link types by
U.S. state (the remaining fraction are of unknown type).
another. For example, light rain and light snow have almost no discernible dif-
ference in dropout probabilities for wired links, but light rain exhibits higher
dropout probability for wireless links, and light snow sees lower probability of
dropout. Conversely, gale-force winds have a profound increase in dropout prob-
abilities for wired links, but wireless links are less likely to drop out during them.
It is not surprising that strong winds can cause wired links to fail, for instance by
knocking down above-ground cables. Although wireless links are not affected
in the same way, it is surprising that higher failure rates would not be observed,
given that such strong winds could destroy or blow away satellite dishes.
Summary and ramifications The results from Figure 6.2 collectively show that
different link types can experience weather in different ways. It is not surprising
that different link types would differ in the magnitude with which they experi-
169
Inflation in hourly P(Dropout)
Fraction of link types
(a) Thunderstorm (b) Rain (c) Snow
Figure 6.4: Inflation in hourly dropout probability by U.S. state for various weather
conditions. Large geographic regions can exhibit common behavior; northern states are
more prone to failures in thunderstorms, Midwestern states in rain, and southern states
in snow. (Note the different scales for each sub-figure.)
ence dropouts; but what we do find surprising is that some weather conditions
(especially snow and cold) can differ in whether they increase or decrease dropout
rates. This has ramifications on network measurement methodology: when per-
forming outage analysis, it is important to account for both link type and weather
condition.
6.4.2 Geographic variation
Next, we investigate the extent to which different geographic regions experi-
ence weather in different ways. Of course, different states experience different
amounts of weather (for instance, we did not observe a statistically significant
amount of snow in Florida). To control for this, we present the inflated proba-
bility of hourly dropouts, comparing hours with a particular weather condition
(e.g., snow) against all hours without that weather condition. This gives us an
170
apples-to-apples comparison across states, even if they experience weather con-
ditions in varying amounts.
In Figure 6.3, we present the dropout probability inflation across all 50
U.S. states (and DC) for three weather conditions: thunderstorms, rain (excluding
hurricanes), and snow. We make two key observations. First, there is a high varia-
tion of increased dropout probability across states. For example, during thunder-
storms, South Dakota experiences an average increased hourly dropout probabil-
ity of 0.018 (3.1 additional failures per week), while New Jersey increases by only
0.0038 (2.9 additional failures per month). Moreover, as shown by the 95% con-
fidence intervals in the figure, these differences are statistically significant. We
believe this to be an important result because it shows the role that geography
plays in network outages.
Second, while the raw dropout inflation varies among states, the relative
impact of weather types is common across most states: the increase in dropouts
during thunderstorms tends to be greater than in rain, which in turn tends to
be greater than in snow. There are a few notable exceptions. Louisiana and
Mississippi have more inflated dropouts in snow than in thunderstorms, and
Florida tends to experience similar amounts of failures in rain as it does in thun-
derstorms. By controlling for geography and the total amount of time spent in
weather, this result shows that some weather conditions have more pronounced
impact on dropouts.
Below Figure 6.3, we present a breakdown of the classified link types in
each state, weighted by responsive hours in probing. The intent of consulting
171
this graph is to determine whether the outliers in the top graph are a direct func-
tion of the link types that are prevalent in a state. North Dakota has a substantial
and exceptional deployment of Fiber: 50% of the link-type-classified responsive
hours are from Fiber addresses. Although our sampling approach is based on
finding 100 addresses in each provider in a region, and thus is not meant to sam-
ple the distribution of link types used by customers, we note that this is consistent
with published reports that “60 percent of the households, including those on
farms in far-flung areas, have fiber” [131]. Although there are instances where
top and bottom graphs appear related—Vermont (VT) and Maine (ME) show
both a high vulnerability to thunderstorms and a relatively large proportion of
DSL compared to immediate neighbors CT, NH, MA—it appears that geography
is more important than link type at determining the inflation in probability of
dropout in precipitation.
Next, we look beyond individual states to see if there are regional corre-
lations of dropouts. In Figure 6.4, we show maps with the average inflation
in dropout probabilities during thunderstorms, rain (excluding hurricanes), and
snow.
During thunderstorms (Figure 6.4(a)) and rain (Figure 6.4(b)) Midwestern
states tend to experience greater inflation of dropouts than other regions. (Maine
is an outlier; its dropout inflation during thunderstorm and rain is due to an
abnormally powerful series of storms in October 2017.) Recall from Figure 6.2
that WISP and satellite links fail more often in thunderstorms and rain than other
link types. One possible explanation for higher dropout rates in the Midwest
172
would be that these states have more wireless links. This hypothesis is confirmed
in Figure 6.2, which shows that Midwestern states have more satellite links than
other states.
During snow (Figure 6.4(c)), we see more pronounced dropout inflation
in southern states.5 Texas, Louisiana, and Mississippi experienced drastically
higher probability of dropouts in snow than in the absence of snow. Unlike rain
and thunderstorm, this disparity cannot be explained by link type alone, as no
link types experience drastically higher dropout rates than others (in fact, Fig-
ure 6.2 shows that wireless links tend to experience fewer dropouts during snow).
Our insight is that snow seems to affect states where snow is less common.
One possible explanation for the regional effects is therefore that regions
that are less “familiar” with a particular weather condition may be more heav-
ily affected by it. To evaluate this hypothesis, we plot in Figure 6.5 the hourly
dropout rate of each U.S. airport as a function of the number of hours each airport
has spent in snow. The results in this figure confirm our hypothesis for snow: the
less familiar a location is to snow, the more often it tends to experience dropouts.
Areas with very small amounts of snow do not experience large inflation (osten-
sibly because there is not enough time for it to cause damage). Conversely, areas
with snow beyond a threshold are more resilient to snow. A likely reason for this
is that regions that are more used to snow tend to invest more in infrastructure
to prepare for and mitigate it [132]. We also performed this analysis under thun-
5We do not include data for Florida or Hawaii, as we did not observe enough responsive hours
of snow to achieve statistical significance.
173
0.10
0.05
0.00
20 40 60 80 100
Hours of snow
Figure 6.5: Hourly dropout probability of hosts (all link types) as a function of the num-
ber of hours the hosts’ nearest U.S. airport received snow (truncated to only those with
fewer than 100 hours in snow). The less common snow is in a region, the more impact it
tends to have.
derstorms and rain (figures not shown), but did not observe as strong an effect.
We hypothesize that this is because all of the airports we measured experienced
enough thunderstorm and rain to grow accustomed to them.
Summary and ramifications We conclude from these results that different geo-
graphic regions can be affected by weather to varying degrees. We attribute this
geographic variation to two leading factors: (1) the predominance of some link
types over others (e.g., wireless links are more common in the Midwest), and
(2) how familiar a region is with a particular weather condition (and thus how
prepared for it the region is). Our results have several interesting ramifications
on outage analysis. First, when performing outage analysis, it is important to con-
174
Hourly P(Dropout)
sider a representative set of locations and link types; measuring only, say, cable
links would risk overestimating the Midwest’s resilience to dropouts. Second, it
is important to note the time and weather conditions when outage measurements
are taken; collecting measurements only during Spring months6, when thunder-
storms are more common, would risk overestimating dropouts year-round.
6.4.3 Continuous weather variables
Thus far in our analysis, we have considered various binary classifications of
weather—rain (or not), snow (or not), gale (or not), and so on. Although these
classifications are standard (they are included in the weather reports we collect),
they risk masking the precise effect that various weather conditions can have.
Here, we evaluate dropouts as a function of several continuous weather variables:
wind speed, precipitation, and temperature.
Figure 6.6 shows the inflation in the hourly dropout probability of various
link types as a function of wind speed. Note that not all link types share the same
values on the x-axis; we aggregate data in increasing values of x until we reach
either an interval of 10 mph or 20 dropout samples (Eq. (6.2)), whichever comes
last.
For all link types, we see almost no inflation in dropout probability when
wind speed is less than 30 mph. Beyond 30 mph, there is little effect on wire-
less links (WISP and satellite), but significant increases in dropout probability for
wired links: cable, DSL, and fiber. This is reflected in Figure 6.2, which showed
6For instance, in the run-up to the IMC deadline.
175
that wireless links were not as affected by gale winds. Figure 6.6 expands on this
by showing that, as wind speed increases, dropout inflation increases at a super-
linear rate—between 40 mph and 55 mph winds, Cable links’ dropout inflation
increases by an hour of magnitude.
In Figure 6.7, we show dropout inflation as a function of temperature. Like
with wind speed, we bin along the x-axis in units of 10, or 20 dropout samples,
whichever comes second, and include 95% confidence intervals. There are several
surprising observations in this figure. First, satellite links are highly sensitive to
temperature; at low temperatures, satellite links are far less likely to experience
dropouts, but this increases steadily, until at approximately 70◦ F when satellite
links become more likely to fail. Surprisingly, at approximately 80◦ F, there is an
inflection point at which satellite links again become significantly more reliable.
We hypothesize that there is a confounding factor: satellite links are less reliable
when there is no line-of-sight visibility (e.g., due to fog), and we suspect that
higher temperatures result in less fog.
All of the other link types we measured exhibit similar behavior to one an-
other. They have highly variable dropout probabilities at low temperatures; they
remain mostly steady until 60◦ F, then they increase slightly with higher temper-
atures. Unlike with our other results, WISPs more closely resemble wired links
than satellite links; we hypothesize that this, too, is because satellite links are
affected by line-of-sight while WISPs and wired links are not.
Finally, in Figure 6.8 we measure various link types’ dropout inflation as a
function of precipitation in thunderstorms, rain, and snow. All link types exhibit
176
0.10
Sat
WISP
All
DSL
Cable
0.05 Fiber
1 per day
1 per week
0.00
0 10 20 30 40 50
Wind Speed (MPH)
Figure 6.6: Inflation in hourly dropout probability as a function of wind speed across
multiple link types. All link types experience greater dropout probabilities, but satellite
and WISP links increase the least.
increased dropout inflation with increased precipitation, regardless of the over-
arching weather condition. However, surprisingly, the magnitude of increase
varies significantly across link types. Again, satellite tends to be the most sensi-
tive to change. Other link types are not as consistent across different types of pre-
cipitation; WISP links exhibit nearly the same increase in dropouts at high thun-
derstorm precipitation as satellite, but far less during non-thunderstorm rain.
Similarly, DSL links experience a (varying but statistically significant) increase
in dropouts during high snow precipitation, but not nearly as much during thun-
derstorm or rain.
There appears to be an inflection point with snow and rain: prior to 0.1 inches
of precipitation in rain or snow, non-satellite links experience little change in their
dropout probabilities. After these points, they increase significantly and quickly.
177
Inflation in hourly P(Dropout)
1 per week
0.005
1 per month
Sat
0 WISP
All
DSL
Cable
Fiber
-0.005
-0.010
0 50 100
Temperature (F)
Figure 6.7: Inflation in hourly dropout probability as a function of temperature across
multiple link types. All link types exhibit non-monotonic effects, typically increasing at
higher and lower temperatures (satellite being a clear exception).
Conversely, most links experience (slight but statistically significant) increases in
dropout rates in all levels of precipitation during thunderstorms.
Summary and ramifications Weather conditions are often described with binary
categories: rain (or not), snow (or not), and so on. These continuous variable
results show that such categories can be overly coarse; the mere presence of rain
or snow does not necessarily affect most link types, unless there is more than
0.1 inches of precipitation. Like with our prior results, different link types can
exhibit widely varying behaviors, lending further motivation to incorporate link
types into future outage analyses.
178
Inflation in hourly P(Dropout)
0.10
0.10
Sat Sat Sat
0.05 WISP 0.05 WISP WISP
1 per day All All All1 per day 0.05
DSL DSL DSL
Cable Cable 1 per day Cable
Fiber Fiber Fiber
1 per week
0.00
1 per week
0.00 1 per week
0.00
0 0.01 0.05 0.1 0.5 1 0 0.01 0.05 0.1 0.5 1 0 0.01 0.05 0.1 0.5 1
Hourly thunderstorm precipitation (inches) Hourly rain precipitation (inches) Hourly snow precipitation (inches)
Figure 6.8: Inflation in hourly dropout probability as a function of precipitation dur-
ing thunderstorm (left), rain (center), and snow (right), across multiple link types. All
link types experience higher dropout probabilities with more precipitation, but to widely
varying magnitudes. (Note the different ranges of the x-axes.)
6.5 Conclusions
Using a seven year dataset collected by probing residential IP addresses in the
U.S., I showed that a variety of weather conditions can inflate the likelihood of
Internet dropouts. I quantified this inflation and show that it varies depending
upon the type of weather, link type, and geographic location.
Even ignoring times when hurricanes were active, all link types see more
failures during thunderstorms—fiber addresses, the most resilient to thunder-
storms still observed an additional dropout every 11 days, while satellite ad-
dresses, the most susceptible, observed an additional dropout every day. High
wind speeds result in a super-linear increase in dropout probability for wired
links while higher precipitation results in particularly pronounced increases in
dropout probability for wireless links.
The extent to which weather conditions can inflate the probability of dropouts
varies considerably with geography. States in the Midwest are susceptible to
179
Inflation in hourly P(Dropout)
Inflation in hourly P(Dropout)
Inflation in hourly P(Dropout)
dropouts during rain while states in the south experience dropouts much more
often in the snow: addresses in Mississippi, for example, experience an additional
dropout every 4 days.
The reliability analyses in this chapter were performed using dropouts of
individual IP addresses. Although dynamic addressing and user behavior also
constitute dropouts, the key observation that allowed the use of dropouts for reli-
ability comparisons is that the inflation in dropout rates during the occurrence of
severe weather conditions is due to the additional outages that occur. Confound-
ing factors such as dynamic addressing and user behavior do not positively cor-
relate with peak diurnal failure periods; therefore, the increase in dropout rate
during a weather condition is equivalent to the increase in outage rate during
that condition.
180
Chapter 7: Future Work
Here, I identify directions for future work in measuring residential reliability us-
ing probing-based techniques.
7.1 Tracking devices across IP addresses using IDs on a global
scale
In Chapter 4, I showed how to use IDs from complementary datasets to (i) an-
alyze dynamic addressing patterns and (ii) to confirm outages. However, the
RIPE Atlas dataset consisted of ten thousand probes, a small fraction of residen-
tial users around the world. While the CDN dataset was obtained from an order
of magnitude more devices than RIPE Atlas, it could still only offer confirmation
for 1% of Thunderping’s detected outages.
With more sources of IDs, it may be possible to model the likelihood of
address change. Such models can help prevent false inferences about outages
and their durations. For ISPs that change periodically and/or synchronously,
the model can predict when probe-loss is more likely due to address changes
than outages. For ISPs that change addresses upon most outages, the model can
inform in which ISPs outage duration detection is particularly error-prone. For
181
other ISPs which change addresses mostly upon longer outages, the model can
be used to estimate the likelihood that an inferred outage ended falsely.
Orthogonally, if every outage detected by a probing-based technique could
be confirmed through a complementary dataset that provides IDs, false positives
due to dynamic addressing can be entirely mitigated. Additionally, the analysis
of outage recovery durations for all of these outages will be possible.
The key to tracking IP address(es) assigned to a home router over time is to
associate some uniquely identifying feature (the ID) that remains constant across
the home router’s address changes. Chapter 4 showed two examples of IDs: in
the case of RIPE Atlas probes’, the probe ID remained unchanged and in the case
of the CDN software, the installation ID remained unchanged.
A challenge, however, is to obtain sources of IDs that can scale across the In-
ternet. IP address changes can be tracked over time if there exists some uniquely
identifying feature that remains constant across the device’s address change. There
are several potential datasets which have this property:
7.1.1 Dynamic DNS services
Websites such as dyn.com [133] provide dynamic DNS. Dynamic DNS is a service
that allows users with a dynamic IP address to host web services, by providing
DNS services that can be easily updated to reflect changes in users’ IP addresses.
Users of Dynamic DNS Services run a daemon provided by the dynamic DNS
182
1000
100
No address changes
Address changes
10
1
TH TW MX IT DE BR AU ES PL RU RO AR CA SE FI CH GB US NL NO NZ SI
Geographic Location
Figure 7.1: IP address renumbering in dynamic DNS domains over a week: Black rep-
resents dynamic DNS domains which experienced at least one address change, while
grey represents domains whose addresses remained the same. Renumbering behavior
appears to be correlated with geographic location.
provider, which is responsible for determining the publicly visible IP address,
and updating the A record(s) for the user’s domain(s).
IP address changes can be tracked using the domain names registered with
dynamic DNS services. Since the domain name of a user maps to her current
IP address, we can use the domain name as a fingerprint, and detect changes in
IP addresses for each domain name over time, by periodically obtaining the ’A’
record associated with each domain name.
Geographic correlation of dynamic behavior As a proof of concept, I report on
a preliminary result from this approach: corroborating the geographic relation-
ships in Figure 4.1 while extending to countries not well represented by RIPE.
I obtained 3000 dynamic DNS domains from three different dynamic DNS ser-
vices: 2000 from afraid.org [134], 600 from dyn [133] and 400 from noip.com [135]
and fetched the ’A’ records from their respective nameservers once every hour.
183
Number of Domains
I collected this data for a week, and then inspected how many of these domains
experienced at least one address change during this time. Figure 7.1 shows the
number of domains that had at least one address change and the domains that
had none. The y-axis is in log-scale. Address changes in Asian and Latin Amer-
ican countries appear more prevalent, with more than a third of all domains in
these countries seeing at least one address-change. On the other hand, Northern
European countries observe fewer than 6% of their domain names experiencing
an address change. Address changes are uncommon in North America: only 3%
of domain names in the US and 6% of domain names in Canada observed an
address change.
7.1.2 Open DNS resolvers
Since 2010, various studies have reported on the existence of more than 15 mil-
lion ’open’ DNS resolvers on the Internet [69, 136–138]. These DNS resolvers are
’open’ because they will resolve a DNS query sent from arbitrary IP addresses
on the Internet. Previous studies have found that more than three-quarters of
open DNS resolvers are likely to be residential [137, 139]. I identify two potential
approaches to fingerprint these open DNS resolvers and track address changes.
DNS caches Open DNS resolvers often cache previous lookups [139]. These
caches can be used to fingerprint open DNS resolvers, allowing us to track when
their IP addresses change.
184
Anomalous Open DNS Resolvers Of the 30 million Open DNS Resolvers on
the Internet, around 17 million are anomalous [68], i.e., instead of sending DNS
responses with a source port of 53, they respond with a non-standard source
port. Kaizer et al. [68] found that these devices are primarily residential ADSL
modems. Not only do these devices use a non-standard source port, DNS re-
quests can be made to these devices in such a way that the source ports are
assigned sequentially. We can use this sequential assignment of source ports to
fingerprint anomalous open DNS resolvers.
7.2 Classifying IP addresses
Probing-based techniques that seek to detect residential Internet outages need a
list of addresses classified as residential. More broadly, a classification of the IP
address space into residential, enterprise, campus etc., can benefit any system
that uses IP addresses as a proxy for measurement, including IP address based
host-reputation systems [75, 76]. Recent work has also shown that ISPs are in-
creasingly likely to deploy Carrier Grade NATs (CGNs), where tens of residential
Internet connections are multiplexed over a single public IPv4 address [20].
In this dissertation, I relied upon classifications of addresses as residen-
tial using reverse DNS based schemes from prior work [5]. Many ISPs include
hints about an address’s intended use in the reverse DNS entry of that address.
Recent research has further improved address classification with reverse DNS
names [140]. However, it is not mandatory for ISPs to provide meaningful reverse
185
DNS names. Some large ISPs, such as AT&T do not provide reverse DNS names
for most of their addresses, resulting in their addresses’ under-representation in
Thunderping data as seen in Chapter 6.
An orthogonal approach to address classification is to use datasets with
some uniquely identifying feature (an ID) that can be used to track IP addresses
over time. By analyzing how many IDs are associated with an IP address simul-
taneously and over time, I show in preliminary work that it is possible to infer
how the ISP is using the address [141, 142]. An address that is observed with
multiple devices over time, though with relatively few devices at any instant, is
likely a dynamic residential address. An address that remains associated with a
single ID over months is either statically assigned or is a residential address with
a linktype that uses DHCP. Addresses associated with many IDs simultaneously
could be CGN addresses or university/enterprise proxies.
7.3 Identifying outage causes can help orthogonal reliability anal-
yses
In this dissertation, I covered one possible reliability analysis: examining how
challenging conditions like weather affects Internet reliability. Another potential
analysis is the head-to-head comparison of one ISP’s reliability against another.
Such comparisons can aid users in their choice of ISP and can help ISPs gauge
their competition.
186
When comparing reliability across ISPs, the reliability metric should ide-
ally only consider outages that each ISP was responsible for. If a user voluntarily
chooses to power off their home Internet equipment, the user has an Internet out-
age but this outage should not lower the user’s ISP’s Internet reliability. Similarly,
a power outage in an area should contribute towards lowering the reliability of
that area, but should not lower the reliability of the ISPs whose addresses were
affected.
For conducting comparisons of ISPs, we need to classify outages by de-
tected cause. Chapter 5 showed the potential of using simultaneous outages of
related addresses to find addresses that failed due to a common underlying cause.
When addresses from multiple ISPs fail together in the same geographic region,
the cause is potentially a power outage. When addresses from only a single ISP
have been observed to fail, the cause is potentially a network outage. Once out-
ages have been classified by cause, outages in appropriate classes can be used to
determine the outage rate per ISP.
187
Chapter 8: Conclusion
In this dissertation, I described how to measure residential Internet reliability
remotely using probing-based techniques. While having the ability to measure
broadly, these techniques’ outage inferences can be inaccurate. My contributions
have improved their accuracy, and have allowed their detected outages to be
used in metrics for comparing residential Internet reliability of various ISPs, me-
dia types, and geographic regions in different weather conditions. I showed how
to detect Internet outages accurately using probing-based techniques by analyz-
ing and mitigating potential scenarios that can cause these techniques to make
false inferences about detected outages. In Chapter 3, I investigated how fre-
quently probe responses can be delayed beyond commonly used timeouts by
analyzing ping response latencies from IP addresses across the world in a vari-
ety of networks. In Chapter 4, I analyzed dynamic addressing patterns in ISPs
to find networks where addresses are stable. I also showed how to detect out-
ages in networks where dynamic reassignment is common, using complemen-
tary datasets that can provide information on whether a device’s IP address has
changed. Chapter 5 demonstrated the need for detecting individual address out-
ages when measuring residential reliability. In Chapter 6, I compared the relia-
188
bility of ISPs, media-types, and geographic regions across several weather condi-
tions.
189
Bibliography
[1] Inc. Ideal Life. Ideal Life - Remote Patient Monitoring for Home Health
Care. http://www.ideallife.com/how-it-works/.
[2] Susanna Spinsante and Ennio Gambi. Remote health monitoring for elderly
through interactive television. BioMedical Engineering Online, 11(1):54, 2012.
[3] Voipfone. VoIP 999 Emergency Services. http://www.voipfone.co.uk/999
Emergency Services.php.
[4] Federal Communications Commission. VoIP and 911 Service. https://
www.fcc.gov/consumers/guides/voip-and-911-service.
[5] Aaron Schulman and Neil Spring. Pingin’ in the rain. In IMC, 2011.
[6] Measuring Broadband America. https://www.fcc.gov/general/
measuring-broadband-america.
[7] Broadband Measurement Project, Canada. https://crtc.gc.ca/eng/
internet/proj.htm.
[8] Broadband in the U.K.: data and research. https://www.ofcom.org.uk/
research-and-data/telecoms-research/broadband-research.
[9] Measuring Broadband Australia. https://www.accc.gov.au/consumers/
internet-phone/monitoring-broadband-performance.
[10] Yu Jin, Nick Duffield, Alexandre Gerber, Patrick Haffner, Subhabrata Sen,
and Zhi-Li Zhang. NEVERMIND, the problem is already fixed: Proactively
detecting and troubleshooting customer DSL problems. In CONEXT, 2010.
[11] Lin Quan, John Heidemann, and Yuri Pradkin. Trinocular: Understand-
ing Internet reliability through adaptive probing. In Proceedings of the ACM
SIGCOMM Conference on Applications, Technologies, Architectures, and Proto-
cols for Computer Communication (SIGCOMM), 2013.
190
[12] Philipp Richter, Ramakrishna Padmanabhan, David Plonka, Arthur Berger,
and David Clark. Advancing the Art of Internet Edge Outage Detection. In
Proceedings of the ACM SIGCOMM Internet Measurement Conference (IMC),
2018.
[13] Ethan Katz-Basset, Harsha V. Madhyastha, John P. John, Arvind Krishna-
murthy, David Wetherall, and Thomas Anderson. Studying black holes in
the internet with Hubble. In Proceedings of the ACM/USENIX Symposium on
Networked Systems Design and Implementation (NSDI), 2008.
[14] Alberto Dainotti, Claudio Squarcella, Emile Aben, Kimberly C. Claffy,
Marco Chiesa, Michele Russo, and Antonio Pescapè. Analysis of country-
wide Internet outages caused by censorship. In Proceedings of the ACM SIG-
COMM Internet Measurement Conference (IMC), 2011.
[15] RIPE NCC. Atlas. http://atlas.ripe.net.
[16] SamKnows. http://www.samknows.com.
[17] Srikanth Sundaresan, Sam Burnett, Nick Feamster, and Walter de Do-
nato. BISmark: A testbed for deploying measurements and applications
in broadband access networks. In Proceedings of the USENIX Annual Techni-
cal Conference, June 2014.
[18] Yuval Shavitt and Eran Shir. DIMES: Let the Internet Measure Itself. SIG-
COMM Comput. Commun. Rev., 35, October 2005.
[19] Ramakrishna Padmanabhan, Patrick Owen, Aaron Schulman, and Neil
Spring. Timeouts: Beware surprisingly high delay. In Proceedings of the
ACM SIGCOMM Internet Measurement Conference (IMC), 2015.
[20] Philipp Richter, Florian Wohlfart, Narseo Vallina-Rodriguez, Mark Allman,
Randy Bush, Anja Feldmann, Christian Kreibich, Nicholas Weaver, and
Vern Paxson. A multi-perspective analysis of carrier-grade NAT deploy-
ment. In Proceedings of the ACM SIGCOMM Internet Measurement Conference
(IMC), 2016.
[21] Ramakrishna Padmanabhan, Amogh Dhamdhere, Emile Aben, kc claffy,
and Neil Spring. Reasons dynamic addresses change. In Proceedings of the
ACM SIGCOMM Internet Measurement Conference (IMC), 2016.
[22] David D. Clark. The design philosophy of the DARPA internet protocols.
In Proceedings of the ACM SIGCOMM Conference on Applications, Technologies,
Architectures, and Protocols for Computer Communication (SIGCOMM), 1988.
[23] Gianluca Iannaccone, Chen-nee Chuah, Richard Mortier, Supratik Bhat-
tacharyya, and Christophe Diot. Analysis of Link Failures in an IP Back-
bone. In Proceedings of the ACM SIGCOMM Internet Measurement Workshop
(IMW), 2002.
191
[24] Craig Labovitz, Abha Ahuja, and Farnam Jahanian. Experimental Study of
Internet Stability and Wide-Area Backbone Failures. In FTCS, 1999.
[25] Ratul Mahajan, David Wetherall, and Thomas Anderson. Understanding
BGP misconfiguration. In Proceedings of the ACM SIGCOMM Conference on
Applications, Technologies, Architectures, and Protocols for Computer Communi-
cation (SIGCOMM), 2010.
[26] Alberto Dainotti, Claudio Squarcella, Emile Aben, Kimberly C Claffy,
Marco Chiesa, Michele Russo, and Antonio Pescapé. Analysis of country-
wide Internet outages caused by censorship. In Proceedings of the ACM SIG-
COMM Internet Measurement Conference (IMC), 2011.
[27] Vern Paxson. End-to-end routing behavior in the Internet. In IEEE/ACM
Transactions on Networking, 1997.
[28] Amogh Dhamdhere, Renata Teixeira, Constantine Dovrolis, and
Christophe Diot. NetDiagnoser: Troubleshooting Network Unreacha-
bilities Using End-to-end Probes and Routing Data. In Proceedings of the
2007 ACM CoNEXT Conference, 2007.
[29] Ethan Katz-Basset, Colin Scott, David R. Choffnes, Ítalo Cunha, Vytautas
Valancius, Nick Feamster, Harsha V. Madhyastha, Thomas Anderson, and
Arvind Krishnamurthy. LIFEGUARD: Practical repair of persistent route
failures. In Proceedings of the ACM SIGCOMM Conference on Applications,
Technologies, Architectures, and Protocols for Computer Communication (SIG-
COMM), 2012.
[30] Umar Javed, Italo Cunha, David Choffnes, Ethan Katz-Bassett, Thomas An-
derson, and Arvind Krishnamurthy. PoiRoot: Investigating the Root Cause
of Interdomain Path Changes. In Proceedings of the ACM SIGCOMM Con-
ference on Applications, Technologies, Architectures, and Protocols for Computer
Communication (SIGCOMM), 2013.
[31] Ritwik Banerjee, Abbas Razaghpanah, Luis Chiang, Akassh Mishra, Vyas
Sekar, Yejin Choi, and Phillipa Gill. Internet Outages, the Eyewitness Ac-
counts: Analysis of the Outages Mailing List. In Proceedings of Passive &
Active Measurement (PAM), 2015.
[32] Daniel Turner, Kirill Levchenko, Alex C. Snoeren, and Stefan Savage. Cal-
ifornia Fault Lines: Understanding the Causes and Impact of Network
Failures. In Proceedings of the ACM SIGCOMM Conference on Applications,
Technologies, Architectures, and Protocols for Computer Communication (SIG-
COMM), 2010.
[33] Craig Labovitz, Abha Ahuja, Abhijit Bose, and Farnam Jahanian. Delayed
Internet Routing Convergence. In Proceedings of the ACM SIGCOMM Con-
ference on Applications, Technologies, Architectures, and Protocols for Computer
Communication (SIGCOMM), 2000.
192
[34] John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, Thomas Ander-
son, and Arun Venkataramani. Consensus Routing: The Internet As a
Distributed System. In Proceedings of the ACM/USENIX Symposium on Net-
worked Systems Design and Implementation (NSDI), 2008.
[35] Feng Wang, Zhuoqing Morley Mao, Jia Wang, Lixin Gao, and Randy Bush.
A Measurement Study on the Impact of Routing Events on End-to-end In-
ternet Path Performance. In Proceedings of the ACM SIGCOMM Conference
on Applications, Technologies, Architectures, and Protocols for Computer Com-
munication (SIGCOMM), 2006.
[36] Nate Kushman, Srikanth Kandula, and Dina Katabi. Can You Hear Me
Now?!: It Must Be BGP. ACM SIGCOMM Computer Communication Review,
37(2):75–84, March 2007.
[37] Internet Outage Detection and Analysis (IODA). https://www.caida.org/
projects/ioda/.
[38] Routeviews Project – University of Oregon. http://www.routeviews.org/.
[39] RIPE Routing Information Service. http://www.ripe.net/ris/.
[40] Sarthak Grover, Mi Seon Park, Srikanth Sundaresan, Sam Burnett, Hyojoon
Kim, Bharath Ravi, and Nick Feamster. Peeking behind the NAT: an empir-
ical study of home networks. In Proceedings of the ACM SIGCOMM Internet
Measurement Conference (IMC), 2013.
[41] Anant Shah, Romain Fontugne, Emile Aben, Cristel Pelsser, and Randy
Bush. Disco: Fast, good, and cheap outage detection. In TMA, 2017.
[42] Z. Bischof, F. Bustamante, and N. Feamster. The Growing Importance of
Being Always On – A First Look at the Reliability of Broadband Internet
Access. In Research Conference on Communications, Information and Internet
Policy (TPRC) 46, 2018.
[43] M A. Sánchez, J. .S. Otto, Z. S. Bischof, D. R. Choffnes, F. E. Bustamante,
B. Krishnamurthy, and W. Willinger. Dasu: Pushing Experiments to the
Internet’s Edge. In Proceedings of the ACM/USENIX Symposium on Networked
Systems Design and Implementation (NSDI), 2013.
[44] Oded Argon, Anat Bremler-Barr, Osnat Mokryn, Dvir Schirman, Yuval
Shavitt, and Udi Weinsberg. On the dynamics of IP address allocation and
availability of end-hosts. arXiv preprint arXiv:1011.2324, 2010.
[45] Zachary S. Bischof, Fabian E. Bustamante, and Rade Stanojevic. Need,
Want, Can Afford: Broadband Markets and the Behavior of Users. In Pro-
ceedings of the ACM SIGCOMM Internet Measurement Conference (IMC), 2014.
193
[46] Zakir Durumeric, Eric Wustrow, and J Alex Halderman. ZMap: Fast
Internet-wide Scanning and Its Security Applications. In USENIX Security,
pages 605–620, 2013.
[47] John Heidemann, Yuri Pradkin, Ramesh Govindan, Christos Papadopou-
los, Genevieve Bartlett, and Joseph Bannister. Census and survey of the
visible Internet. In Proceedings of the ACM SIGCOMM Internet Measurement
Conference (IMC), 2008.
[48] David Adrian, Zakir Durumeric, Gulshan Singh, and J. Alex Halderman.
Zippier ZMap: Internet-Wide Scanning at 10 Gbps. In WOOT, 2014.
[49] R. Braden, Editor. Requirements for internet hosts – communication layers.
Internet Engineering Task Force Request for Comments RFC-1122, October
1989.
[50] Harsha V. Madhyastha, Tomas Isdal, Michael Piatek, Colin Dixon, Thomas
Anderson, Aravind Krishnamurthy, and Arun Venkataramani. iPlane: An
information plane for distributed services. In Proceedings of the Symposium
on Operating Systems Design and Implementation (OSDI), 2007.
[51] Nick Feamster, David G. Andersen, Hari Balakrishnan, and M. Frans
Kaashoek. Measuring the effects of Internet path faults on reactive routing.
In Proceedings of the ACM SIGMETRICS International Conference on Measure-
ment and Modeling of Computer Systems, 2003.
[52] Ming Zhang, Chi Zhang, Vivek Pai, Larry Peterson, and Randy Wang. Plan-
etSeer: Internet path failure monitoring and characterization in wide-area
services. In Proceedings of the Symposium on Operating Systems Design and
Implementation (OSDI), 2004.
[53] Philip Homburg. [atlas] timeout on ping measurements. http://www.
ripe.net/ripe/mail/archives/ripe-atlas/2013-July/000891.html, July 2013.
Posting to the ripe-atlas mailing list.
[54] ISI ANT Lab. Internet address survey binary format descrip-
tion. http://www.isi.edu/ant/traces/topology/address surveys/
binformat description.html.
[55] Matthew Luckie. Scamper: A Scalable and Extensible Packet Prober for
Active Measurement of the Internet. In Proceedings of the ACM SIGCOMM
Internet Measurement Conference (IMC), pages 239–245, 2010.
[56] Neil Spring, David Wetherall, and Tom Anderson. Scriptroute: A public In-
ternet measurement facility. In USENIX Symposium on Internet Technologies
and Systems (USITS), 2003.
[57] Jeffrey Mogul. Broadcasting Internet datagrams. Internet Engineering Task
Force Request for Comments RFC-919, October 1984.
194
[58] Landernotes. https://wiki.isi.edu/predict/index.php/LANDER:internet
address survey reprobing it54c-20130524.
[59] Fred Baker. Requirements for IP version 4 routers. Internet Engineering
Task Force Request for Comments RFC-1812, June 1995.
[60] Mathew J. Luckie, Anthony J. McGregor, and Hans-Werner Braun. Towards
improving packet probing techniques. In Proceedings of the ACM SIGCOMM
Internet Measurement Workshop (IMW), 2001.
[61] Ina Minei and Reuven Cohen. High-speed internet access through unidi-
rectional geostationary satellite channels. In IEEE Journal on Selected Areas
in Communications, 1999.
[62] Chadi Barakat, Nesrine Chaher, Walid Dabbous, and Eitan Altman. Im-
proving TCP/IP over geostationary satellite links. In Global Telecommunica-
tions Conference, 1999. GLOBECOM’99, volume 1, pages 781–785, 1999.
[63] Rajiv Chakravorty, Andrew Clark, and Ian Pratt. GPRSWeb: Optimizing
the web for GPRS links. In Proceedings of the International Conference on Mo-
bile Systems, Applications and Services (MOBISYS), May 2003.
[64] Stefan Saroiu, P. Krishna Gummadi, and Steven D Gribble. Measurement
study of peer-to-peer file sharing systems. In MMCN, 2002.
[65] Jacky C. Chu, Kevin S. Labonte, and Brian N. Levine. Availability and
locality measurements of peer-to-peer file systems. In ITCom: Scalability
and Traffic Control in IP Networks, 2002.
[66] Subhabrata Sen and Jia Wang. Analyzing peer-to-peer traffic across large
networks. IEEE/ACM Transactions on Networking (ToN), 12(2):219–232, 2004.
[67] Vyas Sekar, Yinglian Xie, Michael K Reiter, and Hui Zhang. A multi-
resolution approach for worm detection and containment. In DSN, 2006.
[68] Andrew J. Kaizer and Minaxi Gupta. Open resolvers: Understanding the
origins of anomalous open DNS resolvers. In Proceedings of Passive & Active
Measurement (PAM), 2015.
[69] Marc Kührer, Thomas Hupperich, Jonas Bushart, Christian Rossow, and
Thorsten Holz. Going wild: Large-scale classification of open dns resolvers.
In Proceedings of the ACM SIGCOMM Internet Measurement Conference (IMC),
2015.
[70] Yinglian Xie, Vyas Sekar, David Maltz, Michael K Reiter, Hui Zhang, et al.
Worm origin identification using random moonwalks. In Proc. of the IEEE
Symposium on Security and Privacy, 2005.
195
[71] Jaeyeon Jung and Emil Sit. An empirical study of spam traffic and the use of
DNS black lists. In Proceedings of the ACM SIGCOMM Internet Measurement
Conference (IMC), 2004.
[72] MARJZ Fabian and Monrose Andreas Terzis. My botnet is bigger than
yours (maybe, better than yours): why size estimates remain challenging.
In Proceedings of the 1st USENIX Workshop on Hot Topics in Understanding
Botnets, Cambridge, USA, 2007.
[73] Brett Stone-Gross, Marco Cova, Lorenzo Cavallaro, Bob Gilbert, Martin
Szydlowski, Richard Kemmerer, Christopher Kruegel, and Giovanni Vigna.
Your botnet is my botnet: analysis of a botnet takeover. In Proceedings of the
16th ACM conference on Computer and communications security, 2009.
[74] Dennis Andriesse, Christian Rossow, and Herbert Bos. Reliable recon in
adversarial peer-to-peer botnets. In Proceedings of the ACM SIGCOMM In-
ternet Measurement Conference (IMC), 2015.
[75] Fail2ban. http://www.fail2ban.org/.
[76] The spamhaus project. http://www.spamhaus.org/.
[77] The cbl. http://www.abuseat.org/.
[78] Sorbs (spam and open-relay blocking system). www.sorbs.net/.
[79] Ralph Droms. RFC 2131: Dynamic host configuration protocol. Internet
Engineering Task Force Request for Comments RFC-2131, March 1997.
[80] William Simpson. The Point-to-Point Protocol. Internet Engineering Task
Force Request for Comments RFC-1661, July 1994.
[81] Glenn McGregor. The PPP Internet Protocol Control Protocol (IPCP). Inter-
net Engineering Task Force Request for Comments RFC-1332, May 1992.
[82] Vladimir Brik, Jesse Stroik, and Suman Banerjee. Debugging DHCP perfor-
mance. In Proceedings of the ACM SIGCOMM Internet Measurement Confer-
ence (IMC), 2004.
[83] Manas Khadilkar, Nick Feamster, Matt Sanders, and Russ Clark. Usage-
based DHCP lease time optimization. In Proceedings of the ACM SIGCOMM
Internet Measurement Conference (IMC), 2007.
[84] Ioannis Papapanagiotou, Erich M. Nahum, and Vasileios Pappas. Config-
uring DHCP leases in the smartphone era. In Proceedings of the ACM SIG-
COMM Internet Measurement Conference (IMC), 2012.
196
[85] Yinglian Xie, Fang Yu, Kannan Achan, Eliot Gillum, Moises Goldszmidt,
and Ted Wobber. How dynamic are IP addresses? In Proceedings of the
ACM SIGCOMM Conference on Applications, Technologies, Architectures, and
Protocols for Computer Communication (SIGCOMM), 2007.
[86] Giovane CM Moura, Carlos Ganán, Qasim Lone, Payam Poursaied, Hadi
Asghari, and Michel van Eeten. How dynamic is the isps address space?
towards internet-wide dhcp churn estimation. IFIP, 2015.
[87] Gregor Maier, Anja Feldmann, Vern Paxson, and Mark Allman. On domi-
nant characteristics of residential broadband internet traffic. In Proceedings
of the ACM SIGCOMM Internet Measurement Conference (IMC), 2009.
[88] Martin Casado and Michael J. Freedman. Peering through the shroud: The
effect of edge opacity on IP-based client identification. In Proceedings of the
ACM/USENIX Symposium on Networked Systems Design and Implementation
(NSDI), 2007.
[89] Ramakrishna Padmanabhan, Zhihao Li, Dave Levin, and Neil Spring.
UAv6: Alias resolution in IPv6 using unused addresses. In Proceedings of
Passive & Active Measurement (PAM), 2015.
[90] Thomas Narten, Richard Draves, and Suresh Krishnan. Privacy extensions
for stateless address autoconfiguration in ipv6. Internet Engineering Task
Force Request for Comments RFC-4941, September 2007.
[91] David Plonka and Arthur Berger. Temporal and spatial classification of
active IPv6 addresses. In Proceedings of the ACM SIGCOMM Internet Mea-
surement Conference (IMC), 2015.
[92] About RIPE Atlas: FAQ: How does the probe connect to the Internet? https:
//atlas.ripe.net/about/faq/.
[93] Philip Homburg. NTP measurements with RIPE Atlas. https://labs.
ripe.net/Members/philip homburg/ntp-measurements-with-ripe-atlas,
February 2015.
[94] RIPE NCC. RIPE atlas probe archive. https://atlas.ripe.net/api/v1/probe-
archive/.
[95] RIPE NCC. RIPE atlas connection logs url format. https://atlas.ripe.net/
probes/〈prb id〉/connection-history/〈yyyy〉/〈mm〉/.
[96] Routeviews prefix to as mappings dataset (pfx2as) for ipv4 and ipv6. https:
//www.caida.org/data/routing/routeviews-prefix2as.xml.
[97] RIPE NCC. Built-in measurements. https://atlas.ripe.net/docs/built-in/.
197
[98] Zwangstrennung (Forced IP address change). https://de.wikipedia.org/
wiki/Zwangstrennung.
[99] RIPE NCC. Become a ripe atlas probe host. https://atlas.ripe.net/get-
involved/become-a-host/.
[100] RIPE NCC Staff. RIPE Atlas: A global internet measurement network. In-
ternet Protocol Journal, 18(3), September 2015.
[101] RIPE NCC. Technical updates. https://atlas.ripe.net/resources/
announcements/.
[102] Gerald van Belle, Patrick J. Heagerty, Lloyd D. Fischer, and Thomas S. Lum-
ley. Biostatistics: A Methodology for the Health Sciences (Second Edition). John
Wiley & Sons, 2004.
[103] Hang Guo and John S. Heidemann. Detecting ICMP rate limiting in the
internet. In Proceedings of Passive & Active Measurement (PAM), 2018.
[104] Comcast outage on Sep 13 2017 in the Outages Mailing List. https://puck.
nether.net/pipermail/outages/2017-September/010754.html.
[105] National Hurricane Center Tropical Cyclone Report: Hurricane Irma. https:
//www.nhc.noaa.gov/data/tcr/AL112017 Irma.pdf.
[106] Northeast Storm Undergoes Bombogenesis, Bringing 70 MPH Gusts, Al-
most 350 Reports of Wind Damage, Flooding — The Weather Chan-
nel. https://weather.com/forecast/regional/news/2017-10-30-northeast-
storm-damaging-winds-flooding.
[107] October 29-30, 2017 damaging winds, heavy rainfall & flooding. https://
www.weather.gov/aly/October29-302017.
[108] More than 1 million power outages in the Northeast after blockbuster fall
storm - The Washington Post. https://www.washingtonpost.com/news/
capital-weather-gang/wp/2017/10/30/over-one-million-power-outages-
in-the-northeast-after-blockbuster-fall-storm/.
[109] Line Of Storms Moves Through Oklahoma. http://www.newson6.com/
story/36651816/tornado-watch-in-effect-for-ne-oklahoma.
[110] U.S. Government. CFR part 4 section 4.9: Outage reporting requirements
threshold criteria.
[111] Edmond W. W. Chan, Xiapu Luo, Waiting W. T. Fok, Weichao Li, , and
Rocky K. C. Chang. Non-cooperative diagnosis of submarine cable faults.
In Proceedings of Passive & Active Measurement (PAM), 2011.
[112] Tomasz Bilski. Disaster’s impact on Internet performance – case study. In
CCIS, 2009.
198
[113] John Heidemann, Lin Quan, and Yuri Pradkin. A preliminary analysis of
network outages during hurricane Sandy. Technical report, USC/ISI, 2012.
[114] Gianluca Iannaccone, Chen nee Chuah, Richard Mortier, Supratik Bhat-
tacharyya, and Christophe Diot. Analysis of link failures in an IP backbone.
In Proceedings of the ACM SIGCOMM Internet Measurement Workshop (IMW),
2002.
[115] Frank B. Jewett. The modern telephone cable. In Proceedings of 26th annual
convention of the American Institute of Electrical Engineers, 1909.
[116] W. T. Smith and W. L. Roberts. Design and characteristics of coaxial cables
for Community Antenna Television. IEEE Transactions on Communication
Technology, 1966.
[117] D.C. Hogg and Ta-Shing Chu. The role of rain in satellite communications.
PROC-IEEE, 1975.
[118] Helmut Bölcskei, Arogyaswami J. Paulraj, K. V. S. Hari, Rohit U. Nabar, and
Willie W. Lu. Fixed broadband wireless access: State of the art, challenges,
and future directions. IEEE Communications Magazine, 2001.
[119] John Heidemann, Yuri Pradkin, Ramesh Govindan, Christos Papadopou-
los, Genevieve Bartlett, and Joseph Bannister. Census and survey of the
visible Internet. In Proceedings of the ACM SIGCOMM Internet Measurement
Conference (IMC), 2008.
[120] RIPE NCC. RIPE Atlas. http://atlas.ripe.net.
[121] Srikanth Sundaresan, Walter de Donato, Nick Feamster, Renata Teixeira,
Sam Crawford, and Antonio Pescapè. Broadband Internet performance: A
view from the gateway. In Proceedings of the ACM SIGCOMM Conference on
Applications, Technologies, Architectures, and Protocols for Computer Communi-
cation (SIGCOMM), 2011.
[122] Aaron Schulman and Neil Spring. Pingin’ in the rain. In Proceedings of the
ACM SIGCOMM Internet Measurement Conference (IMC), 2011.
[123] NOAA. Automated surface observing system (ASOS). https:
//www.ncdc.noaa.gov/data-access/land-based-station-data/land-
based-datasets/automated-surface-observing-system-asos. ftp:
//ftp.ncdc.noaa.gov/pub/data/noaa/.
[124] NOAA. State of the climate report: Hurricanes and tropical storms. https:
//www.ncdc.noaa.gov/sotc/.
[125] Cisco. GS7000 DOCSIS status monitor transponder installation and op-
eration guide, 2011. https://www.cisco.com/c/dam/en/us/td/docs/
video/access edge/Nodes/GS7000/4037424 A.pdf.
199
[126] Alpha Technologies. Installation and technical manual DOCSIS HMS em-
bedded transponder, 2004.
[127] Xiaoliang Zhao, Daniel Massey, Mohit Lad, and Lixia Zhang. On/off
model: A new tool to understand bgp update burst. Technical report, Uni-
versity of California, Los Angeles, 2004.
[128] R. Beverly, M. Luckie, L. Mosley, and k. claffy. Measuring and Character-
izing IPv6 Router Availability. In Passive and Active Network Measurement
Workshop (PAM), pages 123–135, Mar 2015.
[129] G. Comarela, G. Gürsun, and M. Crovella. Studying interdomain routing
over long timescales. In Proceedings of the ACM SIGCOMM Internet Mea-
surement Conference (IMC), 2013.
[130] J. M. Wallace. Diurnal variations in precipitation and thunderstorm fre-
quency over the conterminous United States. Monthly Weather Review, 1975.
[131] Carolyn Orr. A look at how and why North Dakota became a leader
in deployment of fiber optic Internet. http://www.csgmidwest.org/
policyresearch/0616-fiber-optic-North-Dakota.aspx, Jun 2016.
[132] Chris Hill. 23 state DOTs spent more than $1 billion on snow, ice main-
tenance this winter. https://www.equipmentworld.com/23-state-dots-
spent-more-than-1-billion-on-snow-ice-maintenance-this-winter/, May
2015.
[133] Remote Access (DynDNS). http://dyn.com/remote-access/.
[134] the Dynamic DNS page - FreeDNS. https://freedns.afraid.org/dynamic/.
[135] No-IP: Free Dynamic DNS. www.noip.com.
[136] Open resolver project. http://openresolverproject.org/.
[137] Kyle Schomp, Tom Callahan, Michael Rabinovich, and Mark Allman. On
measuring the client-side DNS infrastructure. In Proceedings of the ACM
SIGCOMM Internet Measurement Conference (IMC), 2013.
[138] Marc Kührer, Thomas Hupperich, Christian Rossow, and Thorsten Holz.
Exit from hell? reducing the impact of amplification ddos attacks. In
USENIX Security Symposium, 2014.
[139] Kyle Schomp, Tom Callahan, Michael Rabinovich, and Mark Allman. As-
sessing DNS vulnerability to record injection. In Proceedings of Passive &
Active Measurement (PAM), 2014.
[140] Youndo Lee and Neil Spring. Identifying and analyzing broadband internet
reverse DNS names. In CONEXT, 2017.
200
[141] Ramakrishna Padmanabhan. We can find shared IP addresses. https://
blog.apnic.net/2018/03/05/can-find-shared-ip-addresses/.
[142] Ramakrishna Padmanabhan. Analyzing static, dynamic, and gateway IPv4
addresses. In AIMS 2017: Workshop on Active Internet Measurements, 2017.
201