ABSTRACT

Title of Dissertation: EVALUATION OF SELECTED SIDE-CHANNEL
ANALYSIS METHODS FOR
RANSOMWARE CLASSIFICATION
AND DETECTION

Jennie E. Hill
Doctor of Philosophy, 2023

Dissertation Directed by: Professor Martin C. Peckerar
Department of Electrical and Computer Engineering

The physical implementation of computer hardware leads necessarily to physical behavior

on the part of an operating computer. This physical behavior has physical characteristics, many of

which become channels of information leakage that can be observed by an unintended receiver.

This poses a serious threat to computer security. These “side-channels” of computer operations,

such as current usage and power consumption, generation of heat and electromagnetic radiation,

and events at the micro-architectural level, can be exploited to compromise the confidentiality of

a system.

This work considers side-channel analysis techniques for the temperature, power, and

micro-architectural side-channels for the purpose of classifying state-of-the-art ransomware on

real world, non-virtualized Windows systems. Over three thousand ransomware and benign trials

were collected to generate training and testing data sets, which required development of a process

to synchronize collection of on-system (e.g. performance counters) and off-system (e.g. power)


measurements, safely transfer trial data from the encrypted system, and restore the system to a

“clean” state without the use of virtualization techniques, which negatively impact the validity

of side-channel measurements. Side-channels were evaluated on their effectiveness in accurately

differentiating between ransomware and benign operations such as background operating system

activity, 7zip encryption, and SPEC benchmarks, in a given time duration, with Matthews’

Correlation Coefficient (MCC) used to measure overall classifier performance of five machine

learning classification algorithms.

The temperature side-channel, accessed through thermal imaging, was found to be unsuitable

for the ransomware detection/classification application due to its sensitivity to thermal noise,

significant pre-processing requirements, and slow response times due to the loss of signal components

above the low kHz range. This limited its ability to identify ransomware before encryption

operations typically begin (within 2 seconds of execution, on average). The power side-channel,

accessed by monitoring the current drawn by a solid state drive, generated best-case classification

accuracy results of 96% (0.92 MCC) with 15 seconds of current data and ≥ 90% (MCC ≥ 0.8)

for all five classifiers tested with at least 5 seconds of data. Tests demonstrated at least four

seconds of data were required to attain a best case classification accuracy greater than 90%, and

at 2 seconds the best-performing classifier attained an MCC of just 0.66 with 83.3% accuracy.

The micro-architectural side-channel was accessed through hardware performance counters,

which provided the highest MCC and accuracy results in the shortest period of time. Hardware

performance counters are registers built into a CPU’s Performance Monitoring Unit, and measure

events related to processor and memory system operations (e.g. CPU clock cycles, total instructions

retired, memory accesses, cache hits/misses, branches taken, etc.). Over 230 hardware events

were collected, tested, and ranked by their contribution to overall classifier performance. Each


classification algorithm was found to have a distinct performance counter feature ranking, and the

selected features could be further optimized by desired detection window duration. Examination

of results showed that, despite the quantity of features collected, classifier performance only

marginally improved after 6 features for ≤ 2 seconds, with MCC ≥ 0.9 for 1 second of data

for 3 of 5 classifiers tested with just 4-6 performance counter features, and a best-case MCC of

0.98 with 1 second of data and 4 performance counter features. MCC results for the shortest

duration event window (0.1s) were found to be within 0-7% of the best case MCC result window

(1-2 s) for each classifier, indicating that ransomware can be classified using only a tenth of a

second of 4-6 performance event measurements with greater than 90% accuracy for four of five

classifiers tested, which makes the implementation of this approach in a real-time ransomware

detector feasible. With the financial impact of ransomware estimated to cost more than $30

billion globally this year, the usefulness of new detection techniques for non-virtualized computer

systems has significant real-world implications.


EVALUATION OF SELECTED SIDE-CHANNEL ANALYSIS METHODS
FOR RANSOMWARE CLASSIFICATION AND DETECTION

by

Jennie Elizabeth Hill

Dissertation submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial fulfillment

of the requirements for the degree of
Doctor of Philosophy

2023

Advisory Committee:
Professor Martin C. Peckerar, Chair/Advisor
Professor Bruce L. Jacob, Co-Advisor
Professor Ankur Srivastava
Professor T. Owens Walker, III
Professor Donald Yeung
Professor Amitabh Varshney, Dean’s Representative


Acknowledgements

This was absolutely not a solo effort. Many people deserve my sincere gratitude; I could
not possibly capture them all... So here are some.

My advisors: Dr. Jacob and Dr. Peckarar, for being available to answer my texts at all hours and
for pushing the pace when I was on the verge of procrastinating. Your guidance and mentorship
have been both insightful and essential in helping me to get to this point.

My “CECSR” (Computer Engineering and Cyber Security Research) team - James, Justin, Dane,
Rob, Owens, Ryan, Hau - all of whom were very generous, extremely patient mentors: For
teaching me, letting me borrow equipment that I may or may not have returned, and reminding
me that research is a journey rather than a destination. Your bald-faced humor, insistence on the
insertion of randomly-chosen words (e.g. “bald”) into papers, and ability to not take yourselves
too seriously made the long days in the lab tolerable. Special thanks to ENS Brendan Farmer for
far outperforming all expectations as a lab assistant, and to my small-but-mighty UMD crew of
Devesh and Ananth for the selfless gifts of their time and guidance.

My Naval Academy colleagues, particularly: Jeff, for his technical expertise, critical lab support,
and many years of friendship. Mike and Chris, for numerous hours of EI, well outside the scope
of their normal duties. Hatem, Ethan, and Ann, for their unwavering support and encouragement.
The ECE Department and PMP program leadership, for their support in making this opportunity
a possibility in the first place, commitment to finding a program that was a realistic fit, and then
ensuring I had the resources to finish the job.

Nick, Jon, and Sam: For adapting to living through a global pandemic in a single-parent household
where mom turned the dining room into a lab for her PhD research, and for all the additional
responsibilities they assumed in the process (and especially to Sam for his many hours of company
when we finally had a real lab to go to). Together, they made sure I never got too much sleep or
quiet time, and provided me with an endless supply of stories to tell my research group when I
should have been working.

My family: My parents for making multiple trips to Annapolis to assist with cooking/driving/
childcare, my mom for doubling as a copy-editor, my brothers for their consistent phone support
availability, and to Amanda for being like-a-sister for longer than either of us cares to admit
(unless we met when we were 4 or younger).

My CW crew, QuaranTEAM (Alana, Sara, Chad), and friends: Cindy, Ora, Meredith, Kristin, &
Kristen, to name just a few. If you ever stumble across this and think maybe you should be part
of this list, you’re right. Thank you all, seriously, for your support. I really couldn’t have done it
without you. And adderall. It was prescribed. But mainly you.

Last and least, to generative AI for producing a solid Acknowledgements section draft that
incorporated humor and sarcasm. And for the warning to not use it.

ii


Table of Contents

Acknowledgements ii

Table of Contents iii

List of Tables vii

List of Figures viii

List of Abbreviations xviii

Chapter 1: Introduction 1
1.1 A Brief History of Side-channels . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Considering Side-channels in the framework of Electronic Warfare . . . . 3
1.2 Side-channel Analysis to detect Ransomware . . . . . . . . . . . . . . . . . . . 5
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Chapter 2: Side-channels 10
2.1 Side-channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.1 Power Side-channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.2 Electromagnetic Side-channel . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.3 Acoustic Side-channel . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.4 Optical Side-channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.5 Temperature Side-channel . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.6 Micro-architectural Side-channel . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Side-channel Attack, Analysis, and Defense . . . . . . . . . . . . . . . . . . . . 13
2.3 Side-channel Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3.1 Statistical Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.2 Entropy, Conditional Entropy, & Guessing Entropy . . . . . . . . . . . . 16
2.3.3 Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.4 Success Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.5 Welch’s T-Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.6 Side-channel Vulnerability Factor (SVF) . . . . . . . . . . . . . . . . . . 18
2.3.7 Spatial Thermal Side-channel Factor (STSF) . . . . . . . . . . . . . . . . 18
2.3.8 Cache Side-channel Vulnerability (CSV) . . . . . . . . . . . . . . . . . . 19
2.3.9 Signal Available to the Attacker (SAVAT) . . . . . . . . . . . . . . . . . 19

iii


2.3.10 Thermal-Security-in-Multi-Processors (TSMP) . . . . . . . . . . . . . . 19
2.3.11 Maximal Leakage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.12 Information Leakage Rate . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.13 Local Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.14 Trust Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.15 Holistic Assessment Criterion . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.16 Machine Learning Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 21

Chapter 3: The Temperature Side-channel 22
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 Characteristics of the Temperature Side-channel . . . . . . . . . . . . . . . . . . 22
3.3 Sensing the Thermal Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.3.1 Internal Temperature Sensing Methods . . . . . . . . . . . . . . . . . . . 24
3.3.2 External Temperature Sensing Methods . . . . . . . . . . . . . . . . . . 25

3.4 Temperature Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4.1 Temperature Side-channel Attacks . . . . . . . . . . . . . . . . . . . . . 28
3.4.2 Covert Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.3 Physical Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.5 Temperature Side-channel Defense . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.5.1 Design-time Defense . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5.2 Run-time Defense . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Chapter 4: The Micro-architectural Side-channel 33
4.1 Micro-architectural Side-channel . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.1.1 Timing Side-channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.1.2 Memory (Access) Side-channel . . . . . . . . . . . . . . . . . . . . . . 34

4.2 Hardware Performance Counters for the Micro-architectural Side-channel . . . . 34
4.2.1 Profiling Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.2 Intel®VTuneTM Profiler Primer . . . . . . . . . . . . . . . . . . . . . . . 36

4.3 Considerations When Using Performance Counters for Security Applications . . 39

Chapter 5: Malware and Ransomware 42
5.1 Malware Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.1.1 Ransomware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2 Malware Detection via Side-Channel Analysis . . . . . . . . . . . . . . . . . . . 45

5.2.1 Malware Detection with the Power Side-channel . . . . . . . . . . . . . 45
5.2.2 Malware Detection with the EM Side-channel . . . . . . . . . . . . . . . 45
5.2.3 Micro-architectural Side-Channel for Malware Detection (via Hardware

Performance Counters) . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.2.4 HPC-based Machine Learning Classification . . . . . . . . . . . . . . . . 48

5.3 Ransomware Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.3.1 Ransomware Detection with Hardware Sensors and Performance Counters 50

Chapter 6: Temperature Side-channel Analysis Experiments 54

iv


6.1 TSCA Study 1: Temperature Side-channel Analysis to detect File Operations on
an SSD with Thermal Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.1.1 Experimental Set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.1.2 Classification Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.1.3 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . 60

6.2 TSCA Study 2: Temperature Side-channel Analysis to detect Simulated Malware
on a Single Board Computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.2.1 TSCA Study 2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2.2 Thermal Signature of Write Operations . . . . . . . . . . . . . . . . . . 66

6.3 TSCA Experiment Conclusions and Future Research Directions . . . . . . . . . 70

Chapter 7: Investigating Vtune HPCs for Ransomware Detection 72
7.1 Proof-of-Concept A: Vtune Data Collection of Ransomware Trials . . . . . . . . 72

7.1.1 Hardware Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.1.2 Data Collection Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.1.3 System Restore Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.1.4 Initial Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

7.2 Proof-of-Concept B: Simultaneous Current and Vtune Data Collection of ransomware
in Non-virtualized and Virtualized Environments . . . . . . . . . . . . . . . . . 78
7.2.1 Non-Virtualized Hardware Setup . . . . . . . . . . . . . . . . . . . . . . 79
7.2.2 Virtualized System Set-up . . . . . . . . . . . . . . . . . . . . . . . . . 81
7.2.3 Data Collection Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.2.4 Vtune Performance Counter Analysis . . . . . . . . . . . . . . . . . . . 84
7.2.5 Proof-of-Concept B Observations . . . . . . . . . . . . . . . . . . . . . 84
7.2.6 Drive Restore Procedure Using Image for Linux . . . . . . . . . . . . . . 85

7.3 Study 1: Automate, Expedite, and Expand Vtune collection . . . . . . . . . . . . 86
7.3.1 Data Collection Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.3.2 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

7.4 Study 2: Expanded Vtune HPC Collection with Additional Ransomware Samples 90
7.4.1 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

7.5 Analysis of Performance Counters for Classification of Ransomware Op-erations
for Studies 1 (23-class) and 2 (31-class) . . . . . . . . . . . . . . . . . . . . . . 92
7.5.1 Feature Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.5.2 Classification Algorithm Identification . . . . . . . . . . . . . . . . . . . 94
7.5.3 HPC Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.5.4 Single-HPC F1-based Feature Reduction . . . . . . . . . . . . . . . . . . 118

7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Chapter 8: Selecting Vtune HPCs for Ransomware Detection 126
8.1 Study 3: 35-class, 90 HPC Feature Collection with Networked Database transfer

and Parallel Report Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.1.1 Training Data Verification . . . . . . . . . . . . . . . . . . . . . . . . . 127

8.2 Training Data Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8.2.1 Matthews’ Correlation Coefficient (MCC) . . . . . . . . . . . . . . . . . 131
8.2.2 HPC Feature Selection Process . . . . . . . . . . . . . . . . . . . . . . . 133

v


8.2.3 Cross-Validation of Training Data . . . . . . . . . . . . . . . . . . . . . 136
8.2.4 Classifier Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

8.3 Classifier Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
8.3.1 Test Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

8.4 Towards Start-time Ransomware Detection . . . . . . . . . . . . . . . . . . . . . 148
8.4.1 Identification of Operation Start Time . . . . . . . . . . . . . . . . . . . 150
8.4.2 Cross Validation and Testing Results . . . . . . . . . . . . . . . . . . . . 151
8.4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

8.5 Top Performance Counter Feature Comparison . . . . . . . . . . . . . . . . . . . 158
8.5.1 25 second Start Time Feature Set . . . . . . . . . . . . . . . . . . . . . . 158
8.5.2 Performance Counter Ranking Discussion . . . . . . . . . . . . . . . . . 159

8.6 MASC Experiments Summary and Future Research Directions . . . . . . . . . . 159

Chapter 9: Current-based Power Side-channel Analysis Experiments 167
9.1 Proof of Concept: Power Side-channel Analysis via Current Measurements for

Detection of SSD File Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 167
9.1.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
9.1.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
9.1.3 Follow-on Power Side-channel Analysis Work . . . . . . . . . . . . . . . 176

9.2 Study: Classification of Read, Write, and Idle Operations with Current-based
Power Side-channel Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
9.2.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
9.2.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
9.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

9.3 Proof of Concept: Accessing the Power Side-channel via Current Draw for Ransomware
Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
9.3.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
9.3.2 Current Side-Channel for Ransomware Detection . . . . . . . . . . . . . 196

9.4 Study: 35-class Current-draw-based Side-channel Analysis for Ransomware Classification197
9.4.1 Part 1: Power Feature Generation and Classifier Identification . . . . . . 199
9.4.2 Part 2: Current-based Power side-channel Analysis for Ransomware vs.

Benign Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
9.4.3 Study 2 Results and Initial Conclusions . . . . . . . . . . . . . . . . . . 212

9.5 PSCA Experiment Summary and Future Research Directions . . . . . . . . . . . 223

Chapter 10: Conclusions and Future Work 227
10.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

Appendix A: Table of Vtune Hardware Performance Counter Names, Descriptions,
and associated Feature Numbers for each Study, adapted from [1] 232

Bibliography 258

vi


List of Tables

6.1 Comparison between the thermal cameras used in experiments. . . . . . . . . . . 64

8.1 Accuracy, Precision, Recall, F1, and MCC metrics for the Confusion Matrix
shown in Figure 8.7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

9.1 Solid State Drives Tested . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
9.2 Solid State Drives Tested . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
9.3 Current data from trials were converted to Power Spectral Density using Welch’s

Method with varying combinations of data duration, window size, and maximum
frequency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

9.4 Current data from all training and testing trials were converted to Power Spectral
Density using Welch’s Method with varying combinations of data duration, window
size, and maximum frequency. All power features were generated in both Watts
and dBW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

vii


List of Figures

1.1 Electromagnetic wave, reprinted from [2]. . . . . . . . . . . . . . . . . . . . . . 3
1.2 Electromagnetic Spectrum, adapted from [3]. . . . . . . . . . . . . . . . . . . . 4
1.3 Relationship between Electronic Warfare terminology and Side-Channel terminology. 6

6.1 Experimental set up to capture thermal images of SSDs with a FLIR A325sc
thermal camera. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6.2 Thermal images of a Samsung 970 EVO Plus SSDs during idle, read, write,
operations. All images were captured with a FLIR A325sc thermal camera. . . . 56

6.3 Thermal images of a Samsung 970 EVO Plus SSDs during idle, read, write,
operations. All images were captured with a FLIR A325sc thermal camera. . . . 57

6.4 Consolidated Confusion Matrix showing the ability to differentiate activity (read/write)
from non-activity (idle) for all SSD, operation, and file size combinations for the
3 classifiers trained and tested. . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.5 10-fold Cross Validation Results for 3 classifiers trained with read, write, and idle
operations at file sizes of 100, 500, and 1000MB. Each row shows results for a
single classifier at increasing read/write sizes. . . . . . . . . . . . . . . . . . . . 59

6.6 Test Data Results for 3 classifiers trained with read, write, and idle operations at
file sizes of 100, 500, and 1000MB. Each row shows results for a single classifier
at increasing read/write sizes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

6.7 System diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.8 Measurement system setup. The green (upper) box in the left image indicates the

thermal camera, while the white (lower) box highlights the single board computer.
The right images show the top and side views of the camera and board set up. . . 63

6.9 Comparison of images captured with (a) FLIR A325sc, (b) Seek Thermal CompactPro,
and (c) FLIR T530 cameras. FLIR image temperature scaled to 75-135 °F with
FLIR Research Studio software. SeeK CompactPRO scaled to 75-135 °F with
SeeK Thermal iPhone app. FLIR T530 scaled from 80-110 °F in FLIR Tools.
Blue in SeeK image indicates temperatures < 75°. . . . . . . . . . . . . . . . . . 65

6.10 File operations performed on a BeagleBone Black, as observed with a SeeK
CompactPRO thermal camera. . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.11 Thermal signature of a 100 MB file write over time. The left image shows a
BeagleBone Black with three components highlighted: (1) Power Management
Integrated Chip, (2) Embedded Multi Media Card, and (3) Processor, all of which
produce distinct heat signatures over the course of the write operation. . . . . . . 66

viii


6.12 Accuracy of SeeK Compact Pro and FLIR A325sc cameras, using both automated
and manual detection techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . 68

6.13 Precision of SeeK Compact Pro and FLIR A325sc cameras, using both automated
and manual detection techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . 69

6.14 Recall of SeeK Compact Pro and FLIR A325sc cameras, using both automated
and manual detection techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . 69

7.1 Data Collection Summary for Proof of Concept A . . . . . . . . . . . . . . . . . 73
7.2 Collection apparatus. Includes the data recorder and current probe/amplifier to

measure current supplied to the drive, but those capabilities were not incorporated
into the proof-of-concept Vtune collection. . . . . . . . . . . . . . . . . . . . . . 73

7.3 Data Collection and System Restore Procedure for manually executed 7zip and
Sodinokibi randomized trials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

7.4 Mean Event Rate for a 7zip encryption trial vs. REvil/Sodinokibi ransomware,
from analysis of VTune microarchitectural hardware performance counters. . . . 77

7.5 Data Collection Summary for Proof of Concept B . . . . . . . . . . . . . . . . . 78
7.6 Experimental setup for monitoring power supplied to the test OS SSD. Current is

measured with a current probe, amplified, and acquired and stored with the data
recorder. Vtune microarchitectural hardware performance counters are collected
from Vtune Profiler running on-system and then transferred off-system for additional
analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

7.7 Data Collection and System Restore Procedure for manually executed trials. . . . 83
7.8 Vtune Hardware Performance Counters collected solely in 20 of 20 non-idle trials. 84
7.9 Vtune Hardware Performance Counters collected solely in 15 of 15 non-idle trials. 85
7.10 Data Collection Summary for Study 1 . . . . . . . . . . . . . . . . . . . . . . . 86
7.11 SPECworkstation 3.1 Benchmarks incorporated into data collection. . . . . . . . 88
7.12 Data Collection and System Restore Procedure Study 1. . . . . . . . . . . . . . . 89
7.13 Data Collection Summary for Study 2. . . . . . . . . . . . . . . . . . . . . . . . 90
7.14 Data Collection, System Restore, and Report Generation Procedure for 31-class

data set (Study 2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.15 Start Time - Window Size - Stop Time Combinations for Study 1. Each column

indicates a time interval for which HPC feature sets were generated (12 total). . . 93
7.16 Start Time - Window Size - Stop Time Combinations for Study 2. Each column

indicates a time interval for which HPC feature sets were generated (12 total). . . 94
7.17 MATLAB Classification Models Evaluated in Classification Learning Application,

adapted from [4]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.18 MATLAB Classification Learning Application Cross-Validation Accuracy for 23-

class data set (Study 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.19 MATLAB Classification Learning Application Cross-Validation Accuracy for 31-

class data set (Study 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.20 Five Classification Algorithms were selected based on their performance across

both data sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

ix


7.21 Weighted importance score calculation example for the 31-class (220 HPC) data
set and the MRMR algorithm. Feature ranking rows 11-210 are hidden for the
31-operation classification ranking results, while feature rows 11-220 are hidden
for the binary classification ranking results. On the right hand side, weighted
importance scores are summed across each row for each HPC to get the total
importance score for each HPC and ranking. Importance scores are then summed
by column to get a weighted importance score for each HPC for all rankings and
data sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

7.22 Weighted importance scores by algorithm for the to 10 HPCs in the 31-class
experiment, ranked from highest to lowest importance score value. . . . . . . . . 103

7.23 Single-HPC Feature Classification Accuracy for top 20 Features using Bagged
Tree Ensemble Classification Algorithm for multi-class and binary versions of
Studies 1 and 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

7.24 Single-HPC Feature Classification Accuracy for top 20 Features using Subspace
Discriminant Ensemble Classification Algorithm for multi-class and binary versions
of Studies 1 and 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

7.25 Cross Validation Accuracy for incremental performance counters in order from
highest to lowest individual accuracy values for Bagged Tree Ensemble Classifier.
23-class data sets are shown in the top two plots, with 31-class data sets in the
bottom two plots. The multi-class version of each data set is on the left, while the
binary version of each data set is on the right. . . . . . . . . . . . . . . . . . . . 106

7.26 Cross Validation Accuracy for incremental performance counters in order from
highest to lowest individual accuracy values for Subspace Discriminant Ensemble
Classifier. 23-class data sets are shown in the top two plots, with 31-class data
sets in the bottom two plots. The multi-class version of each data set is on the
left, while the binary version of each data set is on the right. . . . . . . . . . . . . 107

7.27 Comparison of Bagged Tree Classifier Cross Validation Accuracy for incremental
performance counters in order from highest to lowest individual accuracy values
for windows of 0.1 s, 0.5 s, 1 s, and 2 s before, at, and after operation start time.
23-class multi-class data sets are shown across the top, with 31-class multi-class
data sets across the bottom. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

7.28 Comparison of Bagged Tree Classifier Cross Validation Accuracy for incremental
performance counters in order from highest to lowest individual accuracy values
for windows of 0.1 s, 0.5 s, 1 s, and 2 s before, at, and after operation start time.
23-class BINARY data sets are shown across the top, with 31-class BINARY data
sets across the bottom. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

7.29 A Confusion Matrix is a standard way to visualize supervised machine learning
outcomes. False Negatives (FN) are placed in Quadrant I, True Positives (TP) are
placed in Quadrant II, False Positives (FP) are placed in Quadrant III, and True
Negatives (TN) are in Quadrant IV . . . . . . . . . . . . . . . . . . . . . . . . . 111

7.30 Example “worst case” Confusion Matrix for single-HPC classification tests. . . . 112
7.31 Best single-HPC Confusion Matrix resulting from training 5 classification algorithms

with a single HPC feature at a time. This Confusion Matrix shows the cross
validation results of a Fine KNN classifier trained with 2 seconds of data and the
single HPC feature BR INST RETIRED.NEAR RETURN PS. . . . . . . . . . . 113

x


7.32 F1 scores for each HPC feature and start/stop time window were calculated and
then averaged across all start/stop time windows to obtain an average F1 score
for HPC feature ranking purposes. . . . . . . . . . . . . . . . . . . . . . . . . . 114

7.33 Cross validation accuracy, precision, recall, and F1 plots for binary 23-class
(Study 1) and 31-class (Study 2) data sets trained with Bagged Tree Ensemble
classifier and incrementing the number of HPCs used for training. HPCs were
ranked from high to low by F1 score averaged over start/stop time window sizes. . 115

7.34 Cross validation accuracy, precision, recall, and F1 plots for binary 23-class
(Study 1) and 31-class (Study 2) data sets trained with Subspace Discriminant
Ensemble classifier and incrementing the number of HPCs used for training.
HPCs were ranked from high to low by F1 score averaged over start/stop time
window sizes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

7.35 Cross validation accuracy, precision, recall, and F1 plots for binary 23-class
(Study 1) and 31-class (Study 2) data sets trained with Linear Support Vector
Machine classifier and incrementing the number of HPCs used for training. HPCs
were ranked from high to low by F1 score averaged over start/stop time window
sizes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

7.36 Cross validation accuracy, precision, recall, and F1 plots for binary 23-class
(Study 1) and 31-class (Study 2) data sets trained with Fine K-Nearest Neighbor
classifier and incrementing the number of HPCs used for training. HPCs were
ranked from high to low by F1 score averaged over start/stop time window sizes. . 116

7.37 Cross validation accuracy, precision, recall, and F1 plots for binary 23-class
(Study 1) and 31-class (Study 2) data sets trained with Narrow Neural Network
classifier and incrementing the number of HPCs used for training. HPCs were
ranked from high to low by F1 score averaged over start/stop time window sizes. . 117

7.38 Top 20 F1-ranked HPCs for each classification algorithm. Study 1 (23-class)
HPC features are on the left column and Study 2 (31-class) HPC features are on
the right. Corresponding feature numbers from the opposing study are listed in
gray next to the HPC number for each study. . . . . . . . . . . . . . . . . . . . . 117

7.39 Top F1-ranked HPC for each classification algorithm for Study 2 (31-class data
set) only. These were used as a starting point to identify the fewest number of
HPC features to generate the highest possible F1 score. . . . . . . . . . . . . . . 119

7.40 HPC Features selected for each classifier to achieve the highest possible F1 score
with the fewest number of features, ranked by F1 score. . . . . . . . . . . . . . . 119

7.41 Cross Validation F1 for each classifier, with HPC features selected to achieve the
highest possible F1 score with the fewest number of features. The HPC Features
selected for training each classifier are shown in Figure 7.40. . . . . . . . . . . . 120

7.42 Top 3 Cross Validation F1 scores for each classifier and F1 ranks, with HPC
features selected to achieve the highest possible F1 score with the fewest number
of features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

7.43 Comparison of HPC features with three highest F1 cross validation scores for 1st
4 ranks, with HPC features selected to achieve the highest possible F1 score with
the fewest number of features using Bagged Tree Ensemble Classifier. . . . . . . 122

xi


7.44 Comparison of HPC features with three highest F1 cross validation scores for 1st
4 ranks, with HPC features selected to achieve the highest possible F1 score with
the fewest number of features using Subspace Discriminant Ensemble Classifier. . 122

7.45 Comparison of HPC features with three highest F1 cross validation scores for 1st
4 ranks, with HPC features selected to achieve the highest possible F1 score with
the fewest number of features using Linear Support Vector Machine Classifier. . . 123

7.46 Comparison of HPC features with three highest F1 cross validation scores for 1st
4 ranks, with HPC features selected to achieve the highest possible F1 score with
the fewest number of features using Fine K-Nearest Neighbor Classifier. . . . . . 123

7.47 Comparison of HPC features with three highest F1 cross validation scores for 1st
4 ranks, with HPC features selected to achieve the highest possible F1 score with
the fewest number of features using Narrow Neural Network Classifier. . . . . . . 124

8.1 Study 3 included all ransomware samples used in Studies 1 and 2, plus 4 new
ransomware versions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

8.2 Data Collection Summary for Study 3. . . . . . . . . . . . . . . . . . . . . . . . 128
8.3 Data Collection, System Restore, and Networked Report Transfer/Generation

Procedure for 35-class data set (Study 3). . . . . . . . . . . . . . . . . . . . . . . 128
8.4 Summary of 20 Training Data Trials for each of 35 classes in Study 3 (HPC 48)

INST RETIRED.ANY. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8.5 Example Plot of HPC 48 (INST RETIRED.ANY) Individual Sodinokibi Trials

for Training Data Verification Trial 151 was identified as anomalous and repeated. 130
8.6 Start Time - Window Size - Stop Time Combinations for Study 3. Each column

indicates a time interval for which HPC feature sets were generated (12 total). . . 131
8.7 Example: Confusion Matrix for Matthews’ Correlation Coefficient (MCC) vs. F1

Score for imbalanced data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.8 Top 10 ranked HPC features for each classifier for 5 rounds of cross validation

and training. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.9 MCC scores averaged across each start/stop window for each classifier plotted

against the number of optimized HPCs used to train the classifier. . . . . . . . . . 136
8.10 Cross validation confusion matrices for Bagged Tree Ensemble Classifier trained

with 6 optimized HPCs. Window size increases from left to right. . . . . . . . . . 137
8.11 Cross validation confusion matrices for Subspace Discriminant Ensemble Classifier

trained with 6 optimized HPCs. Window size increases from left to right. . . . . . 137
8.12 Cross validation confusion matrices for Linear SVM Classifier trained with 6

optimized HPCs. Window size increases from left to right. . . . . . . . . . . . . 138
8.13 Cross validation confusion matrices for Fine KNN Classifier trained with 6 optimized

HPCs. Window size increases from left to right. . . . . . . . . . . . . . . . . . . 138
8.14 Cross validation confusion matrices for Narrow Neural Network Classifier trained

with 6 optimized HPCs. Window size increases from left to right. . . . . . . . . . 138
8.15 Average MCC by window size for classifiers trained with 6 optimized HPCs. . . . 139
8.16 Percentage of Correct Predictions by Classifier and Start/Stop Window for all

ransomware trials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
8.17 Percentage of Correct Predictions by Classifier and Start/Stop Window for all

benign trials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

xii


8.18 Classification Accuracy for classifiers evaluated with separate test data, collected
in randomized order on two different systems with identical hardware. . . . . . . 144

8.19 Classification MCC for classifiers evaluated with separate test data, collected in
randomized order on two different systems with identical hardware. . . . . . . . 144

8.20 Average MCC by window size for classifiers tested with 6 optimized HPC features.145
8.21 Confusion Matrix for Bagged Tree Ensemble classifier trained with 6 optimized

HPC features for window sizes of 0.1s, 0.5s, 1s, and 2s at operation trigger time. . 145
8.22 Confusion Matrix for Subspace Discriminant Ensemble classifier trained with 6

optimized HPC features for window sizes of 0.1s, 0.5s, 1s, and 2s at operation
trigger time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

8.23 Confusion Matrix for Linear SVM classifier trained with 6 optimized HPC features
for window sizes of 0.1s, 0.5s, 1s, and 2s at operation trigger time. . . . . . . . . 146

8.24 Confusion Matrix for Fine KNN classifier trained with 6 optimized HPC features
for window sizes of 0.1s, 0.5s, 1s, and 2s at operation trigger time. . . . . . . . . 147

8.25 Confusion Matrix for Narrow Neural Network classifier trained with 6 optimized
HPC features for window sizes of 0.1s, 0.5s, 1s, and 2s at operation trigger time. . 147

8.26 Percentage of Correct Predictions for 6 HPC feature Classifiers for all (12) sets
of Start/Stop Time Windows for all ransomware trials. . . . . . . . . . . . . . . . 149

8.27 Percentage of Correct Predictions for 6 HPC feature Classifiers for all (12) sets
of Start/Stop Time Windows for all benign trials. . . . . . . . . . . . . . . . . . 149

8.28 Top 10 ranked HPC features for each classifier for 5 rounds of cross validation
and training, with features generated for durations of 0.1, 0.5, 1, and 2s from
visually identified operation start time. . . . . . . . . . . . . . . . . . . . . . . . 151

8.29 Cross Validation Accuracy (top) and MCC (bottom) for classifiers trained with
the top 1-9 HPC features by MCC rank generated for durations of 0.1, 0.5, 1, and
2s from visually identified operation start time. . . . . . . . . . . . . . . . . . . 152

8.30 Test Data Accuracy (top) and MCC (bottom) for classifiers trained with the top
1-9 HPC features by MCC rank generated for durations of 0.1, 0.5, 1, and 2s from
visually identified operation start time. . . . . . . . . . . . . . . . . . . . . . . . 152

8.31 Summary of Test Data Accuracy (top) and MCC (bottom) for classifiers trained
with the top 1-9 HPC features by MCC rank generated for windows of 0.1, 0.5,
1, and 2s beginning at 15s, adapted from Section 8.3.1 for comparison purposes. . 153

8.32 % of Correct Predictions for benign operations, using 0.1, 0.5, 1, and 2s windows
at actual operation start time (identified visually) and 9 HPC features. Operations
for which 100% of trials were predicted correctly across all feature sets are in
bold. Operation/classifier combinations which predicted less than 80% of trials
correctly are in orange, while operation/classifier combinations that had fewer
than 50% of trials predicted correctly are in red. . . . . . . . . . . . . . . . . . . 155

8.33 % of Correct Predictions for ransomware operations, using 0.1, 0.5, 1, and 2s
windows at actual operation start time (identified visually) and 9 HPC features.
Operations for which 100% of trials were predicted correctly across all feature
sets are in bold. Operation/classifier combinations which predicted less than 80%
of trials correctly are in orange, while operation/classifier combinations that had
fewer than 50% of trials predicted correctly are in red. . . . . . . . . . . . . . . . 155

xiii


8.34 Comparison of % of Correct Predictions for benign operations, with 15s Start
Time Results (from Figure 8.27) on the left, and visually identified actual operation
start time shown on the right. Just with just 6 HPC features were required for the
results on the left, while 9 HPC features were required to obtain the results on the
right. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

8.35 Comparison of % of Correct Predictions for ransomware operations, with 15s
Start Time Results (from Figure 8.27) on the left, and visually identified actual
operation start time shown on the right. Just with just 6 HPC features were
required for the results on the left, while 9 HPC features were required to obtain
the results on the right. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

8.36 Titles and descriptions of top 3 ranked performance counters for all combinations
of classifiers at 15 seconds, 25 seconds, and actual operation start time. . . . . . . 160

8.37 Summary of all training trials and all 35-classes of benign and ransomware operations
for performance counter 2: BR INST RETIRED.FAR BRANCH PS. The 20 benign
operations are displayed in the top 5 rows, while the 15 ransomware operations
are displayed in the bottom 4 rows. . . . . . . . . . . . . . . . . . . . . . . . . . 161

8.38 Summary of all training trials and all 35-classes of benign and ransomware operations
for performance counter 64: L2 RQSTS.CODE RD MISS. The 20 benign operations
are displayed in the top 5 rows, while the 15 ransomware operations are displayed
in the bottom 4 rows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

8.39 Summary of all training trials and all 35-classes of benign and ransomware operations
for performance counter 67: L2 RQSTS.PF HIT. The 20 benign operations are
displayed in the top 5 rows, while the 15 ransomware operations are displayed in
the bottom 4 rows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

9.1 Experimental setup for monitoring power supplied to the test SSD from the host
computer. Current is measured with a current probe, amplified, and acquired and
stored with the data recorder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

9.2 Seven SSDs were tested. Drives were selected to cover multiple form factors
(2.5” SSD and M.2 module), interface types (SATA III and PCIe 3.0 x4), and
technologies (3D NAND and 3DXP), as indicated above. . . . . . . . . . . . . . 170

9.3 Data Collection Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
9.4 Representative Reads of 500 MB files for all Crucial, Optane, and Samsung SSDs

tested. Fifteen trials were conducted at each of 5 file sizes. Power is provided
through the 12V line to all PCIe drives (all Optane drives, the Crucial P5, and the
Samsung 970 EVO Plus), and through the 5V line for the SATA III SSDs (Crucial
MX500 and Samsung 850 EVO). Note: Y-axis for each drive is scaled to show
detailed signature and should not be used for direct power comparison. . . . . . . 172

9.5 Representative Writes of 500 MB files for all Crucial, Optane, and Samsung SSDs
tested. Fifteen trials were conducted at each of 5 file sizes. Power is provided
through the 12V line to all PCIe drives (all Optane drives, the Crucial P5, and the
Samsung 970 EVO Plus), and through the 5V line for the SATA III SSDs (Crucial
MX500 and Samsung 850 EVO). Note: Y-axis for each drive is scaled to show
detailed signature and should not be used for direct power comparison. . . . . . . 174

xiv


9.6 Longest pulse average power for (a) writes and (b) reads as a function of file size.
Values plotted are averages across trials (n = 15). Error bars are +/- 2 standard
errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

9.7 Data Flow for collection of read, write, and idle operations on four SSDs under test.179
9.8 Representative time domain (left) and spectrogram (right) plots for 1 GB read

operations for the Samsung, Western Digital, Crucial, and Optane drives, respectively
(top to bottom). (Values smaller than -115 dB in the spectrogram plots are
thresholded to -115 dB to facilitate visualization by improving image contrast.). . 180

9.9 Representative time domain (left) and spectrogram (right) plots for 1 GB write
operations for the Samsung, Western Digital, Crucial, and Optane drives, respectively
(top to bottom). (Values smaller than -115 dB in the spectrogram plots are
thresholded to -115 dB to facilitate visualization by improving image contrast.). . 181

9.10 Representative time domain (left) and spectrogram (right) plots for the idle state
for the Samsung, Western Digital, Crucial, and Optane drives, respectively (top
to bottom). (Values smaller than -115 dB in the spectrogram plots are thresholded
to -115 dB to facilitate visualization by improving image contrast.). . . . . . . . . 182

9.11 Classification results: Drive/Operation/File size . . . . . . . . . . . . . . . . . . 186
9.12 Full classification results for the individual test data sets: 2022 Drive 1 data and

2022 Drive 2 data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
9.13 Classification by operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
9.14 Classification results by operation for the individual test data sets: 2022 Drive 1

data and 2022 Drive 2 data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
9.15 Classification by drive. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
9.16 Classification results by by drive model for the individual test data sets: 2022

Drive 1 data and 2022 Drive 2 data. . . . . . . . . . . . . . . . . . . . . . . . . 191
9.17 Classification results by drive model and operation for both 2022 Drive 1 data

and 2022 Drive 2 data sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
9.18 Classification results by drive model and operation for the individual test data

sets: 2022 Drive 1 data and 2022 Drive 2 data. . . . . . . . . . . . . . . . . . . . 193
9.19 Collection apparatus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
9.20 Current Signatures of idle, 7zip encryption, and live ransomware samples (Darkside,

Sodinokibi, WannaCry). Data is recorded for 150 seconds, with encryption and
Ransomware triggered manually at the 30 second mark. . . . . . . . . . . . . . . 194

9.21 1 second section of Current Signatures of idle, 7zip encryption, and live ransomware
samples (Darkside, Sodinokibi, WannaCry). . . . . . . . . . . . . . . . . . . . . 195

9.22 Frequency Spectrum of idle, 7zip encryption, and live ransomware (Darkside,
Sodinokibi, WannaCry) at sampling rate of 200kHz with amplitude truncated
after 1 mA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

9.23 1kHz of Frequency Spectrum of idle, 7zip encryption, and live ransomware (Darkside,
Sodinokibi, WannaCry) at sampling rate of 200kHz. . . . . . . . . . . . . . . . . 196

9.24 Confusion matrix for 5-way classification of idle, 7zip encryption, and live ransomware
(Darkside, Sodinokibi, WannaCry). . . . . . . . . . . . . . . . . . . . . . . . . . 197

9.25 Current for a representative trial for each operation type from trial initiation for
175 seconds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

xv


9.26 Current for a representative trial for each operation type from operation initiation
at 60 second mark for 5 seconds. . . . . . . . . . . . . . . . . . . . . . . . . . . 198

9.27 Data Collection Summary for simultaneous current and Vtune HPC collection for
36-classes, used to fine-tune the feature generation and select well-performing
classifiers for further exploration. One ransomware class (WannaCry 1022) was
eventually dropped from the final study because it did not perform encryption
operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

9.28 Cross Validation Accuracy for non-exhaustive combinations of trail duration,
Welch Window size, maximum feature frequency, and dB units, recorded from
MATLAB Classification Learning Application. . . . . . . . . . . . . . . . . . . 202

9.29 Top 20 Cross Validation Accuracy results (based on Coarse Tree validation ranking)
for non-exhaustive combinations of trail duration, Welch Window size, maximum
feature frequency, and dB units, recorded from MATLAB Classification Learning
Application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

9.30 Current, Full 100 kHz Frequency Spectrogram, and Cutoff 300 Hz Frequency
Spectrogram for benign operations: OS only, 7zip, SPEC01 7zip, SPEC02 octave,
SPEC03 Blender. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

9.31 Current, Full 100 kHz Frequency Spectrogram, and Cutoff 300 Hz Frequency
Spectrogram for benign operations: SPEC04 CalculiX, SPEC05 Convolution,
SPEC06 FFTW, SPEC07 fsi, SPEC08 handbrake. . . . . . . . . . . . . . . . . . 206

9.32 Current, Full 100 kHz Frequency Spectrogram, and Cutoff 300 Hz Frequency
Spectrogram for benign operations: SPEC09 Kirchhoff, SPEC10 lammps, SPEC11
luxrender, SPEC12 namd, SPEC13 WPCcfd. . . . . . . . . . . . . . . . . . . . . 207

9.33 Current, Full 100 kHz Frequency Spectrogram, and Cutoff 300 Hz Frequency
Spectrogram for benign operations: SPEC14 poisson, SPEC15 python36, SPEC16
rodinaLifeSci, SPEC17 rodiniaCFD, SPEC18 srmp. . . . . . . . . . . . . . . . . 208

9.34 Current, Full 100 kHz Frequency Spectrogram, and Cutoff 300 Hz Frequency
Spectrogram for ransomware operations: Babuk 1022, DarkSide Spr21, DarkSide
0521, Gibberish 0422, HiddenTear 0422. . . . . . . . . . . . . . . . . . . . . . . 209

9.35 Current, Full 100 kHz Frequency Spectrogram, and Cutoff 300 Hz Frequency
Spectrogram for ransomware operations: Phobos 0122, Phobos 0522, Phobos
0922, Snatch 0422, Snatch 0521. . . . . . . . . . . . . . . . . . . . . . . . . . . 210

9.36 Current, Full 100 kHz Frequency Spectrogram, and Cutoff 300 Hz Frequency
Spectrogram for ransomware operations: REvilSodinokibi Spr21, Sodinokibi
0722, Sodinokibi 0222, WannaCry Spr22, WannaCry 0622. . . . . . . . . . . . . 211

9.37 Cross Validation Accuracy for all combinations of trial duration, Welch Window
Size, and Cutoff Frequency Power Spectral Density features. . . . . . . . . . . . 213

9.38 Cross Validation Accuracy for all combinations of trial duration, Welch Window
Size, and Cutoff Frequency Power Spectral Density features in dB. . . . . . . . . 214

9.39 Cross Validation MCC for all combinations of trial duration, Welch Window Size,
and Cutoff Frequency Power Spectral Density features. . . . . . . . . . . . . . . 215

9.40 Cross Validation MCC for all combinations of trial duration, Welch Window Size,
and Cutoff Frequency Power Spectral Density features in dB. . . . . . . . . . . . 216

xvi


9.41 Percent of Total Correct Predictions for all combinations of Welch Window and
Cutoff Frequency for 1-3s, 5s, 7s, 10s, and 15s for benign operations, with PSD
conversion in dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

9.42 Test Data Classification Accuracy for all combinations of trial duration, Welch
Window Size, and Cutoff Frequency Power Spectral Density features. . . . . . . 219

9.43 Test Data Classification Accuracy for all combinations of trial duration, Welch
Window Size, and Cutoff Frequency Power Spectral Density features in dB. . . . 220

9.44 Test Data Classification MCC for all combinations of trial duration, Welch Window
Size, and Cutoff Frequency Power Spectral Density features. . . . . . . . . . . . 221

9.45 Test Data Classification MCC for all combinations of trial duration, Welch Window
Size, and Cutoff Frequency Power Spectral Density features in dB. . . . . . . . . 222

9.46 Top 3 results for each trial segment length for eleven different durations (ranging
from 1-15 seconds) tested. Results are ranked from low-to-high segment length
and then by high-to-low MCC. . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

xvii


List of Abbreviations

3DXP 3D Cross Point

AES Advanced Encryption Standard

BIOS Basic Input/Output System
BPI Blind Power Identification

CER Cross Entropy Ratio
CLI Command Line Interface
CPU Central Processing Unit
CSV Cache Side-channel Vulnerability

DSS Digital Signature Standard
DVS Dynamic Voltage & Frequency Scaling

EA Electronic Attack
EDR Endpoint Detection & Response
EM Electromagnetic
eMMC embedded Multi-Media Card
EMS Electromagnetic Spectrum
EMSC Electromagnetic Side-Channel
EP Electronic Protection
ES Electronic Warfare Support
EW Electronic Warfare

FFT Fast Fourier Transform
FLIR Forward Looking Infrared
FPGA Field Programmable Gate Array

GE Guessing Entropy
GPU Graphics Processing Unit

HAC Holistic Assessment Criterion
HMD Hardware-level Malware Detection
HPC Hardware Performance Counter

xviii


IC Integrated Circuit
IDQ Instruction Decode Queue
IFL Image for Linux
ILR Information Leakage Rate
IO Input Output
IP Intellectual Property
IR Infrared

KNN K-Nearest Neighbor

LED Light-Emitting Diode
LPF Low-Pass Filter

MASC Microarchitectural Side-Channel
MCC Matthews’ Correlation Coefficient
MI Mutual Information
ML Machine Learning
MOE Measure of Effectiveness

NETD Noise Equivalent Temperature Difference
NVMe Non-Volatile Memory express

OHM Open Hardware Monitor
OS Operating System

PCA Principal Component Analysis
PCIe Peripherial Component Interconnect express
PCTA Progressive Correlation Thermal Analysis
PI Perceived Information
PSC Power Side-Channel
PSCA Power Side-Channel Attack

QLC Quad Level Cell

RaaS Ransomware-as-a-Service
RAT Resource Allocation Table
RMS Root Mean Square
RO Ring Oscillator
RSA Rivest Shamir Adleman

SATA Serial AT (or Advanced Technology) Attachment
SAVAT Signal AVailable to the ATtacker
SBC Single-Board Computer

xix


SCA Side-Channel Attack
SCP Secure Copy SCAN
Side-Channel ANalysis
SCD Side-Channel Defense
SQ Super Queue
SR Success Rate
SSD Solid State Drive
SSH Secure SHell
STSF Spatial Thermal Side-channel Factor
SVF Side-channel Vulnerability Factor
SVM Support Vector Machine

TSC Temperature Side-Channel
TSCA Temperature Side-Channel Attack
TSMP Thermal-Security-in-Multi-Processors
TVLA Test Vector Leakage Assessment

UOPS Micro-operations
USB Universal Serial Bus

xx


Chapter 1: Introduction

Computers exude information, by design. They have become fully integrated into our lives -

from word processing to gaming, streaming music and movies, big data crunching, and accessing

unlimited information through the internet with the click of a button, often in the palm of your

hand - because we have come to depend on the myriad services computer systems provide. Much

of the information that is obtained from computers is intentional - documents, movies, and games

are displayed to monitors, music streams through headphones and speakers, and network packets

are transmitted wirelessly to and from routers, all originating from the device and sent through

the proper channel to an intended receiver.

The physical implementation of computer hardware that provides these services leads necessarily

to physical behavior on the part of an operating computer. This physical behavior has physical

characteristics, many of which become channels of information leakage that can be observed by

an unintended receiver. This can pose a serious threat to computer security.

1.1 A Brief History of Side-channels

These “side-channels” of computer operations, such as current usage and power consumption,

generation of heat and electromagnetic radiation, and events at the micro-architectural level, can

be exploited to compromise the confidentiality of a system. The identification of side-channels

1


as avenues to gather and exploit valuable information is relatively new - it was 1996 when Paul

Kocher used differences in the timing of computer performance optimizations (a function of the

chosen micro-architectural implementation) to find the entire secret key of asymmetric encryption

algorithms such as Diffie-Hellman, Rivest Shamir Adleman (RSA), and Digitial Signature Standard

(DSS) [5], followed closely by analyzing the power consumed during cryptographic operations to

extract keys from dozens of products [6]. In the 2000s the National Security Agency declassified

the TEMPEST program [7], and electromagnetic signal leakage was identified as another viable

side-channel [8].

In the mere 20 years that followed, a number of additional side-channels have been commonly

accepted: Acoustic, optical, temperature, memory/cache, and micro-architectural; and advantages,

disadvantages, and methods of exploiting and securing each are active research areas. The related

nomenclature, however, is highly inconsistent. Side-channel attacks are classified a number

of ways : Active vs. Passive, Invasive vs. Non-invasive [9], Simple vs. Differential [10],

Profiled vs. Non-Profiled [11], but the terms are often used in overlapping contexts. Some

works refer to Side-channel Analysis and Constructive Side-channel Analysis while others refer

to the same techniques as passive side-channel attacks and side-channel defense, respectively.

Meanwhile, what actually constitutes a side-channel is also ambiguous - with multiple names

used interchangeably for the same leakage vector, sometimes compounded by conflating the

name of the side-channel with an attack method. “Mutual information,” generally considered

statistical measure of the amount of information shared between two random variables, is at

times considered a metric [12, 13] , a side-channel [14], an analysis method [15, 16], and an

attack method [17]. There are as many as six different terms used for the similar side-channel

methods related to micro-architecture/memory/cache/access/timing/transient execution, with no

2


clear guideline or definition to distinguish between them. The highly publicized Spectre [18]

and Meltdown [19] attacks, released in 2018, exemplify this issue. In a fairly new, rapidly

evolving field, every novel attack that is identified sends researchers scrambling for a solution.

It’s not at all surprising that the field is disorganized and terminology is inconsistent; the focus

is always reacting to the next threat. For the purposes of this work, side-channels will organized

and discussed in the framework of Electronic Warfare, as discussed in Section 1.1.1, below.

1.1.1 Considering Side-channels in the framework of Electronic Warfare

Electronic Warfare (EW) is the term given to military operations to control the electromagnetic

spectrum (EMS), or the range of frequencies of electromagnetic radiation. Electromagnetic

waves consists of perpendicular electric and magnetic fields, as shown in Figure, and the EMS is

organized by frequency (and corresponding wavelength), seen in Figure 1.2.

Figure 1.1: Electromagnetic wave, reprinted from [2].

EW focuses on maintaining friendly control of the EMS while denying its use to adversaries,

and is further divided into 3 categories: Electronic Attack (EA), Electronic Warfare Support

(ES), and Electronic Protection (EP) [20]. EW has been used in a military capacity for over a

hundred years. The earliest documented attempt to use EW in a military capacity was during

the Russo-Japanese War in 1905, when a Russian Captain requested (but was denied) permission

to transmit a signal to interfere with a Japanese plane that had spotted the Russian Fleet in the

3


Figure 1.2: Electromagnetic Spectrum, adapted from [3].

Tsushima Straight, which ended in a crushing Russian defeat that ultimately ended the war in

Japan’s favor [21]. It has been widely used to gain an operational advantage since World War II,

when both the Allies and Axis powers employed EW against the navigational systems of bomber

aircraft, among other uses [22].

In the century since EW was first introduced, a time-tested framework has been developed

to define and organize the EW field. EW focuses on maintaining friendly control of the EMS

while denying its use to adversaries, and is further divided into 3 categories: Electronic Attack,

Electronic Support, and Electronic Protection [20].

• Electronic Attack (EA) - EA is the use of the EMS or directed energy to attack an

adversary, either by preventing their access to the EMS or by preventing the adversary

from denying friendly access to the spectrum [23].

• Electronic Warfare Support (ES) - ES is the use of the EMS for information gathering,

specifically for the purpose of threat recognition, planning, or supporting future operations

[23].

• Electronic Protection (EP) - EP refers to actions taken to protect personnel, facilities, and

equipment from any effects the use of the EMS for EA or ES [23].

4


Although side-channels are quite recent in comparison, similarities exist. EW is focused

on actions involving the range of electromagnetic radiation in the EMS with EA indicating

“jamming.” Side-channels, in comparison, center on information leakage from the physical behavior

of computer hardware, where attacks primarily constitute obtaining secret encryption keys or

privileged system access. This work will consider side-channels in the framework of EW, where

the distinct characteristics of the power, acoustic, electromagnetic, temperature, optical, and

micro-architectural side-channels are analogous to the divergent properties and propagation methods

exhibited by the different frequency bands in the EMS. To organize inconsistent terminology for

side-channel methods, actions to (actively) attack side-channels with the goal of cryptanalysis or

privileged system access will still be called Side-Channel Attacks (SCA). Use of side-channels

for the purposes of information gathering and threat recognition will be termed Side-Channel

ANalysis (SCAN), and actions taken to defend side-channels - either through minimizing side-

channel leakage or by reducing the relationship between the leakage and sensitive information -

will be considered categorized as Side-Channel Defense (SCD). For reference purposes, relationships

between EW and SC methods are shown in Figure 1.3.

1.2 Side-channel Analysis to detect Ransomware

Side-channel analysis is valuable in determining what kind of unintentional information

is leaked from a system, and can be used to detect modifications to a system or validate the

expected operation [24–26]. In order to provide scope and a pertinent, real-world appliction for

which to evaluate the effectiveness of disparate side-channel methods, this research focuses on

side-channel analysis of selected side-channels (power, temperature, and micro-architectural),

5


Figure 1.3: Relationship between Electronic Warfare terminology and Side-Channel terminology.

with specific applications to the detection of state-of-the-art, real ransomware running on a live,

non-virtualized system.

The impact of even the perceived threat of ransomware attacks was brought to the front of

national consciousness in May 2021, when the largest US refined fuel pipeline operator, Colonial

Pipeline, shut down its fuel distribution system for 5 days in response to a ransomware attack

on its administrative system by the DarkSide ransomware group, causing panic buying, price

surges, and fuel outages [27]. Myriad reports of attacks on critical infrastructure, school systems,

hospitals, government systems, and private companies occur almost weekly; the first human death

was attributed to a ransomware attack when an German woman’s ambulance was re-routed from

a nearby Dusseldorf hospital after it was attacked by WannaCry ransomware in September 2020

[28]. In November 2022, the U.S. Treasury released that the total cost of ransomware attacks

6


on U.S. financial institutions alone in 2021 increased 200% over the prior year to $1.2 billion

dollars [29], and these numbers are estimated to be well below the true figure as it includes only

data that was required to be reported by U.S. banks [30]. The global impact of ransomware is

so significant that the White House has hosted two International Counter Ransomware Initiative

Summits in the past two years [31].

Typical ransomware detection methods depend on up-to-date anti-virus software signatures

created from previously discovered versions of ransomware, and are unlikely to prevent brand

new “zero-day” ransomware attacks. This work considers side-channel analysis techniques for

the temperature, power, and micro-architectural side-channels for the purpose of classifying state-

of-the-art ransomware on real world, non-virtualized Windows systems. Over three thousand

ransomware and benign trials were collected to generate training and testing data sets, which

required development of a process to synchronize collection of on-system (e.g. performance

counters) and off-system (e.g. power) measurements, safely transfer trial data from the encrypted

system, and restore the system to a “clean” state without the use of virtualization techniques,

which negatively impact the validity of side-channel measurements. Side-channels were evaluated

on their effectiveness in accurately differentiating between ransomware and benign operations

such as background operating system activity, 7zip encryption, and SPEC benchmarks, in a given

time duration, with Matthews’ Correlation Coefficient (MCC) used to measure overall classifier

performance of five machine learning classification algorithms. With the financial impact of

ransomware estimated to cost more than $30 billion globally this year [32], the usefulness of

side-channel analysis to detect ransomware in a non-virtualized computer system has significant

real-world implications.

7


1.3 Contributions

This work is the first hardware-based ransomware classification and detection exploration

to leverage side-channel analysis of multiple (power, temperature, micro-architectural) side-

channels on hardware without use of virtualization techniques. Each side-channel is evaluated

for its usefulness in classifying ransomware based on detection time required to achieve at least

90% classification accuracy. Specifically, the following contributions are insights that came from

this work:

1. This work developed a process to collect system-wide data from the power and micro-

architectural side-channels simultaneously during ransomware execution without the use

of virtualization.

2. The micro-architectural side-channel, accessed by collecting hardware performance counters

through Intel’s VTune Profiler, produced test accuracy of 99% in ≤ 1 second. Over 200

hardware events were collected and systematically tested for their ability to differentiate

ransomware from benign operations. The top 10 events were ranked by classifier, and best

events were found to change based on classifier and time window size. Accuracy results

for a 0.1s window were within 10% of best case results for each classifier.

3. The power side-channel, accessed by collecting current supplied to a solid state drive,

produced test accuracy of 96% in 15 seconds, and required at least 4 seconds to pass 90%.

4. Temperature side-channel-related research was surveyed and organized, but proof-of-concept

efforts to utilize the Temperature Side-channel for ransomware detection indicated it is not

8


a viable path for time sensitive applications without the use of pre-processing, which is

outside the scope of this work.

5. Since the terminology used to refer to side-channel topics is inconsistent at best, a preliminary

attempt to organize side-channel concepts in the framework of Electronic Warfare is described

and utilized in this work.

1.4 Organization

The remainder of this work is organized as follows: Chapter 2 provides a literature review

and background on side-channel types, methods, and metrics. Chapter 3 considers the less-

utilized temperature side-channel in depth, and Chapter 4 provides additional background on

the micro-architectural side-channel and use Hardware Performance Counters (HPCs) to obtain

micro-architectural side-channel information. Chapter 5 includes background and related literature

on side-channel analysis-based methods to detect malware and ransomware, along with additional

analysis of all previous works that have investigated the use of HPCs for ransomware detection.

Experiments accessing the temperature side-channel through thermal imaging are included in

Chapter 6. Chapters 7 and 8 describe the experiments and results leveraging micro-architectural

side-channel hardware performance counter event data to classify ransomware operations. Eval-

uation of power side-channel-based current-draw analysis and its feasibility for ransomware

classification and detection are provided in Chapter 9. This work concludes with Chapter 10,

which summarizes results and discusses future research directions to employ side-channel analysis

techniques for detection of ransomware on a non-virtualized system.

9


Chapter 2: Side-channels

2.1 Side-channels

The physical implementation of computer hardware leads necessarily to physical behavior

on the part of an operating computer. This physical behavior has physical characteristics, many of

which become channels of information leakage that can be observed by an unintended receiver.

Digital computations result in physical effects such as current usage and power consumption,

generation of heat and electromagnetic radiation, and events at the micro-architectural level.

Although these effects are not intended to convey information about the operations being performed,

they create the opportunity to do so through side-channel analysis. A side-channel is a source

of observable information leakage through methods other than the intended communications

channel. Whereas the communications channel itself is the medium through which information is

passed between a transmitter and receiver, a side-channel is a medium through which information

may be leaked or observed due to the physical effects caused by the operation or implementation

of a digital process, rather than through direct access to the device hardware or software itself.

Typical side-channels include power consumption, electromagnetic emissions, acoustic emissions,

optical emissions, heat generated, and micro-architectural effects: execution time required and

memory resources consumed. These commonly used side-channels are considered below.

10


2.1.1 Power Side-channel

The power side-channel is based on the current draw of CMOS devices in a transitory

state. As data is processed, the millions of these transitions taking place have a direct impact on

a device’s power consumption. Power side-channel analysis infers activity or operations based

on measuring the power consumed by a device. The power side-channel has been widely studied

for over 20 years and recently summarized in comprehensive survey papers [33, 34]. It allows

for multiple types of power and current collection methods to determine the ground truth of the

amount of power consumed by a device with high fidelity; however, it is typically invasive to

measure and limited to a single device or chip.

2.1.2 Electromagnetic Side-channel

The electromagnetic (EM) side-channel is a result of the electromagnetic field generated by

the flow of current. It has also been widely studied and recently surveyed [35, 36], is considered

the most useful side-channel when power measurements are unavailable, and is particularly

helpful when an implementation is resistant to side-channel power analysis [8]. The EM side-

channel does not require direct access to a device, and is usually measured with a near-field

probe placed in close proximity to the device under test. A newer, related application of the

EM side-channel is called “Backscattering,” which is a result of EM signals being reflected and

simultaneously combined with effects of the circuit activity at that moment [37, 38].

11


2.1.3 Acoustic Side-channel

A classic form of eavesdropping, the acoustic side-channel is a result of the sound that is

created in multiple computation scenarios [39], including determining keystrokes on a keyboard

[40–42], finger taps on the touch-screen of a smart phone [43], inferring what is printed on

a traditional [44] or 3D [45, 46] printer, and even cryptanalysis resulting in the extraction of

encryption keys [47–49]. The most recent application of the acoustic side-channel is the first

active acoustic SCA by Cheng et. al. [50], which used inaudible audio signals to enable a

smartphone to use SONAR to track human movements and disclose unlock patterns.

2.1.4 Optical Side-channel

Most examples of the optical side-channel consider the analysis of the photons emitted

when a transistor changes state, which has been applied to AES [51–53] and RSA [54] encryption.

The optical side-channel has also been applied to supplement power analysis when using a

charge-coupled device light-sensing camera [55], and by using the status of router LEDs to

covertly communicate and exfiltrate data [56].

2.1.5 Temperature Side-channel

The speed at which a microprocessor operates, combined with the movement of charge

required to change the state of transistors gives rise to heat, or a temperature-based side-channel.

An in-depth discussion of the temperature side-channel is provided in Chapter 3.

12


2.1.6 Micro-architectural Side-channel

Microarchitecture is the specific design of a microprocessor, and consists of all the digital

logic, arithmetic, and data path and control circuits required to implement an instruction set in

a given processor. The micro-architectural side-channel leverages the distinct hardware features

inherent in the particular microprocessor design, particularly with regards to the time it takes to

execute instructions, contention inherent is the sharing of hardware resources, and the functionality

of the memory subsystem. In contrast to previous side-channels, this side-channel is software-

based and does not necessarily require physical proximity to the target device. The Micro-

architectural side-channel consists of both Timing and Memory Access related side-channel

information and is considered in detail in Chapter 4.

2.2 Side-channel Attack, Analysis, and Defense

Side-channel attacks take advantage of the physically observable characteristics of computer

tasks on specific hardware implementations, and were originally considered to be sortable into

two orthogonal classifications according to [9]:

1. Active. vs. Passive. Active attacks deliberately tamper with the proper functioning of the

device in question. Passive attacks observe the behavior of the device without causing any

disruption to the device itself.

2. Invasive. vs. Non-Invasive. Invasive attacks require gaining access to the inside components

of a chip, usually by depackaging. Non-invasive attacks exploit information which is

externally observable.

13


More recent classification methods consider additional orthogonal axes:

1. Active. vs. Passive.

2. Invasive. vs. Semi-Invasive. vs. Non-Invasive.

3. Simple. vs. Differential. This classification is based on the method used to analyze the

side-channel data. A simple analysis generally utilizes a single side-channel trace, where

information can be extracted directly from the side-channel observations. A differential

analysis, on the other hand, uses many side-channel traces and statistical (or ML) approaches

to find the correlation between the side-channel information and the secret data. [10]

4. Non-Profiled. vs. Profiled. A non-profiled attack uses traces measured directly from the

target device (e.g. Simple or Differential Power Analysis), while a profiled attack depends

on the use of a separate device to construct a profile of the target device, which is then

used to attack the target. Template attacks and Deep learning-based attacks are powerful

examples of profiled side-channel attacks [11].

Most frequently, active side-channel attacks are used to ascertain the secret key of an

encryption algorithm. This is so frequent that the term “side-channel attack” has become synonymous

with cryptanalysis. The passive, non-invasive attack described above essentially constitutes

information gathering, and is better described as side-channel analysis, allowing for timely insights

into the effectiveness of offensive or defensive measures. Side-channel defense is the use of side-

channel insights to to defend against side-channel attacks. Defensive measures either aim to

eliminate side-channel leakage itself or to eliminate the relationship between the side-channel

leakage and sensitive information.

14


2.3 Side-channel Metrics

Side-channel attacks pose a considerable threat to the security of a system, so the ability to

understand, quantify, and compare the threat vectors posed by side-channel leakage is crucial to

understanding the a system’s vulnerabilities.

A thorough survey of over 80 technical privacy metrics was released in 2018 [57], which

considered all measures that described any level of privacy across six privacy domains: Communi-

cations systems, databases, location-based services, smart metering, social networks, and genome

privacy, and grouped metrics by the output measures of uncertainty, information gain/loss, data

similarity, indistinguishibility, error, time, accuracy/ precision, and the adversary’s probability

of success. This survey captured many of the metrics that have been used to quantify privacy,

although many did not specifically focus on side-channel leakage. The pertinent metrics from

[57], along with a wide variety of other statistical and novel side-channel leakage-based metrics

are summarized in this section.

2.3.1 Statistical Tests

The earliest work on side-channel metrics looked at the ability of statistics to quantify the

immunity of a device to leakage, as minimal leakage provides limited opportunity for the creation

of side-channels. Leakage tests were designed to detect general vulnerability to timing and

power consumption attacks, as well as correlation tests (Hamming Weight, external parameters).

These tests were based on existing statistical tests for randomness (F-test, R-test) and significance

(distance of means, goodness of fit, sum of ranks). Failure of any of the designed leakage test

indicated probable emission of secret information; however, passing any of these tests does NOT

15


imply a device does not leak information [58].

2.3.2 Entropy, Conditional Entropy, & Guessing Entropy

Entropy is the average amount of information provided by the outcome of a random event,

or the measure of uncertainty of a random variable [59]. A decrease in uncertainty corresponds

to an increase in information. Entropy as a side-channel metric originated from Shannon’s

description of a communications channel and was explored as a desirable measure of predictability

in the context of atmospheric science and weather forecasts by [60] and for early side-channel

attack modeling by [61]. It is used as the basis of many metrics, including Gu’s Spatial Thermal

Side-channel Factor (STSF) in section 2.3.7 [62]. [57] summarized arguments against its use as

a privacy metric due to the heavy influence of outlier values, which can make it misleading and

difficult to use in comparison, as it only gives a general indication of uncertainty with no insight

into how accurate the estimate is.

2.3.2.1 Conditional Entropy

Conditional Entropy is a measure of how much information is needed to describe the

outcome of an event X given the known outcome of another event Y. In the side-channel context,

Y can be considered as the attacker’s observations of the given side-channel of interest. Conditional

Entropy directly yields Mutual Information (Section 2.3.3) [12, 57, 61].

16


2.3.2.2 Guessing Entropy

Guessing entropy is a security metric that gives the average number of questions that must

be asked to correctly guess a value. In the context of side-channel attacks, it estimates the average

number of key candidates that will need to be tested after the attack is complete, thus quantifying

the effectiveness of the attack [12, 61, 63].

2.3.3 Mutual Information

Mutual Information is the measure of the amount of information shared between two

random variables. In the side-channel context, the variables are usually the true distribution of

information and the attacker’s observations of that information, hence measuring the amount

of information leakage [12, 57]. [14–17, 64–66] all explore Mutual Information Analysis for

side-channel analysis in varying contexts. Mutual Information is also used as the basis of the

Side-channel Leakage Evaluator and Analysis Kit in [67] and is related to Success Rate (Section

2.3.4) [13].

2.3.4 Success Rate

Generally speaking, Success Rate measures the probability that the adversary is successful

by determining the percentage of successes over a large number of attempts [57]. In the side-

channel application, Success Rate indicates the efficiency with the adversary can recover the

secret key [12, 13]. [68] explores the application of this metric to deep-learning-based side-

channel analysis of imbalanced data and finds that it is difficult to embed in DL algorithms,

but closely related to Cross Entropy Ratio (Section 2.3.16.2).

17


2.3.5 Welch’s T-Test

Welch’s T-test uses hypothesis testing to determine if two separate distributions with unequal

variances have equal means. It is the underlying metric used by the Test Vector Leakage Assessment

(TVLA) testing methodology initially developed by [69], which focuses on determining the

resistance of a cryptographic module to leakage of information through the power side-channel.

[70] thoroughly examines and extends the work of [69], with detailed applications of the T-test

in higher-order settings. It is identified by [57] as a privacy metric to measure data similarity, and

by [71] with gate-level power measurements in conjunction with electronic design automation

tools. [72], however, cautions against the use of Welch’s T-Test as a standalone pass/fail metric

in TVLA to assess whether a cryptographic implementation is safe.

2.3.6 Side-channel Vulnerability Factor (SVF)

SVF is the novel metric developed in [73–75] that measures the information leakage through

a side-channel by determining the correlation between the actual cache activity trace (“oracle”)

and the side-channel observations for memory side-channels.

2.3.7 Spatial Thermal Side-channel Factor (STSF)

The Entropy-based (Section 2.3.2) STSF was developed in [62] as a two-dimensional

metric to complement SVF (Section 2.3.6). STSF accounts for the temperature of function blocks

as they correlate to secret information.

18


2.3.8 Cache Side-channel Vulnerability (CSV)

CSV constrains SVF (Section 2.3.6) by limiting its application to caches only and assumes

the strongest possible attacker in order to remove the ambiguity of differences between system

vulnerabilities and attacker capabilities [76].

2.3.9 Signal Available to the Attacker (SAVAT)

SAVAT is an instruction-level metric that measures the signal made available through the

side-channel as a result of a single instruction variation, and is determined by directly analyzing

the variation between individual processor instructions [77].

2.3.10 Thermal-Security-in-Multi-Processors (TSMP)

TSMP was introduced by [78] as a metric to quantify the security of multiprocessors against

a temperature side-channel attack. TSMP ranges from 0 (not secure) to 1 (more secure).

2.3.11 Maximal Leakage

Maximal Leakage is defined in [79] as the multiplicative increase in the likelihood of

correctly guessing a randomized function of X after observing Y, maximized over all such functions.

It quantifies the leakage of information from X to Y, by measuring the difference between an

informed guess (after observing Y) and a blind guess. Applications of maximal leakage are

further explored in [80–82], and as a way to measure information gained from databases or social

networks in [57]. [83] argues for the use of maximal leakage over mutual information or channel

capacity in evaluating (timing) side-channels.

19


2.3.11.1 Maximal α-Leakage

When refining guesses, maximal α-leakage gives an adversary the ability to fine-tune the

level of confidence of additional guesses, allowing for continuous adjustment between mutual

information (α = 1, Section 2.3.3) and maximal leakage (α = ∞, Section 2.3.11) [84].

2.3.12 Information Leakage Rate

Information leakage rate is a novel metric introduced in [85] to evaluate the amount of

information leaked through the electromagnetic side-channel and compare the quality of EM

side-channel measurement systems.

2.3.13 Local Differential Privacy

The local differential privacy metric measures how indistinguishable two items of interest

are from each other [57], and is considered a useful leakage metric when a system designer is

extremely risk averse [81, 82].

2.3.14 Trust Coverage

Trust coverage is a framework that calls for the application of a variable weighted sum

of three different coverage metrics (Functional Coverage, Structrual Coverage, Asset Coverage)

to quantify the trustworthiness of hardware at the gate-level, as a response to the popularity of

hardware trojans [86].

20


2.3.15 Holistic Assessment Criterion

The Holistic Assessment Criterion focuses on assessing the leakage of power side-channels,

and is intended to improve on the TVLA’s T-test (Section 2.3.5) by focusing on a null hypothesis

built on a well-founded definition of exploitable leakage [87].

2.3.16 Machine Learning Metrics

Recently, several metrics specific to use of machine learning in side channel analysis have

been introduced due to discrepancies between the traditional machine learning (ML) metrics of

accuracy, precision, and recall and side-channel metrics, with an eye towards designing a metric

that better suits ML rather than increasing complexity by attempting to include side-channel

metrics in ML algorithms [88].

2.3.16.1 Perceived Information

[89] found that, for balanced data, using the Negative Log Likelihood (aka Cross Entropy)

loss function during the training of deep neural nets is the equivalent of maximizing perceived

information, which is the lower bound of mutual information (Section 2.3.3) between the observation

and the leakage.

2.3.16.2 Cross Entropy Ratio

Cross entropy ratio, which is closely related to both Guessing Entropy (Section 2.3.2.2)

and Success Rate (Section 2.3.4), is a useful metric to evaluate the performance of deep learning

models for side-channel analysis [68].

21


Chapter 3: The Temperature Side-channel

3.1 Overview

There exists an array of literature related to the the use of both offensive and defensive

temperature methods to address the security of computer systems via the temperature side-channel.

3.2 Characteristics of the Temperature Side-channel

The TSC leakage has a linear relationship to the leakage of the PSC, but with limitations

due to the thermal properties of materials. Modeling dynamic temperature changes is accomplished

with an RC-equivalent circuit with large (thermal) capacitance [90], and therefore behaves like a

low-pass filter (LPF) with a cutoff frequency (fc = 1
2πRC

) in the low kHz range. Since modern

processors operate in the GHz range, this low-pass filtering effect attenuates higher-frequency

components of system activity, resulting with a delay in temperature compared to power (which

changes virtually instantaneously). Temperature changes are slower both to increase and to

decrease, which results in previous and current operations being superimposed at the sensor,

so leakage has an integrative effect. Advantages and disadvantages of the TSC are listed below

[91–94]. Advantages of the TSC include:

1. Linear relationship to power.

22


2. Ease of access through the use of on-chip sensors, with some applications for infrared

imaging.

3. Leakage can be measured internally or externally through temperature variations.

The disadvantages of the TSC are not insignificant:

1. Low bandwidth due to LPF behavior, which attenuates the leakage of high frequency

computations, resulting in slow temperature changes.

2. Noisy - since heat generated is superimposed spatially and temporally.

3. Data collection is limited by the response time and resolution of the thermal sensor.

4. Any temperature offset varies over time, as no mechanism exists to control it directly.

Conversely, Dynamic Voltage and Frequency Scaling regulates power.

3.3 Sensing the Thermal Channel

Temperature side-channel readings can be obtained through both internal and external

means. This section provides an overview of commonly used internal and external sensing

options, as well as the context in which these have been used in the literature. The majority

of works have used internal on-chip sensors to monitor the temperature side channel, with the

Ring Oscillator being a popular internal implementation on FPGAs. External sensors were much

less common, with only a small number of works using an external temperature sensor or infrared

(IR) camera to capture temperature traces. Across these categories, limited success was found

with direct IR image captures, with pre-processing required to adapt temperature readings for

analysis through traditional power methods.

23


3.3.1 Internal Temperature Sensing Methods

Internal temperature readings are either acquired through the internal temperature sensor

of each processor core or, when using re-programmable devices, through the use of a specially

designed circuit (ring oscillator).

3.3.1.1 Internal Core Sensor

Since maintaining the proper operating temperature range is vital to the performance and

lifetime of a chip, cores contain software-readable temperature sensors to monitor temperature

and adjust the speed of operations if necessary. These sensors provide an easy source of temperature

information if one has access to the sensors, either through physical access to the device or the

use of a malicious program to provide sensor information. Resolution of this side-channel source

is limited by the number of sensors on the device and their placement, as well as by the frequency

of temperature readings [95].

The vast majority of existing literature utilizes on-chip sensors, many of which leverage

“HotSpot” modeling software [96] to simulate sensor temperatures for a variety of applications.

These applications include covert communications [97, 98], physical temperature attacks [99],

thermal modeling [62, 93, 100, 101], defense against TSCAs [102] and detection of hardware

trojans [103,104]. Experimental results using on-chip sensors are limited to applications in covert

communications [105, 106], and TSCA defense [78, 107, 108].

24


3.3.1.2 Ring Oscillator

A ring oscillator is a circuit that consists of and odd number of inverters in a feedback loop.

When the device is enabled, the signal through the inverters oscillates and generates heat. Each

inverter contributes to the delay of the signal, which decreases the frequency of the oscillating

signal. Ring oscillators can also be used to detect temperature changes by correlating the output

frequency drift to degrees Celsius or Fahrenheit. A number of works have explored the ring

oscillator as a temperature sensor on FPGAs [109–117], and are useful in that context due to

their ease of implementation, their dynamic nature (they can be added, moved, and removed

as desired), their ability to measure junction temperature (in lieu of package temperature, like

other on-chip sensors), and that they can be placed as desired (e.g. in an array in order to obtain

a thermal map of the die). In the context of temperature side-channels they are used almost

exclusively for covert communications applications [112, 118–120].

3.3.2 External Temperature Sensing Methods

External methods to sense temperature include infrared imaging, temperature sensors, and

occasionally, fan speeds. Each method is described below.

3.3.2.1 Infrared Camera

Hot objects radiate electromagnetic infrared (IR) frequencies and the intensity of the radiation

depends on the object’s temperature. A thermal camera uses an array of infrared sensors to

measure this electromagnetic radiation and then converts the measurement to an image showing

the temperature of different objects as colors. There are several factors that determine the quality

25


of a thermal image, but the most pertinent to this work are resolution, temperature range, and

thermal sensitivity.

Resolution. The resolution of the sensor is the number of sensitive elements (“pixels”) that

make up the sensor. Since infrared radiation has longer wavelengths than visible light, sensitive

elements of infrared cameras are larger than those of traditional cameras, so thermal cameras have

fewer pixels and lower resolution overall. Thermal cameras with more pixels produce higher-

quality images.

Temperature Range. The temperature range of a given camera is the range of temperatures

from the lowest to highest that the camera’s thermal sensor (microbolometer) is capable of

measuring. Some cameras have multiple temperature range options, which need to be chosen

based on the temperature of the object being measured.

Thermal Sensitivity. Thermal sensitivity, also called Noise Equivalent Temperature Difference

(NETD), is the smallest temperature difference a microbolometer can detect in the presence of

electronic circuit noise. Lower numbers indicate the ease of distinguishing subtle temperature

variations.

Several works propose that this would be an “ideal” way to sense temperature because

it creates an air-gap between the monitoring equipment and the system being monitored (thus

more secure), and does not require access to the on-chip sensors. Additionally, IR imaging has

the potential to provide better resolution than a single individual on-chip sensor, as well as two-

dimensional thermal/spatial correlation information. This method also eliminates monitoring

overhead on the system, which is highly advantageous in resource and power constrained devices

[101, 104, 107].

To date, there has only been limited success in applying infrared imaging to the temperature

26


side-channel, and primarily in processing data for analysis using power methods. Cochran et.

al. presented a “thermal-to-power-inversion” method to estimate spatial power using a thermal

camera collecting emissions from the back of a silicon die and compensated for challenges

associated with the spatial LPF effect of heat diffusion [91]. Meanwhile, Reda et. al. solved the

“Blind Power Identification” (BPI) problem, which uses only the chip’s total power measurements

and (internal or external) thermal sensor measurements to find the thermal model and fine-grain

power consumption of the chip, without requiring knowledge of the thermal power model to

identify sources [100]. Reda’s BPI technique was evaluated using simulation software, real

multi-core embedded sensors, and an IR camera, but the camera application was limited to a test

chip with a 10x10 grid of microheaters for proof-of-concept validation. Werner et. al. created

a thermal modeling framework for accelerator-rich architectures, which estimates the power

consumption profile IC by solving the inverse of the heat transfer equation using information

gathered from the thermal side-channel, which required images to be pre-processed to account

for the spatial LPF effect due to thermal diffusion [101].

3.3.2.2 External Sensor

Although less frequent, some works either placed an external temperature sensor on a

depackaged chip [92, 121], while others leveraged sensors embedded in a development board

[94, 122].

27


3.3.2.3 Fan Speeds

Fan speeds, which have a strong correlation to board temperature, were occasionally considered

as an acoustic and thermal side-channel. [107] leveraged the CPU fan’s acoustic emissions in

combination with CPU core temperature for a more robust embedded system monitoring capability.

From a convert communications standpoint, [118] posited that since the fan’s angular speed is

software readable, it could be suitable to indicate temperature, and [105] considered the fan

speed and its impact on implementing a bi-directional communications channel.

3.4 Temperature Attacks

The temperature side-channel is an ongoing area of investigation and is increasingly being

considered as a vector for many types of temperature attacks; however, not all temperature attacks

are side-channel attacks. “Temperature