ABSTRACT Title of Dissertation: DESIGN AND OPTIMIZATION OF 5G-&-BEYOND HYBRID COMMUNICATIONS SYSTEMS Nariman Torkzaban Doctor of Philosophy, 2023 Dissertation Directed by: Professor John S. Baras Department of Electrical and Computer Engineering 5G and beyond communication systems are envisaged to fulfill three key promises that enable novel use cases and applications such as telemedicine, augmented reality/virtual reality (AR/VR), smart manufacturing, autonomous vehicles (AVs), etc. These three key promises are i) Enhanced mobile broadband (eMBB), ii) Ultra-reliable low latency Communications (URLLC), and iii) Massive machine-type communications (mMTC). In other words, 5G is required to achieve key performance indicators (KPIs) in terms of low latency, massive device connectivity, consistent quality of service (QoS), and high security. For instance, user bit-rates up to 10 Gbps and round-trip times (RTTs) as small as 1–10 ms are demanded in specific application scenarios in 5G. Toward achieving the 5G key promises, it is essential to utilize the capacity of all sorts of communications networks (terrestrial, space, aerial) and supporting technologies (SDN, NFV, etc.) simultaneously, leading to the so-called hybrid communication networks as opposed to the traditional stand-alone ones. This signifies the importance of a seamless integration and configuration policy tailored to specific use cases and QoS requirements of 5G and beyond services and will spawn several challenging design and optimization problems from the control and management to the physical layer of next-generation systems. In this thesis, we will address such critical problems in the course of 9 chapters. In the second chapter, we study the benefits of incorporating trust into decision-making for resource provisioning in next-generation communications networks. In this regard, we study the trust-aware service chain embedding problem for enhancing the reliability of virtual network function (VNF) placement on the trusted infrastructure. The problem of placing the VNFs on the NFV infrastructure (NFVI) and establishing the routing paths between them, according to the service chain template, is termed SFC embedding. The objectives and constraints for the optimization problem formulation of SFC embedding may vary depending on the corresponding network service. We introduce the notion of trustworthiness as a measure of security in SFC embedding and thus network service deployment. We formulate the resulting trust-aware SFC embedding problem as a Mixed Integer Linear Program (MILP). We relax the integer constraints to reduce the time complexity of the MILP formulation and obtain a Linear Program (LP). We investigate the trade-offs among the two formulations, seeking to strike a balance between results accuracy and time complexity. The space-air-ground integrated network (SAGIN) offers potential benefits that are not possible otherwise, including global coverage, low latency, and high reliability. On the other hand, the heterogeneity of the integrated network with non-unified interfaces, and the diversity of 5G use cases with large-scale applications highlight the need for a unified management structure and a dynamic resource allocation policy that are both scalable and flexible enough to handle the increasing complexity. In the third chapter, on one hand, we optimize the integration of the hybrid network by deployment of satellite gateways on the ground segment of the network to ensure proper connection between the layers with minimum latency, and on the other hand, we aim at providing a seamless management and control scheme for the hybrid network utilizing the capacities of the supportive technologies, software-defined networking (SDN) and network function virtualization (NFV); In particular, we study the problem of SDN controller placement with the goal maximizing the reliability of the hybrid network. In the fourth chapter, we propose trust as a metric to measure the trustworthiness of the FL agents and thereby enhance the security of the FL training. We first elaborate on trust as a security metric by presenting a mathematical framework for trust computation and aggregation within a multi-agent system. We then discuss how this framework can be incorporated within an FL setup introducing the trusted FL algorithm for both centralized and decentralized FL. Next, we propose a framework for decentralized FL in UAV-enabled networks which involves the placement of the UAVs while ensuring the connectivity of the network of deployed UAVs. We dedicate the remaining chapters to studying the novel design problems and the key technologies for the physical layer of next-generation wireless systems with an emphasis on millimeter-wave communications, massive MIMO, and hybrid beamforming. We introduce a novel antenna configuration called twin-ULA (TULA) and its composite configurations to generate sharp beams with maximal and uniform gain. We introduce a novel beam alignment technique to maximize the utility of transmission in the presence of multipath, efficiently utilize reconfigurable intelligent surfaces (RIS) to enhance mmWave coverage in urban environments, and synchronize and calibrate in distributed massive MIMO networks for 6G systems, where the synchronization involves the carrier frequency offset estimation and compensation, and the calibration involves mitigating reciprocity mismatches in digital and analog RF chains of the access points (APs) implementing hybrid beamforming, enabling efficient downlink channel estimation. DESIGN AND OPTIMIZATION OF 5G AND BEYOND HYBRID COMMUNICATIONS SYSTEMS by Nariman Torkzaban Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2023 Advisory Committee: Professor John S. Baras, Chair/Advisor Dr. Mohammad A. (Amir) Khojastepour Professor Richard J. La Professor Behtash Babadi Professor Bruce L. Golden © Copyright by Nariman Torkzaban 2023 Dedication To my best and oldest friends, Shahzad and Behzad. ii Acknowledgments First and foremost, I would like to express my gratitude to my advisor, Professor John S. Baras, for granting me the opportunity to delve into some of the most interesting, challenging, and impactful research fields of our time. It has been a pleasure to work with and learn from such an extraordinary individual who possesses a brilliant mind and a big heart. I would also like to extend my gratitude to Professor Richard J. La, Professor Behtash Babadi, Dr. Mohammad A. (Amir) Khojastepour, and Professor Bruce L. Golden for serving on my dissertation committee. Their feedback during various stages has been instrumental in the improvement and completion of this work. I am greatly thankful to Mrs. Kim Edwards, for her generous and timely assistance with the administrative aspects of my work. I wish to emphasize that my dissertation work would not have reached its current level of excellence without the exceptional mentorship of Dr. Khojastepour during my time as an intern at NEC Laboratories, America, Inc., and long beyond that. He tirelessly spent hours with me almost every day for more than 1.5 years in various stages of my work, from initial brainstorming sessions to tackling complex theorems. The technical rigor evident in the final five chapters of this thesis owes a significant debt to his guidance. Furthermore, I extend my gratitude to my mentor, Yubing Jian, and my manager, Arnaud Meylan, during my internship at Qualcomm Inc. They provided invaluable insights and profound knowledge, equipping me with the tools to succeed in a thriving corporate environment. iii My colleagues and lab mates at the university have enriched my graduate life in many ways and deserve a special mention. To begin with, I am immensely thankful to Professor Chrysa Papagianni for her invaluable mentorship. Her extensive expertise, profound wisdom, and instructive feedback have played a crucial role in shaping my journey along this path. I would also like to thank my old officemates, Siddharth Bansal, Omkar Ninawe, Sandeep Damera, Dr. Usman Amin Fiaz, Dr. Fatima Alimardani, Dr. Christos Mavridis, Dr. Nilesh Suriyarachchi, Dr. Faizan M. Tariq, and Dr. Leda Apergi who made the office an enjoyable place to work. Among all, Asim Zoulkarni deserves a special mention. He has been not only a great colleague but also a true friend to me, especially towards the end of my time at the University. I would also like to extend my heartfelt thanks to Hamed S. (Abdollah) and Parisa N., as well as Farshid S. and Sorour S. for providing me with the chance to cherish the wonderful gift of their friendship. Of course, my life-long friends Soroush G. and his wife Mojan J., Soroush O., and Erfan B. should always be acknowledged as part of anything I may accomplish. I owe my deep thanks to my cousin Negar, who was always supportive of me as family and did not let me feel that I was away from home. Words cannot express the gratitude I owe to my lovely family- my beautiful mother, and my kind father- who have always stood by me, and have pulled me through against impossible odds at times even from a far distance. Last but not least, I would like to express my deepest gratitude to my long-time partner, Anousheh, who was also my senior in our research group. Apart from the insightful discussions with her, the results of which will follow in this dissertation, her unwavering support, love, and forgiveness have been my rock throughout this journey. Her encouragement, understanding, and belief in me were my pillars of strength, and I am incredibly fortunate to have her by my side. She has truly been at the heart of my every moment and every endeavor, and I look forward to a iv future filled with more adventures and accomplishments together. This dissertation is based on research supported by the Office of Naval Research (ONR) under grant N00014-17-1-262, Defense Advanced Research Projects Agency (DARPA) under Agreement No. HR00111990027, and grants by Leidos Corporation, and Lockheed Martin Corporation. It is impossible to remember all, and I apologize to those I have inadvertently left out. v Table of Contents Preface ii Acknowledgements iii Table of Contents vi List of Tables x List of Figures xi List of Abbreviations xiii Chapter 1: Introduction 1 1.1 Trust-aware Resource Provisioning . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Space-Air-Ground Integrated Networks . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Cell-free Massive MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Millimeter-wave Communications . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5 Reconfigurable Intelligent Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.6 Contributions and Organization of the Dissertation . . . . . . . . . . . . . . . . . 10 Chapter 2: Trust-aware Service Chain Embedding 16 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.3 Models and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3.1 Trust Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3.2 Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.4.1 Trust-aware SFC Embedding . . . . . . . . . . . . . . . . . . . . . . . . 26 2.4.2 Path-based Trust-aware SFC Embedding . . . . . . . . . . . . . . . . . . 27 2.5 Link-based Trust-aware Service Chain Embedding . . . . . . . . . . . . . . . . . 28 2.5.1 MILP Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.5.2 LP Relaxation and Rounding Algorithm . . . . . . . . . . . . . . . . . . 31 2.6 Path-based Trust-aware Service Chain Embedding . . . . . . . . . . . . . . . . . 33 2.6.1 Mixed Integer Linear Programming Formulation . . . . . . . . . . . . . 33 2.6.2 Column Generation Method . . . . . . . . . . . . . . . . . . . . . . . . 36 2.6.3 Column-Generation-based Solution . . . . . . . . . . . . . . . . . . . . 37 2.7 Link-based Model Performance Evaluation . . . . . . . . . . . . . . . . . . . . . 41 vi 2.7.1 Performance Evaluation Setup . . . . . . . . . . . . . . . . . . . . . . . 41 2.7.2 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.8 Path-based Model Performance Evaluation . . . . . . . . . . . . . . . . . . . . . 48 2.8.1 Evaluation Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 2.8.2 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Chapter 3: Ground Segment Optimization in Space-Ground Hybrid Networks 55 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.2 Joint Satellite Gateway Placement and Routing for Integrated Satellite-Terrestrial Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.2.1 Network Model and Problem Description . . . . . . . . . . . . . . . . . 59 3.2.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.2.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.3 Joint Satellite Gateway & SDN Controller Placement in SDN-enabled 5G-Satellite Hybrid Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.3.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.3.2 Optimization Model & Solution Approach . . . . . . . . . . . . . . . . . 75 3.3.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 96 3.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Chapter 4: Trust-aware Federated Learning: A Multi-agent UAV-enabled Networks Scenario 107 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.2 Trust Aggregation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.2.1 Trust Aggregation Framework . . . . . . . . . . . . . . . . . . . . . . . 111 4.2.2 Local Trust Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.2.3 Global Trust Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 4.3 Trust-aware Federated Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.3.1 Trust-aware Centralized Federated Learning . . . . . . . . . . . . . . . . 116 4.3.2 Trust-aware Decentralized Federated Learning . . . . . . . . . . . . . . . 120 4.3.3 Trust Evaluation Method . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.4 UAV Deployment for Decentralized Federated Learning . . . . . . . . . . . . . . 123 4.4.1 Approximate UP-DeFeL Model . . . . . . . . . . . . . . . . . . . . . . 124 4.4.2 Deterministic Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . 125 4.4.3 DA solution for A-UP-DeFeL . . . . . . . . . . . . . . . . . . . . . . . 126 4.5 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 4.5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 4.5.2 Trusted Decentralized FL: Numerical Results . . . . . . . . . . . . . . . 130 Chapter 5: Channel Reciprocity Calibration for Hybrid Beamforming in Cell-free Massive MIMO Systems 134 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 5.2 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 vii 5.2.1 Beamforming Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.2.2 Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.3 Reciprocity Calibration between two nodes . . . . . . . . . . . . . . . . . . . . 141 5.3.1 Digital Chain Reciprocity Calibration . . . . . . . . . . . . . . . . . . . 143 5.3.2 Analog Chain Reciprocity Calibration . . . . . . . . . . . . . . . . . . . 145 5.4 Reciprocity-based DL Channel Estimation . . . . . . . . . . . . . . . . . . . . . 149 5.4.1 Downlink Channel Estimation with a Single AP . . . . . . . . . . . . . . 149 5.4.2 Downlink Channel Estimation with Multiple APs . . . . . . . . . . . . . 151 5.5 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 5.5.1 Simulation Setup and Parameters . . . . . . . . . . . . . . . . . . . . . . 154 5.5.2 Reciprocity Calibration between Two nodes . . . . . . . . . . . . . . . . 156 5.5.3 Co-operative Zero-forcing Beamforming (ZFBF) . . . . . . . . . . . . . 158 Chapter 6: Codebook Design for mm-wave Beamforming in 5G and Beyond 160 6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 6.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 6.2.1 Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 6.2.2 Beamforming Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 6.2.3 Antenna Array Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 6.3 Single-beam Codebook Design Problem Formulation . . . . . . . . . . . . . . . 170 6.4 Proposed Single-beam Codebook Design Method . . . . . . . . . . . . . . . . . 174 6.4.1 Single-beam Codebook Design under ULA Setting . . . . . . . . . . . . 174 6.4.2 Single-beam Codebook Design under TULA Setting . . . . . . . . . . . 177 6.5 Composite Codebook Design Problem Formulation . . . . . . . . . . . . . . . . 179 6.6 Proposed Composite Codebook Design Method . . . . . . . . . . . . . . . . . . 182 6.6.1 Composite Codebook Design under ULA Setting . . . . . . . . . . . . . 183 6.6.2 Composite Codebook Design under TULA Setting . . . . . . . . . . . . 185 6.7 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 6.7.1 Single-beam Codebook Design Problem Numerical Results . . . . . . . . 189 6.7.2 Composite Codebook Design Problem Numerical Results . . . . . . . . . 190 6.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Chapter 7: Multi-user Beam Alignment for 5G and Beyond 196 7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 7.2 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 7.2.1 Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 7.2.2 Beamforming Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 7.2.3 Time-slotted System Model . . . . . . . . . . . . . . . . . . . . . . . . . 200 7.2.4 Beam Alignment Model . . . . . . . . . . . . . . . . . . . . . . . . . . 200 7.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 7.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 7.3.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 7.4 Proposed Beam Alignment Scheme . . . . . . . . . . . . . . . . . . . . . . . . . 206 7.5 Trade-off Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 7.5.1 Trade-off curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 viii 7.5.2 Analytical Results for Uniform Distribution . . . . . . . . . . . . . . . . 214 7.6 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 7.6.1 Trade-off Curve Numerical Analysis . . . . . . . . . . . . . . . . . . . . 221 7.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 Chapter 8: RIS-aided mm-Wave Beam-forming for Two-way Communications of Multiple Pairs 224 8.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 8.1.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 8.1.2 Main contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 8.1.3 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 8.2 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 8.2.1 Channel model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 8.2.2 RIS model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 8.3 General Properties of RIS as beamformer . . . . . . . . . . . . . . . . . . . . . 235 8.4 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 8.4.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 8.4.2 Transformation between MTMR and STMR . . . . . . . . . . . . . . . . 240 8.4.3 Relationship between RIS-UPA and UPA-antenna beam-forming . . . . . 241 8.4.4 Multi-beamforming design problem formulation . . . . . . . . . . . . . . 243 8.5 Proposed Multi-beamforming Design Solution . . . . . . . . . . . . . . . . . . . 247 8.6 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 8.6.1 Multibeam design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 8.6.2 Comparison of multibeam and single beam . . . . . . . . . . . . . . . . 253 8.6.3 Two-way multi-link communications . . . . . . . . . . . . . . . . . . . 254 8.6.4 Beams with arbitrary shape (footprint) . . . . . . . . . . . . . . . . . . . 255 8.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 Chapter 9: Blind Cyclic Prefix-based Carrier Frequency Offset Estimation in MIMO- OFDM Systems 257 9.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 9.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 9.3 Proposed CFO Estimation Technique . . . . . . . . . . . . . . . . . . . . . . . . 261 9.3.1 Coarse Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 9.3.2 Fine Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 9.3.3 Derivation of the Cramer-Rao Bound . . . . . . . . . . . . . . . . . . . . 267 9.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 9.4.1 Simulation Setup and Parameters . . . . . . . . . . . . . . . . . . . . . . 269 9.4.2 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 9.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 Appendix A: Proof of Theorem (21) 274 Bibliography 282 ix List of Tables 3.1 System model parameters and variables . . . . . . . . . . . . . . . . . . . . . . 60 3.2 Summary of the studied topologies . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.3 System model parameters and variables . . . . . . . . . . . . . . . . . . . . . . 75 3.4 Network Topology Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 3.5 Failure Probability Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 8.1 Frequently-used parameters and variables . . . . . . . . . . . . . . . . . . . . . 230 x List of Figures 2.1 Trust-aware SFC Embedding Network Models . . . . . . . . . . . . . . . . . . . 23 2.2 Example of trust-based SFC embedding. . . . . . . . . . . . . . . . . . . . . . . 27 2.3 Initial Dummy Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.4 Link-based Model Experiment A Numerical Results . . . . . . . . . . . . . . . . 44 2.5 Link-based Model Experiment B Numerical Results . . . . . . . . . . . . . . . . 45 2.6 Path-based Model Experiment A Numerical Results . . . . . . . . . . . . . . . . 50 2.7 Path-based Model Experiment B Numerical Results . . . . . . . . . . . . . . . . 52 3.1 Space-Air-Ground Integrated Network (SAGIN) . . . . . . . . . . . . . . . . . . 57 3.2 Experiment A - Approximation vs. Exact Method . . . . . . . . . . . . . . . . . 69 3.3 Experiment B - The Impact of Load Minimization . . . . . . . . . . . . . . . . . 70 3.4 Trade-off between the synchronization cost and average controller-to-node latency 82 3.5 Exp A. Objective function & average node-to-gateway latency . . . . . . . . . . 99 3.6 Exp A. Impact of α on the gateway deployment policy . . . . . . . . . . . . . . 99 3.7 Exp B: Average node-to-gateway reliability . . . . . . . . . . . . . . . . . . . . 100 3.8 Exp C. Jointly minimizing the synch. cost and the average node-to-controller latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 3.9 Exp D. Average node-to-controller reliability . . . . . . . . . . . . . . . . . . . 102 4.1 Centralized FL vs. Decentralized FL . . . . . . . . . . . . . . . . . . . . . . . . 108 4.2 Trust aggregation framework in (a) decentralized and (b) centralized regimes . . . 115 4.3 Evaluation results on MNIST. Top: mean training loss; bottom: testing accuracy . 129 4.5 Comparing different ML techniques validation loss for the MNIST task . . . . . 132 4.6 Impact of the number of allowed per-device neighbors in each communication round . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 5.1 Hybrid Beamforming System Model . . . . . . . . . . . . . . . . . . . . . . . . 138 5.2 Reciprocity Calibration Normalized MSE (NMSE) . . . . . . . . . . . . . . . . 154 5.3 Channel Reciprocity Calibration Scheme Performance . . . . . . . . . . . . . . . 155 5.4 Cal. MSE vs. Noise Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 5.5 Sum Rate under varying K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 5.6 Sum Rate under varying U . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 5.7 Sum Rate Performance under varying Mismatch with K = U = 2 . . . . . . . . 157 6.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 xi 6.2 Example of the Codebook Design Problem Settings . . . . . . . . . . . . . . . . 167 6.3 Antenna Array Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 6.4 ULA patterns, Mt = 32, Card(C) = 8, (a)(c) Fully-digital, (b)(d) Hybrid. . . . . 187 6.5 TULA-based patterns, (a)(d) Fully-digital,Card(C) = 16, (b) Hybrid,Card(C) = 16, (c) Fully-digital, Card(C) = 12. . . . . . . . . . . . . . . . . . . . . . . . . 188 6.6 Fully-digital, ULA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 6.7 Fully-digital, TULA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 6.8 Beam quality vs. λb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 6.9 Beam quality vs. η . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 6.10 Single-beam shape for varying η . . . . . . . . . . . . . . . . . . . . . . . . . . 192 6.11 Effect of quantization on hybrid beamforming using TULA . . . . . . . . . . . . 193 7.1 Time-slotted System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 7.2 Example of a Tulip design for b = 5 . . . . . . . . . . . . . . . . . . . . . . . . 210 7.3 p = 2, b = 5, N = 1000, Uniform PDF . . . . . . . . . . . . . . . . . . . . . . . 217 7.4 p = 2, b = 5, N = 1000, Uniform PDF . . . . . . . . . . . . . . . . . . . . . . . 217 7.5 p = 2, b = 5, N = 1000, Cut-Normal PDF . . . . . . . . . . . . . . . . . . . . . 218 7.6 Impact of varying the size of SB set on the BA solution . . . . . . . . . . . . . . 218 7.7 Theoretical curve for Uniform dist. . . . . . . . . . . . . . . . . . . . . . . . . . 221 7.8 Empirical curves for general dist. . . . . . . . . . . . . . . . . . . . . . . . . . . 221 8.1 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 8.2 RIS-enabled two-way communications . . . . . . . . . . . . . . . . . . . . . . . 237 8.3 Transforming two-way MTMR to STMR communications . . . . . . . . . . . . 240 8.4 RIS-UPA beam patterns for multi-beamforming settings . . . . . . . . . . . . . . 250 8.5 RIS-UPA beam patterns for MTMR settings . . . . . . . . . . . . . . . . . . . . 252 8.6 Effect of resolution on the beam quality . . . . . . . . . . . . . . . . . . . . . . 252 9.1 Coarse vs. fine CFO estimation with M antenna elements and K OFDM symbols 269 9.2 Time vs. Antenna Diversity (K,M) = (64, 1), (1, 64) . . . . . . . . . . . . . . . 270 9.3 Impact of K, for M = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270 9.4 Impact of M , for K = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270 9.5 Evaluation of combining time and antenna diversity for various K, and M . . . . 271 xii List of Abbreviations 5G Fifth Generation (of wireless communication) 6G Sixth Generation (of wireless communication) ACI Angular Coverage Interval ACK Acknowledgment ADC Analog-to-Digital Converter AGC Automatic Gain Control AI Artificial Intelligence AoA Angle of Arrival AoD Angle of Departure AP Access Point AWGN Additive White Gaussian Noise BBU Baseband Beamforming Unit BA Beam Alignment BER Bit Error Rate BF Beamforming BS Base Station BW Bandwidth CFO Carrier Frequency Offset CIR Channel Impulse Response CDF Cumulative Distribution Function CDNs Content Delivery Networks CQI Channel Quality Indicator CoMP Coordinated Multi-Point Transmission CP Cyclic Prefix CPS Cyber-Physical Systems CRB Cramer-Rao Bound CPU Central Processing Unit CS Central Server CSI Channel State Information DAC/ADC Digital-to-Analog Converter / Analog-to-Digital Converter DA Deterministic Annealing DC Data Center DFL Decentralized Federated Learning DFT Discrete Fourier Transform xiii DL Deep Learning EPC Evolved Packet Core ETSI European Telecommunications Standards Institute EWMA Exponential Weighted Moving Average Fed-Avg Federated Averaging FFT Fast Fourier Transform FL Federated Learning FW Firewall Gbps Gigabits per second GES Generalized Exhaustive Search GEO Geosynchronous Equatorial Orbit GS Gap Subcarriers HBF Hybrid Beamforming Hz Hertz ICI Inter-Carrier Interference IDS Intrusion Detection Systems I-BA Interactive Beam Alignment IID Independent and Identically Distributed IoT Internet of Things InP Infrastructure Provider ISTN Integrated Satellite-Terrestrial Network ITU International Telecommunication Union ISLs Inter-Satellite Links JGCP Joint Gateway and Controller Placement KPB-SCE K-Shortest Path-Based Service Chain Embedding KPIs Key Performance Indicators LCMV Linearly Constrained Minimum Variance LC-QAM Low Complexity Quadrature Amplitude Modulation LEO Low Earth Orbit LMMSE Linear Minimum Mean Squared Error LP Linear Programming LTE Long-Term Evolution MCF Multi-Commodity Flow Allocation MIMO Multiple-Input, Multiple-Output MMSE Minimum Mean Square Error MU Mobile User MU-MIMO Multi-User Multiple Input Multiple Output NACK Negative Acknowledgment NAT Network Address Translation NFs Network Functions xiv NFV Network Function Virtualization NI-BA Non-Interactive Beam Alignment NMSE Normalized Mean-Square Error NP Non-deterministic Polynomial NSF National Science Foundation NS Network Slice NS Null Subcarriers NLOS Non-Line-of-Sight OFDM Orthogonal Frequency-Division Multiplexing OMP Orthogonal Matching Pursuit PA/LNA Power Amplifier / Low-Noise Amplifier PAPR Peak-to-Average Power Ratio PB-SCE Path-Based Service Chain Embedding PB-TASCE Path-Based Trust-Aware Service Chain Embedding PDF Probability Density Function PDP Power Delay Profile PSN Phase Shifter Network QoS Quality of Service RF Radio Frequency RIS Reconfigurable Intelligent Surface SAGINs Satellite-Aided Ground Integrated Networks SA Simulated Annealing SB Scanning Beam SFO Sampling Frequency Offset SER Symbol Error Rate SINR Signal-to-Interference-plus-Noise Ratio SNR Signal-to-Noise Ratio SISO Single-Input Single-Output SVD Singular Value Decomposition TB Transmit Beam TDD Time-Division Duplex TULA Twin ULA UAV Unmanned Aerial Vehicle ULA Uniform Linear Array UPA Uniform Planar Array UP-DeFeL UAV Deployment for Decentralized Federated Learning UR Uncertainty Region UE User Equipment UL-DL Uplink-Downlink ULA Uniform Linear Array xv VNE Virtual Network Embedding VNF Virtual Network Function VNF-FG VNF Forwarding Graph ZFBF Zero-Forcing Beamforming xvi Chapter 1: Introduction In a world where information drives our interconnected society, the ongoing development of communication systems plays a crucial role in technological advancement. This exploration delves into the intricacies of 5G, which is already revolutionizing the way we communicate, and extends into the promising domains of 6G and beyond. The interdisciplinary nature of the problems that arise as communication systems evolve, has attracted researchers from various fields including but not limited to Mathematics, Control and Systems Theory, Computer Science, and Machine Learning to address the multifaceted challenges in the design and optimization of next-generation networks. In this dissertation, we develop a set of tools, theories, and algorithms to model, design, and optimize the next-generation communication systems, that lie in the interse- ction of aforementioned fields. In this chapter, we first introduce some fundamental concepts that are essential to this thesis’ foundation and then proceed with presenting the contributions and organization of the chapters. 1.1 Trust-aware Resource Provisioning Given the escalating intricacy characterizing modern cyber-physical systems (CPS), the imperative to formulate an innovative framework for modeling, analyzing, and predicting their behaviors has become increasingly evident. This endeavor assumes heightened significance in 1 light of recent advancements in the Internet of Things (IoT), coupled with the promises of 5G to facilitate extensive machine-to-machine (M2M) communications. In this evolving landscape, tightly coupled next-generation CPS devices are poised to engage in collaborative efforts, levera- ging sophisticated sensing, computing, and communication capabilities on an amplified scale enabling them to realize a wide range of applications and use cases, involving data collection, processing, and decision-making, from healthcare, vehicular networks, and smart manufacturing, to 5G service provisioning, and content delivery. All these applications heavily rely on the constant exchange of collected raw data and processed information between the collaborating agents, as opposed to the traditional case where data were collected and processed at a centralized entity. Therefore, with the heterogeneity and the large scale of the CPS, as well as the paramount importance of devising a seamless management and control scheme dealing with privacy and security threats becomes a pivotal concern. The fact that the information is crowd-sourced by the CPS agents, to a large extent, elimina- tes the risk of the existence of a single point of failure and contributes to the resilience of the network, but at the same time demonstrates the need to establish trust relationships between the agents that are exchanging information. More specifically, apart from ensuring the security of communications between the network agents, it is essential to answer the following questions: (i) whether an agent refuses to share its information with other agents due to privacy concerns or conflict of interest. (ii) whether an agent manipulates the received data before processing. (iii) whether an agent intentionally or unintentionally, shares incorrect information with the rest of the network? etc.; In other words, it is essential to establish to what extent each agent of the network can be trusted. Clearly, such mechanisms of trust contribute essentially to the resilience of networked cyber-physical systems (Net-CPS). 2 Within the context of Net-CPS, we interpret trust as a relation between network entities that may interact or collaborate in groups toward achieving various goals. These relations are set up and updated based on the evidence generated from the previous collaboration of the agents within a protocol. Suppose the collaboration has been contributive towards the achievement of a specific goal (positive evidence). In that case, the parties accumulate their trust perspective towards one another, and otherwise (negative evidence), trust will decrease between them. Trust estimates have input to decisions such as access control, resource allocation, agent participation, and so on. The method by which trust is computed and aggregated within the network may depend on the specific application, however, we enumerate the central differences in the terminology of how the trust computation and aggregation are employed: Centralized vs. Decentralized: Under the centralized regime, all the network entities rely on a central trusted party that estimates the trustworthiness level of each entity and updates all the network nodes. In this sense, all the nodes are forced to agree on the degree to which each entity is trusted as dictated by the central provider. On the other hand, under the decentralized approach, each user is responsible for calculating its opinion on the level of trustworthiness for each entity it might be interested in. This distinction however is irrelevant to the fashion trust is computed and only relates to the semantics of trust. For instance, under a decentralized regime, a user may utilize a distributed approach for computing the trust of its target. Global vs. Local: Local trust is the opinion that a trustor node has towards a trustee and is generated depending on the first-hand evidence gathered based on local interactions, however, global trust is formed by combining the first-hand evidence and the opinions of other nodes about the specific trustee and is usually more accurate. In fact, the local exchange of the local observations is used towards obtaining global trust. 3 Proactive vs. Reactive: Under a proactive regime, the entities manage to keep the trust estimates updated, while under a reactive regime, the trust estimates are computed only when they are required. The proactive scheme is not communication efficient as a large bandwidth needs to be consumed to keep the trust values updated; therefore a reactive scheme is usually preferred unless the frequency by which trust decisions are made is comparable to the frequency of the local trust updates. Direct vs. Indirect: Directed trust is obtained via interaction through direct communication with another agent. However, indirect trust is a trust relationship between two entities that have not interacted in the past. Establishing an indirect trust relationship heavily relies on the assumption that trust has the transitivity property which is not necessarily the case in any application. 1.2 Space-Air-Ground Integrated Networks Space-Air-Ground Integrated Networks (SAGIN) represent an advanced and integrated communication framework that encompasses various domains: space, aerial (such as drones or aircraft), and terrestrial (ground-based) networks. The core idea behind SAGIN is to seamlessly connect these diverse components to create a robust and versatile communication infrastructure. Space-based assets, typically satellites, form the backbone of the space segment. Satellites in orbit are equipped with communication equipment that can relay data across vast distances. They provide global coverage and are especially useful for connecting remote and inaccessible areas. Space networks are instrumental in supporting applications like global internet access, Earth observation, and satellite-based navigation. The aerial component of SAGIN includes 4 unmanned aerial vehicles (UAVs), drones, aircraft, or high-altitude balloons. These platforms can be deployed quickly to cover specific areas or events. They serve various purposes, such as disaster response, surveillance, or temporary network coverage augmentation. The terrestrial component includes traditional cellular networks, Wi-Fi, and other ground-based infrastructure. Ground networks serve as the last-mile connectivity for end-users and play a crucial role in providing consistent connectivity. By integrating multiple layers, SAGIN is inherently resilient. If one component encounters issues, communication can be seamlessly routed through another layer. This redundancy ensures high availability. SAGIN networks can cover a wide range of areas, from urban environments to remote and under-developed regions. This broad coverage is especially valuable for disaster response, rural connectivity, and global communication. The system can be easily scaled up or down based on requirements. For example, in a crowded event or during a disaster, additional aerial platforms or ground-based equipment can be deployed to meet increased demand. SAGIN networks can adapt to changing conditions. For example, aerial platforms can be re-positioned to address shifting communication needs in real-time. As technology advances, SAGIN networks are becoming more feasible and practical, and they represent a vital part of the evolving landscape of communication systems. 1.3 Cell-free Massive MIMO Systems Cell-free Massive MIMO is an innovative and advanced wireless communication system architecture that departs from the traditional cellular network structure. In a Cell-free Massive MIMO system, the concept of ”cells” is replaced with a more distributed and collaborative app- 5 roach, offering significant benefits in terms of coverage, capacity, and reliability. In traditional cellular networks, base stations (cell towers) are responsible for providing coverage to specific geographic areas. In Cell-free Massive MIMO, the network deploys a large number of distributed access points (APs) equipped with antennas throughout the coverage area. These distributed APs are typically smaller and closer together than traditional cell towers. Unlike traditional cellular systems where base stations operate independently, in Cell-free Massive MIMO, all distributed APs collaborate and coordinate their signals. This cooperation is achieved through advanced signal processing techniques. Each distributed AP is equipped with a large number of antennas, often hundreds or more. This massive antenna array allows for spatial multiplexing, enabling multiple users to be served concurrently over the same frequency resources. The distributed APs are typically connected to a centralized signal processing unit, which manages the coordination and signal processing tasks. The centralized processing enables efficient beamforming and interference management. Cell-free Massive MIMO focuses on serving users directly, regardless of their location within the coverage area. This user-centric approach optimizes signal strength and quality for each user, leading to improved performance. Through cooperation and coordination, Cell-free Massive MIMO minimizes interference between users. This results in a cleaner and more efficient communication environment, which enhances network capacity and reliability. The distributed nature of the system and the cooperation among APs ensure more uniform and extensive coverage, including in areas with challenging propagation conditions. Massive MIMO enables spatial multiplexing, allowing the system to support a large number of users with high data rates simultaneously. By minimizing interference and optimizing beamforming, Cell- free Massive MIMO offers robust and reliable communication, even in crowded and interference- 6 prone environments. The architecture is highly flexible and can adapt to changing user densities and traffic patterns. It can be easily scaled by deploying additional APs. By optimizing signal transmission and reducing interference, Cell-free Massive MIMO systems are more energy- efficient compared to traditional cellular networks. Cell-free Massive MIMO represents a promising evolution in wireless communication systems, particularly suited for dense urban environments, industrial IoT applications, and scena- rios with high user mobility. Its ability to provide consistent and high-quality connectivity, along with improved capacity, positions it as a key technology in the transition to 6G and beyond. 1.4 Millimeter-wave Communications Millimeter wave (mmWave) communication is a wireless communication technology that operates in a high-frequency range, typically in the spectrum between 30 gigahertz (GHz) and 300 GHz. This frequency range is higher than traditional microwave frequencies, which are used in many wireless communication systems. Millimeter waves are characterized by short wavelengths, typically in the millimeter range, which is where the name ”millimeter wave” comes from. The frequency range used in mmWave communication is significantly higher than that of conventional wireless communication, such as 4G and 5G, which typically operate in the sub-6 GHz and 3.5 GHz ranges. The high frequencies of mmWave enable the transmission of large amounts of data due to their broader bandwidth. Millimeter wave frequencies offer wide bandwidths that can support extremely high data rates. This makes mmWave technology suitable for applications that demand ultra-fast data transfer, such as high-definition video streaming, 7 augmented reality, virtual reality, and large-scale data downloads. One limitation of mmWave communication is its relatively short propagation range. Millimeter waves are more susceptible to atmospheric absorption and obstacles like buildings and vegetation, which can attenuate the signals. As a result, mmWave communication is often used for shorter-range, high-capacity applications, including point-to-point links and localized wireless networks. Millimeter waves are sensitive to obstacles, so they often require a clear line of sight between the transmitter and receiver. This characteristic makes mmWave technology suitable for applications like fixed wireless access, where there is a direct line of sight between a rooftop antenna and a user’s premises. To overcome the challenges of obstacles and limited range, mmWave systems often use multi-beam technology, where multiple narrow beams are formed to track the user’s device or move data between different points. Beamforming and beam steering are essential techniques to maintain the connection as the user moves. Millimeter wave technology is often used in small cell deployments, where a dense network of small, low-power base stations is placed throughout an area to provide localized high-capacity coverage. This approach is commonly used in urban environments and venues with high data demand. Millimeter wave communication is a key component of 5G and future wireless communication systems. It plays a significant role in delivering high data rates and low latency for next-generation applications. Beyond consumer applications, mmWave technology is also used for wireless backhaul in telecommunications networks, connecting cellular base stations to the core network infrastructure. This helps support the increasing data traffic demands of mobile networks. Millimeter-wave communication has the potential to revolutionize wireless communication by enabling ultra-fast data transfer rates and low-latency connections. mmWave is a critical component of 5G and is expected to be even more essential in the development of future wireless 8 communication standards, including 6G. 1.5 Reconfigurable Intelligent Surfaces Reconfigurable intelligent surface (RIS), also known as intelligent reflecting surface (IRS) is a novel technology in the field of wireless communications and signal processing. RIS is a two-dimensional surface comprising a large number of small passive reflecting elements, such as antennas or meta-material-based units, that can be electronically controlled to manipulate the electromagnetic waves in a way that optimizes wireless communication links. The key idea behind RIS is to control the phase and amplitude of the reflected signals from these elements to enhance the performance of wireless communication systems. RIS can focus signals in desired directions, which can increase the signal strength at the receiver and improve the signal-to-noise ratio. It can redirect signals around obstacles or towards specific receivers, overcoming obstacles like buildings or other structures that would otherwise block direct line-of-sight communication. RIS can be used to cancel out interference from other wireless sources, leading to cleaner and more reliable communication. By strategically placing RIS units, it’s possible to extend the coverage of wireless networks and fill in coverage gaps. RIS can reduce the power consumption of wireless devices by improving the link quality, allowing for lower transmission power levels. RIS technology is being researched and developed for various applications, including 5G and beyond, the Internet of Things (IoT), and wireless communication in challenging environments. It has the potential to revolutionize wireless communication by providing more efficient and flexible ways to manage the propagation of electromagnetic waves in wireless networks. 9 1.6 Contributions and Organization of the Dissertation This dissertation is organized into 9 chapters covering different interrelated design and optimization problems for next-generation hybrid networks. In Chapter 2, we introduce the notion of trustworthiness as a measure of security in SFC embedding and thus network resource provisioning. Next, we incorporate this notion into the service deployment problem by defining the trustworthiness of the service network nodes, links, substrate network hosts, and the interconnecting paths to form the trust-aware service chain embedding problem. We introduce and formulate two variants of the problem, i.e. the link- based and the path-based trust-aware service chain embedding problems as mixed integer-linear programs (MILP), and then provide approximate models based on linear programming (LP) relaxation and rounding, and column generation with selecting k-shortest candidate substrate paths for hosting each virtual link, to reduce the complexity of the model. We validate the performance of our methods through simulations and conduct a discussion on evaluating the methods and some operation trade-offs. In Chapter 3, we study two challenging optimization problems that arise while considering the deployment of the space-air-ground integrated networks (SAGINs), among which the optimal satellite gateway deployment problem is of significant importance. We propose a joint satellite gateway placement and routing strategy for the terrestrial network to minimize the overall cost of gateway deployment and traffic routing while adhering to the average delay requirement for traffic demands. Although traffic routing and gateway placement can be solved independently, the dependence between the routing decisions for different demands makes it more realistic to solve an aggregated model instead. Moreover, with the increasing interest in the software- 10 defined integration of 5G networks and satellites, the existence of an effective scheme for optimal placement of SDN controllers is essential. We discuss the interrelation between the two problems above and propose suitable methods to solve them under various network design criteria. We first provide a MILP model for solving the joint problem, and then motivate the decomposition of the model into two disjoint MILPs. We then show that the resulting problems can be modeled as the optimization of submodular set functions and solved efficiently with provable optimality gaps. In Chapter 4, we propose trust as a metric to measure the trustworthiness of the FL agents and thereby enhance the security of the FL training. We first elaborate on trust as a security metric by presenting a mathematical framework for trust computation and aggregation within a multi- agent system. We then discuss how this framework can be incorporated within CFL and DFL setups introducing trust-aware FL algorithms. UAVs are envisioned as vital components of next- generation networks, driving diverse applications across various domains. UAV-enabled networks typically encompass diverse edge devices, including UAVs, ground stations, sensors, and other IoT devices, all generating vast amounts of data. Therefore, FL in UAV-enabled networks has attracted much attention over the past several years. However, most of the approaches in the state-of-the-art suffer from a single point of failure as well as massive communication overhead and excessive delay due to the existence of a central aggregator. The rest of the chapter proposes a framework for decentralized FL in UAV-enabled networks. Our approach consists of two parts; i) First, we propose a UAV placement scheme while ensuring the connectivity of the network of deployed UAVs. ii) Next, we use a consensus-based approach for decentralized FL among the UAV agents. We verify the effectiveness of our approach via extensive numerical simulation. We allocate Chapter 5 to the design and optimization of cell-free (distributed) massive MIMO networks that are envisioned to realize cooperative multi-point transmission in next- 11 generation wireless systems. In this context, for efficient cooperative hybrid beamforming, the cluster of access points (APs) needs to obtain precise estimates of the uplink channel to perform reliable downlink precoding. However, due to the radio frequency (RF) impairments between the transceivers at the two en-points of the wireless channel, full channel reciprocity does not hold which results in performance degradation in the cooperative hybrid beamforming (CHBF) unless a suitable reciprocity calibration mechanism is in place. We propose a two-step approach to calibrate any two hybrid nodes in the distributed MIMO system. We then present and utilize the novel concept of reciprocal tandem to propose a low-complexity approach for jointly calibrating the cluster of APs and estimating the downlink channel. Finally, we validate our calibration technique’s effectiveness through numerical simulation. In Chapter 6, we propose a novel scheme for codebook design for hybrid beamforming in mmWave systems. We highlight the shortcomings of uniform linear arrays (ULAs) in generating perfect beams, i.e., beams with maximal uniform gain and sharp edges, and propose a solution based on a novel antenna configuration, namely, twin-ULA (TULA). Consequently, we propose two antenna configurations based on TULA: Delta and Star. We pose the problem of finding the beamforming coefficients as a continuous optimization problem for which we find the analytical closed-form solution by a quantization/aggregation method. Thanks to the derived closed-form solution the beamforming coefficients can be easily obtained with low complexity. We note that to efficiently search for the beam to serve a user and to jointly serve multiple users it is often required to use a composite beam which consists of multiple disjoint lobes. A composite beam covers multiple desired angular coverage intervals (ACIs) and ideally has maximum and uniform gain (smoothness) within each desired ACI, negligible gain (leakage) outside the desired ACIs, and sharp edges. We propose an algorithm for designing such an ideal composite codebook 12 by providing an analytical closed-form solution with low computational complexity. There is a fundamental trade-off between the gain, leakage, and smoothness of the beams. Our design allows us to achieve different values in such trade-offs based on changing the design parameters Through numerical analysis, we illustrate the effectiveness of the proposed antenna structure and beamforming algorithm to reach close-to-perfect beams. In Chapter 7, we introduce an innovative beam alignment (BA) scheme tailored to mmWave communications. The mmWave technology relies on precisely aligning narrow beams with the channel Angle of Departures (AoDs) to maximize beamforming gain. While most existing BA schemes in the literature primarily consider single-path channels, real-world scenarios involve multiple resolvable paths with varying AoDs. As a result, traditional BA methods may not perform optimally in the presence of multipath or may not fully leverage this diversity to enhance robustness. Our proposed BA schemes address this challenge by transmitting probing packets via a set of scanning beams and gathering feedback for all scanning beams at the end of the probing phase from each user. We formulate the BA scheme as an optimization problem aimed at minimizing the expected average transmission beamwidth under different policies, where the policy maps received feedback to the set of transmission beams (TB). To expand the range of feasible feedback sequences, we demonstrate the unique structure of the scanning beams, referred to as the Tulip Design, and accordingly reformulate the minimization problem with a set of linear constraints and a reduced number of variables, efficiently solved using a greedy algorithm. We also explore the trade-off between the gain of SBs and TBs. Higher SB gain enhances SB penetration, while increased TB gain improves communication link performance. However, TB performance is contingent on the set of SBs. We show that by expanding the coverage of each SB, which in turn reduces its penetration, we create opportunities to find sharper TBs and amplify 13 beamforming gain. We quantify this trade-off through a trade-off curve, demonstrating that the Tulip Design attains the optimal curve for single dominant path channels. Additionally, we provide closed-form solutions for special cases of this trade-off curve and introduce an algorithm with performance evaluations to elucidate the need for further optimization of SB sets in state- of-the-art beam search algorithms. In Chapter 8, we study the design of RIS for extending the coverage map of next-generation networks. A mmWave coverage map consists of blind spots due to shadowing and fading es- pecially in dense urban environments. Beam-forming employing massive MIMO is primarily used to address high attenuation in the mmWave channel. Due to their ability to manipulate the impinging electromagnetic waves in an energy-efficient fashion, RISs are considered a great match to complement the massive MIMO systems in realizing the beam-forming task and hence effectively filling in the mmWave coverage gap. We propose a novel RIS architecture, namely RIS-UPA where the RIS cells are arranged in a Uniform Planar Array (UPA). We show how RIS-UPA can be used in an RIS-aided MIMO system to fill the coverage gap in mmWave by forming beams of a custom footprint, with optimized main lobe gain, minimum leakage, and fairly sharp edges. Further, we propose a configuration for RIS-UPA that can simultaneously support multiple two-way communication pairs. We obtain theoretical low-complexity closed-form solutions for our design and validate our findings through extensive numerical experiments. Low-complexity estimation and correction of carrier frequency offset (CFO) are essential in orthogonal frequency division multiplexing (OFDM). Therefore, in Chapter 9, we propose a low-overhead blind CFO estimation technique based on cyclic prefix (CP), in multi-input multi- output (MIMO)-OFDM systems. We propose to use antenna diversity for CFO estimation. Given 14 that the RF chains for all antenna elements at a communication node share the same clock, the carrier frequency offset (CFO) between two points may be estimated by using the combination of the received signal at all antennas. We improve our method by combining the antenna diversity with time diversity by considering the CP for multiple OFDM symbols. We provide a closed- form expression for CFO estimation and present algorithms that can considerably improve the CFO estimation performance at the expense of a linear increase in computational complexity. We validate the effectiveness of our estimation scheme via extensive numerical analysis. 15 Chapter 2: Trust-aware Service Chain Embedding 2.1 Overview With the emergence of NFV and SDN, communication networks are becoming increasingly agile. Supported features such as programmability, IT virtualization, and open interfaces, enable operators and infrastructure providers to launch a network service rapidly and in a more cost and operation-efficient manner, allowing for shorter time-to-market cycles while driving down both operational and capital costs. Essentially, NFV implements NFs through software virtualization techniques (e.g., running on containers or virtual machines) and runs them on commodity hardware [1]. Examples of such VNFs include firewalls, load balancers, IDS, caches, etc. and more recently mobile network functions [2]. These virtual appliances can be instantiated and scaled on demand, using cloud scaling technologies, to match the service demand requirements and utilize the underlying distri- buted computing, storage, and networking resources most efficiently. To support more complex services, a sequence of NFs may be stitched together, creating an SFC, and can be represented by a graph containing VNFs as nodes and the traffic demand between those VNFs, termed VNF-FG [3]. SDN on the other hand decouples the control plane from the underlying data plane, allowing programmability of the network control function using standard, open interfaces. Network in- telligence is logically centralized, using software-based SDN Controllers that maintain a global 16 view of the network. SDN and NFV are complementary technologies, and each one can leverage the other to improve the flexibility and agility of networks and service delivery. Using NFV and SDN, networks can be sliced into multiple dedicated, mutually isolated, end-to-end logical networks composed of several customizable software-defined functions, tai- lored to a given application or service [4] [5] [6]. A network slice is self-contained, ensuring resource isolation, including all necessary functions and capabilities, appropriately chained together, following the NFV service function chaining principle [7], to address the requirements of the corresponding services/applications [5] [8] [9]. VNFs and network elements are configured in each network slice to meet such requirements, enabling at the same time new business opportunities, by providing support for multi-service and/or multi-tenancy. That is primarily the reason why network slicing has evolved from a simple fixed network overlay concept to a fundamental feature of the emerging 5G systems [10] [7] [2]. While the vision is very compelling from an infrastructure, operation, and business per- spective, the implementation of network slices poses multiple challenges, many of which are inherent to the enabling technologies, specific to the shared physical medium, or associated with the application context. To create a network slice [11] [2] and accommodate the needs of a particular service, we must be able to dynamically allocate, install, program, and configure all the service-specific network functions and chain these functions together, as prescribed by the network service graph. Towards the deployment of SFCs, the SFC embedding problem becomes of paramount importance. The problem entails the placement of VNFs on the physical hosts and establishing the routing paths between them, according to the SFC template. Therefore considering the network slice as an SFC or a collection of SFCs, efficient service chain resource allocation 17 (service chain embedding or VNF-FG embedding) becomes of paramount importance. Service chain embedding is an NP-hard problem, as a generalization of the Virtual Network Embed- ding problem [12]. Appropriate approaches are needed for allocating network and computing resources to the requested SFC. Solving the service chain embedding problem requires the identification and adoption of appropriate resource allocation policies, resulting from the identified strategic objective(s) to optimize. Depending on the network service, such objective(s) may be related to QoS, profit maximization, fault tolerance, load balancing, energy saving, security, etc. Many of these can be expressed as hard constraints in the problem formulation e.g., end-to-end delay in [13]. In this chapter, we also consider constraints and objectives motivated by security recom- mendations laid out to protect systems supporting multi-tenancy [14] [15]. In particular, similar to [16], we capture the notion of security by integrating “trust weights” into the service chain embedding problem. These weights indicate the trustworthiness of a host system, based on its interactions with the other network entities, location (e.g., physically secured location), the level of hardening for the host, etc. Thus, by using such weights, we enhance the trustworthiness of the service chain deployment and its resilience to attacks. We assume that such trust weights are developed based on monitoring and configuration/deployment data and are disseminated via appropriate methods so that they are timely available to the entity (e.g., orchestrator, domain controller) that performs the embedding process. Trust weights are also considered for the VNFs in the service chain, expressing their respective requirements in terms of security, e.g., VNFs performing mission-critical operations must be hosted on a trusted infrastructure. Hence, the proposed trust-aware service chain embed- ding approach ensures matching the requested security level of each VNF in the chain, with the security level offered by the servers that will host them. 18 Though the incorporation of trust seems intuitively fit for resource allocation problems spanning multiple administrative domains, due to the uncertainty in dealing with multiple InPs, we believe that it befits also the case of a single InP, due to the distributed heterogeneous nature of the NFVI. For example, edge NFVI PoPs are less reliable/secure than the NFVI in the central office or core DCs; in a single NFVI PoP, the security levels of the hosts may not be homogeneous (e.g., adoption of security zones, as collections of resources that share a common security expo- sure or security risk). The main contributions of the chapter are as follows. • We incorporate trust into the service chain embedding problem as a metric for capturing security. We take into account the trustworthiness requirements for both the NFs and the edges between them. Similarly, we assign trust values to the substrate network paths as well as the substrate hosts to model the trustworthiness of the NFV infrastructure. • We propose a link-based model for the problem when trust-based requirements only impose constraints on the NFs and the physical servers hosting them. • We model the link-based problem as a MILP and present an approximation algorithm based on LP relaxation and rounding to obtain near-optimal solutions for the MILP in polynomial time. • By augmenting the notion of trust to the virtual links and the substrate paths, we present a path-based formulation for the problem. • We model the path-based problem as a MILP and propose a column generation-based method to obtain near-optimal solutions for the MILP in polynomial time. 19 2.2 Related Work In this section, we briefly discuss related work on service chain embedding in general and then we focus on studies on security and trust in VNF-FG embedding. The definition and approaches for the VNF-FG embedding problem vary, depending on several design decisions such as: (i) embedding objectives such as QoS [17], profit maximization [18], fault tolerance [19], load balancing [20], energy efficiency [21] etc.; (ii) solution strategy such as exact [22], heuristic [23] or meta-heuristic [24]; (iii) the technology domain where VNF- FG embedding is applied such as radio access networks [25], LTE/EPC [26], mobile core network [13], cloud networks [27]; (iv) type of resources such as CPU and BW [13], processing time and buffer capacity [28], ternary content-addressable memory [29]; (v) single administrative [30] domain or multi-domain approaches [31]. Authors in [12] provide a taxonomy of SFC embedding approaches. Trust in NFV has been considered relatively recently, with studies such as [32] that discuss the challenges associated with incorporating trust into NFVI, mentioning also the need to apply policies for binding VNFs to certain (trusted) NFV platforms depending on the platforms’ confi- gurations. The requirement for placement of NFVs onto trusted platforms/hosts is more important given the emergence of 5G for supporting vertical markets (e.g., healthcare). The problem of security awareness in resource allocation has only been addressed in the context of VNE. Liu et al. [33] abstract the security requirements of virtual networks to quantifiable security levels and formulate the problem with security constraints related to the security demands of the virtual resources and the minimum security level required for a virtual node to be hosted on a substrate node. Similar to the above we consider trust values of the substrate nodes and trust demands for 20 the VNFs for the placement of VNFs on trusted servers. To the best of our knowledge, this is the first formulation for trust-aware service chain embedding. 2.3 Models and Definitions 2.3.1 Trust Model Trust is a commonly used concept in a variety of application domains, such as e-commerce, peer-to-peer systems, cloud computing, and social networks. Trust management systems are implemented to define and gather trust-related evidence, and evaluate, establish, and manage trust relationships between entities in distributed systems. However, this chapter does not delve into the specifics of trust modeling or trust assessment. We assume that the infrastructure provider has efficient mechanisms for distributing trust evidence related to the underlying infrastructure. There are various ways to quantify trust; in different trust schemes, continuous or discrete numerical values have been assigned to measure the level of trustworthiness of a network entity. For example, in [16] the trustworthiness of nodes in multi-hop wireless networks is defined as a continuous value in [0, 1]. In [34] reputation-based trust for experimental infrastructures in a federated environment is defined in the same fashion. Following the same rationale, we denote that trust takes a continuous numerical value in [0, 1]. The trust estimate is updated periodically; we denote the update period as Tinterval. During the update period, denoted by [τ − Tinterval, τ ], the trust evaluation mechanism provides the trust estimate of node u denoted as t′u. We use an exponential weighted moving average (EWMA), to characterize the node’s time-varying trust- worthiness over time, as it reacts faster to recent updates in trust values. Hence, the new derived 21 trust value for node u at time τ is given by tu(τ) = (1− α)tu(τ − Tinterval) + α t′u (2.1) where α ∈ [0, 1] is a constant weight indicating the preference between the updated and historic samples of trust values. 2.3.2 Network Model The substrate network, is modeled as an undirected graphGs = (Ns, Es), while the request network is modeled as a directed graph Gf = (Nf , Ef ). Each substrate node u ∈ Ns has a residual processing capacity ru, and each substrate link (u, v) ∈ Es has a BW capacity of cuv, while the CPU requirement of request node i ∈ Nf and the BW demand of a request link (i, j) ∈ Ef are represented by gi and dij accordingly. We denote by tu the trustworthiness of the substrate node u ∈ Ns, and by ti the trust requirement of the request node i ∈ Nf , while this metric for a request link (i, j) ∈ Ef is denoted by tij and for a substrate path p by tp, where p is a connected set of edges in the substrate graph. As in [14], trust takes fractional numerical value in [0, 1]. We note that the trustworthiness of a substrate path can be any function (according to a specific use-case or methodology) of the trust values corresponding to the links and nodes belonging to that path. We define the following components of the path-based formulation to facilitate the description of the model: Definition 1 Augmented Graph. For a commodity (virtual link) k = (i, j) where ij ∈ Ef we denote by Gk s = (Nk s , E k s ), the augmented graph corresponding to commodity k, whereby for every node u ∈ Ns that is eligible for hosting request node i, the directed augmented edge (i, u) 22 (a) Substrate Graph Gs = (Ns, Es) (b) Request Graph Gf = (Nf , Ef ) (c) Gij s = (N ij s , Eij s ) Augmented Substrate Graph (d) Trust-Aware SFC Embedding Solution Figure 2.1: Trust-aware SFC Embedding Network Models 23 is added to Es. Similarly, for every node u that is eligible for hosting the request node j, the directed augmented edge (u, j) will be added to Es. Hence, for the augmented graph, we will have: Nk s = Ns ∪ {i, j} Ek s = Es ∪ {iu|u ∈ Ns and tu ≥ ti and ru ≥ gi} ∪{uj|u ∈ Ns and tu ≥ tj and ru ≥ gj} Furthermore, we denote by Ga s = (Na s , E a s ) the augmented graph corresponding to the request graph Gf = (Nf , Ef ), which contains all the nodes and links in all of the augmented graphs for all the commodities. Definition 2 Augmented Path. For a commodity K = (i, j) where (i, j) ∈ Ef we denote by pk from i to j, a generic augmented path corresponding to commodity k, where the initial and the final links are augmented edges corresponding to commodity k. In other words, once we remove the initial and final edge from pk the result will be a path of the original graph Gs. We further denote by Pk, the set of all augmented paths corresponding to commodity k, and by P the set of all augmented paths. For instance, fig. 2.1c shows an augmented graph for commodity (virtual link) (i, j) in the Request Graph depicted in fig. 2.1b, that is going to be placed on the substrate network shown in Fig. 2.1a. Furthermore. each of the directed paths in the augmented graph depicted in Fig. 2.1c, that start with a red edge and end with a blue edge, is an augmented path corresponding to 24 commodity (i, j). 2.4 Problem Description Agile service deployment of the requested network service at the operator’s infrastructure involves placement of its constituent VNFs and in-sequence routing through them as prescribed by the Service Chain. Therefore, decisions must be made on where functions should be placed among the available NFVI and how traffic must be routed among them. NFVI covers all hardware and software resources that comprise the NFV environment; it includes network connectivity between locations, e.g., between data centers and public clouds or NFVI Points of Presence (NFVI-PoPs). Physical resources include computing, storage, and network hardware provid- ing processing, storage, and connectivity for VNFs through the virtualization layer that sits just above the hardware and abstracts the physical resources. The optimization challenge at hand is to efficiently allocate the host and BW resources to the Service Graphs; in other words, to efficiently map the Service Graphs to the operator’s NFV infrastructure adhering to the capacity constraints of the physical resources, as well as performance and resilience objectives and con- straints. Performance and resilience objectives and constraints (e.g., delay budget between NFs, etc.) are dictated by the corresponding network service to be implemented, while additional objectives related to the operators’ effective use of the infrastructure such as load balancing and energy conservation may be included in the problem formulation. SFC embedding bears many similarities to the Virtual Network Embedding (VNE) problem since chained NFs can be seen as directional graphs to be embedded into a substrate network. However, the two problems call for different solutions since i) VNE requests are undirected 25 graphs as opposed to service chains that define an ordered set of VNFs and traffic must flow accordingly through this predefined ordered set. ii) VNFs exhibit ingress/egress bit-rate variations due to specific VNF operations (e.g., compression as coding, transcoding in video optimizers, etc.). iii) With VNE we primarily observe one-level mappings, where virtual network requests are mapped to the substrate network. However, in NFV environments we may have two-level mappings, i.e. service function chaining requests to VNF instances and then VNF instances to physical networks as vNFS may be shared between service chains that constitute the network service. 2.4.1 Trust-aware SFC Embedding Considering trust-aware embedding of service chains, VNFs apart from their computing requirements can have also explicit requirements in terms of security, expressed via the respective trust values in the Request. Both can be expressed as constraints in the corresponding resource allocation problem. Thus, the trust-aware embedding solution in Fig. 2.2 ensures that: (i) the computing and BW resources, allocated to the service chain, do not exceed the residual capacities of servers and links, respectively and, (ii) each VNF in the chain is assigned to a server that matches (at minimum) its requested trust level. For example, considering VNF j, even though servers w and x can both support its computing requirements, only server w can provide (at minimum) the required trust level. In other words, the trustworthiness of the server should be equal to or higher than the one requested. Binding the SFC End Points (EPs) to substrate ingress and egress nodes has been omitted for readability purposes. 26 Figure 2.2: Example of trust-based SFC embedding. 2.4.2 Path-based Trust-aware SFC Embedding We note that the method in [14] is link-based; i.e. i) The flow decision variables are link- to-link, and ii) Provided in the output is the assignment of each network request link to a set of substrate links that are guaranteed to generate valid continuous substrate paths by suitable flow formation and conservation constraints. However, in this chapter, we represent the SFC embedding problem by a path-based model; i.e. i) The flow decision variables are link-to-path; and ii) In the output, the request links are directly assigned to the pre-selected substrate paths. One of the main contributions of this chapter is to propose a path-based model for the SFC embedding problem (PB-SCE) which provides multiple advantages over the traditional link-based formulation used in [14]. Firstly, a path-based formulation allows for the integration of various network and routing policies within the service chain embedding framework with low complexity. For instance, PB-SCE can be simply augmented by a path pre-selection phase to admit requirements such as traffic splitting, guaranteeing maximum delay or cost, or even assuring the existence of (disjoint) backup paths. 27 Moreover, within the path-based framework many of the design metrics that would enforce non-linear constraints to the link-based formulation (e.g. reliability, trust, availability, etc.), can simply be computed along the network paths in an offline fashion and be input to the path-based formulation. For instance, it is not possible in the link-based formulation in [14] to incorporate a linear constraint for capturing the trust requirement of each virtual edge; however, in the path- based model, it is straightforward to compute the trustworthiness of a network path following the corresponding trust aggregation policy and then input it to the model as a linear constraint. This is the main motivation for introducing the path-based trust-aware SFC embedding (PB-TASCE) model. Finally, we note that in the context of trust-aware service chain embedding, a path-based model allows for abstracting out the method by which the trustworthiness of the infrastructure is computed and aggregated. Precisely, considering a path-based approach that only requires the trust values assigned to the paths, disregarding how this value is computed based on the same for the underlying components, allows for the application of our method in different settings, where the trust needs to be modeled differently [35]. For instance, interpreting trust as a multiplicative metric will lead to a different trustworthiness judgment for a path compared to the case where the trustworthiness of a path is computed as the minimum of the trustworthiness of all its edges. 2.5 Link-based Trust-aware Service Chain Embedding 2.5.1 MILP Formulation To formulate the trust-aware service chain embedding problem, we consider: • the set of binary variables x, where xiu expresses the assignment of VNF i to the substrate 28 node u • the set of binary variables f, where f ijuv expresses the amount of BW from substrate link (u, v) allocated to VNF-FG edge (i, j) of the SFC. The problem is formulated as follows: Objective: Min. ∑ i∈NF ∑ u∈NS tug ixiu + ϕ ∑ ij∈EF ∑ uv∈ES f ijuv s.t. (2.2) Placement Constraints: ∑ u∈NS xiu = 1, ∀i ∈ NF (2.3) Trust Constraints: (tu − ti)xiu ≥ 0, ∀i ∈ NF , ∀u ∈ NS (2.4) Flow Constraints: ∑ uv∈ES (f ijuv − f ijvu) = dij(xiu − xju), ∀ij ∈ EF , u ∈ NS (2.5) 29 Capacity Constraints: ∑ i∈NF gixiu ≤ ru, ∀u ∈ NS (2.6) ∑ ij∈EF f ijuv ≤ cuv, ∀uv ∈ ES (2.7) Domain Constraints: f ijuv ≥ 0, ∀ij ∈ EF , uv ∈ ES (2.8) xiu ∈ {0, 1}, ∀i ∈ NF , u ∈ NS (2.9) Constraint (2.21) ensures that each NF is exactly placed on one server. Constraint (2.4) ensures that each VNF is placed on a server that can support its requested trust level. Constraint (2.5) enforces flow conservation; i.e. the sum of all inbound and outbound traffic in switches and servers that do not host VNFs should be zero. Moreover, this condition ensures that for a given pair of assigned nodes i, j (VNFs), there is a path in the network graph where the VNF-FG edge (i, j) has been mapped. Constraints (2.24) and (2.25) guarantee that the allocated computing and BW resources do not exceed the residual capacities of servers and links, respectively. Constraints (2.27), (2.26) express the domain constraints for the variables x and f. The objective function (2.2) attempts to minimize the corresponding embedding cost. The first term of this function represents the amount of CPU resources multiplied by the trust estimate of each assigned server. This term is minimized, if servers with lower security levels are preferred. Coupled with the set of constraints (2.4), it leads to solutions where the trust level of the servers 30 hosting the requested VNFs is close (but over) to the requested one. Assuming that increased trust levels for the substrate resources are used to instantiate the service chain will lead to higher provisioning costs; using the trust estimates at the objective function keeps this cost to the minimum feasible value. The second term of the objective function expresses the accumulated BW assigned to the VNF-FG edges. The term is minimized if all VNF-FG edges are mapped onto single-hop links. The parameter ϕ = 1∑ ij∈EF dij (∑ i∈NF g i ) is used for the normalization of CPU and BW units. 2.5.2 LP Relaxation and Rounding Algorithm We derive the LP model from the original MILP model by relaxing the integrality of the binary x variables. Thus, the domain constraints in MILP are replaced by the following: 0 ≤ xiu ≤ 1 ∀i ∈ NF , u ∈ NS, f ijuv ≥ 0 ∀ij ∈ EF , uc ∈ ES As the non-binary xiu values do not represent mappings between the VNFs and the servers, we use a deterministic rounding technique to obtain integer values for the variables x and embed the SFC requests. The pseudo-code for the LP rounding is shown in Algorithm 1. The LP solver (Solve LP(..)) is called iteratively. At each iteration (Lines 2-12), one dimension xiu of the current LP solution is rounded. The selected dimension is the maximum fractional value among x that, if rounded to 1, still satisfies the capacity constraint of the corresponding substrate node. The rest of the corresponding xiv, ∀v ∈ NS \ {u} fractional solutions for the particular VNF will be zero, due to constraint (2.21). In the case that the capacity constraint can not be satisfied for any of the fractional solutions, the request will be rejected (Sol=false). 31 Given the mapping of the VNFs to the infrastructure (integral x values) we solve the multi- commodity flow allocation (MCF) problem, to determine the routing paths between them, as prescribed by the SFC (Lines 14-17). The algorithm will terminate if there is no solution found for the LP (Sol=false), or if there are no remaining dimensions of the solution to be rounded (Sol=true). The Sol flag determines the rejection/acceptance of the request (Lines 18-23). Algorithm 1 LP Rounding 1: repeat 2: {xiu, f ijuv} ← Solve LP(..) 3: If solution exists Sol:= true, Otherwise Sol:= false 4: X ← {xiu|xiu /∈ {0, 1}} 5: if X ̸= 0 then 6: {i0, u0} ← argmax{i∈Nf ,u∈Ns|gi≤ru}X 7: if {i0, u0} exists then 8: Add LP Constraint xi0u0 = 1 9: else 10: Sol=false 11: end if 12: end if 13: until (X = 0) ∨ (Sol = false) 14: if Sol = true then 15: {xiu, f ijuv} ← Solve MCF(..) 16: If solution exists Sol:= true, Otherwise Sol:= false 17: end if 18: if Sol:= true then 19: return {xiu, f ijuv} {Accept the request} 20: else 21: {∀xiu := 0,∀f ijuv := 0} 22: return {xiu, f ijuv} {Deny the request} 23: end if Contrary to the MILP formulation that is computationally very expensive to solve or even intractable for large problem instances, the relaxed linear program can be solved in polynomial time. The LP solver is invoked at most |NF | + 1 times, as every iteration, up to |NF |, leads (ideally) to the mapping of a VNF, while the MCF algorithm at the end provides us with the corresponding flow allocation. 32 2.6 Path-based Trust-aware Service Chain Embedding In this section, we propose the path-based problem formulation for trust-aware service chain embedding. To this end, we define two sets of variables as follows, • x, denotes the set of binary variables xiu which express the assignment of VNF i to substrate node u. • f, denotes the set of continuous variables fp which express the amount of flow passing through the augmented path p ∈ P in the augmented substrate graph. 2.6.1 Mixed Integer Linear Programming Formulation We start with a MILP formulation as follows which contains all the service requirements as hard constraints. PB-TASCE Objective: Minimize ∑ p∈P cpfp + γ ∑ i∈Nf ∑ u∈Ns tux i u (2.10) Placement Constraints: ∑ u∈NS xiu = 1, ∀i ∈ NF (2.11) ∑ p∈Pij fp = dij, ∀ij ∈ Ef (2.12) ∑ p∈P:iu∈p fp ≤ xiuM, ∀i ∈ Nf , u ∈ Ns (2.13) 33 Trust Constraints: (tu − ti)xiu ≥ 0, ∀i ∈ NF ,∀u ∈ NS (2.14) (tp − tij)fp ≥ 0, ∀k ∈ Ef , p ∈ Pk (2.15) Capacity Constraints: ∑ i∈NF gixiu ≤ ru, ∀u ∈ NS (2.16) ∑ p:uv∈p fp ≤ cuv, ∀uv ∈ Es (2.17) Domain Constraints: xiu ∈ {0, 1}, ∀i ∈ NF , u ∈ NS (2.18) fp ≥ 0, ∀p ∈ P (2.19) The objective function (9.5) is the weighted sum of the flow embedding (BW) and server assignment (processing) costs with γ being the normalization factor to determine the balance between the two terms of the objective function. The processing cost corresponding to each substrate server is proportional to its trust value, i.e. the more trustworthy servers are more expensive. The constraint set (2.11) ensures that each VNF-FG node is placed on exactly one substrate node. Constraints in set (2.12) make sure that the traffic demand of each VNF-FG link will be allocated to this commodity using as many augmented paths as needed, while constraints (2.13) enforce that no flow passes through the paths inclusion of which in the SFC embedding 34 solution is disallowed by the node assignment policy, where M is a large enough constant. The constraint sets (2.14), and (2.15) guarantee that the trust requirements of each virtual link and each virtual node are satisfied, while the sets (2.16), and (2.17) guarantee that the allocated CPU and BW resources by each substrate node and link do not exceed their residual capacity, respectively. The constraints in sets (2.18), and (2.19) are the domain constraints corresponding to variable sets x and f respectively. We note that removing constraints (2.14), and (2.15) from the last model gives the baseline PB-SCE model. We note that the PB-TASCE model cannot be used efficiently in realistic settings with large- scale networks due to not being scalable. First, as a generalization of the VNE problem, the SFC embedding problem is NP-hard. Second, the space complexity of the model is mostly driven by the size of the path set P . Indeed, for a complete substrate graph, the set P may contain as many as (e|Ef |/2)(|Ns|!) paths [36]. Even, for a sparse network graph, the size of the set of augmented paths for each virtual edge may grow exponentially in |Ns|. Therefore, due to constraint (2.15), the size of the constraints set grows exponentially with the scale of the network. Similarly, the number of path variables fp grows exponentially with the scale of the network. Therefore, in the rest of this section, we propose a column-generation-based solution to tackle this issue by systematically including only the set of paths that may be chosen in the final solution to carry non-zero traffic, i.e. fp > 0. We modify the PB-SCE and the PB-TASCE models to contain only the k-shortest augmented paths for each commodity. This will result in lower complexity at the expense of suboptimal results. Opting for different values of k one can adjust the performance of the algorithm and seek for a suitable value of k to seek a balance between complexity and result accuracy. We will explore this trade-off in detail in the evaluation section. We refer to these new models as KPB-SCE and KPB-TASCE in order. 35 2.6.2 Column Generation Method Column generation (CG) is an effective technique for solving large linear programs (LPs), especially when the number of decision variables is substantially higher than the number of con- straints. This is because in the final optimal LP solution only as many variables as the number of constraints may appear with non-zero values. Therefore, it is possible to start by solving the considered LP with only a subset of the decision variables and then gradually generate variables (i.e. columns) that have the potential to improve the value of the objective function. The procedure consists of iterative steps including solving a restricted master problem (RMP) followed by a pricing sub-problem (PS). Initially, the RMP is formulated as the original problem with a limited set of decision variables such that an initial feasible solution to the LP can be obtained efficiently by solving the initial RMP. Once the primal RMP is solved the dual variables corresponding to the RMP constraints will be computed and utilized in formulating the PS. The objective function of the PS for each decision variable represents the reduced cost of that decision variable. The reduced cost of a decision variable in linear programming is the amount by which the objective function coefficient of that variable would have to improve (increase for a maximization problem, decrease for a minimization problem) before that variable would become non-zero in the optimal solution. In other words, it is the amount by which the objective function coefficient of the variable would have to increase (or decrease) to make it beneficial to include the variable in the optimal solution. Therefore, in the case of a minimization problem, the PS amounts to minimizing the PS objective function to find the variable with the most negative reduced cost and then adding that to the RMP variable set. Next, the RMP and the PS are solved again and this procedure continues until no variable can be found with a negative reduced cost. 36 2.6.3 Column-Generation-based Solution Following the discussions in the last subsection, the RMP is formulated as, Minimize ∑ p∈P̄trusted cpfp + γ ∑ i∈NF ∑ u∈NS tux i u (2.20) ∑ u∈NS xiu = 1, ∀i ∈ NF (2.21) ∑ p∈P̄ijtrusted fp = dij, ∀ij ∈ EF (2.22) ∑ p∈P̄trusted:iu∈p fp ≤ xiuM, ∀i ∈ NF , u ∈ NF (2.23) ∑ i∈NF gixiu ≤ ru, ∀u ∈ NS (2.24) ∑ p∈P̄trusted:uv∈p fp ≤ cuv, ∀uv ∈ Es (2.25) xiu ∈ [0, 1], ∀i ∈ NF , u ∈ NS (2.26) fp ≥ 0, ∀p ∈ P̄ trusted (2.27) where P̄ ijtrusted ⊆ P ij trusted i.e. the set of those augmented paths in the augmented graph Gij that satisfy the trust constraints and it holds that ∪ij∈EF P̄ ij trusted = P̄ trusted. Therefore, con- 37 straints (2.14) and (2.15) are removed from the RMP. 2.6.3.1 Initial Solution The CG procedure starts from an initial RMP with an initial set of variables -preferably the smallest possible one- that yield a feasible solution to the RMP. However, in general, it is not trivial to find such an instance of the RMP. In the path-based service chain embedding problem, for a general service request GF = (NF , EF ) and a substrate GS = (NS, ES) it is not possible to deterministically establish a set P̄ trusted,init such that the corresponding RMP is feasible. To tackle this issue, we extend the GS = (NS, ES) graph by adding a copy of the service request graph to form a new substrate graph G′ S = (N ′ S, E ′ S) and then constructing the new augmented graphG′a S = (N ′a S , E ′a S ). Furthermore, we assign large weights to the augmented edges inG′a S . An example of this graph is shown in Fig. 2.3. We propose a dummy initial solution for path-based service chain embedding by placing the request graph on its copy in the G′ S = (N ′ S, E ′ S) graph. The large weights assigned to the dummy augmented edges ensure that if the relaxed service chain embedding problem is feasible, none of the dummy placements will end up in the final solution once the CG procedure terminates. 2.6.3.2 Addition of Columns To formulate the pricing sub-problem to generate new columns (paths) that can improve the objective value of the RMP, we need to utilize the dual problem corresponding to the RMP 38 Figure 2.3: Initial Dummy Solution denoted by DRMP. The DRMP is given as follows, Maximize ∑ i αi + ∑ ij dijσij − ∑ u ruβu − ∑ uv cuvγuv (2.28) σij − ∑ u∈Ns:(i,u)∈p or(u,i)∈p τiu − ∑ uv∈p γuv ≤ cp ∀p ∈ P̃ ij,∀ij ∈ Ef (2.29) αi +Mτiu − giβu ≤ 0 ∀i ∈ Nf , u ∈ Ns (2.30) τiu, βu, γuv ≥ 0, αi, σij unrestricted (2.31) where the variable sets α, β, γ,σ, and τ correspond to constraint sets in the last MILP in respective order. The following lemma is useful in obtaining the pricing sub-problem. Lemma 3 The optimal solution to the relaxed path-based service chain embedding problem is 39 obtained at an RMP instance where there is no p ∈ P , which violates constraint (2.29) from the DRMP problem. Proof. Let RMP ∗ denote the RMP instance at which the optimal solution to the relaxed path-based service chain embedding problem is obtained. Let P∗ be the set of generated paths at this instance and let RMP ∗ denote the corresponding dual problem. Further, let RMP and DRMP be the primal and dual RMP instances corresponding to the set of generated paths P . Let OPT (.) denote the optimal solution to each problem instance. It holds for any P ⊆ P∗ that, OPT (RMP ) ≥ OPT (RMP ∗) = OPT (DRMP ∗) ≤ OPT (DRMP ) (2.32) where the middle equality follows from the strong duality theorem, the LHS inequality follows from the fact that the optimal solution to the minimization problem RMP is a feasible solution of RMP ∗, and the RHS inequality holds since the instance DRMP contains fewer constraints than the DRMP ∗ model. Now, consider the specific instance of RMP , where none of the (2.29) constraints are violated when evaluated for all p ∈ P . Then the corresponding (α, β, γ, σ)P and its associated (x, f)P vectors will be feasible solutions to the DRMP ∗ and RMP ∗ problems respectively, because none of the constraints of the DRMP ∗ model are violated. Then, it is straightforward to write: OPT (RMP ) = OPT (DRMP ) ≤ OPT (DRMP ∗) = OPT (RMP ∗) (2.33) Combining (2.32), and (2.33), the desired result follows. Conversely, it holds that in the optimal solution to a linear program, all the basic and non-basic variables have greater than or 40 equal to zero reduced costs. It is straightforward to show that equation (2.29) represents the reduced cost of each fp variable for each p ∈ P . Therefore, in the optimal solution to the relaxed service chain embedding problem the constraints (2.29) are satisfied for all p ∈ P . 2.7 Link-based Model Performance Evaluation In this section, we evaluate the efficiency of the proposed trust-aware SFC embedding method; we benchmark the LP algorithm against the MILP one. Specifically, we provide an overview of the performance evaluation setup (3.2.3.2), and discuss the evaluation results (3.2.3.3). 2.7.1 Performance Evaluation Setup For the simulations, we use an event-based simulator implemented in Java ( [26, 29]), including an SFC and DC topology generator. We use CPLEX for our MILP models based on the branch-and-cut method. We used the dual simplex method for the LP problem. Our tests are carried out on a server with an Intel i5 CPU at 2.3 GHz and 8 GB of main memory. NFV Infrastructure. We have generated a 3-layer fat-tree network topology for the DC, consisting of 16 pods. In our evaluations, we use only one portion (or zone) of the DC consisting of 4 (out of 16) pods, utilizing 2 switches per layer and the corresponding set of 2 servers per ToR switch. For each server, we consider 8 cores running at 2 GHz. Each server’s initial utilization is uniformly distributed U(0.3, 0.6). The ToR-to-Server link capacity is 8 Gbps while the inter- rack link capacity is 16 Gbps. The initial trustworthiness of the substrate nodes is drawn from a uniform distribution U(0.2, 1). Service Chains. We generate VNF-FGs based on three service chain templates: (i) a chain 41 handling traffic that needs to pass through a particular sequence of VNFs, i.e., a NAT and an FW followed by an IDS; (ii) the second template reflects the case where traffic in a service chain is split by a particular VNF, according to some predefined policy, e.g., load balancing; (iii) the last template corresponds to a bifurcated path with a single endpoint reflecting cases where one part of the traffic needs to be encrypted while another part needs to pass through a firewall [37]. For each chain, the number of requested VNFs, inbound traffic demands, and requested trust level are uniformly distributed, U(5, 9), U(50, 100) Mbps, and U(0.2, 0.8) respectively. The CPU demand of each VNF is derived from the inbound traffic rate and the respective VNF profile (cycles per packet). Resource profiles are available for a wide range of VNFs [38, 39], while profiling techniques can be employed for processing workloads whose computin