ABSTRACT Title of Dissertation: INTRODUCING MEETING LOCATIONS IN TAXI RIDE- SHARING SYSTEMS Sanaz Aliari Kardehdeh Dissertation directed by: Professor Ali Haghani, Department of Civil and Environmental Engineering Taxi-sharing is admittedly a promising solution to reduce travel costs and mitigate congestion. However, sharing a ride with other passengers may impose long detours. Taxi-sharing can be more beneficial if the vehicle miles traveled for the detours can be reduced when serving additional passengers. In this study, we propose incorporating alternative meeting points in taxi-sharing routing to further boost efficiency by eliminating unnecessary detours and improving the chances of passengers being matched. Unlike traditional taxi-sharing systems, passengers are not necessarily picked up at their original location in the proposed system. Instead, they are serviced within a walkable distance of their origin (meeting points). This can be especially beneficial in heavily congested road networks in dense urban areas. There are several challenges in the real-world implementation of a ride-sharing system with alternative pick-up points in a dense high-demand network. The first challenge is to find appropriate passenger matches that may share a ride. The second challenge is to effectively find a set of specific locations as the potential alternative pick-up points that are likely to reduce total travel times. Once possible matches and the corresponding set of candidate pickup points are selected, the last challenge is to obtain optimal/near-optimal routes to satisfy the passengers? demand with a reasonable computational time. In this study, we formulate a mixed integer programming model for the ride-sharing problem with alternative pick-up points and introduce strategies to cope with these challenges in a real-world setting. We use the New York City taxi demand data that may sometimes have hundreds of demands per minute in a relatively small geographical area. We first introduce a decomposition procedure to break down such a large-scale problem into smaller subproblems by pre- matching groups of passengers that are more likely to share a ride. We then create an initial set of candidate locations for all pick-up points in each group. Then, different strategies are proposed to reduce the set of candidate locations into smaller subsets, including alternative locations with higher potentials of constructing better routes. The experimental results show that incorporating alternative pick-up points results in significant savings in total travel times, total wait times, and the number of vehicles used compared to a baseline ride-sharing system that picks up each passenger exactly from their requested location. Finally, given the high computational cost of the optimization problem, we propose a genetic algorithm to find near-optimal solutions for the formulated problem, and we show that the proposed algorithm achieves solutions that are very close to the optimal solutions, with significantly lower computational times. We design specific operations to implement basic components of the genetic algorithm in the context of our algorithm. This includes strategies to diversify and create random sets of feasible solutions (individuals), select the fittest individuals in each generation, and create new generations through mutations and ordered cross-over such that the new individuals can inherit from their parents while still representing a feasible solution. INTRODUCING MEETING LOCATIONS IN TAXI-SHARING SYSTEMS by Sanaz Aliari Kardehdeh Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park, in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2021 Advisory Committee: Professor Ali Haghani, Chair Professor Bruce Golden, Dean?s Representative Professor Gang-Len Chang Professor Paul Schonfeld Doctor Kaveh Farokhi Sadabadi ? Copyright by Sanaz Aliari Kardehdeh 2021 Dedication This dissertation work is dedicated to my parents, Giti and Salar, who have always loved me unconditionally and taught me to follow my dreams and become who I. This work is also dedicated to my husband, Vahid, who has been a source of support and encouragement through all the challenges. I am truly thankful for having you in my life. ii Acknowledgments I owe my gratitude to all the people who have made this thesis possible and because of whom my graduate experience has been one that I will cherish forever. First and foremost, I'd like to thank my advisor, Professor Ali Haghani, for his kindness, guidance, and support throughout the years. It has been a pleasure to work with and learn from such an extraordinary individual. I am also grateful to the members of the advisory committee, Professor Bruce Golden, Professor Gang-Len Chang, Professor Paul Schonfeld, and Dr. Kaveh Farrokhi- Sadabadi, whom I also had the privilege to work and learn from as their students during my graduate studies, for their comments, suggestions and sparing their valuable time reviewing the manuscript. My friends, fellow students, and colleagues at Center for Advance Transportation Technologies (CATT) have enriched my graduate life in many ways and deserve a special mention: Shahrzad Saffari, Kiana Roshan-Zamir, Golnush Masghati, Aref Darzi, Sara Zahedian, Amir Nohehkhan, Sepideh Eshragh, Ali Shafahi, Sepehr Ghader, Romina Jahangiri, Moschoula Pternea, Binya Zhang, Qinglian He, Chenyang Fang, Zhongxiang Wang, Yeming Hao, Somayyeh Mohammadi, Arefeh Nasri, Mohammad Motalleb Nejad, Eric Oden, Sarvenaz Hedayati, and also my mentors and senior members of CATT who taught me a lot and kindly supported my research at CATT, Thomas Jacobs, Zachary Vander-Laan and Przemyslaw Sekula. I owe my deepest thanks to my family, my mother, and my father, who brought me forth and have always stood by me, and my brother, Yashar and his lovely wife Farnaz, iii who were great source of encouragement and inspiration all these years. Words cannot express the gratitude I owe my family. I wish to thank my husband, my friend, Vahid, for his enduring love, for believing in me, and for sharing my wish to reach the goal of completing this task. This work would have never been possible without his endless support. Lastly, thank you UMD community! iv Table of Contents Dedication ..................................................................................................................................... ii Acknowledgments ........................................................................................................................ iii Table of Contents .......................................................................................................................... v List of Tables .............................................................................................................................. viii List of Figures .............................................................................................................................. ix List of Abbreviations ..................................................................................................................... x Chapter 1: Introduction ......................................................................................................... 1 1.1 Motivation ......................................................................................................................... 1 1.2 Problem Statement ............................................................................................................. 4 1.3 Research Contributions ...................................................................................................... 4 1.4 Organization of the Dissertation ........................................................................................ 5 Chapter 2: Literature Review ................................................................................................ 6 2.1 Traveling Salesman Problem ............................................................................................. 6 2.2 Vehicle Routing Problem .................................................................................................. 6 2.3 Dial-A-Ride Problem ......................................................................................................... 7 2.4 Ride-sharing Problem ........................................................................................................ 8 2.5 Taxi-Sharing Problem ...................................................................................................... 11 2.5.1 Metaheuristic Solution Approaches ........................................................................... 13 2.6 Pricing Strategies ............................................................................................................. 15 2.7 Meeting Point .................................................................................................................. 16 2.8 Summary .......................................................................................................................... 20 Chapter 3: Methodology...................................................................................................... 22 3.1 Methodology Overview ................................................................................................... 22 3.2 Passenger Demand Pre-matching .................................................................................... 26 3.3 Finding Alternative Meeting Points ................................................................................. 27 3.3.1 Meeting Point Selection Strategies ............................................................................ 28 3.4 The Optimization Model.................................................................................................. 31 v 3.4.1 Model Assumptions ..................................................................................................... 31 3.4.2 Model Notations .......................................................................................................... 33 3.4.3 Problem Formulation ................................................................................................... 34 3.5 Summary ............................................................................................................................... 38 Chapter 4: NYC experimental case study ........................................................................... 40 4.1 Initial Feasibility Study ................................................................................................... 40 4.1.1 NYC Taxi Dataset ....................................................................................................... 41 4.1.2 The Potential of Alternative Meeting Points In NYC .................................................. 45 4.1.3 Pre-Matching Sensitivity Analysis for NYC Data ....................................................... 46 4.2 Alternative Point Matching: MILP Experimental Results ............................................... 49 4.3 Computational Complexity .............................................................................................. 54 4.4 Meeting Point Selection Strategies .................................................................................. 55 4.4.1 Comparing Meeting Point Selection Strategies .................................................................. 57 4.5 Summary ............................................................................................................................ 59 Chapter 5: A Genetic Algorithm Solution ........................................................................... 62 5.1 Genetic Algorithm Fundamentals .................................................................................... 62 5.2 Chromosome Representation ........................................................................................... 63 5.3 Population Initialization .................................................................................................. 64 5.4 Fitness Function and Parent Selection ............................................................................. 68 5.5 Crossover and Mutation ................................................................................................... 69 5.6 Summary ............................................................................................................................... 71 Chapter 6: Genetic Algorithm Experimental Results .......................................................... 72 6.1 Comparing the performance of Genetic Algorithm and MILP ........................................ 72 6.2 Sensitivity Analysis for Genetic Algorithm Parameters ................................................. 74 6.2.1 Cross-Over Rate ......................................................................................................... 74 6.2.2 Survival rate ................................................................................................................ 75 6.2.3 Population Size ........................................................................................................... 76 6.2.4 Number of Generations.............................................................................................. 78 6.2.5 The impact of Greedy Request Insertion ................................................................... 81 vi 6.3 Sensitivity Analysis for Taxi-Sharing Problem Parameters Using Genetic Algorithm .. 83 6.3.1 The Impact of Vehicle Dispatch Penalty ................................................................... 83 6.3.2 The Impact of Allowable Ride-Time Factor.............................................................. 84 6.3.3 The Impact of Pick-up Time Threshold ..................................................................... 85 6.4 Large-Scale Experimental Results ................................................................................... 86 6.4 Summary ............................................................................................................................ 88 Chapter 7: Pricing Policy Analysis ..................................................................................... 90 7.1 Pricing Policy ..................................................................................................................... 90 7.2 Summary ............................................................................................................................ 95 Chapter 8: Conclusions and Directions for Future Research .............................................. 96 8.1 Conclusion ........................................................................................................................... 96 8.2 Recommendations for Future Research ............................................................................... 98 References ................................................................................................................................. 102 vii List of Tables TABLE 3.1: METHODOLOGY STEPS.............................................................................................................. 23 TABLE 3.2 NOTATION SUMMARY OF PARAMETERS AND VARIABLES ........................................................... 35 TABLE 4.1: STATISTICS OF RESULTING SCHEDULES (ADDITIONAL VEHICLE PENALTY = 1000) .................... 52 TABLE 4.2: STATISTICS OF RESULTING SCHEDULES (ADDITIONAL VEHICLE PENALTY = 4000) .................... 52 TABLE 4.3: SOLUTION RESULTS (352 REQUESTS FROM NYC DATA) ........................................................... 58 TABLE 4.4: SOLUTION RESULTS (1292 REQUESTS FROM NYC DATA) ......................................................... 58 TABLE 6.1: GA AND MILP PERFORMANCE COMPARISON (965 REQUESTS FROM NYC DATA) .................... 73 TABLE 6.2 COMPARISON OF THE OBJECTIVE VALUE FOR DIFFERENT CROSS-OVER RATES ........................... 75 TABLE 6.3 COMPARISON OF THE OBJECTIVE VALUE FOR DIFFERENT SURVIVAL RATES ............................... 75 TABLE 6.4 GREEDY INSERTION AND RANDOM INSERTION METHODS OPTIMALITY GAP COMPARISON .......... 82 TABLE 6.5 GREEDY INSERTION AND RANDOM INSERTION METHODS OPTIMALITY GAP COMPARISON FOR DIFFERENT NUMBERS OF PRE-MATCHED REQUESTS ........................................................................... 83 TABLE 6.6 COST BREAKDOWN OF THE OBTAINED SOLUTIONS FOR 965 REQUESTS FOR DIFFERENT VEHICLE DISPATCH PENALTY VALUES .............................................................................................................. 84 TABLE 6.7 COST BREAKDOWN OF THE OBTAINED SOLUTIONS FOR 965 REQUESTS FOR DIFFERENT ALLOWABLE RIDE TIME FACTOR VALUES ........................................................................................... 85 TABLE 6.8 COST BREAKDOWN OF THE OBTAINED SOLUTIONS FOR 965 REQUESTS FOR DIFFERENT PICK-UP TIME THRESHOLD VALUES ................................................................................................................. 86 TABLE 6.9 COST BREAKDOWN OF THE OBTAINED SOLUTIONS FOR 43,027 REQUESTS, COMPARING THE BASELINE SYSTEM TO THE GA BASED AMP SYSTEM (AMP-GA)...................................................... 87 TABLE 7.1 AVERAGE NUMBER OF PARTICIPANTS PER DISCOUNT RATE ....................................................... 92 viii List of Figures FIGURE 1.1 COMPARISON OF CONSTRUCTED RIDESHARE ROUTES WITH (LEFT) AND WITHOUT (RIGHT) ALTERNATIVE PICKUP LOCATIONS (RONAK, 2017). ............................................................................. 3 FIGURE 3.1 TWO METHODS FOR CREATING A POOL OF REQUESTS: DISCRETE WINDOWS VS. SLIDING WINDOWS .......................................................................................................................................... 25 FIGURE 3.2 MEETING LOCATION SELECTION STRATEGIES ........................................................................... 31 FIGURE 4.1: DAILY TRIPS IN JUNE 2015. ..................................................................................................... 42 FIGURE 4.2: REQUEST COUNTS PER MINUTE ON 6/5/2015 BETWEEN 1 PM AND 2 PM. ................................... 43 FIGURE 4.3: MANHATTAN TAXI-ZONE MAP (TAX, B) .................................................................................. 44 FIGURE 4.4: LEFT: PICK-UPS IN FOUR ZONES IN MANHATTAN DURING 13:00-13:01, RIGHT CORRESPONDING DROP-OFF LOCATIONS. EACH CIRCLE ON THE MAP IS A PICK-UP REQUEST IN THE LEFT PLOT AND A DROP-OFF LOCATION IN THE RIGHT PLOT. CIRCLES WITH SIMILAR COLORS ARE FROM THE SAME PICK- UP ZONES. .......................................................................................................................................... 45 FIGURE 4.5: TRAVEL TIME FOR DRIVING AND WALKING TO A LOCATION IN A 0.1-MILE DISTANCE. ............. 47 FIGURE 4.6: PRE-MATCHING ALGORITHM SENSITIVITY TO PRE-MATCHING RADIUS FOR THE AVERAGE COUNT OF PRE-MATCHED REQUESTS (LEFT) AND THE PERCENTAGE OF NOT-MATCHED REQUESTS (RIGHT) ............................................................................................................................................. 48 FIGURE 4.7: DISTRIBUTION OF THE NUMBER OF MATCHED REQUESTS FOR ONE-MINUTE DATA ................... 50 FIGURE 4.8: ROUTING COMPARISON OF TWO MODELS................................................................................. 54 FIGURE 4.9: PROCESSING TIMES OF THE PRESENTED MODELS FOR THE SELECTED 1-MINUTE DATA ............ 55 FIGURE 4.10 MEETING LOCATION USAGE RATE .......................................................................................... 57 FIGURE 5.1 A CHROMOSOME REPRESENTATION OF A FEASIBLE SOLUTION WITH TWO VEHICLES. ............... 64 FIGURE 5.2 AN EXAMPLE OF ORDER BASED CROSSOVER ........................................................................... 70 FIGURE 6. 1 THE TOTAL OBJECTIVE VALUE CONVERGENCE IN GENERATIONS OF THE GA ........................... 73 FIGURE 6. 2 POPULATION SIZE IMPACT ON MAX COMPUTATION TIME AND AVERAGE OPTIMALITY GAP ...... 76 FIGURE 6.3 POPULATION SIZE IMPACT ON AVERAGE AND 75TH PERCENTILE OPTIMALITY GAP ................... 77 FIGURE 6.4 POPULATION SIZE IMPACT ON AVERAGE OPTIMALITY GAP FOR DIFFERENT PROBLEM SIZES...... 78 FIGURE 6.5 THE IMPACT OF NUMBER OF GENERATIONS ON MAXIMUM COMPUTATION TIME AND AVERAGE OPTIMALITY GAP ............................................................................................................................... 80 FIGURE 6.6 NUMBER OF GENERATIONS IMPACT ON AVERAGE AND 75TH PERCENTILE OPTIMALITY GAP ..... 80 FIGURE 6.7 GREEDY REQUEST INSERTION VS. RANDOM REQUEST INSERTION METHODS? COMPUTATION TIME COMPARISON ..................................................................................................................................... 81 FIGURE 7.1 TOTAL OBJECTIVE FUNCTION VALUE VS. DISCOUNT RATES COMPARISON FOR AMP MODEL AND BASELINE MODEL. ............................................................................................................................. 93 FIGURE 7.2 AVERAGE WAIT-TIME VS. DISCOUNT RATES COMPARISON FOR AMP MODEL AND BASELINE MODEL. ............................................................................................................................................. 93 FIGURE 7.3 AVERAGE EXCEEDED RIDE-TIME FUNCTION VALUE VS. DISCOUNT RATES COMPARISON FOR AMP MODEL AND BASELINE MODEL. ................................................................................................ 94 FIGURE 7.4 TOTAL NUMBER OF SHARED VEHICLES VS. DISCOUNT RATES COMPARISON FOR AMP MODEL AND BASELINE MODEL. ...................................................................................................................... 94 ix List of Abbreviations AMP Alternative Meeting Point CVRP Capacitated Vehicle Routing Problem DCP Department of City Planning GA Genetic Algorithm FHVs For-Hire Vehicles HOV High Occupancy Vehicle MILP Mixed Integer Linear Programming MIP A Mixed Integer Programming NP-hardness Non-deterministic Polynomial-time hardness NTA Neighborhood Tabulation Areas NYC New York City OSRM Open Street Map Routing Machine RSP Ride-sharing Problem SDRT Shared Demand-Responsive Transportation SHL Street Hail Livery TAZ Traffic Analysis Zones TLC Taxi and Limousine Commission TSP Traveling Salesman Problem VMT Vehicle Miles Traveled VRP Vehicle Routing Problem VRPPD Vehicle Routing Problem with Pickup and Delivery VRPTW Vehicle Routing Problem with time windows x Chapter 1: Introduction 1.1 Motivation In the past few decades, overpopulation of urban areas, overdevelopment, inadequate mass transit options, and improvement in living standards have led to a growing demand for mobility. This has caused persistent traffic congestion in road networks and has led to serious health concerns due to air pollution. In fact, the annual cost of congestion is more than nearly $820 for each commuter in the United States (Schrank et al., 2012). Due to specific reasons such as lack of funds, open lands, or residents? support, infrastructure development has limitations and cannot be implemented in many cases. Therefore, more efficient use of the existing transportation infrastructure can be highly beneficial in addressing the challenges of ever-growing mobility issues. According to a Bureau of Transportation Statistics report in 2012, 76.4% of commuters drove to work alone (Sprung et al., 2012). This suggests that sharing a portion of available seats among travelers with similar itineraries and time schedules might be a promising solution to reduce the overall load of the road networks. The benefit of ride-sharing is not limited to congestion mitigation for society. Reduced accidents, fuel conservation, increased accessibility, and reduced emissions are other benefits of ride-sharing (Chan and Shaheen, 2012; Fellows and Pitfield, 2000). On the other hand, individuals can benefit from ride-sharing through travel cost savings resulting from splitting the cost with other travelers, travel time reduction compared to public transportation, and potentially accessing HOV lanes (compared to solo drivers). 1 The advantages and disadvantages of taxi-sharing are similar to private ride-sharing. In (Jacobson and King, 2009), it is shown that assuming no detours are necessary, sharing a ride with only one person for every 100 vehicles can save about 0.8 billion gallons of fuel per year. Technology advancements and the emergence of smartphones have made the idea of real-time ride-sharing more feasible than before (i.e., drivers and riders are matched via cell phone apps in real-time). They have attracted the attention of researchers in academia and industries to take advantage of this capacity to find practical solutions satisfying the mobility demand while reducing the total vehicle miles traveled (VMT). In fact, some online taxi-hailing companies have already launched taxi-sharing services such as UberPOOL and Lyft Line. A vehicle trajectory study in Italy showed that up to 60% of rides could be saved if passengers agree to a 1 km detour (Bicocchi et al., 2015), which obviously results in a significant reduction in traffic congestion. However, it is indeed important to take the extra miles and travel time caused by detours to pick up additional passengers into account since it may eliminate potential savings of fuel consumption. In fact, the current practice of complete door-to-door mobility solutions could impose long unnecessary detours to travel itineraries to pick-up/drop-off passengers at their exact requested locations. Therefore, it is essential to carefully design an efficient taxi-sharing system to benefit from its promised advantages fully. Introducing appropriate meeting points instead of the initially requested pick-up locations has excellent potential to decrease the extra detours. Figure 1.1 depicts an example case where appropriately selected meeting locations (left panel) significantly improve the efficiency of the shared ride compared to 2 the traditional approach of picking up all passengers at their originally requested location (right panel). It can be seen that by passengers agreeing to walk at most two blocks to an alternative pick-up point, the necessary detours can be decreased considerably. Moreover, meeting points can increase the chance of matching passengers to travel in a single vehicle and hence, improving the system?s overall efficiency in responding to the demand. It could even be possible to pick up or drop off a group of passengers at a single meeting point to reduce the number of stops. Meeting points also offer more privacy to passengers as their origin and destination are not disclosed to other travelers. Figure 1.1 Comparison of constructed rideshare routes with (left) and without (right) alternative pickup locations (Ronak, 2017). 3 1.2 Problem Statement The purpose of this study is to develop a framework to increase the efficiency of taxi-sharing systems by introducing alternative pick-up/drop-off points, i.e., meeting points. As discussed, meeting points can increase the chances of matching passengers and increase the occupancy rate; therefore, the traveled miles and the number of vehicles required to satisfy the demand can be lower. Also, it can prevent unnecessary detours and decrease the number of stops needed. Thus, the total driving distance can be further reduced. Obtaining real-world locations of alternative meeting points in an automated setting, finding a strategy to select a practical subset of them (since including all of the possible meeting points in scheduling calculation is computationally expensive), locating appropriate matches to share a ride, and obtaining an optimal route to satisfy the demand are the challenges that this study tries to address taking advantage of new spatial data tools (e.g., OSMnx) and operation research techniques. 1.3 Research Contributions The contributions of this research are as follows: ? Developing a routing framework to find the most effective alternative pick-up locations and obtain an efficient taxi-sharing route with a reasonable occupancy rate. ? Applying the framework to large-scale real-world taxi demand data to address all of the practical challenges. 4 ? Presenting a novel optimization model to incorporate the alternative points in the solution. ? Providing an efficient heuristic solution algorithm to solve for the real-world- scaled simulated environment. 1.4 Organization of the Dissertation In Chapter 1, the motivation of the research and the scope of the problem, and the challenges were presented. In Chapter 2, the previous important studies related to the problem are summarized, and the potential contributions to the current literature are discussed. In Chapter 3, first, the problem is formally defined. Next, the steps of the routing framework are identified, and finally, the proposed methodology and the routing optimization model are presented. Chapter 4 presents the experimental results to test the framework and evaluate the proposed system against the traditional ride-sharing system. In Chapter 5, a genetic algorithm-based solution is proposed to find near-optimal routes while efficiently considering alternative meeting points. In Chapter 6, the proposed genetic algorithm solution's performance is studied compared to the exact method, and its sensitivity to its hyperparameters is extensively studied. Moreover, the sensitivity of the solutions to the parameters of the problem is also analyzed using the proposed GA algorithm. In chapter 7, a pricing policy framework based on a hypothetical model for passengers? willingness to share rides is introduced to demonstrate the advantages of the proposed ride-sharing system from the pricing policy point of view. Finally, in chapter 8, the conclusions are drawn, and recommendations for future studies are provided. 5 Chapter 2: Literature Review In this chapter, we first summarize the literature regarding the fundamentals of ride- sharing systems. We then review the details of relevant studies and compare these works to our problem and the proposed methodology (introduced in the next chapter). 2.1 Traveling Salesman Problem The problem of Taxi-sharing has its roots in the well-known Traveling Salesman Problem (TSP). TSP is one of the most important and studied combinatorial problems. It requires finding the minimum-cost Hamiltonian cycle in a graph. Despite the simplicity of the problem statement, the TSP cannot be solved in polynomial time and is considered NP-hard (non-deterministic polynomial-time) as the number of possible paths to explore a graph with ? vertices grows with ?!. A comprehensive description and overviews of solution methods of the TSP can be found in (Laporte, 1992) and (Gendreau et al., 2010). The TSP can be formulated and solved as an integer linear program (Papadimitriou and Steiglitz,1998) 2.2 Vehicle Routing Problem The vehicle routing problem (VRP) is another combinatorial optimization and integer programming problem which generalizes the TSP. In VRP, the goal is to find optimal routes for a fleet of vehicles to visit a given set of locations. The problem first appeared in 1959 (Dantzig and Ramser, 1959). Since then, different problem variations 6 and an abundance of methods to approach the problem have been introduced. A VRP with only one vehicle is reduced to TSP, which is known to be NP-hard. Therefore, VRP and all of its variations are also NP-hard problems. Common variations of VRP are: ? Capacitated VRP (CVRP): the sum of the visited customers? demand in each route cannot exceed the associated vehicle?s capacity. ? VRP with time windows (VRPTW): each customer must be visited within a time interval. ? VRP with pickup and delivery (VRPPD): each request is defined by a pickup point, a corresponding delivery point, and a demand to be transported between these locations (Desaulniers, 2000). ? Time-dependent VRP (TDVRP): the real-time variations in travel times between the network nodes are estimated for the time of operation (Jung and Haghani, 2001). A study by (Braekers et al., 2016) classifies characteristics and assumptions of VRP literature and analyzes the current trend in this area. 2.3 Dial-A-Ride Problem A combination of these three variants of VRP with considering human factors for transporting people is called the Dial-A-Ride Problem (DARP). DARP requires a minimum-cost route and schedule to pick up and drop off passengers within their 7 requested time windows. Its primary difference from vehicle routing problems is the human perspective, since not only operating costs but also user inconvenience (e.g., customer waiting time, customer riding time, and customer late arrival time) should be minimized (Calvo and Colorni, 2007; Cordeau and Laporte, 2003; Psaraftis, 1983) or constrained (Coslovich et al., 2006). Another important constraint in DARP is the time window. There are two different approaches to model time window constraints: hard time window or soft time window. Hard time window ensures that each pick-up and drop-off point is visited strictly within the time window range (Calvo and Colorni, 2007; Cordeau, 2006; Coslovich et al., 2006; Markovi?c et al., 2015). However, the soft time window allows some early and late arrivals of the vehicles though handled by minimizing total early arrival time and delay penalties (Cordeau and Laporte, 2003; Kim and Haghani, 2011). Initially, the DARP was exclusively designed for door-to-door transportation services for the elderly or people with disabilities. However, later due to emerging new mobility solutions and the shortcomings of conventional public transportation systems, it is also considered for transporting people without disabilities (Kashani et al., 2016; Nelson et al., 2010). 2.4 Ride-sharing Problem The ride-sharing problem (RSP) is similar to DARP, except there is not a depot. Instead, the origin and destination for each driver (also drivers? availability) may be constrained by time windows. The problem is to assign drivers to passengers (matching) 8 and to minimize the drivers' detour length or other objectives (vehicle routing). The matching and the routing problems can be modeled separately or be closely interconnected. The single driver/single passenger problem can be solved using a bipartite graph matching with no need for the routing phase (Agatz et al., 2011). The single-driver multiple-passenger problem can be considered equivalent to the vehicle routing problem with pick-up and delivery, and no matching phase is required. DARP and RSP are defined in two modes: static and dynamic scenarios (see Furuhata et al. (2013) for an overview). In a static scenario, it is assumed that the requests are known in advance (Baldacci et al., 2004; Calvo et al., 2004; Yan and Chen, 2011a; Armant and Brown, 2014; Huang et al., 2014; Chou et al., 2016; Huang et al., 2017, 2018; Fan et al., 2018; Jiau et al., 2018; Hou et al., 2018), while in the dynamic mode requests appear throughout the time, and the routing schedules need to be adjusted to satisfy new demand (Xing et al., 2009, Ma, 2017; Agatz et al., 2011; Ghoseiri et al., 2010; Di Febbraro et al., 2013; Jung et al., 2016; Masoud and Jayakrishnan, 2017a,b; Wang et al., 2017; Bian and Liu, 2017). However, in most real-world settings, we have the fusion of both modes (Cordeau and Laporte, 2003). A thorough review of optimization algorithms for the dynamic ride-sharing problem can be found in (Altshuler et al., 2017) and (Agatz et al., 2012). A dynamic ride-sharing system is introduced in (Xing et al., 2009), where passengers and drivers can be matched en route. Some passenger preferences, such as acceptable response time, gender, and smoking, are considered in the model. Experiments 9 simulated on the actual map of the Bremen metropolitan area showed that the matching rate is increased with the number of available drivers. Geisberger et al. (2009) describe a pruning strategy to group and match the trips where the origin and destination locations of the drivers and ride-sharing requests are close to each other. A fast detour computation algorithm is designed to minimize the additional travel distance imposed by sharing a ride. The road network is modeled as a weighted graph, where travel times between the nodes are considered edge weights. Once a sharing request emerges, all possible combinations with the existing requests are explored to find the shortest detour. They also suggested evenly sharing the travel cost between the participants for the shared portion of the trip based on the fair sharing rule in Algorithmic Game Theory. A dynamic ride-sharing problem with several passenger preferences is proposed in (Ghoseiri, 2012; Ghoseiri et al., 2010). These preferences include age, gender, smoking, and pet restrictions. Moreover, constraints related to vehicle occupancy, waiting time to pick up, the number of connections, detour distance for vehicles, and relocation distance for passengers are considered in the model. The objective is to maximize the number of matches in a given planning horizon and minimize the riders? and drivers? travel time. The participants are willing to share a ride to take advantage of splitting the travel cost and benefit from the High Occupancy Vehicle (HOV) lanes to reduce their travel time. A heuristic three-spherical (spatial, temporal, and hierarchical) decomposition model is developed to solve the problem. 10 Wang et al. (2016) also considered the availability of HOV lanes in a static pick-up and delivery problem with time windows. They explored the changes in the optimal routes, passenger travel times, and toll costs where taking HOV lanes is an option. They concluded that when the time saving on HOV lanes is significant, taking detours to pick up additional passengers is favorable and can be an incentive to share a ride. Lobel and Martin (2020) studied detours and values in shared rides. A detour is the extra driving distance to serve additional passengers, and value is the driving distance saved by serving multiple passengers in a single ride. They defined detour and value ratios as the total detour (value) divided by the total direct travel distance to compare values for different city topologies. The authors showed that the sum of detours and values ratios could not exceed ?. Interestingly, they find that Manhattan looks more like a highway network than a grid from the ride-sharing perspective. 2.5 Taxi-Sharing Problem Taxi-sharing is a more flexible transportation mode than ride-sharing because the taxi drivers do not have detour restrictions and are not obliged to their own itineraries. The problem can be modeled as a dynamic DARP (Carotenuto et al., 2014; Lois and Ziliaskopoulos, 2017). In dynamic taxi-sharing, taxis with available seats are dynamically matched with new requests, and routing schedules are adjusted in a short time. A taxi- sharing system with a spatio-temporal indexing structure is proposed in (Ma et al., 2014), which can match the passengers to en route taxis in a very short time using insertion heuristics. 11 A Dynamic Taxi-Sharing (DTS) system is proposed in (Hao, 2018). The DTS system matches drivers and riders, yields routing schedules, and dynamically calculates riding fees. Taxi fees are adjusted according to the occupancy rate in order to give both taxi drivers and riders monetary incentives to use the DTS system. A spectral clustering pre-selection procedure is applied to shrink the search space for the model. Results show that the proposed approach can perform matching and fare calculation in a short time; however, the model is only tested with the assumption that each trip can be shared with only one additional request. A greedy randomized adaptive search procedure (GRASP) with path re-linking technique and Contraction Hierarchy (CH) algorithm is developed in (Santos and Xavier, 2015) for matching requests to vehicles, which can be taxis or private ride-sharing drivers. The cost-sharing mechanism is to evenly divide the cost of each part of the route between passengers in a ride. The objective is to maximize the number of served requests and minimize the total cost of all served requests. Real data are used to create instances for both dynamic and static problems. The experiments show that the path re-linking technique boosts the GRASP heuristic's performance, and the CH algorithm performs significantly better than the Dijkstra algorithm in solving dynamic problems. A Mixed Integer Programming (MIP) model is presented by Hosni et al. (2014) to maximize the profit and assign passengers to vehicles for the shared-taxi problem. A Lagrangian decomposition and two heuristic approaches (for a dynamic system) are proposed to solve the problem. A simple heuristic and Incremental Cost Heuristic (ICH) 12 algorithms are used to find a lower bound for the optimal solution. The results are compared to CPLEX. Scenario-based methods were adopted in (Manna and Prestwich, 2014) to minimize expected delays or unserved requests for a taxi-sharing problem in which trip requests appear during the execution of the solution. A greedy heuristic is proposed to generate initial solutions for each deterministic scenario. The solutions are improved using a local search algorithm. 2.5.1 Metaheuristic Solution Approaches Metaheuristic algorithms are among the leading solutions in optimizing taxi sharing problems. A genetic algorithm approach is suggested in Ma and Zhang, 2018 for a simplified taxi sharing problem where only the pick-up order of the passengers is determined based on a set detour threshold which minimizes the total travel cost. The drop-off order is assumed to follow the pickup order. In the proposed multi-objective GA, the traditional crossover and mutation operators are replaced with a kind of partheno- genetic operator. A case study is carried out based on a road network with 24 nodes which shows the proposed genetic algorithm can efficiently solve the problem. Moreover, it is shown that the model can increase taxi drivers' income while reducing the cost for passengers. A Hybrid-SA (HSA) is proposed in Jung et al., 2016 for a dynamic shared-taxi dispatch system to improve passenger travel by utilizing vehicle resources more efficiently. 13 Ye et al., 2015 designed the Non-dominated Sorting Genetic Algorithm II (NSGA- II) to solve a multi-objective taxi-sharing problem. The considered objectives are minimizing the total ride-sharing fare of riders, the total ride-sharing traveling time of riders, and the total fuel charge for drivers. Their result showed that the income of drivers could be improved while passengers can benefit from low fares in taxi ride-sharing. A genetic algorithm solution is proposed in Ren et al., 2020 to explore the problem of taxi-sharing in shared electric vehicles. The discussed problem seeks to optimize routes while considering the operational cost, user time cost, user rental cost, and user ride- sharing bonus. The proposed approach can efficiently reduce the system-wide vehicle mile traveled compared to the baseline shared electric vehicles with no sharing. Zhang et al., 2020, proposed Multi-Objective Evolutionary Algorithms (MOEAs) to find solutions for a route optimization and sharing expenses taxi-sharing model in which vehicles loading rate, rider cost, and distance traveled are considered. New elements such as the championship operator are designed to be executed before the crossover and mutation operator to improve the algorithm's performance. A recent study by Ting et al., 2021 classifies characteristics and assumptions of shared mobility literature and analyzes the current trend in this area. They classified the shared mobility problems into six types which include ride-sharing, carpooling, taxi-sharing, vanpooling, bus- pooling, and multi-modal. Out of 103 reviewed papers, only 13 articles were categorized under taxi-sharing. Among these, only two articles considered many drivers and many riders, and multiple origins and multiple destinations, and none of them considered meeting points. 14 Cao et al., 2021 proposed a fair ride-sharing route optimization model that considers both user-fairness and system optimization in a single driver-multiple riders ride-sharing setting. The proposed system allows higher discount rates for the users with a long detour for user-fairness considered in this study. A genetic algorithm (GA) based approach is used to solve this problem. The results are compared to the non-ride-sharing mode and show that the proposed system can reduce the taxi empty-loaded rate as well as the passengers? costs. A brief review of the pricing strategies is presented in the following subsection. 2.6 Pricing Strategies There exist several studies focusing on mechanisms of ride-sharing pricing to increase the chances of cooperation between passengers and drivers. One method is to share the travel cost proportionate to different travel attributes, such as travel distance, detour, and waiting time, and fairly allocate the total cost among participants (Lu, 2019; Bistaffa et al., 2015; Gopalakrishnan et al., 2016; Li et al., 2016; Wang et al., 2018; Chen et al., 2018; Bian and Liu, 2018). Another mechanism is to incorporate the passenger pricing in the matching and routing optimization to obtain specific objectives, such as total profit maximization, total cost minimization, maximizing the total saved travel mileage. (Cheng et al., 2012; Biswas et al., 2017a,b; Santos and Xavier, 2015; Qian et al., 2017). Another method is to adjust the price to obtain a balanced ride-sharing supply and demand. This method which is commonly used in taxi services (Yang et al., 2002; Zhang 15 and Ukkusuri, 2016; Qian and Ukkusuri, 2017), is also modified and applied in the context of the ride-sharing problem (Witt et al., 2015; Banerjee et al., 2015; Fang et al., 2016; Liu and Li, 2017). Ride-hailing companies like Uber and Lyft use this pricing mechanism to incentivize drivers to relocate to undersupplied locations (Hall et al., 2015). Auction-based mechanisms are another framework that aims to maximize society?s overall welfare in which participants are given incentives to report their maximum willingness to pay the price. Several researchers developed Auction-based mechanisms for ride-sharing services (Kamar and Horvitz, 2009; Kleiner et al., 2011; Nguyen, 2013; Cheng et al., 2014; Zhao et al., 2014, 2015; Zhang et al., 2015; Masoud and Lloret-Batlle, 2016; Asghari et al., 2016; Shen et al., 2016; Asghari and Shahabi, 2017; Lloret-Batlle et al., 2017; Masoud et al., 2017b; Hsieh et al., 2019; Zhang et al., 2017; Zhang et al., 2018; Ma et al., 2020). In the taxi-sharing system of (Hao, 2018) taxi fees are adjusted according to the occupancy rate in order to give both taxi drivers and riders monetary incentives to use the DTS system. In (Santos and Xavier, 2015), the cost of each shared part of the route is evenly divided between passengers in a ride. 2.7 Meeting Point In traditional taxi-sharing systems, it is assumed that the passengers are picked up and dropped off at their requested location; however, in recent studies finding optimal meeting and divergence points as alternative origin and destination have become popular (after being neglected for a long time). The introduction of alternative meeting points can 16 improve the resulting solutions by increasing the number of matches and decreasing the length of detours (Aissat and Oulamara, 2014, 2015; Stiglic et al., 2015), compared to the case where passengers are served at their exact origin and destination locations. In the context of ride-sharing, Xing et al. (2009) suggest that boundary nodes can be introduced as alternative pick-up points for the riders located in the pedestrian zones to facilitate pick-ups by the driver to maximize the total number of served riders. However, in the experimental simulation, alternative pick-up points are not considered, and it is assumed that none of the riders are located in the pedestrian zones. If it is assumed that the assignment of the riders and drivers is already known, the problem will be limited to finding meeting points for the travelers. For instance, (Aissat and Oulamara, 2014) proposed an exact enumeration and two heuristic methods to identify meeting and divergence locations to minimize the total travel costs. However, no preference is set on the locations of the meeting/divergence points. Therefore, the walking distances between the original demand points and meeting locations could be arbitrarily long. In another work with a similar setting (Aissat and Oulamara, 2015), a specific passenger is given the flexibility to select the most cost-efficient candidate between the drivers. An exact method and a heuristic are proposed to solve the problem. For a ride-sharing system with fixed driver routes, the authors in (Rudnicki et al., 2008) determine the meeting and divergence points by finding the local minimal distances from the rider?s origin/destination to the time-attributed points in the driver?s route and selecting the nearest feasible point. The time saving of the ride-sharing system is compared to the walking mode. 17 A ride-sharing problem introduced in (Balardino and Santos, 2016) considers walking to meeting points located in the vicinity of the passengers' origins. It is assumed that all the passengers are traveling to a common destination. The problem is formulated as a mixed-integer linear programming (MILP) with constraints on the maximum acceptable walking length of the passengers, the driver's detour length, and the vehicle capacity. The objective is to maximize the passengers served and to minimize the total driver distance. In (Stiglic et al., 2015), the potential benefits of introducing meeting points in a ride-sharing system are investigated. Demand points and meeting points are randomly generated around the centroid of each Traffic Analysis Zone (TAZ). Common pickup and drop-off meeting points are selected for the riders and the driver of each trip. Therefore, there are no additional stops between the selected origin meeting point and destination meeting point, and no routing is needed for the trips. Meeting points are chosen based on their accessibility to the riders (feasible walking distance) and the travel time limitations of the riders and drivers. The travel demand model for the metropolitan Atlanta region is used on an artificial network based on Euclidean distances. It was demonstrated that setting one pick-up point and one drop-off point per shared ride can significantly increase the number of matched passengers (up to 6.8%) and reduce travel distances (up to 2.2%). A greedy heuristic is presented in (Yan et al., 2011b) to find the optimal meeting location for a given set of points (query points) in a road network, such that the meeting point (gathering point) has the minimum sum of network distances to all the points. 18 Bruck et al. (2017) studied the carpooling problem in an Italian service company to organize the daily carpooling crew with the objective of minimizing CO2 emissions. Carpool participants are allowed to drive to intermediate meeting points and then carpool to the common workplace. The employees may have different working shifts; thus, only those with the same shift can ride in the same car. Potential meeting points in this problem are simply assumed to be the set of the residence location of the employees. Experimental results demonstrate a considerable potential in CO2 savings by the use of carpooling. Li et al. (2018) consider the ridesharing problem with private drivers in which they select meeting points from a set that does not contain all of the nodes in the city and introduce preferable time windows instead of only the earliest and latest times for ride requests. They propose a tabu-search-based meta-heuristic algorithm to solve the mixed- integer linear program that formulates the ridesharing problem. By comparing the results to the optimum obtained through CPLEX, they show the proposed heuristic yields near- optimal results. Chen et al. (2019) considers the ride-sharing (carpooling) problem with meeting points and return restrictions. The study is restricted to a closed community of companies that are sharing the calendars of their employees. The problem is a dynamic many-to- many ride-sharing system with guaranteed return as an incentive. The proposed system includes meeting points for sharing a ride and role (driver/rider) flexibility for the participants. The meeting locations for participants are selected from a set of given locations for departure, arrival, or transfer that are also considered the road network. They 19 formulate this problem as a MIP in which the costs incurred from vehicle miles and penalties for arriving too late at meetings, waiting time for transfers, inconvenience, and risk with transfers are considered. They proposed a greedy heuristic that sorts customers based on their potential for being a rider and greedily assigns others to serve them. 2.8 Summary Based on this review, it can be observed that not only just a few papers explored the introduction of meeting points in taxi/ride-sharing context, but also most of the existing works only focused on simple ride-sharing scenarios such as having a single meeting point for a group of people to be picked up or having common destinations. However, a common pickup location may not be practical in a taxi-sharing framework, especially considering the limitation of low demand periods and walking distance. Therefore, instead of considering common meeting points, we propose assigning alternative pickup points to every passenger. The taxi-sharing problem becomes more complex when alternative pick-up/drop- off points for each passenger have to be determined. The delivery sequence and the best pick-up/drop-off locations must be jointly determined to minimize the travel distance. Most of the prior studies used integer linear programming to solve the problem, which is not efficient nor applicable to large-scale scenarios such as metropolitan cities. Moreover, many of the existing models use the Euclidean plane to simulate meeting points candidates. Developing solutions on a real network with real walking/driving distance and travel times, which can validate the assumption and solution, is missing from 20 the literature. This study aims to investigate the benefits of meeting-point-enabled taxi- sharing in a real-world setting and propose efficient algorithms and solutions to deal with the practical challenges of working on real-world demand data on real-world road networks. According to our review and a recent thorough review paper by Ting et al., 2021, there are no prior works considering matching many drivers to many riders, multiple origins, and multiple destinations and meeting points in a taxi-sharing problem. 21 Chapter 3: Methodology In this chapter, we describe the details of the proposed methodology. Unlike the regular taxi-sharing system, we propose a system where passengers are not necessarily served in their exact requested location, but instead, it is possible to pick them up at a close enough location, which is accessible with a short walk. We believe that due to the flexibility of this system, it will result in higher matching rates and smaller detours (millage and travel timewise) compared to conventional taxi-sharing systems. Indeed, the benefits of determining single pick-up and drop-off points for a group of matched riders in an artificial network with real demand are demonstrated (Stiglic et al., 2015). We specifically address the challenges of a real-world high-demand network in our proposed solution, where it is possible to pick up individuals in alternative locations and not necessarily gather them at a mutual pick-up point. 3.1 Methodology Overview To improve the efficiency of a ride-sharing system by introducing meeting points, the first step is to determine the pool of requests, and the next step is to find potential alternative meeting point locations, and finally incorporate these additional alternative meeting points in the inputs of the routing problem to find the optimal taxi routes. However, in a large network with high demand, it is computationally impossible to solve for the whole network and obtain a globally optimal schedule as the solution time of the corresponding NP-hard optimization problem grows exponentially with the size of 22 the network. Therefore, a pre-matching step is performed in advance to break down the problem into smaller problems determining a subset of requests with a higher chance of being matched. Table 3.1 shows the overall scheme of the procedure. A set of requests received in a predetermined time window (one minute in our case) will be called a pool of requests from now on. Table 3.1: Methodology steps Pool of requests {(AP, AD),(BP, BD),?,(ZP, ZD)} {(AP, AD),(CP, CD),(FP, FD)} {(BP, BD)} Pre-Matching ? {(SP, SD),(ZP, ZD)} {(AP, AP?, AD, AD?), (CP, CP?, CD, CD?), (FP, FP?, FD, FD?)} Adding potential {(BP, BP?, BD, BD?)} alternative pickup ? points {(SP, SP?, SD, SD?), (ZP, ZP?, ZD, ZD?)} {(AP?, FP, AD?, FD), (CP?, CD)} {(BP, BD)} Optimized routes ? {(SP, ZP?, ZD, SD?)} A pool of requests (e.g., requests A to Z), which includes their pick-up (subscript P) and drop-off (subscript D) locations, are received and then grouped into smaller subsets of requests through the pre-matching step to form smaller sub-problems that can be solved 23 with lower computational complexities. To form the pool of requests, we need to wait for a predetermined period of time for new demand to arrive to have a reasonable number of passengers that can potentially share a ride. We consider two windowing approaches to form the pool of requests: ? Discrete time windowing: we form non-overlapping time windows, each covering a predetermined non-overlapping period of time (one second, for instance). All passengers? requests in each window are considered the pool of requests for pre-matching, and if some of the requests did not meet the criteria for pre-matching, they would be served alone without sharing the ride. ? Sliding time windowing: we create overlapping windows of time, each spanning a pre-determined period of time. After pre-matching is completed for the current window, we shift the window to the position of the first un- matched request and form a new window of the same length from that point. As such, we give a second chance to the unmatched requests to be pre-matched with the newly received requests. These alternative approaches are depicted in Figure 3.1. In this figure, {?, ?, ?, ? } represent different passenger requests in a simplistic scenario where each window covers four requests. The pre-matched requests have a shaded background in each window, while the un-matched ones have a clear background. It can be seen that in the case of discrete time windowing, the request ? is not matched with any other requests and will be served alone. However, in the case of sliding time windowing, it is given a second chance in the second sliding window and is actually pre-matched with two other requests. 24 Figure 3.1 Two methods for creating a pool of requests: Discrete windows vs. Sliding windows Given that the sliding time windowing provides more opportunities for riders to be pre-matched, this approach is used in our framework to create the sub-problems to be solved for optimal routes. In the next step, potential alternative pick-up/drop-off points for each request are identified and added to the network of each subproblem. The possible strategies for the selection of alternative meeting points are discussed in section 3.3.1. The last step is to find the efficient routes for each subset of the requests. We propose a mixed-integer programming (MIP) model in section 3.4 to find the optimal routes for each group. Given the relatively high computational complexity of solving the MIP problem in real-time, we designed a genetic algorithm approach to obtain high-quality solutions for large-scale problems. The proposed genetic algorithm is presented in chapter 5. 25 3.2 Passenger Demand Pre-matching In a dense urban taxi-sharing system, the demand can exceed hundreds of requests per minute. To handle the routing optimization for such a high-demand network, it is necessary to decompose the problem into smaller ones in a way that minimizes the compromise between the quality of the solution and the corresponding computational complexity. The pre-matching procedure is a decomposition process to find smaller subsets of requests with a higher potential of being matched to achieve this goal. The pre- matching is done based on the physical location (travel time and distance) of the pick-up and drop-off of the requests. As obtaining the exact travel times and distances is computationally demanding, a rough estimate of travel times using the corresponding zones is obtained by implementing the following steps. 1. The demand area is divided into smaller zones (e.g., grids, Taxi and Limousine Commission (TLC ) taxi zoning, traffic analysis zones (TAZ), etc.), 2. The zone centroids are identified, 3. A travel time matrix of zone centroids is created (via OSMnx API), 4. Every request?s pick-up and drop-off locations are mapped to the zones, 5. A reasonable pickup radius threshold is set (i.e., selecting a distance that can be traversed within a tolerable wait time), 6. A preferable travel time extension threshold is set (e.g., ride times should not exceed 1.5 times of the original travel time ), 26 7. For each request (noted as the first rider) in the pool of requests, all other requests are explored and selected as a potential co-rider if the conditions below are satisfied: ? The co-rider?s pickup point is within the predefined radius of the first rider?s pick-up point and, ? Adding the new co-rider to the route does not violate the maximum allowable ride time of the first rider according to the previously set travel time extension threshold. The first two steps are to be performed only once for a selected area of interest. In contrast, the interzonal travel time matrix (step 3) can be updated as needed based on the stability of the traffic condition before being used in the pre-matching calculations. The remaining steps are performed every time the pool of requests is received. The goal of the pre-matching step is to break the large number of requests in the pool into many smaller-sized request clusters that have a high potential to be matched. Note that this is a preliminary matching based on roughly estimated travel times, and there is no guarantee that these riders will be served on the same ride in the final solution. 3.3 Finding Alternative Meeting Points In this step, to reduce detour distance, a set of potential alternative meeting locations is introduced for each request in each of the pre-matched subsets created in the previous steps. We consider all the surrounding intersections within a predefined walkable distance threshold as the initial set of candidate alternatives. Then, intersections that are 27 inaccessible by foot, such as those located in ?motorways,? are excluded. Considering intersections as meeting points has multiple benefits. First, it is easy to address intersections. Second, the intersections are accessible from different directions. And finally, considering intersections as meeting points helps avoid the extra distance to the midpoint of the street. Adding a set of alternate points for every pick-up request increases the routing optimization problem's size significantly. Therefore, a strategy is required to reduce the number of candidates and keep a smaller set of alternatives with higher chances of improving the final solution. In (Lyu et al., 2019), the forward Dijkstra algorithm between each pick-up and associated drop-off point is used to find the neighboring junction along the route within a walking threshold. This method requires a considerable number of shortest path computations. It is only useful when the passengers are directly delivered to their destinations, and no more stops are made to pick up or drop off other passengers (in case of sharing a ride, it is less likely that the new routing includes the candidate junction located in the direct route). We studied a number of meet point selection strategies as follows. 3.3.1 Meeting Point Selection Strategies Given the fact that it is computationally impossible to include all of the candidate alternative pickup points in the problem in a real-world setting with a huge number of requests, different strategies are designed to select a subset of meeting points from all candidate meeting points of a request which has a higher potential of reducing the total 28 travel time. All these strategies are designed intuitively and later evaluated on a sample dataset for comparison. First, a list of all possible alternative pickup points is created by considering all the intersection locations within a predefined driving distance of each request and excluding those that are not accessible by foot. Then, we apply the meeting point selection strategies to reduce the number of alternative candidates. Random Selection Method In this method, k alternatives (if available) are randomly selected from the candidate points for each request. Farthest Method In this method, k candidate alternative points with the largest distance to the original pickup point are selected as alternative points. The intuition behind this approach is that if a passenger is located in a congested area, the alternatives that are farthest from the original pick-up point are more likely to be located in a less congested area. The farthest alternative pickup points are also more likely to introduce alternative routes that are essentially different from the route that includes the original pickup point. Distance Ratio Method In this method, k candidate points with the highest driving to walking distance ratio are selected as alternative pickup points. The intuition behind this method is that the higher the drive time is compared to the walk time, the more logical it is to benefit 29 from walking to the corresponding alternative pickup point. For instance, this method is likely to help evade long detours caused by one-way streets. Time-Distance Ratio Method In this method, k candidate points with the highest driving travel time to driving distance ratios are selected as alternative pickup points. This method is likely to help evade congested detours. Figure 3.2 shows an example of the last two strategies. The right panel shows that the 0.4 miles detour caused by one-way streets could be eliminated if the passenger walks only 0.1 miles. The left panel shows the same route in a traffic condition where the travel time is doubled compared to the no traffic condition. In other words, traversing the same distance requires more travel time. Therefore, a greater time to distance ratio can give a sense of congestion which could be eliminated if the passenger could walk the distance. Heatmap Method In this data-driven method, k candidate points with the highest rate of being used in the historical optimal routes are selected as alternative meeting points. This method calculates the usage rates based on the number of times a point (intersection) is offered as a candidate alternative and the number of times it was used in optimal routes in historical data with a random alternative point selection method. Basically, intersections with higher usage rates are hot spots for future pick-up points. Intersections that are not 30 offered before are assigned a usage rate of 20% to increase the chance of being selected as an alternative point. (a) Distance ratio method (b) Time-distance ratio method Figure 3.2 Meeting location selection strategies 3.4 The Optimization Model A mixed-integer linear programming model is developed to formulate the capacitated vehicle routing problem with pick-up and delivery, representing the taxi- sharing problem with alternative meeting points discussed before. 3.4.1 Model Assumptions The main assumptions of this problem are: 31 ? One organization (e.g., TLC) operates the taxi-sharing system. Its interest is to satisfy all the requests with minimum operational time and wait time using the minimum number of vehicles while not exceeding an allowable ride time for each passenger. This can be transformed into minimizing the total cost when the operational cost per minute and the penalty per minute of passengers? wait time are known. ? Taxi locations are not considered in the routing model, and the driver assignment is performed separately. Each optimal route can be assigned to a vacant taxi by the operator by solving a minimum cost assignment or by drivers accepting the route on the taxi-sharing platform. The taxi is dispatched to the first pick-up location. The first reason for separating the taxi assignment part is to avoid an additional index for vehicles in travel time in the formulation, which makes the problem larger. The second reason is that, since the problem is decomposed by taxi zones, it is not clear that enough taxis are located in each zone, and there might be a need for borrowing taxis from other zones. Therefore, we either need to include all the taxis in the network, which eliminates the advantages of decomposition strategy, or find a heuristic way to reduce the number of taxis needed in each small problem. Instead, we decided to separate the taxi assignment step. ? Travel times between every node in the problem (including all pick-up and drop-off points of the potential matches and all alternative points) are retrieved in real-time. They are considered the costs associated with the edges of the VRP network. 32 ? Taxis start and end their trips in dummy start and end depots. Travel times from and to the depots are zero. This is necessary for tracking the number of unused vehicles in the formulation. Moreover, since the taxi locations are not known at this stage, it is assumed the taxis are located near the first passenger?s pick-up location (street hail), so for each request, the taxi location varies; therefore, we can assume that taxis start from a dummy depot and the travel time to the first passenger?s pick-up location is zero. ? There is a penalty to dispatch a taxi from the starting depot. ? All taxis have the same capacity (homogeneous fleet). ? Each request has given a number of passengers to be picked up. ? Passengers are at the pickup location before the drivers. Although this assumption is limiting, real-world operators such as Uber and Lyft also prohibit waiting for a passenger in their pooling mode. ? The service time at each stop is considered to be negligible/embedded into the travel times. ? There is no predetermined arrival time for the passengers; however, ride times more than a predefined threshold are not allowed. ? Passengers are to be picked up before an allowable wait time. 3.4.2 Model Notations Taxis start their trip from the dummy start depot, designated as Node O, to the set of pick-up points P = ?(P1, P2,...,Pn) where n is the total number of requests in the request 33 pool A, and Pa, a ? A = {1,...,n}, represents set of original and alternative pick-up points for request a. Similarly, there is an associated set of drop-off points D = ?(D1, D2, ..., Dn) where Da represents the set of original and alternative drop-off points for request a. Each node i in the network N = ?P ? D ?{O, E} has a demand qi (passenger count), which is positive for pick-up points and negative for drop-off points, and zero for start (O) and end (E) depots. It is assumed that up to V vehicles are available to satisfy the demand, and the maximum capacity of each is Q. Also, each dispatched vehicle imposes a cost of C on the system. Moreover, there is a ride time limit coefficient for all of the passengers, and each additional passenger needs to be picked up before the maximum wait time of WT. Finally, the travel time between nodes i and j is denoted by tij. The summary of notations is listed in Table 3.2. Four sets of binary and continuous decision variables are used to formulate the model. 3.4.3 Problem Formulation The interest of the organization to satisfy all the requests in minimum operational time with the minimum number of vehicles and wait time is expressed in the objective function formulation below: Minimize: ???? ????? ??? ???. ? ? ?? + ?i?P? ? ? ??? ? + ?1. (? ? ? ? ? ??? ??? ) + ?2. ???? ????? ??? ??? (3.1) 34 The first term of (3.1) minimizes the total travel time (operation time), while the second component minimizes the visit time of each node, which indeed takes care of the pick-up wait time minimization. The third term reflects the penalty of dispatching vehicles to the customers, and the final term reflects the penalty for exceeding the allowable ride time. Table 3.2 Notation summary of parameters and variables Parameter name Description n Number of potential matches A = {1,2,...,n} Set of riders Pa Set of all potential pick-up points for rider a P = ?(P1,P2,...,Pn) Set of all pick-up points Da Set of all potential drop-off points for rider a D = ?(D1,D2,...,DN) Set of all drop-off points K = {1,2,...,V } Set of available vehicles N = ?P ? D ?{O,E} Set of all nodes (demand points and depots) Q Vehicle capacity C1 The time penalty for dispatching an additional vehicle C2 The time penalty for exceeding allowable ride time WT Maximum time to pick up a co-rider RT Maximum allowable ride-time for each passenger qi Number of passengers picked up/dropped off at node i tij Driving travel time from node i to node j W, M Very large numbers Variable name Description (BINARY) 1 if vehicle k visits node i immediately after node j, 0 otherwise Qki (CONTINUOUS) Passenger count of vehicle k after visiting node j B k i (CONTINUOUS) Visiting time of node i by vehicle k Lkaij (CONTINUOUS) a?s ride-time when picked up at i, dropped off at j Ekaij (CONTINUOUS) a?s exceeded ride-time when picked up at i, dropped off at j 35 The constraints of the MILP model are presented below: Every request is served exactly once. This constraint ensures that only a single vehicle visits either one of the alternative pick-up points or the original pick-up point of a request. ??? ???? = 1 ? a ? A (3.2) ???? ??? ??? The same vehicle which services the pick-up of a request must service the associated delivery: ?????? ? ? ?? ? ?? = 0 ? a ? A, ? k ? K (3.3) ???? ??? ???? ??? Every vehicle must leave the start depot: ? ???? = 1 ? k ? K (3.4) ????{?} Every vehicle must enter the end depot: ? ???? = 1 ? k ? K (3.5) i???{?} The same vehicle which enters a node must leave the node: ????? ??? ? ?? = 1 ? i ? N-{O, E}, ? k ? K (3.6) j?? j?? A vehicle?s passenger count at each node is calculated and compared with the vehicle capacity in the two constraints below: 36 ??? ? ? ? ? ? ????? ? ??? ?? ? i, j ? N, ? k ? K ? (3.7) ?? ? ? ? i ? N ?{O, E},? k ? K The visiting time of each node by a vehicle is set, and the pick-up time of co-riders according to the maximum allowable wait time are constrained below: ?? ? ?? ? ?? ????? ? ??? ?? ? i, j ? N, ? k ? K (3.8) ??? ? ?? ? a ? A, ? i ? Pa, ? k ? K (3.9) In 3.9, the visiting time of the first passenger is assumed to be zero since the locations of the taxis are not incorporated in our routing model, and it is assumed that the taxis are located at a dummy depot close to the demand location. Therefore, the wait-time is only defined for the co-riders of the first passenger who are picked up en route. The taxi assignment step is done after calculating the optimal routes, either by solving the assignment problem and obtaining the optimal assignments or by employing a strategy such as assigning the nearest taxi to the first passenger in each optimal route to satisfy the demand and obtain a reasonable solution in a shorter amount of time. The ride-time of each passenger is calculated, and a ride-time limit is enforced for each passenger in 3.10. 3.11 represents the variable domain constraints. ??? ? ??? + ?? ? ?? = 0 ? a ? A, ? i ? Pa , ? j ? Da ,, ? k ??? ? K ?? ? ? ?? ?? ? ??. ??? ? a ? A, ? i ? Pa , ? j ? Da , ? k (3.10) ? K 37 ???? ? {0,1} ? i ? N, ? j ? N, ? k ? K ??? ? 0 ? i ? N, ? k ? K ??? ? 0 ? i ? N, ? k ? K (3.11) ????? ? 0 ? a ? A, ? i ? N, ? j ? N, ? k ? K ????? ? 0 ? a ? A, ? i ? N, ? j ? N, ? k ? K In our experiments, we initially solve the introduced mixed-integer linear programming model to obtain optimal solutions demonstrating the advantage of introducing alternative meetup locations to a taxi sharing system. Since the computational complexity of solving the proposed model is very high, we present a heuristic genetic algorithm to achieve near- optimal solutions with reasonable computational complexity. 3.5 Summary In this chapter, we described the problem of incorporating alternative pickup points in a taxi-sharing system. We introduced the challenges of designing such a system in a high-demand real-world road network and presented the full details of the proposed methodology. In the proposed system, it is possible to pick up passengers at an alternative location, accessible with a short walk, with the objective of achieving a higher passenger matching rate and smaller detours compared to conventional taxi-sharing systems. To break down the large-scale problem into a subset of smaller problems, we introduced a set of criteria to match passengers into groups with a higher chance of sharing a taxi together. We then introduced the procedure of offering alternative pick-up points for each passenger. We explored several strategies to reduce the number of potential alternative 38 pick-up points efficiently. We then introduced a MILP formulation to solve the corresponding optimization problems. 39 Chapter 4: NYC experimental case study This chapter first presents the results of using a publicly available dataset to conduct a feasibility study to demonstrate the potential benefits of incorporating alternative meeting points in a real-world taxi-sharing system. Subsequently, experimental results are presented to illustrate the extent to which the performance of real-world taxi-sharing systems can be improved by incorporating alternative pick-up locations. 4.1 Initial Feasibility Study There is no guarantee that a taxi-sharing system will benefit from incorporating meeting points in every environment. A study of shared demand-responsive transportation (SDRT) from a Swedish city (Ha?ll et al., 2008) with uniformly distributed meeting points showed that the meeting point-based operation is not considerably efficient when compared with door-to-door service. Although this result could be due to a very dense network of meeting points, the characteristics of the road network, traffic condition, and density of the demand are also important factors in evaluating the efficiency of introducing meeting points in the system. Therefore, in this section, the selected network is investigated to find the potential improvements that locating meeting points can bring to the taxi-sharing system. 40 4.1.1 NYC Taxi Dataset New York City Taxi data is made available publicly by Taxi and Limousine Commission (TLC) and contains over a billion trips from January 2009 to June 2019. However, recent years? data does not include the exact pick-up/drop-off coordinates. Instead, the pick-up/drop-off taxi zones are reported. The data includes three types of taxis: ? Yellow Taxicabs: These vehicles provide transportation exclusively through street hails. So, the pickups are not prearranged. ? For-Hire Vehicles (FHVs): These include black car services or luxury limo services. FHV transportation is only accessed by prearrangement. These FHVs are not permitted to pick up passengers via street hails. ? Green borough taxis or Street Hail Livery (SHL): This class is a hybrid between yellow taxicabs and For-Hire Vehicles that are permitted to additionally accept street hails above 110th Street in Manhattan and in the outer boroughs of New York City (TLC). According to this information, 2015 yellow taxicab demand data is chosen to be able to extract the exact locations of non-prearranged requests for taxi-sharing simulations. 2015 Yellow Taxi Trip Data contains 146,087,462 trips. Each trip provides information such as the number of passengers, trip distance, pick-up and drop-off time and location, and fare amount. In June, 12,324,935 trips were recorded with an average 41 of 404,230 trips per day and a standard deviation of 28,077 trips. Figure 4.1 shows the total daily number of pick-ups by yellow cabs in June 2015, which starts on a Monday. The weekly seasonality of the pick-up data is evident in the plot showing that the lowest demand occurs on Sundays and Mondays. Figure 4.1: Daily trips in June 2015. A one-hour sample of data is selected arbitrarily to create a simple visualization of the data. The chosen data contains pick-ups and drop-offs on June 5th, 2015, between 1 pm and 2 pm in Manhattan, New York. After dropping records with empty fields, the data includes 19,316 trips. The average ride request per minute is 321.9, with a standard deviation of 19.3 rides. Figure 4.2 shows the request counts per minute during the one- hour sample data. Such a high demand rate in Manhattan can lead to a high occupancy rate by adopting an effective matching framework. The pick-up and drop-off points are assigned to taxi zones to aggregate the data into neighborhoods. Taxi zones are defined roughly based on the NYC Department of City Planning?s Neighborhood Tabulation Areas 42 (NTAs) and are meant to approximate neighborhoods. Taxi zones in Manhattan are shown in Figure 4.3. Figure 4.2: request counts per minute on 6/5/2015 between 1 pm and 2 pm. The first minute (13:00-13:01) data from the fairly popular taxi-zones of Manhattan is visualized in Figure 4.4 to see a sample of pick-up and drop-off distribution on the map. The points appearing in the left panel show the pick-up location of each request. The pick- up points with the same color are assigned to the same zone (four high-demand zones are selected). The right panel shows the same requests? associated drop-off points. The color of the points in the right panel is still based on their pick-up zone to make it possible to recognize the drop-off requests originating from the same zone. According to the drop- off points in the right plot, there are indeed several close-by drop-off requests originating from the same zones (similar colors) that can be grouped into shared rides. In addition, considering the pick-up point clusters in the left plot, it can be expected that introducing alternative pick-up points within a walking threshold can generate even more shared rides. This is studied in further detail in the following sections. 43 Figure 4.3: Manhattan taxi-zone map (Tax, b) 44 Figure 4.4: Left: Pick-ups in four zones in Manhattan during 13:00-13:01, Right corresponding drop-off locations. Each circle on the map is a pick-up request in the left plot and a drop-off location in the right plot. Circles with similar colors are from the same pick-up zones. 4.1.2 The Potential of Alternative Meeting Points In NYC As mentioned earlier, alternative meeting points may or may not be beneficial depending on the traffic and other conditions of the road network. In this section, 45 we provide some facts and analyze a simple scenario to determine the potential effectiveness of implementing meeting points in the ride-sharing system in NYC. NYC, especially in the Manhattan area, is known to be suffering from the heavily congested road network, such that the average speed of traffic in Midtown Manhattan is 4.7 mph (Agrawal, 2018). As such, walking would be faster than driving sometimes. Figure 4.5 shows an example of such a scenario where the congested network, along with the geometry of one-way streets, has caused 4 minutes difference between driving or walking to a location 0.1 mile further. Assuming the route continues in E 45th street in this scenario, picking up the passenger in the crossing of Lexington Ave and E 45th street instead of the passenger?s original location can save 6 minutes in ride time and 0.4 miles in driving distance. Considering the average demand of 322 ride requests per minute in Manhattan and the potential driving time savings such as the one shown in the example, it can be concluded that incorporating alternative meeting points could make a considerable difference in NYC congestion. 4.1.3 Pre-Matching Sensitivity Analysis for NYC Data The premise of benefiting from taxi-sharing is that there exist travelers with similar itineraries and time schedules. To examine the existence of such travelers in the selected network, we perform a sensitivity analysis. As such, we can confirm that a reasonable percentage of the request pool has the potential to be matched with each other. 46 (a) Driving takes 6 minutes (b) Walking takes 2 minutes Figure 0.5: Travel time for driving and walking to a location in a 0.1-mile distance. In the proposed pre-matching procedure in section 3.1.1, we need to set the pick-up radius parameter for passengers to match the requests and the allowable time window to form the pool of requests. Two points should be considered while selecting these parameters: ? The number of requests which cannot be pre-matched should be minimized, ? The higher the number of pre-matched requests is, the larger the size of the scheduling optimization would be. Hence, limiting the number of pre-matched requests to those with the highest potential of being matched in the final optimization might be preferable. 47 To understand how the number of pre-matched requests varies with these parameters, 1-minute, 2-minute, and three-minute request pools are formed from the demands on June 5th of 2015 between 1 pm and 1:03 pm. Moreover, different matching radius between 55 and 660 meters (equal to 0.0005, 0.001, 0.002, 0.004, 0.006 degrees of latitude) are tested to understand the effect of physical distance on the number of pre- matched requests. The results of this analysis are summarized in Figure 4.6. The right panel in Figure 4.6 shows the percentage of requests that cannot be matched with other requests for each pool of requests. According to this plot, the percentage of no-match requests starts to flatten at a radius of 440 meters. This shows that further increasing of the radius has a smaller impact on matching the requests. Figure 4.6: Pre-matching algorithm sensitivity to pre-matching radius for the average count of pre-matched requests (left) and the percentage of not-matched requests (right) 48 Moreover, the difference between 3-minute and 2-minute curves (green and orange) is not significant, suggesting that waiting another minute for a larger pool of requests to obtain a few more pre-matched requests may not be reasonable. The left graph shows the average number of potential matches per request. The numbers of matched requests for 3-minute and 2-minute curves (green and orange) are greater than three for a radius of more than 440 meters, suggesting that there might be a need for more than one vehicle to serve the pre-matched requests for most of the cases which increases the computation time of the scheduling optimization. Therefore, we conclude that a 1-minute request pool and 440-meter radius are reasonable choices to create a balanced pool of requests with enough number of pre-matched requests. 4.2 Alternative Point Matching: MILP Experimental Results Once the pool of requests is formed and the pre-matching is performed, the final matching can be done by optimizing the schedule for each set of pre-matched requests. In this section, a one-minute window of NYC taxi demand is selected (June 5th of 2015 between 1 pm and 1:01 pm). Several variations of the proposed methodology are tested to assess the impact of introducing alternative pick-up points for each pre-matched request. The one-minute window of data consists of 298 requests to transport 471 passengers in Manhattan. 65% of the requests are matched with at least one other request within the set radius of 440 meters by performing the pre-matching step. The distribution of the number of matched requests for each request is shown in Figure 4.7. According to the 49 graph, 105 requests cannot be pre-matched with any other request within the 1-minute time window. This means that no candidates (pre-matches) are found for these requests, but this does not necessarily mean they have not been selected as candidates for other requests. This can be better understood by considering a request for a very short trip. It is indeed difficult to find matching requests that can be shared with this short trip without exceeding the allowable travel time extension for that request (which is set to 150% of the original direct travel time in this study). However, the same short trip can easily be a pre-match candidate for longer trips that can accommodate sharing their ride with such a short trip without violating the set allowable travel time extension. Figure 4.07: Distribution of the number of matched requests for one-minute data The results are compared with a base model that operates similar to a traditional ride-sharing system (serving the travelers at their exact requested locations) to demonstrate the advantage of adding alternative meeting points in the ride-sharing system. For the baseline system, after obtaining subsets of pre-matched requests, the alternative point selection step is skipped, and the optimization step is performed using 50 only the original pick-up and drop-off points as the set of origin and destination points (without considering any alternatives). The baseline is then compared to the following variations of the proposed meeting-point based system: ? 1-alt: one alternative is randomly selected for each pre-matched request ? 2-alt: two alternatives are randomly selected for each pre-matched request ? 3-alt: three alternatives are randomly selected for each pre-matched request ? 4-alt: four alternatives are randomly selected for each pre-matched request In all of the optimization steps, the vehicle capacity is set to 6 passengers, the maximum time for a co-rider to be picked up is set to 6 minutes, and the maximum allowable ride-time for each passenger is set to 150% of the original direct travel time. Also, to evaluate the impact of different vehicle dispatching penalties, the penalty of dispatching an additional vehicle is set to 1,000 seconds in one scenario and 4,000 in another scenario. The results of these experiments are summarized in Tables 4.1 and 4.2 for additional vehicle penalties of 1000 and 4000, respectively. It can be seen in these two tables that for both scenarios, introducing more alternative points reduced the total travel time significantly compared to the base model (about 180 minutes for 4-alt). Moreover, both solo travel times and shared travel times are also reduced. More importantly, the total wait time is almost reduced to half for the model, with four alternative pickup points for each passenger (4-alt). 51 Table 4.1: Statistics of resulting schedules (additional vehicle penalty = 1000) Metrics Base-Model 1-alt 2-alt 3-alt 4-alt Total travel time (min) 2,620.0 2,580.8 2,624.7 2,603.6 2,549.5 Solo travel time (min) 809.3 792.4 770.1 786.9 778.6 Shared travel time (min) 1,810.7 1,788.4 1,854.6 1,816.8 1,770.9 Total wait time (min) 182.9 143.6 126.7 111.0 91.8 Average wait time (sec) 120.6 90.7 75.3 67.2 56.8 Average requests fulfil time (min) 9.4 9.1 9.2 9.1 8.9 Num. of vehicles used 207 203 197 199 201 Num. of solo vehicles 124 117 107 109 110 Num. of shared vehicles 83 86 90 90 91 Num. of shared requests 174 181 191 189 188 Table 4.2: Statistics of resulting schedules (additional vehicle penalty = 4000) Metrics Base-Model 1-alt 2-alt 3-alt 4-alt Total travel time (min) 2,797.2 2,801.2 2,754.1 2,740.4 2,609.2 Solo travel time (min) 784.8 763.4 756.7 752.5 720.9 Shared travel time (min) 2,012.4 2,037.9 1,997.3 1,988.0 1,888.3 Total wait time (min) 225.4 154.6 148.1 140.4 126.3 Average wait time (sec) 136.6 88.3 82.3 76.6 72.2 Average requests fulfil time (min) 10.1 9.9 9.7 9.7 9.2 Num. of vehicles used 199 193 190 188 193 Num. of solo vehicles 111 100 98 95 99 Num. of shared vehicles 88 93 92 93 94 Num. of shared requests 187 198 200 203 199 When we compare Table 4.1 with Table 4.2, we can observe that when the penalty for additional vehicles is lower, more solo vehicles are used, and better travel times and wait times are achieved. Also, when the penalty is higher, the average time to fulfill a request (wait-time plus travel time for the shared rides) is increased. This means that longer shared rides are selected due to the higher cost of dispatching additional taxis. 52 It is interesting to note that the number of solo vehicles is increased for the best model (4-alt), and fewer vehicles are used for shared riding when we compare the two best models (3-alt and 4-alt). This implies that the introduction of meeting points to the individual optimization problems has led to finding better solutions (in the sense of minimizing the travel times), achieving lower total travel times and much lower shared- ride average waiting times (15.4% lower) while reducing the total number of shared rides. Apart from the shared rides, it may happen that the solo riders are also dispatched from better alternative locations, reducing the total travel time. Overall, fewer requests are served together, but more efficiently (in terms of total travel time and wait time). The routes obtained from the base model and 4-alt model of a pre-matched subset of requests are compared on the map in Figure 4.8. In the base model solution, four pre- matched requests are served in two rides, each serving two requests where total travel time and wait time are 2,888 and 286 seconds, respectively. On the other hand, using an alternative point for node A in the red route of the base model (left panel), and transporting it closer to node A in the blue route, results in a different schedule with a much shorter travel time of 1,763 and wait time of 90 seconds (right panel). This schedule is composed of a solo and a three-rider shared ride. The total number of vehicles is the same, but not only the total travel times are considerably reduced, but also the taxi servicing the solo driver would be available for the next ride much earlier and in the same high demand Manhattan area. This example demonstrates how introducing alternatives in close proximity can significantly improve the overall efficiency (of the shared and solo rides). 53 (a) Base model routing (b) 4-alternative model routing Figure 4.8: Routing comparison of two models 4.3 Computational Complexity The improvement in travel and wait times for the model with a higher number of alternative points comes with the price of significantly higher computation times. The computation times for each model in the previous section are shown in Figure 4.9. It can be seen that for the best model (4-alt), the processing time is more than six times the processing time for the base model. The reason is obviously the larger size of the 54 optimization problem for the 4-alt model. This suggests the need for employing better strategies for selecting alternative points (instead of the current random strategy) to achieve similar performances with smaller-sized optimization problems. 300 242.9 250 218.4 200 150 105 113.4 100 50 37.2 0 Baseline 1-alt 2-alt 3-alt 4-alt Figure 0.9: Processing times of the presented models for the selected 1-minute data 4.4 Meeting Point Selection Strategies In this section, instead of randomly selecting up to four alternative meeting points per request, we tried the meeting point selection strategies defined in Chapter 3. A brief explanation of each method is given below: ? Random Selection Method: k alternatives (if available) are randomly selected from the candidate points for each request. ? Farthest Method: k candidate points with the largest distance to the original pickup point are selected as alternative points. This method assumes that passengers are picked up in a location away from a possibly congested high- demand area. 55 Minute ? Distance Ratio Method: k candidate points with the highest driving to walking distance are selected as alternative pickup points. This method presumably helps evade long detours caused by one-way streets. ? Time-Distance Ratio Method: k candidate points with the highest driving travel time to driving distance are selected as alternative pickup points. This method presumably helps evade congested detours. ? Heatmap Method: In this data-driven method, k candidate points with the highest rate of being used in historical optimal routes are selected as alternative meeting points. This method calculates the usage rates based on the number of times a point (intersection) is offered as a candidate alternative and the number of times it was used in optimal routes in historical data with a random alternative point selection method. Basically, intersections with higher usage rates are hot spots for future pick-up points. Intersections that are not offered before are assigned a usage rate of 20% to increase the chance of being selected as an alternative point. Figure 4.10 shows the usage rate of intersections in Manhattan extracted from optimal routes obtained from solving one hour of data (19,316 requests) when randomly giving each request four alternative pick-up points. The larger-sized circles show the higher usage rate of the intersections. 56 Figure 0.10 Meeting location usage rate 4.4.1 Comparing Meeting Point Selection Strategies Two different sets of data are solved with each strategy and compared to others to evaluate the proposed strategies' effectiveness. These sets correspond to a one-minute and a 4-minute data on June 12th, 2015, between 1 pm and 1:05 pm. The summary of results is presented in Tables 4.3 and 4.4. The maximum computation time for the optimization problem is set to 2 minutes to avoid excessive computation times. It is 57 experimentally observed that the majority of the problems can be solved optimally within this time limit. The rare ones that cannot, take significantly longer (hours) to be solved and hence would not be practical to be solved in a real-world setting. Consequently, we just take the best solution obtained within the allowed two minutes processing time for these very rare problems. Table 0.3: Solution results (352 requests from NYC Data) Optimal Average Total Num. Num. Average obj. process process Num. Model name not shared travel time function time time vehicles solved requests min value min min Heatmap3alts 15,042 0 0.926 90.75 200 260 4.35 Heatmap4alts 15,341 4 1.066 103.379 205 251 4.29 Heatmap2alts 15,545 2 0.652 64.544 208 250 4.24 Dist-ratio3alts 15,572 3 0.943 92.389 209 249 4.14 Random4alts 15,728 9 0.897 85.243 211 240 4.16 Random3alts 15,783 3 0.943 91.487 211 246 4.32 Time-Dist3alts 15,976 7 0.826 80.128 214 238 4.26 Base model 16,020 0 0.047 4.704 213 248 4.46 Table 0.4: Solution results (1292 requests from NYC Data) Optimal Average Total Average Num. Num. obj. process process Num. travel time Model name not shared function time time vehicles per request solved requests value min min min Heatmap3alts 58,024 4 0.835 327.378 767 908 4.83 Heatmap2alts 58,231 9 0.551 218.204 769 903 4.81 Heatmap4alts 58,282 15 0.911 356.992 773 890 4.76 Dist-ratio3alts 59,070 9 0.793 309.984 784 895 4.75 Random3alts 59,341 13 0.807 316.97 785 882 4.83 Random4alts 59,449 31 0.856 332.959 789 878 4.81 Base model 61,412 3 0.059 23.425 811 874 4.99 58 For these experiments, the heatmap information (usage rate of intersections) is extracted from optimal routes obtained from solving one hour of data on June 5th, 2015, between 1 pm and 2 pm when randomly giving each request four alternative pick-up points. This corresponds to 19,316 requests. As such, the heatmap is not extracted from the test data but is extracted from historic data belonging to the same day of the week and around the same hour of the day to account for the seasonality of the demand data. According to these results, the Heatmap method yields the best results for both datasets. The objective function value of the Heatmap with three alternatives is better than the one with four alternatives. This is because the number of problems that cannot be solved within the given time limit (2 minutes) is higher when more alternatives are provided. The Distance-Ratio method yielded better results than the random method in both cases. Since the time-distance model results are worse than the random method results in the first case, it is not tested on the larger case. The Heatmap method with three alternatives yields the best objective value while using fewer vehicles and higher vehicle utilization. Therefore, it is the most effective method in selecting the alternative points with a higher potential of constructing better routes. However, as expected, the processing time is still significantly higher than the baseline system. This confirms the need for a heuristic solution approach to allow real-time application of the proposed methodology. 4.5 Summary In this chapter, we utilized the publicly available NYC yellow cab dataset to evaluate the effectiveness of the proposed framework in a real-world setting. Initially, a 59 feasibility study was conducted to evaluate the possibility of implementing such a ride- sharing system in NYC and select a reasonable radius and time window for creating a reasonable pool of requests for the pre-matching step of the proposed ride-sharing system. We then presented experimental results comparing the proposed system to the baseline ride-sharing system and showed that incorporating alternative meeting points can make a considerable difference in NYC congestion by reducing the number of vehicles and travel times. We further investigated the benefits of the proposed system by comparing the results with different vehicle costs and different alternative pick-up selection strategies. We showed that when the penalty for additional vehicles is lower, more solo vehicles are used, and better travel times and wait times are achieved. Also, when the penalty is higher, longer shared rides are selected due to the high cost of dispatching additional taxis. Trying the meeting point selection strategies introduced in Chapter 3, we found out the Heatmap method finds the best results compared to other methods. However, since giving more alternatives increases the problem size notably and occasionally causes the problem not to be solvable within the preset time limit, the Heatmap with three alternatives yielded better results than the one with four alternatives. Finally, we discussed that although incorporating alternative meeting points reduces the average travel time and the number of vehicles, the computational complexity is also growing significantly, and the proposed methodology took much longer to find the solutions 60 compared to the baseline model and hence a heuristic approach is needed to solve the problems in reasonable amounts of time. 61 Chapter 5: A Genetic Algorithm Solution As the experimental results obtained in chapter 4 show, giving alternative pick-up points to each request significantly increases the computational times. This is due to the significant increase in the problem size, making it impossible to obtain optimal routes in real-time use cases of the proposed framework. To leverage the benefits of using alternative meeting locations in real-life ride- sharing situations, we developed a genetic algorithm approach to obtain near-optimal results in a short amount of time. A Genetic Algorithm (GA) is a metaheuristic approach used in solving combinatorial optimization problems. GA does not require information about the structure of the optimization problem and circumvents the features of the constraints of the problem, such as linearity, continuity, differentiability, and multimodality. 5.1 Genetic Algorithm Fundamentals GA is an evolutionary algorithm imitating the biological reproduction process and natural selection to start from a random initial population (of solution) and evolve over successive generations towards the ?fittest? solution. GA relies on the following components to achieve this goal: ? A population of individuals (feasible solutions), randomly created at first and evolved generation by generation. Individuals (solutions) in the population are represented as chromosomes that are strings of encoded genes. 62 ? A fitness function is used to determine the quality of each chromosome; the fittest individuals are selected to pass genes from one generation to the next. It can be the objective function we want to minimize. ? Cross-over operations to produce next-generation members from the fittest members of the current generation. As such, each new generation inherits from the fittest individuals of the previous generation. ? Random mutation of chromosomes in the new generation to diversify the set of solutions by introducing randomness, to allow exploration of the search space, and to avoid being trapped in local optima The ultimate solution to the optimization problem is obtained via a parallel evolutionary search operation on a population of feasible solutions. This is an effective strategy to avoid being trapped in local optima and makes the approach suitable for large- scale combinational optimization problems (Ma and Zhang, 2018). Moreover, the algorithm is based on probability rules, not deterministic ones, making the search more flexible. We developed an approach based on a genetic algorithm to solve the problem of taxi ride-sharing with alternative pickup locations. In the following, we describe how the components mentioned above are implemented for the problem at hand. 5.2 Chromosome Representation In genetic algorithms, a chromosome is a set of parameters that define a solution to the problem at hand. In the ride-sharing problem with alternative pickup points, the 63 solution would be the constructed route for a set of requests. Since a pre-matched group of requests can be served with multiple vehicles, each solution is represented by strings of the sequence of passengers? origins and destinations. Each string represents the route of a different vehicle. Figure 5.1 depicts an example of such a solution. In this figure, Letters represent vehicles, and each number corresponds to one of the passengers. For each passenger, the ?+? superscript shows the passenger's origin, and the ?-? superscript indicates the destination node. For each solution to be feasible, each passenger?s drop-off location (destination) must be visited after its pickup location (origin) and in the constructed route for the same vehicle. Pickup location can be the initially requested pick- up location or any one of the alternative pick-up locations. The creation and maintaining of such feasible solutions is one of the important aspects of the GA. This is described in more detail in the following section. Figure 5.1 A chromosome representation of a feasible solution with two vehicles. 5.3 Population Initialization The first step of the GA is to create an initial population (generation) of individual solutions (chromosomes). These solutions should be generated in a random way allowing the GA to explore the search space reasonably. Each individual in the generated 64 population must be a feasible solution. A feasible solution to our problem is a set of routes, each consisting of sequences of pre-matched requests? pick-ups and drop-offs, similar to the example shown in Figure 5.1, where the following criteria are met: - Pick-up and drop-off of a request are contained in the same route (vehicle), - Vehicle capacity is not violated in any of the routes, - A request?s drop-off is later in the route after its pick-up, - All the passengers are picked up within a time threshold, which starts from the first passenger?s pick-up time. We designed an iterative procedure to initialize the population to create random feasible solutions serving all the requests. We select a pair of corresponding pick-up and drop-off locations at each step and insert the pair into the partial solution created so far. The insertion should be accomplished randomly while respecting the feasibility criteria. Algorithm 5.1 describes the procedure for inserting a pair of randomly selected pick-up and drop-off points. In addition to the random insertion algorithm, a greedy insertion algorithm is also used to create new solutions. The greedy algorithm is implemented the same way as described in Algorithm 5-1, with only one difference in how the drop-off location is inserted into the solution. Instead of randomly selecting the drop-off index from a set of possible drop-off indices, the greedy algorithm selects the one that results in the shortest visit time for the last node in the route. As such, the greedy algorithm creates slightly superior random solutions in terms of overall travel times while introducing less diversity. 65 Algorithm 5-1: Random Request Insertion Algorithm Input: ???????_????????, ????_??, and ????_??? location pair, travel-time matrix, demand Output: Flag of insertion success, ???????_???????? if length(???????_????????) = 0, then return True, ???????_???????? = [????_??, ????_???] else: # find set of indices in the current solution where inserting the pickup point is feasible ? = {??}, ??? (0, ? , ??????(???????_????????)) # set of possible pick-up indices for ?? in ? do insert ????_?? at the ?? index of ???????_???????? calculate visit time for every node in ???????_???????? calculate vehicle load for every node in ???????_???????? if any visit time > pick-up time threshold or vehicle load > vehicle cap, then remove ? ? from ? if ? = ? then # it?s not possible to create a feasible solution return False, ???????_???????? else: # insert the pick-up point at randomly selected index from P ?? = random sample from ? ?????? ?? ????? ?? ????_?? ? ???????_???????? # inserting the drop off node ? = {??}, ??? (?? + 1,? , ???(???????_????????))# set of possible drop-off indices for ?? in ? do insert ????_??? at the ?? index of ???????_???????? calculate visit time for every node in the Current route calculate vehicle load for every node in the Current route if any visit time > pick-up time threshold or vehicle load > vehicle cap, then remove ?? from ? If D = ? then # it?s not possible to create a feasible solution return False, ???????_???????? else: # insert the drop up point at randomly selected index from D ?? = random sample from ? ?????? ?? ????? ?? ????_??? ? ???????_???????? return True, ???????_???????? The insertion algorithm is then used to create complete feasible solutions in a reasonably random way. We start from a blank chromosome (solution) and pool of pre- matched requests and then, step by step, randomly choose a pair of pick-up and drop-off 66 locations and insert them into the solution created so far. Specifically, the following steps are repeated while unserved pre-matched requests exist: 1. Randomly select a request from a set of pre-matched requests to add to the chromosome (solution). 2. Create a new route (use a new vehicle) with a probability of ? to add the new request to it. Otherwise, add the request to the current route (vehicle). The value of ? is randomly generated each time from the standard uniform distribution. 3. Randomly select a pickup node from the original and alternative pickup nodes of the selected request. 4. Insert the selected pick-up location and the corresponding drop-off location in the current route, using the insertion algorithm. 5. Create a new route (vehicle) if no feasible solution is available (insertion is not successful) and go back to step 4. A route (vehicle) is added to a solution for two reasons. First, to cope with the situation where inserting a new request to the current route makes the solution infeasible, and second, to increase the ability of the algorithm to escape from local optima. A new route is created (a new vehicle is dispatched) to have a more diverse population in the latter case. As we will see in the next section, the exceeded travel time of passengers is heavily penalized in our fitness function. Consequently, a solution comprised of more routes (vehicle) may result in a lower total cost. Consequently, it is important that the 67 solutions containing more routes are reasonably represented in the initial population of solutions. 5.4 Fitness Function and Parent Selection Genetic algorithm uses a fitness function that assesses how ?fit? each candidate solution is with respect to the problem at hand. The fitness function in our problem is considered the same as the MIP objective function (Eq. 3.1), which is equal to the summation of total travel times, cost of vehicle dispatching, and the penalty of exceeding allowable travel times. The selection probability of parents is generally based on the value of the objective function. The roulette wheel selection scheme (Holland, 1975) is used in this study to implement proportionate random selection. Each chromosome has a chance of being chosen equal to the complementary probability of its objective value divided by the sum of the values of all of the generation?s chromosomes. Elitism is implemented to keep the best chromosome so far. Elitism preserves two slots in the next generation for the fittest chromosome of the current generation to be carried over to the next generation without any changes. The additional slot is to prevent the elite chromosome from being mutated in the next generation. 68 5.5 Crossover and Mutation Crossover and mutation operators are applied to reproduce offspring and create a new generation of population. Cross-over combines the genetic information of two of the selected parents to generate new offspring such that the new chromosome inherits and combines the characteristics of the two parents. The new chromosome can then be mutated to introduce diversity into the new population. Mutation helps avoid being trapped in local minima by preventing the population of chromosomes from becoming too similar. Common methods of crossover and mutation are not suitable for our problem since they construct many infeasible solutions. To avoid producing such solutions, these genetic operators are modified to solve the taxi-sharing routing with alternative meeting point problem as follows: - Order-based crossover: the goal is to combine the chromosomes of two parents (solutions) so that the two new chromosomes inherit from the characteristics of the parents while still representing a feasible solution. The characteristic we choose to pass over to the new offspring is the order in which the two randomly selected requests are served in the two parents. We randomly select two of the requests and eliminate these requests from one of the parent chromosomes to form a partial solution from the first parent. We then take the order in which the exact same pair of requests are served in the other parent sequence and insert the ordered pair into the first parent?s partial solution. The procedure is demonstrated in Figure 5.2. This procedure is similar to the order-based crossover proposed by Syswerda (1991). However, instead of directly inserting the 69 requests in the exact inherited positions, they are inserted one by one according to their inherited pickup order using Algorithm 5-1 to produce feasible solutions. Figure 5.2 An Example of Order Based Crossover - Mutation operator: This operator slightly modifies the sequence by changing a randomly selected pickup point with a randomly selected alternative pickup location. Each chromosome in the population has a certain chance of being selected for mutation. The only criterion to produce a feasible solution with this operation is that the passengers? pick-up times are within the allowable threshold. 70 5.6 Summary As discussed in chapter 4, giving alternative pick-up points to each individual request significantly increases the problem size, increasing the computational times significantly and making the proposed framework almost impossible to use in real-time use cases. Therefore, we developed a genetic algorithm approach to obtain near-optimal solutions quickly to benefit from using alternative meeting locations in real-life ride- sharing situations. We first introduced a chromosome representation suitable for our problem and defined the criteria of a feasible chromosome according to the proposed MILP formulation and our problem constraints. We then designed an insertion algorithm to produce new chromosomes incrementally by inserting the rides into an incomplete solution, one after another, in a reasonably random way. The insertion algorithm is used both in population initialization and crossover operations. As the proposed insertion algorithm always generates feasible solutions, our GA algorithm does not need a repair operation. The fitness function is the same as the MILP objective function, and the roulette wheel selection scheme is used for parent selection. The crossover operation is a variation of order-based inheritance, and the mutation operation is to swap a pick-up point with one of its alternatives randomly. 71 Chapter 6: Genetic Algorithm Experimental Results We proposed a genetic algorithm approach in chapter 5 to search for the optimal solutions in a short amount of time. NYC taxi data is used to test the algorithm and compare the results to the optimal results obtained by directly solving the formulated MIP. We next study the sensitivity of the proposed genetic algorithm to variations of its hyperparameters. Finally, we use the generic algorithm to conduct large-scale simulations, comparing the performance of a taxi-sharing system with alternative meeting locations to that of the baseline system with no alternative points. For the experiment in this chapter, we created a test dataset composed of 2246 requests forming 669 pre- matched groups. These requests comprise 231 groups of 2-riders, 161 groups of 3-riders, 84 groups of 4-riders, 193 groups of 5-riders. 6.1 Comparing the performance of Genetic Algorithm and MILP To evaluate the performance of the generic algorithm in more challenging cases, we focused on the part of the dataset that includes 193 pre-matched groups of 5 riders (965 total requests). We need to solve a capacitated vehicle routing problem with pick-up and delivery for each group, comprising 32 nodes (including the alternatives). To evaluate the effectiveness and efficiency of the proposed GA approach, we measure the optimality gap of the obtained solutions and also compare the computational times with those for optimally solved MILP. The results are averaged over three runs of the GA algorithm to obtain statistically meaningful results for the random operations. For this experiment, the size of the initial population was set to 1,250, and the algorithm stopped after ten 72 generations. The convergence of the total objective value over consecutive generations is shown in Figure 6.1, and the results are compared to MILP solutions in Table 6.1. 33600 33500 33400 33300 33200 33100 33000 32900 32800 1 2 3 4 5 6 7 8 9 10 generations Figure 6.1 The total objective value convergence in generations of the GA Table 6.1: GA and MILP performance comparison (965 requests from NYC Data) summary statistics Optimal exec times (sec) GA exec time (sec) GA gap Min 0.98 7.57 0.00 Median 7202.58 8.71 0.00 Average 12044.54 8.83 1.88 95th Percentile 58251.07 10.12 2.02 Maximum 189507.90 11.49 78.29 It can be seen in Table 6.1 that the proposed GA algorithm achieves an average optimality gap of 1.88%, while 95% of the problems are solved with less than 2.02% of the optimality gap. This suggests that the proposed algorithm is capable of consistently yielding near-optimal solutions. In terms of computational complexity, it can be observed 73 objective value (min) that the average processing time for obtaining optimal results through MILP is more than 1000 times the processing time for the GA algorithm. In other words, GA is, on average more than 1000 times faster than the direct solution of the MILP. Looking at the 95th percentiles, GA is more than 5000 times faster. These results confirm the effectiveness and efficiency of the proposed GA in obtaining near-optimal solutions with a reasonable computational time (GA can obtain results with at most a 2% optimal gap in less than 12 seconds in 95% of the problems). 6.2 Sensitivity Analysis for Genetic Algorithm Parameters In this section, we study the sensitivity of the proposed genetic algorithm to variations of its hyperparameters. The default values are 0.5 for cross-over rate, 0.3 for survival rate, 1250 for population, and 10 for the number of generations. 6.2.1 Cross-Over Rate The problems are solved when the cross-over rate is given different values to study the impact of this parameter on the solution. Table 6.2 shows the summary of the results with varying values of the parameter. The problems are solved three times for each parameter value, and the average results are shown in the table. According to the results, the crossover rate of 50% yields the best result. However, it has the largest processing time for 193 problems compared to other values. 74 Table 0.2 Comparison of the objective value for different cross-over rates Cross-over rate Fitness value Total processing time (sec.) 0.1 2851168 1501 0.2 2826904 1742 0.3 2765706 1993 0.4 2784194 2220 0.5 2749104 2410 6.2.2 Survival rate The problems are solved when the survival rate is given different values to study the impact of this parameter on the solution. Table 6.3 shows the summary of the results with different values of the parameter. The problems are solved three times for each parameter value, and the average results are shown in the table. According to the results, the survival rate of 40% yields the best result. However, it has a slightly larger processing time (for 193 problems) than lower survival rate values. Table 0.3 Comparison of the objective value for different survival rates Survival rate Fitness value Total processing time (sec.) 0.1 2855088 2414 0.2 2759869 2460 0.3 2749104 2410 0.4 2743254 2489 0.5 2744993 2518 75 6.2.3 Population Size One of the important hyperparameters affecting the performance of the GA is the population size. This parameter creates a compromise between the optimality gap and computational complexity. To study the impact of this hyperparameter on the obtained solutions, we executed the algorithm multiple times with different population sizes. We used the entire evaluation dataset for this experiment that includes 2246 requests, including all different problems with two to five pre-matched requests in each group. The computation times and optimality gaps for different population sizes are compared in Figures 6.2 and 6.3. For this experiment, the GA search was stopped after ten generations. 12.00 Avg gap max computation time 11.00 10.00 9.00 8.00 7.00 6.00 5.00 4.00 3.00 2.00 1.00 0.00 250 500 750 1000 1250 population size Figure 6.02 Population size impact on max computation time and average optimality gap 76 exec. time (sec) - gap(%) 8.00 Avg gap 75% gap 7.00 6.00 5.00 4.00 3.00 2.00 1.00 0.00 250 500 750 1000 1250 population size Figure 6.3 Population size impact on average and 75th percentile optimality gap It can be observed that while larger populations increase the computation time, it reduces the optimality gap (by expanding the exploration ability of the algorithm). By increasing the population size from 250 to 1250 in five steps, the average and 75th percentile solution gap decreased from 7% to 2% and 2% to 0%, respectively, while the maximum computation time increased from 3 seconds to 12 seconds. Indeed, GA's ability to obtain good solutions given a fixed population size is also highly dependent on the problem size (number of nodes in our case), which in turn, is proportional to the number of pre-matched requests. Figure 6.4 shows the average optimality gap for different problem sizes when initializing the algorithm with various population sizes. It can be seen that better results are achieved for the larger problem sizes when the population size is increased proportionately. This suggests that we may change the population size dynamically based on the problem size such that the population size increases with problem size to allow a sufficient exploration of the solution space. Based 77 gap (%) on the results presented in Figure 6.4, we opted for setting the population size to the number of requests multiplied by 250 in the algorithm. 6.00 250*1 250*2 250*3 250*4 250*5 5.00 4.00 3.00 2.00 1.00 0.00 n_req_2 n_req_3 n_req_4 n_req_5 250*1 0.02 0.11 2.65 5.01 250*2 0.00 0.03 2.46 4.90 250*3 0.00 0.01 1.00 3.04 250*4 0.00 0.00 0.94 2.15 250*5 0.00 0.00 0.91 2.27 Number of prematched requests in each group Figure 6.04 Population size impact on average optimality gap for different problem sizes 6.2.4 Number of Generations Another important hyperparameter in GA is the number of generations the algorithm goes through before stopping the search. The chromosomes are randomly altered within a population generation to create new chromosomes that potentially have better fitness values. Through several generations, the fittest chromosomes will possibly 78 gap(%) converge to the optimal value. Obviously, longer searches result in better solutions but with longer computational times. To better understand the impact of this hyperparameter, we conducted an experiment to compare the computation times and optimality gaps of the solutions obtained through stopping the GA after a different number of generations. The results are presented in Figures 6.5 and 6.6, with a population size of 1250. As expected, it can be seen in Figures 6.5 and 6.6 that the results keep improving with new generations, as each generation improves over the chromosomes of the previous generation through crossover and mutation operations. The improvement is, of course, achieved at the expanse of increased computation time. By increasing the number of generations from 3 to 10, the average and 75th percentile solution gap decreased from 3% to 2% and 1% to 0%, respectively. In contrast, the average computation time increased from 3 seconds to 9 seconds and the maximum computation time increased from 5 seconds to 12 seconds. Since the computation times are not significantly increased for ten generations and the 75th percentile gap is almost zero, we decided to stop the algorithm after ten generations. 79 14.00 Avg gap max computation time 12.00 10.00 8.00 6.00 4.00 2.00 0.00 3 5 7 10 Number of Generations Figure 0.5 The impact of Number of generations on maximum computation time and average optimality gap Avg gap 75% gap 3.00 2.50 2.00 1.50 1.00 0.50 0.00 3 5 7 10 Number of Generations Figure 0.6 Number of generations impact on average and 75th percentile optimality gap 80 gap (%) exec. time (sec) - gap(%) 6.2.5 The impact of Greedy Request Insertion The insertion algorithm (Algorithm 5-1) proposed in Chapter 5 is used in the population initialization and crossover operation of the introduced GA. In addition to random order insertion, a greedy-order insertion was also introduced. In Random Order insertion, we first insert the corresponding pick-up point on a random index of the existing sub tour to insert a new request into an existing sub tour. We then randomly select the corresponding drop-off location index from the set of possible drop-off indices. In the greedy insertion, however, we select the drop-off order that results in the minimum visit time of the last node in the sub tour (the last drop-off). The result of the GA using random and greedy insertion methods are compared in Figures 6.7 and Table 6.4. The GA population size for each group is set to the number of pre-matched riders multiplied by 250. The GA search is stopped after ten generations. Figure 6.7 Greedy request insertion vs. random request insertion methods? computation time comparison 81 Table 6.4 Greedy insertion and random insertion methods optimality gap comparison Summary statistics GA gap: random insertion (%) GA gap: greedy insertion (%) Min 0.00 0.00 Median 0.00 0.00 Average 2.05 0.62 75th Percentile 0.00 0.00 95th Percentile 1.82 0.80 Maximum 89.12 77.38 It can be seen in Figure 6.7 that the mean and median of the GA execution time for both methods are the same except that the greedy insertion execution time range is 3 seconds longer. Table 6.4 reports the statistics of each technique. The average optimality gap for the random insertion is 2.05%, whereas the average optimality gap for the greedy insertion method is 0.62%. Moreover, the table shows that for the greedy insertion method, 95% of the problems are solved with less than 1% gap, while in the random insertion, the 95th percentile gap is about 2%. Table 6.5 compares the optimality gaps obtained for different sizes of the problems. Comparing the 95th percentile of optimality gaps, we can observe that the greedy insertion method especially performs better in larger problem sizes (4-requests and 5-requests). Tables 6.4 and 6.5 show that the greedy method performs better on average and can find high-quality solutions for a larger number of problems, especially for problems with a higher number of pre-matched requests. Therefore, the greedy request insertion approach is better for creating new solutions in the proposed genetic algorithm. 82 Table 6.5 Greedy insertion and random insertion methods optimality gap comparison for different numbers of pre-matched requests Summary 2 requests gap% 3 requests gap% 4 requests gap% 5 requests gap% statistics Random Greedy Random Greedy Random Greedy Random Greedy Min 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Median 0.01 0.00 0.65 0.01 2.85 0.94 5.06 2.28 Average 0.00 0.00 0.00 0.00 0.00 0.00 0.09 0.00 75th Percentile 0.00 0.00 0.00 0.00 0.15 0.00 0.81 0.32 95th Percentile 0.00 0.00 0.49 0.00 1.65 0.46 43.10 4.04 Maximum 1.04 0.00 89.12 0.88 79.69 73.53 78.40 78.26 6.3 Sensitivity Analysis for Taxi-Sharing Problem Parameters Using Genetic Algorithm Now that the GA parameters are thoroughly studied, we can use it to study the sensitivity of the solutions to the parameters of the taxi-sharing problem defined in Equation 3.1. These parameters include vehicle dispatch penalty (C1), allowable ride-time factor for each passenger (C2), and pickup (wait) time threshold (WT). These analyses are conducted on the part of the dataset that includes 193 pre-matched groups of 5 riders (965 total requests) using the proposed GA. The default values for the studied parameters used in the experiments are 4000 for vehicle dispatch penalty, 1.5 for allowable ride-time factor, and 7 for pick-up time threshold. 6.3.1 The Impact of Vehicle Dispatch Penalty To study the impact of the vehicle dispatch penalty on the solutions, we obtained the solutions for different vehicle penalty values, shown as C1 in Equation 3.1. The summary of the results is presented in Table 6.6. 83 We can see from the table that by increasing the vehicle dispatch penalty, the total number of vehicles dispatched and the number of solo vehicles in the solutions decrease significantly. Consequently, the solo travel time decreases while the total travel time and pick-up times increase. This is expected because the higher vehicle cost promotes sharing to prevent dispatching additional vehicles. As a result, the use of solo vehicles is reduced, and the pick-up (wait) times are increased. Comparing the results for 4,000 and 6,000 vehicle penalties, we can observe that the number of vehicles is not further reduced. This can be due to other constraints of the problem resulting in both values yielding very similar results except for a few problems for which the GA cannot find an optimal solution. Table 0.6 Cost breakdown of the obtained solutions for 965 requests for different vehicle dispatch penalty values Vehicle Total travel Total solo travel Total waiting Total Solo dispatch time (sec.) time (sec.) time (sec.) vehicles vehicles penalty (sec) 1000 311546 51001 31113 526 175 2000 324179 31948 62436 460 97 4000 326653 27637 66091 451 84 6000 326296 27546 65911 451 85 6.3.2 The Impact of Allowable Ride-Time Factor The problems are solved for different values of the allowable ride time factor, shown as C2 in Equation 3.1, to study the impact of maximum allowable ride time of passengers on the solutions. Table 6.7 shows the summary of the results with different values of the parameter. 84 The table shows that increasing the maximum allowable ride time factor has a significant impact on the solution. The total number of vehicles dispatched, and the number of solo vehicles decrease significantly when the passengers? ride times are allowed to increase up to two times their direct travel time. Consequently, the solo travel time decreases while the total travel time and pick-up times increase. This is expected because the higher limit of maximum allowable ride time allows more sharing and increases the number of shared rides while decreasing the number of solo rides. Table 6.7 Cost breakdown of the obtained solutions for 965 requests for different allowable ride time factor values Allowable ride Total travel Total solo travel Total waiting Total Solo time factor time (sec.) time (sec.) time (sec.) vehicles vehicles 1.2 345808 105962 53587 598 276 1.5 324749 28566 65931 451 87 1.8 318306 13983 61784 395 35 2 313407 6055 61723 372 15 6.3.3 The Impact of Pick-up Time Threshold The pick-up time threshold (WT) is a constraint in our problem to control passenger waiting time for being picked up so that it does not exceed a reasonable time. This time is computed from the beginning of the drive time. The wait time is indirectly controlled in the pre-matching step by setting the radius of the co-rider search, which is set to a distance that can be reached by driving in a reasonable amount of time. For this reason, as well as the presence of wait time in the objective function of the minimization problem, 85 we do not expect a significant impact on the solution when changing the threshold compared to vehicle dispatch penalty and maximum allowable ride-time factor. To study the impact, we obtained the solutions for different pick-up time threshold values. The summary of the results is presented in Table 6.8. Table 6.8 Cost breakdown of the obtained solutions for 965 requests for different pick- up time threshold values Pick-up time Total travel Total solo travel Total waiting Total Solo threshold (min) time (sec.) time (sec.) time (sec.) vehicles vehicles 6 327175 33415 64359 452 93 7 324749 28566 65931 451 87 8 325378 26719 69139 445 77 Table 6.8 shows that by increasing the pick-up time threshold, the total number of vehicles dispatched and the number of solo vehicles in the solutions decrease. As expected, the range of variation is smaller compared to previously studied parameters. Increasing the allowable wait-time limit can increase the chance of passengers being matched. Therefore, the total number of vehicles and the number of solo vehicles may decrease as well as the total travel time and total solo travel time, as is the case in our results. 6.4 Large-Scale Experimental Results Having developed a heuristic GA approach that is capable of obtaining near- optimal solutions reasonably fast, we can proceed to demonstrate the benefits of the proposed taxi sharing system on a larger sample of the NYC dataset. 86 A sample of 43,027 requests is randomly selected from the dataset that accounts for about 2.5 hours of taxi demand data in Manhattan. The requests are pre-matched into 14,155 groups (problems). The problems are first solved without alternative pick- up points using the MIP model to create a baseline for comparison. Then, up to four alternative pick-up points are given to each request, and the problems are solved using the proposed GA algorithm. Results of the baseline model and the Alternative Meeting Point (AMP) model are reported and compared in Table 6.9. Table 0.9 Cost breakdown of the obtained solutions for 43,027 requests, comparing the baseline system to the GA based AMP system (AMP-GA) Total travel Total solo travel Total waiting Total vehicles Avg. process time time (hrs.) time (hrs.) time (hrs.) (Solo) (per problem) Baseline 3307.4 385.7 933.7 20863 (4561) 1.9 sec AMP-GA 3057.9 232.4 892.6 18967 (2709) 3.7 sec Savings 249.5 153.3 41.0 1896 (1852) It can be seen in Table 6.9 that the AMP model can save up to 249.5 hours in total travel time, in which more than 50% (153.3 hours) of the saved travel time is because of savings in solo travel times. This is also confirmed when comparing the total number of vehicles dispatched. It can be seen that the AMP model saves up to 1,896 vehicles while 1,852 vehicles out of 1,896 vehicles are saved from those dispatched to serve a single request (solo rider vehicles). Therefore, it can be concluded that introducing alternative meeting points for each request can significantly enhance the ability of the system to match requests to share a ride, increase the utilization of the vehicles, and reduce the number of vehicles dispatched to serve a single request. Although the AMP 87 requires fewer vehicles to serve the requests, it does not increase the wait time of passengers to be picked up and reduces up to 41 hours of total waiting time. These improvements only add less than two more seconds to the computation time of the AMP compared to the baseline model. 6.4 Summary In this chapter, we used a publicly available dataset to verify the efficiency and accuracy of the proposed genetic algorithm and perform several sensitivity analyses on the hyperparameters of the proposed framework. Initially, we compared the solutions of the proposed genetic algorithm to the optimally solved MILP results. We showed that in the large size problems (five requests), the proposed GA could obtain results with an average 2% optimal gap in less than 12 seconds per problem for 95% of the 193 problems used for this experiment. We then performed a sensitivity analysis on the population size and the number of generations of the GA and determined the right population size and number of generations for each problem size to obtain high-quality solutions while maintaining a reasonable solution time. Moreover, the two variants of the proposed insertion algorithm were examined to determine the best method for the general experiments. It was revealed that the greedy method performs better on average and can find high-quality solutions for a larger number of problems, especially for larger size problems. Finally, we presented experimental results comparing the proposed system to the baseline ride-sharing on large-scale data and showed that introducing alternative 88 meeting points for each request can significantly enhance the ability of the system to match requests to share a ride, increase the utilization of the vehicles, and reduce the number of vehicles dispatched to serve a single request. Also, it was concluded that although the proposed system requires fewer vehicles to serve the requests, it does not increase the wait time and even reduces the total waiting time in a comparable computation time. 89 Chapter 7: Pricing Policy Analysis This chapter proposes a pricing policy framework based on passengers? willingness to share a ride. The determinants of passenger?s willingness to share can, of course, be obtained through surveys and social studies. In this chapter, we study the proposed framework under hypothetical scenarios to demonstrate the advantages of the proposed ride-sharing system that incorporates alternative meeting points from the pricing policy point of view. 7.1 Pricing Policy One of the most important design parameters for a ride-sharing system is the discount rate offered to passengers to incentivize shared travel. Obviously, the larger the discount rate (?) is, the higher the percentage of riders willing to register for the slight discomfort of the shared travel (compared to the solo riding). Therefore, to simulate the variations of willingness to share with the discount rate (?), we assume that for each optimization problem, the proportion of passengers who volunteer to share a ride is equal to ? ? ?, where ? is a scaling factor obtained through surveys and social studies. Next, to include the impact of the discount rate into the optimization problem, the originally defined objective function is slightly modified to account for the impact of variations in the discounted price for sharing a ride (organization cost) while considering the corresponding variations in the passengers? willingness to share a ride. It is assumed 90 that the company will incentivize ride-sharing by reimbursing a portion of the ride fare that is proportional to the total inconvenience experienced due to sharing a ride (exceeded ride time from the passenger?s direct travel time, plus the wait time if the passenger is not the first request picked up). Therefore, the reimbursement cost is added to the objective function (4.1) as: min z? = ? + (?i?P? ???? ? ? + ???? ????? ? ??? ??? ) . ?. ? (7.1) In the formula, ? is the discount rate, and ? ? ? is the compensation for the unit of time the passenger is waiting for the vehicle (??? ) and the time passenger is delayed (????) compared to when served individually. In this study, we analyzed the impact of discount on the total cost on a set of 355 problems, each having five requests (extracted from the large-scale experiment in chapter 6), and assumed each request contains only one passenger. For each problem, the number of participating individuals in the ride-sharing is determined using the formulas (7.2-7.4) below: ???? = ?????? ?? ???????? . ?. ? + 1 (7.2) ????_? = ???? ? ???(????) (7.3) ???(????) + 1 ?? ?~ ?(0,1) > ????_? ???? ??????? ?? ????? = { (7.4) ???(????) ?. ?. 91 where ?. ? is the willingness to share rate. ? and ? are scaling coefficients, and X is a random sample from the uniform distribution between zero and one. The parameters ? and ? are empirically set to 250 and 2, respectively, for this comparison to achieve reasonable scales on how ? impacts the objective function and the willingness of passengers to share. These are obviously hypothetical values to study how differently the discount rate parameter impacts the proposed system versus the baseline system. The average number of passengers willing to share at each discount rate for this experiment is depicted in Table 7.1. Based on the values selected for ? and ?, at a 40% discount rate, all passengers are willing to participate in the taxi-sharing. Table 7.1 Average number of participants per discount rate Discount Rate (%) 10 15 20 25 30 35 40 Avg. Participation 2 2.5 3 3.5 4 4.5 5 The total sum of the objective functions for different discount rates for the baseline model and the Alternative Meeting Point (AMP) model is compared in Figure 7.1. The AMP model consistently outperforms the baseline model in terms of operating costs with different discount rates. The breakdown of the sum of the objective function, in terms of average wait-time, average exceeded ride time, and the total number of shared vehicles, are shown in Figures 7.2, 7.3, and 7.4, respectively. It can be seen that for the AMP, the average wait-time, the average exceeded ride time, and the total number of shared 92 vehicles are consistently reduced with increasing discount rates, while this is not the case for the baseline system. 8300000 8200000 AMP baseline 8100000 8000000 7900000 7800000 7700000 7600000 7500000 7400000 7300000 0 0.1 0.2 0.3 0.4 0.5 Discount rate (?) Figure 0.1 Total objective function value vs. discount rates comparison for AMP model and baseline model. 14 12 10 8 6 4 2 AMP baseline 0 0 0.1 0.2 0.3 0.4 0.5 Discount rate () Figure 7.2 Average wait-time vs. discount rates comparison for AMP model and baseline model. 93 average wait-time Objective function (sec) 35 30 AMP baseline 25 20 15 10 5 0 0 0.1 0.2 0.3 0.4 0.5 Discount rate () Figure 07.03 Average exceeded ride-time function value vs. discount rates comparison for AMP model and baseline model. 250 200 150 100 50 AMP baseline 0 0 0.1 0.2 0.3 0.4 0.5 Discount rate () Figure 7.4 Total number of shared vehicles vs. discount rates comparison for AMP model and baseline model. The results imply that for the baseline system, although more passengers are willing to share when incentivized with higher discount rates, the efficiency of the system cannot really be improved. The AMP system, however, allows for significantly higher discount rates to incentivize ride-sharing and leads to a higher correlation between organizations? interests with that of the passengers to improve the overall utilization of the ride-sharing system. 94 shared vehicles average exceed times 7.2 Summary In this chapter, we proposed a pricing policy framework based on the assumption that providing discounts on the rides proportionate to the additional wait time and travel time the proposed system imposes encourages more passengers to participate in the ride- sharing system. We then studied the proposed policy under hypothetical scenarios to demonstrate the advantages of the proposed taxi-sharing system that incorporates alternative meeting points over the traditional ride-sharing from the pricing policy point of view. Specifically, we studied the impact of different discount rates on the system's efficiency. We showed that the AMP model offers the flexibility to accommodate the increasing number of passengers who are incentivized to share a ride due to the higher discount rates by constructing new routes with short detours. 95 Chapter 8: Conclusions and Directions for Future Research 8.1 Conclusion In this study, we proposed a taxi sharing system capable of determining convenient pick-up points for individuals to find faster routes and reduce the number of detours to pick up additional passengers. Moreover, the flexibility offered by introducing alternative pick-up points allows for finding better ride matches and itineraries to satisfy the demand and optimize the objectives. We investigated the suggested framework using the existing network, travel times, and taxi demand of the Manhattan area in NYC to deal with the practical challenges of implementing such a system in a real-world setting. To do so, first, the problem was decomposed by dividing the demand into groups of passengers with higher chances of sharing a taxi by defining a set of criteria. Then a procedure was designed to locate potential alternative pick-up points in the network. Different strategies were developed and explored to efficiently reduce the number of potential alternative pick-up points to those with higher potential of being used in the optimal route. A MILP and a genetic algorithm were designed to solve the problem. We performed several analyses on the proposed framework and showed that incorporating alternative meeting points can make a considerable difference in NYC congestion by reducing the number of vehicles and travel times compared to the traditional taxi sharing. We tried different meeting point selection strategies and showed that the Heatmap method is the best way to reduce potential alternative points. 96 Moreover, we evaluated the proposed genetic algorithm by comparing its results to optimal results from MILP and showed that the proposed GA is at least 1000 times faster and can obtain results with at worst 2% optimal gap in less than 12 seconds for 95% of the problems. As such, the proposed GA heuristic allows the implementation of the proposed framework in a real-world setting, where passengers can wait for a very short period of time to find their potential matches, as well as their best pick-up location. These results demonstrate the potential of GA in the efficient exploration of such large solution spaces. Of course, the meticulous design of the basic components of GA is extremely important to allow for a reasonably random and evolutionary search while adhering to the constraints of the problem at hand. Each new generation should inherit from the fittest parents (better sub-tours found so far) so that the randomness of search and the feasibility of the sub tours in the new generation are maintained. The insertion algorithms and the cross-over operations were specifically designed to address these requirements. The developed GA was used to carry out a large-scale experiment to delineate the performance of the proposed framework to the traditional ride-sharing. The results indicated that the alternative meeting points could significantly reduce travel and waiting time and specifically reduce the number of vehicles dispatched to serve a single request. Significant savings were achieved by only considering static requests, from passengers requesting around the same time, within a set radius. If we include the vehicles that are already serving to match them with newly received requests, the savings could be even more considerable. 97 Finally, we studied the impact of different discount rates on the efficiency of the ride-sharing system. We showed that the alternative meeting point model offers the flexibility to accommodate the increasing number of passengers who are incentivized to share a ride due to the higher discount rates by constructing new routes with short detours. To sum up, we tested the idea of introducing alternative pick-up locations to each request in a taxi-sharing framework and developed a heuristic approach to cope with the challenges of the problem when implemented in a real-world situation. The experiments showed that our system is superior in matching the requests to share a ride and the flexibility offered by picking up passengers at alternative meeting points compared to traditional ride-sharing. This results in reducing the number of solo rider vehicles and increasing the occupancy of the vehicles and consequently can reduce the congestion and accidents and have a positive environmental impact. The experiments were conducted on the data obtained from a real-world road network, and an efficient heuristic was developed to obtain results in a few seconds. As such, the framework can be readily implemented for real-world use cases, as long as the inputs to the problem, such as pick-up and drop- off coordinates, are provided and network travel times and alternative pick-up points can be obtained through software APIs. 8.2 Recommendations for Future Research This study could be extended in the following aspects: 1. This study only considers matching passengers for an empty ride and does not consider adding a new passenger to an ongoing ride. With slight modifications to 98 feasibility criteria and acquiring a time-stamped trajectory of the serving vehicles, the pool of requests can be significantly expanded by including moving vehicles with available seats. 2. This study also assumes that all the passengers choose to share a ride with others. A new model could analyze varying passenger travel preferences regarding travel times, pick-up locations, and flexibility in allowable waiting time. 3. The fleet in this study is a single type of taxi with a 6-passenger capacity. Mixed fleets of taxis could be considered in future models as a change in the capacity can significantly impact the number of dispatched vehicles, waiting time, and each vehicle?s driving time. 4. This study assumes that the passengers are always present at the pick-up location. This assumption could be relaxed in future studies. 5. In this study, we assumed a 300-meter distance threshold for the meeting locations from the original pick-up location and did not consider the walking time of the passengers in the model. In a future study, the walking time of the passengers could be considered in the model as a cost or to calculate the arrival time of the passenger to a meeting point to ensure the passengers' presence at the location before the vehicle. 6. In our initial experiments, after solving the routing problem, the requests served individually (solo) were identified and returned to the next pool of requests to be considered for sharing a ride with another pre-matched group. However, due to increased computation time, this step was eliminated from our solution framework. Future studies can evaluate the trade-off between the computation cost and the 99 potential benefit of reconsidering solo riders of the solutions in the next pre- matching window. 7. The proposed model is deterministic. The framework is designed to retrieve network costs (travel times) in real-time, querying map APIs (Google map API provides travel times with current traffic for a fee, however, for cost efficiency, we used travel times extracted through OSRM API). Although these provide reasonably realistic assumptions, it is still possible that the actual travel times on the network be slightly different in reality. In this case, a stochastic model can be considered to address such unpredictable differences. 8. The experimental results are carried out on the NYC road network and demand, which is a grid network, and the taxi demand is very dense. A future study should examine the proposed framework on other network geometries and demand scales. 9. A future study can investigate the meeting locations from a global perspective where the set of meeting points are identified, and the passengers are assigned to one of these locations. 10. In a future study, practical considerations for the real-world implementation of the proposed framework can be investigated. This may include market analysis and surveys to obtain more accurate estimates of passengers? willingness to share and the probability of no-shows for the targeted area. The estimates can be used to conduct a realistic cost analysis to determine the optimal pricing policy of the taxi sharing system for the targeted area. 11. Finally, a more complicated model can be designed to offer smarter meeting locations and adjust parameters such as wait time and allowable ride time factor 100 taking the traffic and weather condition into account. Or predict the willingness to share of passengers based on these variables. 101 References Agatz, N., A. Erera, M. Savelsbergh, and X. Wang, 2012. Optimization for dynamic ride- sharing: A review. European Journal of Operational Research, 223(2):295?303. Agatz, N., A. L. Erera, M. W. Savelsbergh, and X. Wang, 2011. Dynamic ride-sharing: A simulation study in metro Atlanta. Procedia-Social and Behavioral Sciences, 17:532? 550. Agrawal, N., 2018. The average speed of traffic in midtown manhattan is 4.7 mph. new york thinks it?s found a solution. https://www.latimes.com/nation/ la-na-new-york- traffic-manhattan-20180124-story.html. Accessed: 2019-07-30. Aissat, K., and A. Oulamara, 2014. Dynamic ride sharing with intermediate locations. In 2014 IEEE Symposium on Computational Intelligence in Vehicles and Transportation Systems (CIVTS), Pp. 36?42. IEEE. Aissat, K., and A. Oulamara, 2014. Meeting locations in real-time ride sharing problem: A buckets approach. In International Conference on Operations Research and Enterprise Systems, pp. 71?92. Springer. Altshuler, T., Katoshevski, R., & Shiftan, Y. (2017). Ride sharing and dynamic networks analysis. arXiv preprint arXiv:1706.00581. Armant, V. and Brown, K.N., 2014, November. Minimizing the driving distance in ride sharing systems. In 2014 IEEE 26th International Conference on Tools with Artificial Intelligence (pp. 568-575). IEEE. Asghari, M., Deng, D., Shahabi, C., Demiryurek, U. and Li, Y., 2016, October. Price- aware real-time ride-sharing at scale: an auction-based approach. In Proceedings of the 24th ACM SIGSPATIAL international conference on advances in geographic information systems (pp. 1-10). Asghari, M. and Shahabi, C., 2017, November. An on-line truthful and individually rational pricing mechanism for ride-sharing. In Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (pp. 1-10). Baldacci, R., Maniezzo, V. and Mingozzi, A., 2004. An exact method for the carpooling problem based on lagrangean column generation. Operations Research, 52(3), pp.422- 439. 102 Banerjee, S., Riquelme, C. and Johari, R., 2015. Pricing in ride-share platforms: A queueing-theoretic approach. Available at SSRN 2568258. Bian, Z. and Liu, X., 2017, April. Planning the ride-sharing route for the first-mile service linking to railway passenger transportation. In ASME/IEEE Joint Rail Conference (Vol. 50718, p. V001T04A002). American Society of Mechanical Engineers. Bian, Z. and Liu, X., 2018, April. A detour-based pricing mechanism for first-mile ride- sharing in connection with rail public transit. In ASME/IEEE Joint Rail Conference (Vol. 50978, p. V001T04A003). American Society of Mechanical Engineers. Bistaffa, F., Farinelli, A., Chalkiadakis, G. and Ramchurn, S.D., 2015, September. Recommending fair payments for large-scale social ride sharing. In Proceedings of the 9th ACM Conference on Recommender Systems (pp. 139-146). Biswas, A., Gopalakrishnan, R., Tulabandhula, T., Metrewar, A., Mukherjee, K. and Thangaraj, R.S., 2017a. Impact of detour-aware policies on maximizing profit in ride sharing. arXiv preprint arXiv:1706.02682. Biswas, A., Gopalakrishnan, R., Tulabandhula, T., Mukherjee, K., Metrewar, A. and Thangaraj, R.S., 2017b, May. Profit optimization in commercial ride sharing. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (pp. 1481-1483). Balardino, A. F., and A. G. Santos, 2014. Heuristic and exact approach for the close enough ride matching problem. In International Conference on Hybrid Intelligent Systems, pp. 281?293. Springer. Bicocchi, N., M. Mamei, A. Sassi, and F. Zambonelli, 2015. Opportunistic ride sharing via whereabouts analysis. In 2015 IEEE 18th International Conference on Intelligent Transportation Systems, pp. 875?881. IEEE. Braekers, K., K. Ramaekers, and I. V. Nieuwenhuyse, 2015. The vehicle routing problem: State of the art classification and review. Computers Industrial Engineering, 99:300 ? 313. Bruck, B. P., V. Incerti, M. Iori, and M. Vignoli, 2015. Minimizing co2 emissions in a practical daily carpooling problem. Computers & Operations Research, 81:40?50. 103 Cao, Y., Wang, S. and Li, J., 2021. The Optimization Model of Ride-Sharing Route for Ride Hailing Considering Both System Optimization and User Fairness. Sustainability, 13(2), p.902. Calvo, R.W., de Luigi, F., Haastrup, P. and Maniezzo, V., 2004. A distributed geographic information system for the daily carpooling problem. Computers & Operations Research, 31(13), pp.2263-2278. Calvo, R. W., and A. Colorni, 2007. An effective and fast heuristic for the dial-a-ride problem. 4OR, 5(1):61?73. Carotenuto, P., L. Paradisi, and G. Storchi, 2014. A flexible transport service for passengers. Transportation Research Procedia, 3:442?451. Chen, L., Zhong, Q., Xiao, X., Gao, Y., Jin, P. and Jensen, C.S., 2018, April. Price-and- time-aware dynamic ride sharing. In 2018 IEEE 34th International Conference on Data Engineering (ICDE) (pp. 1061-1072). IEEE. Chen, W., Mes, M., Schutten, M. and Quint, J., 2019. A ride-sharing problem with meeting points and return restrictions. Transportation Science, 53(2), pp.401-426. Cheng, S.F., Nguyen, D.T. and Lau, H.C., 2012, December. A mechanism for organizing last-mile service using non-dedicated fleet. In 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (Vol. 2, pp. 85-89). IEEE. Cheng, S.F., Nguyen, D.T. and Lau, H.C., 2014. Mechanisms for arranging ride sharing and fare splitting for last-mile travel demands. AAMAS. Chou, S.K., Jiau, M.K. and Huang, S.C., 2016. Stochastic set-based particle swarm optimization based on local exploration for solving the carpool service problem. IEEE transactions on cybernetics, 46(8), pp.1771-1783. Cordeau, J. F., 2006. A branch-and-cut algorithm for the dial-a-ride problem. Operations Research, 54(3):573?586. Cordeau, J. F., and G. Laporte, 2003. The dial-a-ride problem (darp): Variants, modeling issues and algorithms. Quarterly Journal of the Belgian, French and Italian Operations Research Societies, 1(2):89?101. 104 Coslovich, L., R. Pesenti, and W. Ukovich, 2006. A two-phase insertion technique of unexpected customers for a dynamic dial-a-ride problem. European Journal of Operational Research, 175(3):1605?1615. Dantzig, G. B., and J. H. Ramser, 1959. The truck dispatching problem. Management Science, 6(1):80?91. Desaulniers, G., 2000. The VRP with pickup and delivery. Groupe d??etudes et de recherche en analyse des d?ecisions (Montr?eal, Qu?ebec). Di Febbraro, A., Gattorna, E. and Sacco, N., 2013. Optimization of dynamic ride sharing systems. Transportation research record, 2359(1), pp.44-50. Fan, J., Xu, J., Hou, C., Cao, B., Dong, T., and Cheng, S., 2018, July. Uroad: An efficient algorithm for large-scale dynamic ride sharing service. In 2018 IEEE International Conference on Web Services (ICWS) (pp. 9-16). IEEE. Fang, Z., Huang, L. and Wierman, A., 2019. Prices and subsidies in the sharing economy. Performance Evaluation, 136, p.102037. Federal Enterprise Data Resources. [NYC taxi zones]. https://catalog.data.gov/dataset/nyc-taxi-zones. Accessed: 2019-07-30. Fellows, N., and D. Pitfield, 2000. An economic and operational evaluation of urban car- sharing. Transportation Research Part D: Transport and Environment, 5(1):1?10. Furuhata, M., M. Dessouky, F. Ord?on?ez, M.-E. Brunet, X. Wang, and S. Koenig 2013. Ride sharing: The state-of-the-art and future directions. Transportation Research Part B: Methodological, 57:28 ? 46. Geisberger, R., D. Luxen, S. Neubauer, P. Sanders, and L. Volker, 2009, Fast detour computation for ride sharing. arXiv preprint arXiv:0907.5269. Gendreau, M., J.-Y. Potvin et al., 2009. Handbook of metaheuristics, volume 2. Springer. Ghoseiri, K., 2012. Dynamic rideshare optimized matching problem. Ph.D. dissertation, University of Maryland, College Park. Ghoseiri, K., A. Haghani, and M. Hamedi, 2010. Real-time rideshare matching problem. Technical report, Mid-Atlantic Universities Transportation Center. 105 Gopalakrishnan, R., Mukherjee, K., & Tulabandhula, T. 2016. The costs and benefits of sharing: Sequential individual rationality and sequential fairness. arXiv preprint arXiv:1607.07306. Ha?ll, C. H., J. T. Lundgren, and P. V?arbrand, 2008. Evaluation of an integrated public transport system: a simulation approach. Archives of Transport, 20(1-2):29?46. Hall, J., Kendrick, C. and Nosko, C., 2015. The effects of Uber?s surge pricing: A case study. The University of Chicago Booth School of Business. Hao, Y., 2018. An integer programming model for dynamic taxi sharing considering provider profit. M.S. Thesis, University of Maryland, College Park. Holland, J.H., 1975. Adaptation in natural and artificial systems, Univ. of Mich. press. Ann Arbor. Hosni, H., J. Naoum-Sawaya, and H. Artail, 2014. The shared-taxi problem: Formulation and solution methods. Transportation Research Part B: Methodological, 70:303 ? 318. Hou, L., Li, D. and Zhang, D., 2018. Ride-matching and routing optimisation: Models and a large neighbourhood search heuristic. Transportation Research Part E: Logistics and Transportation Review, 118, pp.143-162. Hsieh, F.S., Zhan, F.M. and Guo, Y.H., 2019. A solution methodology for carpooling systems based on double auctions and cooperative coevolutionary particle swarms. Applied Intelligence, 49(2), pp.741-763. Huang, S.C., Jiau, M.K. and Lin, C.H., 2014. A genetic-algorithm-based approach to solve carpool service problems in cloud computing. IEEE Transactions on intelligent transportation systems, 16(1), pp.352-364. Huang, S.C., Jiau, M.K. and Chong, K.H., 2017. A heuristic multi-objective optimization algorithm for solving the carpool services problem featuring high-occupancy-vehicle itineraries. IEEE Transactions on Intelligent Transportation Systems, 19(8), pp.2663- 2674. Huang, S.C., Jiau, M.K. and Liu, Y.P., 2018. An ant path-oriented carpooling allocation approach to optimize the carpool service problem with time windows. IEEE Systems Journal, 13(1), pp.994-1005. 106 Jacobson, S. H., and D. M. King, 2009. Fuel saving and ride sharing in the US: Motivations, limitations, and opportunities. Transportation Research Part D: Transport and Environment, 14(1):14?21. Jiau, M.K. and Huang, S.C., 2018. Self-organizing neuroevolution for solving carpool service problem with dynamic capacity to alternate matches. IEEE transactions on neural networks and learning systems, 30(4), pp.1048-1060. Jung, S., and A. Haghani, 2001. Genetic algorithm for the time-dependent vehicle routing problem. Transportation Research Record, 1771(1):164?171. Jung, J., Jayakrishnan, R. and Park, J.Y., 2016. Dynamic shared?taxi dispatch algorithm with hybrid?simulated annealing. Computer?Aided Civil and Infrastructure Engineering, 31(4), pp.275-291. Kashani, Z. N., N. Ronald, and S. Winter, 2016. Comparing demand responsive and conventional public transport in a low demand context. In 2016 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops), pp. 1?6. IEEE. Kamar, E. and Horvitz, E., 2009, June. Collaboration and shared plans in the open world: Studies of ride sharing. In Twenty-First International Joint Conference on Artificial Intelligence. Kim, T., & Haghani, A., 2011. Model and algorithm considering time-varying travel times to solve static multidepot dial-a-ride problem. Transportation research record, 2218(1), 68-77. Kleiner, A., Nebel, B. and Ziparo, V.A., 2011, June. A mechanism for dynamic ride sharing based on parallel auctions. In Twenty-Second International Joint Conference on Artificial Intelligence. Laporte, G., 1992. The traveling salesman problem: An overview of exact and approximate algorithms. European Journal of Operational Research, 59(2):231?247. Li, S., Fei, F., Ruihan, D., Yu, S., & Dou, W. (2016, December). A dynamic pricing method for carpooling service based on coalitional game analysis. In 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS) (pp. 78-85). IEEE. 107 Li, X., Hu, S., Fan, W. and Deng, K., 2018. Modeling an enhanced ride sharing system with meet points and time windows. PloS one, 13(5), p.e0195927. Liu, Y. and Li, Y., 2017. Pricing scheme design of ride sharing program in morning commute problem. Transportation Research Part C: Emerging Technologies, 79, pp.156-177. Lloret-Batlle, R., Masoud, N. and Nam, D., 2017. P2p ride sharing with ride-back on hov lanes: Towards a practical alternative mode for daily commuting. Lois, A., and A. Ziliaskopoulos, 2017. Online algorithm for dynamic dial a ride problem and its metrics. Transportation Research Procedia, 24:377?384. Lu, W. and Quadrifoglio, L., 2019. Fair cost allocation for ride sharing services? modeling, mathematical programming and an algorithm to find the nucleolus. Transportation Research Part B: Methodological, 121, pp.41-55. Lyu, Y., V. C. Lee, J. K.-Y. Ng, B. Y. Lim, K. Liu, and C. Chen, 2019. Flexi-sharing: A flexible and personalized taxi-sharing system. IEEE Transactions on Vehicular Technology, 68(10):9399?9413. Ma, S., Y. Zheng, and O. Wolfson, 2014. Real-time city-scale taxi ride sharing. IEEE Transactions on Knowledge and Data Engineering, 27(7):1782?1795. Ma, T.Y., 2017, June. On-demand dynamic Bi-/multi-modal ride sharing using optimal passenger-vehicle assignments. In 2017 IEEE International Conference on Environment and Electrical Engineering and 2017 IEEE Industrial and Commercial Power Systems Europe (EEEIC/I&CPS Europe) (pp. 1-5). IEEE. Ma, C., He, R. and Zhang, W., 2018. Path optimization of taxi carpooling. PLoS One, 13(8), p.e0203221. Ma, H., Fang, F. and Parkes, D.C., 2020. Spatio-temporal pricing for ride sharing platforms. ACM SIGecom Exchanges, 18(2), pp.53-57. Manna, C., and S. Prestwich, 2014. Online stochastic planning for taxi and ride-sharing. In 2014 IEEE 26th International Conference on Tools with Artificial Intelligence, pp. 906?913. IEEE. 108 Markovi?c, N., R. Nair, P. Schonfeld, E. Miller-Hooks, and M. Mohebbi, 2014. Optimizing dial-a-ride services in Maryland: benefits of computerized routing and scheduling. Transportation Research Part C: Emerging Technologies, 55:156?165. Masoud, N. and Lloret-Batlle, R., 2016. Increasing ridership and user permanence in ride sharing systems using a novel peer-to-peer exchange mechanism. In 95th Annual Meeting of the Transportation Research Board, Washington, DC. Masoud, N., Nam, D., Yu, J. and Jayakrishnan, R., 2017a. Promoting peer-to-peer ride sharing services as transit system feeders. Transportation Research Record, 2650(1), pp.74-83. Masoud, N., Lloret-Batlle, R. and Jayakrishnan, R., 2017b. Using bilateral trading to increase ridership and user permanence in ride-sharing systems. Transportation Research Part E: Logistics and Transportation Review, 102, pp.60-77. Nguyen, D.T., 2013. Fair cost sharing auction mechanisms in last mile ride-sharing. Singapore Management University (Singapore). Nelson, J. D., S. Wright, B. Masson, G. Ambrosino, and A. Naniopoulos, 2010. Recent developments in flexible transport services. Research in Transportation Economics, 29(1):243?248. Papadimitriou, C.H. and Steiglitz, K., 1998. Combinatorial optimization: algorithms and complexity. Courier Corporation. Psaraftis, H. N., 1983. An exact algorithm for the single vehicle many-to-many dial-a- ride problem with time windows. Transportation Science, 17(3):351?357. Qian, X. and Ukkusuri, S.V., 2017. Taxi market equilibrium with third-party hailing service. Transportation Research Part B: Methodological, 100, pp.43-63. Qian, X., Zhang, W., Ukkusuri, S.V. and Yang, C., 2017. Optimal assignment and incentive design in the taxi group ride problem. Transportation Research Part B: Methodological, 103, pp.208-226. Ronak, T., 2017. A street smart pool experience for Manhattan. https://www.uber.com/ newsroom/manhattanpool/. Accessed: 2019-07-30. Rudnicki, R., K. H. Anders, and M. Sester, 2008. Rendezvous-problem in local shared- ride trip planning. In International Society for Photogrammetry and Remote Sensing Congress. Citeseer. 109 Santos, D. O., and E. C. Xavier, 2015. Taxi and ride sharing: A dynamic dial-a-ride problem with money as an incentive. Expert Systems with Applications, 42(19):6728? 6737. Shen, W., Lopes, C.V. and Crandall, J.W., 2016. An online mechanism for ride sharing in autonomous mobility-on-demand systems. arXiv preprint arXiv:1603.02208. Schrank, D., B. Eisele, and T. Lomax, 2012. Tti?s 2012 urban mobility report. Texas A&M Transportation Institute. The Texas A&M University System, 4. Sprung, M. J., M. Chambers, et al., 2012. Transportation statistics annual report 2012. Stiglic, M., N. Agatz, M. Savelsbergh, and M. Gradisar, 2015. The benefits of meeting points in ride-sharing systems. Transportation Research Part B: Methodological, 82:36? 53. Syswerda, G., 1991. Scheduling optimization using genetic algorithms. Handbook of genetic algorithms. The City of New York. Taxi & Limousine Commission [Green Cabs]. https://www1.nyc.gov/site/tlc/vehicles/get-a-vehicle-license.page. Accessed: 2019-07- 30. The City of New York. Taxi & Limousine Commission [TLC taxi zones Manhattan]. https://www1.nyc.gov/assets/tlc/images/ content/pages/about/taxi_zone_map_manhattan.jpg. Accessed: 2019-07-30. Ting, K.H., Lee, L.S. and Pickl, S., 2021. Shared Mobility Problems: A Systematic Review on Types, Variants, Characteristics, and Solution Approaches. Applied Sciences, 11(17), p.7996. Wang, X., M. Dessouky, and F. Ordonez, 2015. A pickup and delivery problem for ride- sharing considering congestion. Transportation Letters, 8(5):259?269. Wang, X., Agatz, N. and Erera, A., 2017. Stable matching for dynamic ride-sharing systems. Transportation Science, 52(4), pp.850-867. Wang, X., Yang, H. and Zhu, D., 2018. Driver-rider cost-sharing strategies and equilibria in a ride-sharing program. Transportation Science, 52(4), pp.868-881. 110 Witt, A., Suzor, N. and Wikstr?m, P., 2015. Regulating ride-sharing in the peer economy. Communication Research and Practice, 1(2), pp.174-190. Xing, X., T. Warden, T. Nicolai, and O. Herzog, 2009. Smize: a spontaneous ride-sharing system for individual urban transit. In German Conference on Multiagent System Technologies, Pp. 165?176. Springer. Yan, S. and Chen, C.Y., 2011a. An optimization model and a solution algorithm for the many-to-many carpooling problem. Annals of Operations Research, 191(1), pp.37-71. Yan, D., Z. Zhao, and W. Ng, 2011b. Efficient algorithms for finding optimal meeting point on road networks. Proceedings of the VLDB Endowment, 4(11):1?11. Yang, H., Wong, S.C. and Wong, K.I., 2002. Demand?supply equilibrium of taxi services in a network under competition and regulation. Transportation Research Part B: Methodological, 36(9), pp.799-819. Ye, Q., Ma, C., He, R., Xiao, Q. and Zhang, W., 2015. Multi-objective optimisation for taxi ride sharing route based on non-dominated sorting genetic algorithm. International Journal of Wireless and Mobile Computing, 8(3), pp.262-270. Zhang, J., Wen, D. and Zeng, S., 2015. A discounted trade reduction mechanism for dynamic ride sharing pricing. IEEE Transactions on Intelligent Transportation Systems, 17(6), pp.1586-1595. Zhang, W. and Ukkusuri, S.V., 2016. Optimal fleet size and fare setting in emerging taxi markets with stochastic demand. Computer?Aided Civil and Infrastructure Engineering, 31(9), pp.647-660. Zhang, C., Wu, F., Gao, X. and Chen, G., 2017, December. Online auctions with dynamic costs for ride sharing. In 2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS). pp. 127-134. IEEE. Zhang, C., Wu, F. and Bei, X., 2018, August. An efficient auction with variable reserve prices for ridesourcing. In Pacific Rim International Conference on Artificial Intelligence (pp. 361-374). Springer, Cham. Zhang, X., Zhang, Q., Yuan, Z., Wang, C., and Zhang, L., 2020. The Research on Planning of Taxi Sharing Route and Sharing Expenses. Mathematical Problems in Engineering, 2020. 111 Zhao, D., Zhang, D., Gerding, E.H., Sakurai, Y., and Yokoo, M., 2014. Incentives in ride sharing with deficit control. KITAKYUSHU UNIV (JAPAN). Zhao, D., Ramchurn, S.D. and Jennings, N.R., 2015. Incentive design for ride sharing with uncertainty. arXiv preprint arXiv:1505.01617. 112