ABSTRACT Title of dissertation: DYNAMIC DISCRETE CHOICE MODELS FOR CAR OWNERSHIP MODELING Renting Xu, Doctor of Philosophy, 2011 Dissertation directed by: Assistant Professor Cinzia Cirillo Department of Civil & Environmental Engineering With the continuous and rapid changes in modern societies, such as the in- troduction of advanced technologies, aggressive marketing strategies and innovative policies, it is more and more recognized by researchers in various disciplines from social science to economics that choice situations take place in a dynamic environ- ment and that strong interdependencies exist among decisions made at di erent points in time. The increasing concerns about climate change, the development of high-tech vehicles, and the extensive applications of demand models in economics and transportation areas motivate this research on vehicle ownership based on dis- aggregate discrete choices. Over the next ve to ten years, dramatic changes in the automotive marketplace are expected to occur and new opportunities might arise. Therefore, a methodology to model dynamic vehicle ownership choices is formulated and implemented in this dissertation for short and medium-term planning. In the proposed dynamic model framework, the car ownership problem is de- scribed as a regenerative optimal stopping problem; when a purchase is made, the current vehicle state (vehicle age, mileage driven, etc.) is regenerated. The model allows the estimation of the probability of buying a new vehicle or postponing this decision; if the decision to buy is made, the model further investigates the vehicle type choices. Dynamic models explicitly account for consumers? expectations of fu- ture vehicle quality or market evolution, arising endogenously from their purchase decisions. Both static and dynamic formulations are applied rst to simulated data in order to test the ability to recover the true underlying parameters of the synthetic population. Results obtained attest that the dynamic model outperforms the static MNL in terms of goodness of t, parameters bias and predictive power. In particular, it is found that MNL captures the general trends in choice probabilities, but fails to recover peaks in demand and behavioral changes due to rapidly evolving external conditions. The extension to a real case study required a data collection e ort. A pre- liminary pilot survey was designed and executed in the State of Maryland in fall 2010; the survey was self-administrated and web-based. Choices were made under the hypothesis that an interval time period of six months passed from a decision to the successive decision and choices over a hypothetical time period of six years were recorded. Finally, the application of dynamic discrete choice models to vehicle owner- ship decisions in the context of the introduction of new technology is proposed. Results from the real case study con rm our initial expectations, as the model t is signi cantly superior to the t of the static model. DYNAMIC DISCRETE CHOICE MODELS FOR CAR OWNERSHIP MODELING by Renting Xu Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial ful llment of the requirements for the degree of Doctor of Philosophy 2011 Advisory Committee: Professor Cinzia Cirillo, Chair/Advisor Professor Anna Alberini Professor Fabian Bastin Professor Paul Schonfeld Professor Lei Zhang c Copyright by Renting Xu 2011 Acknowledgments This research has been done for three years, and nally received approval from the National Science Foundation in summer of 2011. This precious honor greatly encouraged me to continue with my research on dynamic discrete choice modeling and to believe in myself when facing every challenge. I would like to express my gratitude to all the people who have contributed towards the completion of this thesis. I am deeply indebted to my advisor Dr. Cinzia Cirillo whose help, stimulating suggestions and encouragement helped me during all the time of research for and writing of this thesis. At every happy and di cult moment in my past four years, she has always been with me and supportive. She is not only a valuable guide in my research career, but also my good friend. I would also like to thank Dr. Fabian Bastin from the University of Montreal, who has always o ered his guidance and suggestions in my modeling formulation and simulation process. I also appreciate all the help he provided me during the time I was in Montreal and working with him. Additionally I would like to thank Michael Maness, who has always been helping me with C programming problems. I also thank Jean-Michel Tremblay, who helped to calibrate historical data for the dynamic research. Finally, I would like to thank my family for their support and company. And I really appreciate my mother for traveling to US and helping me with my life in the most di cult time. ii CONTENTS Acknowledgements : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : ii List of Tables : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vi List of Figures : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vii List of Abbreviations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : viii 1. Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 In Economics . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.2 In Transportation . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2 Objective of the Research . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3 Outline of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . 9 2. Car Ownership Forecasting Methodology Review : : : : : : : : : : : : : : 10 2.1 Aggregate Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2 Disaggregate Static Models . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 Joint Discrete-continuous Models . . . . . . . . . . . . . . . . . . . . 22 2.4 (Pseudo)-panel methods . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.5 Dynamic car transaction models . . . . . . . . . . . . . . . . . . . . . 27 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3. Dynamic Discrete Choice Models Review : : : : : : : : : : : : : : : : : : : 32 3.1 Discrete Choice Models and the Dynamics . . . . . . . . . . . . . . . 32 3.2 Markov Decision Process and Dynamic Discrete Choice Structure . . 36 3.2.1 Theory of Dynamics . . . . . . . . . . . . . . . . . . . . . . . 36 3.2.2 Dynamic Discrete Choice Models . . . . . . . . . . . . . . . . 37 3.3 Discussion by Model Type . . . . . . . . . . . . . . . . . . . . . . . . 39 3.3.1 Rust Optimal Stopping Problem . . . . . . . . . . . . . . . . . 39 3.3.2 Melnikov Demand Model for Di erentiated Durable Products 43 3.3.3 Computer Server Choice Model with Persistence E ect . . . . 49 3.3.4 Dynamic Durable Goods Demand with Consumer Heterogene- ity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.3.5 Dynamic Durable Goods Demand with Repeat Purchases . . . 57 3.4 Summary of Dynamic Demand Models in Economics . . . . . . . . . 59 iii 4. Dynamic Car Ownership Formulation : : : : : : : : : : : : : : : : : : : : : 63 4.1 Car Ownership Formulation . . . . . . . . . . . . . . . . . . . . . . . 64 4.1.1 General Consumer Stopping Problem . . . . . . . . . . . . . . 64 4.1.2 Utility Formulation . . . . . . . . . . . . . . . . . . . . . . . . 69 4.1.3 Industry Evolution . . . . . . . . . . . . . . . . . . . . . . . . 71 4.1.4 Objective Function and Parameters to Estimate . . . . . . . . 72 4.1.5 Dynamic Estimation Process . . . . . . . . . . . . . . . . . . . 74 4.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5. Experiments Using Simulated Data : : : : : : : : : : : : : : : : : : : : : : 79 5.1 Simulated Data Format and Generation . . . . . . . . . . . . . . . . . 80 5.1.1 Household Characteristics . . . . . . . . . . . . . . . . . . . . 80 5.1.2 Current Vehicle Attributes . . . . . . . . . . . . . . . . . . . . 80 5.1.3 Static Potential Vehicle Attributes . . . . . . . . . . . . . . . 81 5.1.4 Dynamic Attributes . . . . . . . . . . . . . . . . . . . . . . . . 81 5.1.5 Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.2 Utility Speci cation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.3 Model Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.4 Model Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6. Survey Design and Methodology : : : : : : : : : : : : : : : : : : : : : : : : 94 6.1 Survey Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 6.1.1 Household Characteristics . . . . . . . . . . . . . . . . . . . . 95 6.1.2 Current Vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.1.3 Stated Preference . . . . . . . . . . . . . . . . . . . . . . . . . 97 6.2 Survey Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.2.1 Sample design . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.3 Platform for the Web-based Survey Design . . . . . . . . . . . . . . . 105 6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 7. Descriptive Statistics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 108 7.1 Socioeconomics Results . . . . . . . . . . . . . . . . . . . . . . . . . . 108 7.2 Current Vehicle Characteristics . . . . . . . . . . . . . . . . . . . . . 111 7.3 Stated Preference experiment: vehicle technology game . . . . . . . . 113 7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 8. Experiments Using Data Collected : : : : : : : : : : : : : : : : : : : : : : 119 8.1 Static Model Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 8.2 Dynamic Model Results . . . . . . . . . . . . . . . . . . . . . . . . . 122 8.3 Model Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 8.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 iv 9. Conclusions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 128 9.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 9.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Appendix 133 A. Simulated Input Data File Format : : : : : : : : : : : : : : : : : : : : : : 134 B. List of Possible Questions for the Survey : : : : : : : : : : : : : : : : : : : 137 C. Sample Scenario Designs : : : : : : : : : : : : : : : : : : : : : : : : : : : : 142 D. Distribution of Households : : : : : : : : : : : : : : : : : : : : : : : : : : : 144 E. MAJOR C CODE FOR THE FORMULATION AND ESTIMATION : : : 145 Bibliography : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 175 v LIST OF TABLES 2.1 Comparison of types of car ownership models . . . . . . . . . . . . . 11 3.1 Comparison of the ve dynamic models . . . . . . . . . . . . . . . . . 62 5.1 Model Estimation of Experiment One . . . . . . . . . . . . . . . . . . 85 5.2 Model Estimation of Experiment Two . . . . . . . . . . . . . . . . . . 86 5.3 Model Validation: Market Shares of Experiment One . . . . . . . . . 87 5.4 Model Validation: Market Shares of Experiment Two . . . . . . . . . 88 6.1 Vehicle Technology Game Summary . . . . . . . . . . . . . . . . . . . 100 6.2 Fuel Type Game Summary . . . . . . . . . . . . . . . . . . . . . . . . 103 6.3 Tolling and Taxing Game Summary . . . . . . . . . . . . . . . . . . . 104 7.1 Socioeconomics Results . . . . . . . . . . . . . . . . . . . . . . . . . . 116 7.2 Current Vehicle Characteristics . . . . . . . . . . . . . . . . . . . . . 117 7.3 Scenarios In Which Respondents Bought a Vehicle . . . . . . . . . . . 118 7.4 Scenarios in Which Respondents Bought a New Non-Conventional Gasoline Vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 7.5 SP Game 1 Vehicle Type Choice as Percentage . . . . . . . . . . . . . 118 8.1 Static Logit Model Estimation . . . . . . . . . . . . . . . . . . . . . 121 8.2 Dynamic Model Estimation . . . . . . . . . . . . . . . . . . . . . . . 123 8.3 Model Validation: Market Shares . . . . . . . . . . . . . . . . . . . . 124 vi LIST OF FIGURES 4.1 Scenario tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.1 Market Trend for Gasoline Car-Experiment 1 . . . . . . . . . . . . . 89 5.2 Market Trend for Hybrid Car-Experiment 1 . . . . . . . . . . . . . . 90 5.3 Market Trend for Electric Car-Experiment 1 . . . . . . . . . . . . . . 90 5.4 Market Trend for Current Car-Experiment 1 . . . . . . . . . . . . . . 90 5.5 Market Trend for Gasoline Car-Experiment 2 . . . . . . . . . . . . . 91 5.6 Market Trend for Hybrid Car-Experiment 2 . . . . . . . . . . . . . . 91 5.7 Market Trend for Electric Car-Experiment 2 . . . . . . . . . . . . . . 92 5.8 Market Trend for Current Car-Experiment 2 . . . . . . . . . . . . . . 92 8.1 Market Trend for Gasoline Car . . . . . . . . . . . . . . . . . . . . . 125 8.2 Market Trend for Hybrid Car . . . . . . . . . . . . . . . . . . . . . . 126 8.3 Market Trend for Electric Car . . . . . . . . . . . . . . . . . . . . . . 126 8.4 Market Trend for Current Car . . . . . . . . . . . . . . . . . . . . . . 126 A.1 Household Characteristics . . . . . . . . . . . . . . . . . . . . . . . . 134 A.2 Current Vehicle Attributes . . . . . . . . . . . . . . . . . . . . . . . . 135 A.3 Potential Vehicle Attributes . . . . . . . . . . . . . . . . . . . . . . . 135 A.4 Dynamic Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 A.5 Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 vii List of Abbreviations DDCM Dynamic Discrete Choice Model BEV Battery Electric Vehicles PHEV Plug-in Hybrid Electric Vehicles NTRF National Road Tra c Forecasts ALTRANS ALternative Transport Systems ORL Ordered Response Logit MNL Multinormial Logit STM Sydney Strategic Transport Model VMM Vehicle Market Model DETR Department of the Environment, Transport and the Regions RP Revealed Preference SP Stated Preference MNP Multinomial Probit DVTM Dutch Dynamic Vehicle Transaction Model GEV Generalized Extreme Value OGEV Ordered Generalized Extreme Value PCL Paired Combinatorial Logit CNL Cross-nested Logit MDP Markov Decision Process CI Conditional Independence RUM Random Utility Maximization NTS National Transportation Survey viii 1. INTRODUCTION 1.1 Background The classical economic theory of consumer behavior provides a logically con- sistent foundation for the empirical analysis of many aspects of individual?s choice decision. This realm of behavior involves choice among discrete alternatives, taste variation in the population and individual?s choice making procedure. The origin of discrete choice models is in economics. These statistical pro- cedures describe choices made by people among a nite set of alternatives. Daniel McFadden won the Nobel prize in 2000 for his pioneering work in developing the the- oretical basis for discrete choice. In marketing research, discrete choice models can be used to study the consumer demand, to predict market share, and to solve some business related problems, such as pricing and product development. In energy and environmental studies, discrete choice models are utilized to make forecasts (e.g., households? and rms? choice of heating system), and to examine people?s choice of shing or skiing site. Some labor economists use discrete choice models to exam- ine occupation choice, retirement choice, and education or training program choice [Aguirregabiria and Mira, forthcoming, 2009]. In transportation area, planners have been using discrete choice models for decades to predict demand for transportation 1 facilities, travelers? transportation mode, route, destination, and time choice, and even to predict the travelers? one-day-activity [Ben-Akiva and Lerman, 1985, Daly, 1982, Edited by Randolph W. Hall, 2003]. In recent years, the objectives of transportation planning have evolved from adding road and transit capacity to managing travel demand, connecting modes/trips, and reducing emissions. Car ownership models play a central role in the planning and decision making of various public agencies and private organizations: a) The US Department of Energy, b) State Departments of Transportation, c) The auto industry and d) Local transit Agencies [Train, 1979]. The Clean Air Act of 1990 strengthens the role of demand-side policies and requires that MPOs with over 200,000 populations have their planning procedures recerti ed by USDOT every three years. MPOs are now required to greatly improve their capabilities for model- ing travel and land development and the e ects of the resultant travel and land use patterns on the economy, environment, and social equity. However, some behaviors are generally missing from the MPOs Transportation Model Systems completely (for which new sub-models must be created); this includes car ownership (number of cars and types per household), which strongly a ects trip/tour/chain generation and mode choice [Johnston, 2003]. In an ongoing project started in 2003, the federal government (jointly administered by the Federal Highway Administration (FHWA) and the Federal Transit Administration) provides support to MPOs who wishes to conduct a peer review of their travel modeling. Reviewers frequently suggest im- proving or including a car ownership model in the transportation modeling system. Rising oil prices and environmental consciousness - in particular on climate 2 change - are major drivers for the global race of developing and promoting high technology cars. So is the concern over energy security, especially at times of turmoil in the Middle East. Technological advances have also brought electric vehicles closer. Energy prices in the twenties century rose sharply and will rise steadily once the global economy fully recovers and creates a competitive marketplace for alternative energy sources. Besides, state and national governments are interested in adjusting public policy to reduce dependence on foreign oil, decrease air pollution, and combat climate change. Therefore, technology, energy, and policy development create an interesting opportunity for changes in the automotive marketplace over the next ve to ten years. The traditional static discrete choice models cannot truly make the prediction of consumer preferences for future vehicles under the expected changes in technology and environment awareness. Dynamic in car ownership choice, both at intertemporal dimension (resis- tance to change in ownership levels due to uncertainty of nancial position) and intratemporal dimensions (acquired taste for a certain lifestyle) has been studied by researchers in the US, but in many cases their analysis is based on panel sur- veys collected oversees (often the models are based on the Dutch National Mobility Panel) [Kitamura and BUNCH, 1990]. In the majority of these studies the state variable of current period is in uenced by state in the past. However, the state of each period is only represented by the number of cars owned by a household but not by exogenous attributes. A real dynamic framework is therefore necessary when modeling consumer demand that explicitly accounts for consumers? expectations of future vehicle quality, evolving market and consumers? out ow from the car market. 3 The purpose of this research is to present a dynamic discrete choice model of con- sumers? car ownership and develop an estimation technique for analyzing the impact of technological changes and the marketing evolution on the dynamics of consumers? demand. 1.1.1 In Economics A signi cant portion of the literature focusing on the extension of discrete choice models into a dynamic frame can be found in economics and related elds. In dynamic discrete choice structural models, agents are forward looking and maximize expected inter-temporal payo s; the consumers get to know the rapidly evolving nature of product attributes within a given period of time and di erent products are supposed to be available on the market. Changing prices and improving technologies have been the most visible phenomena in a large number of important new durable goods markets. As a result, a consumer can either decide to buy the product or to postpone the purchase at each time period. This dynamic choice behavior has been treated in a series of di erent research studies. In his pioneer work, John Rust[Rust, 1987] formalized the optimal stopping problem and estimated the optimal stopping time to replace a used bus engine. In this rst dynamic version of McFadden?s logit model, a single agent was considered, and random components were assumed to be additively separable, conditionally independent and extreme value distributed. Berry, Levinsohn and Parkes [Berry et al., 1995] - BLP had shown the importance of incorporating consumer hetero- geneity for obtaining realistic predictions of elasticities and welfare but their models 4 were static and did not account for the inter-temporal incentives of market partic- ipants. In 2000, Oleg Melnikov [Melnikov, 2000] expanded the engine replacement model and released the BLP limitations to model the decision of whether to buy a printer machine or to postpone the purchase based on the expected evolution of the product quality and price. The Melnikov formulation was transferred to model the adoption of other durable goods, such as computers, digital products, etc. [Song and Chintagunta, 2003, Gordon, 2006, Nair, 2007] whose quality was rapidly im- proving overtime. In the Melnikov?s framework, the products were heterogeneous while consumers were homogeneous; error terms were in fact assumed to be inde- pendently distributed across consumers, products and time periods; furthermore, the purchase was only made once in the consumers? lifetime. In addition, the pa- rameters of the static problem part were estimated separately from the dynamic part; the participation probability of a consumer was directly obtained from ob- serving the number of purchases in the total market. The estimation of dynamic discrete choice models was computationally costly because the solution of the xed point problem as de ned by Rust was required on all points along the estimation algorithm. In conclusion a three-step method was used to solve the estimation problem. Szabolcs Lorincz [Lorincz, 2005] added a persistent e ect to the optimal stopping model which completed the standard optimal stopping problem. This per- sistence means that customers who already had a product may choose to upgrade it, (i.e. upgrade the operating systems). For this application, the model not only included the likely future quality of the product, but also the industry evolution. These dynamic economic models were generally applied to evaluate price and elas- 5 ticities, intertemporal substitution and the welfare gains from industry innovations. In 2006, Carranza [Carranza, 2006] examined digital cameras market and proposed a logit utility model with one time purchase; the model incorporated fully hetero- geneous consumers and extended standard estimation techniques to account for the dynamics in consumers? characteristics. The model was estimated in a reduced- form speci cation that was relatively easy to compute. Gowrisankaran and Rysman [Gowrisankaran and Rysman, 2007] also analyzed the importance of dynamics when modeling consumer?s preferences over digital camcorder industry products using a panel data set on prices, sales and characteristics. Their model combined the BLP techniques for modeling consumer heterogeneity in a discrete choice context and the Rust techniques for modeling optimal stopping decisions. This model was based on an explicit dynamics of consumer behavior and allowed for unobserved product characteristics, repeated purchases, endogenous prices and multiple di erentiated products. 1.1.2 In Transportation In the transportation eld, dynamic models have been widely used for dynamic network equilibrium [Lam et al., 2006]. For transportation demand analysis, a num- ber of dynamic models were proposed and calibrated but they were not based on dynamic optimization. Landau et al. [Landau et al., 1981] de ned and tested em- pirically a framework for trip-generation models sensitive to temporal constraints; households decided whether or not to perform a trip for a speci c purpose during the day, and which period was taken. Hirsh et al. [Hirsh et al., 1986] estimated a 6 parametric model of dynamic decision-making process for weekly shopping activity behavior. The individual was assumed to proceed from period to period and the observed weekly activity pattern was the outcome of successive decisions. Action plans were then modi ed on the basis of actual behavior and of the additional infor- mation acquired in previous periods. Liu and Mahmassani [Liu and Mahmassani, 1998] calibrated a day-to-day dynamic model of commuters? joint departure time and route switching decisions that took into account commuters? learning from ex- perience. The analysis provided insight into day-to-day e ects of real-time tra c information on user decisions. Most recently, Train [Train, 2002b] gave the concept of dynamic decision mak- ing and described a two/more-periods model in his book Qualitative Choice Analy- sis, which was very well known amongst demand modelers in transportation. Moshe Ben-Akiva and Maya Abou-Zeid [Ben-Akiva and Abou-Zeid, 2007] proposed a dy- namic framework to model the evolution of latent variables and observed choices over time. Their approach involved the integration of discrete choice with Hidden Markov chains which contained behavioral dynamics such as individuals? plans, well- being states and actions. Shortly after, the methodology of Hidden Markov chains was used again to model dynamic driving behavior [Choudhury, 2007]. Choudhury in her MIT PhD thesis studied the e ects of unobserved plans for four tra c sce- narios: freeway lane changing, freeway merging, urban intersection lane choice and urban arterial lane. These dynamic applications of discrete choice model in trans- portation focused on the evolution of individuals? previous plans and actions but did not consider the changes in external conditions. Possible applications of dy- 7 namic discrete choice models in transportation include modeling car ownership for short and medium-term planning applications; customer choices for dynamic pricing schemes in airline or rail industry; route choice and lane change behavior; weekly (or longer term) activity patterns. Therefore, in transportation the development of dynamic discrete choice models has not been as comprehensive as in economics or marketing. 1.2 Objective of the Research In transportation, dynamic discrete choice models have not been studied ex- tensively and applications are rather limited; this dissertation aims at widening this gap from the methodological perspective and proposes an application of dynamic discrete choice models on car ownership for short and medium-term planning. This research has multiple objectives: Integrate dynamic behavioral processes into discrete choice models. Propose a general dynamic framework for car ownership. Develop an e cient algorithm to estimate dynamic discrete choice models. Extend the framework to heterogeneous consumer problems and evolving prod- uct quality. Generate an e cient and simple method for collecting real behavioral data over time. 8 Validate the superiority of dynamic model to the traditional multinomial logit model. 1.3 Outline of the Dissertation This dissertation is composed of eight chapters. Chapter 2 reviews car own- ership forecasting methodology used by transportation researchers; popular static formulations and dynamic models based on panel data are presented. Chapter 3 dis- cusses dynamic discrete choice models (DDCMs) used by econometricians to forecast the demand for durable products. DDCMs are usually speci ed as an optimal stop- ping problem, where agents decide the time period of making a stopping decision. Chapter 4 formulates DDCMs for car ownership forecasting and the dynamic na- ture of the problem is carefully detailed. Chapter 5 proposes results obtained from a simulated experiment; dynamic and static models are compared in terms of coe - cients? bias and prediction power. Chapter 6 presents the methodology adopted for survey design and execution; it reports on the revealed preference experiment and on the three stated choice games corresponding to vehicle technology, fuel choice, and taxation policy. In Chapter 7, the statistical analysis of the sample collected is described. The application of the DDCM based on real data is presented in Chapter 8; results are then compared with those deriving from a static model with equivalent speci cation. Finally, Chapter 9 summarizes the main ndings of this dissertation, presents the main contributions and o ers avenues for future research. 9 2. CAR OWNERSHIP FORECASTING METHODOLOGY REVIEW Di erent car ownership models are being used for a wide variety of purposes. National governments (notably the Ministries of Finance) make use of car ownership models for forecasting tax revenues and the regulatory impact of changes in the level of taxation. Regional and local governments (particularly tra c and environment departments) use car ownership models to forecast transport demand, energy con- sumption and emission levels, as well as the likely impact on this of policy measures. Car manufacturers apply models to the consumer valuation of attributes of cars that are not yet on the market. Oil companies want to predict the future demand for their products and might bene t from car ownership models. International organi- zations, such as the World Bank, use aggregate models for car ownership by country to assist investment decision-making [Fox et al., 2004]. The estimation of future car ownership and car users? preferences are modeled with demand models, using one of the two possible forms: aggregate or disaggregate. The literature on car owner- ship forecasting models is reviewed in this chapter with a focus on the disaggregate models and their framework, variables speci cations, and estimation methods. 10 2.1 Aggregate Models In the year 2004, De Jong published a paper on a comprehensive review of car ownership models. In this paper, the models have been classi ed into nine types [Fox et al., 2004]. I simplify the classi cation into: (1) aggregate models, (2) static disaggregate car ownership models (number of car choice model, type choice model), (3) joint discrete-continuous models, (4) Pseudo-panel methods, and (5) Dynamic car transaction models with vehicle type conditional on transaction. Aggregate car ownership models are mainly of three types: (1) time series models [Tanner, 1981, Dargay and Gately, 1999], (2) cohort models [Algers et al., 1989] and (3) car market models [Leuven, 1989]. Aggregate models no longer appear in academic journals but are still used in practice; their major limitation is the impossibility of modeling vehicle type and use; they also usually include limited socio-demographic variables. Aggregate time series models Aggregate time series models usually contain a sigmoid-shape function for the de- velopment of car ownership over time as a function of income or gross domestic product (GDP). The function increases slowly in the beginning (at low GDP per capita), then rises steeply, and ends up approaching a saturation level. Examples are the work done in the late 1980s by Tanner [Tanner, 1981] and in the early 1990s by Button [Button et al., 1993]. They mainly used logistic function. In more recent application, Ingram and Liu [K.Ingram and Liu, 1997] used a double logarithmic speci cation to explain car ownership; the National Road Tra c Forecasts (NTRF) 11 in the UK [Whelan, 2001, Whelan et al., 2000] applied a logistic curve for satura- tion, and extended this by including the saturation levels (by household type) to the overall disaggregate tree logit calibration. Dargay and Gately [Dargay and Gately, 1999] used the more exible function to predict the motorization rate (the number of cars per 1000 persons) on the basis of GDP/capita for a large number of countries. These models had the lowest data requirements, and were attractive for application to developing countries. Income was generally considered to be the main driving force behind car ownership growth. Aggregate cohort models Aggregate cohort models segmented the current population into groups with the same birth year (often ve-year cohort), and then shifted these cohorts into the fu- ture, describing how the cohorts as they became older, acquired, kept and lost cars. Examples are the models of Van den Broecke [den Broecke/Social Research] for the Netherlands, cohort-based car ownership models in France (Madre and Pirotte, 1991) and Sweden. The Van den Broecke car ownership model was a combination of cohort survival model and an econometric model. The econometric component was used for producing the impact of changes in income on car ownership. Aggre- gate cohort models were most suited for predicting the impact on car ownership of changes in the size and composition of the population. The demographic force be- hind car ownership growth can be expected to remain important in Western Europe for another couple of decades. Aggregate car market models 12 Examples of aggregate car market models are Mogridge, the Cramer car ownership model [Cramer and Vos, 1985], Manski [Manski, 1983], Berry [Berry et al., 1995], the TREMOVE model [Leuven, 1989], the ALTRANS model [Kveiborg, 2001], and the software package TRESIS [Hensher and Ton, 2002]. Mogridge distinguished between demand for cars and supply for cars in the car market which was di erent from aggregate time-series models. Cramer?s model was based on time-series data and depended on car prices, income, variation of income and development over time in the utility of using a car. The second hand car price was endogenous. Manski?s aggregate car demand and supply model had endogenous used car price on the market as well. Berry modeled the market for new cars only, with consumer demand, oligopolistic manufacturers and endogenous prices which was an innovation in the the most car market models. TREMOVE was designed to analyze cost and emission e ects of a wide range of measures in the European Union to reduce emission from road transport. TREMOVE was a simulation model but not a forecasting model. It was specially used to ana- lyze changes in behavior as a result of changes in economic conditions. ALTRANS (ALternative TRANSport systems) was a model developed for analyzing the envi- ronmental impact of di erent policy proposals on car and public transport usage in Denmark. The software package TRESIS was developed in 2002 for integrated strategic planning of transport, land use and the environment. It included disaggre- gate models for household eet size, vehicle type choice and car use. The aggregate car demand of the households by vintage in each year was compared with aggregate supply. The used vehicle prices were used to reach equilibrium and the new vehicle 13 prices were exogenous. 2.2 Disaggregate Static Models Disaggregate car ownership and type choice models have been extensively de- veloped and applied in the last two decades in several countries: The Netherlands [AVV., 2000, HCG, 1989], Norway [HCG and TOI, 1990], Sydney [Hensher et al., 1992, HCG, 2000], and US [Manski, 1983, Bhat and Pulugurta, 1998]. Their success is due to their behavioral foundations and the possibility of including a large number of policy variables, as well as car types and use. Car ownership models This category is discrete choice models and they deal with the number of cars owned by a household. The car ownership submodel within the Dutch national model system (LMS) for transport [HCG, 1989] is an early example. The car own- ership choices of the household were conditioned on household license holding, i.e. household without licenses had zero cars, household with one license chose between zero cars or one, and household with two or more licenses chose between one car or two more cars. The models were binary logit based on random utility theory. Monthly income that a household can freely spend was an important explanatory variable from which the monthly expenditures on food, clothing and housing had already been subtracted. Another important variable was xed car cost. Therefore, if monthly incomes rose, the probability of car ownership would rise accordingly. If the xed car costs rose, the car ownership probability would decrease. The rest 14 explanatory variables were age, gender, household size, number of workers in the household and region-speci c variables. These household car ownership models in the LMS , combined with personal and household license holding, then in uenced tour frequencies and mode/destination in the model system. Bhat and Pulugurta [Bhat and Pulugurta, 1998] in the year 1998 proposed ordered-response choice mechanism to model car ownership choice; they compared it with unordered-response choice mechanism and both of them applied disaggre- gate models. Ordered-response choice mechanisms were not consistent with global utility maximization. They were based on the hypothesis that a single continuous variable represented the latent car owning propensity of the household. The deci- sion process can be viewed as a series of binary choice decisions. A given household assigned utility values for each car ownership outcome, and then made an indepen- dent utility maximization decision for each range. Only one set of M household parameters needed to be estimated in this approach, but variation in sensitivity to income cannot be speci ed to vary between alternatives. The ordered-response mechanism was Ordered Response Logit (ORL). Unordered-response mechanisms were consistent with the theory of global utility-maximization. The choice was de- termined by the alternative with highest utility and the process was simultaneous among alternatives. This method had more parameters to estimate and allowed for variation in sensitivity to household income to vary with car ownership alternative. The unordered-response mechanism was Multinomial Logit. ORL and MNL models were estimated. Three socio-economic variables were signi cant across the data set: number of working adults, number of non-working adults and household income. 15 After comparison, it was found that the MNL was superior according to the rooted mean-square error measure. The average probability of correct prediction showed the MNL was superior as well. Therefore, Bhat and Pulugurta concluded that the ap- propriate choice mechanism for modeling car ownership was the unordered-response structure, such as MNL or probit models. Hague Consulting Group did the Sydney Strategic Transport Model (STM) [HCG, 2000] in 2000 in which company and total car ownership at the household level were estimated. The data set was from two sources, one collected during 1991/1992 and the other one during 1997/1998. Three approaches were tested in the disaggregate models: modeling private and company car ownership behavior in- dependently, modeling private car ownership conditional on company car ownership, and modeling company car ownership conditional on private car ownership. The re- sults showed that the second approach performed better. The model structure was a two-level MNL model system which had company car ownership models as the upper level and total car ownership models as the lower level. Both the company and total car models were dependent on the logarithm of net household income. The total car model accounted for the impact on the net household income of car ownership cost. The number of license holders in the household was a signi cant factor. Parking cost was signi cant negative in the lower car ownership zones since parking was more expensive in those areas. The head of the household was identi ed with the highest income, and it re ected car ownership di erences according to the age and gender. The variable accessibility from the home-work mode-destination model was important as well. It accounted for higher car ownership in the certain 16 zones which were accessible to work places. The UK Department of Transport made a number of possible improvements for NRTF forecast in the year 1999 [Whelan, 2001, Whelan et al., 2000]. The 1997 NRTF included two binary models for each household, a P1+ model to predict the probability of owning at least one car each household, and a P2+j1 model that was a conditional probability of owning two or more cars given the household owned at least one car. The improved model was introduced in NRTF-2001 with an additional submodel which was the conditional probability of a household owning three or more cars (P3+j2+j1+). Considering the impact of company car ownership on total household car ownership, company car dummies were included into the ownership models. This was consistent with the ndings of HCG?s work in Sydney described before. Another example is from Rich and Nielsen [Rich and Nielsen, 2001] who mod- eled a long-term travel demand for households with up to two workers. The model was speci ed as a nested logit model with two components: a work model (W- model) modeling the choice of work location and car ownership and a residential location model (R-model) modeling the zone and type of residence. Car ownership was treated within this model structure, but not separately estimated. The W- model was at the bottom of the structure, therefore it was assumed that individuals chose the work location depending on their residential location. Car ownership was modeled as a decision conditional on both residential and work location choice, and the alternatives were zero, one or two cars per household. They did not consider company cars in the models. 17 Car-type choice models This category deals with the choice of car type of the household given car ownership. Hensher et al.[Hensher et al., 1992], Manski and Sherman [Manski and Sherman, 1980] and Train [Train, 1986] have made in uential studies. Hensher et al. and Train not only included detail vehicle types, but also included the number of vehicles in the household and car use. Disaggregate models for the number of cars per household had usually been developed to provide inputs for multimodal transport model system, while the car-type choice models form a part of standard models to forecast the size and composition of the car eet. Among recently developed car ownership models, some new vehicle type mod- els are described. Page et al.[Page et al., 2000] developed a model of new car sales for incorporation within the Vehicle Market Model (VMM) of the UK Department of the Environment, Transport and the Regions (DETR). Both revealed preference (RP) and stated preference (SP) data were used in the model. RP data contained some household socio-economic characteristics and the attributes of the household?s vehicle eet. The SP data collected information from households that were either planning to acquire a new car, or had just bought a new car. The potential vehicle at- tributes were presented to respondents that included purchase prices, running costs, resale value, engine size, vehicle emissions, safety measurement, fuel type (petrol, diesel or hybrid petrol-LPG) and fuel economy. The SP and RP data were com- bined to form two nested models. One model predicted the binary choice between a private and company car. The other one predicted a multinomial choice between 18 di erent vehicle types. Separate models were used for company and private cars. In the private car model, variables were population density, log of annual household income, log of purchase price, number of children, running costs, variations in emis- sions, safety features, resale value, fuel economy, standing charges, hybrid engine type and diesel engine type. In the company car ownership model, the variables were population density, log of annual household income, log of monthly cost, num- ber of children, fuel cost, engine size, variations in emissions, safety features, hybrid engine type. There was a scale factor used to scale the SP data relative to the RP data. An interesting result in both models was that in areas with high population densities with scarce parking spaces, there was a higher probability of acquiring a smaller vehicle. Brownstone et al. [Brownstone et al., 2000] compared multinomial logit (MNL) and mixed logit models for data on California households? RP and SP for vehicle type choice. Before estimating joint SP/RP models, separate SP and RP models were estimated. However, some preference were only identi ed in the SP while some preference only in the RP. In the joint SP/RP models, a scale factor was used. MNL model showed the scale factor was less than 1, indicating the stochastic error term in the SP data had a larger variance than in the RP dataset. Mixed logit model had the scale factor greater than 1 and its preference heterogeneity was captured by fuel-type error components. The results showed that pure SP models predicted unrealistically high sports car market shares compared with the RP/SP model which demonstrated the superiority of combining RP and SP data. The mixed logit models showed results with higher market shares for the alternative fuel vehicles. Because of the 19 Independence from Irrelevant Alternatives (IIA) properties of MNL, a proportionate share of each new vehicle must come from all other vehicles, whereas the mixed logit models resulted in more plausible results that the market share for electric fuel vehicles came disproportionately from other mini and subcompact vehicles. Therefore, mixed logit models were feasible for joint RP/SP data. Hensher and Green [Hensher and Greene, 2000] estimated both MNL and mixed logit models with combined RP/SP data for vehicle choice. In the SP survey, vehicles were categorized according to the following attributes: three size categories based on engine size (within a given engine size, respondents were asked to indicate a preferred body type), price of vehicle, registration fee, fuel cost to travel 500 km (variable described as approximate cost of lling a tank so respondents understood levels), fully fuelled range, acceleration and boot size. The SP survey followed a two- stage process. The household member was required to consider three conventionally fuelled vehicles (one from each size class) and choose one in the rst stage. In the second stage, three electric vehicles and three alternative fuel vehicles were added to the choice set, and the respondent was asked to choose one vehicle from the nine options. This process was repeated three times. The RP model was de ned by a 10-alternative choice set, using a random sampling procedure within each size class to assign vehicles of each vintage to the 10 alternatives given their size class. One nested logit and three mixed logit models were estimated. In the mixed-logit mod- els, random parameters were estimated for the electric and alternative fuel vehicle constants (normally distributed), and for the vehicle price (log-normally distributed to ensure the parameter is alwasnegative). After comparing nested MNL and the 20 mixed logit formulation, MNL was found to over allocate to new fuels market shares and therefore underestimate shares on the conventionally fuelled classes relative to mixed-logit models. 2.3 Joint Discrete-continuous Models Part of Train?s models [Train, 1986] for California and Hensher?s et al. [Hen- sher et al., 1992] for Sydney and the models of De Jong [Jong, 1989a,b, 1991] for the Netherlands belong to this category. These models explain household car ownership and car use in an integrated micro-economic framework. De Jong [Jong, 1989a] developed two disaggregate models in his studies. Both of his models explained whether or not a household would own a private car, and the number of kilometers driven per year conditional on car ownership. The idea from his models was that decision of household on car ownership and car use were strongly interrelated and should be studied together. Both models were joint discrete-continuous models. The rst model was called the "statistical model" and it was under the situation without major policy changes. It assumed that a household had a desired annual kilometrage, which depended on attributes of the household. When this desired kilometrage exceeded a threshold, the household would own a car. The observed kilometrage can deviate from the desired kilometrage through a random disturbance term. Explanatory variables for both models were household income, household size, age, gender and occupation of the head of the household. De Jong?s second model was the "indirect utility model" [Jong, 1989b]. Train 21 [Train, 1986] and Hensher et al.[Hensher and Greene, 2000] used this model in their studies after that. This model was based on micro-economic theory. The basic idea was that households compared combinations of car ownership and car use with each other and chose the combination that gave the highest utility. Fixed car cost and variable car cost were incdlued besides the variables that were in the statistical model. Train [Train, 1986] and Hensher et al. [Hensher and Greene, 2000] developed similar "indirect utility" equations for car ownership and annual kilometrage, but embedded these models in a larger framework which also contained car type choice, conditional on car ownership. Hensher et al.?s model system was developed based on panel data for Sydney and contained both static and dynamic vehicle choice and use models. 2.4 (Pseudo)-panel methods Panel models Panel data has been used since 1980s. Early in the 1987, Kitamura [Kitamura, 1987] developed an integrated model simultaneously determining car ownership and the total number of trips in a week. The model contained lagged e ects. All the equations were linear. The data set consisted of the rst waves from the Dutch National Mobility Panel (LVO). 10 waves were collected between March 1984 and March 1989. Kitamura and Bunch [Kitamura and BUNCH, 1990] used four waves of the same LOV panel data set to develop an ordered-response probit model for the 22 car ownership per household. They included lagged variables to account for state dependence and individual-speci c error components to account for unobserved het- erogeneity across households. Meurs [Meurs, 1991] also had car ownership models estimated on the panel data of LVO. The models included linear simultaneous equa- tion models of car ownership and use, discrete choice car ownership models, and joint car ownership and mobility models [Meurs, 1993]. Income was used as the variable but car cost variables were not included. Here are some recent panel models. Nobile [Nobile et al., 1996] estimated a ran- dom e ect multinomial probit (MNP) model of car ownership level with longitudinal data collected in the Netherlands. Nobile et al. noted that panel data enabled the incorporation of both intertemporal dimensions and intratemporal dimensions. The data source for modeling was drawn from Dutch National Mobility Panel. Waves 3, 5, 7 and 9 of the period were analyzed. The approach used for estimation was Bayesian: a prior distribution of the parameters of the longitudinal MNP model was speci ed and the posterior was examined using Markov chain Monte Carlo methods. A total of 50,000 draws were used for the Markov chain, with an initial burn-in of 5000 draws excluded to ensure that the Markov chain had stabilized. The results showed wave dummies were all negative, suggesting generic temporal e ects. In the cross-sectional terms, standard disaggregate household model term were estimated for one and two or more car alternatives with no cars as the base. These terms included the level of urbanization, number of licenses in the household, number of full and part-time workers, number of adults, number of children, and household income. 23 Hanly and Dargay [Hanly and Dargay, 2000] used 4-year panel data from the British Household Panel Survey. This panel model had dependent variable, the number of cars owned per household in each year. The dependence on past experience was incorporated by introducing lagged endogenous variables. Three types of models were estimated: a model without a lagged dependent variable, a model with a lagged dependent variable and a model with dummies for the number of cars in the last year. Golounov et al.[Golounov et al., 2001] in the year 2002 rst developed a the- oretical model for the purchases and consumption of cars, other durable goods and other day-to-day and long-term purchases. They stated that most existing dynamic car ownership models (panel models, cohort models, duration models) did not have a strong theoretical underpinning. Another theoretical foundation for dynamic own- ership and replacement model is from John Rust [Rust, 1987] who combined utility theory from micro-economics with optimal stopping process decision-making rules from dynamic programming. His application concerned the replacement of bus en- gines in a single agent over time. Pseudo-panel models The pseudo-panel method is a relatively new econometric approach to estimate dy- namic transport demand models that circumvents some problems of panel data such as attrition. A pseudo-panel is an arti cial panel based on cohort averages of re- peated cross-sections. There are some restrictions imposed on pseudo-panel data. One of the important is that the cohorts should be based on time-invariant char- 24 acteristics of the households, such as the birth year of the head of the household. The cohorts should have homogeneity within them and heterogeneity between them. Another important feature of pseudo-panel data is that averaging over cohorts trans- forms disaggregate values of variables into cohort means losing information about the individuals. Dargay and Vythoulkas [Dargay and Vythoulkas, 1999a] used the pseudo-panel dataset of 5-year cohorts constructed from repeated cross-section data contained in the UK Family Expenditure Survey. Their model was a xed e ect model but resulted in an error-in-variables estimator. A generation e ect was added to the model proposed by Deaton and a lagged dependent variable was included to estimate the dynamics of the model. There were three other models estimated to compare with the xed e ect model: OLS, random e ect speci cation and random e ect with a rst-order autoregressive scheme. The dependent variables was the number of cars per household and it indicated the average number of cars for that particular cohort. The explanatory variables were socio-economic characteristics of the household such as number of adults and children, income, metropolitan and rural areas, and a generation e ect for the head of the household. Car purchase costs, car running costs and public transport fares were also included. Dargay and Vythoulkas [Dargay and Vythoulkas, 1999b] had another paper which extended the previous paper by de ning the pseudo-panel observations not only as 5-year cohorts, but also in terms of area type (e.g. rural, urban). 25 2.5 Dynamic car transaction models Hocherman et al.[Hocherman et al., 1983], Smith et al.[Smith et al., 1989] and Gilbert[Gilbert, 1992] did some early studies on vehicle transaction models. Hocher- man et al. used a nested logit model for vehicle transactions and the conditional vehicle type choice. The transaction options for a zero-car household were purchas- ing a car or doing nothing; for a one-car household, the options were replacing or doing nothing. For the purchase and replace options, there were type choice mod- els. Smith et al. only studied the transaction of one-car households. Gilbert used duration models to explain car ownership duration. Bunch et al.[Bunch et al., 1996] and the Dutch Dynamic Vehicle Transaction Model (DVTM) are the most recent examples. Duration models in these models determine whether a household will make a purchase. Vehicle type model is used if a transaction is made. Bunch et al.?s model for California contained transaction models for adding a car, disposing a car and replacing a car for single- and multivehicle households. The overall dynamic simulation system also included the type choice models from Brownstone et al. [Brownstone et al., 2000] and car use equations. The DVTM model was developed and tested by the Hague Consulting Group. The main objective of the modeling was to extend the static disaggregate mod- eling approach for the size and composition of the car market into the dynamic models. The DVTM consisted of four submodels. Hazard-based duration models explained the time that elapsed between two household vehicle transactions. They used continuous time and were intrinsically stochastic models. Several functional 26 forms they used in the models were exponential, Weibull, and log-normal. Vehicle type choice models in this study were for households replacing or extending their eet. Vehicle types were distinguished by brand, model and vintage. For each brand-model-vintage combination, the engine size, weight, average fuel e ciency, fuel type, type of catalytic converter and xed and variable cost were known. MNL model was used. Model for annual car use was similar to the indirect utility model (discrete-continuous model). Model for style of driving, the last submodel deter- mined a possible deviation from the average fuel e ciency. In the dynamic vehicle transaction model such as the DVTM or Bunch et al.?s model for California, the number of cars per household was predicted based on current car ownership of the household. The duration model predicted the time (e.g. months) until the next vehicle transaction and the type of transaction (e.g. replacement, disposal, adding a car). Time was discrete in this model. Households that did not transact in year t would have the same vehicle ownership in year t+ 1 as in year t. Households that had transactions involved replacing a car or adding a car, the conditional type choice model would therefore be used to get new type choice probabilities. The duration model then could be used to predict transactions each time based on the car ownership situation of the previous year. Meanwhile, vehicle scrappage transactions could be integrated in the model. For both duration model and a panel model of vehicle transactions, short term predictions (up to 5 years ahead) might be done without updating the population in the sample used. But for medium and long term forecasts, the population needs to be updated. 27 The discrete vehicle type choice model was applied conditional on speci c vehi- cle transactions in the DVTM. The choice alternatives were the brand-model-vintage combinations and there were about 1000 distinguished alternatives. In addition, av- erage emission rates and fuel consumption for the brand-model-vintage combination can be used to give outcomes on these variables. 2.6 Summary This chapter reviewed car ownership models with a classi cation into ve types: aggregate models, static disaggregate models, joint discrete-continuous mod- els, (pseudo)panel models and dynamic models. Table 2.6 compared the car own- ership model types discussed above on the basis of 16 criteria proposed by De Jong in 2004 [Fox et al., 2004]. The aggregate models which included time series models, cohort models and car market models could not model vehicle type and use and they lacked a lot of variables. Therefore, the aggregate models were not the right type for the devel- opment of a fully edged car eet model. They can only predict the total number of cars in the medium and long term and then used the results as a starting point in other models. However, when the data were very scarce, aggregate time series models might be the only method available for forecasting. The static car ownership models and discrete car-type choice models were suitable for a long-term prediction to forecast the number of cars and the distribution over households and car types. Their advantages compared to the aggregate models 28 were the possibility of including a large number of policy variables, cost and price variables with RP and SP data. Car-type choice models were predicted given car ownership. After comparing nested MNL and mixed logit formulation, mixed logit was found to reasonably allocate the car-type market shares. Joint discrete-continuous models explained household car ownership and car use in an integrated micro-economic framework. The idea from these models were that decision of household on car ownership and car use were strongly interrelated and should be studied together. Discrete car type choice models can be added to panel models for the tran- sitions between car ownership states of households. The panel models could then be used to give the evolution of the eet from the present eet. For medium- and long-term forecasts, panel models can be carried out when changes in the size and composition of the population need to be predicted. Pseudo-panel models provide an convenient way to get short- and long-term policy-sensitive forecasts of the car ownership based on cohort averages of repeated cross-sections. But the restrictions of losing information about the individuals determine pseudo-panel models cannot take over the role of a choice-based model for the number of cars and car type. Dynamic transaction models included duration models for determining whether a household would make a purchase. These dynamic models had been combined with detailed policy-sensitive type choice models to predict brand-model-vintage combination. For long-term forecasts, as for panel models, population needed to be updated. Long-term changes in the supply of car types could be simulated through scenarios. 29 As p ec t Aggregat e tim e serie s m o de ls Cohor t m o d el s Aggregat e mar ke t m o del s Stati c disaggregat e ownershi p m o de ls Stati c disaggregat e ty p e choic e m o del s Joi n t discrete - co n ti n uou s m o de ls P ane l m o del s Pseudo - pane l m o del s Dynami c transactio n m o del s Le ve l of aggregatio n Aggregat e Aggregat e Aggregat e Disaggregat e Disaggregat e Disaggregat e Disaggregat e Aggregat e Disaggregat e Dynami c or stati c Dynami c Dynami c Dynami c Stati c Stati c Stati c Dynami c Dynami c Dynami c Lon g or short-ru n forecast s Short , mediu m an d lon g Mediu m an d lon g Short , mediu m an d lon g Lon g Lon g Lon g Shor t an d lon g Shor t an d lon g Shor t an d mediu m Theor y N/ A N/ A Economi c mar ke t equilibriu m Ca n b e base d on rando m utili ty theor y Ca n b e base d on rando m utili ty theor y micro - economi c theor y Ca n b e base d on rando m utili ty theor y or lifetim e utili ty theor y W ea k link s wit h rando m utili ty theor y P art s ca n b e base d on rando m utili ty Dat a requireme n ts Lig h t Lig h t Lig h t M o derat e He av y He av y V er y he av y M o der at e V er y he av y Ca r ty p es N o ca r ty p es N o ca r ty p es Limite d ca r ty p es V er y limite d Ma n y ca r ty p es V er y limite d V er y limite d bu t coul d b e co m bine d wit h a ty p e choic e m o de l V er y limite d V er y limite d n u m b er in duratio n m o del , bu t ver y ma n y in ca r ty p e choic e m o de l Impac t of incom e Y es Y es Y es Y es Y es Y es Y es Y es Y es Impac t of ca r co st Fixe d an d or variabl e cos t sometime s include d Non e Fixe d an d variabl e Fixe d cos t ofte n included;log - su m include s variabl e cos t Pur chas e cos t an d fue l e cienc y ofte n include d Fixe d an d variabl e N o p olic y run s re p orted , bu t mig h t b e p ossibl e Fixe d an d variabl e Fixe d an d variabl e Impac t of licens e holdin g N o Y es Y es P ossibl e N o P ossibl e No,bu t p oss ibl e No,bu t p oss ibl e No,bu t p oss ibl e S o ci o-de m ographi c impact s Limite d Ma n y p ossib le Limite d Ma n y p ossibl e Ma n y p ossibl e Ma n y p ossibl e Ma n y p oss ibl e Limite d Ma n y p ossibl e Scrappag e include d N o N o Ca n b e include d N o N o N o Ca n b e include d N o Ca n b e include d T ab . 2.1 : Compariso n of ty p es of ca r ownershi p m odel s 30 3. DYNAMIC DISCRETE CHOICE MODELS REVIEW A signi cant portion of the literature focusing on the extension of discrete choice models into a dynamic frame can be found in economics and related elds. In dynamic discrete choice structural models, agents are forward looking and maxi- mize expected inter-temporal payo s; the consumers is aware of the rapidly evolving nature of product attributes within a given period of time and di erent products are supposed to be available on the market. Changing prices and improving tech- nologies have been the most visible phenomena in a large number of important new durable goods markets. This chapter provides a review of dynamic theory and its application in economics, with a special focus on the combination of behavioral dy- namics and discrete choice. Successively, possible applications in transportation are discussed. Finally, conclusions and the avenues for future research opportunities in transportation are presented. 3.1 Discrete Choice Models and the Dynamics Discrete choice models based on Random Utility Maximization (RUM) theory have been of interest to researchers for many years in a variety of disciplines. These methodologies are used to analyze and predict individual choice behavior. Classical 31 formulations assume that utilities are linear, additive and include both individual characteristics and alternative attributes. The multinomial logit (MNL) [Ben-Akiva and Lerman, 1985] model has been the most widely used structure for modeling discrete choices in travel behavior analysis. Nested logit (NL) model [Daly, 1982] relaxes in part MNL model assumptions; it is derived from McFadden?s [McFadden, 1978] generalized extreme value (GEV) model. Other relaxations of the MNL model, designed to consider similarity between pairs of alternatives, have been derived from McFadden?s GEV model as well. These include the ordered generalized extreme value (OGEV) model [Small, 1987], the paired combinatorial logit (PCL) model [Chu, 1981, 1989] and the cross-nested logit (CNL) model [hua Wen and Koppelman, 2001, Abbe et al., 2007, Papola, 2000]. Non-closed form discrete choice models as Probit [Daganzo et al., 1977] and Mixed logit [McFadden and Train, 2000] have been adopted by researchers to deal with heterogeneity over consumer preferences, correlation across alternatives and state dependency. All these models have been mainly developed in a static context. However, the static framework is limited by the assumption that consumers are not a ected by past and future states when choosing their preferred alternative in the present. The gap between discrete choice model and dynamics in individual behavior has spurred various developments that are mainly intended to enrich the basic theory by including in the formulation the changes occurring in the system to be modeled. A signi cant portion of the literature focusing on the extension of discrete choice models into a dynamic frame can be found in economics and related elds. In dynamic discrete choice structural models, agents are forward looking and maxi- 32 mize expected inter-temporal payo s; the consumers is aware of the rapidly evolving nature of product attributes within a given period of time and di erent products are supposed to be available on the market. Changing prices and improving technologies have been the most visible phenomena in a large number of important new durable goods markets. Although sometimes the future e ects are not fully known, or de- pend on factors that have not yet transpired, the person knows that in the future, he/she will maximize utility among the alternatives that are available at that time. This knowledge enables him/her to choose the alternative in the current period that maximizes his expected utility over the current and future periods [Train, 2002a]. As a result, a consumer can either decide to buy the product or to postpone the purchase at each time period. This dynamic choice behavior has been treated in a series of di erent research studies and the modeling procedures were applied in various areas, such as Wolpin?s model for women?s fertility probability[Wolpin, 1984], Pakes? model about patent options [Pakes, 1986], and Wolpin?s model on job search [Wolpin, 1987]. John Rust formalizing the optimal stopping problem and estimating the optimal stopping time to replace a used bus engine have been considered as a breakthrough on dynamic modeling in both transport and economic elds. In this dynamic version of McFad- den?s logit model, a single agent was considered, random components were assumed to be additively separable, conditionally independent and extreme value distributed. Berry, Levinsohn and Parkes [Berry et al., 1995] - BLP had shown the importance of incorporating consumer heterogeneity for obtaining realistic predictions of elas- ticities and welfare but their models were static and did not account for the inter- 33 temporal incentives of market participants. In 2000, Oleg Melnikov expanded the engine replacement model and released the BLP limitations to model the decision of whether to buy a printer machine or to postpone the purchase based on the ex- pected evolution of the product quality and price. The Melnikov formulation was transferred to model the adoption of other durable goods, such as computers, dig- ital products, etc. [Song and Chintagunta, 2003, Gordon, 2006, Nair, 2007] whose quality was rapidly improving overtime. Szabolcs Lorincz [Lorincz, 2005] added a persistence e ect to the optimal stopping model which completed the standard op- timal stopping problem. This persistence means that customers who already had a product may choose to upgrade it(i.e. upgrade the operating systems). For this application, the model not only included the likely future quality of the product, but also the industry evolution. These dynamic economic models were generally applied to evaluate price and elasticities, intertemporal substitution and the welfare gains from industry innovations. In 2006, Carranza examined the digital cameras market and proposed a logit utility model with one time purchase [Carranza, 2006]; the model incorporated fully heterogeneous consumers and extended standard estimation techniques to account for the dynamics in consumers? characteristics. The model was estimated in a reduced-form speci cation that was relatively easy to compute. Gowrisankaran and Rysman also analyzed the importance of dynamics when modeling consumer?s pref- erences over digital camcorder industry products using a panel data set on prices, sales and characteristics [Gowrisankaran and Rysman, 2007]. Their model combined the BLP techniques for modeling consumer heterogeneity in a discrete choice con- 34 text and the Rust techniques for modeling optimal stopping decisions. This model was based on an explicit dynamics of consumer behavior and allowed for unob- served product characteristics, repeated purchases, endogenous prices and multiple di erentiated products. In the transportation eld, dynamic models have been widely used for dynamic network equilibrium [Lam et al., 2006]. For transportation demand analysis, a number of dynamic models were proposed and calibrated but they were not based on dynamic optimization. In transportation the development of dynamic discrete choice models has not been as comprehensive as in economics or marketing. 3.2 Markov Decision Process and Dynamic Discrete Choice Structure 3.2.1 Theory of Dynamics According to the formulation proposed by John Rust in 1987, any dynamic problem can be formulated as a Markov decision process (MDP) in which two compo- nents should be de ned at each discrete period and for each individual: (1) a vector of system state variable st and (2) an action or decision variable dt. The state and action determine current utility u(st; dt) and a ect the distribution of the next pe- riod?s state st+1 via the Markov transition probability p(st+1jst; dt) . In each period t, the individual maximizes the expected utility V (s) =maxE( P t=0 u(st; dt)js0 = s) and decides the optimal decision rule d . In this equation, E denotes expectation with respect to the controlled stochastic process st; dt and 2 (0; 1) is the discount 35 factor. By applying the Bellman?s principle of optimality the value function can be obtained using a recursive procedure: V (st) = max d2D(st) [u(st; dt) + Z V (st+1)p(dst+1jst; dt)] (3.1) and the optimal decision rule is obtained from V by nding a value d(s) 2 D(s) that attains the maximum utility in equation (3.1) for each s (Rust, 1994 draft)[Rust, 1994]. dt(st) = arg max d2D(st) [u(st; dt) + Z V (st+1)p(dst+1jst; dt)] (3.2) 3.2.2 Dynamic Discrete Choice Models Dynamic discrete choice models describe the behavior of a forward-looking agent who chooses among some available alternatives repeatedly over time and in- tends to maximize expected inter-temporal payo s. The parameters in the dynamic function describe agents? preferences and beliefs about technological and institu- tional constraints, and the whole utility function contains both the static parameters and the transition probabilities. The ultimate objective is to estimate the structural parameters in preferences, state transition probabilities and the discount factor . The application of dynamic discrete choice models in economics are intended for the consumer i to decide whether to buy a product or not at time t, that is the consumer chooses one of Jt products in period t or chooses to postpone buying. From these Jt choices, the consumer chooses the alternative which maximizes the sum of the expected discounted value of utilities at time t+1 conditional on the information 36 at time t . Generally, product j is characterized by observed static characteristics xj, dynamic characteristic yjt (such as price) and unobserved characteristics j (e.g. policy, technology innovation). Consumer preferences over xj and yjt are de ned respectively by coe cients xi and y i which need to be estimated with j. It is assumed that xj and j stay constant over in nite life of the product. In each period, the consumer obtains a utility from the product that has just been purchased or from the product that has already been owned. The utility function of discrete choice from product j purchased at time t can be generalized as uijt = x i xj + y i yjt + j + ijt (3.3) ijt is an individual-speci c random term depending on the individual i , the product j and the time period t . It is usually assumed that ijt is distributed type I extreme value, independent across consumers, products and time. The consumer i will decide to buy a product at time period t when the max- imum utility is greater than a speci c utility which will depend on the expected evolution of products? quality and prices in the future. Let vit = maxj uijt denotes the maximum utility consumer i can get from any product purchased at time t . The reservation utility is the value of not purchasing anything at current time pe- riod t and postponing until the next period t+ 1 when the individual evaluates the problem again. The reservation utility could be written as: V ( it) = E[max fvi;t+1; V ( i;t+1)g j it] (3.4) 37 where it is a vector of su cient statistics for the distribution of vit and its Markov transition probability. The speci c settings of V ( it) might di er depending on the speci c application considered while the estimation methods used are mostly based on Rust?s nested xed point maximum likelihood algorithm. Both speci cation and estimation will be discussed in the following Sections. 3.3 Discussion by Model Type 3.3.1 Rust Optimal Stopping Problem Modeling framework An early example of dynamic framework for agent decisions is the optimal stopping model proposed by John Rust in 1987 and applied to the problem of bus engine replacement. This work is the basis for later dynamic studies [Melnikov, 2000, Lorincz, 2005, Carranza, 2006, Gowrisankaran and Rysman, 2007]. In this speci c case, the optimal stopping rule is de ned as "whether or not to replace the current bus engine" in each period and based on observed and unobserved variables. The stochastic dynamic problem formalizes the trade-o between the con icting objectives of minimizing maintenance costs versus minimizing unexpected engine failures. Rust?s framework focuses on two ideas: (1) a "bottom-up" approach for modeling the replacement problem and a (2) "nested xed point" algorithm for estimating dynamic programming models in the presence of discrete choices. The bottom-up approach generates replacement investments by aggregating single replacement demands for some speci c capital goods such as bus engine 38 (Rust?s case is to aggregate all the models of bus engine ). The demand is the sum of a large number of stochastic processes, each characterized by a decision vari- able dt , where dt = 1 if a replacement occurs and 0 otherwise, and by a state variable st which is the mileage cumulated by the bus engine at time t. At each time period the agent faces the following discrete decisions: (i) perform normal maintenance on the current bus engine and incur operating cost c = (st; 1)1 or (ii) cannibalize the old bus engine for scrap value P and install a new bus engine at cost P and incur operating cost c = (0; 1) . It is also assumed that the mileage travelled each month is exponentially distributed with parameter 2. Besides, there are still some vari- ables that can be observed by the agent but not by the econometrician, a solution is to add an error term t to the utility function u(st; dt; ) + t(d) which realizes single period utility value when alternative d is selected and the state variable is st, = f 1; 2g. Suppose the vector of state variables obey a Markov process with transition density given by a parameter function (st+1; t+1jst; t; dt; ) . The behavioral hy- pothesis is that agent chooses a decision rule to maximize his expected discounted utility over an in nite horizon where the discount factor 2 [0; 1) . The solution to this optimal stopping problem is given by the recursive Bellman?s equation: V (st; t) = max dt2D(st) [u(st; dt; ) + t(dt) + EV (st; dt; t)] (3.5) 1 Costs are in general not directly observable, so they are inferred from observations. In Rust case study a total cost function is estimated with parameter 1 39 where the utility function u is given by: u(st; dt; 1) = 8 >>< >>: c(st; 1) + (0) if dt = 0 [P P + c(0; 1)] + (1) if dt = 1 (3.6) In function (3.5), V (st; t) is the maximum expected discounted utility obtained by the agent when the state variable is (st; t). The expected function EV is de ned by EV (st; t; dt) = Z V (st+1; t+1) (dss+1; d t+1jst; t; dt; ) (3.7) The transition probability de nes the regeneration property through evolution of the mileage variable st: p(st+1jst; dt; 2) = 8 >>>>>>>>< >>>>>>>>: 2exp f 2(st+1 st)g if dt = 0; st+1 st 2exp f 2(st+1)g if dt = 1; st+1 0 0 otherwise (3.8) With all the functions de ned above, it is concluded that Section by saying that (st; dt) is a realization of a controlled stochastic process whose solution is an optimal decision rule dt that attains the maximum in Bellman?s equation (3.5). The objective is to use the observed data to infer the unknown parameter vector . 40 Estimation Maximum likelihood is the method used to infer the unknown parameters and to derive the probability density function L(s1; :::; sT ; d1; :::dT j ) from the data and to compute the estimate ^ which maximizes the likelihood function. Rust set Conditional Independence (CI) Assumption yielding a simple formula for the likelihood function so the procedure to compute (3.5) is substantially simpli ed. CI (st+1; t+1jst; t; dt; ) = p(st+1jst; dt; )q( tjst; ) (3.9) CI limits the pattern of dependence in (st; t) in two ways. First, st+1 is a su cient statistic for t+1 so that any statistical dependence between t and t+1 is transmitted entirely through the vector st+1 . Second, the probability density for st+1 depends only on st and not on t . If it is assumed that q yields some speci c functional form such as multivariate extreme value distribution, the likelihood function can be written as: L(s1; :::sT ; d1; :::dT j ) = TY t=1 P (dtjst; )P (stjst 1; dt 1; ) (3.10) Where the conditional choice probability P (djx; ) , is given by the standard multi- nomial logit formula: expu(s; d; ) + EV (s; d) P d02D(x) expu(s; j; ) + EV (s; j) (3.11) 41 where EV is the xed point to the contraction mapping T (EV ) computed by: EV (x; d) = T (EV )(s; d) Z log 2 4 X d02D(s0 ) exp n u(s 0 ; d 0 ; ) + EV (s 0 ; d 0 ) o 3 5 p(ds 0 ; d; ) (3.12) T is a contraction mapping and EV is the unique solution to (3.12). To conclude, the nested xed point optimization nds a that maximizes the likelihood function (3.10). Further details about the optimization algorithm can be found in Rust, 1988. 3.3.2 Melnikov Demand Model for Di erentiated Durable Products The bus engine replacement problem only describes one single agent?s choosing behavior that limits the application of dynamic discrete choice models. Another ex- ample of dynamic demand framework is the Melnikov?s model for computer printers. The computer hardware market is similar to many other high-technology product markets; the quality rapidly improves over time and product durability impacts the evolution of prices and sales [Melnikov, 2000]. In Melnikov?s model, only one purchase is made; this is the same assumption made by Rust in his optimal stop- ping problem. Furthermore, all consumer heterogeneity is captured by a term that is independently distributed across consumers, products and time. The signi cant di erence between the two approaches is that Melnikov mainly deals with di er- entiated durable products rather than homogenous products (i.e. the bus engine in Rust?s example). The framework is divided into three parts: consumer optimal stopping problem, industry evolution and sales dynamics and aggregation. 42 Consumer Optimal Stopping Problem The consumer optimal stopping problem gives a general formulation of this choice decision. In each period t , consumer i has two options, Sit = f0; 1g . sit = 0 means i does not own any product at t; sit = 1 otherwise; in the latter case consumers are out of market. In each period t consumers who have no product either choose to buy one of the products j or to postpone the purchase until the optimal time. If the consumer buys a product, the terminal payo which is the utility when the consumer decides to buy is: uijt = f(xj; yjt; i) + ijt (3.13) where xj is a vector of static product attributes for product j, yjt is a vector of dynamic characteristics such as price for product j at time t, i is a vector of pa- rameters for homogenous consumer preferences over x and y, so it can be simpli ed as under the author?s assumption; random terms ijt are individual-speci c ran- dom utility components of J-dimensional random vector which are assumed to be independent and identically-distributed amongst individuals and periods. is also required to follow generalized extreme value (GEV) distribution. Based on the de- scription above, uijt are therefore i.i.d amongst individuals as well. We can neglect the di erent individuals in (3.13) because of the assumption of homogeneity and decompose it as ujt = jt + jt, where jt is the mean utility E[uijt]. Generally, the consumer makes the decision following two steps: rst he chooses j t that maximizes the utility from set J and then he decides whether to 43 buy or to postpone the purchase until the next period. j t is the product which contributes the maximum utility and set J includes all the products j available to the consumer. This optimal stopping problem can be generated as the following formula: D(uit; :::; uJt) = max ( 1X k=t k tc+ tE[maxj2JuJ ] ) (3.14) where is a common discount factor; c is the utility payo and E denotes a condi- tional expectation. Let vt = maxj2Juj and vt has type I extreme value (Gumbel) distribution according to the described assumption about ijt. The distribution of vt is Gumbel distributed with a scale factor 1 (because of the assumption de ned in this paper), so Fv(z; rt) = exp( exp( (z rt))) (3.15) where rt in formula (3.15) is the mode of the distribution of vt given by rt = lnG(exp( 1t; :::; Jt; t)) = lnR (proof see Appendix A of Melnikov?s paper). The consumer?s decision can be nally transformed from (3.14) into: D(vt; ct) = max fvt; c+ E[D(vt+1)]g (3.16) Industry evolution Melnikov?s model contains a very important factor rt which characterizes the distribution of the maximum utility; it represents the evolution of the industry and it is formulated as the mode of the Gumbel distribution of vt. It is also assumed that 44 the evolution of the mean utility can be characterized by a homogenous Markov process with transition density (rt+1jrt; r). Besides, rt here follows a di usion process de ned by: rt+1 = (rt) + (rt) t+1 (3.17) where t are assumed to be i.i.d. standard normal N(0; 1). (r) and (r) are continuous and almost everywhere di erentiable and (r) > r. The di usion process can be expressed by means of di erent formulations; those formulations are reported in Melnikov?s paper but not implemented into the framework presented. Here rt has a homoschedastic random walk with drift, rt+1 = rt + + (where 0 ). In this case, the Bellman equation (3.16) becomes: D(vt; rt) = max fvt; c+ E[D(vt+1(rt+1))jrt]g (3.18) where vt has Gumbel distribution with mode rt. Meanwhile, the stopping set is (r) = fvjv c+ E[D(:)jr]g and it is convenient to de ne W (r) = c+ E[D(:)jr] as the reservation utility. W (r) can be integrated as: W (r) = c+ Z 1 1 Z 1 1 max(v;W (z))dF (vjz)d (zjr) (3.19) where from (3.15), F (vjz) = exp( exp( (v z))), d (zjr) = ( z (r) (r) )dz and (:) is the standard normal density. 45 Sales dynamics and aggregation Demand structure The dynamics of the demand structure is determined by the probability of postpon- ing the purchase, which the author denotes as: 0t(rt) = P fSi;t+1 = 0jSit = 0; rtg = Fv(W (rt); rt) = exp( exp( (W (rt) rt))) (3.20) The probability of buying the product is de ned as the individual hazard rate of the product adoption, h(rt) = 1 0t(rt). Furthermore, product-speci c purchase probability is: jt(rt; :) = P fujt ukt;8k 6= j;ujt W (rt)g (3.21) = P fujt W (rt)jujt ukt;8k 6= jgP fujt ukt;8k 6= jg (3.22) = P fujt W (rt)g p fujt ukt;8k 6= jg (3.23) = h(rt) exp( jt)Gj(e j1 ; :::; e jt) G(e j1 ; :::; e jt) = h(rt) exp( jt)Gj(:) Rt (3.24) Gj(:) is the partial derivative of G(:) with respect to jth argument. One important issue in this Section is the calculation of the hazard rate with equation (3.20). By setting Y (rt) = W (rt) rt and by combining (3.17), 46 (3.18),(3.19), Y (rt) can be integrated as: Y (rt) = c+ (r) r + Z 1 1 E[max( ; Y (rt+1))] z (r) (r) dz (3.25) Recall that equation (3.17) and rt?s random walk with drift, Y (rt) is obtained from (3.25). Thus, the hazard rate h can be computed from (3.20). Aggregation The transition of consumer state can be presented by a Markov matrix H : f0; 1g ! f0; 1g: H1(rt) = 2 6 6 4 0t(rt) h(rt) 0 1 3 7 7 5 (3.26) The model can also accommodate product?s "break down", which is given a probability q; the consumer i under the state sit = 1 has probability q to return to the market. Therefore the transition matrix can be expressed by: H2(rt) = 2 6 6 4 0t(rt) h(rt) q 1 q 3 7 7 5 (3.27) Participation rate is composed by two components: (1) the market share that does not own a product and (2) the market share that has break-down product (i.e. ?t = P [sit = 0]). The participation rate evolves over time according to the Kolmogorov- Chapman equation (3.28). ?t+1 = ?t 0t(rt) + q(1 q) (3.28) 47 The hazard rate and product-speci c purchase probability of (3.24) are adjusted into: h 0 t = ?tht (3.29) 0 jt = ?th(rt) exp( jt)Gj(:) Rt (3.30) Rather than using Rust nested xed point maximum likelihood algorithm, Melnikov uses an easier three-stage method to estimate the models that includes: (1) iden- tifying static parameters by OLS, (2) using maximum likelihood to get parameters and from transition density (rt+1jrt; r) , and (3) estimating the remaining parameters (c; ; q; ?0) by tting predicted sales to the data with the moment con- dition. This method is based on the assumption that sales of product j can be aggregated, total market size is known and that the consumers are homogeneous. 3.3.3 Computer Server Choice Model with Persistence E ect In the previous examples, dynamic discrete choice models are applied by Rust to describe the optimal stopping time for bus engine replacement decision and by Melnikov to model the choice from a set of di erentiated durable goods with quality stochastically improving over time. Lorincz?s paper incorporates a persistence e ect into the Melnikov optimal stopping problem [Lorincz, 2005]. If the consumer already has one product, he can upgrade it without getting rid of the old one. Hence, besides deciding about the optimal time to buy a product, the consumer who already has a product can choose between simply using the original product and speci cally upgrading its format. Overall, this model is built on three principles: product 48 di erentiation, optimal stopping problem and persistence e ect. In this example, the model is applied to low-end server computers where formats are represented by operating systems (OSs). Since reliability and security are essential characteristics for servers, upgrades of OSs often need to be carried out. Meanwhile, servers are very important parts in a computer network which is ever changing, evolving and being upgraded; so the right server choice needs to be based on a more sophisticated forward-looking behavioral model. In particular, a dynamic nested logit model is estimated here, where nests are represented by di erent operating systems. General dynamic nested logit model Lorincz represents the evolution of the state vector by a Markov-transition probability and models the problem by using the Bellman equation: V (st) = maxj2J(s) uj(st) + Z V (st+1)p(dst+1jst;j) (3.31) The choice set J(s) is partitioned into G+1 mutually exclusive subsets: J(s) = SG g=0 g(s). The subset g = 0 means that customers are not buying any product. The other G subsets correspond to di erent OSs which are nested. The state is composed of three elements: x, y and . x is a set of product speci c state variables such as characteristics and price; y is the customer speci c state variable observed by econometrician and y 2 0; 1; :::; G. y = 0 indicates that the customer does not own anything at the beginning of the current period. y = g indicates that the product owned currently belongs to nest g. This latter speci cation di erentiates 49 this approach from the Melnikov?s formulation where only states y = 0 (not owning a product) and break-down probability are considered. Di erent utilities need to be speci ed depending on the conditions y and g: In case 1, y = 0 and g = 0 the customer owns nothing and does not buy, uj = c+ 0. The constant c is a payo . In case 2, y = 0 and g 2 f1; :::Gg, j 2 g the customer owns nothing but buys one from nest g. Payo is then the sum of a product speci c value uj = xj g + g + (1 g) j where g is a vector of parameters and g 2 (0; 1) governs correlation in nest g . The terms g and j represent the heterogeneity of nests and products within nests respectively. And they are distributed identically and independently across nests and periods with extreme value distributions. In case 3, y 2 f1; :::; Gg and g = 0 the customer does not buy anything when he already owns one product. So he gets a format speci c "continuation value"? cy. uj = cy + u0 . In case 4, y 2 f1; :::; Gg and g 2 f1; :::; Gg, j 2 g = y the customer already has a product and decides to upgrade it. So the customer chooses an alternative j from the upgrade nest y of the original product. uj = xuj u g + u g + (1 u g ) u j . Some assumptions are given. 0 and u0 are iid distributed across all alternatives and periods with extreme value. g + (1 g) j and g are iid distributed across nests and periods with extreme value, that is the same as ug + (1 u g ) u j and u g . Then, transition probabilities are speci ed as following: p(xt+1; yt+1; t+1jxt; yt; t; j) = h( t+1jxt+1; yt+1)f(xt+1jxt)l(yt+1jyt; j) (3.32) 50 Simpli ed dynamic nest logit model The customer is supposed to choose between nests; this assumption reduces the state and choice dimensional space. Since the speci c product index j is identi ed by its nest g, the author replaces j by g in the transition probability function of state y . So in equation (3.32) l(yt+1jyt; j) can be changed into l(yt+1jyt; g) while equation (3.31) becomes V (st) = maxj2J(s) uj(st) + R V (st+1)p(dst+1jst; g) . Therefore, it is assumed that the formats of all products belonging to the same nest g are the same and that customer speci c persistence e ect is carried out through time by the format but not by the product itself. pj is de ned as the probability of choosing product j belonging to nest g. As in classical nested logit model pj can be obtained by multiplying the conditional probability of choosing j from g and the probability of choosing g , that is pj = pjjgpg . Let wg(s) = R V (st+1)p(dst+1jst; g) . So pj of case 2 is represented by the following nested logit structure: pj = exp [(xj g=(1 g))] Rg exp [(1 g)lnRg + wg(s)] PG g0=1 exp (1 0g)lnRg0 + w0g(s) (3.33) where Rg P j2g exp [xj g=(1 g)]. In this formula, the rst term and the second term are both standard logit models. The mean utility of the rst term is (xj g=(1 g)). The value g lnRg is the expected maximum utility of the conditional choice problem. The mean utility of the second term is the weighted sum of the value g of this nest g and the discounted value of the next period problem. Similar formula can be generated for case 4 where the corresponding inclusive value is ug . 51 Through reducing the state vector of the problem, the state is composed of ( ; y; ) with transition probability p( t+1; yt+1; t+1j t; yt; t; j) = h( t+1j t+1; yt+1) f( t+1j t)l(yt+1jyt+1; g). Here t is the vector of g?s and ug ?s. The Bellman equation is updated as V (st) = maxg2(0;1;:::;G) ug(zt) + Z V (zt+1)p(dzt+1jzt; g) (3.34) Estimation The model is estimated following three steps. First, specify static conditional logit models of within nest choices are estimated; second, the transition probabilities for the models? inclusive values are calculated; then a dynamic logit model of choice between nests including the results from the last two steps is calibrated. More technical details can be found in Lorincz?s (2005). 3.3.4 Dynamic Durable Goods Demand with Consumer Heterogeneity In previous examples, consumers are assumed to be homogeneous and ran- domly i.i.d. Under this assumption the parameters of the static problem can be estimated separately from the dynamic one. Homogeneity simpli es the problem formulation although the computation cost associated to the xed point algorithm is still high. Furthermore, when extending the original technique to fully heteroge- neous consumer problems, the integration of the individual demand function over the distribution of consumers? characteristics is needed. In this context, Juan Es- teban Carranza [Carranza, 2006] models digital camera demand by using models 52 similar to those described in previous Sections but incorporates fully heterogeneous consumers into a reduced form of the participation probability. The author is then able to estimate the joint distribution of consumers? preferences and the parameters associated to the participation function which is based on the observed number of purchases. A dynamic model of demand Suppose individual i buys product j, the lifetime utility of this purchase is: uijt = &ij + x jxj p i pjt + ijt (3.35) Similarly with the framework presented in Section 2, &ij is an unobserved product attribute common to all consumers who purchase product j at time t; pjt is the price of product j at time t and xj is the vector of observed static characteristics of product j. Preference parameters (&ij; xi ; p i ) vary across consumers. The author lets &ij = &j + &ei& ; xi = x + xeix, and p i = p + peip, where ei is drawn from a know iid distribution Fe . So (3.35) can be rewritten as: uijt = (&j + xxj ppjt) + ( &ei& + xeixxj peippjt) + ijt (3.36) = jt(xj; pjt; 0; &j) + ijt(xj; pjt; 1; ei) + ijt (3.37) In this formula, 0 = ( x; p), 1 = ( & ; x; p) and ei = (ei& ; eix; eip) . The utility function has mean jt which is common to all consumers, variance ijt which captures 53 the variability of tastes across consumers and an idiosyncratic product-consumer random component ijt . The reservation utility of the consumer is the value of not purchasing anything at time t and waiting until the next period to decide. The problem can be formulated as: W (Sit) = 0 + E [MAX fVi;t+1;W (Si;t+1)g jSit] (3.38) Let vit = maxj fuijt(:)g be the maximum utility consumer i can get from any product purchased at t and it is assumed that its distribution Fvit is known (recall that in Melnikov case Fvit is GEV distributed). The probability that the consumer buys any product at time can be expressed as a hazard rate (see 3.2.2) and obtained from the known distribution of vit: Pr(purchase) hit(Sit) = P [vit > W (Sit)] = 1 Fvit(W (Sit)) (3.39) It is assumed that ijt have an independent extreme value distribution. Accord- ing to Melnikov?s deduction (Melnikov, 2000-see appendix), vit has extreme value distribution with mode rit : rit(:) = log " X k2Jt exp( kt(:) + ikt(:)) # (3.40) Since vit is assumed to be Markovian, state Sit in formula (3.39) and (3.40) can be 54 replaced by rit . Then the speci c product purchase probability is hijt(:; ei) = hit(rit(:; ei)) exp jt(:) + ijt(:; ei) exp(rit(:; ei)) (3.41) Estimation To obtain the predicted market share for product j, (3.41) has to be integrated across consumers which can be based on the distribution of Fg. sjt( 0; 1) = Z ht(rt( 0; 1; e)) exp( jt( 0) + jt( 0; 1; e)) exp(rt( 0; 1; e)) dFe (3.42) = ( 0; 1) can be obtained by equating the predicted and observed demand but the premise is the observed demand for product j at t and market size are known, that is: Mtsjt( 0; 1) = Qjt (3.43) where Qjt is the observed demand for product j at t and Mt the market size. The integration of (3.42) can be simpli ed by using simulation techniques, to obtain N draws of fengn=1;:::;N . See formula (3.44). sjt( 0; 1) 1 N NX n 1 nthnt(rnt( 0; 1; en)) exp( jt( 0) + njt( 0; 1; en)) exp(rnt( 0; 1; en)) (3.44) In (3.44), n;1 = 1 and n;t>1 = n;t 1(1 hn;t 1) is the probability that consumer n is still in the market in period t. When computing mean utility j from (3.43), a xed point for each simulated consumer is required. This procedure 55 is computationally costly and it is not clear whether the computable points have a contraction across all the relevant parameter space. To circumvent the nested computation, the author imposes a parametric structure on hit and estimate its parameter 2i as a part of the whole model. If the transition probability of rit follows a Makov process, the participation probability can be approximated as hit = ~hi(rit(:); ~ 2i) and the function ~h(:) varies across consumers. (3.44) will be updated as: sjt( 0; 1; 2) 1 N NX n 1 nt ~hnt(rnt(:); ~ 2n) exp( jt( 0) + njt( 0; 1; en)) exp(rnt( 0; 1; en)) (3.45) The detail estimation procedure of 0; 1 and ~ 2n can be referred from Carranza?s paper in 2006. 3.3.5 Dynamic Durable Goods Demand with Repeat Purchases Carranza?s model incorporates consumer heterogeneity into di erentiated prod- uct demands but does not account for repeat purchases. Gowrisankaran and Rysman [Gowrisankaran and Rysman, 2007] generate a dynamic model of consumer prefer- ence for the digital camcorder. It allows for unobserved product characteristics, repeat purchases, endogenous prices and di erentiated products. It is assumed that a consumer who purchases product j at t would receive a net ow utility uijt = f jit p i ln(pjt) + ijt, where f jit = x i xjt + &jt . f jit is the gross ow utility from product j purchased at time t . xjt is observed char- acteristics and &jt is unobserved; pjt is price; ijt is an idiosyncratic unobservable 56 parameters. Let t denote current product attributes and it evolves according to the Markov process P ( t+1j t). The author de nes a consumer who does not pur- chase a new product at time t has net ow utility as well: ui0t = f i0t + i0t . Then the value function could be V ( it; f i0t; t) and the expectation of the value func- tion is EVi( f i0t; t) = R ei;t V ( it; f i0t; t)dP . it is iid and it satis es the conditional independence assumption in Rust?s 1987 paper. Bellman equation is represented as: Vi( it; f i0t; t) = max n ui0t + E[EVi( f i0t; t+1)j t]; maxj=1;:::Jt n uijt + E[EVi( f ijt; t+1)j t] oo (3.46) The problem so far is the large dimensionality of t that leads to the heavy di cul- ties to compute (3.46). Therefore, the author substitutes t with a scalar variable, the logit inclusive value of purchasing in time t : it( t) = ln( X j=1;:::Jt exp( ijt( t))) (3.47) Besides, there is a main simplifying assumption, the logit inclusive value depends only on the current logit inclusive value that is termed Inclusive Value Su ciency. This assumption indicates that if two states have the same inclusive value it for consumer i at current time t , they have the same distribution of inclusive value for this consumer for the future time. The simpli cation from this assumption is represented in this formula: EVi( f i0t; it; E[ it+1; it+2; :::j ]) = EVi( f i0t; it) (3.48) 57 To specify the density P ( i;t+1j it) , a simple function is assumed with linear autoregressive speci cation with drift i;t+1 = 1i + 2i it +uit ,where uit is normally distributed with mean 0 and 1i , 2i are parameters. The estimation algorithm includes three levels of optimization. The inner loop evaluates the predicted market shares as a function of fjt and parameters by solving the consumer dynamic programming problem for the simulated consumers and then integrating across consumer types. The middle loop performs a xed point equation and iterates until the new and old fjt converge. The outer loop is a search over the parameters. Details can be found in paper (Gowrisankaran and Rysman, 2007). The model allows for consumers? repeat purchases but does not introduce any new parameters over the static model that is because there are some strong assumptions for the product. The assumptions include: durable goods do not wear out; there is no resale market for them; and there are no households with more than one good at the same type. Therefore, the second purchased good will only have new features which are observed and very di erent from previous good?s type. 3.4 Summary of Dynamic Demand Models in Economics Finally the ve dynamic models are compared and presented in Table 2.6 which includes the case description, the main formulation and the estimation results. Rust?s optimal stopping problem provides the basic model framework and the estimation method for the dynamic models developed later in the literature. It is a single agent problem describing the decision of time to make one purchase over 58 a set of products with homogeneous attributes (bus engines with di erent mod- els). The estimation method is the nested xed point algorithm that computes the maximum likelihood estimates and reduces the computational burden of solving the contraction xed point EV . Melnikov?s dynamic demand model of computer printers contains the concept that product quality rapidly improves over time and the product durability impacts the evolution of prices and sales. Same as Rust?s example, only one purchase is made and all consumer heterogeneity is captured by a term that is independently distributed across consumers, products and time. The di erence is that it deals with di erentiated durable products rather than homoge- nous products. The estimation method is a three-stage procedure that replaces the more complicated nested xed point maximum likelihood algorithm. Then Lorincz?s model extends Melnikov?s optimal stopping problem with a persistent e ect. The consumer can choose to upgrade the product instead of getting rid of it. Given that di erent product alternatives and two conditions are considered: without a product (when alternatives are not to buy and to buy a new product) and with the current product (when alternatives are not to upgrade and to upgrade the owned product), thus the decision problem in this case is speci ed as a dynamic nested logit model. The estimation follows a sequential procedure with three steps. Juan Esteban Carranza incorporates fully heterogeneous consumers into a reduced form of the participation probability for a digital camera demand problem. He estimates the joint distribution of consumers? preference and parameters of the participation function which is based on the observed number of purchases. The distribution of preference is de ned as a continuous parametric distribution. The complicated in- 59 tegration across consumers in the estimation part needs simulation. Gowrisankaran and Rysman?s dynamic model for digital camcorder demand allows for repeat pur- chases which is di erent from previous studies. The estimation algorithm includes three levels of optimization and the repeat purchases estimation could be simpli- ed only with strong assumptions. Table 3.1 shows the summary of these dynamic models. 60 Name Bus engine replacement Computer printer demand Low-end computer server demand with persistence e ects Digital camera demand Digital camcorder demand Author (year) John Rust (1987) Oleg Melnikov (2000) Szabolcs Lorincz (2005) Juan Esteban Carranza (2006) Gautam Gowrisankaran and Marc Rysman (2009) Data ten years of monthly data on bus mileage and engine replacements for a subsample of 104 buses monthly data on sales and average prices of computer printers and multifunction devices. 462 models from 27 manufac- turers. 1998-1999 quantities, prices and technical characteristics for all server models in three regions. 1996-2001 a panel of sales, prices and characteristics of digital cameras, 1998-2001 monthly level for 378 models and 11 brands, number of units sold, price, others Mar 2000- May 2006 Characteristics Single agent, one purchase, homogeneous attributes of the products Homogeneous consumers with one purchase, di erentiated durable products. Potential market size is required. Homogeneous consumers with one purchase, di erentiated servers and upgraded formats Fully heterogeneous consumers and di erentiated durable products. Potential market size is required. Repeat purchases, heteroge- neous consumers and di erentiated products Main formula Described recursively by Bellman?s principle of optimality Formulate the timing of consumers? purchase as an optimal stopping problem and the solution de nes the hazard rate of production adoptions The utility function in the Bellman equation has four cases. The nested logit assumptions describe the unobserved heterogeneity term. The endogenous participation probability has a reduced form. The identi cation of the participation function is based on the observation over time. Described recursively by Bellman?s principle of optimality with logit inclusive value of purchasing in a given time. Estimation method Nested xed-point maximum likelihood algorithm that computes theta and associated value function A nested three-step method that allows for sequential parameters with aggregate data from relatively short time series Estimated by a sequential procedure, specifying static conditional logit models of within nest choices, estimating transition probabilities and the dynamic logit model of choice between nests Integrating across consumers by simulation methods to obtain the market demand for each product. Estimate the parameter vector by equating the predicted and observed demand. Three levels of non-linear optimization: a search over parameters outside, a xed point calculation of population mean ow utilities outside and calculation of predicted market shares inside Tab. 3.1: Comparison of the ve dynamic models 61 4. DYNAMIC CAR OWNERSHIP FORMULATION Discrete choice models based on Random Utility Maximization (RUM) theory have been of interest to researchers for many years in a variety of disciplines. These methodologies are used to analyze and predict individual choice behavior [Ben- Akiva and Lerman, 1985, Daly, 1982, McFadden, 1978, Small, 1987, Chu, 1981, hua Wen and Koppelman, 2001, Papola, 2000]. However, discrete choice methods are commonly based on a static framework which is limited by the assumption that consumers are not a ected by past and future states when choosing their preferred alternative in the present. The gap between discrete choice model and dynamics in individual behavior has spurred various developments that are mainly intended to enrich the basic theory by including the changes occurring in the system over time. This chapter presents a comprehensive modeling framework for car ownership modleing; it includes the consumer utility speci cation, the de nition of the dy- namic programming problem, the industry evolution equation and the optimization algorithm. 62 4.1 Car Ownership Formulation 4.1.1 General Consumer Stopping Problem We consider a consumers set I = f1; : : : ;Mg, where each consumer i 2 I can be in one of two possible states at each time period t 2 f0; 1; : : : ; Tg. More precisely, we have the state space S = f0; 1g; 8i 2 f1; : : : ;Mg; t 2 f0; 1; : : : ; Tg: Each state sit 2 S can therefore take two values: s = 8 >>>< >>>: 0 i in the market, 1 i out of market. ?In the market? typically means the consumer, also referenced as the individual has the possibility to buy a product while ?out of the market? means the individual never considers to make a purchase at all. State is evolving from period to period depending on the consumer?s decision as well as some external factors. In other words, in an optimal stopping problem, a consumer in state 0 tries to choose the best transition period in order to attain state 1. The decision process continues even when he/she reaches state 1 because the framework is used for repeated purchases. In the car ownership case, ?in the market? means the individual considers to buy a car no matter whether he/she currently owns a car. If the individual does not own a car, it is quite possible he/she considers to buy one; if he/she does own a car but 63 with some problematic condition (or plan to sell the previous car), he/she can also consider to replace it. ?Out of market? means the individual does not consider to buy a car at all. The car ownership problem is described by a regenerative optimal stopping problem, i.e. when the individual reaches state 1, this state is replaced by the state 0, and some variables of the problem such as current vehicle age and mileage are reinitialized. The regeneration can sometimes happen at each period in state 1 with some probability (strictly less than 1). In each time period t, consumer i in state sit = 0 has two options 1. to buy one product j 2 Jt and obtain a terminal period payo uijt, where Jt = f1; : : : ; Jtg is the set of products available at time t; 2. to postpone and obtain a one-period payo cit, which is a function of individ- ual i?s attributes and the characteristics of current product owned by i, i.e. c(xit; qit; i; i). xit is a vector of attributes for individual i at time t, e.g., sex, education, income, age, etc., and qit is the vector of characteristics of current product owned by this individual. i and i are parameters vectors for xit and qit respectively. It is here assumed that the choice set Jt is consistent in each time period t, so the subscript t from Jt and Jt can be dropped, and keep J and J respectively. The payo uijt is expressed as a random utility function uijt = u(xit; dj; yjt; i; i; i; ijt); (4.1) 64 where xit; i 2 0; here let = 1; (iii) limaj!1G(a1; : : : ; aJ) =1;8j = 1; : : : ; J ; 1 This allows us to summarize car consumption and current fuel price into one attribute. 65 (iv) for any distinct sequence (j1; : : : ; jk); @kG=@aj1 : : : @ajk is greater than 0 if k is odd and less than 0 if k is even. Besides, Gj( ) is the rst partial derivative of G( ) with respect to jth argument. It is further assumed that equation (4.1) can be rewriten with error acting in an additive way: uijt = Vijt + ijt; where Vijt is the mean utility, i.e. Vijt = E[uijt]. It also assumed so far that these parameters are the same over individuals, i.e. i = , i = , i = , and i = , i = 1; : : : ;M (in other words, there is no heterogeneity between individuals). Relying on McFadden seminal paper [McFadden, 1978], follows a multivari- ate extreme value distribution. An example of a quite general G function is G(a) = NX n=1 X j2Bn a 1 1 n j !1 n (4.2) where Bn f1; : : : ; Jg, [Nn=1Bn = f1; : : : ; Jg, and 0 n < 1. Each Bn (n = 1; : : : ; N) can therefore be seen as a nest, with possible overlappings between the nests. n can be interpreted as an index of the similarity of the unobserved attibutes in Bn. The choice probabilities for the function (4.2) satisfy Pi = NX n=1 P [i jBn]P [Bn] = P i2Bn e Vi 1 n P j2Bn e Vj 1 n n PN n=1 P k2Bn e Vk 1 n 1 n ; (4.3) 66 where P [i jBn] = 8 >>>>< >>>>: e Vi 1 n P j2Bn e Vj 1 n if b 2 Bn; 0 if b =2 Bn: In the special caseG(a1; : : : ; aJ) = PJ j=1 aj, we have F ( 1; : : : ; J) = F ( 1); : : : ; F ( J), so j are all independent, and n = 0 (n = 1; : : : ; N). When all alternatives are avail- able, probabilities (4.3) simplify to usual multinomial logit probabilities. The two-step decision process is that, at each period, rst, the consumer de- cides to buy or to postpone the purchase until the optimal time period , that is the time when the consumer decides to buy instead of postponing; then, the consumer chooses the product j t that maximizes utility (4.1) from J . The consumer deciding to buy or postpone is the optimal stopping problem at time t: D(ui1t; : : : ;uiJt; cit; t) = max ( 1X k=t k tcit + tE max j2J uij ) (4.4) where is a discount factor in [0,1); cit is the payo function of individual i?s attributes and the characteristics of current product owned by i when choosing to postpone the purchase, as de ned before. Let vit = maxj2J uijt. According to the previously described assumption about ijt, vit is Gumbel distributed with a scale factor equals to 1 since it is assumed in 67 property (ii) that G(a) is homogenous of degree one. In other terms, there are Fv(z; rit) = e e (z rit) ; (4.5) fv(z; rit) = e rite( e (z rit) z) = e (z rit)Fv(z; rit); where rit is the mode of the distribution of vt, that is rit = lnG(e Vi1t ; : : : ; eViJt): (4.6) Later, rit will replace rit(yit) in order to stress the functional relationship between the distribution mode and the dynamic attributes. Based on dynamic programming theory, the consumer?s decision can be transformed from (4.4) into: D(vit; cit) = max fvit; cit + E[D(vi;t+1)]g (4.7) 4.1.2 Utility Formulation The Bellman equation (4.7) becomes: D(vit; cit) = max vit; cit + E[D(vi;t+1(yt+1; ci;t+1)) j yt] (4.8) 68 where yt = (y1t; :::yJt). This is a standard regenerative optimal stopping problem. The stopping set is given by: (yt) = fvit j vit cit + E[D( ) j yt]g (4.9) W (yt), as the reservation utility level is de ned by function: W (yt) = cit + E[D(vi;t+1(yt+1; ci;t+1)) j yt] (4.10) and consider the optimal policy: 8 >>>< >>>: vit if vit W (yt) W (yt) otherwise: Using (4.10), (4.7) can be simpli ed as: D(vit) = max fvit;W (yt)g : At this step, the way to calculate expectation utility E is complicated and will be discussed later. As presented in (4.9), the consumer i will buy some product at time t only when vit > W (yt). The probability of postponing the purchase until the next period 69 can therefore be written as: i0t(yt) def = P [vit W (yt)] = P [postpone j sit = 0; yt] = Fv(W (yt); yt) = e e (W (yt) rit) (4.11) The probability of the product adoption is h(yt) = P [buy j sit = 0; yt] = 1 0t(yt), and the product-speci c purchase probability is: ijt(yt) def = P [Uijt Uikt;8k 6= j; uijt W (yit)] = P [Uijt W (yit) jUijt Uikt;8k 6= j]P [Uijt Uikt; k 6= j] = P [vit W (yt)]P [Uijt Uikt; k 6= j] = h(yjt) eVijtGj(eVij1 ; : : : ; eVijt) G(eVij1 ; : : : ; eVijt) : (4.12) As introduced in Section 4.3.1, Gj( ) is the partial derivative of G( ) with respect to jth argument. 4.1.3 Industry Evolution As expressed in Section 4.3.1, yjt represents the evolution of the product j?s attributes and the market environment. yjt here is assumed to follow a normal di usion process: yj;t+1 = (yjt) + L(yjt) j;t+1; (4.13) where 70 jt (j = 1; : : : ; J , t = 1; : : : ; T ) are i.i.d. multivariate standard normal random vectors, N (0; I), where I denotes the identity matrix; (yjt) : < H ! typedef struct { // individual ID int id; // socioeconomic variables for each individual double *indiv; } ind; typedef struct { // variable for current product in each time, for each person double **current; } cur; // information for each potential product typedef struct { // decision variable, 0 or 1 at each time period to each product type for each person double **decision; // static attributes for potential choice j in time t double **stati; } poten; // Define the type of data used for the samplings. typedef struct { // number of individuals int indivNum; // number of time period int time; // number of individual variables int numINDIVAR; // number of current car variables int numCURTVAR; // number of static vars for potential choice int numSTATIC; // number of dynamic vars for potential choice int numDYNAMIC; // number of choice int numch; } glo; 146 //put all data files together typedef struct { ind* in; cur* curr; poten* pot; glo* glonum; double*** prob_matrix; // Probability matrix Random *rand_seed; // Random seed double** draw; //drawn from Random normal distribution double ***err1; //error term for i,j,t double *err2; //error term for i double **p; //the random value (0 <= p <= 1) for calculating normal dynamic variable y. double vp; //the random value (0 <= p <= 1) for calculating MC v double ***perror;//the random value (0 <= p <= 1) for calculating gumbel error_ijt for different i,j,t double *perrori;//the random value (0 <= p <= 1) for calculating gumbel error_ijt for different i double *y_int;//current gas prices(the varible will be seen as dynamic ) } Aldata; //get number of parameters (dimension of the problem) int get_dimension(Aldata* d); //read data from four data structures and allocate memory Aldata* format_data(); //generating gas prices for scenario tree double** draw_random_y(Aldata* d, double** draw); //mode in scenario tree double*** calculate_mode(Aldata* d, double* x, double** y); //mode that is correlated to current situation in each time double** calculate_mode_real(Aldata* d, double* x); //calculate log sum utility of three car alternatives double*** calculate_v(Aldata* d, double* x, double** y, double vp); //recursive process for scenario tree to calculate expectation utility E double cal_E(Aldata* d, int t, int T, double *v, double current, int n, double* x, int indiv); //probability of buying cetern kind of car double*** cal_probcar (Aldata* d, double* x); //probability of not buying, PI0 double** cal_prob (Aldata* d, double* x, double** y); // functions of reading four .txt data files void read_new_indiv(glo* paraNB, ind* paraIN); void read_new_current(glo* paraNB, cur* paraC, ind* paraIN); void read_new_poten(glo* paraNB, ind* paraIN, poten* paraP); void read_new_choice(glo* paraNB, ind* paraIN, poten* paraP); // function of reading coefficient data files void read_new_para(double *x, glo* paraNB); 147 // functions of allocating memories to four structures and their elements glo* getGlo(); ind* getIn(glo *glonum); cur* getC(glo *glonum); poten* getP(glo *glonum); //memory allocation functions for probability matrix double*** c_malloc_P(glo* paraNB); //function of allocating memory to error terms double*** c_malloc_e(glo* paraNB); // function of allocating memory to the array of utility U[i][j][t] double*** c_malloc_u(glo* paraNB); double**** c_malloc_uy(glo* paraNB); // function of allocating memory to the array of summumation of exp(U[i][j]) for each person i double* c_malloc_w(glo* paraNB); //function of allocating memory to v_itj (the dimension of j is for 1000 draws) double*** c_malloc_v(glo* paraNB); // free functions void free_ind (ind* in, glo* paraNB); void free_cur (cur* c, glo* paraNB); void free_poten (poten* p, glo* paraNB); void free_uy(double**** u, glo* paraNB); void free_glo (glo* g); void free_w(double* W); void free_v(double*** v, glo* paraNB); void free_u(double*** u, glo* paraNB); void free_p(double*** u, glo* paraNB); void free_err (Aldata *d); // Function to help in calculating t-stats double amlet_t_statistics(int n, double *theta, double *hypothetical, double **I, double alpha, double *t); //function to inverse matrix void op_matrix_inverse(const enum CBLAS_ORDER Order, const enum CBLAS_UPLO Uplo, double *I, int npar); //function to print out matrix void nt_matrix_print(FILE *out, char *name, double **A, int m, int n); #ifdef __cplusplus } #endif #endif 148 ? data.c #include "data.h" #include #include #include #include //read data from four data structures and allocate memory Aldata* format_data(){ Aldata *d; int i,j,t,n; d= malloc(sizeof(Aldata)); d->glonum = getGlo(); d->in = getIn(d -> glonum) ; d->pot = getP(d -> glonum); d->curr = getC(d -> glonum); read_new_indiv(d->glonum, d->in); read_new_current(d->glonum, d->curr, d->in); read_new_poten(d->glonum, d->in, d->pot); read_new_choice(d->glonum, d->in, d->pot); d->prob_matrix = c_malloc_P(d -> glonum); d->rand_seed = ran_random(); d->y_int=malloc(d->glonum->time*sizeof(double)); /* Place the current dynamic variables into the array. They are calculated by normal diffusion process with the same factors as function draw_random_y, initial y is 3.5 */ d->y_int[0] = 3.50; d->y_int[1] = 3.27; d->y_int[2] = 3.20; d->y_int[3] = 3.25; d->y_int[4] = 3.43; d->y_int[5] = 3.42; d->y_int[6] = 3.24; d->y_int[7] = 3.23; d->y_int[8] = 3.41; d->y_int[9] = 3.63; d->y_int[10] = 3.86; d->y_int[11] = 3.95; d->p=nt_matrix_new(d->glonum->time+1, 20); for (t=0; tglonum->time; t++){ for (n=0; n<20; n++){ d->p[t][n] = ran_random_get_val(d->rand_seed ); } } d->vp = ran_random_get_val(d->rand_seed ); 149 d->perror=c_malloc_e(d -> glonum); for(i = 0; i < d->glonum ->indivNum; i++) { for(t = 0; t < d->glonum->time+2; t++) { for(j = 0; j < d->glonum ->numch+1; j++) { d->perror[i][j][t] = ran_random_get_val(d->rand_seed ); } } } d->perrori=malloc(d->glonum ->indivNum*sizeof(double)); for(i = 0; i < d->glonum ->indivNum; i++) { d->perrori[i] = ran_random_get_val(d->rand_seed ); } d->draw=nt_matrix_new(d->glonum->time, 20); for (t=0; tglonum->time; t++){ for (n=0; n<20; n++){ d->draw[t][n] = st_normal_icdf(d->p[t][n], 0, 16); } } d->err1=c_malloc_e(d->glonum); d->err2=malloc(d->glonum ->indivNum*sizeof(double)); for(i = 0; i < d->glonum ->indivNum; i++) { for(t = 0; t < d->glonum->time+2; t++) { d->err2[i]=st_gumbel_icdf(d->perrori[i], 0, 1 ); for(j = 0; j < d->glonum ->numch+1; j++) { d->err1[i][j][t]=st_gumbel_icdf(d->perror[i][j][t], 0, 1 ); } } } free_p(d->perror, d->glonum); free(d->perrori); return d; } //check number of parameters (dimension of the problem) int get_dimension(Aldata* d){ int l0= d->glonum->numch-1;//asc int l1= d->glonum->numINDIVAR;//indiv var int l2= d->glonum->numSTATIC;//static var int l3= d->glonum->numDYNAMIC;//dynamic var int l4= d->glonum->numCURTVAR;//current var return l0+l1+l2+l3+l4; } /* ** * get a glo variable */ glo* getGlo() { glo *glonum; 150 glonum = (glo *)malloc(sizeof(glo)); glonum->indivNum = 200; glonum->time=12; glonum->numINDIVAR=2; glonum->numCURTVAR=1; glonum->numSTATIC=2; glonum->numDYNAMIC=1; glonum->numch=3; return glonum; } /** * get a ind variable */ ind* getIn(glo *glonum) { ind* individual = malloc((glonum->indivNum) * sizeof(ind)); return individual; } /** * get a cur variable */ cur* getC(glo *glonum) { int i; cur *curr; int ttnumCUR=(glonum->time+2) * glonum->numCURTVAR; curr = (cur *)malloc(sizeof(cur)); curr->current = malloc( glonum->indivNum * sizeof(double *)); for(i=0;iindivNum;i++) { curr->current[i]=malloc(ttnumCUR * sizeof(double)); } return curr; } /** * get a Potential variable */ poten* getP(glo *glonum) { int i; poten *pot; int ttnumSTA = (glonum->time+2) * glonum->numch * glonum->numSTATIC; int ttnumDEC = glonum->time *( glonum->numch+1); pot = (poten *)malloc(sizeof(poten)); pot->stati = malloc(glonum->indivNum * sizeof(double *)); for(i = 0;i < glonum->indivNum;i++) { pot->stati[i]=malloc(ttnumSTA * sizeof(double)); } 151 pot->decision = malloc(glonum->indivNum * sizeof(double)); for(i = 0;i < glonum->indivNum;i++) { pot->decision[i]=malloc(ttnumDEC * sizeof(double)); } return pot; } void read_new_indiv(glo* paraNB, ind* paraIN) { FILE *inn; FILE *out; int i=0, j=0; // read the indivX.txt file inn=fopen("indivX.txt","r"); out=fopen("oput1.txt","w"); if (inn == NULL) { printf ("File could not be opened\n"); exit(-1); } for(i = 0; i < paraNB ->indivNum; i++) { paraIN[i].indiv = malloc( (paraNB ->numINDIVAR) * sizeof(double)); // read the individual's index number fscanf(inn, "%d", ¶IN[i].id); for (j=0;j<(paraNB->numINDIVAR);j++) { // read the x variables for this individual fscanf(inn, "%lg", ¶IN[i].indiv[j]); // printf("%d\n", in[i].x[j]); } //fscanf(inn,"\n"); } for(i=0;iindivNum;i++) { // print out the person's index number fprintf(out,"%d", paraIN[i].id); for (j=0;j<(paraNB->numINDIVAR);j++) { // print out the x variables for each individual fprintf(out, "%lg", paraIN[i].indiv[j]); } fprintf(out,"\n"); } fclose(inn); fclose(out); } void read_new_current(glo* paraNB, cur* paraC, ind* paraIN) { FILE *inn; FILE *out; int i=0, j=0; 152 int ttnumCURT = paraNB->time * paraNB->numCURTVAR; inn=fopen("current.txt","r"); out=fopen("oput2.txt","w"); if (inn == NULL) { printf ("File could not be opened\n"); exit(-1); } for(i=0;iindivNum;i++) { fscanf(inn, "%d", ¶IN[i].id); for (j=0;jcurrent[i][j]); } } for(i=0;iindivNum;i++) { fprintf(out,"%d", paraIN[i].id); for (j=0;jcurrent[i][j]); } fprintf(out,"\n"); } fclose(inn); fclose(out); } void read_new_poten(glo* paraNB, ind* paraIN, poten* paraP) { FILE *inn; FILE *out; int i=0, j=0; int ttnumSTAT = (paraNB->time+2) * paraNB->numch * paraNB->numSTATIC; inn=fopen("poten.txt","r"); out=fopen("oput3.txt","w"); if (inn == NULL) { printf ("File could not be opened\n"); exit(-1); } for(i = 0;i < paraNB->indivNum;i++) { fscanf(inn, "%d", ¶IN[i].id); for (j=0;jstati[i][j]); } } 153 for(i=0; iindivNum; i++) { fprintf(out,"%d", paraIN[i].id); for (j=0; j<(ttnumSTAT); j++) { fprintf(out,"%lg",paraP->stati[i][j]); } fprintf(out,"\n"); } fclose(inn); fclose(out); } void read_new_choice(glo* paraNB, ind* paraIN, poten* paraP) { FILE *inn; FILE *out; int i=0, j=0; int ttnumDEC = paraNB->time * (paraNB->numch+1); inn=fopen("choice.txt","r"); out=fopen("oput4.txt","w"); if (inn == NULL) { printf ("File could not be opened\n"); exit(-1); } for(i=0; i< paraNB->indivNum; i++) { fscanf(inn, "%d", ¶IN[i].id); for (j = 0;j<(ttnumDEC);j++) { fscanf(inn, "%lg", ¶P->decision[i][j]); } } for(i=0;iindivNum;i++) { fprintf(out,"%d", paraIN[i].id); for (j=0;j<(ttnumDEC);j++) { fprintf(out,"%lg",paraP->decision[i][j]); } fprintf(out,"\n"); } fclose(inn); fclose(out); } void read_new_para(double *x, glo* paraNB){ FILE *inn; int i; int j = 0; float f; 154 inn=fopen("parady.txt","r"); if (inn == NULL) { printf ("File could not be opened\n"); exit(-1); } // Reads the ASCs for(i=0; i< paraNB->numch-1; i++) { fscanf(inn, "%f", &f); x[j++] = f; } // Reads the individual specific parameters for(i=0; i< paraNB->numINDIVAR; i++) { fscanf(inn, "%f", &f); x[j++] = f; } fscanf(inn, "\n"); // Reads the static parameters for(i=0; i< paraNB->numSTATIC; i++) { fscanf(inn, "%f", &f); x[j++] = f; } // Reads the dynamic parameters for(i=0; i< paraNB->numDYNAMIC; i++) { fscanf(inn, "%f", &f); x[j++] = f; } // Reads the current product parameters for(i=0; i< paraNB->numCURTVAR; i++) { fscanf(inn, "%f", &f); x[j++] = f; } } double*** c_malloc_P(glo* paraNB){ double ***P; int i,j; P = (double***)malloc(paraNB ->indivNum*sizeof(double**)); for(i = 0; i < paraNB->indivNum; i++) { P[i] = (double**)malloc(4 *sizeof(double*)); for(j=0;j<4;j++){ P[i][j]=(double*)malloc(paraNB->time*sizeof(double)); } } return P; } double*** c_malloc_e(glo* paraNB){ double ***e; int i,j; 155 e = (double***)malloc(paraNB ->indivNum*sizeof(double**)); for(i = 0; i < paraNB->indivNum; i++) { e[i] = (double**)malloc(4 *sizeof(double*)); for(j=0;j<4;j++){ e[i][j]=(double*)malloc((paraNB->time+2)*sizeof(double)); } } return e; } double*** c_malloc_u(glo* paraNB){ double ***U; int i,j; U = (double***)malloc(paraNB ->indivNum*sizeof(double**)); for(i = 0; i < paraNB->indivNum; i++) { U[i] = (double**)malloc(paraNB->numch *sizeof(double*)); for(j=0;jnumch;j++){ U[i][j]=(double*)malloc(paraNB->time*sizeof(double)); } } return U; } double**** c_malloc_uy(glo* paraNB){ double ****U; int i,j,t; U = (double****)malloc(paraNB ->indivNum*sizeof(double***)); for(i = 0; i < paraNB->indivNum; i++) { U[i] = (double***)malloc(paraNB->numch *sizeof(double**)); for(j=0;jnumch;j++){ U[i][j]=(double**)malloc((paraNB->time+1)*sizeof(double*)); for(t=0; ttime+1; t++){ U[i][j][t]=(double*)malloc(20*sizeof(double)); } } } return U; } double*** c_malloc_v(glo* paraNB){ double ***v; int i,j; v = (double***)malloc(paraNB ->indivNum*sizeof(double**)); for(i = 0; i < paraNB->indivNum; i++) { v[i] = (double**)malloc((paraNB->time+1) *sizeof(double*)); for(j=0;jtime+1;j++){ v[i][j]=(double*)malloc(20*sizeof(double)); } } 156 return v; } void free_ind(ind* in, glo* paraNB){ int i; for(i=0;iindivNum;i++) { free(in[i].indiv); } free(in); } void free_cur(cur* curr, glo* paraNB){ int i; for(i=0;iindivNum;i++){ free(curr->current[i]); } free(curr->current); free(curr); } void free_poten(poten* p, glo* paraNB){ int i; for(i=0;iindivNum;i++){ free(p->stati[i]); free(p->decision[i]); } free(p->stati); free(p->decision); free(p); } void free_err (Aldata *d) { free_p(d->err1, d->glonum); free(d->err2); } void free_glo (glo* g){ free(g); } void free_v(double*** v, glo* paraNB){ int i, t; for(i = 0; i < paraNB ->indivNum; i++) { for(t=0; ttime+1; t++) { free(v[i][t]); } free(v[i]); } free(v); } void free_u(double*** u, glo* paraNB){ 157 int i, j; for(i = 0; i < paraNB ->indivNum; i++) { for(j=0; jnumch; j++) { free(u[i][j]); } free(u[i]); } free(u); } void free_uy(double**** u, glo* paraNB){ int i, j, t; for(i = 0; i < paraNB ->indivNum; i++) { for(j=0; jnumch; j++) { for (t=0; ttime+1; t++){ free(u[i][j][t]); } free(u[i][j]); } free(u[i]); } free(u); } void free_p(double*** u, glo* paraNB){ int i, j; for(i = 0; i < paraNB ->indivNum; i++) { for(j=0; jnumch+1; j++) { free(u[i][j]); } free(u[i]); } free(u); } 158 ??cal_v.c #include #include #include #include #include #include "data.h" #include /* U is utility of each i, j, t; v is utility of each i, t, gumbel distributed, vit is randomly generated r is mode of v j=0, choice gas vehicle, u=asc+y*beta_y j=1, choice hybrid vehicle, u=asc+indiv*beta_indiv+static1*beta_sta1+y*beta_y j=2, choice electrical vehicle, u=indiv*beta_indiv+static2*beta_sta2 */ /*make draws for y from normal distribution; parameters are from JM's calibration; There is a concept of scenario tree. From each current gas price, one scenario tree is generated with a two-time-period expansion which means the respondent can imagine all the possible situations happen for the next two time periods when standing at current time. y[t][n] is gas price in scenario trees which is dynamic variable. From the root of the tree, there are two levels of gas prices generated. From every price, four hypothetical prices are generated. The root price is gas price at current time period, y_int[t]. For example, from root price y_int[0], four prices at time 1 in the scenario tree are generated; from each of the four prices at time 1, another four prices at time 2 are generated seperately. Therefore, for each current time period, total 20 gas prices will be generated. In the function, the 20 prices are put in one array, but this array is divided by two levels. From the current price y_int[t] at time t, four prices y[t+1][0], y[t+1][1], y[t+1][2], y[t+1][3] at time t+1 are generated first; then, from y[t+1][0],we have y[t+1][4],y[t+1][5],y[t+1][6],y[t+1][7] generated;from y[t+1][1],we have y[t+1][8],y[t+1][9],y[t+1][10],y[t+1][11] generated; from y[t+1][2],we have y[t+1][12],y[t+1][13],y[t+1][14],y[t+1][15] generated;from y[t+1][3],we have y[t+1][16],y[t+1][17],y[t+1][18],y[t+1][19] generated; */ /* y_int[t] is 3.XX, but the calibration function is only adapted to 3XX, so i put d->y_int[t]*100 for the function; therefore, y[t][n] generated will be 3XX, but the utility function will need 3.xx, so i divide y[t][n] by 100 in the end. */ double** draw_random_y(Aldata* d, double **draw){ int t, n; double **y; 159 y=nt_matrix_new(d->glonum->time+1, 20); for (t=0; tglonum->time; t++){ for (n=0; n<4; n++){ y[t+1][n]=0.9757*d->y_int[t]*100+4.49+draw[t][n]; y[t+1][4*n+4+0] = 0.9757*y[t+1][n]+4.49+draw[t][4*n+4+0]; y[t+1][4*n+4+1] = 0.9757*y[t+1][n]+4.49+draw[t][4*n+4+1]; y[t+1][4*n+4+2] = 0.9757*y[t+1][n]+4.49+draw[t][4*n+4+2]; y[t+1][4*n+4+3] = 0.9757*y[t+1][n]+4.49+draw[t][4*n+4+3]; } } for (t=1; tglonum->time+1; t++){ for (n=0; n<20; n++){ y[t][n]=y[t][n] /100.0; } } return y; } /*calculate mode r[i][t] for the current time period with y_int[t] and all other variables correlated r_it=sum(exp(U_ijt)); j=0,choice gas vehicle, u=asc+y*beta_y j=1,choice hybrid vehicle, u=asc+indiv*beta_indiv+static1*beta_sta1+y*beta_y j=2, choice electrical vehicle, u=indiv*beta_indiv+static1*beta_sta1 */ double** calculate_mode_real(Aldata* d, double* x){ int i=d->glonum->indivNum; int j=d->glonum->numch; int t; int k, l, m; double sum=0; int l0= d->glonum->numch-1; int l1= d->glonum->numINDIVAR; int l2= d->glonum->numch; int l3= d->glonum->numDYNAMIC; double **r_real; r_real= nt_matrix_new(d->glonum-> indivNum, d->glonum->time); double ***U=c_malloc_u(d->glonum); // calculate utility for three choices, from time 0 160 for(i = 0; i < d->glonum ->indivNum; i++) { for(t=0; tglonum->time; t++) { j=0; sum = 0.0; sum+=x[j]; for (m=0; my_int[t]*x[l0+l1+l2-1+m]; } U[i][j][t] = sum; j=1; sum = 0.0; sum+=x[j]; for (k=0; kin[i].indiv[k]*x[l0+k]; } for (l=0; lglonum->numSTATIC-1;l++ ) { sum+=(d->pot->stati[i][ (d->glonum->numSTATIC)*j+l+t*d->glonum- >numSTATIC*l2])*x[l0+l1+l]; } for (m=0; my_int[t]*x[l0+l1+l2-1+m]; } U[i][j][t] = sum; j=2; sum = 0.0; for (k=0; kin[i].indiv[k]*x[l0+k]; } for (l=0; lglonum->numSTATIC-1;l++ ) { sum+=(d->pot->stati[i][ (d->glonum->numSTATIC)*j+l+t*d->glonum- >numSTATIC*l2])*x[l0+l1+(d->glonum->numSTATIC-1)*(j-1)+l]; } U[i][j][t] = sum; } 161 } for(t=0; tglonum->time; t++) { for(i = 0; i < d->glonum ->indivNum; i++) { sum=0; for (j = 0;jglonum->numch;j++) { sum+=exp(U[i][j][t]); } r_real[i][t]= log(sum); } } free_u(U,d->glonum); return r_real; } /*calculate mode for the scenario tree, r[i][t][n], n means the position in the tree; in order to get mode, utilities need to be calculated, UY[i][j][t][n]; utilities for each time period have two levels, UY[i][j][t][n],n=0,1,2,3 are in the first level with correlated variables at time t and y[t][n], n=0,1,2,3; UY[i][j][t][n],n=4...19 are in the second level with correlated variables at time t+1 and y[t][n],n=4...19; so for the mode r[i][t][n], n=0,1,2,3 are in the first level; n=4...19 are in the second level. */ double*** calculate_mode(Aldata* d, double* x, double** y){ int i=d->glonum->indivNum; int j=d->glonum->numch; int t; int k, l, m, n; double sum=0; double ***r; int l0= d->glonum->numch-1; int l1= d->glonum->numINDIVAR; int l2= d->glonum->numch; int l3= d->glonum->numDYNAMIC; r= c_malloc_v(d->glonum); double ****UY = c_malloc_uy(d->glonum); for(i = 0; i < d->glonum ->indivNum; i++) { for(t=1; tglonum->time+1; t++) { //calculate utilities for the first level of scenario tree, n=0,1,2,3 for(n=0; n<4; n++){ j=0; 162 sum = 0.0; sum+=x[j]; for (m=0; min[i].indiv[k]*x[l0+k]; } for (l=0; lglonum->numSTATIC-1;l++ ) { sum+=(d->pot->stati[i][ (d->glonum->numSTATIC)*j+l+t*d->glonum- >numSTATIC*l2])*x[l0+l1+(d->glonum->numSTATIC-1)*(j-1)+l]; } for (m=0; min[i].indiv[k]*x[l0+k]; } for (l=0; lglonum->numSTATIC-1;l++ ) { sum+=(d->pot->stati[i][ (d->glonum->numSTATIC)*j+l+t*d->glonum- >numSTATIC*l2])*x[l0+l1+(d->glonum->numSTATIC-1)*(j-1)+l]; } UY[i][j][t][n] = sum; } //calculate utilities for the second level of scenario tree, n=4,...19 for(n=4; n<20; n++){ j=0; sum = 0.0; 163 sum+=x[j]; for (m=0; min[i].indiv[k]*x[l0+k]; } for (l=0; lglonum->numSTATIC-1;l++ ) { sum+=(d->pot->stati[i][ (d->glonum->numSTATIC)*j+l+(t+1)*d->glonum- >numSTATIC*l2])*x[l0+l1+(d->glonum->numSTATIC-1)*(j-1)+l]; } for (m=0; min[i].indiv[k]*x[l0+k]; } for (l=0; lglonum->numSTATIC-1;l++ ) { sum+=(d->pot->stati[i][ (d->glonum->numSTATIC)*j+l+(t+1)*d->glonum- >numSTATIC*l2])*x[l0+l1+(d->glonum->numSTATIC-1)*(j-1)+l]; } UY[i][j][t][n] = sum; } } } for(i = 0; i < d->glonum ->indivNum; i++) { 164 for(t=1; tglonum->time+1; t++) { for(n=0; n<20; n++){ sum=0; for (j = 0;jglonum->numch;j++) { sum+=exp(UY[i][j][t][n]); } r[i][t][n]= log(sum); } } } free_uy(UY,d->glonum); return r; } /* v is randomly drawn from gumbel distribution with mode r_itn, also in the scenario tree; n means the position of v in the tree; v[i][t][n], n=0,1,2,3 are in the first level; n=4...19 are in the second level */ double*** calculate_v(Aldata* d, double* x, double** y, double vp){ int t,i,n; double*** v= c_malloc_v(d->glonum); double*** r=calculate_mode(d, x, y); for(i = 0; i < d->glonum ->indivNum; i++) { for(t=1; tglonum->time+1; t++) { for(n=0; n<20; n++){ v[i][t][n]= st_gumbel_icdf(vp, r[i][t][n], 1 ); } } } free_v(r, d->glonum); return v; } inline double max(double a, double b) { if (aglonum->indivNum; int j=d->glonum->numch; int t; int k, l, m; double sum=0; double ***err1, *err2; int l0= d->glonum->numch-1; int l1= d->glonum->numINDIVAR; int l2= d->glonum->numch; int l3= d->glonum->numDYNAMIC; int T=d->glonum->time; double ***P =c_malloc_u(d->glonum); double **w=nt_matrix_new(d->glonum-> indivNum, d->glonum->time); double ***U = c_malloc_u(d->glonum); err1 = d->err1; err2 = d->err2; for(t=0; tglonum ->indivNum; i++) { j=0; sum = 0.0; sum+=x[j]; for (m=0; mglonum->numDYNAMIC;m++ ) { sum+=d->y_int[t]*x[l0+l1+l2-1+m]; } U[i][j][t] = sum+err1[i][j][t]+err2[i]; j=1; //choice hybrid vehicle, u=asc+indiv*beta_indiv+static1*beta_sta1+y*beta_y sum = 0.0; sum+=x[j]; for (k=0; kin[i].indiv[k]*x[l0+k]; } for (l=0; lglonum->numSTATIC-1;l++ ) { sum+=(d->pot->stati[i][ (d->glonum->numSTATIC)*j+l+t*d->glonum- >numSTATIC*l2])*x[l0+l1+(d->glonum->numSTATIC-1)*(j-1)+l]; } for (m=0; my_int[t]*x[l0+l1+l2-1+m]; } U[i][j][t] = sum+err1[i][j][t]+err2[i]; j=2; //choice electrical vehicle, u=indiv*beta_indiv+static1*beta_sta1 sum = 0.0; for (k=0; kin[i].indiv[k]*x[l0+k]; } for (l=0; lglonum->numSTATIC-1;l++ ) { sum+=(d->pot->stati[i][ (d->glonum->numSTATIC)*j+l+t*d->glonum- >numSTATIC*l2])*x[l0+l1+(d->glonum->numSTATIC-1)*(j-1)+l]; } U[i][j][t] = sum+err1[i][j][t]+err2[i]; } } for(t=0; tglonum->time; t++) { for(i = 0; i < d->glonum ->indivNum; i++) { sum=0; for (j = 0;jglonum->numch;j++) { sum+= exp(U[i][j][t]); } w[i][t]=sum; for (j = 0;jglonum->numch;j++) { P[i][j][t]= exp(U[i][j][t])/w[i][t]; } 167 } } nt_matrix_free(w); free_u(U,d->glonum); return P; } /*recursive process for calculating E_AVE; n is the position in the tree */ double cal_E(Aldata* d, int t, int T, double *v, double current, int n, double* x, int indiv){ int i; double e_ave; double ***err1, *err2; double c; err1 = d->err1; err2 = d->err2; int l0= d->glonum->numch-1; int l1= d->glonum->numINDIVAR; int l2= d->glonum->numch; int l3= d->glonum->numDYNAMIC; // Base case to cover the last time period /* if (T < t) return 0; */ c = current* x[l0+l1+l2-1+l3]+err1[indiv][3][t]+err2[indiv]; if (t==T) // Base Case return max(v[n], c); else // Recursive Step e_ave=0; for(i=0; i<4; i++){ e_ave+=cal_E(d, t+1, T, v, current+0.5, (4+4*n)+i, x, indiv);// go further to reach the second level in the tree } e_ave=e_ave/4; return max(v[n], c+e_ave); } 168 /*PI0,PI1 is probability of buying and not buying; C_it is utility payoff when not buying, =indiv*beta_indiv+mile*beta_mile; PI0 = F(vglonum->numch-1; int l1= d->glonum->numINDIVAR; int l2= d->glonum->numch; int l3= d->glonum->numDYNAMIC; int T=d->glonum->time; err1 = d->err1; err2 = d->err2; C=nt_matrix_new(d->glonum-> indivNum, d->glonum->time); W=nt_matrix_new(d->glonum-> indivNum, d->glonum->time); E_AVE=nt_matrix_new(d->glonum-> indivNum, d->glonum->time+1); PI0=nt_matrix_new(d->glonum-> indivNum, d->glonum->time); PI1=nt_matrix_new(d->glonum-> indivNum, d->glonum->time); double ***v=calculate_v(d,x,y,d->vp); double **r_real=calculate_mode_real(d,x); for(i = 0; i < d->glonum ->indivNum; i++) { for(t=0; tglonum->time; t++) { sum = 0.0; sum+=d->curr->current[i][t]* x[l0+l1+l2-1+l3]; C[i][t] = sum+err1[i][3][t]+err2[i]; } } // calculate expecations E_AVE in the first level of the tree in a recursive way; for(i = 0; i < d->glonum ->indivNum; i++) { for (t=0; tcurr->current[i][t]+0.5, n, x, i); } E_AVE[i][t+1]=E_AVE[i][t+1]/4; W[i][t]= C[i][t]+E_AVE[i][t+1]; } } //calculate probabilities of postponing PI0 with reservation utility W, mode r for(i = 0; i < d->glonum ->indivNum; i++) { for(t=0; tglonum->time; t++) { PI0[i][t]=st_gumbel_cdf(W[i][t], r_real[i][t], 1); PI1[i][t]=1-PI0[i][t]; } } nt_matrix_free(C); nt_matrix_free(W); nt_matrix_free(E_AVE); free_v(v,d->glonum); nt_matrix_free(r_real); nt_matrix_free(PI1); return PI0; } 170 ? LL.c #include #include #include #include #include #include "data.h" #include #include #include //#include "tstat.h" /*PB is the product specific purchase probability (j-0,1,2); PB=PI1*P j=3 PB is the probability of not buy */ double fLL(double* x, int n, void* data) { Aldata* d = (Aldata*) data; double ***PB = d->prob_matrix; double** y= draw_random_y(d, d->draw); double ***P =cal_probcar(d,x); double **PI0=cal_prob (d,x,y); int i, j, t; double ***ch, LL; ch=c_malloc_P(d->glonum); for(i = 0; i < d->glonum ->indivNum; i++){ for (t=0; tglonum->time; t++) { for (j = 0;jglonum->numch+1;j++) { ch[i][j][t]=d->pot->decision[i][t*4+j]; } } } for(i = 0; i < d->glonum ->indivNum; i++){ for (t=0; tglonum->time; t++) { for (j = 0;jglonum->numch;j++) { PB[i][j][t]= (1-PI0[i][t])*P[i][j][t]; } PB[i][j][t]= PI0[i][t]; } } LL=0; 171 for(i = 0; i < d->glonum ->indivNum; i++) { for (t=0; tglonum->time; t++) { for (j = 0;jglonum->numch+1;j++) { LL+=ch[i][j][t]*log(PB[i][j][t]); } } } // printf("Log likelihood:"); //printf(" %f", LL); //printf("\n"); free_p(ch, d->glonum); nt_matrix_free(PI0); nt_matrix_free(y); free_u(P, d->glonum); return -LL/(200*12); } /* optimization; H, preallocated array of size btr->n*btr-btr->n, if the hessian is needed; I, Hessian matrix; I1, inversed Hessian matrix; */ int btr_unconstrained_opt(NTLog *log, BTR *b, Aldata* d) { double **H; double *t, *h; int n=get_dimension(d); int i; double tol=0; //sets tolerance and scale for hessian derivation double scale[n]; int s; FILE *out; out=fopen("matrix.txt","w"); t=malloc(n*sizeof(double)); double work[100*(b->n)]; //b->retro = 1; H = nt_matrix_new(n, n); nt_matrix_identity(n, n, *H, n); nt_log_subsection(log, "optim of Log likelihood"); nlp_btr_init(b, n, 0); 172 /* Starting point */ // b->x[0] = -1; read_new_para(b->x, d->glonum); b->printer = btr_print_iteration; nlp_btr(b, (C_GENERIC)fLL, NULL, d, log->f, H, work); //derive hessian matrix for(s = 0; s < n; ++s) scale[s]=1.0; h = malloc(n*sizeof(double)); nt_derive_hess_cd((C_GENERIC)fLL, b->x, H, h, tol, scale, n, NULL, work, (void*) d); op_matrix_inverse(CblasRowMajor, CblasUpper, *H, n); double bugfound[n]; int bugindex = 0; for(bugindex=0; bugindexx, bugfound, H, 0.05, t); //Print out inversed hessian matrix nt_matrix_print(out, "matrix", H, n, n); // Print out the t-statistics printf("t:"); for(i=0; i < n; i++) { printf(" %f", t[i]); } return 0; } int main(int argc, char **argv) { BTR *b = malloc(sizeof(BTR)); NTLog *log; Aldata *d = (Aldata*) format_data(); int n=get_dimension(d); nlp_btr_init(b, n, 0); log = nt_log_new(NULL); btr_unconstrained_opt(log, b, d); 173 nt_log_free(log); nlp_btr_free(b); free_ind(d->in, d->glonum); free_cur(d->curr, d->glonum); free_poten(d->pot, d->glonum); free_err(d); free_glo(d->glonum); free(d); return 0; } ; BIBLIOGRAPHY Emmanuel Abbe, Michel Bierlaire, and Tomer Toledo. Normalization and correla- tion of cross-nested logit models. Transportation Research Part B: Methodological, 41(7):795{808, 2007. Victor Aguirregabiria and Pedro Mira. Dynamic discrete choice structural models: A survey. Journal of Econometrics, forthcoming, 2009. Sta an Algers, Andrew Daly, and Sta an Widlert. The stockholm mode system - travel to work. presented to World Conference on Transportation Research, Yokohama, 1989. AVV. Variabilisatie van autokosten en de aanbodzjide van de mobiliteitmarkt. Rot- terdam, Novermber 2000. eindrapport, AVV. Moshe Ben-Akiva and Maya Abou-Zeid. Hybrid choice models: from static to dynamic. In Oslo Workshop on Valuation Methods in Transport Planning, March 19-20 2007. Moshe Ben-Akiva and Steven Lerman. Discrete Choice Analysis: Theory and ap- plication to travel demand. The MIT Press, 1985. Steven Berry, James Levinsohn, and Ariel Pakes. Automobile prices in market equilibrium. Econometrica, 63(4):841{890, 1995. Chandra R. Bhat and Vamsi Pulugurta. A comparison of two alternative behavioural choice mechanisms for household auto ownership decisions. Transportation Re- search Part B: Methodological, 32(1):61C7, 1998. David Brownstone, David S. Bunch, and Kenneth Train. Joint mixed logit models of stated and revealed preferences for alternative-fuel vehicles. Transportation Research Part B: Methodological, 34(5):315{338, 2000. David S. Bunch, David Brownstone, and Thomas F. Golob. A dynamic forecasting system for vehicle markets with clean-fuel vehicles. World Transport Research, 1: 189{203, 1996. Kenneth Button, Ndoh Ngoe, and John Hine. Modelling vehicle ownership and use in low-income countries. Journal of Transport Economics and Policy, 27, 1993. 174 Juan Esteban Carranza. Consumer heterogeneity, demand for durable goods and the dynamics of quality. Meeting Papers from Society for Economic Dynamics, (47), 2006. Charisma Farheen Choudhury. Modeling Driving Decisions with Latent Plans. PhD thesis, Department of Civil and Environmental Engineering, MIT, September 2007. Chaushie Chu. Structural issues and sources of bias in residential location and travel choice models. PhD thesis, Northwestern University, 1981. Chaushie Chu. A paired combinational logit model for travel demand analysis. Proceedings of Fifth World Conference on Transportation Research, 4:295{309, 1989. J. S Cramer and A. Vos. Een model voor prognoses van het personenauto park. University of Amsterdam, 1985. Amsterdam: Interfaculty of Actuarial Science en Econometrics. Carlos Daganzo, Fernando Bouthelier, and Yosef She . Multinomial probit and qualitative choice: A computationally e cient algorithm. Transportation Science, 11:338{358, 1977. Andrew J. Daly. Estimating choice models containing attraction variables. Trans- portation Research, B16:5{15, 1982. Joyce Dargay and Dermot Gately. Income?s e ect on car and vehicle ownership worldwide: 1960-2015. Transportation Research Part A: Policy and Practice, 33 (2), 1999. Joyce M. Dargay and Petros C. Vythoulkas. Car ownership in rural and urban areas: a pseudo panel analysis. (London: ESRC Transport Studies Unit, Centre for Transport Studies, University College London), 1999a. Joyce M. Dargay and Petros C. Vythoulkas. Estimation of a dynamic car ownership model; a pseudopanel approach. Journal of Transport Economics and Policy, 33 (3):287{302, 1999b. Van den Broecke/Social Research. De mogelijke groei van het personenautobezit tot 2010. Technical report, Report for PbIVVS. (Amsterdam: BSR). Edited by Randolph W. Hall. HANDBOOK OF TRANSPORTATION SCIENCE, Second Edition. Kluwer, 2003. Gerard De Jong James Fox, Andrew Daly, Marits Pieters, and Remko Smit. Com- parison of car ownership models. Transport Reviews, 24:379{408, 2004. Carol C. S. Gilbert. A duration model of automobile ownership. Transportation Research Part B: Methodological, 26(2):97{111, 1992. 175 Vladislav Golounov, Benedict Dellaert, and Harry J P Timmermans. A dynamic lifetime utility model of car purchase behavior using revealed preference consumer panel data. Washington, DC, USA., 2001. Paper presented at the 81st Annual Meeting of the Transportation Research Board. Brett R. Gordon. Estimating a dynamic model of demand for durable goods. Un- published manuscript, Unpublished manuscript, 2006. Gautam Gowrisankaran and Marc Rysman. Dynamics of consumer demand for new durable goods. Working Paper Series, 2007. Mark Hanly and Joyce M. Dargay. Car ownership in great britaina panel data analysis. ESRC Transport Studies Unit, University College London., 2000. HCG. Resource papers for landelijk model. 2, 1989. HCG. Sydney car ownership models. Report 9009-3B, 2000. HCG and TOI. A model system to predict fuel use and emissions from private travel in norway from 1985 to 2025. The Netherlands, 1990. Hague Consulting Group. David A. Hensher and William H. Greene. Choosing between conventional, electric and lpg/cng vehicles in single-vehicle households. Gold Coast, Australia, 2000. Paper presented at IATBRC200. David A. Hensher and Tu Ton. Tresis: a transportation, land use and environmental strategy impact simulator for urban areas. Transportation, 29(4):439{457, 2002. David A. Hensher, Peter O. Barnard, Nariida C. Smith, and Frank W. Milthorpe. Dimensions of automobile demand; a longitudinal study of automobile ownership and use. North-Holland, Amsterdam, 1992. Moshe Hirsh, Joseph N. Prashker, and Moshe Ben-Akiva. Day-of-the-week models of shopping activity patterns. Transportation Research Record, 1085:63{69, 1986. Irit Hocherman, Joseph N. Prashker, and Moshe Ben-Akiva. Estimation and use of dynamic transaction models of automobile ownership. Transportation Research Record, (944):134{141, 1983. Chieh hua Wen and Frank Koppelman. The generalized nested logit model. Trans- portation Research Part B, 35:627{641, 2001. Robert A. Johnston. The urban transportation planning process. University of California Davis, 2003. Gerard De Jong. Some joint models of car ownership and car use. PhD thesis, Faculty of Economic Science and Econometrics, University of Amsterdam, 1989a. 176 Gerard De Jong. Simulating car cost changes using an indirect utility model of car ownership and car use. Brighton, UK, 1989b. Paper presented at PTRC SAM 1989, PTRC. Gerard De Jong. An indirect utility model of car ownership and car use. European Economic Review, 34:971{985, 1991. Gregory K.Ingram and Zhi Liu. Motorization and the provision of roads in countries and cities. In Policy Research Working Paper, 1842. 1997. (Washington, DC: World Bank). Ryuichi Kitamura. A panel analysis of household car ownership and mobility. In Proceedings of the Japan Society of Civil Engineers, volume 383, pages 13{27, 1987. Ryuichi Kitamura and DAVID S. BUNCH. Heterogeneity and state dependence in household car ownership: a panel analysis using ordered-response probit models with error components. Transportation and tra c theory, 8 Reprint n.52:477{496, 1990. Ole Kveiborg. Forecasting developments of the car eet in the altrans model. Trond- heim, Norway, March 2001. Paper presented to the Nordic Research Network on Modelling Transport, Land-Use and the Environment, 3rd Workshop. William H.K Lam, Zhi-Chun Li, Hai-Jun Huang, and S.C.Wong. Modeling time- dependent travel choice problems in road networks with multiple user classes and multiple parking facilities. Transportation Research Part B, 40:368{395, 2006. Uzi Landau, Joseph N. Prashker, and Moshe Hirsh. The e ect of temporal con- straints on household travel behavior. Environment and Planning, A13:435{448, 1981. K.U Leuven. Standard and poor?s dri auto-oil ii cost-e ectiveness: Study descrip- tion of the analytical tools tremove 1.1. Second Draft, Working Document, K.U. Leuven en Standard and Poor?s DRI, February 1989. Yu-Hsin Liu and Hani S. Mahmassani. Dynamic aspects of commuter decisions under advanced traveler information systems: modeling framework and experimental results. Transportation Research Record, 1645:111{119, 1998. Szabolcs Lorincz. Persistence e ects in a dynamic discrete choice model. application to low-end computer servers. JOB MARKET PAPER, October 2005. Michael Maness. Modeling vehicle ownership decisions in maryland: A preliminary stated-preference survey and model, December 2010. Charles F. Manski. Analysis of equilibrium automobile holdings in israel with ag- gregate discrete choice models. Transportation Research Part B: Methodological, 17(5):373{389, 1983. 177 Charles F. Manski and Leonard Sherman. An empirical analysis of household motor vehicle holdings. Transportation Research Part A: Policy and Practice, 14(5/6): 349{366, 1980. Daniel L. McFadden. Modeling the choice of residential location. Transportation Research Record, 672:72{77, 1978. Daniel L. McFadden and Kenneth Train. Mixed mnl models of discrete response. Journal of Applied Econometrics, 15:447{470, 2000. Oleg Melnikov. Demand for di erentiated durable products: The case of the u.s. computer printer market. Yale University, 2000. Hendrik Jan Meurs. A panel data analysis of travel demand. Groningen: Groningen University, 1991. Hendrik Jan Meurs. A panel data switching regression model of mobility and car ownership. Transportation Research Part A: Policy and Practice, 27(6):461{476, 1993. Harikesh Nair. Intertemporal price discrimination with forward-looking consumers: Application to the us market for console video games. Quantitative Marketing and Economics, in press, 2007. Agostino Nobile, Chandra R. Bhat, and Eric I. Pas. A random e ects multino- mial probit model of car ownership choice. Research Paper (Amherst, MA: Duke University and University of Massachusetts), 1996. Matthew Page, Gerard Whelan, and Andrew Daly. Modelling the factors which in uence new car purchasing. Cambridge, UK, 2000. Paper presented at the European Transport Conference 2001, PTRC. Ariel Pakes. Patents as options: Some estimates of the value of holding european patent stocks. Econometrica, 54:755{785, 1986. Andrea Papola. Some development of the cross-nested logit model. Proceedings of the 9th IATBR Conference, 2000. Jeppe Rich and Otto Anker Nielsen. A microeconomic model for car ownership, resi- dence and work location. Cambridge, UK, 2001. Paper presented at the European Transport Conference 2001, PTRC. John Rust. Optimal replacement of gmc buses: An empirical model of harold zurcher. Econometrica, 55(5):999{1033, 1987. John Rust. Numerical Dynamic Programming in Economics. Revised November 1994 draft for Handbook of Computational Economics, 1994. Kenneth Small. A discrete choice model for ordered alternatives. Econometrica, 55 (2):409{424, 1987. 178 Nariida C. Smith, David A. Hensher, and Neil Wrigley. A dynamic discrete choice sequence model: Method and an illustrative application to automobile transac- tions. In Working Paper (Sydney: Macquarie University), 1989. Inseong Song and Pradeep K. Chintagunta. A micromodel of new product adoption with heterogeneous and forward-looking consumers: An application to the digital camera category. Quantitative Marketing and Economics, 1:371{407, 2003. J.S. Tanner. Methods of forecasting kilometers per cars. Transport and Road Re- search Laboratory,Department of the Environment and of Transport, Crowthorne, Berkshire, 1981. Kenneth Train. Consumers? responses to fuel e cient vehicles. Transportation, 8: 237{258, 1979. Kenneth Train. Qualitative Choice Analysis: Theory, Econometrics and an Appli- cation to Automobile Demand. MIT Press, Cambridge, MA:, 1986. Kenneth Train. Discrete Choice Methods with Simulation. The MIT PressCambridge University Press, 2002a. Kenneth E. Train. Discrete Choice Methods with Simulation. Cambridge University Press, September 18 2002b. Gerard Whelan. Methodological advances in modelling and forecasting car owner- ship in great britain. Cambridge, UK, 2001. Paper presented at the European Transport Conference 2001, PTRC. Gerard Whelan, Mark Wardman, and Andrew Daly. Is there a limit to car ownership growth? an exploration of household saturation levels using two novel approaches. Cambridge, UK, 2000. Paper presented at the European Transport Conference 2000, PTRC. Kenneth I. Wolpin. An estimable dynamic stochastic model of fertility and child mortality. Journal of Political Economy, 92:852{874, 1984. Kenneth I. Wolpin. Estimating a structural search model: The transition from school to work. Econometrica, 55:801{818, 1987. 179