ABSTRACT Title of Document: THE PERFORMANCE OF PERFORMANCE- BASED CONTRACTING IN HUMAN SERVICES Jiahuan Lu, PhD, 2014 Directed By: Dr. Donald F. Kettl, School of Public Policy Performance-based contracting (PBC) is becoming increasingly attractive to public human service agencies. By attaching contract compensation to contractors’ performance achievement, PBC is expected to encourage quality services, better outcomes, and less administrative monitoring. However, the burgeoning popularity of PBC lacks sufficient evidence to confirm these promised benefits. In particular, the efforts of introducing PBC into human service systems needs first to address the effectiveness problem, i.e., whether PBC really produces better results. This problem constitutes the research question of the research project. After building the theoretical framework which incorporates the literature on formal and relational contracting, this project explores the effectiveness question using Indiana vocational rehabilitation program as a case. In particular, the study evaluates PBC effectiveness from two perspectives: service outcome and participating organizations. From a service-outcome perspective, the research employs a quasi- experimental design to compare the impacts of two contract arrangements, PBC and fee-for-service (FFS), on individual employment outcomes. From a participating- organization perspective, the project runs semi-structured interviews with service counselors and contractors. Triangulating these findings, this project proposes that PBC seems more promising than FFS in human services. It also implies PBC effectiveness might not be well-rounded and should not be exaggerated. Further, the study addresses the managerial implications of the findings. The research and the practice of PBC tend to ignore the relational face of contracting. PBC as a formal arrangement is always disturbed by the highly uncertain nature of human services and thus might result in incomplete performance improvement and contractor opportunism. If so, relational contracting, using informal and normative mechanisms, may enable desirable collaborative outcomes. The combination of formal PBC efforts with relational contracting would encourage high-quality results. In sum, this project represents an attempt to systematically examine PBC effectiveness in human services. It shows the difficulties and dynamics of introducing performance management to human service contracting. It also warns the launch of PBC systems should be very deliberate and careful. More broadly, the project underscores two key components of contracting management: control and trust. THE PERFORMANCE OF PERFORMANCE-BASED CONTRACTING IN HUMAN SERVICES By Jiahuan Lu Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park, in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2014 Advisory Committee: Professor Donald F. Kettl, Chair Professor Philip Joyce Professor Steven Rathgeb Smith Professor Jocelyn Johnston Professor Ellen Fabian © Copyright by Jiahuan Lu 2014 Dedication To those who helped me complete this dissertation. ii Acknowledgements To me, public administration is so fascinating and exciting that I feel lucky to be part of it. The dissertation marks the end of my doctoral study. This moment has certainly prompted a lot of reflection on my part on my intellectual and personal journey in the last five years. Over the years, the intellectual debts I have accumulated are very large. I owe a special debt to my advisor, Dr. Don Kettl. Dr. Kettl and I came to the University of Maryland both in 2009. Since then, I have benefited greatly from numerous discussions with him. I know it would be very difficult for a Dean to work very closely with doctoral students over five years, but Dr. Kettl did that. He is always insightful and supportive, answering my questions on public administration and fundamentally shaping my understanding of public administration. Dr. Steve Smith is always nice and patient. We had a conversation at Georgetown University in 2011 and Dr. Smith offered to serve on my committee. That conversion helped narrow down my research questions on performance-based service contracting and further led to the dissertation project presented here. Since then, we met periodically and I was fortunate to share his thoughts on nonprofit management and service contracting. I am grateful indeed to the Department of Public Administration and Policy at American University for the support for my coursework. Dr. Bob Durant and Dr. iii Jocelyn Johnston allowed me to take two core doctoral-level courses, Proseminar in Public Administration and Seminar in Public Management, which deeply formed my knowledge base of public administration and management. Dr. Johnston later became a member of my dissertation committee and I thus have a chance to access her expertise on social policy and government contracting. I would also want to thank Dr. Phil Joyce and Dr. Ellen Fabian. Dr. Joyce provided excellent advices on structuring and editing the dissertation. Dr. Fabian was very nice to serve on my committee as the Dean’s Representative and provided much guidance on vocational rehabilitation from both theoretical and practice perspectives. In addition, throughout my dissertation process, there are so many people who extended their help to me but I could not mention in detail here. For example, program managers, area supervisors, and service counselors in both Oklahoma and Indiana vocational rehabilitation agencies, such as Teri Egner, Theresa Koleszar, and Kristy Cook, were incredibly open to me. We made phone calls and they were willing to share their experience on performance-based contracting without reservations. Again, scholars from other fields such as John McGrew, Grant Revell, and Lawrence Martin kindly provided insights on performance-based contracting from their own perspectives at the very beginning of this project. All these supports make this dissertation possible and urge me to continue my exploration in my fields of interest. iv Table of Contents Dedication ..................................................................................................................... ii Acknowledgements ...................................................................................................... iii List of Tables ............................................................................................................... vi List of Figures ............................................................................................................. vii Chapter 1. The Rise of Performance-based Contracting .............................................. 1 1.1 The Management Imperative under the Contracting Regime ............................ 1 1.2 Performance-based Contracting as a New Experiment ..................................... 7 Chapter 2. Burgeoning Popularity, Different Models, and Elusive Effectiveness ..... 15 2.1 The Evolution of PBC at Federal Level ........................................................... 15 2.2 Popularity at State and Local Levels ............................................................... 18 2.3 Important but Missing Links ............................................................................ 28 Chapter 3. Theoretical Framework ............................................................................. 41 3.1 Formal Contract Design: A Principal-agent Perspective ................................. 41 3.2 Formal Contract Design for Human Services .................................................. 45 3.3 Informal Contract Design: A Relational Contracting Perspective ................... 52 Chapter 4. Vocational Rehabilitation as a Policy Field .............................................. 61 4.1 Vocational Rehabilitation Programs ................................................................ 61 4.2 The Purchase of Job-related Services .............................................................. 63 4.3 PBC Models in VR Services ............................................................................ 69 4.4 The Use of PBC in VR Services ...................................................................... 78 Chapter 5. The Effectiveness of PBC: A Service Outcome Perspective .................... 82 5.1 Introduction ...................................................................................................... 82 5.2 Research Design............................................................................................... 85 5.3 Interrupted Time Series with a Control Group Design .................................... 87 5.4 Propensity Score Matching .............................................................................. 93 5.5 Difference-in-difference Regressions ............................................................ 103 5.6 Conclusion ..................................................................................................... 110 Chapter 6. The Effectiveness of PBC: Government and Contractor Perspectives ... 115 6.1 Introduction .................................................................................................... 115 6.2 Street-level Perspective in Policy Analysis ................................................... 115 6.3 Vocational Rehabilitation Context and Data Collection ................................ 121 6.4 Findings from VR Agency Perspective ......................................................... 122 6.5 Findings from Contractor Perspective ........................................................... 126 6.6 Conclusion ..................................................................................................... 129 Chapter 7. Two Faces of Contracting, Two Kinds of Control .................................. 133 7.1 The Effectiveness of PBC as a Formal Arrangement .................................... 133 7.2 Managerial Implications from a Relational Contracting Perspective ............ 137 7.3 Conclusion: Control, Trust, and Contracting Management ........................... 146 Bibliography ............................................................................................................. 151 v List of Tables Table 1. A Brief History of PBC in the Federal Government ……….……………. 17 Table 2. PBC in Maine Substance Abuse Treatment Services ………………… 20-21 Table 3. PBC in Delaware Alcohol and Other Drug Treatment Programs ………... 22 Table 4. State Use of PBCs in 2009 …………………………….……….…….. 23-24 Table 5. PBC in Selected State Child Welfare Agencies …….……………...… 24-25 Table 6. Selected Studies on PBC Effectiveness in Human Services …………. 33-35 Table 7. The Determinants of Organizational Control Strategies ………………… 43 Table 8. Contract Type for Human Service Contracting …………………………. 52 Table 9. Comparisons between Formal and Relational Contracting ………….. 54-55 Table 10. Components of VR Services and Contract Type ………………...…. 67-68 Table 11. Oklahoma Milestone Payment System ………………………...…… 71-73 Table 12. New York State Milestone Payment System ……………………...…… 75 Table 13. Indiana Result-based Funding System ………………………...…… 77-78 Table 14. Description of Matching Variables ……………………………….... 94-95 Table 15. Covariate Balance Check Before and After Matching ………..…. 101-102 Table 16. Logistic Regression Model Predicting Likelihood of Employment for Service Recipients …………………………………………………….……... 105-106 Table 17. OLS Regression Models Analyzing Employment Outcomes ….… 107-108 Table 18. Distribution of Interview Samples ……………………………………. 122 Table 19. Mode of Trust Production and Implications for PBC ……………...…. 141 vi List of Figures Figure 1. A System Framework of Service Process ………………………………. 10 Figure 2. Performance Reform Hierarchy ……………………………………….... 13 Figure 3. The Determinants of Contract Type …………………………….………. 44 Figure 4. The Contractual Relationship in the Purchase of Job-Related Services .... 64 Figure 5. Analytical Framework for Network Effectiveness ………………..…..… 83 Figure 6. Interrupted Time Series with a Nonequivalent Control Group Design …. 88 vii Chapter 1. The Rise of Performance-based Contracting 1.1 The Management Imperative under the Contracting Regime In Preface to Public Administration, Stillman (1991) delineates the “stateless” origins of American public administration. He argues a systematic design or thought of public administration was absent at the founding of the United States. “America’s ‘missing state’ at its inception,” Stillman (1991) believes, “fundamentally shapes our way of thinking about, as well as doing, public administration today” (19). Further, to prevent abuse of public power, private power was relied on as much as possible. In this line, America has an ingrained tradition of using private power, in additon to public administrative capacity, to solve public problems (Kingdon, 1999). Government contracting, the most common type of privatization (Savas, 1987), is thus so widely and durably used as a government tool that it has become a remarkable feature of the American governance system. At all levels of governments, the massive use of government contracting to provide public goods and services and achieve policy priorities is a common government practice. Accordingly,“government by proxy” (Kettl, 1988), “hollow state” (Milward & Provan, 2000), and many other labels have been attached to the public administration narrative. In recent years, government contracting is becoming more dynamic, including not only contracting out, but contracting back-in. Decades of contracting-out experiences have rationalized governments’ contracting-out decisions. The insufficiency of 1 contracting management and monitoring capacities has given rise to “reverse contracting”, restoring from third-party delivery back to in-house delivery (Hefetz & Warner, 2004). Even so, there is still no evidence to see the ebb of contracting. Virtually every government is dependent on contracts to a varying degree. At the federal level, according to USASpenidng.gov, more than one third of federal spending is used under the titles of “Contracts” and “Grants” from fiscal years 2000 to 2010 on average1. Again, at the local level, the scope of contracting is equally prominent. Approximately 45.5% of local government services are delivered through contracting in 2007 (Warner & Hefetz, 2009). A recent survey of U.S. local government managers shows that 93% of municipal officials support government contracting (Girth & Johnston, 2011). The field of human services is an indispensable component of government contracting. In fact, the use of government contracting in human services occurs much earlier than contracting for other goods and services in the United States. The historic roots can be dated back to the colonial period (Smith & Lipsky, 1993). Today, governments at every level do very little directly by themselves in human service provision. Rather, they fund third-party actors through government contracts to provide services (Salamon, 1995). Among them, nonprofits deliver a large share of government funded human services. All over the United States, for example, 56.3% of homeless shelters, 35.9% of drug and alcohol treatment programs, and 32.8% of day care facilities are run by nonprofits in local communities (Warner & Hefetz, 2009). In 2009, governments at all levels contracted with 33,000 human service 1 Data come from www.usaspending.gov (accessed on January 30, 2012). 2 nonprofits for approximately 200,000 contracts and grants worth over $100 billion (Boris, de Leon, Roeger, & Nikolova, 2010). Such an extensive government- nonprofit partnership features the U.S. human service delivery system, termed by Smith and Lipsky (1993) as “contracting regime” or by Salamon (Salamon, 1987) as “third-party government”. The significant explosion of contracting fundamentally reshapes the features and businesses of government. In a political sense, contractors constitute an important pillar of American institutions in serving democratic governance and citizenship (Smith & Lipsky, 1993, Cooper, 2003). The policy goals and missions of grand federal and state programs now depends on contractors to represent and realize. Contractors thus act as a critical buffer between the state and citizens. In a managerial sense, since government programs are dependent on contract operation, the performance of government turns to be largely contingent on contractors (Frederickson & Frederickson, 2006; Kettl, 2002). In short, sound contracting performance will not only directly improve government performance, but indirectly improve democratic governance (Behn, 2002). This further raises the critical issue of contracting management. “It makes no sense to speak of effective public policy or of professional public management, or even informed citizenship, without an awareness of the nature and operation of public contract management” (Cooper, 2003, 12). Indeed, contracting is not a panacea with self-enforcing nature. Government’s retreat from human service delivery and reliance on contract operation, no matter aiming for 3 effeciciency improvement, “load-shedding,” or both, does not simply eliminate government role. Although contractors provide various services to citizens as proxies of the state, government continues to bear the responsibility for satisfactory service delivery. The explosion of contracting actually calls for a different government role. Using Osborne and Gaebler’s (1992) metaphor, government now should be “steering not rowing.” The governance by contracting mode gives prominence to contracting management capacity, i.e., if government is able to act as a “smart buyer” throughout the contracting process (Kettl, 1993). “The most fundemental problem with the current system,” as Kelman (2002) suggests, “is that it insufficiently recognizes contract administration as in the first instance a mangagement function” (93). Given the large scope of contracting employed by governments, he further argues that “the ability to manage contracting must be considered a core competency of the organization” (89). However, managing indirect government tools is very different from managing goods and service production within traditional government bureaucries. A central puzzle for public managers in contracting management, as Kettl (2002) summarizes, is that “[t]hey are responsible for ensuring high-quality results in programs that they do not directly control” (493). The reliance on contracting in public management represents a significant shift away from a vertical, authority-based model to a horizontal, negotiation-driven model (Cooper, 2003). When government directly delivers services, there is a clear clain of commands within government domain and all the managerial behaviors are based on hierarchichal authority. However, when various 4 indirect government tools are introduced into governance system, such authority relationship is absent. All the relationships underlying indirect government tools are now based on volunttary market exchange. “The basic administrative problem of indirect government thus is developing effective managemewnt mechanisms [such as bargaining and incentive system] to replace command and control” (Kettl, 2002, 491). Therefore, the managerial responsibility turns “to arrange networks rather than to carry out the traditional task of government, which is to manage hierarchies” (Milward & Provan, 2000, 362). Effective contracting management requires public managers’ sensible answers to the questions of “what to buyer, who to buy it from, and what it has bought” (Kettl, 1993, 180). Accorrdingly, it calls for “personnel with contract-management experience, policy expertise, negotiation, bargaining, and mediation skills, oversight and program audit capabilities, and the necessary communivation and political skills to manage programs with third parties in a complex political environment” (Van Slyke, 2003). However, in contrast to the ubiquitous use of contracting and the critical role of contracting management within is the finding that contracting management capacity is often insufficient, which “create[s] serious public management and accountability problems for which public administration theory fails to prepare us” (Salamon, 1989, 11). For example, Van Slyke (2003) finds serious capacity shortage in social service contracting management in New York state, as demonstrated by loss of contract management expertise, institutional memory, and capacity constraints. Smith and Smyth’s (1996) study of substance abuse service contracting in North Carolina shows 5 that limited administrative resources (personnel and budget) undermine contracting management capacity, and program evaluations are often difficult. When examining contracting management in local governments from 1997 to 2007, Joaquin and Greitens (2012) observe a significant decline in mamagement capacity in agenda setting, formulation, and implementation. This decline is even more significant as local governments contract out more complex goods and services. “The poor management of service contracts,” the U.S. Government Accountability Office (GAO) (2001) concludes, “undermines the government’s ability to obtain good value for the money spent” (5). The deficit of management capacity in the use of contracting would incur substantial uncertainty in aligning private market with public interest (Kettl, 1993). “Any uncertainty surrounding the relation between market means and public ends, any range of discretion or ambiguity,” as Donahue and Nye (2002) argues, “will result, … in effort gravitatingf toward the focus of intensity (private interest)” (7-8). In short, contracting management is a demanding and distinct craft. To address this challenge, public management schoarship in the last two decades was marked by a surge of exploration of with various capacity-building mechanisms, such as rationalizing make-or-buy decisions by balancing contracting out and contracting back-in (Brudney, Fernandez, Ryu, & Wright, 2005; Hefetz & Warner, 2004; Johnston & Romzek, 1999), managing thin service markets to stimulate and maintain competition (Brown & Potoski, 2004; Johnston & Girth, 2012; Warner & Hefetz, 6 2008), relying more on relational contracting to supplement formal contracts (Bertelli & Smith, 2010; Van Slyke, 2007), designing appropriate monitoring tools to tailor contractor incentives and ownership (Amirkhanyan, 2010; Lambright, 2009), improving contract design (Kim & Brown, 2012), and discovering new accountability mechanisms (Romzek & Johnston, 2005; Romzek, LeRoux, & Blackmar, 2012). Under this context, performance-based contracting (PBC), incorporting performance incentives into contract specification and compensation, comes to the agenda of public managers at all levels of governments. 1.2 Performance-based Contracting as a New Experiment Currently, performance-based contracting (PBC) is enjoying a widespread popularity and acclaim as a preferred contracting approach in government acquisition of a variety of goods and services. PBC may also be referred as to result-based contracting, performance-based acquisition, and result-based funding in different contexts. Despite the burgeoning popularity, the connotation of PBC is still quite elusive. Martin (1999) thinks PBC “focuses on the outputs, quality and outcomes of service provision and may tie at least a portion of a contractor’s payment as well as any contract extension or renewal to their achievement” (1). Cooper (2003) considers PBC should “include incentive and penalty clauses that provide benchmarks to assess performance as well as mechanisms to encourage contractors to exceed those minimum levels and to do so at a lower cost than that absolutely required under the contract” (98). 7 Federal acquisition regulation (FAR) provides a more technical definition: “Performance-based contracts (a) describe the requirements in terms of results required rather than the methods of performance of the work; (b) use measurable performance standards (i.e., terms of quality, timeliness, quantity, etc.) and quality assurance surveillance plans; (c) specify procedures for reductions of fee or for reductions to the price of a fixed-price contract when services are not performed or do not meet contract requirements; and (d) include performance incentives where appropriate” (FAR 37.601). In this way, PBC could “ensure that required performance quality levels are achieved and that total payment is related to the degree that services performed meet contract standards” (FAR 37.601). In this study, PBC is defined in a loose way as an “umbrella” term: PBC incorporates performance measures in contract specifications and makes contract compensations (such as payment, extension, and renewal) fully or partially contingent on performance achievements. Suggested by FAR 37, a performance-based contract may include (1) a performance-based work statement, which specifies the work in a quantifiable and measurable way; (2) measurable performance standards in terms of quality, quantity, and timeliness; (3) methods measuring contractor’s performance against the performance standards; (4) performance incentives tied to the performance standards (FAR 37.6). Based on this, public agencies have designed a variety of PBC models. The variance between different models largely centers on five dimensions: (1) payment schedule, (2) the extent to which incentives/disincentives are used, (3) frequency of performance reporting, (4) the extent of which providers are involved in 8 performance indicator development, and (5) level of financial risks assumed by contractors. Traditionally, services are procured using a fee-for-service (FFS) contracting approach. This contracting method specifies the standards on inputs and delivery process, such as amount of time and labor required, and detailed procedures to be followed in delivering services. When services are delivered, contractors are reimbursed based on units of service delivered throughout the service process. Compared with FFS, conceptually, PBC represents several substantial changes in the landscape of service contracting. First, PBC changes contract specification method, from a design specification (focusing on input and process) to a performance specification (focusing on output, quality, and outcome) (See Figure 1) (Martin, 2005). Under PBC, public managers clearly specify the desired end results that service contractors should achieve, while leaving contractors considerable flexibility and freedom to prescribe service methods and use of funds to accomplish those goals. By tying contract compensation contingent on performance achievement, public managers contract for service outcomes, no longer for services per se. Relating to the change in contract specification, PBC presents new challenges for public managers. Under FFS, public managers are responsible for specifying the standards on inputs and the details of service process, ensuring the delivery of promised services. Using PBC, public 9 managers are expected to specify outcomes, designing incentives, and evaluating outcomes, leaving contractors to produce desired results. Figure 1. A System Framework of Service Process Source: Martin (2005). Second, the same as other performance management strategies, PBC implies a change in accountability mechanisms, with increasing attention to accountability for results. This, using Al Gore’s (1993) words, represents “a fundamental shift in the system of accountability … from one oriented around accountability for processes and inputs to one that measures performance and is accountable for results actually achieved” (17). Embedded in a web of competing legitimate expectations, PBC represents a switch away from a hierarchical accountability with input and process orientations toward a professional accountability that allows for the exercise of professional discretion and expertise to achieve targeted results (Romzek, 2000). Inputs Process Outputs Quality Outcomes - Staff - Facilities - Equipment - Supplies - Materials - Funding - Service Recipients - Service Definition - Statements of Work - Measures of Service Volume - Units of Service - Timeliness - Reliability - Conformity - Tangibles - Other Dimensions - Results - Impacts - Accomplish- ment --------- Design Specifications ----- ------------------- Performance Specifications ------- 10 Generally, PBC is expected to promote better service outcomes. PBC, by making contract compensations attached to performance achievements, draws contractors’ attention toward the results of service delivery, away from service delivery per se. The discussion on performance management is always based on the notion of “what gets measured gets done” – when people are given clearly measured targets, they would pay sufficient attention to achieve them. Follwing this line of reasoning, PBC would encourage service performance improvement. Further, as contractors are given much freedom in service process to prescribe services, the amount of administrative reporting and paperwork required by public agencies is greatly reduced. As such, contractors are believed to devote more time and energy to designing quality and innovative services to match client needs, which again enhances service outcomes. Combining these two together, PBC promises greater government acquisition efficiency, i.e., doing more with less. Under PBC, only contractor efforts that result in desired outcomes would be reimbursed, which maximizes the productivity of administrative resources. Less government monitoring also reduces administrative costs substantially. In its essence, PBC stands for a marriage of service contracting with performance management, two prevalent managerial tools in contemporary public administrative narrative. On one hand, as mentioned earlier, service contracting has been a common and desired practice at virtually all levels of governments. Today, governments heavily collaborate with third-party nongovernmental actors to deliver various services through publicly funded contracts and grants. However, along with the 11 widespread use of contracting, contracting management is often found to be problematic. In this vein, PBC, by introducing performance measures into contracting management, can be seen as an endeavor in helping address this challenge. On the other hand, PBC is an extension of government performance management strategy. Although performance measurement in government management appeared as early as the beginning of the twentieth century (Williams, 2003), the popularity of “performance” in public administration discourse is largely due to the Government Reinventing movement in the early 1990s (Kettl, 2005; Radin, 2006). The Government Performance and Results Act (GPRA), drawing government attentions on federal programs away from rules and process to results, service quality, and customer satisfaction, became the prelude of the nationwide performance movement. Gradually, governments at all levels started to adopt performance measures in resource allocation and program management and establish a variety of pay-for- performance systems to align budgetary and managerial decisions with performance achievements (e.g., Behn, 2003; Hatry, 2006; Heinrich, 2002; Joyce, 1993; Kravchuk & Schack, 1996). At the outset, performance management activities were mostly run within government organization domain. However, as public administration evolves, more and more indirect government tools (e.g., contracts, grants) are introduced into the governance system (Salamon, 2002). Public administration today is no longer a tale of government, but more of governance (Kettl, 2002). This implies that government 12 performance depends on not only direct government tools, but indirect ones. As Frederickson and Frederickson (2006) show, the strength of an agency performance nowadays is deeply embedded in the characteristics of third-party grantees and contractors. With the nationwide performance movement keeping reshaping and redefining the structure and process of public administration activities, it is inevitable to witness the expansion of performance elements to the management of indirect government tools, forming a relatively comprehensive government performance management system. PBC thus becomes an indispensible part therein (See Figure 2). Figure 2. Performance Reform Hierarchy Source: Smith & Grinker (2004). Performance Management Performance-based Budgeting Performance-based Contracting Performance: Inputs Activities Outputs Outcomes Performance Measurement Improvement in Public Service Performance 13 As said, the extensive use of government contracting as an indirect government tool to deliver products and achieve policy goals has fundamentally redefined the U.S. governance system, in both political and managerial senses. In this way, the management of contracting process to ensure high-quality results becomes an imperative challenge. Unfortunately, public management literature has documented that public managers at all levels of governments fail to address this challenge effectively. As a response, public management scholarship and practice in recent decades have conducted a huge amount of exploration of effective contracting management strategies. Inspired by performance management movement, PBC represents one of the most recent efforts. By attaching contract compensations to contract performance, rather than the delivery of service per se, PBC promises better outcomes, less service costs and administrative monitoring. Given these potential benefits, PBC is currently very popular in a variety of service areas and advocated by different levels of governments. However, PBC is not a brand-new managerial tool; its historical root could be dated back to two decades ago. Moreover, even with the historical evolvement of PBC in mind, the documented evidence on PBC effectiveness is still unclear. These are the topics of the next chapter. 14 Chapter 2. Burgeoning Popularity, Different Models, and Elusive Effectiveness 2.1 The Evolution of PBC at Federal Level Federal agencies have used PBC to varying degrees for acquiring a wide range of goods and services. Although PBC has been referred to in government regulations, guidances, and policies for about two decades, the historical root of PBC in the federal government can be dated back to even earlier. For example, in the early 1970s, the Office of Economic Opportunity (OEO) in the Department of Health, Education, and Welfare attempted to introduce PBC in educational services. Some school districts contracted out some portion of their instructional activities with private companies and attached contract payment to the extent to which contractors helped students learn (Gramlich & Koshel, 1975; Mecklenburger, 1972; Levine, American Educational Research, & American Association of School, 1972). The results of the initiative were quite mixed and problems arose in the implementation process. Participating organizations failed to reach consensus on several important questions such as the validity of standardized tests as achievement measures and what should be measured. The efforts of introducing PBC to educational services in this experiment were soon dropped. Despite the early trial, federal implementation of PBC was not fully pursued until the Congress and the Office of Management and Budget (OMB) expressed enough 15 enthusianism. Overall, the exploration of PBC in the federal government formally began in 1990s, represented by the appearance of the Office of Federal Procurement Policy’s (OFPP) Policy Letter 91-2 on Service Contracting. The policy letter believed that PBC “enhances the Government’s ability to acquire services of the requisite quality and to ensure adequate contractor performance,” and advocated that all federal agencies should “use performance based contracting methods to the maximum extent practicable when acquiring services”. In 1994, OMB initiated a governmentwide pilot project to encourage the use of PBC in federal agencies. In 1997, the Federal Acquisition Circular 97-01 amended the FAR to implement OFPP policy letter 91-2 and confirmed the policy that PBC should be used as the preferred service acquisition method (FAR 37.102). The FAR currently establishes a policy that federal agencies use PBC to the maximum extent practicable for service acquisition. This preference on PBC remains in recent years. In fiscal year 2001, federal agencies reported a $28.6 billion use of PBC, 21% of the total obligations ($135.8 billion) incurred for services (GAO, 2002). The Services Acquisition Reform Act of 2003 also lends its strong support for PBC. In fiscal years 2005-2007, federal agencies were required to apply PBC to 40% of eligible service actions, including contracts, task orders, modifications, and options. In fiscal year 2008, they were encouraged to expand their PBC efforts on eligible service actions to 50%. OFFP also mandated that federal agencies to submit performance-based acquisition agency-wide management plans for fiscal years 2007-2011, outlining their progress and plans in applying PBC 16 to eligible service contracts. Table 1 provides a brief roadmap of the historical evolvement of PBC use in the federal government. Table 1. A Brief History of PBC in the Federal Government Year Federal Agency/Act Document 1980 OFPP A Guide for Writing and Administering Performance Statements of Work for Service Contracts 1991 OFPP Policy Letter 91-2 1993 Government Performance and Results Act 1994 OFPP Performance-Based Service Contracting Pledge 1997 OFPP Memo on “Performance-Based Service Contracting Checklist” 1997 FAC (Federal Acquisition Circular) FAC 97-01 1998 OFPP Report on Performance-Based Service Contracting Pilot Project 1998 OFPP Best Practices for Performance-Based Service Contracting 2000 National Defense Authorization Act FY 2001 Statutory Preference for Performance-Based Service Contracting 2001 FAC FAC 97-25 2001 OMB Memo on “Performance Goals and Management Initiatives for FY 2002 Budget” 2002 FAC FAC 2001-07 2002 GAO Report on “Guidance Needed for Using Performance Based Service Contracting” 2003 OFPP Report on “Performance-Based Service Acquisition: Contracting for the Future” 2004 OFPP Memo on “Increasing the Use of Performance-Based Service Acquisition” 2006 OFPP Memo on “Use of Performance-Based Acquisitions” 2007 OFPP Memo on “Using Performance-Based Acquisition to Meet Program Needs – Performance Goals, Guidance, 17 and Training” 2007 OFPP Memo on “Fiscal Year 2008 Performance-Based Acquisition Performance Goal” 2.2 Popularity at State and Local Levels State and local governments have also shown growing interest in using PBC in the purchase of goods and services. Almost every state has introduced PBC in their acquisition efforts to some extent. Although there is no uniform effort in state and local governments that responds to the federal initiatives, their explorations of PBC are much more dynamic and diverse. For example, Washington State issued Executive Order 10-07 on Performance-based Contracting in 2010, advocating the use of PBC. It requires all state agencies shall (1) require that new contracts for products and services meet performance-based contracting standard, (2) review existing contract prior to renewal and update as necessary to reflect performance- based contracting standards, and (3) ensure performance-based contracts are actively managed to meet performance-based standards. Particularly in human services, the interest in PBC is expanding rapidly. In Maine, State Statutes mandate the use of PBC in all human service contracting (Me. Rev. Stat. Ann. tit. 22, § 214). In California, eight of the nine counties in Southern California are using PBC in services such as employment training, aging and adult, and juvenile services, with most tying contract payment to a set of defined service outcome milestones (Daly, Tucker-Tatlow, & Gibson, 2004). New York City is demonstrating a growing commitment to PBC in its human service contracts. Most 18 human service contracts there have already included performance indicators and linked contract payment or renewal to contactor achievement in these indicators (Krauskopf, 2008). Although the detailed PBC designs in these states may vary, the motivations behind the injection of PBC into state efforts in service acquisitions are basically the same: to help align human service systems' focus on outcomes with how services are financed. Through restructuring contract specifications and compensations, human service agencies bind contractors with their service outcomes and maximize service acquisition efficiency. Overall, PBC is widely used in four human service areas: substance abuse treatment, child welfare, mental health, and employment training. The following discussion provides some documented evidence of PBC use in these fields. Although the survey here could not be exhaustive, it does make sense of the current status of PBC in human service provision. Substance Abuse Treatment The U.S. Institute of Medicine has advocated the use of performance measures in payment systems to promote quality improvement in treatment services since early 1990s (Institute of Medicine, 1990). The institute reiterates this suggestion in a number of its later reports (Institute of Medicine, 2001; 2006). So far, at least two states have formally responded to this call and their practices have been well documented. 19 Maine was the first state to include PBC in its purchase of addiction treatment services. In 1992, the Maine Office of Substance Abuse launched a PBC system to finance all publicly funded substance abuse treatment services (Commons et al., 1997). Under the PBC system, all programs were evaluated on post-treatment patient indicators within three categories: effectiveness measure (the minimum percentage of discharged clients who had achieved certain outcomes, such as abstinence and employment), efficiency measures (the units of treatment that providers had to deliver, such as number of clients served, number of services per client), and special populations (the targeted percentage of difficult clients, such as homeless people and youths). Each contract within the system would specify a minimum standard on each indicator that a contractor has to satisfy. Those contractors who failed to meet the minimum expectations might incur corrective actions and financial penalties. Table 2 PBC in Maine Substance Abuse Treatment Services Outpatient Residential Rehabilitation Detoxification Efficiency Standards Minimum service delivery (percent of contracted amount) 90% 80% 70% Minimum service delivery to primary clients (percent of total units delivered) 70% N.A. N.A. Number to be met 2 of 2 1 of 1 1 of 1 Effectiveness Standards Abstinence/drug free 30 days prior to termination 70% 85% N.A. Reduction of use of primary substance abuse problem 60% 85% N.A. Maintaining employment 90% 90% N.A. 20 Employment improvement 30% 5% N.A. Employability 3% 3% N.A. Reduction in number of problems with employer 70% N.A. N.A. Reduction in absenteeism 50% N.A. N.A. Not arrested for OUI offense during treatment 70% N.A. N.A. Not arrested for any offense 95% N.A. N.A. Participation in self-help during treatment 40% 80% N.A. Reduction of problems with spouse/significant other 65% 60% N.A. Reduction of problems with family members 65% 60% N.A. Referral in continuum of care N.A. 90% 45% Referral to self-help N.A. N.A. 20% Time in treatment N.A. N.A. 4 days Number to be met 8 of 12 5 of 9 2 of 3 Special Populations Standards Females 30% 40% 14% Age: 0-19 10% 4% 1% Age: 50+ 6% 5% 12% Corrections 25% 10% 2% Homeless 1% 1% 20% Concurrent psychological problems 8% 3% 11% History of IV drug use 12% 15% 27% Poly-drug use 35% 40% 28% Number to be met 5 of 8 5 of 8 5 of 8 Note: 1. Percentages are the minimum percent of total clients that must meet the indicator for the program to be deemed to have met that indicator. 2. N.A. means that programs offering the treatment modality are not required to meet the indicator 3. Number to be met is the number of indicators the program must meet to be deemed to have performed in that category. Source: Commons et al. (1997). 21 More recently, the Delaware Division of Substance Abuse and Mental Health changed its contracting method in alcohol and other drug treatment programs from a FFS basis to a PBC basis in 2001. Under PBC, contractors were paid monthly based on their performance on three performance measures – tilization of treatment capacity, client participation in treatment, and client treatment completion (McLellan, Kemp, Brooks, & Carise, 2008; Stewart, Horgan, Garnick, Ritter, & McLellan, 2013). Table 3 shows the performance measures and payment schedule for the first two indicators. In addition, providers, after helping clients complete treatment (i.e., active participation in treatment for a minimum 60 days, achievement of treatment goals, and a minimum 4 consecutive weeks free from alcohol and illegal drugs) may receive $100 bonus per client. Table 3 PBC in Delaware Alcohol and Other Drug Treatment Programs Program Capacity Utilization Treatment Participation Requirements Target rate 2001-2002 Target rate 2003-2007 Payment: % of contract amount Client treatment phase Client treatment participation requirement % clients required to meet target Payment: % of contract amount 80% 90% 100 Phase 1 2 visits/week 50 1 70%-79% 80%-89% 90 Phase 2 4 visits/month 60 1 60%-69% 70%-79% 70 Phase 3 4 visits/month 70 1 50%-59% 60%-69% 50 Phase 4 2 visits/month 80 1 Note: treatment participation payments are conditional on achieving the capacity utlization requirement. Additional 1% payment when the program meets all four participation target. Source: Stewart et al. (2013). 22 Child Welfare Child welfare might be the area where PBC enjoys the most attention and praise. The traditional fee-for-child contracting was found to undermine permanency: once a child welfare issue has been resolved and a child has been discharged, a contractor would face revenue loss unless a new child is referred. Thus, contractors may be inclined to keeping childs in care rather than moving them toward permanency. Since 1990s, child welfare agencies have experimented PBC to purchase a variety of services, such as adoption, foster care case management, in-home services, residential care, and so on. In fiscal year 2008-2009, 24 states reported that their lead agencies include in service contracts benchmarks or indicators to measure service accessibility, timeliness, and service delivery efficiency (U.S. Child Care Bureau, 2008). Within the same time period, the Quality Improvement Center on the Privatization of Child Welfare Services found 14 states had service contracts that directly connect contract payment to performance and 11 states would consider contractor’s performance achievement when making future funding decisions. In 2005, the Children’s Bureau at the Department of Health and Human Services funded a project to test the use of PBC in child welfare services in Florida, Illinois, and Missouri. Table 4 State Use of PBCs in 2009 Operational Definition States Number PBCs link contractor payment to performance States with at least one PBC that links payment to performance, most commonly in the way of AZ, FL, IA, ID, IL, MI, MN, MO, NC, ND, NE, 14 23 service or client outcomes NM, TN, WY PBCs inform contract renewal decisions States using performance measures in contracts primarily to gauge contract renewal decisions AK, AR, CA, CO, CT, IN, LA, OH, OR, WA, WI 11 Source: The Quality Improvement Center on the Privatization of Child Welfare Services, 2009. However, there is a significant variance in the detailed PBC designa across states, in terms of performance measures, payment structures, and other dimensions. Table 5 provides a snapshot of the current state of PBC use in some states. Table 5 PBC in Selected State Child Welfare Agencies State Contracted services Geographic coverage PBC initiated Selected performance measures FL Foster care Judicial circuit 5 2007 • Earlier and more accurate data entry into state’s administrative system • Increased contracts with biological parents • Improved rates of maintained permanency of children IA Resource family recruitment statewide 2007 • Sufficient pool of foster and adoptive homes • Children matched with appropriate foster homes in a timely manner • Safety in foster and adoption care IL Foster care case management statewide 1998 • Child safety (e.g., #of reports of abuse/neglect) • Child well-being (e.g., 24 placement of siblings, placement within community) • Child permanency (e.g., average length of stay in care, placement disruption) IL residential care and treatment statewide 2008 • Sustained favorable discharge rate • Treatment opportunity days rate IL independent living and transitional living programs statewide 2009 • Discharge potential rate with indicators of self-sufficiency • Transitional living placement stability rate MO foster care and adoption case management Three regions 2005 • Reduced reentry into foster care • Increased stability • Increased permanency NM Adoptive and foster home licensing statewide 2008 • Home studies completed in a timely manner TN Foster care case management statewide 2007 • Average care days • Proportion of placements existing to permanency WY Residential treatment statewide 2006 • Reduced length of stay Source: Child Care Bureau. (2009). Mental Health Mental health is one of the pioneers in human service areas that experimented PBC. As early as late 1970s, state mental health agencies have tentatively introduced performance measures into their service acquisition efforts. Wisconsin was the first state to initiate PBC with localities for mental health care in 1973. In Wisconsin, each 25 local mental health authorities (LMHA) received a fixed budget from the state for community treatment and for state hospital treatment. LMHA was responsible for all costs incurred in the provision of services to its population. Community care costs were borne directly by LMHA, either through its own provision of services, or through the costs of contracts for the provision of services. LMHA was charged for state hospital use at per unit cost. Since LMHA received a fixed amount from the state, it received a bonus if their usage falled below this target, and a penalty for usage above the target (Chapin & Fetter, 2002; Gaynor, 1990). Michigan also adopted this model later. In fiscal years 1978-1979, the Division of Mental Health in Colorado introduced PBC into its mental health system (Glover & Berger, 1989; Miller & Wilson, 1981). When contracting with community mental health centers for mental health services, the state agency included several categories of performance indicators (such as number of admissions by age group, regular reporting of the pre- and post- outcome on all clients, number of severely disabled to be served, contractor’s accomplishment in Affirmative Action Plan) in their service contracts. At the end of the contract year, contractors had to report their achievement in these categories. A failure to serve 93% of the categorical quotas might result in a 5-7% reduce in contract funding for the next year. The Philadelphia mental health residential system started PBC experiment in late 1990s, aiming to elevate low occupancy rates and prioritize access to residential care 26 for persons with the greatest needs. Before that, service contractors were compensated based on the availability of residential beds. This contracting method was found to discourage the efficient use of resources and lead to chronically low occupancy levels. In 1998, the Occupancy Based Reimbursement system was launched, directly tying occupancy performance to financial incentives and sanctions. Service contractors were required to maintain annualized occupancy rate of 86% at a minimum to avoid financial sanctions (not exceeding the equivalent of 3% of the program’s yearly costs). Programs that maintained annualized occupancy levels of 93% or higher could receive incentive funds (not exceeding the equivalent of 3% of the program’s yearly costs). Starting 2004, client outcome measures (e.g., graduation and hospitalization rates) were introduced into the PBC system (Faith et al., 2010). Employment Training Employment training programs also have a very long history of using PBC. For example, employment programs funded by the Job Training Partnership Act included client-level performance measures in their contracts and made funding decisions based on performance achievements (Barnow, 2000; Heckman, Heinrich, & Smith, 2003). The Workforce Investment Act (WIA), JTPA’s successor as the primary federal training program, adopted an expanded version of the JTPA performance system. Besides, Wisconsin transferred its Wisconsin Works (W-2) contracts from a cost-reimbursement basis to a PBC basis in 1997, tying contract payment to measured performance. In this PBC system, detailed performance measures changed over time. 27 Contractors who failed to meet basic performance standards might lose future contracts, while capable contactors would enjoy profits or bonuses (Heinrich & Choi, 2007) . Particularly, employment services for disabled people within state vocational rehabilitation programs are increasingly using PBC in the purchase of various services from contractors. Since Oklahoma designed and used the milestone payment system (one version of PBC) in 1990s, many other states such as Alabama, Indiana, Massachusetts, and New York have followed the lead (O’Brien & Revell, 2005). The details of state vocational rehabilitation programs and their PBC models in employment service contracting will be presented in depth in chapter four. 2.3 Important but Missing Links Despite the burgeoning popularity in the use of PBC to purchase human services, there is not much documented evidence on the effectiveness of PBC. Specifically, two critical issues related to the use of PBC in human services remain unclear: (1) whether PBC produces better results than fee-for-service contracting – the effectiveness problem, and (2) if so, under what conditions, or how to use or implement PBC – the capacity problem. Effectiveness Problem 28 Ironically, to date, there is still little empirical evidence supporting that PBC actually leads to performance improvement in human services in a systematic way. The current prevalence of PBC in the purchase of human services is largely driven by the underlying theoretical reasoning behind PBC and the fashion of PBC in other fields. The theoretical reasoning of PBC is tempting: attaching contract compensations to service outcome measures could motivate better outcomes and empowering contractors could encourage innovative and quality services. The effectiveness of PBC in other fields such as energy further makes PBC attractive to human service agencies. For example, the federal government conducted a performance-based service contracting pilot project in 1998 and found a 15% decrease in contract prices and a 18% improvement in customer satisfaction (OFPP, 2003). However, current fad of PBC in human services often ignores the distinct characteristics human services possess and the special challenges those features bring to PBC. The discussion on this point is relatively brief here; a more detailed theoretical elaboration will be found in the next chapter. The foremost precondition of PBC is the inclusion of performance measures. Any performance-based management tool requires a set of performance standards and metrics against which success could be measured. However, developing comprehensive and quantifiable measures that could cover the full spectrum of human service performance has long been considered very tough, if possible. First, human service programs frequently pursue values or goals that are multi-dimensional and often competing, which makes the design of appropriate measures that could perfectly cover the full range of the missions and values very difficult (Behn, 2003; 29 Heinrich & Fournier, 2004). Second, human service outcomes cannot easily be attributed to particular interventions and the confounding factors would contribute to the ambiguity of outcomes. Third, most human service programs aim to promote long-term stability and positive quality-of-life changes, but performance measures in service contracts have to emphasize short-term effects within certain contract duration. As a result, public managers have to use intermediate outcomes to account for final outcomes (Martin & Kettner, 1996). In short, all these elements jointly imply that performance measures for human services are often biased. In addition to the problem of ambiguous performance, human services also feature high provider discretion in service delivery process. Human service provision is highly labor intensive, making the exercise of discretionary judgments by service providers inevitable or even desired (Lipsky, 1980; Riccucci, 2005; Sandfort, 2000). The line staff, through direct interactions with clients, can determine the “range of behavioral actions from which clients may choose their responses” (Lipsky, 1980, 61). Thus, such discretion constitutes part of service providers’ daily work, actually playing a double-edged role. On one side, it can help providers “process” clients in a responsive way, tailoring services to different client situations. On the other hand, there is a risk that such discretion might be abused without justification. In sum, the rise of PBC in human services represents the convergence of imperfect performance measures and high provider discretion. Combining these two together, human services indeed bring challenges to PBC and make it at the risk of “rewarding 30 A, while hoping for B” (Kerr, 1975). Relying on imperfect surrogate measures leaves service contractors room to “gaming,” while higher provider discretion granted by PBC helps contractors achieve these potential gains (Bevan & Hood, 2006; Bohte & Meier, 2000; Heckman, Heinrich, & Smith, 1997; Moynihan, 2011). For example, in serveal human service areas, when contract payments are tied to clients’ outcome achievement, contractors are likely to selcect clients and serve those who are easier to meet performnace goals. Thus, PBC creates much potential for service contractors to “gaming” or “creaming,” by focusing services on the variable measured, while excluding other outcomes which may be equally important but more difficult to measure. As Radin (2006) suggests, “because various players are likely to use the information to meet their varied agendas, it is rational for those who are the subject of the data to find ways to game the system” (207-208). Actually, current evidence on the effectiveness of PBC in human services, though limited and unsystematic, has already been quite mixed. Table 6 demonstrates some of these studies. The introduction of PBC into substance abuse treatment programs has attracted strong scholarly interest in examining various aspect of its effectiveness. Commons, McGuire, and Riordan (1997) compare the client-level changes before and after the use of PBC in Maine and observe positive improvement in service outcomes, such as abstinence, reduction in drug use, reduction in problems with jobs, and no arrests. However, this finding was largely doubted by later studies in that it fails to consider the unintended effects incurred by PBC. Shen (2003) finds that, after the implementation of PBC in Maine addition treatment, the number of most severe 31 clients dropped by 7% and concludes that PBC actually equips contractors with financial incentives to treat less severe clients to achieve targeted performance. Lu (1999) argues that since state agency relied on contractors to report client treatment outcomes, contractors had incentives to misreport and cheat on performance information to ensure funding from state government. Brucker and Stewart (2011) reexamine Maine’s experience and conclude that PBC had no positive effect on program performance such as time to treatment, level of client participation, length of stay, and completion of treatment. In Delaware, McLellan et al. (2008) find significant increases in average capacity utilization (from 54% to 95%) and average proportion of patients’ meeting participation requirement (from 53% to 70%) after PBC implementation, with no notable demographic changes in the patient population over time. Building on this finding, Stewart et al. (2013) further trace the effectiveness of PBC on individual clients and observe 13 days less in waiting time for treatment and 22 days longer in length of stay in treatment. In employment services, the effectiveness of PBC in the programs funded by the Job Training Partnership Act has been found to be very controversial. The use of short- term and straightforward measures is only weakly, and sometimes perversely, associated with long-term welfare (Barnow, 2000; Heckman et al., 2003; Heinrich, 1999). Dias and Maynard-Moody (2006) study workers in a for-profit subsidiary of a national marketing research firm that shifted into the business of providing welfare services. They find requirements for meeting contract performance (job placement) and profit quotas created considerable tensions between managers and workers on the 32 importance of meeting performance goals versus meeting client needs. The easiest way to meet contract goals and gain profits was to minimize the time and effort devoted to each client. Koning and Heinrich (2013) examine the incentive effects of PBC on program outcomes in Dutch welfare-to-work program. They find evidence of gaming activities, but these activities had little impact of gaming on service outcomes. They conclude that the use of PBC increased job placement, but not job duration. In other human service areas, the effectiveness puzzle remains. Many evaluations of PBC in child welfare are still underway. In particular, Illinois used PBC to promote permanency outcomes in its foster care contracting and witnessed a significant decrease in the number of children in out-of-home placement (Kearney, McEwen, Bloom-Ellis, & Jordan, 2010). After Philadelphia directly tied financial incentives and sanctions to occupancy performance, the mental health residential system witnessed a significant increase in occupancy, with an average occupancy rate of mid 90%. However, there was still a concern that the performance target on occupancy may suppress the flow of residents through the housing system (Faith et al., 2010). Table 6. Selected Studies on PBC Effectiveness in Human Services Author(s) Study site Contracted services Unit of analysis Findings Commons, McGuire, and Riordan (1997) Maine Substance abuse treatment services Client • Improvement in service outcomes, such as abstinence, reduction in drug use, reduction in problems with jobs, and no arrests Lu (1999) Maine Substance abuse treatment services Client • Providers had incentives to report better treatment performance 33 outputs Heinrich (1999) Chicago JTPA programs Program • Performance measures were not strongly correlated with program goals • Cost-per-placement measure had negative implications for service quality Shen (2003) Maine Substance abuse treatment services Client • Number of most severe clients dropped by 7% Lu, Albert Ma, and Yuan (2003) Maine Substance abuse treatment services Client • More referrals and better match between illness severity and treatment intensity • A positive but insignificant effect on dumping (a client is sequentially referred from one provider to the next without being treated) Heckman, Heinrich, and Smith (2003) US nationwide JTPA programs Client • Short-term measures were weakly, even perversely, related to long-term impacts • Efficiency gains or losses from gaming were small Dias and Maynard- Moody (2006) Porter City Welfare-to work program Program and client • Distorted incentive structures that led to programmatic conflicts between program management and staff • Negative program practice and poor client outcome Heinrich and Choi (2007) Wisconsin Wisconsin Works (W-2) program Program • Contractors responded to performance incentives related to future funding decisions • Insufficient contacting management may undermine PBC effectiveness McLellan, Kemp, Brooks, and Carise (2008) Delaware Outpatient alcohol and other drug treatment Program • Average capacity utilization rates increased from 54% to 95% • Average proportion of patients’ meeting participation requirement 34 increased from 53% to 70% Faith et al. (2010) Philadelphia mental health residential services Program • Significant increases in program occupancy • The flow of residents through the housing system might be suppressed Stewart, Horgan, Garnick, Ritter, and McLellan (2013) Delaware Outpatient alcohol and other drug treatment Client • Waiting time for treatment declined 13 days • Length of stay in treatment increased 22 days Koning and Heinrich (2013) Netherlands Welfare-to-work services Client • Evidence of gaming activities • Little impact of gaming on service outcomes Overall, current research on the effectiveness of PBC in human services mostly suffers from two limitations. First, many studies fail to count in the impact of unintended consequences of PBC on full service performance. As mentioned above, developing a series of performance measures that could capture full service performance is very challenging. Thus a common strategy is to use short-term and easy-to-measure indicators instead. As such, contractor efforts in achieving measured performance may affect their behaviors related to unmeasured performance. For example, the performance improvement in Maine substance abuse treatment programs was very likely to be attained though custom selection and contractor misreporting. Such performance improvement, though efficient to some extent, should not be considered effective. More broadly, if improvement in measured performance is achieved at the expense of other unmeasured performance, such improvement is not effective and desired. A systematic evaluation of PBC effectiveness should include such consideration. Without it, the evaluation is inevitably biased. 35 Second, methodologically, these evaluation studies often rely on “pre-post” comparisons based on observation data. The most severe threat to internal validity in observation studies in that observations in comparison groups are biased by counterfactual variables, which are not directly comparable. The “pre-post” comparison, as the most basic quasi-experimental design, is very unlikely to rule out the effect of these counterfactual variables (Shadish, Cook, & Campbell, 2002). Thus, the results from pre-post comparisons generally suffer from low internal validity. In this sense, more robust research designs should be used. Capacity Problem Closely related to the effectiveness of PBC is the capacity challenge. As discussed previously, PBC is experimented in and introduced to service contracting as an effort to address the smart-buyer problem, i.e., public managers are sometimes not equipped with sufficient management capacity to use contracting effectively. However, although the potential benefits of PBC are attractive, the launch of PBC system does not guarantee the achievement of those benefits. Rather, PBC itself creates a series of new challenges for public managers in designing and implementing PBC systems, such as how to set performance milestones and indicators, how to split responsibilities and risks bewteen contracting parties, how to conduct performance monitoring, etc. After reviewing the use of PBC in federal agencies, GAO (2002) raises the concern that “whether agencies have a good understanding of performance- 36 based contracting and how to take full advantage of it” (2). New York State piloted PBC in its employment services for disabled people in early 2000s and soon abandoned the effort when the administration found they lacked the capacity to implement PBC and lead organizational change (Gates et al., 2004). Heinrich and Choi (2007) admit that insufficient program administration and contracting management capacity undermined the effectiveness of PBC in Wisconsin Works program. Basically, the introduction of PBC requires two managerial capacities: designing appropriate PBC systems and implementing organizational changes. First, the critical role of performance measures could not be emphasized more. As is shown previously, there are many variations in performance measures among the PBC models currently used in different states, even in the same human service field. Appropriate performance measurement facilitates PBC implementation and reduces the potential of unintended consequences. This further implies several more detailed tasks, such as which part of performance to track and how to link contract reimbursement to client outcomes. In addition to these technical aspects of PBC design, a more profound capacity would be leading organizational innovation and changes in an inter-organizational setting. Given the difficulty of designing comprehensive measurement systems for human services, this capacity becomes even more critical. Under PBC, only service efforts that successfully achieve desired outcomes would be reimbursed. Thus, PBC actually 37 forces contractors to burden substantial fiscal risks. Contractors are exposed to loss when their service efforts do not result in expected outcomes. Such risk shifting complicates contract implementation. Romzek and Johnston’s (2002) study of service contracting in Kansas finds that although accurate performance measures in contracting may facilitate contract implementation, substantial risks at the contractor side would “compromise the capacity of the contractor both to meet performance expectations and to provide required performance information to contract managers” (430). McGrew et al. (2007) observe that contractors do prefer FFS over PBC, although they mostly welcome the freedom in the service process under PBC. In this sense, it is likely that contractors resist the transition from traditional FFS approach to PBC, or only perversely adjust to PBC systems. Moreover, the injection of PBC to human service system is an evolving process, allowing longtime trial-and-error. The movement toward PBC takes patient and deliberate effort and needs to address a myriad of challenges. It is an evolutionary rather than a revolutionary process, which requires years’ planning with progressive implementation and is expected to continue evolving over time. For example, over a 6-year period, the Philadelphia mental health system was able to shift from a FFS model to a PBC model. Even though the basic PBC framework had been there, the administration was still modifying and improving the performance measures (Faith et al., 2010). Particular, it takes a great deal of time to establish a meaningful performance measurement system that informs program development and client improvement. Public managers have to confront this evolutionary dynamic. As 38 Heinrich and Marschke (2010) argue, “an incentive designers’ understanding of the nature of a performance measure’s distortions and employees’ means for influencing performance is typically imperfect prior to implementation,” and thus “it is only as performance measures are tried, evaluated, modified, and/or discarded that agents’ responses become known” (203). All these imply that PBC should be treated as a learning process for public managers. In sum, governments at all levels have shown substantial and continuous enthusiasm for PBC. In human services, particularly, state and local governments have expressed growing interests in using PBC in their service acquisition. Although the designs of detailed PBC systems in different states and different service areas might vary, the basic motivation of the injection of PBC is the same: align human service systems’ focus on outcomes with how services are financed, or more technically, reshape contractor behaviors through redefining contract incentive structures. However, the burgeoning popularity of PBC lacks sufficient evidence to show its promised benefits are actually achievable. The evidence available in this regard still fails to provide a consistent and persuasive answer. To an extent, the introduction of PBC into human service systems, from a managerial perspective, needs to address the effectiveness problem (whether PBC produces better results) and the capacity problem (how to use PBC and lead interorganizational changes). The present reasearch mainly focuses on the effectiveness problem, but would briefly discusses the implications on the capacity problem. Before that, the 39 research needs a theoretical framework that could pave the way for future discussion. This is the topic of the chapter three: a theoretical discussion of contract design and its application to human service contracting. 40 Chapter 3. Theoretical Framework 3.1 Formal Contract Design: A Principal-agent Perspective The same as much previous literature on government contracting (e.g., Donahue, 1989; Johnston & Romzek, 1999; Kettl, 1993; Milward & Provan, 1998, 2000; Romzek & Johnston, 2005), this research puts the discussion of contract design first in a principal-agent model (Eisenhardt, 1989; Jensen & Meckling, 1976; Shapiro, 2005), where government (the principal) relies on contractors (the agents) to deliver human services and achieve policy goals. Based on the assumptions of goal conflicts and information asymmetry between the principal and the agent, the agency theory warns the existence of agency problem, i.e., the principal is subject to the agent’s self- serving opportunistic behaviors. First, because of incomplete information, the principal could not verify the agent’s capacity and thus may rely on low-quality agents. In this sense, it is the agent that chooses the principal, not the opposite. This is termed as adverse selection or hidden information (Arrow, 1984). Moreover, the agent may further take the information advantage to shirk his/her responsibility and not put forth the agreed-upon efforts. This hidden action (Arrow, 1984) would generate considerable moral hazard for the principal. To address the agency problem, the principal might try a variety of monitoring tools to bridge information asymmetries and goal conflicts. However, all these efforts would incur agency costs. Therefore, the managerial implication of the agency theory 41 focuses on the design of efficient governance mechanisms to moderate the agency problem, or more precisely, appropriate control mechanisms to guide the distribution of risk and uncertainty between the principal and the agent. If organizational control is seen as a problem of information flow (Ouchi & Maguire, 1975), the design of control mechanisms and strategies within an organization largely rests upon two dimensions: (1) task programmability – the degree to which the means-ends relationships involved in agent behaviors can be precisely defined, and (2) outcome measurability – the extent to which various aspects of task outcomes could be specified in a comprehensive and quantifiable manner. The focus of control, therefore, can be on either the behavior of employees or the outcomes of those behaviors. Accordingly, the control strategy can be either behavior or outcome based (Eisenhardt, 1985; Ouchi, 1980; Thompson, 1967). Generally, behavior-based control is appropriate in an environment characterized by high task programmability. When certainty regarding causation is high, control strategies are more reflected in high levels of monitoring and direction in agent activities, with performance evaluation often focusing on job inputs. If outcome measurability is high, organizations would prefer outcome-based control strategies, under which compensation schemes are attached to outcome measures and monitoring of employees becomes relative less. When a task is neither programmed nor measured, formal control mechanisms, both behavior-based and outcome based, seem ineffective in that there is no exact place to host the control. In this case, social control, or what Ouchi (1980) calls “clan” control, may emerge to play a 42 supplemental role. The social control system, using informal and normative mechanisms (such as shared values and norms of reciprocity) to align the preferences between the principal and the agent, implicitly encourages appropriate behaviors that could lead to desirable organizational outcomes. Table 7. The Determinants of Organizational Control Strategies Task Programmability High Low Outcome Measurability High Behavior or outcome control Outcome control Low Behavior control “Clan” control Source: Ouchi (1980). Arrow (1964) defines the design of control strategies as the choice of operating rules and the choice of enforcement rules to support the operating rules. If an organization operates “as a nexus for a set of contracting relationships among individuals” (Jensen and Meckling, 1976, 310), then the design of optimal contract arrangement governing the principal-agent relationship constitutes the enforcement rule to facilitate contract implementation. In accordance with two types of organizational controls, there are two major contract alternatives: behavior-based and outcome-based contracts. The choice of a contract type is thus a function of task programmability and outcome measurability. The key in structuring contractual relationships, writes Eisenhardt (1989), is “the trade-off between (a) the cost of measuring behavior and (b) the cost of measuring outcomes and transferring risk to the agent” (61). 43 Figure 3 describes four types of goods and services in terms of their certainty in causation and outcome and different contract types tailored to fit these characteristics. Figure 3. The Determinants of Contract Type For services in Cell 2, the means-ends relationships involved in agent services can be explicitly specified and observed. As such, information asymmetry between the principal and the agent in terms of task programmability is low and the risk transferred from the agent to the principal becomes expensive. Therefore, the principal knows what the agent has done and could under behavior-oriented contracts to purchase the agent’s direct behaviors. In Cell 3, agent services are ambiguous to observe, but their outcomes could be clearly measured with less difficulty. Under these circumstances, the principal would prefer outcome-based contracts to align the agent’s incentives with those of the principal and make risk shifting from the agent to the principal become less likely. When both cause/effect relationships and outcomes Low High Low High Task Programmability Outcome Measurability Cell 3 Outcome-based Contracts Cell 2 Behavior-based Contracts Cell 4 Behavior-based or Outcome-based Contracts Cell 1 44 are highly certain (in Cell 4), there is no difference for the principal to control either service process or outcome, and thus both contract types work equally well. The most problematic situation for contract design comes from the services in Cell 1, where agent services share both low task programmability and low outcome measurability. In health care, for example, the principal lacks the ability to anticipate clearly the treatment process and outcomes. As such, the locus of control for the principal seems obscure, leaving a high degree of incompleteness in contract specification. When the control the principal uses to govern the contractual relationships is incomplete, as incomplete contract theory (Hart, 1988; 1989) predicts, the agent would enjoy “residual rights of control” and be at the advantageous position in ex post bargaining and the division of ex post benefits. In most cases, the agent could perform discretionary judgments in the circumstances that were not specified in initial contracts. These behaviors are very likely to incur moral hazard. In short, the incompleteness in task programmability and outcome measurability would make contract design challenging. Unfortunately, this is where human services usually fit in. 3.2 Formal Contract Design for Human Services Human services generally feature low task programmability and low outcome measurability. The effort on task programmability in human services is always disturbed by high provider discretion in the service delivery process. Human service provision is highly labor intensive, making the exercise of discretionary judgments by 45 service providers inevitable or even desired (Lipsky, 1980; Riccucci, 2005; Sandfort, 2000). Although there are various operating rules and service manuals throughout thr service process, in real situations service providers are always required to apply their judgment and make decisions contingent on detailed contexts. These line staff, through direct interactions with clients, can determine the “range of behavioral actions from which clients may choose their responses” (Lipsky, 1980, 61). Thus, typically, service providers “do not do just what they want or just what they are told to want. They do what they can” (Brodkin, 1997, 24). Thus, such discretion constitutes part of service providers’ daily work, actually playing a double-edged role. On one side, it can help providers “process” clients in a responsive way, tailoring services to different clients. On the other hand, it may abuse such rights without justification. Sandfort (2000) examines the potential influence of the new public management and traditional public administrative practices on front-line actions in two local welfare offices and two private contractors in Michigan. She finds that neither performance-based management nor traditional bureaucratic directives have an impact on front-line practices in either type of agency. Instead, the most powerful determinants of street-level behaviors rest upon the collective beliefs of front-line staff, such as norms, shared knowledge of the organizational members. In addition, the outcome of human services is often too uncertain to be defined clearly. Measuring the performance of human service programs has long been considered demanding. First, from the normative perspective, like many other public programs, human service programs frequently pursue values or goals that are multi- 46 dimensional and often competing, such as efficiency, equity, and representativeness, derived from the various expectations on government cherished by citizens. Wilson (2000) details this multidimensional nature and the dilemma of balancing them. At the very basic level, public welfare programs are always involved in the efficiency- equity puzzle, recognized by Okun (1975) as “the big tradeoff.” Thus, the answers to the question of “what to measure” are always ambiguous and competing. As such, figuring out appropriate measures that could comprehensively cover the full range of the missions and values can be difficult (Behn, 2003; Heinrich & Fournier, 2004; Heckman, Heinrich, & Smith, 1997). Second, technically, human services are directed to improving service recipients’ welfare through behavioral interventions. As Hasenfeld (1983) observes, human services aim to “protect, maintain, or enhance the personal well-being of individuals by defining, shaping, or altering their personal attributes” (1). However, beyond such interventions, there might be a number of uncontrollable factors out of service providers’ reach that would lower the certainty of desired outcomes (DeHoog & Salamon, 2002; Martin & Kettner, 1996; Wedel & Conston, 1988). Thus, outcomes cannot easily be attributed to a particular intervention. Also, the standards on significant changes in welfare conditions before and after services are sometime controversial. Third, most human service programs aim to promote long-term stability and welfare, but performance measures have to emphasize short-term effects. Tracking persons 47 over time is a costly activity and does not produce short-term feedback on the success of the program. Most programs use outcomes of participants measured at the time they complete the program, or within a short period thereafter (Martin & Kettner, 1996). Both measures are short-term in nature, which creates another puzzle that these performance standards misdirect activities by focusing on the criteria that may be not related to long-term goals. Heckman, Heinrich, and Smith (2002) find that in the JTPA system, short-term measures used to monitor performance were only weakly, even perversely, related to long-term impacts. Putting these three points together, we could have some understanding on why performance measurement in human service programs is so difficult. With these intrinsic characteristics of human services in mind, let’s move onto the discussion of contract design. Traditionally, human service contracts run on a fee-for- service basis, a behavior-based contract, where government directly controls the service process (such as inputs standards and service methods employed) in order to ensure the delivery of promised services. When a client comes to a human service agency for services, agency staff would determine the eligibility and prescribe the amount of services needed. After that, the human service agency buys this amount of services from service contractors. For example, a human service agency may purchase individual counseling services for a domestic violence offender at the rate of $75 per hour, or group counseling in outpatient substance abuse treatment at $20 per 15 minute increment. After the services are delivered and paperwork is approved, 48 service contractors are reimbursed for that amount of services delivered, based on the unit of services (e.g., per hour or per 15 minutes). However, the task programmability of a human service is always tentative – it is difficult to predict initially what services could exactly lead to desired results due to ambiguous jobs and uncertain future events. Thus, government effort on task programmability under FFS might be offset by the discretion contractors enjoy in the service delivery process because they work directly with clients and have (or pretend to have) more information on clients’ service needs. And the negotiation nature of human service contracting may further justify the existence of discretion. As DeHoog (1990) observes, human service contracting generally follows a special negotiation or cooperation logic. Due to limited market competition, ambiguous performance, and costly contracting monitoring (DeHoog, 1984; Schlesinger, Dorwart, & Pulice, 1986; Van Slyke, 2003), human service contracting does not usually rely on the classical competitive bidding model. Rather, human service contracts are mostly specified through negotiations between government buyers and contractors. This would no doubt complicate government effort on task programmability. Another byproduct of low programmability of human services is that the link between task and outcome becomes broken: due to failure of clear task specification, the detailed services prescribed by government do not necessarily lead to desired outcomes. In this sense, contract compensation, independent of service outcomes, only encourages service delivery, demonstrating a “triumph of process over results” 49 (Kettner & Martin, 1993, 62). Given this, contractors have no incentive to improve service performance. Further, better performance may even mean economical inefficiency for them (Wulczyn, 2005). For example, improving service quality increases contractor costs for advanced facilities and staff training, which would not be reimbursed by government. Again, better services reduce client demands for feature services. The new PBC approach, holding an outcome orientation, draws contractors toward service results and leaves them considerable flexibility in serving clients. Theoretically, PBC would encourage innovative services, better outcomes, and less monitoring. However, these benefits are subject to two assumptions—PBC is not vulnerable to (1) measurement problem and (2) gaming by contractors (Behn & Kant, 1999; Bevan & Hood, 2006). Without meeting these two requirements, the effectiveness of PBC cannot be guaranteed. As mentioned above, the performance of human service programs is very challenging to track. In most cases, performance measures for human services are often just approximations of the targeted outcomes, i.e., short-term measures representing long- term effects and easy-to-measure goals representing ambiguous goals. As a result, the measurement assumption becomes problematic in human service contracting. However, when surrogate measures are used, as mentioned earlier, service contracts become incomplete, leaving room for contractors to seek gaming and other strategic behaviors (Radin, 2006). Baker (2002; 1992) shows the efficiency of incentive 50 contracting depends on the extent to which the performance measures used are aligned with the principal’s objective. When the principal’s objective is not contractible, i.e., unclear or immeasurable, alternative measures have to be adopted as proxies. If so, the incentives associated with those performance measures are inaccurate and nonoptimal, leading contractors to engage in unintended activities even if contractors are risk neutral. The more distortion is in performance measures, the lower is the incentive for desired objectives. Indeed, such distortion becomes even severer when gaming enters the picture. As noted above, service contractors embrace discretion when delivering services and PBC even enhances such discretion. Thus, it is very likely that contractors use their information advantage to conduct perverse adjustment to performance measures in order to appear to be behaving well (Hood, 2006; Courty & Marschke, 2004; Moynihan, 2008; Radin, 2006). Williamson (1985) terms this phenomenon as opportunism, a “self-interest seeking with guile” (47), which includes a wide range of behaviors, such as shirking, cheating, and withholding important information. Bevan and Hood (2006) summarize three forms of gaming problem under PBC context— ratchet effects (restricting current output to gain undemanding future performance target), threshold effects (downgrading the output of those performing better than the target to meet the target), and output distortions (achieving targeted performance measures at the expense of unmeasured performance). All would limit the effectiveness of PBC in human services. 51 Almost two decades ago, Cragg (1997) questioned why PBC was not prevalent in human service programs. After examining the practices of PBC in job training programs under JTPA, he concluded “unless performance standards are carefully designed, problems of moral hazard may preclude the widespread use of performance incentives in government programs” (147). Although performance measurement techniques have been improved greatly since Cragg’s study, the problem he observed may persist as long as ambiguous performance and high discretion associated with human services continue. By and large, neither behavior-oriented nor outcome- oriented contracts fit seamlessly with human services (Table 8). Both might incur certain amount of agency costs in structuring and monitoring contractual relationships, which would further undermine the effectiveness of formal contact design. Table 8. Contract Type for Human Service Contracting Fee-for-service Contracting Performance-based Contracting Control Strategy Behavior-based control Outcome-based control Implementation problem Low task programmability Low outcome measurability Limitations Triumph of process over results Surrogate measures; Gaming behaviors 3.3 Informal Contract Design: A Relational Contracting Perspective Another line of literature that would cast light on the discussion here is relational contracting. Interorganizational relationships (IORs) always embrace two dimensions, structural and relational, and thus propose two streams of governing mechanisms 52 (Faems, Janssens, Madhok, & Van Looy, 2008; Ring & Van de Ven, 1994). The structural perspective considers a formal structural arrangement and its role in structuring interorganizational behaviors and performance, in order to “create a predictable collaborative environment that mitigates exchange hazards and facilitates coordinated action” (Faems et al., 2008, 1054). The relational perspective emphasizes informal relationship building, trust cultivation, and trustworthy behaviors. Thus, it “promotes a more relational governance strategy in which partners rely on trust to address issues of safeguarding and coordination” (Faems et al., 2008, 1054). This dichotomy actually follows the conventional wisdom of the interaction between formal and informal behaviors in organizational management. Following this line of reasoning, formal contracting centers on “detailed, binding legal agreements that specify the obligations and roles of both parties in the relationship” (Vandaele et al. 2007, 240). In this most visible part of a contract, the attention would be on the design of comprehensive contract clauses to bind future contingencies. In this sense, contracting basically means two elements: “(a) rational planning of the transaction with careful provision or as many future contingencies as can be foreseen, and (b) the existence or use of actual or potential legal sanctions to induce performance of the exchange or to compensate for non-performance” (Macaulay, 1963, 56). Thus, the designs of formal contracts aim to reduce uncertainties in contracting process and make contractor behaviors more predictable. 53 In contrast, relational contracting literature questions the gap between contract doctrine and the empirical operation of the contract system in the real world. Organizations engaged in contracting often do not need to conduct rational contract planning and negotiation when the transactions are run within a setting of continuing relationships. Potential disputes are compromised in the way of keeping the relationship continues. The contracting process, as Macaulay (1985) argues, is not “a neutral application of abstract rationality,” but “operates at the margins of major systems of private government through institutionalized social structures and less formal social fields” (477). Underlying this observation is the notion of relational contracting, a type of contracting that reflects “the relations among parties to the process of projecting exchange into the future” (Macneil, 1980, 4). It can be seen as a logical extension of the bounded rationality represented by formal contracting. This line of research (Macaulay, 1963; Macneil, 1977) highlights the role of relational sanction and social interaction in understanding the incentives under the fulfillment of contractual agreements. A detailed comparison between formal and relational contracting is listed below in Table 9. Table 9. Comparisons between Formal and Relational Contracting Formal Contracting Relational Contracting Perspectives about relations with vendors • Anticipate short-term relationship • Low risk/low trust • No expectation for altruistic behavior • Anticipate long-term relationship, seek out trustworthy partners • High risk/trust • Expect altruistic behavior in the interest of the whole 54 Market assumptions • Many vendors available • Few potential vendors Contract writing • Detailed specification of benefits, burdens, rules, and rights • Monitoring for compliance • Reliance on legal remedies • Comparatively ambiguous contracts with anticipation of adapting to changing circumstances • Social norms serve as principal mechanisms of mediation or control • Aversion to third-party, legal remedies Management style • Sanctions imposed as written • Low levels of contacts and coordination • Compliance as a key concern • Sanctions and remedies not imposed but rather negotiated and mediated • Flexibility, solidarity, information sharing • Maintenance of relationship as a primary concern Service characteristics • Easy to define service tasks • Easy to evaluate service quality and vendor performance • Tasks do not require special investment or customization and involve standardized service production processes • Ambiguity in defining service tasks • Difficult to assess service quality and vendor performance • Vendors are required to make special investments to satisfy buyers’ customized needs Source: Beinecke & DeFillippi (1999), Lamothe & Lamothe (2012), Sclar (2000), Williamson (1985). This relational exchange perspective in contracting management has actually received growing attention in public management literature (e.g., Brown, Potoski, & Van Slyke, 2006; Sclar, 2000). For example, scholars in recent years have proposed to use 55 stewardship theory to explain public service contracting (e.g., Dicke, 2002; Lambright, 2009; Van Slyke, 2007). Stewardship theory emphasizes the cooperation and trust nature in principal- agent relationships. As Davis, Schoorman, & Donaldson (1997) suggest, it “defines situations in which managers are not motivated by individual goals, but rather are stewards whose motivates are aligned with the objectives of their principals” (21). Stewardship theory becomes more relevant to government-nonprofit contracting, in which nonprofits are always believed to be social-mission driven and have weaker incentives to take advantage of asymmetric information in market exchange. Such mission/value alignment with government would moderate goal conflicts between contracting parties and prevent nonprofit contractors’ opportunistic behaviors in maximizing their financial interest and market value. Indeed, relational contracting has special implications for service contracting. First, human service contracting has been found to follow a negotiation model, rather than the competitive bidding model proposed by the privatization literature, due to limited market competition, ambiguous performance, and costly monitoring (DeHoog, 1991; Johnston & Romzek, 1999; Sclar, 2000; Van Slyke, 2003). In this sense, informal social exchanges between contracting parties would play a significant role. Romzek and Johnston (2002) find that in Kansas social service programs, ongoing “negotiation and collaboration among contracting partners” (423) is necessary for effective contract implementation. Brown and Potoski (2004) show that even in refuse collection, where service attributes are relatively easier to measure and market 56 competition is rich, public managers still engage in a variety of informal network activities (such as hosting informal meetings with contractors and attending professional conferences) to promote competition and reduce information asymmetries. Second, as mentioned earlier, human service contracting is always troubled by low task programmability and low outcome measurability, which complicates the design of formal contract arrangement. As such, social control, or what Ouchi (1979) calls “clan” control, may emerge to function as a supplement. The existence of informal socialization process against organizational rationality in organizational operation has been long acknowledged since the Hawthorne Studies. The social control system, using informal and normative mechanisms (such as shared values and norm of reciprocity) to eliminate interest and goal incongruence between the principal and the agent, implicitly encourages appropriate behaviors that could lead to desirable collaborative outcomes. Put together, the arguments here call for the inclusion of relational aspects of contracting, in addition to formal contracting endeavor. However, the interaction between formal and relational components of contracting is still under scholarly debate: whether formal and relational contracting could function as mutually competing or enhancing mechanisms. The mutual exclusion view considers a hostile relationship between formal and informal contracting. From this perspective, efforts on legal maneuvers in formal contracts as safeguards against potential breaches would be interpreted as a sign of distrust and hinder relationship- 57 building between organizations (Bernheim and Whinston, 1998; Ghoshal and Moran 1996; Lyons and Mehta, 1997). However, the complementary-role perspective challenges this view. Poppo and Zenger (2002) and Goo et al. (2009) find that clear contract specification reduces risk in cooperation, which would promote repeated exchanges and further result in mutual dependence and trust. And trust emerging from prior collaborations would substitute for more elaborate formal contract provisions (Gulati, 1995). Informal relationship and mutual understanding could mitigate ex post informal flow and coordination, reducing the need for clear specifications (Dore, 1983; Zollo et al. 2002). Sclar (2000) even argues with relational contracting, “the formal contract or agreement is less important as a reference point for dispute resolution than is the quality of trust between the organizations” (123). Different from all these studies, Klein Woolthuis, Hillebrand and Nooteboom (2005), through comparative case studies of four inter-firm relationships, find that the relationship between formal and informal contracting is so complex and dynamic that they can be both complements and substitutes, largely dependent on managerial contexts. In public management literature, the empirical research on whether formal contracting and relational contracting are substitutes or complements is still less common. Van Slyke (2006) finds through interviews with public and nonprofit managers that social service contracting management might evolve from more formal-contracting like to more relational-contracting like over time. Lambright (2009) examines the use of government contracting monitoring tools from both 58 principal-agent perspective (formal contracting) and stewardship perspective (relational contracting). She concludes neither one could explain the entire story. Lamothe and Lamothe (2012) confirm this argument and find that in local service delivery, there are substantial contact and communicate between public managers and their vendors in contract implementation, in addition to clearly written formal contracts. Put together, the evidence so far points to the coexistence of these two mechanisms that public managers would devote themselves to simultaneously. To some extent, such combination of formal and informal contracting reflects the nature of contracting management in public administration context: well-planned and written contracts to meet the formal accountability demand, and negotiation and discretion to satisfy the flexibility concerns in service delivery (DeHoog, 1990). In sum, this chapter provides a theoretical framework of contract design for human services. Holding a principal-agent perspective, this chapter first argues that formal contract design depends on two dimensions: task programmability and outcome measurability of the contracted services, which further lead to two contract arrangements: behavior-based contracts and outcome-based contracts. However, given that human services share both low task programmability (due to high provider discretion) and low outcome measurability (due to multidimensional, long-term outcomes), neither formal contract arrangements might fit seamlessly with human services. To provide a balanced theoretical framework, this chapter also includes the literature on relational contracting, which implies the reliance on relational exchange as an informal contracting management mechanism. Put together, the combination of 59 formal and informal contracting literature provides a complete framework to study contracting management. With this theoretical framework in mind, this project turns to the discussion of vocational rehabilitation, a human service area where PBC is becoming increasingly prevalent, as a policy field of inquiry for this present research. Particularly, Indiana vocational rehabilitation program’s transition from FFS to PBC in the purchase of VR employment services provides a good case to answer the PBC effectiveness question raised earlier. 60 Chapter 4. Vocational Rehabilitation as a Policy Field 4.1 Vocational Rehabilitation Programs In the United States, 56.7 million American had a disability in 2010, representing 18.7 percent of the population. Among them, about 41 percent of those aged 21 to 64 with a disability were employed2. Employment has been found to be fundamental to people’s physical and psychological well-being (Dooley, Fielding, & Levi, 1996; Linn, Sandifer, & Stein, 1985; Paul & Moser, 2009). Employment would help disabled people move toward desired quality-of-life changes. However, the disabled generally face a number of barriers in entering into the workforce and inclusion to the society. This calls for public vocational assistance. The major public vocational assistance service for adults with disabilities in the United States is the federal vocational rehabilitation program. The federal interest in rehabilitation issues started in 1920s, with the enactment of the Vocational Rehabilitation Act, also known as the Smith-Fess Act. The Act begins the federal-state partnership in the rehabilitation of individuals with disabilities. The passage of the Rehabilitation Act in 1973 marked a significant progress in the federal rehabilitation program. It provides the statutory authority for programs and activities that help individuals with disabilities in the pursuit of gainful employment, independence, self-sufficiency, and full integration into community life. Under the Act, a wide range of rehabilitation programs were created. The U.S. Department of 2 http://www.census.gov/newsroom/releases/archives/miscellaneous/cb12-134.html 61 Education has primary responsibility for administering the Act, particularly the programs under the Act that are funded through the Department of Education. Within the Education department, the Rehabilitation Services Administration (RSA) is the principal agency for carrying out most of programs and activities that provide direct support for vocational rehabilitation (VR), independent living, and individual advocacy and assistance. By far, the largest program administered by RSA is the State Vocational Rehabilitation Services Program, also known as the Vocational Rehabilitation State Grants Program. Title I of the Rehabilitation Act of 1973 authorizes the VR program to “empower individuals with disabilities to maximize employment, economic self-sufficiency, independence, and inclusion and integration into society.” This program funds state VR agencies to provide employment-related services for individuals with disabilities to prepare for, gain, and maintain employment. The value of VR programs has been well recognized (Bolton, Bellini, & Brookings, 2000; Bond, 2004; Dutta, Gervey, Chan, Chou, & Ditchman, 2008; Gamble & Moore, 2003). Typically, the VR program service more than 1 million people with disabilities nationwide each year. More than 90% of the people who use state VR services have significant physical or mental disabilities that seriously limit one or functional capacities, such as mobility, communication, and interpersonal skills. The employment rates of people with disabilities after receiving VR services have been consistently found to be around 60% (Kaye, 1998). 62 The VR program follows a federal-state model. Within the partnership, the federal government substantially funds state programs and states are also required to match federal funds. Generally, the federal government covers 78.7% of the program’s costs through financial assistance to the states for program services and administration. For example, in fiscal year 2010, VR programs received $3,040,323,049 federal funding and states expended $ 864,073,243 (U.S. Department of Education, 2010). The federal government also establishes the program and monitors state program operation. For example, RSA conducts periodic on-site reviews and requires state VR agencies to submit annual program review, in order to ensure the state follow the program goals and requirement under the Rehabilitation Act. States enjoy certain latitude in running their VR programs and are responsible for delivers various VR services to clients. This federal-state vocational rehabilitation constitutes the policy field for the present research project. 4.2 The Purchase of Job-related Services As mentioned, the importance of employment for people with disabilities has been widely accepted. Thus, job placement and on-the-job support of people with disabilities at the highest level possible has been central to the mission of VR programs (Rubin, Roessler, & Dunkerby, 1983). Through these job-related services, VR programs help clients prepare for, gain, and maintain employment. Specially, the job-related services in VR include job search assistance, job placement assistance, and on-the-job support. Often, state VR agencies acquire these services from nonprofit community rehabilitation programs, through a variety of purchase of 63 service contracts. Figure 4 describes the general contractual relationship in the purchase of job-related services. Figure 4. The Contractual Relationship in the Purchase of Job-Related Services Three major players are involved in the rehabilitation process: • Vocational rehabilitation counselor: The VR counselor is a rehabilitation professional, usually with a master degree level, who is an employee of the state VR program. The counselor is usually knowledgeable about consumers with disabilities and their vocational needs and thus determines the eligibility for VR services. The counselor is also responsible for assisting the consumer to determine and achieve a suitable vocational objective. The counselor works with the customer to devise an individual employment plan that will lead to the achievement of the vocational objectives. The counselor is responsible for authorizing service contractors for service needs, assuring the services State VR Program Community Rehabilitation Program VR Counselors Employment Specialists Consumers Contracting Monitoring Tracking Progress Delivering Services 64 delivered are appropriate, and issuing payment based on service amount and consumer achievement. • Contractor: a vendor of services, mostly a nonprofit community rehabilitation program, who has a contract with VR agency to deliver specific services leading to employment of consumer in a competitive job3. VR services, such as job placement assistance and on-the-job support, are generally delivered by an employment specialist, who directly works with a consumer. The VR counselor makes authorizations against their contract for specific services. • Consumer: an individual with a disability who has been determined eligible for VR services by the VR counselor. Traditionally, these service contracts are process-oriented, making contract compensation contingent upon the provision of services. Most of these contracts have common elements: defined services, a purchasable unit for each service (e.g., day, hour), and a unit cost for each defined service (Revell, West, & Cheng, 1998). The 3 Competitive employment means work in the competitive labor market that is performed on a full-time or part-time basis in an integrated setting; and for which an individual is compensated at or above the minimum wage, but not less than the customary wage and level of benefits paid by the employer for the same or similar work performed by individuals who are not disabled. An Integrated setting is typically found in the community in which individuals interact with non-disabled individuals, other than support staff, to the same extent that non-disabled individuals in comparable positions interact with other persons. 65 predominant purchasable unit for services is an hour. For example, a contractor may be paid $30 for each hour of job placement service it provides to an eligible service recipient. The popularity of hour-based contracts lies in several aspects. First, service contractors can customize service based on individual service needs, because they are reimbursed for each hour of service provided to individuals. Second, funding agencies have access to individualized information on the specific services provided and the impact of their funds. Through intensive reporting by service providers throughout the delivery process, funding agencies are able to control the services needed for successful employment and the detailed flow of funds. In that way, VR agencies actually centralize the service delivery process. However, the weakness of this contracting method is visible. The hourly fee-for- service contracts do not readily encourage quality assessment and quality control by service providers, as the services are paid for without considering the results of those services. Moreover, contractors have limited incentives to encourage service recipients to move toward desired employment outcomes. In essence, the hour-based contracts emphasize the provision of service per se, i.e., the time spent providing those services, rather than the results of those services. Indeed, this contracting method equips contractors with disincentives to pursue valued outcomes (client independence). Basically, hourly billing tends to bear an inverse relationship to client independence: it is in contractors’ fiscal interest to emphasize service provision and hours billed rather than working toward employment and long-term stability. This 66 demonstrates a “triumph of process over results” (Kettner & Martin, 1993, 62) and further leads to high service costs and poor employment outcomes. Therefore, there was an incentive for a more effective contracting approach that simultaneously considers valued employment outcomes and the costs to achieve those outcomes. Inspired by the national performance movement, PBC emerged as a new approach in the purchase of VR placement services. Under PBC, contractors are compensated for the outcomes of services rather than the process of service delivery. Thus, the defining feature is payment for the valued accomplishments of service recipients. This transition from FFS to PBC aims to pay for meaningful and measurable employment outcomes at a defined cost. Contractors receive payment only if the service recipients they serve successfully achieves defined employment outcomes, such as assessment, obtaining employment, and job maintenance for a specific time period. For example, the provider may be reimbursed $ 1,000 when a service recipient finds a job and $ 1,500 when this client reaches stabilization on the job. Table 10. Components of VR Services and Contract Type Service Component Contract type Inputs Resources Staff, facilities, … Fee-for-service Process Program activities Job assessment, development, coaching Fee-for-service Outputs Service delivery Completion of services Performance-based Outcome (short-term) Benefits of services Job placement, retention Performance-based Outcome (long-term) Long-term quality- of-life changes self-sufficiency, independence, and Performance-based 67 inclusion Source: Novak et al. (1999). As Novak, Mank, Revell, and O’Brien (1999) argue, PBC in VR service promises a number of benefits: increased emphasis on valued outcomes and accountability for results, increased cost efficiency and effectiveness due to streamlined service delivery, and increased consumer choices and satisfaction. First, PBC approach compensates contractors when service recipients attain successful employment outcomes, rather than reimbursing the amount of services delivered and time spent. The success of services lie not in the array or number of services provided but in the extent to which these services embrace desired results. Along with the innovation, there is a change in the institutional environment of VR programs, from an accountability for following rules and regulations to an accountability for outcomes, in line with the government-wide performance movement. PBC thus enables contractors to increase accountability for aligning resources to achieve results. Second, PBC promotes streamlined service delivery and improves cost efficiency and effectiveness. Under PBC, service providers are granted greater flexibility in service delivery in return for greater accountability for service performance. It deemphasizes regulations and micromanagement of contractor operation throughout the service process. Thus, time spent in reporting and paperwork would be largely squeezed. Such saving from documentation and reporting is supposed to be devoted to carefully 68 serving people with disabilities. This further encourages more cost efficient and effective service delivery. Third, with an outcome orientation, contractors are expected to behave toward more effective service delivery. This will lead to the achievement of more timely outcomes for service recipients, and thus, increased customer satisfaction. In short, PBC is expected to generate a triple win for VR programs: disabled people receiving quick and quality services, contractors enjoying less regulation and greater flexibility, and state VR agencies achieving better results at lower costs with greater accountability (Frumkin, 2001; O’Brien & Revell, 2005). 4.3 PBC Models in VR Services Oklahoma Milestone Payment System Oklahoma is a pioneer in the design and use of PBC in the purchase of VR services. The Oklahoma Department of Rehabilitation Services (DRS) began providing employment assessment and training services for people with severely mental and developmental disabilities in 1988, through contracting with community nonprofit service vendors. After receiving rehabilitation services, eligible individuals were able to achieve placement in local communities. Typically, these services were purchased from nonprofit contractors on a fee-for-service basis that reimburses nonprofit contractors at hourly payment rates for all services provided. 69 However, the DRS soon found the program experienced high costs but poor performance in helping the disabled for integrated employment in their communities. In 1991, bringing one case to closure cost more than $22,000 and took 438 days on average (Frumkin, 2001). The DRS attributed this to the distorted incentives in the fee-for-service method: it emphasized contractor efforts in delivering services rather than in achieving employment outcomes through those services. This further led to an inverse relationship between contract payment based on amount of services provided and employment outcomes. To address the problem, the DRS designed the Milestone Payment System, in which contractors were reimbursed when service recipients reached each of milestones leading to employment and long-term stability. The DRS defined each milestone as a predefined check point on the way to a desired outcome, such as case assessment, job placement, and job retention. Each milestone may include quality outcome indicators to be accomplished before payment. For example, consumer and employer satisfaction with job placement and minimum working wage were used as the quality indicators for the milestone on placement. Each milestone is associated with a fixed rate payment, with the higher payments toward the later milestones. The payment rate at each milestone would reflect the average cost of achieving the specific milestone rather than the cost of staff time (as under FFS model). Payment rates were negotiated for each milestone. The DRS solicited bids from community rehabilitation programs, allowing vendors to include in average cost per closures from the previous year multiplied by the estimated number of closures for the contract year. The DRS then reviewed the bids primarily 70 based on the per-customer bid price and the average cost per closure, as well as past service history. After that, the DRS negotiated with community vendors to achieve agreements (Frumkin, 2001). An example of milestone payment structure is: (1) determination of consumer needs – 10 % of bid; (2) vocational preparation completion – 10 % of bid; (3) job placement – 10% of bid; (4) 4 weeks job retention – 20% of bid; (5) job stabilization – 20% of bid; and (6) consumer rehabilitated (stabilization +90 days) – 30 % of bid. The milestone payment system was first piloted in 1992. After several years’ pilot, the DRS converted all the service contracts to the milestone approach in 1997. Effectively July 1, 2001, the DRS moved the milestone payment system to the statewide fixed rates. Table 11 provides an example of the Oklahoma milestone payment system for the purchase of supported employment services4. Table 11. Oklahoma Milestone Payment System Milestone Regular Rate Highly Challenged Rate Assessment and Career Planning $ 625 $ 625 (Optional) Vocational Preparation $ 625 $ 625 Job Placement $1,688 $3,125 4 Supported Employment Services is intended for individuals with the most significant barrier to employment who require: (1) substantial assistance in making a job choice, (2) substantial assistance in getting a job matching that choice, (3) a significant degree of job site support to learn the job tasks, gain work adjustment skills, and stabilize in employment, and (4) long-term support to retain employment. 71 4 Weeks Job Support $2,250 $1,875 8 Weeks Job Support $1,688 $1,875 Job Stabilization $2,125 $2,125 Successful Employment $2,875 $4,125 Milestone Outcome Description Assessment and Career Planning A determination of the individual’s informed job choice has been made, and the specific supports the individual will need to perform the chosen job successfully have been identified. Vocational Preparation The individual has clarified his/her career/employment objectives which include short-term and long-term vocational goals developed collaboratively with the individual. Job Placement The individual has been placed in a job of his/her choice meeting the requirements of supported employment and the objective in the IPE. An individual under this contract may not become an employee of the Contractor. Job placement is complete when the individual has completed the fifth day of work. 4 Weeks Job Support The individual has worked successfully for a minimum of four weeks, beginning with the first day of employment (note 1). 8 Weeks Job Support The individual has worked successfully for a minimum of 8 weeks total and has received the appropriate support services (Note 1). Job Stabilization The individual has worked successfully for the minimum required weeks (a total of 12 weeks for individuals receiving services under the regular rate and 17 weeks for individuals who are highly challenged) and is working the weekly work goal as identified in the IPE (Note 2). 72 Successful Employment The individual has been employed a minimum of 90 days beyond stabilization and the case is ready for closure (note 2). Note 1: Only weeks in which the work hours exceed 40% of the weekly work goal, and in which on-site and/or off-site supports are provided, will be counted towards the minimum four weeks of this milestone. Note 2: Only weeks in which hours worked meet the weekly work goal, and where needed supports were provided will be counted. Source: Metro employment services contract 2012, Department of Rehabilitation Services, State of Oklahoma Under the milestone contracting approach, service contractors were reimbursed when clients they served achieved certain milestones along the way to successful employment. The DRS did not specify the vocational methods to be used; vendors had the flexibility in achieving specified outcomes. To encourage contractors to take on more difficult clients, the milestone system designed a two-tiered payment rate, with a different rate for serving highly challenged clients. VR counselor, working with the individual and the contractor, designated the services to be used and whether the individual fited the regular or highly challenged rate. Services would be purchased on an individual basis as authorized by the counselor. Each milestone would be pre-authorized by the counselor and paid only once per case, per contractor, upon receipt and acceptance of the required documentation for payment by the counselor. Payment of a milestone would constitute payment in full for all services delivered during that phase of the program. In short, the milestone payment system created different incentives for contractors. Under the hourly payment system, the provider generated more income by delivering 73 more services before placing customers. Under the milestone payment system, the providers’ incomes improved when consumers got jobs of their choices as rapidly as possible (O’Brien & Revell, 2005). The Oklahoma PBC system received extensive recognition and was introduced by other states—including Alabama (Valerie, Howard, Dan, Byron, & Amy, 2000), Colorado (Block, Athens, & Brandenburg, 2002), Indiana (McGrew, Johannesen, Griss, Born, & Katuin, 2005), New York (Gates et al., 2004), etc. --into their purchase of VR services. Although there are small variations in the PBC systems across states, all these systems were modeled after Oklahoma. New York PBC Demonstration The New York State Office of Mental Health implemented a 2-year demonstration of PBC to promote employment outcomes (placement and retention) for people with serious mental health conditions, starting 2000 (Gates et al., 2004). Before the initiative, a traditional fee-for-service method was used, where providers were paid quarterly advances for hours spent working with clients regardless of consumer outcomes. The demonstration model included 6 milestone payment points – life skills assessment, vocational planning and initial job placement, job skills acquisition, retention at 3 months, retention at 6 months, and retention at 9 months. Each 74 milestone was associated with a fixes rate payment, with higher payments toward later milestones. The rate was determined by the government agency, factoring in provider-estimated costs with provider-estimated consumer success rates at each milestone. Additional funding was available for long-term support, encouraging contractors to offer time-unlimited support to consumers once they had completed milestone VI (retention at 9 months). The same as other PBC models in VR, New York also developed a two-tiered payment to avoid creaming problem. Providers serving the most difficult clients would receive 20% more payment than serving the standard clients. Table 12. New York State Milestone Payment System Milestone Standard Rate Incentive Eligible Rate Life Skills Assessment $ 750 $ 900 Career Planning & Initial Job Placement $ 750 $ 900 Job Skill Acquisition for 4 Weeks $1,500 $1,800 Job Retention at 3 Months $1,500 $1,800 Job Retention at 6 Months $1,875 $2,250 Job Retention at 9 Months $1,125 $1,350 Long-term Job Supports $1,300/year $1,300/year Source: O’Brien and Revell (2006). Indiana Result-based Funding System To date, Indiana is the latest state that changed from a traditional fee-for-service model to a performance-based model, or what they call result-based funding (RBF). 75 The transition began with a pilot project. In 2002, The Indiana Supported Employment Results-based Funding Pilot Project was launched by the Supported Employment and Consultation Training (SECT), the Office of Vocational Rehabilitation Services (VRS), and the Indiana Division of Mental health and Addictions (DMHA) in supported employment services for individuals with severe mental health problems. Stakeholders, including government staff representatives, contractor, and consumers, were actively involved in the planning stage to determine the structure of the RBF system. The pilot RBF system included: (1) completion of the person-centered plan - $550 (10% of VRS funding), (2) consumers’ 5th day of employment - $1,100 (20% VRS funding), (3) 1 month of employment - $1,100 (20% VRS funding), (4) VRS eligible case closure - $2,750 (50% VRS funding), and (5) 9 months of continuous employment - $1,000 (DMHA funding) (McGrew, Johannesen, Griss, Born, & Katuin, 2005). The total amount paid for milestones 1-4 reflected the statewide historical average paid by VR per successful case closure under the FFS model, plus an amount equal to the average costs of providing services for individuals who fail to reach case closure. The milestone 5 was used as an extra bonus to incentivize long-term retention. The Indiana Bureau of Rehabilitation Services changed its statewide contracting approach to RBF in late fiscal year 2006. Table 13 provides an example of its RBF system. The emphasis of RBF was placed upon structuring service contracting method that would increase the likelihood of both initial job placement and long-term tenure. Under RBF, contractors received reimbursement at a fixed rate once 76 consumers reached predetermined stages across the employment process. VR counselors would make the decisions on milestone authorization and the tiers individuals will enter. Each milestone should be authorized at the completion of the prior milestone. Providers should not provide services without proper authorization. Substantial progress towards the vocational goals needed to be demonstrated throughout the service process. Table 13. Indiana Result-based Funding System Milestone Tier I Rate (For people who need ongoing support) Tier II Rate (For people who do NOT need ongoing support) 1. Plan for Employment & Supports $1,200.00 $ 600.00 2. Job Placement $1,200.00 $ 900.00 3. Four Week Placement $1,864.00 $1,325.00 4. Eligible for Closure $4,000.00 $2,600.00 TOTAL $8,264.00 $5,425.00 Note: Tier One: For people who (1) qualifies as the most severely disabled as defined in the state policy, (2) requires multiple services over an extended period of time, and (3) is likely to need ongoing, intensive intervention to get and keep a job. Tier Two: For people who (1) has a disability, severe disability, or most severe disability, and (2) would not require ongoing, intensive intervention to get and keep a job. Milestone Outcome Description 1. Plan for employment & supports A plan for employment and supports developed by the customer and his/her support team. The team is 77 comprised of the customer, Vocational Rehabilitation Counselor, employment service provider and any other stakeholder or individual the customer desires to participate in the meeting. 2. Job placement The customer has worked one week at the hours per weekly work goal (e.g., based upon hours scheduled) in the vocational area identified in the Plan for Employment and Supports. 3. Four week placement The customer has worked four weeks in which he/she met hours per weekly work goal (e.g., based upon hours scheduled) and pay rate as stated in the Plan for Employment and Supports. The customer is satisfied with the job. The employer has indicated satisfaction with the employee. 4. Eligible for Closure The customer has maintained employment for 60 calendar days (for those eligible for Supported Employment services) or 90 calendar days for others. The customer is employed in a job as outlined in his/her Plan for Employment and Supports that is commensurate with his/her skills and abilities. The customer meets VRS closure criteria. Customer and employer are satisfied and this is documented (verbal or written reports). Source: Indiana Bureau of Rehabilitation Services (2006). 4.4 The Use of PBC in VR Services As can be seen from the discussion above, the use of PBC in the purchase of job- related services has become widespread. More interestingly, unlike the PBC systems 78 used in other human service areas mentioned in chapter two, the PBC systems piloted and used in VR services in different states are roughly the same, most modeled after Oklahoma. This implies that the PBC in VR services has been relatively stable and mature, which provides a good policy field for this study to examine the question proposed in previous chapters – whether PBC is better than FFS. Generally, the design of PBC system in VR employment services involves three common components: (1) defining the desired employment outcomes, (2) defining the payment point for each outcome, including criteria for determining achievement of each outcome, and (3) establishing a fee structure for payment points (Novak, Mank, Revell, & O’Brien, 1999). First, VR employment services generally proceed through several stages – establish job goal, become employed, stabilize in employment, and continue in employment. The design of the desired service outcomes is in line with these stages. Second, the selection of specific benchmarks and criteria to qualify a contractor for reimbursement needs to include consumer and employer satisfaction and the quality and stability of services. Third, the fee structure reflects the average cost of serving an individual within each defined outcome. Indeed, the fee should include contractor costs associated with serving individuals who reach an employment outcome and costs historically associated with serving individuals who fail to achieve employment outcomes. Actually, although some studies conclude promising results after the use of PBC, systematic evaluations of PBC effectiveness in VR services are still missing. For 79 example, after the use of the milestone payment system in Oklahoma, a study observes that customers’ time on waiting list reduced by 53%, time before placement reduced by 18%, time from placement to success reduced by 45% observation, and number of people assessed but without placement reduced by 25%5. However, such before-after observation is not free from methodological problems. Further, another two problems related to PBC effectiveness are not well explored. The first one is the customer selection or “creaming” problem. Due to the outcome orientation, contractors under PBC may prefer serving individuals who appear to be easier to place and thus have disincentives to serve people with the most significant disabilities. Contractors could maximize earnings by serving only the most readily employable people at the expense of serving those with more significant support needs. Although the PBC systems used in VR services all include two tiers of payment to consider the cost variance between serving regular customers and serving difficult customers, the effectiveness of such design is still largely unknown. Second, the quality of employment outcomes derserves some more attention. Under PBC, contractors may be less likely to invest in job matches and job development to ensure quality employment. A good placement requires extensive services in job preparation and match for job seekers, which extends contractors’ service time and cost. However, PBC fosters the achievement of employment milestones in a timely manner. In fact, the quality of these milestone achievement and the long-term effects 5 www.onenet.net/~home/milestone. 80 are hard to define and thus are not attached to milestone payment. Although some quality indicators such as customer satisfaction and working hours and wages at placement, they may not guarantee against contractors’ gaming behaviors. More broadly, this questions the effectiveness of PBC on the aspects of employment outcomes that are not specifically measured in the PBC systems – do contractors gaming the PBC systems? Here, with these questions, we turn to the next two chapters that are devoted to examine PBC effectiveness, from both quantitative and qualitative perspectives. 81 Chapter 5. The Effectiveness of PBC: A Service Outcome Perspective 5.1 Introduction This chapter starts to evaluate the effectiveness of PBC. The performance of a contractual network, as Provan and Milward (2001) argue, is a multi-dimensional construct, including community, network, and participating organizations, each with different effectiveness criteria (Figure 5). At the community level, networks are evaluated by the contribution they bring to the communities and the clients they serve in addressing certain policy problems. In this way, the community perspective of network effectiveness means “first by assessing aggregate outcomes for the population of clients being serve by the network, and second, by examining the overall costs of treatment and service for that client group within a given community” (Provan & Milward, 2001, 417). At the network level, effectiveness should consider the operation of network structure per se. Thus, the effectiveness is evaluated based on the growth and function of the network as a whole, such as membership growth, range of services provided, and network maintenance. In addition, network effectiveness needs to recognize participating organizations involved, their individual survival and success in particular. This organization perspective would assess network on client outcomes, agency survival, legitimacy, resource acquisition, and cost. Figure 5. Analytical Framework for Network Effectiveness 82 Source: Provan & Milward (2001). Given the bilateral nature of government-contractor structure in PBC, with no network-level arrangement involved, this project assesses PBC effectiveness mostly from community-level and participating organization-level. This chapter holds a community-outcome perspective, leaving the other perspective to next chapter. Specifically, it explores whether PBC contributes to the improvement in client well- being. The research design is to quantitatively compare the employment outcomes under two contracting models, FFS and PBC. The unit of analysis is the individual client receiving placement services in Indiana. As mentioned, Indiana is the latest state in the transition from FFS to PBC, which provides a case that allows using administrative data to examine the policy impact of PBC intervention. Several approximations of employment outcomes are identified: likelihood of employment, time to placement, job retention, and wage. The first two are directly targeted by performance measurement in Indiana RBF system. PBC motivates contractors to Organization/Participant -level Effectiveness Network-level Effectiveness Community-level Effectiveness Key Stakeholders Principals Clients Agents 83 move through the performance milestones and across the employment process quickly in order to receive reimbursement. Thus, I predict: H1 After using PBC, clients are more likely to attain employment. H2 After using PBC, clients are able to achieve employment in less time. In addition to these two indicators, another two employment quality indicators (job retention and wage), not directly targeted in the RBF system, are also included to examine if the potential performance improvement in employment possibility and time to placement is attained through gaming other unmeasured performance. As we may notice, these hard-to-measure performance areas are often excluded in performance measures in PBC systems. Even in Indiana RBF system, job retention (in terms of working hours) and working wages are measured by minimum standards such as state minimum hourly wage. Indeed, by leaving high discretion to contractors during the service process, PBC implicitly assumes nonprofit contractors would work with clients meticulously and innovatively and help them secure high-quality employment. Thus, this research would also test: H3 After using PBC, clients are able to achieve longer job retention. H4 After using PBC, clients are able to gain higher wages. 84 5.2 Research Design This research uses an experimental design to examine the treatment effect of PBC in Indiana. Experimental designs are widespread in examining the treatment effects of policy interventions. The treatment effect in an experiment, intuitively, is the net difference between the condition of a unit after receiving a treatment and the condition of that unit if it would have not received that treatment. However, these two conditions are not possible to observe at the same time in real life: we can only observe one of these potential conditions, not both. This “missing data problem” (Rubin, 1976) constitutes what Holland (1986) called the “fundamental problem of causal inference” (947). Thus, the core task of policy/program evaluation study is constructing a counterfactual outcome to estimate the unobserved outcome. This implies that we have to find a control group that to a greatest extent approximates the treatment one in various aspects. To put it another way, ideally, there should be no selection bias between counterfactual outcome of treatment group and observed outcome of control group. When the unit of analysis is a group and the analytical objective is average treatment effect, the composition of the samples in groups, or assignment mechanism used to assign samples to either treatment group or control group, becomes relevant. A key to estimate the causal effects is to identify a control that have units share the characteristics with those in treated group, or the distribution of covariates6 is the 6 Covariate is a variable that is measured before the treatment and thus is not affected by the treatment, such as many demographic variables. 85 same for treated and control groups. Generally, there are two experimental designs associated with two different assignment mechanisms, randomized experiments, in which units are assigned to different conditions randomly, and quasi-experiments, where units are assigned to conditions not by chance. Admittedly, the randomized experimental design is the most desired design in evaluation studies. In this true experimental setting, the randomized control trial offers a robust and straightforward means to assess treatment effects. Random assignment guarantees that there would be no systematic preexisting differences between comparison groups before treatment and the only differences on all background covariates between two groups before the treatment, if any, are random, due to chance. This randomization removes selection bias, ensuring that all the characteristics of units are equally distributed between groups. Thus, the intervention is the only differentiating factor between groups. The average difference observed in outcomes can be attributed to the impact of the intervention (or/and possibly sampling error if the sample is not large enough). However, for various ethical and practical reasons, this ideal randomized experimentation is not feasible in current research. Instead, I resort to quasi- experiment design. In quasi-experimental setting, samples are “collected through the observation of systems as they operate in normal practice without any intentions implemented by randomized assignment rule” (Rubin, 1997, 757). In this way, assignment of conditions is determined by factors beyond the experimenter’s control, 86 such as self-selection and administrator selection (Shadish, Cook, & Campbell, 2002, 13-14). Thus, it is very likely that “the treated and control groups differ prior to treatment in ways that matter for the outcomes under study” (Rosenbaum, 2002, 71). In this case, systematical difference in the characteristics besides the treatment may influence outcomes, and by directly comparing the difference between outcomes may fail to provide robust answers. Thus, more complicated research designs are need to address this problem. 5.3 Interrupted Time Series with a Control Group Design Although quasi-experiments, if well designed, are able to reproduce the same results as randomized experiments, extra attention should be paid to correct the potential challenges. This study uses an interrupted time series with a nonequivalent control group design, diagrammed in Figure 6, to examine the treatment effect of performance-based contracting. It compares individual employment outcomes in Indiana VR programs before and after the PBC intervention within a time period of 2004-2009. As mentioned, Indiana, as the treatment group, changed the purchase of VR placement services from FFS to PBC in the end of fiscal year 2006. Michigan, Indiana’s only neighbor state which kept fully using FFS over time, is added as a control group. The repeated cross-sectional data for analysis was requested from the Rehabilitation Services Administration (RSA) of the Department of Education. The RSA 911 database reports records pertain to all the individuals whose case records were closed in a given fiscal year, including personal characteristics, types of 87 services, and employment outcome of all clients receiving state VR services. Figure 6. Interrupted Time Series with a Nonequivalent Control Group Design 2004 2005 2006 2007 2008 2009 IN O1 O2 O3 X O5 O6 MI O1 O2 O3 O4 O5 O6 Here, the research relies on Campbell and Stanley’s (1963) typology on the threats to internal validity in quasi-experimental designs. The type of quasi-experimental design used here is robust in removing most of the threats to internal validity, such as maturation, testing, and regression, but still subject to instrumentation, local history, and selection (Cook & Campbell, 1979; Shadish et al., 2002). Thus, in order to gain a more accurate estimate of the treatment effect, these three potential threats should be minimized as much as possible before comparing treatment and control groups. Instrumentation may bias causal inference when different administrative procedures and measures are used to record participants’ performance over time. However, this would not be a big concern for state VR programs. Under the Rehabilitation Act, all the administrative and service components, procedures, and standards are under rigorous federal regulations. For example, RSA conducts annual reviews and periodic on-site monitoring of state VR programs to ensure they comply with program and performance requirements under the Rehabilitation Act. In this way, the consistency within and between states can be expected. 88 A serious threat comes from selection bias, i.e., differences exist between individuals in treatment and control groups. To solve this problem, matched sampling is used to correct the observed imbalances between the two states. Matched sampling is a resampling strategy, “selecting units from a large reservoir of potential controls to produce a control group of modest size that is similar to a treated group with respect to the distribution of observed covariates” (Rosenbaum & Rubin, 1985, 33). After matching, two comparison groups are identical on a variety of observed variables, which actually replicates a randomized experiment where the treatment assignment is unconfounded, at least given the observed covariates (Rosenbaum & Rubin, 1983; Rubin, 1973). In particular, this study adopts propensity score matching to produce the matched sample. A propensity score, as Rosenbaum and Rubin (1983) define, is “the conditional probability of assignment to a particular treatment given a vector of observed covariates” (41)7. Matching samples based on propensity scores allows simultaneously considering a variety of covariates. Rather than requiring exact or close matching on all covariates separately, propensity score matching enables matching on the scalar summary of the covariates. Given the propensity score, each unit has the same chance to be assigned to treatment, as in a randomized experiment. In essence, the propensity score is a balancing score. Given a propensity score, e(x), the distribution of the observed covariates x is the same in both treatment and control groups. Matching treated and control based on the propensity score could create new comparison groups that are identical on a vector of those observed covariates 7 The propensity score for subject i (i=1, … , N) to be assigned to treatment (Z=1) versus control (Z=0) given a vector of observed covariates xi is e (xi) = Pr (Zi=1| xi) 89 (homogeneous)8, replicating a randomized experiment based on these covariates. Rosenbaum and Rubin (1983) also proved that treatment assignment and the observed covariates are conditionally independent given the propensity score. Thus, exact matching based on propensity score enables to remove bias due to all observed covariates and to produce unbiased estimation of the average treatment effect, measured by the difference in means in the outcome between treated and control groups. Local context might also bias causal inference when the individuals in comparison groups reside in different settings. To address this issue, this study chooses Michigan as the control group against Indiana, aiming to maximize the socio-economic similarities between the two. In addition, I use difference-in-differences (DID) regressions after matched sampling to further adjust the unobserved imbalance. Under the DID model, any bias caused by exogenous variables common to Indiana and Michigan could implicitly be controlled for, even when these variables are unobserved. There is indeed some evidence to support the common trend assumption of the DID model during 2004-2009. The state-level factors that might affect employment outcomes, including GDP growth, unemployment rate, average weekly earnings, and VR program capacity (measured by average number of clients served per program staff) were found to roughly follow the same trend (See Appendix 1). I also reviewed the annual review reports of Indiana and Michigan VR programs and 8 It is possible that two units with the same propensity score may be different in a certain observed covariate, but those differences are not systematic (Guo & Fraser, 2010). 90 didn’t find any major policy changes on the purchase of employment services. Therefore, I have somewhat strong confidence in assuming that the two states have parallel trends over time. Indeed, running DID regressions on matched samples embraces a number of advantages. First, the combination of the two methods is most robust and efficient in removing the biases due to covariates and estimating the treatment effect on the treated (Abadie & Imbens, 2006; Heckman, Ichimura, & Todd, 1997; Rubin, 1973; 1979). A major problem in the use of matched sampling is inexact matching—it is not always possible to find enough matched treatment and control samples with exactly the same observed characteristics (Rubin, 1979). This is especially the case as the number of matching variables increases. Given the imperfect matching, the estimated treatment effect might not be accurate. However, when putting matched sampling and model-based regression together, matched sampling substantially reduces observed covariate differences, and model-based adjustment afterwards could further controls for residual differences. Second, matched sampling relaxes the DID identification restrictions. Model-based regression adjusts the effect of confounding variables by estimating the relationship between the dependent variable and the confounding variables. The major problem associated with this method is that the model assumptions may be unwarranted in many cases. For example, the linear relationships with the dependent variable and matching variables may not be justified (Rubin, 1979). Thus, the combined method 91 makes model-based adjustment less sensitive to model specification. This again allows the estimation of parsimonious parametric approximations of the average treatment effect on the treated. (Abadie, 2005; Ho, Imai, King, & Stuart, 2007). Guo and Fraser(2010), through a data simulation, also show that under the ideal conditions like randomized experiment, both methods work equally well, leading to accurate estimation of treatment effect with biases closing to zero. However, in quasi- experimental situation, especially when treatment assignment is not ignorable, although either method could remove the biases to a different extent, neither method could produce unbiased estimation of treatment effect. Also through a simulation, Rubin (1973; 1979) finds that model-based adjustment could produce smaller standard errors than matched samples when the model is correctly specified. However, when the model is inaccurate, model-based adjustment would be less robust, not remove biases, but increasing them. Given this, it is suggested that the combination of matched sampling and regression adjustment to be the most robust method for producing the least biased estimate and controlling the biases due to the imbalances in observed covariates (Cochran & Rubin, 1973; Rubin, 1979). In a word, as Abadie (2005) suggests, this combined method “allow[s] for the distribution of both observed and unobserved factors to differ between treated and untreated, as long as the effect of unobserved factors on the outcome does not vary with time (or, more generally, if it experiences the same variation, on average, for treated and untreated)” (5). 92 5.4 Propensity Score Matching Propensity score matching was first used to produce matched samples. When conducting propensity score matching, I followed the procedures suggested by Caliendo and Kopeinig (2008) and Guo and Fraser (2010). 1. Specification of Conditioning Model The first step in conducting propensity score analysis is determining which covariates and conditioning model to be used to estimate propensity score. After all, the accuracy of the specification of covariates and models would affect the effectiveness of propensity score analysis and final estimation of the treatment effect (Heckman, Ichimura, & Todd, 1997; Rubin, 1997). However, there is no guideline available in current literature on propensity score analysis providing definitive answers. Theoretically, in order to meet the assumption of ignorable treatment assignment, all covariates that might be related to treatment assignment and the outcome should be included into the conditioning model (Glazerman, Levy, & Myers, 2003; Rubin & Thomas, 1996; Stuart & Rubin, 2007). Omitting important variables would seriously increase bias in resulting estimates (Dehejia & Wahba, 1999). Shadish et al. (2008) warn that only relying on small set of “predictors of convenience,” such as demographic factors, would lead to poor matching performance. However, in most cases there is no comprehensive list of such conditioning variables explicitly. Therefore, to satisfy the assumption of strong ignorability to a great extent, scholars 93 generally suggest including a large set of covariates of theoretical relevance (Greevy, Lu, Silber, & Rosenbaum, 2004; Lunceford & Davidian, 2004). Actually, including variables that are little unassociated with the outcome might slightly increase variance, but excluding potentially important variables would increase bias. As Rubin and Thomas (1996) argue, “unless a variable can be excluded because there is a consensus that it is unrelated to outcome or is not a proper covariate, it is advisable to include it in the propensity score model even if it is not statistically significant” (253). This paper follows this convention. In view of theoretical relevance and data availability, in this study, we include three categories of covariates in Table 14 —demographic background (age, education, race, gender, veteran status, primary disability, secondary disability), pre-service status (employment status, work disincentives, previous service status, Projects with Industry status), and employment service received (number of placement services received)—that are thought to be related to either treatment assignment or outcome. Table 14. Description of Matching Variables Matching variables Description and Measurement Demographic Background Age An individual’s age at service application Education An individual’s level of education attained at application, with 0=less than high school, 1=special education, 2=high school graduate, 3=post-secondary/associate degree, and 4=college degree or higher Race An individual’s race and ethnicity, with 0=black or African American, 1=native American (American Indian, Alaska native, native Hawaiian, or other pacific islander), 2=Asian, 94 3=white, 4=Hispanic or Latino Gender An individual’s gender status, with 0=male, 1=female Veteran An individual’s veteran status, 0=not a veteran, 1=veteran Primary disability An individual’s primary physical or mental impairment, with 0=sensory/communication impairments, 1=physical impairments, 2=mental impairments Secondary disability An individual’s second physical or mental impairment, with 0=no impairment, 1=sensory/communication impairments, 2=physical impairments, 3=mental impairments Pre-service Status Employment status An individual’s employment status at application, with 0=not employed, 1=employment Work disincentives The number of public support an individual had at application, including supplemental security income (SSI), Temporary Assistance for Needy Families (TANF), general assistance from state or local government, social security disability insurance (SSDI),veterans’ disability benefits, workers’ compensation, Medicaid, Medicare, medical insurance not through employment, and others Previous service status If an individual had received previous employment service, with 0=no previous closure, 1= closed before services, 2=closed after services Participation in Projects with industry If an individual participates in Projects with Industry program, with 0=no, 1=yes Employment Services No. of placement services received The number of employment services an individual received throughout service process, including job search assistance, job placement assistance, and on-the-job supports The propensity score, in its essence, is a balancing score representing a vector of covariates. Unlike in randomized experiments9, propensity scores in quasi- experiments are unknown and must be estimated. Propensity scores are often estimated using binary logistic regression with observed covariates X as independent variables and treatment assignment D (D=1 for treatment condition, D=0 for control 9 In randomized experiments, each unit has a 50% probability of being assigned to either treatment or control group. Thus, the propensity score for each unit is 0.5, without considering sampling error. 95 condition) as dependent variable. The propensity score for unit i (i = 1, 2, … , N) is as follows: is the regression parameters In using logistic regression to predict the value of propensity score, the aim of the modeling is not to estimate the parameters, but to balance the covariates between treatment and control groups. Thus, many traditional regression diagnosis methods, such as collinearity check and model fit statistics, are no longer helpful in model specification here. Rather, the balancing property of the propensity score is used to justify a model specification (Dehejia & Wahba, 1999; Rosenbaum & Rubin, 1984). Researchers find that treatment effect estimation is not sensitive to the model specification used to predict the propensity score, as long as the balancing property of the propensity score holds (Waernbaum, 2010; Zhao, 2008). Misspecification of propensity score model under this condition would not bring bias to the treatment effect estimation. The present project adopts the strategy suggested by Dehejia and Wahba (1999) and Rosenbaum and Rubin (1984), by correcting function form of covariates and adding higher order terms and interaction terms of observed covariates sequentially and 96 check the balance of the covariates based on the propensity scores. Particularly, I used STATA program pscore.ado developed by (Becker & Ichino, 2002) to estimate propensity scores. In particular, this program helps ensure the balancing property of propensity scores, i.e. observations with the same propensity scores should have the same distribution of observed characteristics, regardless of treatment status. The program first splits the sample into several spaced intervals of the propensity score and test whether the mean propensity score of the treatment and control units are statistically different within each interval. If the test fails in one interval, the program would split the interval in half and retest within each finer interval until the mean propensity score of the treatment and control units become balanced. Again, within each interval, testing the means of each covariate to ensure that there is no statistical difference between treatment and control units. If one or more covariates are not balanced in all intervals, the balance property is not supported by the current model specification and specification modification is necessary by adding more interaction and higher order terms. 2. Choose Matching Algorithm After estimating propensity scores, we move onto match treatment units with control units based on the value of propensity score. To date, there are already a number of matching algorithms available, including greedy matching, kernel matching, etc. Dehejia and Wahba (2002) highlight three major decisions in the choice of matching methods: (1) which matching algorithm to be used, (2)number of control units used to 97 match with each treatment unit, and (3) match with replacement or without replacement: reduce bias (with replacement); increase precision (without replacement); inexact matching vs. incomplete matching. Unfortunately, there has been no clear rule for determining which matching algorithm works best under what conditions; there is always a tradeoff between bias and efficiency. It is largely dependent on data per se and the research design. In this study, we use 1-to-1 nearest neighbor matching within caliper without replacement, one of the most common matching algorithms, so-called greedy matching10. This matching algorithm randomly orders the treatment and control units, and selects for each treatment unit a control unit with the smallest distance from the treatment one. Once a control unit was matched to a treatment unit, it was removed from the control group without replacement. The most attractive feature of this matching algorithm is that it allows multivariate analysis used directly after matched sampling, without extra statistical adjustment (Guo & Fraser, 2010). One limitation of this 1-to-1 nearest neighbor matching is that there is no restriction on the distance between two matched units, as long as they are nearest neighbors based on propensity scores. It is possible that these two units are very different in 10 The trade-off between bias and variance: Matching one nearest neighbor minimizes bias at the cost of larger variance; matching using additional nearest neighbors increase the bias but decreases the variance. Matching with replacement keeps bias low at the cost of larger variance; matching without replacement keeps variance low at the cost of potential bias. 98 terms of propensity scores, but there is no one that is closer. Thus, I add a caliper (a quarter of a standard deviation of the propensity scores of the sample (Rosenbaum & Rubin, 1985) to the nearest neighbor matching, choosing matched units only when the absolute distance between the two units (in terms of propensity scores) are within a predetermined caliper. The detailed algorithm is as follows: let and are the propensity scores for the treatment and control units i and j, is the set of control units matched to the treatment unit, and is the caliper. One control unit with the estimated propensity scores falling within a caliper from are matched to the treated unit i. The matched sample sets are: 3. Balancing Tests After matching, it is expected that the preexisted statistical differences in the covariate means between two comparison groups should be eliminated. And the two groups are comparable in that the distributions of observed covariates are identical in treated and control groups. Before moving forward, we need to check covariate balance before and after matching to ensure that covariate balance has actually been achieved. 99 I check covariate balance before and after matching using the absolute standardized difference in covariate means (D'Agostino, 1998; Haviland, Nagin, & Rosenbaum, 2007). The absolute standardized difference is the absolute value of the mean difference as a percentage of the average standard deviation. For each covariate X, and are the means in the treatment and control groups, and and are the corresponding variances, respectively, the absolute standardized difference includes two standardized measures: contrasts covariate values for treatment units with covariate values of all the potential controls before matching contrasts covariate values for treatment units with covariate values of all the matched controls after matching (a subscript m for before matching) Table 15 show the results of covariate balance check. For each year, the absolute standardized difference compares the covariate values of the treatment individuals with those of the control individuals before matching ( ) and with those of the matched control individuals after matching ( ). T-tests examine the equality of covariate means in the treatment and control groups, both before and after matching. 100 Table 15. Covariate Balance Check Before and After Matching (For individuals receiving employment services) 2004 Before matching: NIN=2951, NMI=2148 After matching: NIN=NMI=1598 2005 Before matching: NIN=3048, NMI=1143 After matching: NIN=NMI=955 2006 Before matching: NIN=2673, NMI=1035 After matching: NIN=NMI=852 2007 Before matching: NIN=2770, NMI=1213 After matching: NIN=NMI=1026 2008 Before matching: NIN=2762, NMI=1098 After matching: NIN=NMI=970 2009 Before matching: NIN=2569, NMI=887 After matching: NIN=NMI=785 Covariate 𝑑𝑑𝑥𝑥 (%) 𝑑𝑑𝑥𝑥𝑥𝑥 (%) t- statist ic 𝑑𝑑𝑥𝑥 (%) 𝑑𝑑𝑥𝑥𝑥𝑥 (%) t- stati stic 𝑑𝑑𝑥𝑥 (%) 𝑑𝑑𝑥𝑥𝑥𝑥 (%) t- statist ic 𝑑𝑑𝑥𝑥 (%) 𝑑𝑑𝑥𝑥𝑥𝑥 (%) t- stati stic 𝑑𝑑𝑥𝑥 (%) 𝑑𝑑𝑥𝑥𝑥𝑥 (%) t- statist ic 𝑑𝑑𝑥𝑥 (%) 𝑑𝑑𝑥𝑥𝑥𝑥 (%) t- stati stic Age 22.6 12.1** 3.39 7.0 14.4** 3.14 11.8 17.7** 3.65 15.1 5.8 1.67 22.6 7.8 1.63 30.0 11.9** 5.61 Gender 4.0 6.9 1.93 7.4 5.8 1.27 5.5 10.6** 2.17 11.3 12.2** 2.75 7.8 13.3** 2.92 2.1 0.5 0.10 Race 23.7 1.8 0.50 6.7 6.1 1.30 12.7 7.2 1.46 12.5 5.4 1.63 17.0 2.0 0.40 14.0 5.8 1.07 Education 0.1 5.0 1.39 2.8 0.6 0.13 7.1 4.5 1.22 0.5 2.9 0.63 3.4 5.3 1.11 7.4 7.6 1.50 Veteran status 22.1 15.5** 4.44 18.4 20.7** 4.25 11.5 18.4** 3.41 6.8 7.9 1.70 20.9 7.8** 3.99 14.7 8.4 2.04 Projects with industry 2.9 2.8 0.71 7.7 1.9 0.45 4.6 0.0 0.00 8.9 1.9 0.58 2.8 2.3 0.38 9.8 2.1 0.45 Primary disability 12.0 5.4 1.47 18.7 16.7** 3.59 10.7 7.3 1.53 15.1 11.4** 3.29 15 7.7** 2.15 15.3 3.4 0.88 Secondary disability 36.6 6.0 1.69 37.2 16.3* 3.53 22.9 13.3** 2.75 21.7 16.6 3.80 28.7 13.6** 2.99 25.1 9.8 1.94 Employment status 10.8 3.5 0.94 4.4 0.0 0.00 7.8 6.5 1.33 1.0 2.8 0.63 9.2 1.7 0.48 0.8 1.2 0.23 Work disincentives 1.6 0.1 0.02 5.9 1.4 0.30 11.0 4.4 0.90 0.2 8.9 1.96 4.4 4.8 1.02 0.5 2.7 0.54 Previous closure/service 11.2 3.9 1.10 8.8 9.4** 2.03 8.7 9.0 1.86 6.5 3.5 1.00 16.3 4.5 1.25 13.2 6.2 1.59 No. of placement services received 14.9 8.0** 2.40 69.9 6.5 1.49 69.9 3.0 0.67 60.9 1.8 0.44 55.5 4.1 0.93 57.2 2.7 0.56 **significant at .05; two-tailed tests. 101 (For individuals with employment) 2004 Before matching: NIN=1185, NMI=862 After matching: NIN=NMI=525 2005 Before matching: NIN=1196, NMI=611 After matching: NIN=NMI=431 2006 Before matching: NIN=1303, NMI=553 After matching: NIN=NMI=376 2007 Before matching: NIN=1429, NMI=616 After matching: NIN=NMI=445 2008 Before matching: NIN=1277, NMI=550 After matching: NIN=NMI=398 2009 Before matching: NIN=1048, NMI=404 After matching: NIN=NMI=295 Covariate 𝑑𝑑𝑥𝑥 (%) 𝑑𝑑𝑥𝑥𝑥𝑥 (%) t- statis tic 𝑑𝑑𝑥𝑥 (%) 𝑑𝑑𝑥𝑥𝑥𝑥 (%) t- statisti c 𝑑𝑑𝑥𝑥 (%) 𝑑𝑑𝑥𝑥𝑥𝑥 (%) t- statis tic 𝑑𝑑𝑥𝑥 (%) 𝑑𝑑𝑥𝑥𝑥𝑥 (%) t- statis tic 𝑑𝑑𝑥𝑥 (%) 𝑑𝑑𝑥𝑥𝑥𝑥 (%) t- statis tic 𝑑𝑑𝑥𝑥 (%) 𝑑𝑑𝑥𝑥𝑥𝑥 (%) t- statisti c Age 21.1 8.4 1.38 22.5 8.7 1.76 23.1 8.6 1.69 9.7 0.9 0.19 36.4 15.3** 2.97 25.0 17.8** 2.98 Gender 0.5 5.5 0.90 7.5 7.6 1.14 10.1 5.0 0.68 14.7 14.9** 2.20 11.0 3.6 0.71 4.0 5.6 0.68 Race 15.8 5.9 0.90 11.0 8.7 1.80 6.0 1.0 0.14 16.3 3.6 0.51 13.6 10.4 1.41 8.6 2.0 0.22 Education 3.7 4.7 0.73 8.9 6.0 1.39 22.2 7.0 1.34 11.7 5.1 1.05 13.2 12.7 1.70 13.4 4.6 0.54 Veteran status 25.2 12.5** 2.43 16.3 13.6** 2.96 44.1 18.2** 3.25 19.2 12.0** 2.33 4.9 3.9 0.45 7.0 9.1 1.08 Projects with industry 6.0 5.8 1.00 5.7 4.0 0.58 3.9 0.0 -- 5.7 5.1 0.58 8.6 0.9 0.17 12.4 4.0 0.58 Primary disability 17.6 14.0** 2.23 23.2 18.8** 2.7 15.4 7.9 1.55 10.4 10.3 1.93 20.4 15.8** 3.03 8.2 1.1 0.19 Secondary disability 35.3 13.8** 2.21 31.1 12.9 1.91 23.9 18.3** 3.63 21.4 23.1** 3.51 25.9 7.2 1.35 33.4 15.5 1.89 Employment status 9.6 5.5 0.88 2.9 1.2 0.16 4.8 1.1 0.22 11.0 2.0 0.04 5.4 0.2 0.04 7.3 1.4 0.24 Work disincentives 0.5 1.7 0.27 12.0 7.4 1.08 17.7 13.8** 3.10 5.9 1.0 0.15 21.1 4.2 0.82 19.5 7.1 0.85 Previous closure/service 5.5 3.0 0.48 9.4 0.7 0.15 13.9 6.3 1.24 12.4 2.8 0.58 19.7 5.8 1.14 3.3 5.6 0.70 No. of placement services received 21.2 1.7 0.34 89.8 3.1 0.52 98.2 0.0 0.00 89.4 1.2 0.21 88.9 1.3 0.22 90.7 6.2 0.83 **significant at .05; two-tailed tests. 102 For good matching, should be less than 5% after matching and t-statistic should be not significant after matching. In this vein, as can be seen from Table 15, the matched sampling in this study is quite effective in removing substantial part of the preexisting differences between two comparison states, but not all of them, as expected. 5.5 Difference-in-difference Regressions After propensity score matching, matched sample has removed most of the imbalance between comparison groups at least in observed covariates. With the matched sample, I moved on to DID analyses to estimate the impact of PBC on Indiana clients, in terms of four employment outcome indicators. The general DID model is as follows: For the logistic model on employment probability: For the OLS models on time to placement, weekly working hours, and weekly earnings: 103 X1 contains “demographic background” variables, including age, education, race, gender, primary disability, and secondary disability. X2 contains “pre-service status” variables, including employment status, work disincentives, previous service status, and participation in projects with Industry. X3 contains “employment services” variable, i.e., number of placement services received. Tables 16 and 17 present the DID regression results. Within each model, the interaction effect between the variable of Indiana and the variable of service period 2007-2009 is the differences-in-differences estimator of the treatment effect. First, logistic regression was employed to predict the differences in the likelihood of attaining employment result for those who received employment services before and after PBC. Before discussing the parameters of detailed variables, tests of goodness of fit of the regression model were performed. The logistic regression model is statistically significant (likelihood ratio chi-square=1102.74, p= .0000), meaning that the model specified is significantly better than the model with only the constant. Hosmer-Lemeshow test for overall goodness of fit was also added. The Hosmer- Lemeshow chi-square equals to 8.943 (p= .063), implying that the differences between the observed and fitted values are small. Both tests show that the logistic model is reliable to produce meaningful inference. Generally, after the introduction of 104 Table 16 Logistic Regression Model Predicting Likelihood of Employment for Service Recipients (N = 12, 372) Variable Odds Ratio Standard Error Z Value State and Service Year State (Indiana) 0.7561*** 0.0397 -5.33 Service Year 2007-2009 0.8287*** 0.0450 -3.46 Indiana Service Year 2007- 2009 1.4991*** 0.1144 5.30 Demographic Background Age 0.9975 0.0016 -1.55 Education Special education 1.3384*** 0.1207 3.23 High school graduate 1.2281** 0.1099 2.30 Post-secondary/associate degree 1.2460** 0.1249 2.19 College degree or higher 1.4369*** 0.1783 2.92 Race Native American 0.4991** 0.1455 -2.38 Asian 1.4769* 0.3241 1.78 White 1.3715*** 0.0707 6.13 Hispanic or Latino 1.4601*** 0.1847 2.99 Gender (Female) 0.8292*** 0.0326 -4.76 Veteran 0.8319* 0.0846 -1.81 Primary disability Physical impairments 0.7305*** 0.0810 -2.83 Mental impairments 1.0239 0.1065 0.23 Secondary disability Sensory/communication impairments 1.0099 0.0.1084 0.09 Physical impairments 0.8492*** 0.0480 -2.89 Mental impairments 0.8156*** 0.0358 -4.46 105 Pre-service Status Currently employed 2.0271*** 0.1071 13.38 Work disincentives 0.9278*** 0.0160 -4.36 Previous closure/service Closed before services 0.8738* 0.0650 -1.81 Closed after services 1.2990*** 0.0640 5.31 Participation in Projects with Industry 3.4439*** 1.4556 2.93 Employment Services No. of placement services received 2.8868*** 0.1392 21.99 Likelihood ratio chi square 1102.74*** Pseudo R2 .2653 *significant at .1; **significant at .05; ***significant at .01; two-tailed tests. 106 Table 17. OLS Regression Models Analyzing Employment Outcomes (N = 4, 940) Variable Model (1) Time to placement Model (2) Weekly working hours Model (3) Weekly earnings Coefficient Robust Standard Error t-value Coefficient Robust Standard Error t-value Coefficient Robust Standard Error t-value State and Service Year State (Indiana) 123.7927*** 9.69203 12.77 -2.7736*** 0.3925 -7.07 -40.4802*** 5.4046 -7.49 Service Year 2007-2009 28.1780*** 9.7877 2.88 -1.2952*** 0.3868 -3.35 2.3857 5.4851 0.43 Indiana Service Year 2007- 2009 -72.1985*** 14.3650 -5.03 1.3291** 0.5603 2.37 4.3715 7.2722 0.60 Demographic Background Age -2.3190*** 0.31189 -7.44 -0.0200 0.0122 -1.63 0.4703*** 0.1642 2.86 Education Special education -0.5269 16.9582 -0.03 0.2597 0.6692 0.39 9.7036 6.3365 1.53 High school graduate -24.6122 16.6818 -1.48 2.9657*** 0.6654 4.46 41.5547*** 6.4365 6.46 Post-secondary/associate degree -14.8805 18.2926 -0.81 5.1137*** 0.7445 6.87 83.4660*** 9.0927 9.18 College degree or higher 34.4609 21.8180 1.58 4.6625*** 0.8734 5.34 163.9384*** 17.9561 9.13 Race Native American -17.4687 49.6692 -0.35 0.1325 1.9237 0.07 5.2426 23.7536 0.22 Asian 57.3725 56.6339 1.01 -3.8100** 1.7111 -2.23 -40.9341** 17.1552 -2.39 White 16.1171* 9.6736 1.67 -0.5949 0.4910 -1.45 -6.2492 5.1003 -1.23 Hispanic or Latino -26.7465 26.3873 -1.01 0.2442 0.9318 0.26 -10.9634 11.0435 -0.99 Gender (Female) 8.6879 7.7448 1.12 -2.5881*** 0.2936 -8.81 -32.0111*** 3.7005 -8.65 Veteran -18.0695 19.4540 -0.93 1.3458* 0.7870 1.71 29.1764** 13.6200 2.14 Primary disability Physical impairments -25.4000 20.0359 -1.27 0.1200 0.8080 0.15 -2.3717 17.4014 -0.14 Mental impairments -52.5525*** 18.0453 -2.91 -2.8383*** 0.7448 -3.81 -55.7796*** 15.5328 -3.59 Secondary disability 102 Sensory/communication impairments -14.1650 17.5393 -0.81 -3.0087*** 0.7232 -4.16 -25.6520*** 8.5062 -3.02 Physical impairments -7.8156 10.0790 -0.78 -1.2690*** 0.4236 -3.00 -8.4449 5.6663 -1.49 Mental impairments -15.1077* 8.2807 -1.82 -0.1644 0.3292 -0.50 -4.0529*** 1.5932 -26.29 Pre-service Status Currently employed 7.905325 9.4227 0.84 -0.2275 0.3484 -0.65 11.1346** 5.1178 2.18 Work disincentives 3.0327 2.971615 1.02 -3.8039*** 0.1238 -30.47 -41.8839*** 1.5932 -26.29 Previous closure/service Closed before services -13.2639 13.1365 -1.01 0.1923 0.5492 0.35 -8.0276 6.8302 -1.18 Closed after services -22.0510*** 8.3032 -2.66 -1.2896*** 0.3354 -3.85 -18.4833*** 3.8301 -4.83 Participation in Projects with Industry -41.4231 59.0265 -0.70 -1.4386 2.4348 -0.59 -61.2789** 25.8489 -2.37 Employment Services No. of placement services received 93.0754*** 12.3244 7.55 -1.9901*** 0.4011 -4.96 -35.1571*** 5.7619 -6.10 Constant 292.321*** 31.31243 9.34 37.6258*** 1.2273 30.66 336.2284*** 19.0697 17.63 F-test 15.70*** 81.59*** 56.09*** R2 .2707 .2601 .2805 *significant at .1; **significant at .05; ***significant at .01; two-tailed tests. 103 PBC in 2007, Indiana clients experienced higher employment possibilities (odds ratio=1.4991, p< .01)11. Second, Ordinary Least Squares (OLS) regressions were run to compare three performance indicators of employment outcomes before and after PBC: (1) time to placement, (2) weekly working hours, and (3) weekly earnings (adjusted by inflation). Before regression analyses, a series of regression diagnostics were conducted to ensure the basic assumptions of OLS regression are met. Both White's and Breusch- Pagan tests imply strong concern for heteroscedasticity of the residuals. Thus, robust standard errors were used in regression models. Overall, these three models are significant, with an F value of 15.7 (p < .0001) for the model on time to placement, 11 The interpretation of interaction effects in nonlinear models is still under much econometric discussion (Ai & Norton, 2003; Athey & Imbens, 2006; Greene, 2010; Karaca‐Mandic, Norton, & Dowd, 2012). Ai and Norton (2003) argue that in nonlinear models the marginal effect of the interaction term does not represent the magnitude of the interaction effect. The interaction effect depends on all the covariates, and thus requires computing the cross derivative of the expected value of the dependent variable. The statistical significance of the interaction effect should be based on the estimated cross-partial derivative. Puhani (2012) and Karaca-Mandic, Norton, and Dowd (2012) further demonstrate that under difference-in-difference context, the incremental effect of the coefficient of the interaction term could approximate the treatment effect on the treated. We follow this suggestion in this paper. 109 an F value of 81.59 (p < .0001) for the model on weekly working hours, and an F value of 56.09 (p < .0001) for the model on weekly earnings. All the models also explain substantial portions of the variations of the dependent variables, 27.07%, 26.01%, and 28.05%, respectively. The regression model on time to placement shows individual employees in Indiana after the use of PBC spent 72 days (p< .01) less to achieve employment outcomes, which is consistent with our hypothesis that PBC motivates service contractors to achieve employment outcomes rapidly. The models on employment quality (working hours and wages) demonstrate mixed results. The same as our hypothesis on job retention, individuals in Indiana worked 1.33 hours (p< .05) longer than their counterparts weekly during 2004-2006. The hypothesis on wages is partially supported. Weekly wages of Indiana employees increased by $4.37 after the introduction of PBC, but the difference is not statistically significant even at p< .1 level. However, these small differences on working hours and wages, though meaningful in a statistical sense, are actually of no real policy significance. 5.6 Conclusion The managerial motivation behind all the performance-based management strategies is the phrase that “what gets measured gets done.” The introduction of PBC in human service provision is no exception. By attaching contract compensation to performance achievements, PBC draws contractors to move toward desired service outcomes in a timely manner. Because of the outcome orientation, PBC gives service providers 110 considerable discretion throughout service process, aiming to encourage innovative and quality customer services that would further result in better service outcomes. This chapter, using a community-outcome perspective, tests these claims by examining the employment service outcomes under two contracting approaches. As predicted, PBC encourages the achievement of employment outcomes in shorter time periods. But the differences between two models in terms of working hours and wages are trivial. It seems “what gets done is what gets measured.” Combining these quantitative evidences together, a conclusion could be made that PBC is better than FFS in that it achieves desired employment outcomes in a more efficient way, without degrading employment quality. The study in this chapter may suffer from two categories of limitations. First, as mentioned earlier, the local differences between two states were out of full control. In quasi-experiments, there are always risks of comparing different people in different contexts. In propensity score matching, samples are matched on observed covariates, assuming that there are no unobserved differences between the treatment and control groups. This assumption might be too strong to be true in real context –balance on observed covariates may not rule out the role of unobserved differences. To address this potential bias, this research introduced DID on matched samples to further adjust unobserved covariates. Although the paper has some evidence to support the common trend between two states, it still cannot guarantee the similarity in local communities over time where employment outcomes are embedded. Michalopoulos, Bloom, and Hill (2004) remind that comparing groups that are not from the same social context 111 would potentially bias the estimation of treatment effect, because these groups may be exposed to different local situations and thus unobserved variables. This warning is particularly relevant as we do social program evaluations using comparison groups across service jurisdictions. Heckman et al. (1997) also warns that the bias might be larger in out-of-state comparison groups than in-state comparison groups due to geographic mismatch, such as different geographic location and local labor market. Second, the present research wouldn’t compare the differences in several other indicators on employment services, due to the data availability. One major concern on the effectiveness of PBC is the client selection program – contractors may have fiscal incentives to decline severe clients to achieve performance outcomes. Unfortunately, the data used here only records the individuals who had been admitted into service processes. Besides, the costs to achieve employment outcomes under two contracting models were not observed, either. Generally, such costs should include two parts, service costs in the purchase of services from contactors and administrative costs in monitoring contractors. Theoretically, PBC is more economical by shortening service duration and reducing monitoring and reporting. Moreover, the long-term employment effect was not examined, either. In this study here, short-term indicators (working hours and wages at closure) were used as proxies. However, as previously noted, VR services target the long-term stability and welfare of disabled people. In short, this chapter evaluates PBC effectiveness from a community-outcome perspective. It employs quantitative quasi-experimental methods to compare 112 individual employment outcomes under two contracting approaches. Methodologically, this research design didn’t address two things. First, as mentioned above, the quantitative methods used here have a number of limitations, which might bias the robustness of the findings. Second, holding a service-outcome perspective, we might ignore the causes behind the findings here that are directly derived from administrative data and miss rich details of PBC implementation in the real service setting. With these in mind, we turn to next section, qualitatively assessing PBC effectiveness from a participating-organization (i.e., government and contractor) perspective. 113 Appendix 1. Comparisons between Indiana and Michigan 2004-2009 Source: Bureau of Economic Analysis. Source: U.S. Bureau of Labor Statistics. 114 Chapter 6. The Effectiveness of PBC: Government and Contractor Perspectives 6.1 Introduction This chapter will evaluate PBC effectiveness from a participating-organization perspective. As mentioned in the previous chapter, network effectiveness should also consider the organizations involved in the contractual networks. Indeed, their survival and success affect the effectiveness and sustainability of a network. Often, individual organizations have different interests and hold different motivations in participating in service-provision networks. How the network structure (i.e., PBC arrangement) defines and influences the behaviors and incentives of participating organizations is the central question of this chapter. Generally, two types of organizations are involved in the provision of VR services, government and service contractors. Thus, the chapter will assess the organization-level effectiveness of PBC from these two perspectives. Particularly, the chapter holds a street-level lens and uses qualitative methods in the examination of PBC implementation by the two actors. 6.2 Street-level Perspective in Policy Analysis Policy implementation is complicated. Policy implementation, as Bardach (1977) defines, is “(1) a process of assembling the elements required to produce a particular programmatic outcome, and (2) the playing out of a number of loosely interrelated games whereby these elements are withheld from or delivered to the program assembly on particular terms” (57-58). Traditional public administration literature has 115 shown “the complexity of joint action” (Pressman & Wildavsky, 1973) and the “great difficulty of organizing cooperative activity on a large scale” (Derthick, 1972) in converting policy intents into policy actions. Scholarly research on policy implementation has identified two analytical approaches: top-down and bottom-up12. The top-down approach, or what Elmore (1979) calls “forward mapping,” begins at the top of the process, with a clear emphasis on policy designers as the central actors. It traces policy implementation along the hierarchical structure within traditional bureaucratic system and explores the ways to guide and constrain the behavior of civil servants and target groups in realizing the policymaker's intent. In this vein, Mazmanian and Sabatier (1983) consider policy implementation as “the carrying out of a basic policy decidsion, usually incorporated in a statute but which can also take the form of important executive orders or court decisions … ” (20). Thus, the prescription developed by this school of thought to ensure faithful implementation generally centers on formal organizational structures, authority relationships, control and regulations, administrative responsibility, and so on. The implicit assumption behind this approach is that policymakers control the administrative and political resources that are needed to guarantee policy implementation. 12 There is another so-called “third-generation” approach of policy implementation study, claiming for a synthesis of both top-down and bottom-up frameworks (e.g., Matland, 1995; Sabatier, 1986). But there is actually little advancement in this regard. See O’Toole (2000) for more details. 116 On the contrary, the bottom-up approach, or “backward mapping” (Elmore, 1979), emphasizes policy implementation at local level and considers the role of local actors in the interpretation of grand policy goals. Due to local differences, the remote control enforced by policy makers is inevitably incomplete and thus lower-level administrators would enjoy certain discretion in turning policy intent into policy action. Based on this, this school of thought argues that it is how lower-level administrators use their discretion to adjust to local context that determines the real meaning of a policy. This strand of research typically explores the dynamics on the recipient level and analyzes the real causes that influence the mutual adaptation of a policy to its local organizational setting. “The crucial difference of perspective”, as Elmore (1979) writes, “stems from whether one chooses to rely primarily on formal devices of command and control that centralize authority or on informal devices of delegation and discretion that disperse authority” (605). The top-down approach is concerned more with compliance, while bottom-up approach values bargaining and compromise. In social policy, Berman (1978) distinguishes “federal macro-implementation” from “local micro- implementation,” tailored to the dichotomous institutional nature of policy implementation – the federal determines the grand picture, while the local organizations adapt the mission to local setting and deliver the concrete services. In this way, “the net result is that the effective power to determine a policy’s outcome rests with local deliverers, not with federal administrators” (157). Based on the elaboration on control strategy in human service contacting in chapter two and the 117 quantitative findings in chapter four, it is not surprising that the chapter here tends to rely on the “backward mapping” logic and the street-level perspective in particular. Lipsky (1980) defines these local public service organizations (such as schools, hospitals, police offices and welfare agencies) as street-level bureaucracies and the front-line workers within as street-level bureaucrats (SLBs). He argues that working in the conditions of huge caseloads, ambiguous agency goals, and inadequate resources, SLBs’ exercise of discretion is inevitable in the day-to-day implementation of public programs. When combined with substantial discretionary judgment and the requirement to interpret policy on a case-by-case basis, the gap between policy intent and policy action can be substantial. Therefore, “the decisions of street-level bureaucrats, the routines they establish, and the devices they invent to cope with uncertainties and work pressures, effectively become the public policies they carry out” (Lipsky, 2010, xiii). The role of SLBs can also be understood from an organizational behavior perspective. Locating at the boundary between public welfare agency and its external constituency, SLBs actually play a critical boundary-spanning role in information processing and external representation (Thomson, 1967; Prottas, 1978). “Information from external sources comes into an organization through boundary role, and boundary role link organizational structure to environmental elements, whether by buffering, moderating, or influencing the environment” (Aldrich & Herker, 1977, 218). In the information processing function, boundary roles gather, transmit, and 118 interpret information from external environment for internal organizational components. In the external representation function, boundary roles serve resource acquisition and institutional legitimation. Summarily, SLBs typically perform these two functions, acting as a mediator between welfare agencies and targeted service receipts and resulting in bureaucracy’s dependence on its SLBs in people processing. Such dependence constitutes a source of power. “To the extent that information access and control is a power resource, boundary spanners are in an excellent structural position to convert this resource into actual power. … Their power is further enhanced to the degree that the nature of the task assigned the boundary role makes routinization of the role difficult, if not impossible” (Aldrich & Herker, 1977, 227). The massive use of contracting in social service delivery complicates the original connotation of SLBs. Under the contracting regime, contractors have partially taken over the functions previously performed directly by social workers. Now, contractors directly work with clients and provide services to achieve goals set by government agencies, while social workers turn to determine the client eligibility and monitor contractor behavior. Collectively, these two actors become new SLBs (Smith & Lipsky, 1993). This notion of new street-level bureaucrats is consistent with Hjern and Porter’s (1981) suggestion of using “implementation structures” as a unit of analysis for studying “purposive action within a framework where parts of many 119 public and private organizations cooperate in the implementation of a programme” (214). Lipsky’s construct of SLB provides a useful perspective in studying social policy implementation. According to Brodkin (2003), “[t]his approach is most valuable when policy implementation involves change in organizational practice, discretion by frontline workers, and complex decisionmaking in a context of formal policy ambiguity and uncertainty” (151). Using the street-level approach, we could explore how street-level actors put policy into practice, more specifically, “what street-level organizations construct as policy through their informal practices, how they do it, and why they produce policy in the ways that they do” (Brodkin, 2011, 199). The robustness of this approach has been shown by numerous studies of welfare reform in a variety of policy areas (e.g., Brodkin, 2011; Keiser, 2010; Meyers, Glaser, & Donald, 1998; Riccucci, 2005). This study relies on this street-level perspective and explores the mutual adaptation of PBC to its local organizational setting. How is performance-based contracting being implemented? Do PBC and the incentive system associated work as intended to motive SLBs to improve service outcomes? Particularly, this study would examine: How do SLBs respond to PBC? How do they use their discretions under PBC? Does PBC motivate them to work in the “right” direction, both short-term and long-term? Do they use discretions to gaming the PBC? If so, how negative are they and how do public managers deal with this? 120 6.3 Vocational Rehabilitation Context and Data Collection The context for examining the question of how PBC is implemented is state vocational rehabilitation programs. As described in chapter three, in rehabilitation services, VR counselors in the state agencies work with clients on eligibility determination and progress monitoring, while employment specialists at service contractors coach and counsel VR clients to find jobs. Thus, The SLBs or front-line workers in this setting are counselors and employment specialists. The main research method here is semi-structured interviews (interview questions attached in Appendix 2). Qualitative data for the research was collected with staff from local VR offices and service contractors in Indiana. Certainly, the ability to gain access to local offices and contractors by the researcher was indeed a criterion in sample selection. Generally, VR local employees were invited to have 30-40 minute talks concerning their experience on PBC from their own perspectives. Interviews were operated in spring and summer 2013, mostly through telephone, except for one face-to-face interview with a counselor conducted in a local VR office in Indianapolis. Detailed notes were taken based on interviewees’ inputs. Table 18 describes the distribution of interview samples. It is important to note that the goal of the interview is not to generate statistics about a population. Thus, it did not follow a strict probability sampling. Rather, it uses a snow-ball sampling strategy. 121 Before interviewing SLBs, interviews with state VR program managers and area supervisors were conducted to collect background information and “set the stage” for later research. Also, content analysis of key documents relating to service contracts, annual state program report, and meeting minutes were conducted. Data from all these sources are used jointly for analysis. Table 18. Distribution of Interview Samples Indiana VR counselors 5 Program managers (contractors) 4 Employment specialists (contractors) 3 6.4 Findings from VR Agency Perspective Views from government side on PBC effectiveness are mostly positive. Counselors were impressed with high costs and poor placement outcomes under the traditional FFS arrangement. Under FFS, due to the mechanism of reimbursement for services provided rather than service outcomes, there were frequently large amount of job development services, sometimes more than necessary, by vendors before client placement. So counselors saw vendors keeping providing services to clients, but without placing them. FFS also means labor intensive for counselors: they had to work closely with clients and vendors. Counselors needed to meet with clients very frequently (e.g., every month) to confirm their situations. Vendors were asked to report to counselors intensively (every 30 minute) when providing detailed services to 122 clients. There were also a number of administrative procedures for vendors to go through, such as monthly billing and reporting. In contrast, PBC is easier to manage from government perspective. Under PBC, counselors enjoy more flexibility and less job burden. They stay involved, but do not track vendor behaviors that closely. Counselors only need to authorize milestones at the beginning and then verify clients’ achievement, rather than regulate detailed services vendors provided. When talking about clients’ employment outcomes, counselors were mostly impressed with PBC’s effect on better employment outcomes with less costs. Because of the financial incentives related to the milestone payment, PBC directs contractors to better employment outcomes. As mentioned earlier, VR services roughly include three steps: job development, job placement, and job retention. PBC is very effective in promoting placement and keeping the job (at least to case closure). For example, in Indiana milestone system, to receive the payment on job retention milestone, vendors have to help regular clients keep their jobs for at least 90 days. Indeed, success in longer-term placements has been a persistent challenge in VR programs. In this way, increased placement possibility and job tenure are most evident effectiveness of PBC. However, the changes on employment wages, benefits, job match, and other aspects of life quality under PBC are not emphasized by counselors. These are those vocational characteristics that are not specified explicitly by the PBC milestones. In these areas, both FFS and PBC might perform quite equally and lack impressive differences. One explanation might be that vendors under either contracting arrangement have the same non-financial incentives to perform these uncaptured 123 performance areas. In addition, counselors have no evidence to track clients’ long- term employment and stability. Also, PBC pushes vendors to move across the service process to hit the milestones and receive payment. Because services would no longer be directly reimbursed by funding agencies under PBC, vendors have no financial incentives to hold clients and keep providing services to them, but pursue rapid job search and accelerated placements. Given that, counselors all agreed that PBC greatly reduces service waiting time and time clients spent to achieve placement. Up to the VR program level, PBC improves program costs in two ways. On one hand, due to “paying for performance” of PBC, only desired service milestones would be paid by funding agencies. In this way, funding agencies actually shift financial risks to vendors. Therefore, there is substantial saving on service costs. On the other, funding agencies no longer need to track vendors very closely to monitor their behaviors in the service process. Less administrative monitoring means a decrease in administrative costs, which is usually one third of total program costs. With these savings, VR agencies, which often suffers from overloaded service applicants, would extend their program capacity and serve more clients. There are certainly a number of managerial concerns/challenges some counselors saw in the implementation of PBC. When using PBC, government has less control of the service process. Unlike FFS, PBC grants huge amount of discretion to vendors and 124 substantially reduces administrative control in the service process. Under PBC, counselors only need initial authorization of service milestones and verification of milestone achievement, being largely out of touch with clients and the service process. Thus, PBC changes the role of counselor from director of services to a payer of services. It also shifts the responsibility of vocational assessment and planning from counselors to contractors. Some interviewees felt that PBC discounted the importance of the counselor role and government involvement. Such diluted government control, if not well managed, might lead to problems. The essence of PBC, according to Wedel and Colston (1988), is to use specified rewards for meeting or exceeding contract objectives and penalties for failure to meet them. This actually points to the critical role of incentive structure design. However, building consensus on outcome/milestone measurement and payment rate within funding agencies and with diverse contractors is not an easy work. For example, the milestones currently being used in contracted employment services are mostly easy- to-observe performance measures, such as one-week placement, 90-day employment. Counselors did admit there should have some other milestones to consider. Some also mentioned whether to include some invisible measures related to positive life quality changes, in order to fight against potential creaming by contractors. Counselors observed that contractors were better at meeting some milestones than others. Also, in the long run, how to adjust milestones and payment rate would be challenging. As such, there is a risk that the incentive structure associated with PBC, if not appropriately designed, might incur undesired vendor behaviors. 125 With regards to creaming/gaming, counselors did put forward concerns for this negative potential of PBC. However, most counselors believed creaming did not occur frequently. They didn’t see strong evidence of creaming or less outcomes, although these observations were directly from intuition, rather than systematic data analysis. Part of the reason might be that the two-track payment system (paying vendors with a higher rate when serving people with higher degree of disabilities) effectively address the creaming problem. However, counselors noticed vendors supported PBC generally due to the flexibility vendor have in the service process and complained the financial risks vendors face. This financial pressure had make some vendors drop off due to poor service performance. So, this pressure might make vendors less engaged in service process, or even lead to deterioration. Generally, VR counselors were consistently satisfied with PBC. However, they did admit that PBC might not work in every situation. Hourly payment is still needed, particularly when considerable specific services are required. Thus, even though system-based conversion from FFS to PBC is necessary, some usage of hourly payment should still be allowed. 6.5 Findings from Contractor Perspective Unlike VR counselors, contractors hold a quite mixed attitude towards PBC. The same as counselors, contractors welcome the flexibility associated with PBC in serving clients. Under FFS, contactors were closely monitored by counselors and 126 funding agencies. When working under PBC, they could take activities they think are necessary to achieve milestone outcomes. Because only milestone outcomes would be evaluated by VR agencies, vendor behaviors are free from strict administrative control. All contractors interviewed were very much impressed with the decrease in the amount of time they spent in administrative reporting and paperwork. In general, contractors support PBC in that it promises more rapid initial assessments, faster VR authorizations (due to less administrative paperwork), less paperwork and reporting, and less scrutiny by counselors. Although more discretion does not mean better service quality or more innovative service methods, contractors are more likely to serve clients in the ways that they believe are best practices. PBC also encourages contractors to work with clients more intensively, at least in the early placement process. Contractors reported that they conducted more in-person contacts and spent more evaluation time with clients under PBC, in order to place clients in good jobs that clients would want to keep. However, such improvements in service quality with clients might be due to the motivation of achieving milestones and securing funding in the shortest time. The incentives of rapid placement might lead to temporary jobs to gain reimbursement at earlier stages at the expense of longer-term jobs. Some clients left jobs because they did not like them or failed to accommodate to workplace. It partly indicates that the employment support services contractors provide in the service process before placements might not be sufficient. In short, PBC might detract from the desired outcomes to the extent that contractors feel pressured to place clients in jobs sooner at the expense of job fit. Also, 127 contractors might use enhanced interactions with clients early in the process to terminate unmotivated or highly difficult client earlier in the process to avoid potential risks. Contractors were very concerned with the financial risks shifted from funding agencies: contractors would not be reimbursed until milestones are achieved. Some contractors complained that PBC undermined the vocational philosophy, changing from serving clients’ overall vocational needs to a narrow focus on meeting milestones. Although they didn’t admit they selected clients based on the possibility of future employment, they did emphasize their employment programs had budgetary constraints. Due to the high financial pressures under PBC, contractors have to consider the cost and risk of serving a certain client. As for employment outcomes, contractors emphasized that because of financial reasons, they would pay attention to clients’ milestone achievement that would lead to payment. However, they didn’t think PBC would induce them to engage in creaming/gaming behaviors, achieving measured performance at the cost of unmeasured ones. Contractors believed other non-financial incentives under FFS and PBC were of no difference. Thus, they didn’t notice significant different in clients’ employment quality, such as wage, job match. Generally, contractors understood the rationale behind the change from FFS to PBC. However, contractors were anxious about the high financial risk PBC would pose on their running employment programs, despite the flexibility they enjoyed under PBC. Weighing these two things, contractors still preferred FFS in providing employment 128 services. At least, some contractors proposed a hybrid contracting model with a combination of FFS and PBC. Funding agencies should use FFS for difficult clients, while PBC for regular ones. 6.6 Conclusion Here, let’s summarize the findings from this qualitative piece. For VR counselors and service contractors, the first impression they have concerning PBC is the flexibility, in either administrative or service sense, PBC brings. On the counselor side, they no longer need to monitor contractors very closely throughout the service process. Rather, they just authorize milestones and verify whether they have been really achieved by clients, becoming a little away from clients and vendors. On the contractor side, without such administrative control, they could serve clients in the way they believe to be the most professional one. Thus, in this regard, both parties expressed strong satisfaction with PBC. Second, PBC, through financial incentive restructure, directs contractors away from service delivery per se towards milestones along job development, job placement, and job retention. Following this logic, contractors, in order to receive service reimbursement, devote much attention to these milestones. So counselors could obverse clients under PBC find jobs sooner and are more likely to keep jobs at least till reaching the milestone on case closure. But PBC’s change on other areas of employment outcomes (wage, job match, etc.) are not impressive to counselors. 129 Given that PBC has no touch on these indicators, contractors explained the incentives in these areas might be equal in FFS and PBC. Third, there are some unsupported clues on creaming/gaming. For contractors, the financial incentives of rapid movement across milestones might lead to clients’ temporary jobs at earlier stage at the cost of job fit and longer-term jobs. However, whether and to what extent would PBC differ from FFS in this regard are ambiguous. For counselors, although there are concerns for creaming/gaming, they actually have no evidence to show these strategic behaviors indeed exist and play roles. At the beginning of the last chapter, this project aims to evaluate PBC effectiveness from two perspectives using different methods. The previous chapter employs a quantitative quasi-experimental method to compare the impacts of two contracting approaches on individual employment outcomes. The present chapter, through qualitative semi-interviews, explores how PBC was implemented by street-level actors, both VR counselors and contractors. How do these two perspectives and methods jointly help us understand PBC effectiveness? This is the topic of the next chapter. 130 Appendix 2. Interview Questions on the Effectiveness of Indiana RBF Questions for VR Counselors: 1. What are the major differences between FFS and RBF, from your perspective? 2. What do you like most about RBF and what do you like least? 3. How might RBF change service activities and process? (1) Service method (best practices: more job development, search, and match) (2) Client selection (two-tier payment useful?) (3) Service costs (purchase costs and administrative costs) 4. How might RBF change clients’ employment outcomes? (1) Employment results (2) Time to placement (3) Employment quality (wage, working hours, job match, customer satisfaction) (4) Long-term employment and stability 5. How might RBF change your work as a counselor, and relationship with service providers? (1) Work with clients (2) Oversight service providers 6. How do service providers view RBF, from your perspective? What are their concerns? 7. All things considered, which funding mechanism does you actually prefer, RBF or FFS? 8. What might be the potential room to improve RBF? 131 Questions for Contractors: 1. What are the major differences between FFS and RBF, from your perspective? (1) RBF: Like most? Like least? (2) FFS: Like most? Like least? 2. How might RBF change service process? (1) Client selection (2) Best practices 3. How might RBF change clients’ employment outcomes? (1) Possibility of employment (2) Time to placement (3) Employment quality (wage, working hours, customer satisfaction) (4) Long-term employment and stability 4. How might RBF change your work as a service provider, and relationship with counselors? (1) Work with clients (consumer evaluation, job development, job search, documentation, and in-person contacts) (2) Oversight from VR agency 5. All things considered, which funding mechanism does you actually prefer, RBF or FFS? 6. What might be the potential room to improve RBF? 132 Chapter 7. Two Faces of Contracting, Two Kinds of Control 7.1 The Effectiveness of PBC as a Formal Arrangement No doubt, PBC is mostly a formal contracting endeavor. Through restructuring formal contract design, or more precisely, changing the financial incentives from “paying for process” to “paying for result,” PBC draws contractors’ attention towards the results of service delivery, rather than service delivery per se. However, as discussed in chapter three, the appropriate contract design for human services is very daunting. Human services always feature their ambiguousness in both task programmability and outcome measurability. This puts human services mostly within what Weisbrod (1988) calls Type II dimension of service attributes13. In this way, neither behavior-oriented contracts (FFS) nor outcome-oriented contracts (PBC) fit seamlessly with human services. Given this, the quesiton becomes which contract arrangement is less risky. The entire research is to evaluate the effectiveness of PBC in human services, using Indiana vocational rehabilitation program as a case. Towards this goal, the previous two chapters employ two perspectives and two research methods. In chapter five, PBC effectiveness was assessed from a service outcome perspective, based on a 13 Weisbrod (1988) differentiates between Type I dimension of service attributes (that are relatively easy to monitor or assess) and Type II dimension of service attributes (that are relatively difficult to monitor). 133 quantitative quasi-experiment to compare the impacts of PBC and FFS on individual employment outcomes. In chapter six, PBC effectiveness was explored using qualitative semi-interviews with VR counselors and contractors about PBC implementation. Clearly, both quantitative and qualitative methods have weaknesses. Quantitative and qualitative methods follow different paradigms, assumptions and have different strengths in information processing (Firestone, 1987). Quantitative methods, based on a positivist paradigm, “produce factual, reliable outcome data that are usually generalizable to some larger population.” In contrast, qualitative methods, grounded on a phenomenological paradigm, “generate rich, detailed, valid process data that usually leave the study participants’ perspectives intact” (Steckler, McLeroy, Goodman, & Bird, 1992, 2). Accordingly, a research strategy advocated in methodology literature is triangulation, “the combination of methodologies in the study of the same phenomenon” (Denzin, 1978, 291). It has been suggested to see quantitative and qualitative methods as complementary, used to compensate for the limitations of each other and cross-validate to gain greater accuracy and confidence in judgments (Jick, 1979; Mathison, 1988). The present research tends to follow this “triangulation” logic, through collecting and analyzing different kinds of data bearing on the use of PBC in VR employment services. Not surprisingly, the findings from the two methods, through illuminating, are not exactly the same, but some general conclusions could be derived. The effectiveness of PBC, from a service outcome perspective, means clients find jobs that are permanent, pay good (above minimum) wage, and match their interests. From 134 an organizational perspective, effectiveness might be close to administrative efficiency and flexibility in running VR programs and providing employment services. In the service effectiveness sense, PBC performs better in the areas that are captured by milestones. In other words, compared with FFS, PBC is more likely to achieve the milestones. Because of the financial incentives, contractors push clients they serve to move across the milestones rapidly. Although PBC has little impact on unmeasured areas, there is no strong evidence of creaming/gaming. Contractors might be involved in strategic behaviors in some cases, but those behaviors haven’t been found to result in deterioration in service outcomes. In the organizational effectiveness sense, PBC are well endorsed by funding agencies for its efficiency and flexibility. VR counselors become relatively free from intensive work with clients and contractors and enjoy much flexibility in managing the service process. Funding agencies spend the money and get the results they want, without seeing severe unintended outcomes. For contractors, PBC would be a double-edged sword. They support PBC in that it allows more exercises of professional discretion, but complain high financial risks they burden. This risk is indeed a big managerial challenge. If not appropriately handled by public managers, it might force contractors to engage in more strategic behaviors. 135 To an extent, PBC seems more promising in both service effectiveness and organizational effectiveness. However, it also implies PBC effectiveness might not be well-rounded and should not be exaggerated. After reviewing the use of PBC in federal agencies, GAO (2002) doubts “whether agencies have a good understanding of performance-based contracting and how to take full advantage of it” (2). Accordingly, a policy question would arise: how to improve the effectiveness of PBC, or how to take advantage of it? From the formal contracting perspective, the most direct response would be optimizing the PBC design, such as fixing performance measures, redefining the connection between performance indicators and contact compensation, and changing incentive structures. For example, Hill’s (2006) study of casework task configurations in welfare-to-work programs finds that the separation of measurable and unmeasurable tasks among frontline workers would contribute to program effectiveness. Heinrich and Choi (2007) suggest changing performance measures periodically before contractors learn the ways to gaming the measures. This would cause a “competition of learning.” When launching a PBC system, both government and contractor start their learning activities, the pros and cons of that system. If government learns faster, they could find ways to fix the problems. If contactors learn faster, they might gaming the system. Anyway, both suggestions warn that PBC should be used very carefully. 136 However, these technical efforts on restructuring PBC systems would hardly be free themselves from the puzzle of introducing performance management to human service contracting mentioned previously. More broadly, this illustrates what Van Thiel and Leeuw (2002) call “performance paradox” in the public sector – “characteristics of the public sector can be counterproductive to developing and using performance indicators” (267). In this way, the use of PBC in public human service programs might always be at the risk of “rewarding A, while hoping for B.” A more feasible way to optimize PBC endeavor might be a relational one. 7.2 Managerial Implications from a Relational Contracting Perspective The theoretical framework in chapter three presents two faces of contracting, formal and relational. Relational contracting perspective highlights the role of relational sanction and social interaction in contractual fulfillment. It suggests relying on relational exchange as a social control mechanism in contracting management. These two faces of contracting remind that the coexistence of these two mechanisms that public managers should devote themselves to simultaneously. To some extent, such combination of formal and informal contracting reflects the nature of contracting management in public administration context: well-planned and written contracts to meet the formal accountability demand, and negotiation and discretion to satisfy the flexibility concerns in service delivery (DeHoog, 1990). However, the efforts on PBC innovation and implementation tend to ignore the relational contracting side. As we’ve seen, the formal effort of using PBC in 137 vocational rehabilitation services was disturbed by the highly uncertain nature of employment services. Therefore, we could see incomplete improvement in employment performance, mostly in targeted performance areas, and the risk of contractor opportunistic behaviors. Rather than the attempt to use other formal devices to awkwardly improve PBC effort, this project suggests introducing relational contracting, with a focus on relationship and trust building, as a supplement. Relational contracting, as Sclar (2000) suggests, “transform[s] the notion of contracting from a market-based arrangement to one rooted in interorganizational trust” (123). This notion is also termed by sociologists as “embeddedness,” to recognize the role of socially embedded relationships in economic exchange (e.g., Powell 1990; Uzzi, 1997). Granovetter (1985) argues that formal exchanges would “become overlaid with social content that carries strong expectations of trust and abstention from opportunism” (490). Exchanges characterized by trust are generally found to be more successful (Dyer, 1997; Klein Woolthuis et al., 2005; Ring and Van de Ven 1994). Therefore, relational exchanges, based on social components (Macneil, 1980), are always associated with a higher level of trust. Here the research adopts Rousseau, Sitkin, Burt, and Camerer’s (1998) definition of trust: “a psychological state comprising the intention to accept vulnerability based on positive expectations of the intentions or behavior of another” (395). They also identify two preconditions for trust to arise: risk (or uncertainty) and interdependence. Risks in exchanges create 138 opportunities for trust. Trust would not be needed if exchanges could be conducted with complete certainty. Although high risks might force two parties to seek other alternatives, interdependence between parties would glue them together. Interdependence means that the goals of one party’s could not be achieved without the participation of the other’s. These two conditions further imply the relevance of the discussion on trust here to service contracting. In human services, governments heavily rely on third-party actors to deliver various services to citizens. However, due to the uncertain nature of human services mentioned above, contracting performance is at the risk of contractor misconducts. The role of trust in interorganizational exchanges and collaborations has been discussed extensively by scholars from a variety of disciplines such as sociology, psychology, and economics. From a sociological perspective, trust acts as a functional alternative to rational prediction for the reduction of complexity in social life (Lewis & Weigert, 1985; Luhmann, 1979). From a transaction cost viewpoint, trust reduces transaction costs by reducing both ex ante and ex post opportunism (Williamson, 1993). Ostrom (1998) suggests trust and reputation for trustworthiness as core factors in collective actions, potentially reducing uncertainty and transaction costs. Management scholars McEvily, Perrone, and Zaheer (2003) propose trust as an organizing principle, structuring and mobilizing organizational components. In the structuring role, trust affects “the development, maintenance, and modification of a system of relative positions and links among actors situated in a social space” (94). In the mobilizing sense, trust “involves motivating actors to contribute their resources, 139 to combine, coordinate, and use them in joint activities, and to direct them toward the achievement of organizational goals” (97). In short, as Zand (1972) summarizes, trust “conveys appropriate information, permits mutuality of influence, encourages self- control, and avoids abuse of the vulnerability of others” (238). The efforts on trust conceptualization tend to acknowledge its multi-faceted nature (Williamson, 1993). Lewis and Weigert (1985) distinguish three dimensions of trust: cognitive, emotional, and behavioral dimensions. To them, cognitive familiarity, emotional bond, and behavioral enactment construct the sociology base of trust. Zucker (1986) also identifies three modes of trust production: (1) characteristic- based, (2) process-based, and (3) institutional-based. Characteristic-based trust can be formed on the basis of individual social characteristics such as ethnicity and background. Exchange partners with similar characteristics are easier to engage in collective actions in that they might believe such exchange would satisfy both parties. Trust can also result from previous and expected future exchanges, i.e., a record of reputation. In institutional-based trust, exchanges are embedded in social practices and trust is thus tied to broad societal institutions. This paper builds on Zucker’s classification. Indeed, characteristic-based trust has been well observed. For example, as mentioned in the previous chapters, in human service contracting, public managers tend to trust nonprofits’ social-mission driven would prevent nonprofit contractors’ opportunistic behaviors. This mission/value alignment produces characteristic-based trust, which makes public agencies incline to 140 partner with nonprofit contractors. The following paragraphs focus more on process- based and institutional-based trust and discuss implications that public managers might consider when optimizing PBC efforts from a relational contracting perspective. A summary is provided in Table 19. Table 19. Mode of Trust Production and Implications for PBC Mode of trust production Basis Implications for PBC efforts Characteristic-based Individual attributes • Contractor’s nonprofit status Process-based Past or expected exchanges • Collaboration and negotiation • Time boundlessness Institutional-based Social structures • Professionalism • Best practice Source: Zucker (1986). Collaboration and Negotiation Reciprocal obligation should be a key principle in the use of PBC. Due to the complicate and dynamic nature of PBC, the implementation of PBC as a system- based change would not succeed without the commitment from all stakeholders. After all, the central goal of PBC is to meet client needs while addressing the financial realities of both funding agencies and service providers. For example, as found in the last chapter, the high financial risks burdened by service contractors under PBC, if not well moderated, might lead to gaming or other strategic behaviors. One way to 141 address the dysfunctional response is to collaborate with stakeholders throughout contracting process. Such collaboration itself acts as a sign of commitment and a tangible expression of mutual trust. The collaboration should start in contract planning and design stage. It requires the participation from three groups of stakeholders: funding agencies, service contractors, and clients (O’Brien & Revell, 2005). Because the process and outcome of human service delivery are relatively uncertain, stakeholders should reach at least some consensus on incentives and disincentives associated with the PBC design, such as essential milestones, fee structure. Also, the formal contract can be seen as a coordination mechanism to specify what goals all parties aim for and how they want to achieve these goals. The emphasis here can be more on the positive (shared mission, goals, etc.) than the negative (legally enforceable provisions and penalty). Overall, this collaborative planning should ensure that the design addresses each party’s concerns and eliminate possible resistance to change. It creates a transparent and participatory process that could enhance the feasibility of PBC design and the likelihood of full implementation. The development of PBC is also a learning process, for both funding agencies and contractors. Throughout PBC implementation, tensions and conflicts could be anticipated. Thus, ongoing negotiation and system modifications are necessary. At the early conversion (from FFS to PBC) stage, substantial time and resources are needed by the funding agencies to provide technical assistance and training for contractors 142 and develop shared commitment with them. After the shared development phase, the use and availability of multiple communication strategies to disseminate information would enhance implementation. The stakeholders still need to meet periodically to assess the implantation and make recommended changes. Funding agencies might hold annual program feedback meetings or annual on-site visits to collect contractors’ and clients’ inputs. These regular and stable interactions reduce opportunistic behaviors and support the development of commitment and adaptation. In this way, trust is formed incrementally and enhanced through repeated interactions. (Dyer, 1997; Gulati, 1995; Lee et al., 2012; Ring & Van de Ven, 1994). Time Boundlessness To an extent, contractor behaviors reflect their expectations for future exchanges. In discrete time-bound transactions, people might respond to calculations of short-term advantage. In contrast, open-ended contracts imply potential benefits from future collaborations and thus provide a safeguard against opportunistic behaviors. Open- ended contracts not only convey a sign of commitment and mutual trust at the beginning of exchanges, but promote trust formation and enhancement in the long run due to repeated interactions mentioned above. Researches find exchanges that operate for a pre-specified duration would behave differently from those under a setting of continuing relationships and interdependence. Axelrod (1984) suggests compared with open-ended contracts, time-bound contracts are less likely to be self-enforcing due to the lack of a “shadow of the future.” Reuer and Arino (2007) confirm that 143 time-bound contracts would cause a greater threat of opportunistic behavior and contribute to contracting complexity. Taken together, the arguments here suggest the use of open-ended contracts or at least longer-term contracts. Currently, in vocational rehabilitation programs studied in the present project, for example, service contract duration is usually one year, with an option of one-year extension contingent on satisfactory performance. Under this specified short duration, contractors might consider short-term opportunistic behaviors. Of course, continuing relationships do not necessarily mean nice. Relational sanctions would not always produce cooperation. Rather, they might lock funding agencies into dependent positions (Williams, 1983). However, this point here does not exclude other formal enforcement mechanisms such as performance assessment and financial auditing, but suggest the combination of open-ended contracts with other formal control tools. Professionalism Professionalism achieves social legitimacy through specialized expertise and qualifications. Under information asymmetry, professionalism acts as a signal of quality, ensuring that professionalized organizations are in compliance with established social expectations and professional standards. It is termed by Ouchi (1979) as a “ritualized, ceremonial forms of control” (844). Professionalism means only a selected organizations and individuals who have gone through 144 professionalization processes could be allowed to participate in the service program operation and service delivery process. For human service organizations, accreditation is a kind of quality assurance that an organization meets the quality standards established by the profession. Accredited organizations are required to follow similar service procedures and occupational norms, which convey an assurance of quality and credibility. For service workers, professional schooling and membership would be a channel to internalize the desired attitudes, values, and beliefs. For nonprofit human service organizations, professional values are also reflected by a professional workforce that an organization uses in service jurisdictions and management. These “organizational professionals” (DiMaggio & Powell, 1983) generally hold occupational norms and standards, implying a higher degrees of professionalism in organizational operation. Documentation of Best Practices Another somewhat relevant to professionalism is the “best practice” approach. Funding agencies might document and disseminate periodical reports, identifying the contractors with best service outcomes and their best practices in service delivery. This would create some informal pressure on service contractors with poor performance in their profession. However, the approach builds on the assumption that service contractors have relatively strong self-motivation to provide better services and care about professional recognition (Else et al., 1992). If so, they would wish to learn from leading organizations and improve their own performance. 145 7.3 Conclusion: Control, Trust, and Contracting Management In the United States, government contracting is widely and durably used as an indirect government tool in the landscape of service delivery and policy implementation. This governing by contracting model has fundamentally redefined the U.S. governance system, in both political and managerial senses. It also highlights the imperative of contracting management to ensure high-quality results. However, public managers are often frustrated by their insufficient management capacity while working with contractors. To address this “smart-buyer” challenge, public management scholarship and practice in past three decades have conducted a huge amount of exploration of effective contracting management. Inspired by performance management movement, PBC represents one of the most recent efforts. PBC incorporates performance measures in contract specification and makes contract compensations attached to contractors’ performance achievement. Theoretically, PBC promises quality services, better outcomes, and less monitoring. Given the potential benefits, governments at all levels have shown substantial and continuous enthusiasm for PBC. In human services, particularly, state and local governments have expressed growing interests in using PBC in their service acquisition. However, the burgeoning popularity of PBC lacks sufficient evidence to show its promised benefits are actually achievable. In particular, the introduction of PBC into human service systems needs to address the effectiveness problem (whether PBC produces better results) and the capacity problem (how to use PBC and lead 146 interorganizational change). The present research mostly focuses on the first problem, while the findings here might shed some light on the second problem. After building the theoretical framework which incorporates the literature on formal and relational contracting, this research explores the effectiveness question using Indiana vocational rehabilitation program as a case. Inspired by the literature on network effectiveness, this project evaluates PBC effectiveness from two perspectives: service outcome and participating organizations. Putting all the findings together, this project proposes that PBC seems more promising than FFS in human services. However, PBC effectiveness could not be well-rounded and should not be exaggerated. PBC, as a formal mechanism, adjusts contractor behavior through redefining incentive structure in formal contract design. Unfortunately, this formal effort of using PBC in vocational rehabilitation services was disturbed by the highly uncertain nature of employment services. Thus, there are only incomplete improvement in employment performance, mostly in targeted performance areas, and risks of contractor opportunistic behaviors. Indeed, the research and the practice of PBC tend to ignore the relational face of contracting. Relational contracting as a social control system, using informal and normative mechanisms (largely represented by interorganizational trust) to eliminate interest and goal incongruence between contracting parties, tends to encourages appropriate behaviors that could lead to desirable collaborative outcomes. In this line of reasoning, this paper proposes the managerial implications that public managers 147 might consider when using PBC, such as ongoing collaboration and negotiation in contract planning and implementation, long-term or open-ended contracts, and professionalism. In sum, this project represents the first attempt to systematically examine PBC effectiveness in human services. It shows the difficulties and dynamics of introducing performance management to human service contracting. For various political and pragmatic reasons, performance management is everywhere (Behn, 2003). However, largely due to human services’ ambiguous performance and high provider discretion, PBC in human services are always at the risk of “rewarding A, while hoping for B.” Therefore, the project reminds that the launch of PBC should be very deliberate and careful. The efforts of introducing PBC to human service provision are often undermined by imperfect performance measures and high provider discretion. The situation becomes worse-off when contractors use discretion to “gaming” the performance measures. Generally, in human services, not all aspects of performance can be clearly defined and measured. Along the full spectrum of the performance of a human service, there is some portion that is straightforward and easy to capture, such as successful placement and time-to-placement in this study. But there must be a certain portion, especially related to service quality and long-term effects, which is elusive to observe and define, such as positive quality-of-life change and long-term stability. In this way, the use of PBC with surrogate performance measures to adjust for the entire performance domain inevitably leads to a mismatch, ending up with incomplete performance improvement or even gaming (Dixit, 2002). Indeed, the 148 more discretion involved in human service delivery, the less portion of service performance can be clearly captured and measured (Lipsky, 1980). The broader ambiguous portion of service performance, the less effectiveness PBC could produce as a formal control mechanism, and the more room left for relational contracting to fit in. In conclusion, in order to take full advantage of PBC, public managers should pay attention to the relational side of contracting and devote administrative resources to building trust with contractors. More broadly, the project underscores two key components of contracting management: (formal) control and trust. The issue of control is a lingering question in organizational management. Studies of management and organization behaviors have long examined effective ways to exercise control of collective actions (e.g., Barnard, 1938; Etzioni, 1964). In contracting management, control, as a power of directing, can be reflected in the provisions of the contract, monitoring, and levying of penalties. It includes monitoring of information flows, design of incentives, and allocation of risks. These formal mechanisms are absolutely necessary given the public accountability requirements. However, the effectiveness of such control is dependent upon the measurability of job-related behavior or outcome. When formal control systems are disturbed by various sociological and psychological factors, formal mechanisms become less effective and more costly. Indeed, due to information asymmetry and uncertainty, contracting out always features some ambiguity, even if in the areas other than human services. Government will never have complete access to, or influence over, contractors’ operation and resources. 149 Thus, flexibility enjoyed by contractors is unavoidable. In this way, there must be areas that control mechanisms could not reach, but social control might emerge to play a role. Social control, based on social and normative influence, targets norms, values, and attitudes that may be relevant to desired collective outcomes (O’Reilly & Chatman, 1996). The center of social control is trust, which could create an environment where the mutually agreed contract goals become self-enforcing. This hybrid contracting management approach would help public managers address the smart-buy challenge and promote high-quality results. Certainly, the arguments here are derived from the case study of Indiana vocational rehabilitation program. The external validity of a case study, as Yin (2009) suggests, lies in “analytical generalization” through replication rather than “statistical generalization” through inference from a sample to a population. This replication logic in theory testing and development demands that the robustness of a theory be confirmed only by replicating the findings in different contexts. In this sense, the research here represents one of the studies that systematically examines the effectiveness of PBC in human service provision. The findings here might be used only for conditional, contingent generalizations (George & Bennett, 2005) to other cases which are similar to the one under study. This project has no intention to generalize in order to infer the causal mechanisms under various contexts, although the findings here to some extent coincide with several recent studies in different human service areas (e.g., Heinrich & Choi, 2007; McGrew et al., 2005). 150 Bibliography Abadie, A. (2005). Semiparametric Difference-In-Differences Estimators. Review of Economic Studies, 72(1), 1-19. Abadie, A., & Imbens, G. W. (2006). Large Sample Properties of Matching Estimators for Average Treatment Effects. Econometrica, 74(1), 235-267. Ai, C., & Norton, E. C. (2003). Interaction Terms in Logit and Probit Models. Economics letters, 80(1), 123-129. Aldrich, H., & Herker, D. (1977). Boundary Spanning Roles and Organization Structure. Academy of Management Review, 2(2), 217-230. Amirkhanyan, A. A. (2010). Monitoring across Sectors: Examining the Effect of Nonprofit and For-Profit Contractor Ownership on Performance Monitoring in State and Local Contracts. Public Administration Review, 70(5), 742-755. Arrow, K. J. (1964). Control in Large Organizations. Management Science, 10(3), 397-408. Axelrod, R. (1984). The Evolution of Cooperation. New York: Basic Books. Baker, G. P. (1992). Incentive Contracts and Performance Measurement. Journal of Political Economy, 100(3), 598-614. Baker, G. (2002). Distortion and Risk in Optimal Incentive Contracts. Journal of Human Resources, 37(4), 728-751. Bardach, E. (1977). The Implementation Game : What Happens after a Bill Becomes a Law. Cambridge, MA: MIT Press. Barnard, C. (1938). The Functions of the Executive. Cambridge, MA: Harvard University Press. 151 Barnow, B. S. (2000). Exploring the Relationship between Performance Management and Program Impact: A Case Study of the Job Training Partnership Act. Journal of Policy Analysis and Management, 19(1), 118-141. Becker, S. O., & Ichino, A. (2002). Estimation of Average Treatment Effects Based on Propensity Scores. Stata Journal, 2(4), 358-377. Behn, R. D. (2002). Government Performance and the Conundrum of Public Trust. In J. D. Donahue & J. S. Nye, Jr. (Eds.), Market-based governance: Supply side, demand side, upside, and downside (pp. 323-348). Washington, DC: Brookings Institution Press. Behn, R. D. (2003). Why Measure Performance? Different Purposes Require Different Measures. Public Administration Review, 63(5), 586-606. Behn, R. D., & Kant, P. A. (1999). Strategies for Avoiding the Pitfalls of Performance Contracting. Public Productivity & Management Review, 22(4), 470-489. Beinecke, R. H., & DeFillippi, R. (1999). The Value of the Relationship Model of Contracting in Social Services Reprocurements and Transitions: Lessons from Massachusetts. Public Productivity & Management Review, 22(4), 490-501. Ben-Ner, A., Ren, T., & Paulson, D. F. (2011). A Sectoral Comparison of Wage Levels and Wage Inequality in Human Services Industries. Nonprofit and Voluntary Sector Quarterly, 40(4), 608-633. Berman, P. (1978). The study of macro- and micro- implementation. Public Policy, 26(2), 157-184. 152 Bernheim, B. D., & Whinston, M. D. (1998). Incomplete Contracts and Strategic Ambiguity. American Economic Review, 88(4), 902-932. Bertelli, A. M., & Smith, C. R. (2010). Relational Contracting and Network Management. Journal of Public Administration Research and Theory, 20(suppl 1), i21-i40. Bevan, G., & Hood, C. (2006). What’s Measured is What Matters: Targets and Gaming in the English Public Health Care System. Public Administration, 84(3), 517-538. Block, S. R., Athens, K., & Brandenburg, G. (2002). Using Performance-Based Contracts and Incentive Payments with Managed Care: Increasing Supported Employment Opportunities for People with Developmental Disabilities. Journal of Vocational Rehabilitation, 17(3), 165-174. Bohte, J., & Meier, K. J. (2000). Goal Displacement: Assessing the Motivation for Organizational Cheating. Public Administration Review, 60(2), 173-182. Bolton, B. F., Bellini, J. L., & Brookings, J. B. (2000). Predicting Client Employment Outcomes from Personal History, Functional Limitations, and Rehabilitation Services. Rehabilitation Counseling Bulletin, 44(1), 10-21. Bond, G. R. (2004). Supported Employment: Evidence for An Evidence-based Practice. Psychiatric Rehabilitation Journal, 27(4), 345-359. Boris, E. T., de Leon, E., Roeger, K. L., & Nikolova, M. (2010). Human Service Nonprofits and Government Collaboration. Washington, DC: Urban Institute. Brodkin, E. Z. (1997). Inside the Welfare Contract: Discretion and Accountability in State Welfare Administration. Social Service Review, 71:1–33. 153 Brodkin, E. Z. (2011). Policy Work: Street-Level Organizations Under New Managerialism. Journal of Public Administration Research and Theory, 21(suppl 2), i253-i277. Brooke, V., Green, H., O'Brien, D., White, B., & Armstrong, A. (2000). Supported Employment: It's Working in Alabama. Journal of Vocational Rehabilitation, 14(3), 163-171. Brown, T. L., & Potoski, M. (2004). Managing the Public Service Market. Public Administration Review, 64(6), 656-668. Brudney, J. L., Fernandez, S., Ryu, J. E., & Wright, D. S. (2005). Exploring and Explaining Contracting Out: Patterns among the American States. Journal of Public Administration Research and Theory, 15(3), 393-419. Caliendo, M., & Kopeinig, S. (2008). Some Practical Guidance for the Implementation of Propensity Score Matching. Journal of Economic Surveys, 22(1), 31-72. Campbell, D. T., Stanley, J. C., & Gage, N. L. (1963). Experimental and Quasi- experimental Designs for Research. Boston: Houghton Mifflin. Chapin, J., & Fetter, B. (2002). Performance‐based Contracting in Wisconsin Public Health: Transforming State‐Local Relations. Milbank Quarterly, 80(1), 97- 124. Cochran, W. G., & Rubin, D. B. (1973). Controlling Bias in Observational Studies: A Review. Sankhyā: The Indian Journal of Statistics, 35(4), 417-446. Commons, M., McGuire, T. G., & Riordan, M. H. (1997). Performance contracting for substance abuse treatment. Health Services Research, 32(5), 631-650. 154 Cooper, P. J. (2003). Governing by Contract: Challenges and Opportunities for Public Managers. Washington, DC: CQ Press. Courty, P., & Marschke, G. (2004). An Empirical Investigation of Gaming Responses to Explicit Performance Incentives. Journal of Labor Economics, 22(1), 23- 56. Cragg, M. (1997). Performance Incentives in the Public Sector: Evidence from the Job Training Partnership Act. Journal of Law, Economics, and Organization, 13(1), 147-168. D'Agostino, R. B., Jr. (1998). Propensity Score Methods for Bias Reduction in the Comparison of A Treatment to A Non-randomized Control Group. Statistics in Medicine, 17(19), 2265-2281. Daly, D., Tucker-Tatlow, J., & Gibson, C. (2004). Innovations in Performance‐Based Contracting. San Diego, CA: Southern Area Consortium of Human Services. Davis, J. H., Schoorman, F. D., & Donaldson, L. (1997). Toward a Stewardship Theory of Management. Academy of Management Review, 22(1), 20-47. De Cooman, R., De Gieter, S., Pepermans, R., & Jegers, M. (2011). A Cross-sector Comparison of Motivation-related Concepts in For-profit and Not-for-profit Service Organizations. Nonprofit and Voluntary Sector Quarterly, 40(2), 296- 317. Dehejia, R. H., & Wahba, S. (1999). Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs. Journal of the American Statistical Association, 94(448), 1053-1062. 155 Dehejia, R. H., & Wahba, S. (2002). Propensity Score-Matching Methods for Nonexperimental Causal Studies. Review of Economics and Statistics, 84(1), 151-161. DeHoog, R. H. (1984). Contracting Out for Human Services: Economic, Political, and Organizational Perspectives. Albany, NY: SUNY Press. DeHoog, R. H. (1990). Competition, Negotiation, or Cooperation: Three Models for Service Contracting. Administration & Society, 22(3), 317-340. Denzin, N. K. (1978). The Research Act (2nd ed.). New York: McGraw-Hill. Derthick, M. (1972). New Towns In-town: Why a Federal Program Failed. Washington, DC: The Urban Institute. DeVaro, J., & Brookshire, D. (2007). Promotions and Incentives in Nonprofit and For-profit Organizations. Industrial and Labor Relations Review, 311-339. DiMaggio, P. J., & Powell, W. W. (1983). The Iron Cage Revisited: Institutional Isomorphism and Collective Rationality in Organizational Fields. American Sociological Review, 48(2), 147-160. Dyer, J., H. Singh. (1998). The Relational View: Cooperative Strategy and Sources of interorganizational Competitive Advantage. Academy of Management Review, 23, 660-679. Dias, J. J., & Maynard-Moody, S. (2007). For-profit Welfare: Contracts, Conflicts, and the Performance Paradox. Journal of Public Administration Research and Theory, 17(2), 189-211. 156 Dicke, L. A. (2002). Ensuring Accountability in Human Services Contracting Can Stewardship Theory Fill the Bill? American Review of Public Administration, 32(4), 455-470. Dixit, A. (2002). Incentives and Organizations in the Public Sector: An Interpretative Review. Journal of Human Resources, 37(4): 696-727. Donahue, J. D., & Nye, J. S. (Eds.). (2002). Market-based Governance: Supply Side, Demand Side, Upside, and Downside. Washington, DC: Brookings Institution Press. Dooley, D., Fielding, J., & Levi, L. (1996). Health and Unemployment. Annual Review of Public Health, 17(1), 449-465. Dutta, A., Gervey, R., Chan, F., Chou, C.-C., & Ditchman, N. (2008). Vocational Rehabilitation Services and Employment Outcomes for People with Disabilities: A United States Study. Journal of Occupational Rehabilitation, 18(4), 326-334. Eisenhardt, K. M. (1989). Agency Theory: An Assessment and Review. Academy of Management Review, 14(1), 57-74. Elmore, R. F. (1979). Backward Mapping: Implementation Research and Policy Decisions. Political Science Quarterly, 94(4), 601-616. Ernita Joaquin, M., & Greitens, T. J. (2012). Contract Management Capacity Breakdown? An Analysis of U.S. Local Governments. Public Administration Review, 72(6), 807-816. Etzioni, A. (1964). Modern Organizations. Englewood Cliffs, NJ: Prentice-Hall. 157 Faith, J., Panzarella, C., Spencer, R., Williams, C., Brewer, J., & Covone, M. (2010). Use of Performance-Based Contracting to Improve Effective Use of Resources for Publicly Funded Residential Services. The Journal of Behavioral Health Services & Research, 37(3), 400-408. Faems, D., Janssens, M., Madhok, A., & Van Looy, B. (2008). Toward an Integrative Perspective on Alliance Governance: Connecting Contract Design, Trust Dynamics, and Contract Application. Academy of Management Journal, 51(6), 1053-1078. Fawber, H. L., & Wachter, J. F. (1987). Job Placement as a Treatment Component of the Vocational Rehabilitation Process. Journal of Head Trauma Rehabilitation, 2(1), 27-33. Firestone, W. A. (1987). Meaning in Method: The Rhetoric of Quantitative and Qualitative Research. Educational Researcher, 16(7), 16-21. Frederickson, D. G., & Frederickson, H. G. (2006). Measuring the Performance of the Hollow State. Washington, D.C.: Georgetown University Press. Frumkin, P. (2001). Managing outcomes: Milestone contracting in Oklahoma. Washington, DC: The IBM Center for The Business of Government. Gamble, D., & Moore, C. L. (2003). The Relation between VR Services and Employment Outcomes of Individuals with Traumatic Brain Injury. Journal of Rehabilitation, 69(3), 31-38. Gates, L. B., Klein, S. W., Akabas, S. H., Myers, R., Schwager, M., & Kaelin-Kee, J. (2004). Performance-based contracting: turning vocational policy into jobs. Administration and policy in mental health, 31(3), 219-240. 158 Gates, L. B., Klein, S. W., Akabas, S. H., Myers, R., Schwager, M., & Kaelin-Kee, J. (2004). Performance-based Contracting: Turning Vocational Policy into Jobs. Administration and Policy in Mental Health, 31(3), 219-240. Gaynor, M. (1990). Incentive Contracting in Mental Health: State and Local Relations. Administration and Policy in Mental Health, 18(1), 33-42. George, A. L., & Bennett, A. (2005). Case Studies and Theory Development in the Social Sciences. Cambridge, MA: MIT Press. Ghoshal, S., & Moran, P. (1996). Bad for Practice: A Critique of the Transaction Cost Theory. Academy of Management Review, 21(1), 13-47. Giffords, E. D. (2003). An Examination of Organizational and Professional Commitment among Public, Not-For-Profit, and Proprietary Social Service Employees. Administration in Social Work, 27(3), 5-23. Girth, A. M., & Johnston, J. M. (2011). Local Government Contracting. National League of Cities. Glazerman, S., Levy, D. M., & Myers, D. (2003). Nonexperimental Versus Experimental Estimates of Earnings Impacts. The Annals of the American Academy of Political and Social Science, 589(1), 63-93. Glover, R. W., & Berger, B. L. (1989). Performance Contracting: The Colorado Model. The Journal of Mental Health Administration, 16(1), 21-28. Gramlich, E. M., & Koshel, P. P. (1975). Educational Performance Contracting. Washington, DC: Brookings Institution. Greene, W. (2010). Testing Hypotheses about Interaction Terms in Nonlinear Models. Economics Letters, 107(2), 291-296. 159 Greevy, R., Lu, B., Silber, J. H., & Rosenbaum, P. (2004). Optimal Multivariate Matching Before Randomization. Biostatistics, 5(2), 263-275. Gulati, R. (1995). Does Familiarity Breed Trust? The Implications of Repeated Ties for Contractual Choice in Alliances. Academy of Management Journal, 38(1), 85-112. Guo, S., & Fraser, M. W. (2010). Propensity Score Analysis: Statistical Methods and Applications. Thousand Oaks, CA: Sage Publications. Hart, O. (1989). An Economist's Perspective on the Theory of the Firm. Columbia Law Review, 1757-1774. Hart, O. D. (1988). Incomplete Contracts and the Theory of the Firm. Journal of Law, Economics, & Organization, 4(1), 119-139. Hasenfeld, Y. (1983). Human Service Organizations. Englewood Cliffs, NJ: Prentice- Hall. Hatry, H. P. (2006). Performance Measurement: Getting Results. Washington, DC: The Urban Insitute. Haviland, A., Nagin, D. S., & Rosenbaum, P. R. (2007). Combining Propensity Score Matching and Group-Based Trajectory Analysis in An Observational Study. Psychological Methods, 12(3), 247-267. Heckman, J., Heinrich, C., & Smith, J. (1997). Assessing the Performance of Performance Standards in Public Bureaucracies. American Economic Review, 87(2), 389-395. Heckman, J., Heinrich, C., & Smith, J. (2003). Performance of Performance Standards. Journal of Human Resources, 37 (4), 778-811. 160 Heckman, J. J., Ichimura, H., & Todd, P. E. (1997). Matching As An Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme. The Review of Economic Studies, 64(4), 605-654. Hefetz, A., & Warner, M. (2004). Privatization and Its Reverse: Explaining the Dynamics of the Government Contracting Process. Journal of Public Administration Research and Theory, 14(2), 171-190. Heinrich, C. J. (1999). Do Government Bureaucrats Make Effective Use of Performance Management Information? Journal of Public Administration Research and Theory, 9(3), 363-394. Heinrich, C. J. (2000). Organizational Form and Performance: An Empirical Investigation of Nonprofit and For-Profit Job-Training Service Providers. Journal of Policy Analysis and Management, 19(2), 233-261. Heinrich, C. J. (2002). Outcomes–based Performance Management in the Public Sector: Implications for Government Accountability and Effectiveness. Public Administration Review, 62(6), 712-725. Heinrich, C. J., & Choi, Y. (2007). Performance-Based Contracting in Social Welfare Programs. American Review of Public Administration, 37(4), 409-435. Heinrich, C. J., & Fournier, E. (2004). Dimensions of Publicness and Performance in Substance Abuse Treatment Organizations. Journal of Policy Analysis and Management, 23(1), 49-70. Heinrich, C. J., & Marschke, G. (2010). Incentives and their Dynamics in Public Sector Performance Management Systems. Journal of Policy Analysis and Management, 29(1), 183-208. 161 Hill, C. J. (2006). Casework Job Design and Client Outcomes in Welfare-To-Work Offices. Journal of Public Administration Research and Theory, 16(2), 263- 288. Hjern, B., & Porter, D. O. (1981). Implementation Structures: A New Unit of Administrative Analysis. Organization Studies, 2(3), 211-227. Ho, D., Imai, K., King, G., & Stuart, E. (2007). Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference. Political Analysis, 15, 199–236. Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396), 945-960. Hood, C. (2006). Gaming in Targetworld: The Targets Approach to Managing British Public Services. Public Administration Review, 66(4), 515-521. Jensen, M. C., & Meckling, W. H. (1976). Theory of the Firm: Managerial Behavior, Agency Costs and Ownership Structure. Journal of Financial Economics, 3(4), 305-360. Jick, T. D. (1979). Mixing Qualitative and Quantitative Methods: Triangulation in Action. Administrative Science Quarterly, 24(4), 602-611. Joaquin, M. E, & Greitens, T. J. (2012). Contract Management Capacity Breakdown? An Analysis of US Local Governments. Public Administration Review, 72(6), 807-816. Johnston, J. M., & Girth, A. M. (2012). Government Contracts and "Managing the Market": Exploring the Costs of Strategic Management Responses to Weak Vendor Competition. Administration and Society, 44(1), 3-29. 162 Johnston, J. M., & Romzek, B. S. (1999). Contracting and Accountability in State Medicaid Reform: Rhetoric, Theories, and Reality. Public Administration Review, 59(5), 383-399. Joyce, P. G. (1993). Using Performance Measures for Federal Budgeting: Proposals and Prospects. Public Budgeting & Finance, 13(4), 3-17. Karaca-Mandic, P., Norton, E. C., & Dowd, B. (2012). Interaction Terms in Nonlinear Models. Health Services Research, 47(1), 255-274. Kaye, H. S. (1998). Vocational Rehabilitation in the United States. Washington, DC: National Institute on Disability and Rehabilitation Research (NIDRR). Kearney, K. A., McEwen, E., Bloom-Ellis, B., & Jordan, N. (2010). Performance- based Contracting in Residential Care and Treatment: Driving Policy and Practice Change through Public-Private Partnership in Illinois. Child Welfare, 89(2), 39-55. Keiser, L. R. (2010). Understanding Street-level Bureaucrats' Decision Making: Determining Eligibility in the Social Security Disability Program. Public Administration Review, 70(2), 247-257. Kelman, S. (2002). Strategic Contracting Management. In J. D. Donahue & J. S. Nye, Jr. (Eds.), Market-based governance: Supply side, demand side, upside, and downside (pp. 88-102). Washington, DC: Brookings Institution Press. Kerr, S. (1975). On the Folly of Rewarding A, While Hoping for B. Academy of Management Journal, 18(4), 769-783. Kettl, D. F. (1988). Government by Proxy:(Mis?) Managing Federal Programs. Washington, DC: CQ Press. 163 Kettl, D. F. (1993). Sharing Power: Public Governance and Private Markets. Washington, DC: Brookings Institution Press. Kettl, D. F. (2002). The Transformation of Governance: Public Administration for Twenty-First Century America. Baltimore, MD: Johns Hopkins University Press. Kettl, D. F. (2002). Managing Indirect Government. In L. M. Salamon (Ed.), The Tools of Government: A Guide to the New Governance (pp. 490-510). New York: Oxford University Press. Kettl, D. F. (2005). The Global Public Management Revolution. Washington, DC: Brookings Institution Press. Kettner, P. M., & Martin, L. L. (1993). Performance, Accountability, and Purchase of Service Contracting. Administration in Social Work, 17(1), 61-79. Kim, Y. W., & Brown, T. L. (2012). The Importance of Contract Design. Public Administration Review, 72(5), 687-696. Kingdon, J. W. (1999). America the Unusual. Belmont, CA: Thomson/Wadsworth. Klein Woolthuis, R., Hillebrand, B., & Nooteboom, B. (2005). Trust, Contract and Relationship Development. Organization Studies, 26(6), 813-840. Koning, P., & Heinrich, C. J. (2013). Cream‐Skimming, Parking and Other Intended and Unintended Effects of High‐Powered, Performance‐Based Contracts. Journal of Policy Analysis and Management, 32(3), 461-483. Krauskopf, J. (2008). Performance Measurement in Human Services Contracts. New York Nonprofit Press, 7(2). 164 Kravchuk, R. S., & Schack, R. W. (1996). Designing Effective Performance- Measurement Systems under the Government Performance and Results Act of 1993. Public Administration Review, 56(4): 348-358. Lambright, K. T. (2009). Agency Theory and Beyond: Contracted Providers' Motivations to Properly Use Service Monitoring Tools. Journal of Public Administration Research and Theory, 19(2), 207-227. Lamothe, S., & Lamothe, M. (2012). Understanding the Differences between Vendor Types in Local Governance. American Review of Public Administration, 43(60), 709-728. Lamothe, M., & Lamothe, S. (2012). What Determines the Formal Versus Relational Nature of Local Government Contracting?. Urban Affairs Review, 48(3), 322- 353. Leete, L. (2000). Wage Equity and Employee Motivation in Nonprofit and For-profit Organizations. Journal of Economic Behavior & Organization, 43(4), 423- 446. Levine, D. M., American Educational Research, & American Association of School. (1972). Performance Contracting in Education--An Appraisal: Toward A Balanced Perspective. Englewood Cliffs, NJ: Educational Technology Publications. Lewis, J. D., & Weigert, A. (1985). Trust as a Social Reality. Social Forces, 63(4), 967-985. Linn, M. W., Sandifer, R., & Stein, S. (1985). Effects of Unemployment on Mental and Physical Health. American Journal of Public Health, 75(5), 502-506. 165 Lipsky, M. (1980). Street-level Bureaucracy: Dilemmas of the Individual in Public Services. New York: Russell Sage Foundation. Lyons, B., & Mehta, J. (1997). Contracts, Opportunism and Trust: Self-interest and Social Orientation. Cambridge Journal of Economics, 21(2), 239-257. Lu, M. (1999). Separating the True Effect from Gaming in Incentive-Based Contracts in Health Care. Journal of Economics and Management Strategy, 8(3), 383– 431. Luhmann, N. 1979: Trust and Power. Chichester: Wiley. Lunceford, J. K., & Davidian, M. (2004). Stratification and Weighting via the Propensity Score in Estimation of Causal Treatment Effects: A Comparative Study. Statistics in Medicine, 23(19), 2937-2960. Macaulay, S. (1963). Non-contractual Relations in Business: A Preliminary Study. American Sociological Review, 28(1), 55-67. Macaulay, S. (1985). An Empirical View of Contract. Wisconsion Law Review, 5, 465-482. Macneil, I. R. (1977). Contracts: Adjustment of Long-Term Economic Relations under Classical, Neoclassical, and Relational Contract Law. Northwestern University Law, 72, 854-902. Macneil, I. R. (1980). The New Social Contract: An Inquiry into Modern Contractual Relations. New Haven, CT: Yale University Press. Mathison, S. (1988). Why Triangulate?. Educational Researcher, 17(2), 13-17. Martin, L. L. (1999). Performance Contracting: Extending Performance Measurement tTo Another Level. Public Administration Times, 22 (January): 1 & 2. 166 Martin, L. L. (2005). Performance-based Contracting for Human Services: Does it Work?. Administration in Social Work, 29(1), 63-77. Martin, L. L., & Kettner, P. M. (1996). Measuring the Performance of Human Service Programs Thousand Oaks, CA: Sage. Marvel, M. K., & Marvel, H. P. (2007). Outsourcing Oversight: A Comparison of Monitoring for In-house and Contracted Services. Public Administration Review, 67(3), 521-530. Matland, R. E. (1995). Synthesizing the Implementation Literature: The Ambiguity- conflict Model of Policy Implementation. Journal of Public Administration Research and Theory, 5(2), 145-174. Mazmanian, D. A., & Sabatier, P. A. (1983). Implementation and Public Policy. Glenview, IL: Scott Foresman. McEvily, B., Perrone, V., & Zaheer, A. (2003). Trust as An Organizing Principle. Organization Science, 14(1), 91-103. McGrew, J. H., Johannesen, J. K., Griss, M. E., Born, D. L., & Katuin, C. (2005). Performance-based Funding of Supported Employment: A Multi-site Controlled Trial. Journal of Vocational Rehabilitation, 23(2), 81-99. McGrew, J., Johannesen, J., Griss, M., Born, D., & Katuin, C. (2007). Performance- based Funding of Supported Employment for Persons with Severe Mental Illness: Vocational Rehabilitation and Employment Staff Perspectives. The Journal of Behavioral Health Services and Research, 34(1), 1-16. 167 McLellan, A. T., Kemp, J., Brooks, A., & Carise, D. (2008). Improving Public Addiction Treatment through Performance Contracting: The Delaware Experiment. Health Policy, 87(3), 296-308. Mecklenburger, J. (1972). Performance Contracting. Worthington, OH: C.A. Jones. Meyers, M. K., Glaser, B., & Donald, K. M. (1998). On the Front Lines of Welfare Delivery: Are Workers Implementing Policy Reforms? Journal of Policy Analysis and Management, 17(1), 1-22. Michalopoulos, C., Bloom, H. S., & Hill, C. J. (2004). Can Propensity-Score Methods Match the Findings from a Random Assignment Evaluation of Mandatory Welfare-to-Work Programs? Review of Economics and Statistics, 86(1), 156- 179. Milgrom, P., & Roberts, J. (1992). Economics, Organization, and Management. Englewood Cliffs, NJ: Prentice-Hall. Miller, S., & Wilson, N. (1981). The Case for Performance Contracting. Administration and Policy in Mental Health and Mental Health Services Research, 8(3), 185-193. Milward, H. B., & Provan, K. G. (2000). Governing the Hollow State. Journal of Public Administration Research and Theory, 10(2), 359-380. Moynihan, D. P. (2008). The Dynamics of Performance Management: Constructing Information and Reform. Washington, DC: Georgetown University Press. Novak, J., Mank, D., Revell, G., & Zemaitis, N. (1999). Initiatives Influencing the Emergence of Results-based Funding of Supported Employment Services. In g. Revell, K. J. Inge, D. Mank, & P. Wehman (Eds.), The Impact of Supported 168 Employment for People with Significant Disabilites (pp. 25-42). Richmond, VA: Virginia Commonwealth University, Rehabilitation Research & Training Center on Workplace Supports. O'Brien, D., & Revell, G. (2005). The Milestone Payment System: Results-based Funding in Vocational Rehabilitation - 2005. Journal of Vocational Rehabilitation, 23(2), 101-114. O’Brien, D., & Revell, G. (2006). Current Trends in Funding Employment Outcomes. In Wehman, P., Inge, K. J., Revell, G., & Brooke, V. A. (Eds.) Real Work for Real Pay: Inclusive Employment for People with Disabilities. Baltimore, MD: Paul Brookes Publishing. Okun, A. M. (1975). Equality and Efficiency: The Big Tradeoff. Washington, DC: Brookings Institution Press. O’Reilly, C. A., & Chatman, J. A. (1996). Culture as Social Control: Corporations, Cults, and Commitment. Research in Organizational Behavior, 18(18), 157- 200. Osborne, D., & Gaebler, T. (1992). Reinventing Government: How the Entrepreneurial Spirit is Transforming the Public Sector. Reading, MA: Addison-Wesley. Ostrom, E. (1998). A Behavioral Approach to the Rational Choice Theory of Collective Action: American Political Science Review, 92(1), 1-22. O’Toole, L. J. (2000). Research on Policy Implementation: Assessment and Prospects. Journal of Public Administration Research and Theory, 10(2), 263- 288. 169 Ouchi, W. G. (1980). Markets, Bureaucracies, and Clans. Administrative Science Quarterly, 25(1), 129-141. Ouchi, W. G., & Maguire, M. A. (1975). Organizational Control: Two Functions. Administrative Science Quarterly, 20(4), 559-569. Paul, K. I., & Moser, K. (2009). Unemployment Impairs Mental Health: Meta- analyses. Journal of Vocational Behavior, 74(3), 264-282. Poppo, L., & Zenger, T. (2002). Do Formal Contracts and Relational Governance Function as Substitutes or Complements?. Strategic Management Journal, 23(8), 707-725. Pressman, J. L., & Wildavsky, A. (1984). Implementation. Univ of California Press. Prottas, J. M. (1978). The Power of the Street-Level Bureaucrat in Public Service Bureaucracies. Urban Affairs Review, 13(3), 285-312. Provan, K. G., & Milward, H. B. (2001). Do Networks Really Work? A Framework for Evaluating Public-Sector Organizational Networks. Public Administration Review, 61(4), 414-423. Puhani, P. A. (2012). The Treatment Effect, the Cross Difference, and the Interaction Term in Nonlinear “Difference-In-Differences” Models. Economics Letters, 115(1), 85-87. Radin, B. (2006). Challenging the Performance Movement: Accountability, Complexity, and Democratic Values. Washington, DC: Georgetown University Press. Revell, W. G., West, M., & Cheng, Y. (1998). Funding Supported Employment: Are There Better Ways?. Journal of Disability Policy Studies, 9(1), 59-79. 170 Riccucci, N. (2005). How Management Matters: Street-level Bureaucrats and Welfare Reform. Washington, DC: Georgetown University Press. Ring, P. S., & Van de Ven, A. H. (1994). Developmental Processes of Cooperative Interorganizational Relationships. Academy of Management Review, 19(1), 90-118. Romzek, B. S. (2000). Dynamics of Public Sector Accountability in An Era of Reform. International Review of Administrative Sciences, 66(1), 21-44. Romzek, B. S., & Johnston, J. M. (2002). Effective Contract Implementation and Management: A Preliminary Model. Journal of Public Administration Research and Theory, 12(3), 423-453. Romzek, B. S., & Johnston, J. M. (2005). State Social Services Contracting: Exploring the Determinants of Effective Contract Accountability. Public Administration Review, 65(4), 436-449. Romzek, B. S., LeRoux, K., & Blackmar, J. M. (2012). A Preliminary Theory of Informal Accountability among Network Organizational Actors. Public Administration Review, 72 (3), 442-453. Rosenbaum, P. R. (2002). Observational studies (2nd ed.). New York: Springer. Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55. Rosenbaum, P. R., & Rubin, D. B. (1985). Constructing a Control Group Using Multivariate Matched Sampling Methods that Incorporate the Propensity Score. American Statistician, 39(1), 33-38. 171 Rousseau, D. M., Sitkin, S. B., Burt, R. S., & Camerer, C. (1998). Not So Different After All: A Cross-discipline View of Trust. Academy of Management Review, 23(3), 393-404. Rubin, D. B. (1973). Matching to Remove Bias in Observational Studies. Biometrics, 29(1), 159-183. Rubin, D. B. (1976). Inference and Missing Data. Biometrika, 63(3), 581-592. Rubin, D. B. (1979). Using Multivariate Matched Sampling and Regression Adjustment to Control Bias in Observational Studies. Journal of the American Statistical Association, 74(366), 318-328. Rubin, D. B. (1997). Estimating Causal Effects from Large Data Sets Using Propensity Scores. Annals of Internal Medicine, 127(8), 757-763. Rubin, D. B., & Thomas, N. (1996). Matching Using Estimated Propensity Scores: Relating Theory to Practice. Biometrics, 52(1), 249-264. Rubin, S. E., Roessler, R., & Dunkerby, M. (1983). Foundations of the Vocational Rehabilitation Process. Boston: University Park Press. Sabatier, P. A. (1986). Top-Down and Bottom-Up Approaches to Implementation Research: A Critical Analysis and Suggested Synthesis. Journal of Public Policy, 6(01), 21-48. Salamon, L. M. (1987). Of Market Failure, Voluntary Failure, and Third-Party Government: Toward a Theory of Government-Nonprofit Relations in the Modern Welfare State. Nonprofit and Voluntary Sector Quarterly, 16(1-2), 29-49. 172 Salamon, L. M. (1995). Partners in Public Service: Government-Nonprofit Relations in the Modern Welfare State. Baltimore, MD: Johns Hopkins University Press. Salamon, L. M. (1989). Beyond Privatization: The Tools of Government Action. Washington, DC: Urban Institute Press. Sandfort, J. R. (2000). Moving beyond Discretion and Outcomes: Examining Public Management from the Front Lines of The Welfare System. Journal of Public Administration Research and Theory, 10(4), 729-756. Savas, E. S. (1987). Privatization: The Key to Better Government. Chatham, N.J.: Chatham House. Schlesinger, M., Dorwart, R. A., & Pulice, R. T. (1986). Competitive Bidding and States’ Purchase of Services: The Case of Mental Health Care in Massachusetts. Journal of Policy Analysis and Management, 5(2), 245-263. Schlesinger, M., Mitchell, S., & Gray, B. H. (2004). Public Expectations of Nonprofit and For-profit Ownership in American Medicine: Clarifications and Implications. Health Affairs, 23(6), 181-191. Sclar, E. D. (2001). You Don't Always Get What You Pay for: The Economics of Privatization. Ithaca, NY: Cornell University Press. Shadish, W. R., Clark, M. H., & Steiner, P. M. (2008). Can Nonrandomized Experiments Yield Accurate Answers? A Randomized Experiment Comparing Random and Nonrandom Assignments. Journal of the American Statistical Association, 103(484), 1334-1343. 173 Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and Quasi- experimental Designs for Generalized Causal Inference. Independence, KY: Wadsworth Cengage learning. Shapiro, S. P. (2005). Agency Theory. Annual Review of Sociology, 31, 263-284. Smith, D. C., & Grinker, W.J. (2004). The Promise and Pitfalls Of Performance- Based Contracting. Presentation at the 25th Annual Research Conference of the Association for Public Policy Analysis and Management (APPAM), Washington, DC. November 5-8, 2003. Smith, S. R., & Smyth, J. (1996). Contracting for Services in a Decentralized System. Journal of Public Administration Research and Theory, 6(2), 277-296. Steckler, A., McLeroy, K. R., Goodman, R. M., & Bird, S. T. (1992). Toward Integrating Qualitative and Quantitative Methods: An Introduction. Health Education Quarterly, 19(1), 1-8. Stewart, M. T., Horgan, C. M., Garnick, D. W., Ritter, G., & McLellan, A. T. (2013). Performance Contracting and Quality Improvement in Outpatient Treatment: Effects on Waiting Time and Length of Stay. Journal of Substance Abuse Treatment, 44(1), 27-33. Stillman, R. J. (1991). Preface to Public Administration: A Search for Themes and Direction. New York: St. Martin’s Press. Thompson, J. D. (1967). Organisation in Action. New York: MacGrow-Hill. Reuer, J. J., & Ariño, A. (2007). Strategic Alliance Contracts: Dimensions and Determinants of Contractual Complexity. Strategic Management Journal, 28(3), 313-330. 174 U.S. Child Care Bureau. (2008). Child Care and Development Fund: Report of state and territory plans FY 2008-2009. U.S. Child Care Bureau. (2009). Examples of Performance-based Contracts in Child Welfare Services. U.S. Department of Education. (2010). RSA Annual Report for fiscal year 2010. U.S. Government Accountability Office (GAO). (2001). Contract Management: Trends and Challenges in Acquiring Services. GAO-01-753T. U.S. Government Accountability Office (GAO). (2002). Contract Management: Guidance Needed for Using Performance-Based Service Contracting. GAO- 02-1049. U.S. Office of Federal Procurement Policy (OFPP). (2007). Fiscal Year 2008 Performance-Based Acquisition Performance Goal. U.S. Office of Federal Procurement Policy (OFPP). (2007). Using Performance- Based Acquisition to Meet Program Needs - Performance Goals, Guidance, and Training. Uzzi, B. (1997). Social Structure and Competition in Interfirm Networks: The Paradox of Embeddedness. Administrative Science Quarterly, 42(1), 35-67. Van Slyke, D. M. (2003). The Mythology of Privatization in Contracting for Social Services. Public Administration Review, 63(3), 296-315. Van Slyke, D. M. (2007). Agents or Stewards: Using Theory to Understand the Government-Nonprofit Social Service Contracting Relationship. Journal of Public Administration Research & Theory, 17(2), 157-187. 175 Van Thiel, S., & Leeuw, F. L. (2002). The Performance Paradox in the Public Sector. Public Performance & Management Review, 25(3), 267-281. Vandaele, D., Rangarajan, D., Gemmel, P., & Lievens, A. (2007). How to Govern Business Services Exchanges: Contractual and Relational Issues. International Journal of Management Reviews, 9(3), 237-258. Waernbaum, I. (2010). Propensity Score Model Specification for Estimation of Average Treatment Effects. Journal of Statistical Planning and Inference, 140(7), 1948-1956. Warner, M. E., & Hefetz, A. (2008). Managing Markets for Public Service: The Role of Mixed Public–Private Delivery of City Services. Public Administration Review, 68(1), 155-166. Warner, M. E., & Hefetz, A. (2009). Cooperative Competition: Alternative Service Delivery, 2002-2007. In The Municipal Year Book 2009, ed. ICMA, 11–20. Washington, DC: International City County Management Association. Wedel, K. R., & Conston, S. W. (1988). Performance Contracting for Human Services: Issues and Suggestions. Administration in Social Work, 12(1), 73- 87. Williams, D. W. (2003). Measuring Government in the Early Twentieth Century. Public Administration Review, 63(6), 643-659. Williamson, O. E. (1985). The Economic Institutions of Capitalism: Firms, Markets, Relational Contracting. New York: Free Press. Wilson, J. Q. (2000). Bureaucracy: What Government Agencies Do and Why They Do it. New York: Basic Books. 176 Witesman E. M., & Fernandez, S. (2013). Government Contracts With Private Organizations: Are There Differences Between Nonprofits and For-profits? Nonprofit and Voluntary Sector Quarterly, 42(4), 689-715. Yin, R. K. (2009). Case Study Research: Design and Methods (4th ed.). Los Angeles, CA: Sage. Zand, D.E. (1972). Trust and Managerial Problem Solving. Administrative Science Quarterly, 17 (2), 229-239. Zhao, Z. (2008). Sensitivity of Propensity Score Methods to the Specifications. Economics Letters, 98(3), 309-319. Zollo, M., Reuer, J. J., & Singh, H. (2002). Interorganizational Routines and Performance in Strategic Alliances. Organization Science, 13(6), 701-713. Zucker, L. G. (1986). Production of Trust: Institutional Sources of Economic Structure, 1840–1920. Research in Organizational Behavior, 8, 53-111. 177