ABSTRACT Title of Disertation: ALTERNATIVE DIRECTIONS FOR MINIMALIST INQUIRY: EXPANDING AND CONTRACTING PHASES OF DERIVATION John Edward Drury, Doctor of Philosophy, 2005 Disertation Directed By: Profesor Juan Uriagereka, Linguistics Department This disertation develops novel derivational mechanics for characterizing the syntactic component of human language ? Tre Contraction Gramar (TCG). TCG fals within a general clas of derivationaly-oriented minimalist approaches, constituting a version of a Multiple Spel Out (MSO-)system (Chomsky 1999, Uriagereka 1999, 2002). TCG posits a derivational WORKSPACE restricting the size of structures that can be active at a given stage of derivation. As structures are expanded, workspace limitations periodicaly force contractions of the span of structure visible to operations. These expansion-contraction dynamics are shown to have implications for our understanding of locality of dependencies, specificaly regarding succesive cyclic movement. The mechanics of TCG rely on non-standard asumptions about the direction of derivation ? structure asembly is required to work top-down. TCG draws a key idea from TAG; that is, recursive structure ought to play a direct role in delimiting the range of possible interactions betwen syntactic elements in phases of derivation. TAG factors complex structures into non-recursive elementary tres and recursive auxiliary tres that are combinable via TAG's two operations (substitution/adjoining). In TCG the expansion of structure in the workspace is similarly limited to containing only non-recursive stretches of structure. In the course of a derivation, encountering "repeated elements" in the expanding dominance ordering forces contractions of the workspace (understood to happen in potentialy diferent ways depending on the properties of repeated elements). In certain circumstances, repeated elements are identified, alowing information from earlier stages of derivation to be caried over to later stages, underwriting our (novel) view of succesive cyclicity. Recursive structure is retained in the global "output" structure, upon parts of which we understand the workspace to be superimposed. i ALTERNATIVE DIRECTIONS FOR MINIMALIST INQUIRY: EXPANDING AND CONTRACTING PHASES OF DERIVATION By John Edward Drury Disertation submited to the Faculty of the Graduate School of the University of Maryland, College Park, in partial fulfilment of the requirements for the degre of Doctor of Philosophy 2005 Advisory Commite: Profesor Juan Uriagereka, Chair Profesor Norbert Hornstein Profesor Howard Lasnik Profesor Colin Philips Profesor Amy Weinberg Profesor James Reggia ? Copyright by John Edward Drury 2005 i Dedication To my Father Dr. Thomas Francis Drury for fostering love of wisdom And to Lisa for patience and understanding ii Acknowledgements There is no way to adequately thank al of the people who made the completion of this disertation possible. But this is the place where one is to make such an atempt, so here 'goes. I thank first my friend and commite chair Juan Uriagereka for leaving just enough bread crumbs behind him on that twisting and dificult to follow trail that leads to the Outside of The Box. Thanks as wel go to my commite: Colin Philips, Amy Weinberg, Howard Lasnik, and Norbert Hornstein, and to James Reggia for agreing to fil the role as Deans Rep on short notice. To both Howard and Norbert I owe a special additional debt for last minute help above and beyond the cal. Standing on the shoulders of others is a job for an acrobat ? which I am not. So, to the following people I must say that I am sorry for al the kicking and if I poked you in the eye once or twice, please know it was not intentional,..thank you for your patience and for whatever help you provided me at one point or another in conversations about linguistics in general or about isues in the vicinity of this disertation: Mark Arnold, Stephen Crain, Norbert Hornstein, Howard Lasnik, David Lebeaux, David Lightfoot, Colin Philips, Phil Resnik, Amy Weinberg, Juan Carlos Castilo, Kleanthes Grohmann, Cedric Boeckx, Max Guimar?es, Jef Lily, Mat Kaiser, Tom Cornel, Robert Chametzky, Chris Wilder, Georges Rey, Ricardo Extepare, Itziar San Martin, Aitizber Atutxa, Lucia Quintana, Eli Murgia, Caro Struijke, Acrisio Pires, Peggy Antonise, Julien Musolino, Viola Miglio, and Laura Benua. I owe a serious debt to Maryland's Linguistics Department as whole, for supporting my work and my education, and for providing an excelent environment in iv which to grow. Within this department, Kathi Faulkingham deserves extra special thanks for helping to ensure that my typicaly pathological response to administrative deadlines did not cause any unsolvable problems. Thanks also go to Michael Ulman for generous support and encouragement during my stay in the Brain & Language Lab in Georgetown University's Department of Neuroscience, and to B&L Lab members past and present, especialy Karsten Steinhauer and Mat Walenski. I thank also my parents, Thomas and Margaret Drury for unconditional love and backing in al my endeavors. And last, to my amazing wife Lisa, thank you for putting up with my proces and my drama over these years, and for keeping the faith that one day (today!) this would actualy come to an end,.. and thank you litle O, for demanding that I complete this work so as to make more time for play and laughter. v Table of Contents DEDICATION............................................................................i ACKNOWLEDGEMENTS.................................................................ii TABLE OF CONTENTS...................................................................v CHAPTER 1: SPELING OUT THE WORKSPACE............................................1 1.1. WS CONSTRAINTS & A SKETCH OF SCM IN TCG.......................................10 1.2. LOCAL & LINKED LOCAL RELATIONS: SAMPLE TCG DERIVATIONS........................19 1.3. SUMARY SO FAR,..& THE PATH AHEAD.............................................34 1.4. SO-SYSTEMS, TAG, & GENERALIZING THE WS/O-DISTINCTION.........................38 1.4.1. MSO and Linearization........................................................39 1.4.2. Phases/MSO & Sucesive Cyclicity.............................................52 1.4.3. What Goes for Cyclicity in TAG.................................................62 1.5. IMPLEMENTATION OF TCG.........................................................6 1.5.1. On Relating Positions in Structure...............................................6 1.5.2. Labels & Structure............................................................89 1.6. CHAPTER SUMARY.............................................................104 CHAPTER 2: REGARDING SUCESIVE CYCLIC MOVEMENT.............................107 2.1. TYPES OF SUCCESIVE CYCLICITY EFECTS...........................................107 2.1.1. Wh-Copying................................................................108 2.1.2. Q-Stranding................................................................10 2.1.3. Agrement on the Path.......................................................14 2.1.4. Some Binding Theoretic Efects................................................17 2.1.5. Interaction with Variable Binding..............................................120 2.1.6. Inversion Efects.............................................................123 2.1.7. Control as Raising?..........................................................126 2.2. SOME THINKING ON SUCCESIVE CYCLICITY..........................................127 2.2.1. M-Features.................................................................129 2.2.2. Cyclicity without M-features?..................................................139 2.3. CHAPTER SUMARY & FURTHER CRITICAL REMARKS..................................156 CHAPTER 3: TCG ANALYSIS............................................................164 3.1. LOCAL RELATIONS: PART ONE.....................................................164 3.2. LINKED-LOCAL RELATIONS I: RAISING TO SUBJECT (RTS)...............................168 3.2.1. Some Raising Imposibilities & Expletive/Asociate Relations.......................169 3.2.2. Binding Interventions & SCM..................................................183 3.2.3. Variable Binding/Condition C Interaction........................................195 3.3. LINKED LOCAL RELATIONS I: WH-MOVEMENT.......................................202 3.3.1. ?-Identification & Local Movement.............................................202 3.3.2. Core Cases of A'-SCM (& Some Technical Problems Adresed).....................206 3.4. LOCAL RELATIONS: PART TWO.....................................................213 3.4.1. Raising-to-Object (RtO)......................................................213 3.4.2. Pasives, Local Binding, & Control.............................................20 3.5. CLAUSAL UNITHOD & WH-AGAIN,.................................................242 3.6. CONCLUSIONS: THE TAKE-HOME MESAGE OF TCG...................................247 3.7. CLOSING.......................................................................251 REFERENCES..........................................................................264 1 CHAPTER 1: Speling Out the Workspace the idea of the form implicitly contains also the history of such a form Hal?, Oldeman, & Tomalinson (1978) This disertation develops novel derivational mechanics for characterizing the syntactic component of human language ? Tre Contraction Gramar (TCG). The approach fals into the general clas of derivationaly oriented systems under development within the Minimalist Program, 1 and more specificaly into a category of models that I wil cal here Multiple Spel Out (MSO-)systems (Chomsky 1999, Uriagereka 1999, 2002). MSO- systems, generaly speaking, divide derivations into sub-derivations, the outputs of which may be independently evaluated at the interfaces to extra-gramatical systems and may play a special role in demarcating domains with import for understanding the locality of syntactic relationships. I propose in this work a general way of thinking about how MSO-systems function, relying on a distinction betwen a syntactic WORKSPACE and a derived OUTPUT structure. It is within this context that the core theoretical intuition underlying TCG emerges. The approach is informed as wel by Tre Adjoining Gramar (TAG). 2 In particular, as in TAG approaches, the fundamental notion of recursive structure is argued here to play an direct role in understanding the locality properties of syntactic dependencies. The general perspective developed in this work ses MSO-systems (and TCG in specific) together with TAG as a family of closely related approaches. 1 Chomsky (195, 198, 199), Uriagereka (198, 202), Lasnik & Uriagereka (204). 2 Se Frank (192, 202), Frank & Kroch (195) among many others. 2 TCG distinguishes itself in terms of the mechanisms it makes available for analyses of succesive cyclic movement (SCM) phenomena in two ways that I argue to be of broad interest theoreticaly: (1) a. The Non-Existence of "EP-/P-features": if the key ideas are right, special features driving intermediate movements in SCM are not neded b. Derivational Directionality: the mechanics of TCG derivations demand that structure asembly work "top-down", and not bottom-up as in Chomsky's (1994, 1995) widely adopted Bare Phrase Structure (BPS) It turns out that (1)a and (1)b are related. Teling the story of how this might be so is the main mision of this disertation. A metaphor helps to get the general intuition behind our Workspace/Output- Distinction (WS/O-distinction): picture an arbitrarily complex syntactic structure upon which we might shine a spotlight beam which can iluminate only smal portions of structure, leaving the rest in darknes. Construction of syntactic objects and the licensing of dependencies within such structures is understood to take place within the iluminated span, and not elsewhere (no syntactic work can happen in the dark). Thus, in order to expand structure beyond the maximal span of ilumination, the spotlight beam must "move on". As a consequence, some of the previously established structure wil necesarily have to be left behind outside of the iluminated zone. If derivationaly later expansion of structure requires the spotlight to move-on before a required syntactic licensing operation has occurred, there is no backtracking to fix the problem. In such a situation the output structure is stuck with some ilegible/unlicensed property which has been left outside the spotlight. (The interface 3 systems, unlike the narow syntactic combinatorial system defined by the workspace, can se in the dark.) The task of developing this metaphor into concrete proposals that can support reasoning about the syntactic component of the human language faculty is thus the task of specifying the principles that could be understood to govern the span of this spotlight beam, how and what operations happen within it, how and why it moves on to further expand structure, what happens to the old structure left outside the workspace boundaries as such later expansions occur, and so on. 3 Consider a graphical ilustration of the intuition informing our workspace/output distinction (WS/O-distinction). The following in (2) and (3) show two diferent partial derivations. The first of these schemas ilustrates a system in which construction within the workspace creates the output structure in a top-down fashion; the second schema represents a similar proces working bottom-up. Direction of derivation in this sense wil be important in this work. As mentioned above, the present approach wil be required to work roughly as pictured in (2) and not as in (3). (2) ???DIRECTION OF DERIVATION??? 3 Note that any conection to the notion of "working memory" as this notion is deployed in the psychological literature is metaphoric. In particular, the limitations proposed here to govern the maximal spans of the derivational workspace are not expected to vary acros individuals. The constraints proposed here are aleged to be "hard" architectural constraints on the combinatorial component. I do think there is rom to relate the proposals here to a story about memory systems, but I won't be spending time on this isue in the present work. 4 (3) ???DIRECTION OF DERIVATION??? The enclosures in these schemas represents the state of the workspace at earlier vs. later stages of derivation. Again, if this "spotlight beam" is limited in its scope then it must at certain points of derivation move on to expand further structure, thus leaving some of the previously established structure outside of its boundaries. Such abandoned spans are represented by the dotted-arcs for the portions of the structure outside the enclosures in (2) and (3). The notion of an OUTPUT then is the entire span of the established structure of which the workspace only alows us to se parts of at any given step. 4 I wil show below that this way of thinking ? our workspace/output distinction (WS/O-distinction) ? is helpful for reasoning about the workings of MSO-type approaches generaly. For example, in addition to having this distinction serve as a platform for the development of TCG, I wil be discussing both Chomsky's (1998, 1999) view of spel-out as happening "by phase" in these terms, as wel as Uriagereka's (1999) linearization-based view. To begin to get a beter fel for the WS/O-distinction, consider the old idea that there might exist special cyclic nodes defining domains in syntactic complexes within which some inventory of transformational operations ?T 1 ,..,T n ? are applied and then reapplied to (recycled in) the next higher domain, and so on, as in (4). 4 With one exception: the first "phase" of derivation involving expansion of the structural description wil, up until the first contraction of the workspace, corespond one-to-one with the output structure (se also fn6). 5 (4) S 4 ?(RE)APLY ?T 1 ,..,T n ? (cycle-?) S 3 ?(RE)APLY ?T 1 ,..,T n ? (cycle-?) S 2 ?(RE)APLY ?T 1 ,..,T n ? (cycle-?) S 1 ?APLY ?T 1 ,..,T n ? (cycle-?) This can be understood in present terms by identifying the maximal span of the workspace with such cyclic domains. As cycles are completed, a new workspace (cyclic domain) begins, leaving the previous cyclic domain stranded outside the borders of the workspace. This is pictured in (5): (5) S 4 S 3 S 3 S 2 S 2 S 2 S 1 S 1 S 1 S 1 But this is just translating terminology ? ordered nodes marking domains for applications of an inventory of transformations give way to a limited workspace that periodicaly expands up to a cyclic node and contracts, "clearing the buffer" to make way for the next domain expansion. 5 So what is the interest of the WS/O-distinction? 5 My thinking of spel-out in terms an emptying/clearing of a bufer of sorts grows in part from numerous helpful conversations with Max Guimar?es. Se Guimar?es (199) for related discusion involving alternative derivational directionality and posible aplications to thinking about syntax/prosody relationships. 6 Note that the schema in (5), and those in both (2) and (3) above, obey a general restriction. The structures in the workspaces at diferent steps of derivation in these examples are always contiguous (sometimes improper) parts of the output structure. 6 I suggest that a principled view of how the contents of the workspace are regulated can result in situations where non-contiguous portions of the output structure are represented in the workspace. This wil be key to our development of a basis for analyses of core cases of SCM. Consider a rather fancier ilustration which conveys this alternative intuition: (6) a. b. c. ws = In (6) two types of workspace contractions are represented. Asuming for this example a top-down derivation, the first such contraction occurs betwen the first two structures in (6)a and (6)b. This is just like the contractions depicted in (2) in which we se expansion of the workspace at the bottom end forcing an abandonment of structure at the top (i.e., the spotlight beam oves on; of course the opposite happens with our schemas in (3) and (5) above where expansions at the top force abandoning structure at the bottom). 6 That is, for any starting point of a derivation, up to the first contraction of the available span of structure visible in the workspace the workspace structure coresponds directly to the output. I say "coresponds" rather than "is identical to" since I wil be viewing the workspace superimposition on outputs as a mater of only syntactic (F) properties being "in" the workspace, which wil be understod to be asociated with the PF- (?) and LF- (?) relevant properties that wil be understod to be what populates the output structure. However, the "structure" (ordering relationships betwen nodes) wil be understod to be the same acros the workspace/output division. I wil unpack al this below. 7 However: betwen the second and third structure in (6)b and (6)c we have a contraction which is unlike the first. In this second contraction it is neither the top nor the bottom of the structure which is voided from the workspace, but rather some intermediate stretch. This results in bringing two nodes (the open/unshaded nodes in (6)) into a more local relationship within the workspace than existed prior to the contraction. It also results in some intervening material being "spliced-out" of the workspace, though we understand this spliced-out material to stil be present in the output. Thus, the two rightmost objects (those in (6)c) are equivalent in terms of what is in the workspace (I mark this above as " ws = ") though the second abstracts away from the output structure to which it is connected (i.e., via the elements stil in the workspace). Our metaphor of a spotlight beam breaks down at this point of course, so let us kick that ladder away ? the formal intuition should now be clear enough to proced. The key idea is that non-contiguous portions of the output may be maintained in the workspace (WS). I wil suggest below that the best way of viewing WS/O-distinction is not in terms of two levels of syntax, but rather to understand the connection betwen the two as a dynamic interface. WS-computations incrementaly yield a structured object populated by only LF and PF relevant properties. But an interface is the meting point of at least two diferent systems, if the WS is itself an interface system, then we should ask, "betwen what and what else?" On one side of the WS I have just suggested that we have a structured PF/LF object ? what then is on the other side? Answer: the lexicon. The view of syntactic architecture can be visualized as follows: 8 (7) L WS-1 E X WS-2 I C O WS-n N A beter way of putting things then would be to say that the WS is itself the "interface" betwen the lexicon, on the one hand, and derived PF/LF output structures on the other. The lexicon feds the WS which expands up to its limits (such limits are introduced and developed below), and then moves-on or contracts. The dynamics of WS expansion and contraction leaves in its wake a structured object ? a tre ? which is populated by only PF and LF relevant properties. The interest of the WS/O-distinction within the TCG approach developed here is in the nature of the shrinking/contraction proceses that yields a way of treating superficialy non-local relationships as potentialy reducible to more local domains. So the key question becomes: what drives contractions of the workspace, and how might we understand these to work in a way that can support analysis of more non-local-looking relationships? I begin development of my answer(s) to this question in ?1.1, introducing two constraints on categories and ordering in the workspace, and show how this yields a novel schema for analyses of succesive cyclicity phenomena. In ?1.2 I develop some sample general derivations for A- and A'-relations, and highlights some features that wil be of interest in later discussion. ?1.3 sums up the previous two. ?1.4 backs up to consider MSO-systems and TAG focusing mainly on their general outlooks on succesive cyclic movement. The discussion of these neighboring models is framed within our 9 WS/O-distinction, bringing forward its generality and highlighting some conceptual advantages of viewing MSO-systems in particular in this way. These discussions lead us to consider, in ?1.5, some technical isues regarding the theory of movement chains, developing a version of an idea offered in Chomsky (1995) where it is suggested that movement chains be understood as sets of contexts/positions. 7 In ?1.5 I also consider isues regarding categories/features and formal ordering ? that is, the theory of local intra-/inter-phrasal organization ? and setle on a "reduced" view adopting some ideas from Brody (2000, 2003). This reduced view is suggested to be a positive step in the direction of restrictivenes of the overal theory, though the central motivation for its adoption is its overal "good fit" with the key intuitions underlying TCG. Following this, Chapter 2 discusses empirical and theoretical isues regarding succesive cyclic movement in some more detail, and raises some isues regarding so- caled EP-/P-features ? what are caled "Move/Merge-features" or "M-features" here ? targeting them for elimination in the TCG approach. A number of other views regarding cyclicity are discussed as wel. Chapter 3 then turns to develop the TCG ideas in more detailed analytical discussion, focusing initialy on Raising-to-Subject (RtS) and wh-movement. The approach is demonstrated to require a top-down implementation, and some contrasts with TAG are discussed. The approach is then explored in possible extensions to other 7 To jump ahead a bit: I wil sugest that Chomsky's particular view ? which views the "context" for an element ? as the entire derivation up to the point where ? is introduced/merged (i.e., the "sister" of ? is the context defining this ocurence of ?, and the "sister" is itself viewed as the entire structure that this sister element dominates). I sugest in ?1.5 that this is both to strong and to weak ? it is to strong in that identifying/distinguishing ocurences requires reference to arbitrarily large stretches of previously established structure; it is to weak because we wil se that reducing our understanding of contexts to just the label of the context ? relates to wil permit us to view some contexts as indistinguishable from others. This wil turn out to be what underwrites SCM without the posultation of EP-properties. 10 phenomena. I conclude with some open questions and general discussion regarding the architecture of syntactic theory. 1.1. WS Constraints & A Sketch of SCM in TCG In this section I introduce two possible constraints on the workspace and examine some of their consequences. These notions conspire to yield linked-local relations of the succesive cyclic movement (SCM) sort. (8) WORKSPACE ORDER: The elements in the workspace manifest a weak partial order (DOMINANCE) 8 (9) WORKSPACE DISTINCTNESS (ANTI-RECURSION): The workspace does not tolerate the presence of multiple tokens of type X First, as mentioned, the system wil be understood to work "top-down". I wil return to explain why things must, in fact, work this way in the conclusion. Take the shaded node in (10)a to be a "to-be-moved" element and the open/unshaded node to be its initial structural context (housing the relevant licensing feature(s), e.g., wh or perhaps Case/? information). Asume for the moment that the branching order represented by the tre structure manifests the traditional notion of dominance (i.e., a transitive, antisymmetric, reflexive relation): 8 I wil consider the posibility of a rather stronger statement regarding ordering in later discusion. 11 (10) a. b. c. d. e. ? ws = MATCHING RELATION ? IDENTIFICATION CONTRACTION At the point in (10)b an element is introduced which satisfies a matching relation "?". This relation wil be further specified below, for now simply take ?xy to be satisfied if x and y are non-distinct. This situation thus violates the distinctnes condition on workspace contents stated above in (9), so something must happen in response in order for this to be a wel-formed workspace. Suppose that the system responds by taking these non-distinct elements to be esentialy the one and the same thing. If they are identified then whatever the higher element dominates, so does the lower one. This efects a copying/lowering of the shaded node (10)c). Note that we do not duplicate elements in the workspace ? what happens in the workspace is an identification of the open/unshaded nodes, so that subsequent to the contraction step in (10)d there are not two tokens or occurrences of either the "moving" element or its context. There are rather just single nodes for each in the workspace (note the WS equivalence is marked again as ws = betwen (10)d and (10)e above). However, there are now, in virtue of this proces, two such pairs in the output structure. Thus, what remains in the workspace subsequent to contraction is best understood in terms of the picture in (10)e, though (10)d captures the workspace/output structure correspondence. Thus, what is one in the workspace can be many in the output. 12 Observe that while identifying these open nodes in the structure could be understood to efect the equivalent of a lowering operation in terms of preserving the domination relations of the upper elements in the output, we clearly sem to require a way of blocking the similar copying of al such dominated elements. That is, why shouldn't this copying apply to the other dominated material (e.g., the dark nodes on the main path in (11)a, resulting in (11)b and then (11)c following contraction)? (11) a. b. c. ? How is a regres of sorts avoided? What stops this proces from copying al the domination relations of the upper instance of the two identified nodes (i.e., ? dominates ? dominates ? dominates ?,..ad infinitum)? How does this terminate? I suggest that we regard the IDENTIFICATION and CONTRACTION steps pictured above in (10)c and (10)d as esentialy a one-step operation governed by the general ordering restriction given in (8). Adding the "moved" element and its context to the botom of the workspace structure in virtue of the identification of this lower node with the upper one adds new pairs to the dominance relationship in the workspace. Technicaly this wil only be possible if elements in previous pairs in this dominance order that would introduce conflicts violating the antisymmetry of the dominance relation are removed from the workspace (though, importantly, preserved in the output). This is, esentialy, the notion of contraction in the TCG framework. 13 Consider (11) again, with the nodes labeled so that we may refer to them in specifying the relevant formal ordering properties as in (12): (12) a. b. ? ? ? ? ? ? ? ? ? ? ?' ?' ? ? ? ? Prior to the introduction of ?' (i.e. the step prior to (12)a) and the subsequent identification (?, ?'), we have the following dominance order D: (13) D = ??, ??, ??, ??, ??, ??, ??, ??,.. ??, ??, ??, ??,.. ??, ?? The introduction of ?' then adds the following pairs to D: (14) D = ??, ??, ??, ??, ??, ??, ??, ??, ??, ?'?,.. ??, ??, ??, ??, ??, ?'?, ??, ??,.. ??, ??, ??, ?'?, ??, ??,.. ??, ?'?, ??, ?? Asuming D is generaly a weak partial order (transitive, antisymmetric, and reflexive), if we identify ? and ?' then we have ordering conflicts even if we do not copy al the nodes ? dominates to the local domain of ?' ? for example: ??, ?? and ??, ??, ??, ?? and ??, ??, and so on. If everything the upper "occurrence" of ? dominates is copied, 9 then we end up with the situation in (11)b/c and (12)b, and many more ordering conflicts would thus 9 I wil be refering to the notion of copying as a convenience. The idea here is that there is an operation (node identification) which results in the equivalent of copying, but that there is no specific "duplicating" operation which takes a single element ? as an input and produces a pair of identical outputs. 14 arise in virtue of creating domination symmetries for the elements {?, ?, ?} (so that we have both ??, ?? and ??, ??, etc.). How might the system respond to the possibility of creating such ordering conflicts? Nunes (1995, 1999, 2004) addreses a similar problem as it arises in his development of the copy theory of movement set within the context of Kayne's (1994) Linear Correspondence Axiom (LCA). For Nunes the problem is that when an element ? is copied and (re)merged in a c-commanding position, similar kinds of ordering conflicts emerge since ?, in addition to now c-commanding "itself", both c-commands and is c- commanded by al the intervening nodes along the movement path. A Kayean view of structure/order correspondence requires there be no such conflicts in order to map hierarchy to precedence. 10 Nunes reconciles the conflicts betwen the linearization demands imposed by the LCA and the symmetric c-command relations in the structure resulting from movement as copying by positing a mechanism he cals Chain Reduction, stated as follows: 1 (15) CHAIN REDUCTION: Delete the minimal number of constituents of a nontrivial chain CH that suffices for CH to be mapped to a linear order in acordance with the LCA A similar idea can be employed to fit with the idea of removing elements from the workspace (contraction/spel-out). The outcome we want is for ??? in (12)a to be reintroduced in the output structure so, for example, the elements {?, ?, ?} wil al dominate ? in the output (this wil be important for our treatment of certain connectivity 10 Se also Chomsky (195) for some discusion of this point where the deletion of copies to satisfy the LCA conceived as a bare output condition on the PF side of the gramar is proposed. 1 Nunes aditionaly proposes a formal feature elimination procedure that is crucial to his analyses. I won't discus this here. 15 efects ? se Chapter 3). But we want these intervening {?, ?, ?} elements to be spliced- out of the workspace so as not to introduce ordering conflicts. Note that this requires some distinction betwen the workspace and the output to ensure that what is problematic with respect to ordering conflicts in the workspace is not problematic in the output. The nature of this particular diference relies on later developments, but I wil offer a sketch below. First consider what happens if we asume the following. Take the structure under discussion prior to the addition of the element ?' which wil match under relation ? with ?, and let us prune away some of the notation to focus on the relevant elements and their ordering properties, as in (16): (16) ? ? ? ? ? ? ? ? ? ? ? ? ? Now we add ?': (17) ? ? ? ? ? ? ? ? ?' ? (?, ?') satisfies ?, and the nodes are identified in the workspace. Since ? and ?' are now the same element, ? comes to be dominated by the intervening elements: (18) ? ? ? ? ? ? ? ? ? ? ? 16 This creates no ordering conflict since ? was in no domination relation with the intervening elements prior to the identification. But the intervening elements do create ordering conflicts, and so the workspace must contract (splice-out the interveners) to respect the properties of the dominance order: (19) ? ? ? ? ? ? ? ? ? ? ws = ? ? ? WORKSPACE AND OUTPUT JUST THE WORKSPACE 12 Although we have not yet specified the nature of the matching relation ?, the mechanics of workspace contraction as just discussed follow from our workspace constraints in (8) and (9) (together with ?). The ordering constraint in particular ensures that we wil be able to add the new domination relationships for ?, but: (i) we cannot add relations that cause ordering conflicts and (i) any elements that would create such problems subsequent to the identification (?,?') must be spliced-out. And the addition of the new domination relationships that efect the "lowering" of ? folows from the proposed response of the system to a potential violation of the distinctnes condition. What then of the output structure? Does it obey this (or any) ordering restriction or not? What about distinctnes? If we make the standard asumption that the items being combined are minimaly triples of semantic (?), phonological (?), and syntactic (F) information, ??, ?, F?, then the 12 So the workspace has just one ? and one ?. The output, however, has two ?'s and two ?'s (or, rather, as I wil sugest in a moment, the output has two corespondents of ? and two corespondents of ?). 17 following line of thinking is available to us, and wil in fact be central to our conception of the WS/O-distinction: the workspace manipulates only F-properties. In fact, we can take this a step further: "being in the workspace" could be identified with "having F-properties". The general idea is another way of framing the key intuition underlying the TCG approach. That is, categorial F-properties are a limited commodity in the syntactic workspace. A given manifestation of the workspace can contain exactly many distinct elements as there are categorial distinctions in the system. There is no such limitation of this sort "outside the workspace" because being outside the workspace just means that these formal distinctions are no longer connected with ??, ?? information. On the general view developed here, the output structure is a syntactic object in the sense of manifesting the formal ordering properties established in the workspace, but it wil be populated with only ? and ? properties. The output wil in this sense be an object of the interface systems, with the PF-component inspecting only the ?-vocabulary and the LF-component inspecting only the ?-vocabulary, but with both sets of vocabulary constrained by the same structure. 13 The following ilustrates the idea abstractly. Our general intuition of the workspace having to "move-on" to expand new structure is pictured first for an abstract domination order of elements: 13 Having structure housing both the ? and ? types of vocabulary also yields a venue for exploring primitive ?/? corespondences over such structures. For example, the wel-known conections betwen prosody/intonation and the semantics of focus would be one such area to explore with these mechanics. These maters are not explored here. In general we wil be concentrating mostly on what hapens in the workspace, and how this might relate to the output. However some brief remarks wil be made about how we might think about relationships established over output structures ? these are sugested to be potentialy truly non-local (examples include variable binding by a quantifier, long-distance obviation efects, so-caled unselective binding, etc.). 18 (20) A ? B ? C ? A ? A ? B ? C ? A ? A ? B ? C ? A ?B Supposing then that {A, B, C} are the relevant formal properties, as the workspace moves on what wil be left in the output structure are the asociated ? and ? properties of each formal element {A, B, C}, (e.g., { ?:A ?: , ?:B ?: ,..}): (21) A ? B ? C ? A ? ?:A ?: ? B ? C ? A ? ?:A ?: ? ?:B ?: ? C ? A ?B Now when we say that the workspace "moves-on", we understand this to mean that the relevant formal properties which ?/?-pairs are connected to must be "reused" in establishing new expansions of structure in the workspace. That is, if the anti- recursion/distinctnes condition in (9) holds, this means that such formal/syntactic information must be stripped away from earlier introduced elements so that it can be used to structuralize new/incoming ones. We can now ilustrate the situation described metaphoricaly above ? where an unlicensed F-property is abandoned from the workspace: (22) A ? B ? C ? A ? ?:A ?:A *F:? ? B ? C ? A Asuming that there is no "backtracking" of the workspace, this wil produce an anomaly as the interface systems are confronted with an ilegible element. Here we have marked this offending property as "*F:?", though note that above we suggested that an element's "being in the workspace" be identified with "having syntactic/formal properties". Below I wil be suggesting specific roles for licensing properties like WH, Case/agrement, and ?, so the way this wil actualy be understood wil be in terms of a failure of a formal 19 relation obtaining in the workspace leading to an ilegible PF or LF property (se ?1.2 below, and Chapter 3 for some discussion of features, valuation, and interface legibility). Regarding our concerns about ordering and repeated elements in the output: this is now best viewed as a constitutive diference betwen the workspace and the output structure. The distinction resides in exactly whether it is possible to represent multiple tokens of a given type or not. In the output, this is possible. In the workspace, it is not. The systems supporting the representation/procesing of PF and LF vocabularies, that is, are capable of handling multiple tokens of a given type; the narow syntactic computation in the workspace, which is stated over formal features/properties, cannot do this. This is one of the central ideas underlying the TCG approach. An important idea here, discussed in ?1.5, is the idea of thinking of movement chains as sets of contexts/positions, though I wil argue that we require a simpler view than the one presented in Chomsky (1995). There it is suggested that we view contexts as the entire structure derived up to the point where a moved/remerged item is (re)integrated. I argue that returning to a simpler view, where the context is simply the local label, and not the entire structure, alows us to view certain sets of contexts as indistinguishable, yielding SCM. In the next section I develop some sample derivations for core cases of A- and A'- movement to get some technical ideas on the table. 1.2. Local & Linked Local Relations: Sample TCG Derivations Now let us consider a pair of standard cases for which SCM analyses have been deployed, in particular wh-movement and raising-to-subject (RtS). First, some 20 simplifications regarding structure and category wil be helpful ? I wil return to discuss these simplifications further in ?1.5. Consider the following case with multiple clausal embedding in (23)a, with the partial description in (23)b: (23) a. Dave thought Mary believed John liked piza b. [ CP C 0 [ TP DP [ T' T 0 [ VP V 0 [ CP C 0 [ TP DP [ T' T 0 [ VP V 0 [ CP C 0 [ TP DP [ VP V 0 DP]..] If we look at the "spine" of the clause as structured in (23)b ? that is, the dominance ordering running from the root to the most embedded element that manifests the sequence of head-complement selection/projection relationships 14 ? we se the following sequence of major categorial distinctions betwen types of elements as in (24)b (ignoring intra-phrasal projection level distinctions, thus collapsing any XP/X' to just X): (24) a. [ CP C 0 [ TP DP [ T' T 0 [ VP V 0 [ CP C 0 [ TP DP [ T' T 0 [ VP V 0 [ CP C 0 [ TP DP [ VP V 0 DP]..] b. C?T?V?C?T?V?C?T?V?D This spine branches to include the external arguments in the specifier positions of the T- elements asociated with each verb, which we add to this reduced diagram as follows (the branching, directional arc is superimposed here to clearly indicate the asumed dominance ordering): (25) C?T?V?C?T?V?C?T?V?D D D D 14 On some views, the relation from functional-to-functional elements and functional-to-lexical elements is discused in terms of selection (e.g., C 0 selects TP, T 0 selects VP, etc.), perhaps with a distinction made betwen "syntactic" and "semantic" selection (se, e.g., Abney 1987, 191). On other views (Grimshaw 191, 202; van Riemsdijk 191, 198) functional-to-functional and functional-to-lexical relations are governed by the notion of (extended) projection, while "selection" is reserved for lexical-to-functional and lexical-to-lexical relations. 21 These reduced structures wil be sufficient to make the points of interest here. Later on I wil argue that this should be sen as more than expository convenience, but rather is a view of structure that makes available the "best fit" with our core constraints on the workspace (in (8)/(9)). 15 Now consider the folowing: (26) a. John sems to tend to appear to like carots b. What did Dave think that Mary believed that John liked? (27) a. John [sems [to tend [to appear [ _ to like carots]] b. What [did Dave think [that Mary believed [that John liked _ ]]? (28) a. John [sems [ _ to tend [ _ to appear [ _ to like carots]] b. What [did Dave think [ _ [that Mary believed [ _ [that John liked _ ]]]? There is something approaching a consensus in the literature that the examples in (26)a/b (raising to subject/RtS and wh-movement) are best viewed as involving linked local relationships of the sort pictured in (28)a/b, and not a direct "one-fel-swoop" relation as in (27)a/b. This is not entirely uncontroversial, though I wil canvas a range of facts in Chapter 2 that are drawn from a variety of languages and which, taken together, strongly suggest that something like these so-caled succesive cyclic movements (SCMs) are real. The TCG vision of these linked local dependencies can be schematized as in (29) for the RtS case (I wil return to the wh-movement case below). (29)a gives a birds-eye view of the entire derivation, with the first stage building the matrix clause structure as in (29)b. 15 I wil be deploying reduced structures in this "horizontal" notation throughout this work. Structures of this reduced type are esentialy those argued for in recent work of Brody (199, 203), and can be sen as related to more general eforts to downsize the aray of label-types that analysis can apeal to. Colins (201) is another such aproach, but one which is incompatible with the central intuitions I wil be developing regarding sucesive cyclicity (and "movement" generaly). I wil return to these isues below in ?1.5. 22 (29) a. sems to tend to apear to like carots C?T?V?T?V?T?V?T?v?V?D D D D D b. C?T ?:? ?:n ?V D ?:f ?:? Asume that T and D both enter the derivation with Case (?) and agrement (?) properties. T-? is unvalued, requiring a relationship with another element with valued-? (D-?); asume the reverse holds for ?-properties (T-? is valued, take ? to range over {?, n, a, d}, for "unvalued", nominative, acusative, and dative/oblique, respectively; similarly, take ? to range over {?, f, g, h} where f, g, etc. are stand-ins for more complex atributes and values like ?:NUM:plural, etc.). 16 I asume that D and T enter into a reciprocal valuation, esentialy swapping values, T retains ? and deletes ?, while D retains both valued properties, as follows (here and throughout, I wil mark alterations of feature properties ? valuation, deletion, etc. with transitions like "?:???:f" = "unvalued feature ? gets value 'f'", or ?:n?? = "valued feature ?:n deletes" as in (29)c): (29) c. C?T ?:???:f ?:n? ?V d. C?T ?:f ?V D ?:f ?:???:n D ?:f ?:n At the next step of derivation a "like element" ? T ? is introduced. I am asuming that raising predicates (i) do not include a specification for an external argument (i.e., no 16 This particular formulation of feature relations folows earlier proposals (Castilo, Drury, & Grohman 199; Grohman, Drury, & Castilo 200; Drury 200). 23 smal-v, though one is present to introduce the external argument of the most embedded clause, se (29)a above), and (i) take defective T-complements (in roughly the sense of Chomsky 1999). Thus, the second T could be viewed as distinct from the first, since they difer in properties (the first/higher T has a ?-property that the second/lower T lacks): (29) e. C?T ?:f ?V?T D ?:f ?:n However, this is exactly the context in which we want the "reverse" of raising to occur. Suppose then that we asume the following as a first pas on our so-far unspecified matching relation ? from above: (30) MATCHING RELATION ?: For two elements ? and ?, ??? if: (i) Either ? dominates ? or ? dominates ? and (i) Either ? subsumes ? or ? subsumes ? The condition makes reference to the notion of SUBSUMPTION, common in unification- based approaches to gramar deploying feature structures, specificaly (from Shieber 1986:15): (31) SUBSUMPTION: A feature structure D subsumes a feature structure D' (D _ D') if D contains a subset of the information in D'. For example, given a node labeled X and a node labeled X [F] , the former wil subsume the later since X contains a subset of the information in X [F] . Subsumption is thus the "more general than" relation. 24 What we have above can be sen as a generalization of Chomsky's (1999) notion of AGREE, though (i) it introduces the possibility of both upwards and downwards valuation on the dominance order, and (i) it extends the relationship to hold amongst categories. On Chomsky's view, in contrast, such relationships are asymmetrical, with unvalued elements ("probes") scanning their subordinate (c-command) domain for matching elements that can provide them with values ("goals"). 17 Thus, asymmetry in valuation is taken to track asymmetry in formal ordering (e.g., goals can't typicaly value probes they c-command). We wil se later on some potential troubles with this statement of matching, in particular when applied to individual features it causes locality problems even for fairly simple examples (e.g., alowing valuation to go in either general direction on the dominance ordering wil be sen to make it dificult to understand how he saw her can't mean she saw him ? se ?3.1 for discussion). For now however this way of thinking wil alow us to give a sketch of how things work. Taking the matching relation ? to involve categories, and not just features of them, might be taken to require some further comment. However, if there turns out to not be a good reason to have a fundamental division betwen categories and features, then this follows as a reasonably natural generalization of Chomsky's conception of 17 However, a similar kind of ?/? reciprocity as we have deployed here is present in a diferent form in Chomsky formulation (roughly: his idea is that ? gets valued as a reflex of agre with a ?-complete element; I won't coment further on this). However, on Chomsky's view the relation betwen the subject in a RtS construction is not directly asociated with its surface/PF-output position. Rather, the traditional line of having this element orginate in its ?-position is asumed. This, I believe, holds onto a residue of D-structure. Though not coded in terms of a level of representation characterizing potentialy unbounded objects (an infinite base component), it is nonetheles retained in the notion that items must enter the derivation through a ?-position. I se no minimalist motivation for this restriction, which is part of the motivation for exploring an alternative route regarding derivational directionality. However, we wil se that the alternative top-down conception is acualy demanded by the general view of contraction as aplied to SCM phenomena. 25 information flow and dependency-formation. What we wil se rather is that features may be divided into clases which either serve to relate elements internal to a domain (e.g., ?- features) and potentialy across such domains, while other features/properties (e.g., categorial distinctions like C, T, etc.) wil serve to separate/distinguish elements within domains and across domains. I wil return to unpack these ideas more explicitly below. The key idea to keep in mind is a feature-based view of domain boundaries ? some properties are responsible for holding things together within domains, and others are responsible for keeping domains apart (or, as in SCM type relations, alowing limited overlaps betwen domains). The general move that is being entertained here is to wed this generalization of AGREE with a version of Chomsky's (1995) discusion of CHAINS formalized as sets of context positions for an element ?. I wil return to elaborate on this point below, but note that what is being proposed here is esentialy a "context" identification view of SCM (se ?1.5). Before turning to these and other related maters in detail let us first complete the example derivation for RtS, and then take a look at one for wh-movement. Returning to the derivation in (29): since ? holds of ?T, T ?:f ?, these nodes are identified. Following the discussion above regarding identification and contraction and maintaining a coherent ordering in the workspace, this results in the following with the raising predicate itself (V) "splicing-out" of the workspace, and D ?:f ?:n being "reintroduced" at the bottom of the dominance order: (29) f. C?T ?:f ?V?T ?:f C?T ?:f ?V?T ?:f D ?:f ?:n D ?:f ?:n D ?:f ?:n D ?:f ?:n 26 And recal that this contracted workspace on the right-hand side here is realy just: (29) f'. C?T ?:f D ?:f ?:n The addition of the further raising predicates for the derivation of (29) goes exactly the same way, until the most embedded domain is reached. Prior to the introduction of the embedded v-element hosting external-?, we would again have a workspace like that in (29)f/f'. Introduction of v, I asume brings with it a ?-feature: 18 (29) g. C?T ?:f ?v ? D ?:f ?:n ?-features, I wil asume, correspond/connect to thematic predicates in a neo- Davidsonian sense (se, e.g., Parsons 1990, Schein 1993, Herberger 2000), relating a participant variable to an eventuality/situation. I suggest that the participant variables of such thematic predicates are inherently non-distinct and require valuation by ?/? properties in order to be rendered localy distinct (not having these local properties around can result in the identification of such variables, as I wil suggest is relevant for control and local anaphora, for example). In the present derivation the step in (29)g involves a local A-relation that the superficialy non-local relation to the matrix has been reduced to via succesive contractions of the workspace ? in efect "carying-along" the matrix T ?/? properties. It is this general property of these derivations which wil alow us to dispense with 18 Where the enclosure representing the workspace boundaries is not relevant I wil simply leave it out. 27 reference to so-caled "EP-features" or their like (se Chapter 2 and 3). Intermediate specifier positions can exist, on this view, because (i) matrix ones exist, and (i) intermediate positions involve an informational superset (more-general-than) relation with the corresponding matrix positions. I asume that ? in (29)g takes the value of the dominating agrement feature (here ?:f) as in "?[?:f]". The suggestion is that ?-role asignment to the D-element is indirect, esentialy importing a notion very similar in spirit to Wiliams (1994:33) notion of "vertical binding". 19 The outcome of this valuation then is as follows: (29) h. C?T ?:f ?v ?[?:f] D ?:f ?:n In general, I wil be understanding A-relations and thematic discharge in this way ? ?/? exchange betwen T and D results in Case-marking of D and valuation of T-?. The ?- property then asociates with ?, which esentialy takes this value as an index marking the participant variable (that is, ..? v ? _ , e?.. ? ..? v ??:f, e?..). I wil return to elaborate on this point of view. A'-relations wil be viewed somewhat diferently. However, the fundamental notion of contraction and node/context identification wil be understood to work in the same way for (e.g.) SCM involving wh-elements. Whereas the identifications for A- 19 Se also Wiliams (1983). A number of recent proposals of this kind have ben entertained in the literature as implemented within a feature system alongside an adoption of an Agre-type relationship of Chomsky's (199) sort. Se Rizac (204), Adger & Ramchand (201), Butler (204b), Koneneman & Neleman (203). Very similar notions have had a long tradition in frameworks that work exclusively with feature logics (e.g., HPSG; se Shieber 1985; Polard & Sag (194). Wiliams proposal technicaly involves an indexing procedure conecting thematic roles with dominating projections, with predication then ocuring betwen maximal projections as a mater of index sharing, thereby resulting in a conection to the lower (coindexed ?-role). Se Castilo, Drury, & Grohman (199) for some earlier discusion of such features relations and the notion of VP-internal subjects; and se also Drury (200). 28 movement involved the T-domain, the relevant relations in A'-movement wil be betwen C-elements. Before turning to ilustrate SCM of wh-elements, let us consider the local case of wh-movement: (32) Who _ likes piza? We wil ilustrate a derivation for (32) down to v (remember: "top-down") to show how WH, ?/?, and ? information wil be understood to relate. As A-relations serve to establish a set of feature-licensing relations resulting in an indirect view of ?-discharge, A'-relationships similarly provide a set of relations resulting in indirect ?-asignment. Asume that wh-elements come with a WH-property which (i) takes ?-features as values, and (i) matches and deletes WH on D. Asume C is has unvalued ? as wel, so we have the following in (33): (33) a. C ?:? WH ?T ?:? ?:n ?v ? b. C ?:???:f WH ?T ?:? ?:n ?v ? D ?:f WH ?:? D ?:f WH?? ?:? The now valued ?-property of C serves to value T-?, and WH takes the ?-property of T as a value (WH[?:n]). In virtue of these relations D-? can now be valued by C: (33) c. C ?:f WH?WH[?:n] ? T ?:???:f ?:n ? v ? D ?:f ?:???:n Thus, the presence of the WH-property serves to mediate the ?/? swap of values. Like the A-relation case, there is some back-and-forth directionality to the flow of information in these feature-relationships. In A-relations, recal from above, we saw ?- and ?-valuation going in opposite "directions". Consider (29)c/d again: 29 (29) c. C?T ?:???:f ?:n? ?V d. C?T ?:f ?V ? ? D ?:f ?:???:n D ?:f ?:n In the wh-movement case (A'-relation) the same holds, though the WH-property is aleged to be implicated in a mediating role (? goes from D to C to T; ? goes from T to C to D): ? (33) c. C ?:f WH?WH[?:n] ? T ?:???:f ?:n ? v ? ? D ?:???:n ?:f Now, as with the basic A-relation case discussed above, the ?-property of T wil index the thematic position: (33) d. C ?:f WH[?:n] ?T ?:f ?:n ?v ???[?:f] D ?:f ?:n The asumption here then is that indexing the participant variable of the ?-role with ? is to close-it off (saturate it) ? the ?-property can only become connected with this ?- property if it has been valued in a way that has resulted in the asignment of Case. This happens in two possible ways now: (i) as in the A-relation, where ? is connected to an overt nominal marked ?, and so the ?-variable wil be connected with the semantic properties of that element, or (i) it is connected with a "bound ?", asociated with the upper WH property ? that is, connected with an "individuator" in our terms. There is a version of a traditional view being implemented here. In GB-era terms (e.g., Chomsky 1981) we have the ideas that "?-marked traces" are "variables" and that in 30 general ?/? are intimately connected with ?-theory. I wil return to these maters in further discussion in Chapter 3. Let us consider now the picture we have of local A- and A'-relations side-by-side: (34) C ?:f WH[?:n] ?T ?:f ?:n ?v ?[?:f] C?T ?:f ?v ?[?:f] D ?:f ?:n D ?:f ?:n A'-RELATION A-RELATION In the A-relation, we have ?-features which form the connection betwen elements, 20 and in the A'-relation there is a mix. That is: (35) C ?:f WH[?:n] ?T ?:f ?:n ?v ?[?:f] C?T ?:f ?v ?[?:f] D ?:f ?:n D ?:f ?:n A'-RELATION A-RELATION This is one reasonable way of specifying the flow of feature-licensing information in local domains. ?-information flows from items that are specified to those that are not, filing in the values along the path; and ?-information does the same, though in the opposite direction on the path. The suggestion is that once we have a valuation mechanism of the AGREE-sort that has recently been appealed to in elaborating minimalist syntactic theories (Chomsky 1999), then it sems there are some fairly straightforward ways to make it do most, if not all of the work. 20 I am ignoring here any ?-properties that may be asociated with C in the A-relation example. We might asume that C-? can manifest an open clause, as with relatives, if we atribute a non-WH operator property to C. Later I wil explore the idea that ? on non-finite C (without an operator property) is what alows indexing of non-?-marked ? (control). 31 We need mechanism to "build", for example, extended projection sequences in the verbal domain. On a traditional movement story, we interleave the building of such sequences via merge operations with movement/remerge relationships involving nominal expresions as each of the relevant levels of structure is constructed. So, a ?-asigning element relates to a nominal, discharging its role to that nominal; further operations add higher projections specifying other licensing properties (Case/agrement), which we then need to relate to the nominal element as wel (so we have an A-movement); further categories/features are added above that, which may provide yet further licensing properties, and so we relate the nominal expresion yet again to the next highest layer (so we might have an A'-relation). But we can view a local A'-relation complex in at least the following two diferent ways: (i) D wh -V, D wh -T, and D wh -C, or (i) C-T-V + D wh . Below (?1.5.1) I wil review Chomsky's (1995) discusion of chains as sets of contexts, and suggest that coupling a simplified version of that with an AGREE-type mechanism yields the following picture: (36) a. In local sequences of categories (which wil, in acordance with the anti- recursion provision in (9), not include repeated "like elements") like features co-value, and, b. Encountering like categories results in a similar "co-valuation" (like elements are identified, though, as with feature-valuation generaly, only so long as one subsumes the other). Intuitively, categorial diferences in local domains prevents collapse of nodes ? dislike elements "repel" one another. But this does not stop like features from identifying with 32 each other within such local domains (likes "atract" one another). 21 Across local domains, we have the possibility of interactions because the edges of such domains can become identified by keeping this same atract/repel logic in place (a like category is introduced, and this can result in identifications which alows a kind of domain expansion ? as we saw above ? a kind of copying/lowering). Note also that the feature-relationships in our schematic A'-relation builds-in a useful property. Consider how a standard "copying" view of SCM of wh-elements works: (37) a. Which picture of himself did John think Bil liked _ ? b. [which..self] did John think [which..self] Bil liked [which..self] It has been noted that on the copy view of such movements there must be some operation which ensures that the actual wh-operator does not appear in al the lower copies. As Safir (1999: 591) points out, quantifiers cannot bind other quantifiers, and somehow the lower copies must be understood as variable-like elements. Acordingly, one or another variant of the folowing sort of operation is typicaly taken to be in efect (Munn 1998 cals this "Make OP"; this particular ilustration is taken from Safir's discussion): 2 (38) a. Whose mother did Bil se _ ? b. whose [whose mother] did Bil se [whose mother] STEP?: "lift" the operator out c. whose [x mother] did Bil se [x mother] STEP?: make variable/delete-WH This operation is built into the WH-licensing discussed above (se (33)a/b). The implementation is in terms of matching D and C WH-properties, with deletion of this 21 Se van Riemsdijk (198) the working out of an intuition which similarly makes use of atract/repel, but in a rather diferent way from what I am entertaining here (his "Categorial Feature Magnetism"). 2 Se Chomsky (195:203), Mun (194:39), Fox (199, 202), and Safir's (199:591) discusions. Se also Fox (203) and the notion of Trace Conversion. 33 property from the D-element, but the efect is the equivalent of D projecting its WH- properties to C (there are several way that we could implement the idea, the one given above is simply one such way). If we keep with our asumptions above, including the asumption of a top-down derivation, then we can understand the "operator" properties to be housed in C, leaving the D-wh phrase itself with a "hole" indexed by its ?-property. The result then is that the local structure provide above (repeated here) wil have a logical form of the sort in (39)b: (39) a. C ?:f WH[?:n] ?T ?:f ?:n ?v ?[?:n] D ?:f ?:n b. WH[?:n] ..[ ?:n ].. ? v ??:n, e? (i.e.,..wh(x) .. [..x..] .. [..Px..] whose x .. [..x mother..] .. [..Px..]; as in (38)c) Now the top-down structure of this story makes it possible to understand the equivalent of a Make-OP sort of operation as happening on the first step (when D and C are integrated). However, note that on longer distance wh-movement (the above example is local asociation with subject ?/?-?) the WH won't have a ?-value until it encounters a lower valued-?. Acording the logic of category identification and lowering sketched above, we might take this to result in an operator being succesively lowered to each new domain edge, along with the residue of the wh-element itself (e.g., roughly [x NP] = "D ?:? " here): (40) a. C WH[?:?] ?..?C WH[?:?] ?..?C WH[?:?] ?..? D ?:? D ?:? D ?:? 34 This gives us part of what we may want for SCM, which is variable-like elements in al the intermediate positions, but it also gives us something that we don't want, namely the wh-operator at al the intermediate positions as wel (i.e., recal the point from above that quantifiers don't bind other quantifiers). Below I wil suggest a way, relating to some ideas proposed by Uriagereka (1999) and from previous work of my own (Drury 1998), which appears to have the right properties to naturaly yield the result that we do want, which looks more like this (in terms of what we want in the output structure): (41) C WH[?:?] ?..?C?..?C?..? D ?:? D ?:? D ?:? I return this briefly below in discussing MSO-systems (?1.4.1), and again when we turn to analysis in Chapter 3. 1.3. Summary So Far,..& The Path Ahead We have so far introduced a few key ideas. Let's sum up before proceding. We have posited a workspace/output-distinction (henceforth: WS/O-distinction). In the course of elaborating on the key intuition we have suggested that the distinction be understood as a dividing line betwen the systems that manipulate elements by handling their syntactic properties only, versus those that handle the ?- or the ?-properties. Moreover, some of the formal/syntactic properties (e.g. WH, ?/?, and ?) have been understood to play a direct role in mapping out logical form distinctions. One way to look at this claim is to view the "workspace" as we have sketched it so far as "the interface" betwen the sound-meaning systems and whatever system(s) are responsible for the general ordering properties of 35 lexical/functional extended projection sequences. The suggestion above was that the workspace is a dynamic interface betwen "the lexicon" and the PF/LF systems. Prior to this, we outlined some consequences of workspace restrictions stated over ordering and category distinctnes, and showed how the combination of these ideas yields a schema for analysis that sems to provide for a novel view of succesive cyclic type movement. The mechanics were suggested to follow on a natural generalization of an AGREE-type operation/relation of the sort studied in Chomsky (1999), broadened to include categorial identifications in a way that alows cross-domain interactions in virtue of a "carying-over" of higher properties of elements into lower domains (via node- identification under subsumption). Some specific asumptions for A- and A'-relationships were sketched, providing a general (though reasonably detailed) outline of the approach to be further constrained and deployed below (Chapter 3). The availability of the type of analyses relevant to SCM phenomena wil be argued here to be extremely interesting in the minimalist seting. As we wil se, the approach makes available a route for analysis which does away with any appeal to (what I argue are) spurious movement-driving features that have ben taken to underwrite SCM in much current minimalist work (so-caled EP/P-features ? which I wil generaly refer to as Move/Merge-features or "M-features"). There are, however, a number of component ideas in play here that require some further background discussion before proceding. For example (i) the motivations for the reduced phrase-structure graphs deployed above, (i) the idea of relegating al "movement" relationships to one or another type of category/feature relationship on the dominance path, and how this could be connected with other existing lines of thought 36 regarding movement/chains, (ii) the generality of the WS/O-distinction, (iv) the conceptual connections to other proposed MSO-type systems as wel as to TAG approaches. Additionaly, one particular consequence of the TCG view, mentioned briefly above, is worth bringing up again here before heading into more detailed discussion. The general structure of the acount of SCM efects demands that syntactic derivations be viewed as asembling structure roughly top-down, instead of bottom-up as asumed in Chomsky's widely adopted Bare Phrase Structure (BPS; Chomsky 1994, 1995). This move (inverting the direction of derivation) on its own teaches us nothing about succesive cyclicity. 23 However, coupled with the right alternative views regarding structure and categories/features and how they might be generated by a derivational system, directionality can be sen to play a crucial role. This somewhat unorthodox outcome converges with the results of a number of other recent investigations which have 23 Contra a discusion in Terada (199), who sugests that sucesive cyclicity efects can be beter understod within the incremental/left-to-right view of derivations proposed in Philips (196). While I agre that derivational directionality may be important, nothing in Terada's discusion establishes this conclusion. In Terada's proposal intermediate movements are stipulated to be driven by features as in many other minimalist aproaches (positing what I cal Merge/Move-features or M-features se Chapter Two below). Terada apears to think that having the ultimate licensing feature (e.g., +wh for question-formation) checked "first" helps in some way with "lok-ahead" isues. But the logic relies on a spurious division betwen the 'top-most' licensing properties and lower ones (like Case/? and ?-properties). The "lok-ahead" problem is symetric. In a botom-up aproach Case/? and ? are localy licensed but a (e.g.) a wh-element must somehow eventualy reach its coresponding licensing context, so where there is multiple embeding there is a lok-ahead isue (the wh-element neds to "know" that the right licensing property wil eventualy show up). But the same goes on a top-down (or left-to-right/incremental) view, just the other way around (a wh-element neds to "know" that Case/? and ? information wil eventualy show up, though its wh-property may be licensed imeidately upon entering the derivation). The mystery/problem/puzle of SCM efects is rather about why there are ever any movement operations other than those which would conect these basic (wh, Case/?, ?) properties. Why are there intermediate movements (chain-links/traces)? Positing M-features (e.g., EP/P-features in Chomsky's parlance) to drive intermediate movements (as Terada does) is not a solution ? that is the problem. I se nothing in Terada's discusion that ads to our understanding of why derivations ought to have a non-standard "direction" nor why sucesive cyclic movement ought to exist. On the present aproach, in contrast, direction of derivation is demanded in order for things to work. I thank Cedric Boeckx for pointing me to Terada's paper. 37 reached similar conclusions regarding directionality and syntactic derivations (se in particular Philips 1996, 2003). 24 In addition to this diference with respect to standard BPS, the TCG approach difers as wel from the structure of TAG derivations, which taken to obey a Markovian condition insisting that it be localy determinable whether a given pair wise combination of tre-structures is licit or not. One efect of this condition in TAG is an ordering fredom which for cases beyond pair-wise combination of tres alows the possibility of a many-to-one mapping of derivation structures to derived structures. 25 I think a large part of the interest in the mechanics of the TCG system is that it has this fairly abstract general requirement regarding derivational directionality. What I am unsure about at present is what the ultimate significance of these ordering diferences might be for the study of gramar qua "system of human knowledge" (i.e., as properties of a competence-level theory). There are, however, some obvious points of interest to be made with respect to connecting gramatical theory and parsing (and perhaps production). The treatment of linked-local relationships here in efect introduces a way that displaced constituents can be in sense buffered as structure is expanded and then re-acesed as lower domains are constructed. The structure of the TCG acount thus does not require the explicit add-on of a memory stack or related storage devices that have been appealed to in the past in 24 Other work along this same line includes previous work of my own, Drury (197, 198, 199a,b), and a number others including Boeckx (199), Guimar?es (199, 204), Richards (199, 201), Terada (199). 25 That is, as in some other aproaches (like clasical Categorial Gramar (CG) or Stedman's Combinatory Cateorial Gramar (CG), TAG derivations manifest a kind of so-caled "spurious ambiguity". This label is somewhat of a misnomer both in CG and in TAG, as both aproaches have sugested that the relevant derivational ordering alternatives are not in fact "spuriously" ambiguous but rather do make linguisticaly significant distinctions. Se Frank (202) for discusion. 38 discussions of filer-gap dependencies in the context of parsing theories. The functional equivalent of such a device is, as saw in the sketch offered above, an esential component of the basic mechanics. I wil not be concerned here with these isues, though its worth keeping in mind in the background. In the next section I back up to consider the general structure of some proposed MSO-systems and TAG, looking at the structure of these approaches in terms of our WS/O-distinction. Following this I turn to some technical discusion further motivating some of the component ideas of implementation of TCG pursued here. 1.4. MSO-Systems, TAG, & Generalizing the WS/O-Distinction MSO-systems as they have emerged within the MP, generaly speaking, are derivationaly oriented models which parcel structure asembly into principled stages in virtue of applications of an operation caled Spel Out (SO). Depending on asumptions varying across implementations, diferent sorts of MSO-systems efect diferent partitions of syntactic complexes (or stages of derivation) into local domains or chunks. Common across implementations is the general idea of SO as an operation that is periodicaly applied in the course of derivation resulting in a reduction or contraction of structural descriptions by shunting or transfering portions of structure to neighboring systems with which the syntax must interface. In this manner evaluation of certain aspects of wel-formednes of syntactic complexes is thus suggested to be divided such that sub-parts of structure are independently inspected by the principles governing the interfaces. 39 The general idea of multiple spel out has a number of antecedents in earlier literature. 26 Within the context of the development of the Minimalist Program (MP) it arose in consideration of the architecture proposed in Chomsky (1993), which contained a weak residue of Government & Binding (GB) theory's level of S-Structure. Rather than a full fledged level of representation, this S-structure hold-over was simply taken to be a "point" of derivation as discussed in Chomsky (1995:229): at some point in the (uniform) computation to LF, there is an operation Spel-Out that aplies to the structure ? already formed. Spel-Out strips away from ? those elements relevant only to ?, [emphasis mine-JD] leaving the residue ? L , which is maped to ? by operations of the kind used to form ?. ? itself is then maped to ? by operations unlike those of the N[umeration]?? computation. We cal the subsystem of C HL that maps ? to ? the phonological component, and the subsystem that continues the computation from to LF the covert component. The pre-Spel-Out computation we cal overt. This pasage characterizes the core properties of the minimalist Y-model: 27 (42) overt component covert component LF Lexicon ? Numeration ? MERGE/MOVE ? Spel-Out PF phonological component The development of MSO-systems in more recent work questions the idea of a single "point" of Spel-Out. 1.4.1. MSO and Linearization Uriagereka (1999) was to my knowledge the first to propose within the seting of the MP that we ought to regard spel-out not as a single point in the syntactic derivation, but 26 Se, e.g., Jackendof (1972) and Bresnan (1971,1972). 27 I have made no mention so far of the notion of "Numeration" in this model (as an intermediary betwen the Lexicon and the syntactic derivation). This object wil play almost no role here, though se our concluding discusion in Chapter 3. 40 rather as a procedure that can apply more than once, perhaps limited by economy principles (e.g., perhaps of the general Last Resort variety, mandating that no operation occurs unles necesary to ensure convergence). Uriagereka's proposal draws on the work of Kayne's (1994) proposed correspondence relation betwen linear precedence and c-command. Supposing with Kayne that asymmetric c-command relationships map to linear precedence, and taking Chomsky's view of spel-out as a proces of stripping away "those elements relevant only for ?" (se above), Uriagereka suggested that we identify domains for spel-out with sub- structures which constitute total/connected c-command orders. He argues that this alows a simplification of Chomsky's (1995) implementation of Kayne's LCA which avoids the need to state an induction step to cover the linearization of parts of complex structures with respect to parts within other such complexes. To ilustrate, consider: (43) D A D a B d E b C e F A version of Kayne's general idea would be to claim that where ? asymmetricaly c- commands ?, ? precedes ?. 28 For the sub-structure dominated by A above this would yield the order ?a, b, ..?. But while we say that A asymmetricaly c-commands the 28 Se Kayne (194) for a conceptual argument that asymetric c-comand ought to be maped to precedence, as oposed to subsequence. Se also Uriagereka (198) and Chametzky (200) for important related discusions. 41 elements dominated by the two-segment category D, the elements dominated by A do not. 29 Therefore we need to add a step to the linearization procedure. That is, there are two separate c-command domains here which each independently constitute a total/connected order. But b and e, for example, are not so ordered. So we need a step to tel us that al the parts of the A-substructure are to precede al the parts of the D- substructure so long as A asymmetricaly c-commands D (se Kayne's (1994) discusion for his handling of this isue). Uriagereka's suggestion is that since independent c-command domains are trivialy linearizable (i.e., they do not require appeal to an induction step of the sort informaly sketched above), they independently undergo spel-out. The output of this procedure could be understood as a flatened structure which we regard as stil "there" in the computation, but whose internal parts are frozen and therefore unable to undergo further interactions in any later stages of derivation. Alternatively, separately linearized substructure could be regarded as sent to the PF-component, leaving only a residual place-holder element "@", with some minimal specification of category/feature information relevant to the interaction of the speled-out unit A with the rest of the structure. 30 These two options are sketched here: 29 I'm mixing in Kayne's asumptions regarding specifiers as adjuncts ? this asumption is replaced in Chomsky's discusion by asumptions regarding intermediate-phrasal-level "invisibility" in order to get the right asymetries for c-comand to hold. This is inesential to the overview I am giving of Uriagereka's proposal in the main text, though I should make it clear that he otherwise folows Chomsky's BPS aproach in his specific formulation. 30 Perhaps simply core licensing properties like Case, agrement, wh, etc. 42 (44) a. D b. D A D @ A D ?a, b,..? d E d E e F e F The intuition in both implementations is that speled-out structure functions like a derived terminal, alowing a trivial statement of structure/order correspondence (? precedes ? ? ? asymmetricaly c-commands ?). Precedence betwen elements which do not themselves enter into an asymmetric c-command relationship fal out from the structure of derivations involving separate linearization of c-command domains. Uriagereka specificaly proposed that non-complements might be understood as the structures that must undergo independent linearization in the sense just sketched, and further argued that Condition on Extraction Domain (CED) efects (Huang 1982) could be understood to follow from this. So the cases in (45) would be understood to be ungramatical because the bracketed sub-structures would have to be independently speled-out, making their contents "frozen" and thus inacesible to further merge/move operations (the relative clause in (45)c would be out on the standard asumption that these structures involve adjunction and thus are also non-complements): (45) a. *What do [explanations of e] bother you? b. *What was Mary bothered [because Peter explained e]? c. *What do you know [the girl] [that _ explained e] Thus, on this linearization driven view of MSO, we have a potential acount of at least these particular so-caled strong islands. 43 However, its not obvious why it should be that subjects and adjuncts need to be independently speled-out, as opposed to the structures they asociate with ? either option would sem to permit the simplification of the linearization procedure. 31 In Drury (1998, 1999) it is proposed that Uriagereka's linearization-based view of MSO be put together with incremental derivations of the sort proposed in Philips (1996) (se also Philips 2003). We can frame a version of this proposal within our WS/O- distinction as follows: (46) Workspace Connectednes (C-Command): The elements in a given syntactic workspace must manifest a connected c- command order (i.e., for every x, y in the set, either x c-commands y or y c- commands x) Recal our top-down schema of the WS/O-distinction from (2), repeated here as (47): (47) This derivation would have workspaces which al obey (46). The following workspace would not: (48) 31 Se Drury (198, 199), Johnson (200) for some discusion of this and related points. Se also Uriagereka (202) for criticism of Uriagereka (199), and a working through of some alternative posibilities that for reasons of time and space I wil not consider in the present work. 44 On a top-down view of structure expansion, the c-command path that was first asembled would have to "spel-out". We can envision speled-out structure as being "ejected" from the workspace as follows, in the spirit of our proposed contractions discused above (?1.1 & 1.2): (49) The shaded node above would stil be visible/present in the workspace, but the structure it dominates would be excluded (removed from the workspace = speled out). There are numerous details here that require elaboration (e.g., with respect to symmetry vs. asymmetry of c-command betwen sisters; se Kayne 1994, Chomsky 1995), but the basic idea would be that the workspace would be limited to only contain trivialy linearizable structure as in Uriagereka's proposed simplification of a Kayne-type order/structure correspondence. This view then doesn't include anything that gets around the objections raised above however (e.g., regarding which of two sisters spels-out, etc). I put it on the table now to underscore the generality of the key idea of viewing spel-out of sub-structures as esentialy being "kicked out" of the active workspace (we wil se other ilustrations below). Note that the reduced structures introduced in the sample derivations in ?1.2 were suggested to involve only a dominance ordering. Suppose that we were to modify the proposed restriction in (46) to refer to dominance in our reduced structures, as in (50): (50) Workspace Connectednes (DOMINANCE): The elements in a given syntactic workspace must manifest a connected dominance order (for every x, y in the set, either x dominates y or y dominates x) 45 At each branching point in the top-down expansion of the domination sequence, something would have to "spel-out" (be voided from the workspace). To ilustrate, take the following nodes to be introduced in the order indicated by their number. The initial sequence ??? would satisfy (50), but the addition of ? would add a domination relation betwen ? and ? but no such relation betwen ? and ?, so we could take ? to be required to "spel-out" (be voided from the workspace). Subsequent spel-outs would ocur for the same reason (SO1, SO2, etc.). The result of this condition is binary branching. (51) ? SO1 ? ? SO2 ? ? SO3 ? ? etc,.. Note that these spel-outs would have to work diferently than the general shape of Uriagereka's proposal. Since the connectednes requirement would be stated over the dominance relationship, the even-numbered nodes would literaly have to be "gone" from the workspace. So this raises the question as to how they might interact with later structure. However, recal from our sketch of A- and A'-relations above that the connection betwen items is mediated by various kinds of feature-exchanges (valuations, etc.). On this view, for example, ? and ? above might relate in such a way as to leave the appropriate feature-relationships visible on ? alone (and thus stil "in" the dominance 46 path). For example, it was suggested that ?-asignment to a "subject" is mediated by the interelationships betwen D and T with respect to ? and ? properties, with the ?- properties serving as a link to the thematic variable (?) introduced by the verb. Following these valuation exchanges, D is marked for ? and D and T are connected by co-valued ?. Given this general picture, we might consider the possibility of an element being speled-out (e.g., like ? above), and then to re-entering the workspace in virtue of a later instance of node-identification. For example, suppose that ? introduces another instance of the ? type: (52) ? SO1 ? ? SO2 ? ? SO3 ? ? ? If (?, ?) mets our matching relation ?, then identification would occur. But, as argued above, this would require the splicing-out of al the intervening odd-numbered nodes in (52). But nothing would prevent the copying/lowering of ?, as this would create no ordering conflicts, nor would it violate the connectednes condition: 47 (53) ? ? ? CONTRACTION ? ? ? ? ? ? Intuitively, this would have the efect of an element (here: "?") leaving the active workspace (being speled-out), and then "returning" again to the workspace as its context was copied via node-identification. Note that further additions to the structure from the point in (53) (e.g., asociating a new element directly with ?) would result in ? having to spel-out again, in order for the workspace to comply with connectednes. But, ? could dominate arbitrarily complex structure, so its not obvious that we could, given our distinctnes condition on the WS, simply reintroduce such complexes at the bottom of the domination order in virtue of the node-identification ilustrated above. However, recal that the ?-? relation has been understood to involve some feature-value exchange. This suggests the possibility that we could understand the copying/lowering as involving simply a reintroduction of a simple label, implementing the notion of a derived terminal in Uriagereka's (1999) sense. That is, the node identification (?,) would result in reintroducing a simplex marker for ? above, facilitating the copying/lowering we require but not reintroducing al of the potentialy complex structure dominated by ? into the workspace. This would stil alow us to se the "left-branch" material of ? being 48 succesively reintroduced into the output structure, in virtue of the initial feature- licensing connection established in the matrix position. The "context" element itself (?) would, in contrast to ?, be a constant presence in the workspace (it does not spel-out, get reintroduced, spel-out again, as ? would). These mechanics are relevant to a discussion at the end of ?1.2 regarding cyclic-wh movement and an operation of the "Make-OP" sort. There we refered to the diference betwen the following two sorts of structures, and suggested that the former introduces a copying of operator-elements that we do not sem to want; whereas the later sems to have the right properties: (54) C WH[?:?] ?..?C WH[?:?] ?..?C WH[?:?] ?..? D ?:? D ?:? D ?:? (55) C WH[?:?] ?..?C?..?C?..? D ?:? D ?:? D ?:? The node-identification procedure, plus the now strengthened condition on workspace ordering (in terms of connectednes of the dominance order) makes available a distinction betwen C and D of exactly the sort we want. C is constantly "there" in the workspace, while D must spel-out, be re-entered to the workspace, spel-out again, and so on as the local domains are established in a top-down fashion. I wil return to these general ideas in the course of developing some analyses to explore the mater a bit (se Chapter 3, especialy ?3.2.3 & ?3.3.3), but the general suggestion is that we think of connectednes of the dominance ordering and the anti- 49 recursion restriction as working together to factor complex structures into natural local domains, creating major boundaries at both points of branching and points of recursion. Every syntactic theory of which I am aware needs to say something about (i) a theory of types, and (i) formal ordering properties. The suggestion here is that it may be possible to get these very basic notions to do quite a bit of work for us if we can sek out the right combination of conceptions of each (as suggested also in the work of Epstein, Groat, Kawashima, & Kitahara (1998), though with rather diferent conceptions pursued). For now, observe that the general idea of MSO as proposed by Uriagereka removes a "point" of spel-out from the familiar Y-model in (42) above, in favor of a more dynamic looking system with multiple such points: (56) MERGE/MOVE LF Lexicon ? Num? SO 1 ?..? SO 2 ?..? SO n ?PF? This kind of derivational architecture raises a number questions about the status of levels of representation. If spel-out is not a unitary point, do we need to amend our conception of PF as a unified object in the sense of "level of representation"? Uriagereka suggests that his dynamic view of speling-out is compatible with a level-les conception in which there is no unified object that is subject to a single-step evaluation with respect to gramatical conditions. On his view, separately speled-out sub-structures could be sen as being sent to the interface systems separately, leaving the gramar architecture with a PF-component but no level of this sort per se. 50 However, note that there is nothing in the MSO view that requires us to abandon levels of representation. It could simply be that MSO establishes the PF representation in the steps given by the independent instances of linearization, but that it stil forms a coherent connected object that can be subjected to further (PF-system) operations. That is, we can simply regard levels as incrementaly established. But it maters a bit what we take the "levels" to actualy be. I wil return to this point, but note here that this is roughly the content of the WS/O-distinction (a limited span derivational workspace that incrementaly builds an output representation). However, the suggestion above was that the object which is incrementaly asembled is "syntactic" in the sense of manifesting the formal ordering properties laid down in the workspace, but which is an object of the extra-gramatical PF/LF-systems in terms of what sorts of properties/features/categories populate this object (what sort of properites "decorate its nodes", if you wil). This PF-motivated view of MSO raises questions about LF too, in particular: are there reasons for thinking that SO is involved in similar kinds of divisions of derivations on the LF-side of the gramar? Asymmetric c-comand, after al, is taken to be relevant for scope and binding and the like; are there thus reasons for thinking that SO sends material to both the PF and LF systems, leaving us with a model like (57)? (57) ?LF? Lexicon ? Num? SO 1 ?..? SO 2 ?..? SO n ?PF? 51 An architecture of this general sort ? with both PF- and LF-relevant spel-outs (seting aside the staggered vs. uniform views) ? has in fact been suggested in connection with Chomsky's recent proposals regarding phases (derivation by phase: DbP) which brings us to the DbP/MSO view of derivations and their handling of SCM. Note that Uriagereka's linearization-based view of MSO on its own does not ofer us anything imediately obvious in terms of helping to understand SCM phenomena. C- command domains are themselves potentialy unbounded in depth, whereas the key point about SCM is that roughly clausal (or perhaps smaler) units form special domains that "punctuate" otherwise longer-distance looking relationships into linked-up local ones. 32 However, this view might be interestingly fit together with something like Chomsky's phases which constitute a subset of the domains picked out by Uriagereka's linearization-based conception. In later discusion I wil suggest that Uriagereka's central idea maybe best viewed in terms of general formal ordering restrictions on the workspace along the lines sketched above ? that is, not specificaly tied to linearization demands, but rather to the internal coherence of ordering properites in narow syntax. 3 I turn now to a phase-based MSO-system and cyclic movement. 32 I borow the notion of "punctuated" relations from Abels (203); se Chapter 2. Se also Bo?kovi? (202) for a discusion evoking ideas from Aoun (1986) and others regarding the notion of having certain phrase boundaries serve to "break" chains. 3 Again, se also Uriagereka (202) for critical discusion of his own previous proposals, and some alternative sugestions that I won't be considering in the present study. My own view here wil involve a formulation akin to the notion of Workspace Conectednes ofered in (46). Its worth noting here that Hornstein & Uriagereka (199) apeal to a similar kind of interface motivation for speling-out as this linearization based conception, though on the other (LF) side of the gramar. Briefly: they examine the posiblity that moved DP's may project their label to determine the type of their dominating category, esentialy alowing the potential of taking clauses as "external arguments" of DP. They sugest this as the syntax suporting generalized quantifiers, and argue that, like the left- branch-type efects at PF, the projection of D-labels in their moved-to target positions creates an analogous kind of efect at LF. Technicaly, they argue that DP's do not so project their labels in the overt syntax, but in the covert component a "re-projection" ocurs, esentialy alowing specifiers to determine the types of their containing phrases. 52 1.4.2. Phases/MSO & Succesive Cyclicity Ilustrated in (58)a-d is a general schema for a fairly standard derivational approach to such linked-local relationships familiar from the MP; below we locate this general line of thinking within Chomsky's (1999) approach. (58) a. XP b. XP c. XP ..? {*F} .. ? {*F} X' {*F} X 0 ..t ? .. XP ? {*F} X' .t ? .. d. XP ? {?F} X' {?F} X 0 XP t ? X' ..t ? .. In the context of the MP, the moving element ? is understood to have some property {F} which requires that ? enter into a licensing relationship that cannot be established in its initial position within the subtre marked XP in (58)a (so we mark {F} here as {*F} until 53 licensed, and as {?F} afterwards). 34 For the case of local wh-movement (e.g., who did John se _ ?) we understand the wh-element to be directly displaced (perhaps copied and/or remerged) to the surface position where the licensing/checking of this feature {*F} can occur via a match with a corresponding feature housed in the target position (pictured in (58)d). Of interest here is that the dependency betwen the top and bottom positions implicated in this sort of movement is aleged to not be "one-fel-swoop" but rather to involve linked local relations. That is, for reasons which vary across specific models, ? may be required to move to an intermediate or non-target position in which its unlicensed property {*F} is not satisfied ? pictured in (58)b ? on its way to the final/target position where this licensing can (in fact, must) occur. Some other property may be satisfied by this intermediate movement, perhaps some property of this intermediate position or perhaps in virtue of constraints built into the movement operation itself. Or, it may be that both sorts of motivations are in play. For example, in Chomsky's DbP formulation, elements must move to intermediate positions in order not to be stranded in an independently speled-out domain. The idea of the DbP approach is that structures may be evaluated piece-meal, so unlicensed elements must be displaced from such localy evaluated structures in order not to crash the derivation. Chomsky motivates these "escape hatch" type movements from localy evaluated domains by positing features/properties of potential intermediate movement landing-sites to play the 34 I don't care for present purposes about any deletion/erasure procedures that may be part of such licensing/checking. 54 role of the local licensor for such operations (I wil later on refer to these putative features as "Merge/Move-features" or "M-features"). Chomsky suggests that certain syntactic categories are phase-inducing, and that when multiple such heads are introduced into the derivation this results in systematic limitations on what remains "in active memory" versus what material is speled-out and thus no longer acesible to computation. His Phase Impenetrability Condition (PIC) insists that for a given phase head H PH-1 , when a second such head H PH-2 is introduced the complement domain of H PH-1 spels out. Abstractly then we have a derivation (on Chomsky's asumptions, a bottom-up one) with periodic applications of spel-out that in our workspace formulation we can picture as follows (phase-inducing heads are marked): (59) ,.., H PH-1 INTRODUCED H PH-2 INTRODUCED COMPLEMENT DOMAIN ETC,.. OF H PH-1 SPELS-OUT (i.e., THE WORKSPACE CONTRACTS!) Speled-out structure under the WS/O-distinction, as outlined above for the linearization- based view of MSO, is simply structure that is no longer in the workspace. Again, the WS/O-distinction as I am understanding it here is extremely general, and it is intended to be so. This gives us a common platform to discus these diferent (otherwise technicaly rather diferent) proposals. It is, however, more than just another "way of talking". There is a substantive claim implicit here which I am carying across the discussions of the 55 TCG approach as sketched above, Uriagereka's linearization-based MSO, and now the derivation-by/spel-out-by-phase view as proposed by Chomsky. The central claim revolves around the technical asumption that has been built-in here, which is that what is in the workspace is a piece of the output structure itself, matched up with formal properties which alow the establishment of ordering properties and syntacticaly significant relationships over such structures. The general outlook avoids some questions we might ask of the informaly presented notions (in both Uriagereka's and Chomsky's work) of the syntax "handing- over" or "transfering"/"shunting" structure to other systems in a piecemeal fashion. That is: what ensures pre-/post-spel-out coherence? How are the independently "handed-over" chunks related in these other systems? Do they need to respect the ordering properties established in the workspace? Or not? Note that there are several ways that workspace ordering might be "respected". For example, we could think of the mapping betwen individual nodes in the workspace to the output structure as being miror theoretic for (e.g.) the ?-vocabulary (as in the seminal work of Mark Baker 1985, and as adopted in Brody 1999, 2003). I won't, however, be pursuing this particular point in this work (though I think its the right one to pursue given the overal architecture), it is the general point about pre-/post-spel-out coherence that I wish to stres here. The WS/O-distinction as we have conceived of it here provides a straightforward model regarding pre/post-spel-out (there should be a conservation of ordering properties ? syntactic ordering should continue to constrain post-syntactic relationships). Nothing of course rules out the possibility of post-syntactic operations that would deform structure in ways that would result in los of information 56 (so relationships across the interface wouldn't be trivialy/transparently reversible). The present point is not that any of these things are impossible, but rather just that the WS/O- distinction provides a format within which to frame the isues. Now asking questions about what spels out (and when) in the derivation is rephrased in terms of asking how/why/when the derivational workspace contracts. Thus, returning now to Chomsky's phase-based conception: Why should the workspace contract to exclude the complement domain of Chomsky's putative phase- inducing heads? Why not the entire sub-structure the phase-inducing head projects? Or, with our WS/O-distinction in play, why not retain the complement structure and simply spel-out the edge of the phase (i.e., the head and its external dependents)? Note that on Chomsky's view of phases the domains circumscribed by the borders of the workspace overlap from some steps in the derivation to others. When the second phase-inducing head (H PH-2 ) is introduced the first such phase-head and its external dependents are stil in the workspace even though the complement domain is understood to spel-out (= "voided from the workspace"). So one important point ilustrated by the discussion so far is that having succesive stages of derivation in which some elements survive in the workspace despite further expansions of structure yields a kind of overlap (ilustrated below in (61)a). This overlap is crucial to elaborating the notion of "escape hatches" for cross-domain dependencies. That is, escape hatches are possible because some position(s) constitute the top of certain workspaces that survive at the bottom of later workspaces ? this alows an element moving to positions residing in such domain overlaps to be visible to elements higher in the structure. 57 There is, however, nothing inevitable about such possible overlaps betwen domains. We could, for example, imagine an architecture which would not permit them. In such a system, when a second phase-inducing head is introduced we could take this to signal the start of a whole new orkspace. Consider: (60) a. b. c. d. e. f. g. ,.., H PH-1 INTRODUCED H PH-2 INTRODUCED COMPLEMENT DOMAIN ETC,.. OF H PH-2 SPELS-OUT On such a view the step betwen (60)e and (60)f would result in the establishment of a brand-new workspace signaled by the introduction of a new phase-inducing head. This would yield a theory with non-overlapping phases. Succesive snapshots the derivational workspace for a longer stretch of derivation can be sen to yield maximal stretches of workspace structure (maximaly expanded workspaces prior to any particular contraction steps or "spel-outs") as follows for Chomsky's view (61)a) versus our hypothetical non-overlapping system (61)b). Also, we include here for consideration a third possible state-of-afairs which imposes no restrictions on the workspace whatsoever (61)c): 58 (61) a. b. c. PHASE-? PHASE-? PHASE-? PHASE-? PHASE OVERLAPS NO PHASE OVERLAP NO PHASES On the (61)a-view we expect the possibility of limited kinds of interactions betwen elements in a structure if we understand the operations responsible for connecting elements in substantive dependency relationships to be limited to what is present in the workspace at a given stage of derivation. On the (61)b-view we expect no such interactions. On the (61)c-view, if nothing more is said by way of constraining operations, interactions of al sorts are expected. The (61)c-view would simply identify the workspace and output for al steps of derivation and would thus need to appeal to something other than the dynamic sort of domain-demarcation under discussion here to understand locality. For example, operations might be limited to only being able to relate two elements ? and ? in a structure if there is no intervening element ? that could enter into the same type of relation (e.g., Relativized Minimality; Rizi 1990). Its easy to se that having a workspace which limits the reach of dependency- forming operations in the gramar could be redundant with distance restrictions of the minimality sort that are often appealed to in the literature. (It may perhaps already be obvious given the heavy emphasis I have placed on the notion of workspace which direction I wil suggest we go in removing any such redundancy). 59 Now let us consider a derivation involving sucesive cyclic movement in these terms. Take the darkly shaded marked node in the following to be an element bearing our {*F} property that requires licensing not available in its initial position, and first consider what happens if there is no SCM (as above, the grey-shaded nodes are the phase-inducing heads): (62) * ! H PH-1 INTRODUCED H PH-2 INTRODUCED COMPLEMENT DOMAIN OF H PH-1 SPELS-OUT (THE WORKSPACE CONTRACTS) On this view then, if the {*F}-bearing element does not move from the complement domain of the first phase-inducing head (H PH-1 ) it wil be stranded in the abandoned (speled-out) portion of (the output) structure. On the asumption that such unlicensed properties are uninterpretable by or ilegible to the interface systems, this derivation would crash. However, the structure of this acount makes available the possibility of the {*F}- bearing moving to some position outside the complement domain of H PH-1 prior to the spel-out of that domain and thus managing to stay within the workspace (remaining active/visible for later steps of derivation). To ilustrate: 60 (63) ? H PH-1 INTRODUCED H PH-2 INTRODUCED COMPLEMENT DOMAIN OF H PH-1 SPELS-OUT (THE WORKSPACE CONTRACTS) As such a derivation continues, introduction of further phase-inducing heads would thus drive further contractions of the workspace (spel-outs) and would thus require additional local movements of the {*F}-bearing element until it reached a domain within which it could asociate with an appropriate licensor. There are some interconnected technical maters that require atention in such an approach. First, what happens to the problematic {*F} feature of the lower element (trace/copy)? Second, what motivates the movement out of the relevant complement domain? Regarding the first point, if we regard the displacement operation as "copying" then why is the {*F}-feature not problematic for the lower copy when its containing domain spels-out (fals outside the workspace)? If it is not a copying operation ? and instead involves a literal re-merger resulting in a multi-motherhood structure, then why doesn't the same worry hold (i.e., the {*F} property should stil reside in both structural contexts)? On either the copy or remerge view it sems that we do not have a way to avoid the outcome that held in the non-movement situation if sub-parts of structure are undergo cyclic evaluation of the sort just sketched. We might suggest that the copy left by movement can have its {*F} property frely deleted, but then why couldn't this happen in the non-movement case? The answer to this question might be taken to involve appeal 61 to some later stage of derivation where a matching {F}-property would go unchecked, but its not clear that this response would be correct (e.g., what about wh-in situ?). Regarding the licensing of the intermediate movement, there are two obvious possibilities. First, there might be some property of the intermediate landing-site that serves to license the movement but crucialy not to license the {*F} property (i.e., it must "move-on" to some other position to license this property). Second, we might motivate the movement as not being driven by the landing-site properties, but rather by some combination of the local context and the {*F}-property itself. Movement out of localy evaluated domains might be possible just in case failure to do so would result in a crashed derivation at that point. 35 This relates to the question above regarding the properties of the wh-element itself, and how/why it does not crash the derivation even if it does move (in virtue of leaving behind a copy) and what to say about situations where we do not want the element to move (e.g., in wh-in situ cases). Again, I wil return to these isues in later discussion. Before we move on to consider how TAG derivations work in comparison, let us sum up some key questions raised in this section. I wil borrow from a discussion in Falser (2004) and refer to these as the questions of TRIGERING (64)a and CONVERGENCE (64)b: (64) Asuming SCM exists,.. a. What motivates it? Properties of the moving element? Properties of the intermediate target? Both? Neither? (i.e. something else)? (TRIGERING?) b. What are the mechanics of movement like such that unlicensed features {*F} do not remain to cause convergence problems in speled-out domains (either on the copy, or on the remerge view)? (CONVERGENCE?) 35 This basic line is sugested in Lasnik & Uriagereka (forthcoming). 62 1.4.3. What Goes for Cyclicity in TAG The categorial distinctnes condition on the syntactic workspace introduced above (se (9)) relates to ideas from work in Tre Adjoining Gramar (TAG), though the insights wil be implemented rather diferently here. The key TAG idea is that we might understand the fundamental notion of recursive structure to play a central role in understanding the range of possible interactions betwen phases of derivation (more neutraly: betwen chunks of structure). As a mater of its basic architecture, TAG factors complex structures into non-recursive elementary tres and recursive auxiliary tres that are combinable via TAG's two main operations (substitution and adjoining). These two operations can be pictured as follows: (65) X X Y SUBSTITUTE Y Y (66) X X Y r ADJOIN Y Y r Y f initial auxiliary Y f derived Auxiliary tres in TAG, as pictured above, are special in that they are taken to have related top (root) and bottom (foot) nodes (e.g., Y r and Y f in (66)) which enables the complex of relationships which they "sandwich" to be spliced-in for an some equivalent atomic element within another structured object (e.g., Y in the initial tre in (66)). Thus, 63 tres without this top/bottom characteristic are elementary; those with this characteristic are auxiliary. As Frank & Kroch (1995:113) put it, "the recursive character of auxiliary tres provides [..] a domination-preserving expansion of a single node in a piece of phrase structure into a larger structure. Now consider what happens in place of succesive cyclic movement in TAG. I say "in place of" because in TAG syntactic dependencies like wh-movement are argued to be localized to elementary tres ? movement across such structures of the GB/MP sort ilustrated above is ruled out on general architectural grounds. Thus the movement of the element ? in (67)a targets what wil in fact be its final landing-site, crucialy within the bounds of the elementary tre. 36 This is the only movement operation that there is in this approach. This movement (/chain) relation betwen the base and final/target position is then stretched as a consequence of the adjoining operation ilustrated above, which splices-in intervening material as pictured in the step from (67)b to (67)c. (67) a. XP b. XP c. XP X' ? {*F} X' ? {*F} X' ? {*F} X' X' ..t ? .. ..t ? .. X' ..t ? .. 36 I wil discus later on some ideas about limitations regarding the "size" and "shape" of elementary and auxiliary tre structures, folowing among others the work of Frank (192, 202). I am leaving to the side the isue of the licensing of the feature {*F} for this ilustration of TAG-mechanics. How this licensing works requires a bit more detailed and subtle discusion. This issue is important however, as it turns out that the TAG aproach I discus here (i.e., from Frank's 202 discusion) requires formulating the adjoining operation to alow "checking" acros elementary tres. Se Chapter 2 for some relevant discusion. 64 Further adjoining operations can then splice-in more auxiliary structures, yielding the efect of a long-distance relationship betwen ? and its base trace position. Note that there are no "intermediate traces/copies" on this view. A key aspect of TAG is the identification of the root/foot nodes of auxiliary structures with a corresponding/matching element in an elementary/initial tre in the adjoining operation. The TCG approach developed here exploits a "matching" relationship of roughly this kind as wel, though such matching is understood to efect the opposite of TAG-theoretic adjoining as we saw in our introductory sketch (contraction). Consider how we might view TAG derivations in the context of our WS/O- distinction. In the development of one specific TAG approach, that in Frank (2002), it is suggested that syntactic derivations are divided into two major stages. The first stage involves the merge/move mechanics familiar from Chomsky (1995). However, departing from Chomsky's "one-stage" system, Frank suggests that the merge/move portion of derivations is limited to only being able to generate structures that met the following general condition: (68) CONDITION ON ELEMENTARY TREE MINIMALITY (CETM): The syntactic heads in an elementary tre and their projections must form an extended projection of a single lexical head Reference to "extended projections" comes from the work of Grimshaw (1991, 2002) and others. The efect of this condition is that the merge/move portion of syntactic derivation only create objects that are roughly clause-sized or smaler. These merge/move-derived objects then, in Frank's system, are fed to a second stage of derivation which deploys the TAG-theoretic operations of substitution and adjoining sketched above. 65 Translated into our WS/O terms, we can understand Frank's CETM as a condition on syntactic workspaces, and then asume that there can be multiple such workspaces corresponding to the basic tre structures that form the input to the second, TAG- theoretic stage of derivation. This TAG portion of derivation is then conceived as a component which efects combination of such workspaces to form derived (output) structures (I take it that this basic picture is clear enough to not require a diagram). Of course, this amounts to a rather diferent conception than the views sketched above. I mentioned the possibility earlier of having a system which would impose no restrictions on the syntactic workspace and thus would require that locality be stated in ways other than the dynamic view of local domains we sketched at the outset. The TAG view on this general outlook would be an entirely diferent approach, but one that would view syntactic workspaces as always coextensive with outputs. However, instead of introducing locality constraints on operations within the workspace, we rather have a limitation on workspace size (i.e., of the CETM sort), plus the major architectural division of derivation into (i) a first stage of local structure creation with multiple workspaces, and (i) a second stage that handles combination of workspaces into larger complexes. The TCG view developed here maintains a "one-stage" view in the sense of not positing two distinct stages of syntactic operations. I turn now to elaborate further. 66 1.5. Implementation of TCG This section backs up to consider some technical isues regarding the notion(s) of movement chains (?1.5.1). This leads us to consider, when coupled with the sketch of contraction offered above, a "reduced" view of categories and structure (?1.5.2). 1.5.1. On Relating Positions in Structure This section discusses a posible view of movement chains based ideas from Chomsky (1995). In derivational terms we think of an item ? as first asociating with some other, independent element to form a structure ? constituting the initial/base position B (as in (69)a). Later on, derivationaly speaking, some operation causes ? to enter into a second set of relationships with some target element T to form the structure ?' constituting the derived/target position, where the T-elements dominate the B-elements (as in (69)b): (69) a. ?= B M b. ?'= T M ? B S ? T S B M ? B S Take the superscripted 'M' and 'S' in (69)a/b to stand for the mother and sister elements respectively, which together form a merge-derived structural context for ? (e.g., B S = base position sister, etc.). I wil return to the isue of whether one or the other of these may suffice ? or whether both are somehow required ? for identifying the contexts for a moved/displaced element ?, but note here that Chomsky (1995:252) understands the 67 relevant element for defining the contexts of ? as the sister or co-constituent of ? (i.e., B S and T S in (69)b). Consider: Supose that ? raises to a target [T] in ?, so that the result of the operation is ?' [..]. The element ? now apears twice in ?', in its initial position and in the raised [target] position. We can identify the initial position of ? as the pair ??, ?? (? the co-constituent of ? in ? [i.e., B S in (69)a/b?JED]), and the raised position as the pair ??, K? (K the co-constituent of the raised term ? in ?' [i.e., T S in (69)b?JED]). Actualy, ? and K would sufice; the pair is simply more perspicuous. Though ? and its trace are identical, the two positions are distinct. We can take the chain CH that is the object interpreted by LF to be the pair of positions. [..] C-comand relations are determined by the maner of construction of [the object in (69)b above?JED]. Chains are unambiguously determined in this way. Following through on this view for our example in (69)b we se that there are two such relevant positions in ?', POS 1 and POS 2 , where POS 1 = ??, T S ? and POS 2 = ??, B S ?. 37 As Chomsky puts it, these two positions are distinct, but together they constitute the chain CH = ?POS 1 , POS 2 ? = ??, T S ?,??, B S ?; or, if we adopt the "more austere version", then CH is simply ?T S , B S ? since "? and its trace are identical". Observe however that there are situations in which the context positions are themselves identical. In particular, consider again the following standard cases of the putative cyclic A- and A'-movements in (70): (70) a. John [sems [ _ to be likely [ _ to appear [ _ to like carots]] b. What [did Dave think [ _ [that Mary believed [ _ [that John liked _ ]]]? Intermediate A- and A'-movements involved in such cases are clasicaly taken to have more-or-les the following general abstract shapes: 38 37 I am taking a shortcut here for expository purposes. The co-constituent forming the sister context for the "raised" position in (69)b would not be just "T", but rather a more complex set-theoretic object in Chomsky's general Bare Phrase Structure aproach. Here we take "T" to "stand in" for this more complex object. 38 There are, as we wil discus later, views which take movement to be "more cyclic" than this, as wel as les (i.e., "one-fel-swop" views). I wil discus these maters in the next Chapter. 68 (71) a. TP {*F} TP TP TP VP ? b. CP {*F} CP CP CP CP VP ? So while it may be true that a moving element ? is "identical" within each of these contexts, at least some of the relevant context pairs ought to yield identity (or non- distinctnes) as wel. T-to-T and C-to-C movement could yield a chain CH = {??, C'?, ??, C'?}. Is this a problem? The suggestion inherent in the TCG view of cyclicity sketched above can be understood as the claim that such a state-of-afairs is not only "not a problem", it is in fact crucial to understanding linked local relationships. As we saw earlier, this is at the heart of the TAG architecture as wel. And as we wil se briefly later on, these relationships have been argued to be central in some MP work (e.g., se Bo?kovi? 2002, Grohmann 2003; both of whom stipulate that A-movement is T-to-T and A'-movement is C-to-C). However, it is important to note that the sketch above regarding Chomsky's view of chains and contexts overlooks a key feature of his view. For Chomsky, "contexts" are not simply the local label of the element that a moving "?" relates to, but rather the entire structure derived up to that point. So, there would on his view be no isue which could arise in terms of distinguishing the contexts in CH = {??, C'?, ??, C'?}, since the contexts would always be unique (they are distinguished by the diferences in the structure they dominate). I wil turn to this in a moment. The suggestion here (as sketched in ?1.1 & ?1.2) is that the natural relationship is not betwen a moving element and these various intermediate positions which just happen to share the super-category specifications of the sought-after target landing site; 69 rather, the natural relationships are betwen the contexts themselves. That is, the natural basic relations that the "moving" element can/should be understood to enter into are the substantive "core" licensing properties (e.g., wh, ?/?, ?, etc., what Fukui & Speas 1986 caled "Kase" properties). The generalizations about the "movements" other than these are most elegantly and naturaly stated by positing direct relationships betwen the contexts. For this, we need only to specify a notion of like/unlike within an architecture where such diferences could mater for derivation and representation. This is the aim of TCG. Our sketch of the TCG approach to SCM presupposed that the "contexts" which can be identified (resulting in lowering) were understood to relate to the "moving" element in terms of a local dominance relation. Note Chomsky's suggested view above states things in terms of sisterhood. Below I wil show that the sisterhood view can't support what we would require of it within the TCG approach, and that we in fact require the relevant relation to be motherhood/domination. However, before heading down that road it of some interest to probe Chomsky's discussion of movement chains a bit further to consider two important technical isues. In particular, the notation used above to mark the SCM's in (70) does not acurately capture one the versions chains discused in Chomsky (1995), though as we wil se, it sems to be demanded by the more recent work proposing that derivations work by phase (depending, as we wil se, on how we view "spel-out" ? our WS/O-distinction turns out to be helpful in this respect ? se below). 70 The two important isues involve (i) how we understand what contexts are, and (i) how contexts are tracked/connected in the course of the derivation. Specificaly, Chomsky (1995:300) makes the following remarks about (72) (= his (88): 39 (72) We are likely [t 3 to be expected [t 2 to [t 1 build airplanes]] He writes: Here the traces are identical in constitution to we, but the four identical elements are distinct terms, positionaly distinguished [..]. Some technical questions remain open. Thus, when we raise ? (with co-constituent ?) to target K, forming the chain CH = (?, t), and then raise ? again to target L, forming the chain CH' = (?, t'), do we take t' to be the trace in the position of t or ? of CH? In the more precise version, do we take CH' to be (??, L?, ??, K?) or (??, L?, ??, ??)? Supose the later, which is natural, particularly if sucesive-cyclic raising is necesary in order to remove al - Interpretable features of ? (so that the trace in the initial position wil then have al such features deleted). We therefore asume that in [(72)] the element ? in t 1 raises to position t 2 to for the chain CH 1 of [(73)], then raises again to form CH 2 , then again to form CH 3 . (73) a. CH 1 = (t 2 , t 1 ) b. CH 2 = (t 3 , t 1 ) c. CH 3 = (we, t 1 ) CHAINS are ordered pairs of contexts where a particular context for a given element ? is understood to be its sister or co-constituent. If we take this to mean that the "context" is the entire structure dominated by the sister element, then the more complete version of the relevant objects in (73) for the derivation of (72) are those in (74): (74) a. CH 1 = (?we, [to we [build airplanes]?, ?we, [build airplanes]?) b. CH 2 = (?we, [to be expected [we [to we [build airplanes]]?, ?we, [build airplanes]?) c. CH 3 = (?we, [are likely [we [to be expected [we [to we [build airplanes]]]?, ?we, [build airplanes]?) 39 Chomsky's example in the discusion refered to in the text used the token "we are likely to be asked to build airplanes". I have switched out ask for expect in my discusion here. It sems clear from the context that Chomsky intended to have an pasivized ECM verb in this example, as pointed out to me by Howard Lasnik. 71 Below I wil suggest that we adopt this idea regarding contexts, but reject the view of contexts as understood as the entire structure up to the relevant step of derivation. The "technical questions" Chomsky raises in the quoted pasage above amount to the choice betwen the following two options regarding how contexts are connected in the course of derivation. From the derivational stage depicted in (75), we can consider the result of the next movement of we to be (76)a or (76)b: (75) ..[to be expected [we [to [we [build airplanes]] MOVE (76) a. [we [to be expected [we [to [we [build airplanes]]] MOVE b. [we [to be expected [we [to [we [build airplanes]]] It is interesting to note that if the technical option Chomsky pursues (76)a) is correct, that we appear to have cases for which the concepts of MOVE and CHAIN would be disociable ? compare (76)b where the any notation for the resultant chains would transparently recapitulate the derivational history of movement (se also (76)b' below for ilustration). These two options turn out to not be equaly compatible (at least not equaly straightforwardly compatible) with derivation by phase. Before turning to this point about chains and phases, note there is at least one other technical option which Chomsky does not consider. This third option would regard movement as extending the initial (derivationaly prior or "older") chain and forming a new one as in (76)c (in contrast Chomsky's version creates a new base-position-tailed chain, leaving the older one intact): 72 (76) c. MOVE [we [to be expected [we [to [we [build airplanes]]] NEW CHAIN CH 2 ? OLD CHAIN CH 1 ? The next step on this alternative view ould involve the formation of a third chain (CH 3 ) and a kind of stretching of the previous two chains, as in (76)c': (76) c'. MOVE we [are likely [we [to be expected [we [to [we [build airplanes]]] NEW CHAIN CH 3 ? OLD CHAIN CH 2 ? OLDER CHAIN CH 1 ? Contrast this with the next step for (76)a (Chomsky's approach) in (76)a': (76) a'. MOVE we [are likely [we [to be expected [we [to [we [build airplanes]]] Note that on both Chomsky's alternative technical view (76)a/a') and the alternative (76)c/c') we can understand move and chain as disociable to some extent, compared to (76)b where the relevant movements and chains are esentialy the same, as mentioned above; consider (76)b' in this regard: (76) b'. MOVE we [are likely [we [to be expected [we [to [we [build airplanes]]] CHAIN What is the diference betwen the choices in (76)a-c? One obvious point is that only (76)b/b' sems straightforwardly compatible with the notion of spel-out by phase of Chomsky (1999) (this is the notation used informaly above to ilustrate our basic succesive-cyclic A- and A'-movements). 73 Phase theory, as introduced above, is a recent version of the general idea of cyclic domains for rule application. In Chomsky's recent work the suggestion is that CP and vP (and possibly others) constitute special domains which, upon derivational completion, require that their complement domains be shunted/transfered to the interpretative systems for evaluation. Above we suggested a way of viewing these transfer steps of derivation within our workspace/output structure distinction. Consider however the impact that viewing spel-out/transfer as a literal "handing-over" or "removal" of sub- parts of structure has on our discussion of chains in Chomsky's terms. We can make the point with reference to the case of succesive cyclic wh- movement. For this example, we wil consider only C 0 as constituting relevant phase- inducing category, as the point regarding the formal shape of chains remains the same even if we consider additional narower domains for spel out (like v/VP). The relevant structures and chains would look like (77) for succesive cyclic wh- movement on Chomsky's view, and like (78) on the alternative discussed above: (77) What [did Dave think [ _ [that Mary believed [ _ [that John liked _ ]]]? (78) What [did Dave think [ _ [that Mary believed [ _ [that John liked _ ]]]? But, suppose that we understand phases (here: CPs) as speling-out their complement domains upon reaching the next higher phase-inducing head, as Chomsky (1999) proposes. Technicaly, the proposal embodied in Chomsky's PHASE IMPENETRABILITY 74 CONDITION (PIC) has it that when the next highest phase-inducing head is reached, al of the substructure constituted by the previous phase-head's complement domain spels-out, leaving a residue (roughly equivalent to the "checking domain" of Chomsky's earlier proposals, se Chomsky 1993). This means that the first movement of the wh-element in our example wil have its "head" visible and wil therefore be able to be moved to the next CP, as this element wil occupy the "edge" of the previous phase. But what happens to the initial chain (?) when the substructure containing its "tail" spels-out? (79) a. [what [that John liked what] ?C-PHASE ? b. [that Mary believed [what [that John liked what] ? ?C-PHASE SPELLOUT???? c. [that Mary believed [what [that John liked what] BROKEN CHAIN??? ?C-PHASE d. [that Mary believed [what [.......?........] ?? ?C-PHASE e. what [that Mary believed [what [.......?........] ? f. etc.,.. (by phase) The spel-out by phase idea, regardles of the grain or size of structures considered to constitute such phase-domains, is not straightforwardly compatible with the idea of having the kind of view of chains in (77), nor the variant introduced above in (78). 40 The 40 It could be that Chains are "real", and work as sugested in Chomsky (195) (as discused above), but that they are fundamentaly not "syntactic objects". Maintaining the reality of "chains" in a derivation-by-phase architecture apears to require the postulation of a kind of cros-dimensional object that exists acros sub-stretches of syntactic computation and the interpretative components or that chains are fundamentaly objects of the interpretative system(s) which the syntax in some sense creates but canot itself handle/manipulate (the system only ses particular elements "?" which can merge and remerge). 75 problem with both of these conceptions is that they involve the postulation of a syntactic relationship which is maintained to the base position, but on the derivation by phase view these lower positions are understood to be in some sense "absent" at the relevant later stages of derivation in virtue of the spel-out operation. Even the idea of having a composed or linked chain appears to not make any sense on the derivation by phase view, as there is no stage of derivation over which we could describe such objects. We appear to either need chains to be non-syntactic entities ? e.g., objects of the interface system (maybe plausible) ? or we need to regard chains as objects somehow superimposed over a dynamic derivational history (i.e., stil "syntactic" but "higher order"). 41 Note as wel that it is not entirely clear how to maintain chains as syntactic objects on a phase-based view that adopts as wel the view of contexts as the entire structure of derivation up to the relevant point (i.e., everything dominated by the sister/context for ?). The view of contexts as the entire structure to which ? relates sems to require that we can refer to such structures ? but if portions of such contexts are dynamicaly shunted/transfered by phase, its not obvious how this should work. At best, contexts could be defined down to the previous phase-inducing head, and not below. These are interesting consequences it sems to me. Put another way, suppose we take the conditions we want to hold of chains to be syntactic conditions. If we can't refer to chains themselves (since there are no structural contexts over which we can capture the relevant relationships in the multiple spel out view), this means, for example, that 41 Se Uriagereka (198) for a discusion of such a view. 76 whatever properties are asociated with the wh-element in virtue of having entered into the complement (?) position of the embedded verb like in our example above, these must be somehow maintained as properties of the element itself (e.g., ?-marking of and element ? could be understood as ? receiving or being marked somehow with a ?-feature ? se Hornstein 2000 for an extensive development of this approach). However, note that the workspace/output distinction as we have introduced it sidesteps these isues. I mentioned above that we might deploy our view to avoid isues that might arise regarding pre-/post-spel-out coherence. This is exactly such a situation. Consider again our earlier schema (i.e., the bottom-up version), repeated here in (80): (80) However it is that we might choose to view CHAINS, this schema alows us to straightforwardly maintain the overal coherence of the derivation in virtue of maintaining the output structure in the way pictured above. Thus the initial and subsequent movement pictured in (81)a/b could be sen to yield any of the objects in (81)c-e, depending on how we sort out Chomsky's two technical options (81)c and (81)e) or our additional one (81)d): (81) a. b. c. d. e. = OR OR 77 So we have a format available for considering al of the possibilities discused above regarding how contexts are tracked/connected throughout the course of a derivation. Moreover, this view is neutral as it stands on the question of whether we define contexts as just the sister-label, or in terms of the entire structure dominated by the sister/co- constituent. I wil return to this isue in a moment. This underscores again the generality of the workspace/output structure distinction and the new idea it brings to discussions of MSO-systems. It helps us here because it points to a way of conceiving of spel-out which does not involve a literal "handing-over" of structure from the syntax to the interface systems, as spel-out is sometimes characterized informaly. Or, rather, the WS/O-distinction offers a concrete formulation of the content of "handing-over"/"transfer" under which the technical questions raised above do not arise. So whatever relations we establish as part of the syntax can stil be "there", but simply not within the active stretch of syntactic computation. This makes it possible to conceive of chains in any of the ways pictured above, with potential stages of derivation that might have workspaces in which only parts of a given chain might be visible. This is another instance of the WS/O-distinction providing a way to understand pre/post-spel-out coherence. 42 Let us now put this discussion back together with Chomsky's idea that chains are fundamentaly connections betwen contexts. Take the unshaded nodes below to be the 42 McGinis (204:64fn18) raises the isue of how c-comand is suposed to be understod as holding acros phases. For example if c-comand is understod as derivational in the Epstein et al (198) sense, its not obvious that when ? and ? merge they come to c-comand everything each other dominates if some of the derivationaly previous domination relations are literaly no longer "there" in the narow syntactic computation. Her exposition presuposes the intuitive notion of "handing-over"/"transfer", which is why her raising this question makes sense. Again, these isues do not arise given our WS/O-distinction. 78 sister or co-constituent elements defining the contexts (the "chain") for the moved element ? (occurrence of ? represented by the shaded nodes). (82) Viewing these context elements as independently relating along the dominance sequence ? a relationship that is "there" in any event, whether we view at as manifesting a "dependency" relationship or not ?yields the following thre possibilities in (83) corresponding to those in (82): (83) We can now dispense with the extra-structural arcs yielding: (84) If the relationships betwen context elements are in some sense independent of the element(s) that relate to them (i.e., that "move through" the positions they in part define), then we might consider the possibility that a given element ? might relate only once to such chain structures, perhaps targeting diferent parts of such complexes. 79 We might also entertain a diferent conception of contexts, in two senses. First, as pointed out by Chomsky (1999) and Lasnik (2000), there are two possible relations on a merge-based view that might be the relevant for implementing the context-view of chains. So far we have considered only sisterhood/co-constituency, but there is also motherhood or imediate domination/containment. We wil require this later conception (se below). Second, we have the following possible diference betwen two ways regarding how to understand what contexts actualy are, which is independent of the sisterhood/motherhood distinction: (85) CONTEXT=LABEL CONTEXT=ENTIRE STRUCTURE 43 On the righthand side we have a picture of contexts as suggested in Chomsky's example (se (74) above). This is a view cast in hiearchical terms that is similar in spirit to Chomsky's (1955) definitions of contexts in terms of strings (where occurrences of an element ? are uniquely defined by the left-to-right content of a string up to a given occurrence ? picking up on a notion present in Quine (1960) for formalizing variable occurrences in logic). On this view, we cannot exploit the possibility of having the system be "unable to distinguish" betwen contexts, since contexts are derivationaly 43 Howard Lasnik (p.c.) points out that this distinction is technicaly betwen two diferent ways of conceiving of labels ? either as local head/phrase information or as encoding the "entire derivational history" up to the relevant point where ? is integrated (whether on "first" or some subsequent re-merge). I wil retain the notion of label for the local category/feature information view, using the notion of the "entire structure/derivational-history" to refer to Chomsky's (195) view. 80 unique. On the lefthand side, however, we have a descriptively les powerful view hich identifies contexts by just the local label of ?'s sister (or, perhap instead: ?'s mother). These two views shake out somewhat diferently if derivations work top-down. For example, if ?'s is initialy integrated in the top-most position, and particular points of derivation are what is relevant for identifying contexts as in Chomsky's view, then ?'s initial context wil be just the sister or mother node characterizing this initial position. The natural extension of the idea of contexts as the entire derivation up to the relevant point where ? is integrated (or remerged) to the top-down view would then se intermediate positions as identified by al of the structure that dominates them. Howevere, the local-label view remains the same on a top-down view. That is: (86) CONTEXT=LABEL CONTEXT=ENTIRE STRUCTURE "TOP DOWN" It is the weaker notion of contexts ? viewing them as simply the local label, and not the entire structure up to the relevant point of derivation ? that our view of SCM requires. This could perhaps be motivated on minimalist grounds appealing to simplicity and locality ? the local label view does not require that we keep track of arbitrary stretches of derivation in order to keep track of occurrences of a given element ?. However, the suggestion here is actualy a bit stronger than this. That is, rejecting the descriptive power inherent in the "entire structure" view of contexts yields a system that is weaker in precisely the way that we require to understand SCM. It is, in fact, another way of stating 81 the key idea being developed here to say that it is because contexts are narowly/localy defined that situations can arise where they are not unique, and it is this state-of-afairs that underwrites SCM type relationships (that is, situations in which contexts in adjacent domains cannot be uniquely identified). So, to sum, we adopt a context-based view of movement chains, but limit the defining contexts to just the imediate local relationships (here: dominance relations realizing feature-licensing connections). In addition, we might consider the possibility that our thre technical options regarding how contexts are connected to each other in the course of the derivation are actualy not in fact technical/theoretical options for characterizing a single sort of relationship, but rather thre diferent species of chains ? diferent constituency structures of chains if you like (se Uriagereka 1998:399 for a related discussion). Recal from our discussion of some schematic TCG derivations for A'- and A-relationships in ?1.2 that we pointed to a "grouping" defined by stretches of agreing properties on the dominance ordering, particular stretches of the path with shared ? and/or ? values. In particular, we pointed to the following: (87) C ?:f WH[?:n] ?T ?:f ?:n ?v ?[?:n] C?T ?:f ?v ?[?:f] D ?:f ?:n D ?:f ?:n A'-RELATION A-RELATION In Chapter Thre I wil suggest that these ideas about co-valued properties along the dominance path can in fact be helpfully viewed as a kind of "chain" constituency. Note that the verbal projection path in the A'-relation in (87) manifests a grouping of the sort 82 sen in the middle schema in (84), repeated here in (88) with some possible descriptive labels suggesting a typology of relationships that I wil return to. (88) Binding/Control A'-relations Some A-relations (connections betwen A-relations) It wil be beyond the scope of the present work to pursue these divisions (and the possibility of others perhaps) in great detail, but in the course of developing some analyses I wil again return to these schemas to point out some of the paterns which emerge on the specific implementation of the general TCG view being proposed in this work (se our concluding discussion in Chapter Thre). Let us return now to some technical possibilities regarding the node-identification proces that I suggested might be useful in understanding SCM. This wil lead us to a discussion regarding labels and structure and a particular set of asumptions regarding these concepts that I wil be adopting here. Recal the following schema from above (repeated here as (89)) in which we suggested that the structural context for a "moving" element might be understood (under some relevant matching relation ?) to collapse/become-identified-with some lower like element. The suggestion was that such identification results in the equivalent of lowering. 83 (89) a. b. c. d. e. ? = MATCHING RELATION ? IDENTIFICATION CONTRACTION Although we did not mention this earlier (as we had not yet discussed the notion of chains and contexts), this view demands that the relevant contexts for ? be understood in terms of motherhood. Note what happens if we view the relevant matching to ocur with the sister element of the shaded node as in (90): (90) a. b. c. d. e. ? = MATCHING RELATION ? IDENTIFICATION CONTRACTION The matching of the open nodes would either result in no equivalent of "copying", so that the structure would simply reduce as pictured in (90)d/e, or we would have to entertain the idea that the open/unshaded and shaded nodes in (90) can instantiate a sisterhood relation which is independent of any dominance relationship, alowing us to extend the logic we introduced above regarding dominance to efect a similar "lowering". But its not clear how this later view would work. To se what I mean by needing an independent sisterhood relation, consider (91): 84 (91) a. b. ? ? ?? In order for the lower open node to have been introduced, as shown in (91)b, it must already have a sister. So its not clear how the sisterhood relation could support anything like the "lowering" operation we have been considering as a possible basis for approaching SCM phenomena. The picture in (90) above stil remains a possibility that would be of interest (recal this is the same as the picture we initialy offered to introduce the notion of contraction; se example (6)), but this would yield nothing like a "movement" relation, as it would not cause the shaded node to enter into any new dominance relationships in the output (or in the workspace). Below we expand on the reduced structural descriptions that were appealed to in our sketch of SCM analyses earlier on ? these structures esentialy deny the existence of linguisticaly significant "sisterhood" relationships. In addition to the technical problems for sisterhood just raised, our independent asumptions about structure wil thus be sen to rule-out the very possibility of identifying contexts in this manner (only the motherhood/dominance-type relations wil be available in principle). Let us consider some posible ways of viewing the upper context of ? which depend on diferent conceptions of phrase-internal projection distinctions, about which we have so far said nothing. Here are two familiar ones: 85 (92) a. XP b. XP ? {?F} X' ? {?F} XP X 0 .. X 0 .. In (92)a we have a traditional view positing thre diferent phrase-internal projection types, the head (X 0 ), the intermediate (non-minimal/non-maximal) X', and the maximal XP. In (92)b we have the view that specifiers are in fact adjunction structures (in the formal sense of May 1985, 1991; Chomsky 1986, as proposed e.g., in Kayne 1994). We can note right away that the specifiers-as-adjunctions view wil be dificult to render consistent with the intuition behind the matching relation ? as we have so far been hinting at it ? that is: the general idea of understanding the regulation of the size of the active workspace in terms of recursion (repeats of like elements). The general idea of TCG as we have been developing it is that the syntactic workspace cannot tolerate multiple tokens of a given type X, and that because of this limitation situations arise in which the workspace might either contract to remove one of the offending like elements from the workspace, or it might simply be unable to distinguish the two resulting in the sort of collapse/identification sketched informaly above. On this intuition ? that recursion in structure maters for regulating the maximal expansions of the syntactic workspace ? its unclear how there could be categories divided into segments of the adjunction sort. There may be technical ways of working 86 with such structures to implement the TCG intuition regarding a workspace limited to representing single tokens of given types, but I wil not pursue this possibility here. 4 Note that the traditional view involving intermediate-level categories in (92)a above is appealed to in TAG-theoretic derivations. Recal from above the general schema for the TAG equivalent of non-local movement relations: (93) a. XP b. XP c. XP X' ? {*F} X' ? {*F} X' ? {*F} X' X' ..t ? .. ..t ? .. X' ..t ? .. Can we appeal directly to this idea such that TCG would simply involve the inverse of TAG-adjoining to shrink/contract structures? The answer, I wil argue, is "no". Viewing the node-identification and contraction of structure as a kind of anti-adjoining wil fail to generate the structures that I wil argue are needed to understand the interaction betwen cyclic movement and certain binding- theoretic phenomena. Simply removing or splicing-out intervening structure defined by a top- and a bottom-node of the X'-type wil not result in the kind of lowering that would result in ? being dominated by material below its upper occurrence, though it does succed in creating new local domains over which ? wil dominate. This is what I showed above in examples (90) & (91). 4 For example, we might find technical justification for distinguishing the segments of adjunction structures based on feature values, an idea that I make use of in a diferent way in later discusion. 87 But the former (geting ? to be dominated by its previously 'neighboring' initial dominance domain) is what I wil argue to be required to correctly handle the binding facts (discussed below and returned to in more detail in Chapters 2 & 3). To quickly re-ilustrate the point, now in specific connection with TAG: having a anti-adjoining procedure (just reversing "direction" of standard TAG steps of derivation) could yield splicing-out of the sort in (94): (94) a. XP ? X' .. ? Z ? .. X' ? b. XP ? X' ? But this sort of operation could not result in ? being dominated by the intervening element Z in the output structure. Cases where this maters are ilustrated with the following examples: (95) a. John thought pictures of himself/*herself were on sale b. Which pictures of himself/*herself did John think were on sale (96) a. ?John thought Mary sold pictures of himself b. Which pictures of himself did John think Mary sold The self-form within an NP in the embedded subject position (95)a can (and must as the agrement mismatch shows) be bound by the matrix subject. This self-form can precede the matrix NP if it is within a fronted wh-phrase without loss of aceptability. This could be understood by connecting the analysis for (95)b to that of (95)a by positing a copy of the wh-phrase in the base position (or a trace that can be reconstructed into in some way). 88 However, note that if the phrase containing the self-form is in the object position, the embedded subject must bind it, and nothing higher can (96)a. But on given this observation we cannot extend the (96)a analysis to (96)b via positing a trace/copy in the object position of the embedded verb, since we've just sen that binding by the matrix subject is not possible with the self-form in that position. But if there is an intermediate movement, for example to the top edge of the embedded clause, nothing intervenes betwen the self-form and matrix subject thus opening the possibility of keeping the view of binding constant across these examples. That is, (96)b could be sen to involve the following partial representation as folows: C-COMAND (97) [Which pictures of himself] did John think [ CP [wh .. himself] [Mary sold [wh .. himself]] Crucialy, in order for this line of analysis to work, the relevant intermediate copy/trace has to be c-commanded by the matrix subject, as pictured above. But we've just sen above that this is not what the TAG derivation provides. In our schema (94) above what we need is for the XP to somehow end up under intervening elements like Z ? but this is not what we get either with TAG's adjoining or with a possible inverse operation that would otherwise be consistent with the views being developed here (e.g., contraction of X' elements). In addition to these technical/conceptual and empirical concerns there is also the following worry, which is similar to the concern raised above for adjunction-type structures. Do we have reason to think that XP and X' are distinct such that X' would not interfere with the XP-XP relationship that we sem to need to support context- identification and the consequence "lowering" of elements? If XP and X' are distinct, and 89 the relevant matching/identification could involve XP, then things could work for SCM as I sketched them above. But this would require that we refer directly to the equivalent of "bar-level" specifications to get the technical details of this view of movement off the ground. Another possibility is that X' elements are (i) there in the structure, (i) non- distinct from XP, but (ii) for some reason they are "invisible". I return to this isue below, but note here that I wil not be pursuing this line of thinking. I wil instead be adopting a view here based on a diferent conception of categories and structure which does not admit the possibility of an X'/XP distinction in the first place. This view alows us to sidestep a number of these technical isues and problems, and abstract away from other isues that wil not be a of central interest. The view I wil be working with is drawn from one of a few interesting recent minimalist investigations aiming to reduce the available range of distinctions in the theory of phrase-structure that analysis can appeal to. It is arguably more consistent than the salient alternatives with the general intuition about the workspace not tolerating "like elements", as we wil se. I turn to these maters directly. 1.5.2. Labels & Structure Let us consider two rather diferent ways of simplifying structural descriptions with respect to structure and category that have been suggested in the recent literature. First, Collins (2001) has suggested that we might head towards a theory in which label distinctions are eliminated as marks on derived structure, retaining this information as a designation only for the ultimate parts of structures (i.e., the terminal elements). So instead of the sort of object from Chomsky's (1994) Bare Phrase Structure (BPS) in (98)a, 90 where the underlined occurrence of the symbol '?' is taken to be the label of the merge- derived complex {?, ?}, we have rather (98)b, which encodes this label information only for the ultimate parts of the structure: (98) a. {?,{?, ?}} OR,.. ? ? ? b. {?, ?} OR,.. ? ? Collins approach is quite interesting, but it wil not support the key idea I am aiming to develop here regarding dominance-encoding of chain-information and the node- identification procedure that I suggest as relevant for succesive cyclic movement. This is so because Collins' system does away entirely with the relevant label markings on derived structure, so the system does not make available the formal means to expres the general idea underlying TCG. 45 However, others have pursued somewhat similar atempts at reducing the distinctions available for principles to refer to in the pursuit of eliminating redundancies and (perhaps) thus increasing restrictivenes. For example, other such label-reducing/- eliminating kinds of moves have been suggested as follows: 45 This may be hasty, but at present I do not se a clear way to begin articulating the TCG system I am developing here within Colins' asumptions. 91 (99) a. XP b. X ?"head" ZP X' Z Y ?"complement" "specifier"? Z' X 0 YP Z 0 Y' Y 0 (100) < ? ? Brody (2000, 2003) suggests a reduction in the available distinctions for capturing phrase-structure generalizations, eliminating the structural distinctions in (99)a in favor of the more sparse (99)b. This is certainly a label-elimination approach, but diferent from what Collins pursues. Stabler (1999) suggests something along the lines of what Collins proposes with the minimal diference of including a pointer which indicates the asymmetry of projection (i.e., which dominance-line constitutes the link to the head of a given combination ? as in (100)). Collins rather offers an inventory of principles which conspire to yield the results that labels are typicaly meant to encode (which, if correct, would eliminate the need for any such 'pointer' indicating the head of the structure). Stabler's view as far as I can se wouldn't support the system I am elaborating here either, for basicaly the same reason that we cannot deploy Collins' approach. However, there is a general question about al of these approaches that is worth raising: What is going on here? Eliminating primitive (intrinsic/non-relational) bar-level distinctions is not a new idea in the theory of phrase structure. This was present in the work of Muysken (1983), who adopted a relational conception of these distinctions 92 (specified in terms of coherency conditions on the projection distribution of features like [?maximal] and [?project]), and this general relational conception is modified and adopted in Chomsky's (1994, 1995) BPS. But what is being suggested here is an elimination of the distinctions altogether. 46 The Collins/Stabler approach difers from Brody's in which direction we understand the elimination (beter: reduction?) to work if we consider a mapping/ transition from the typicaly asumed sort of structure to each of these proposed conceptions. That is, for the standard view of the head-complement unit in (101), we have the following two alternative conceptions, difering on what is retained in the model. (101) ? (Brody) ? ? ? ? (<) (Collins/Stabler) ? ? Of these two ways of thinking, Brody's approach might sem at first blush to be more radical. The Collins/Stabler approach retains the part-whole structure central to the last half century of work in generative gramar (indeed: to most if not al of the entire history thinking about language structure generaly!) by maintaining the head/phrase distinction in structural terms while retaining labels only for heads. Brody clearly intends to stay within this tradition as wel, though his view of basic phrasal structures, he notes, is 46 Or, perhaps more acurately, relocating the conceptual/empirical burden borne by these notions onto the backs of other (hopefuly independently required) ones. I refer the reader to both Colins' and Brody's discusions. 93 intended to eliminate as wel "the apparent conflict betwen the long tradition of dependency theories" and "phrase structure theories of syntactic representation". 47 The isue of doubling labels in the projection relation betwen head and its dominating phrasal node(s) doesn't arise, as he has simply removed the distinction entirely, alowing only a single node (so, only a single label). However, it sems clear that Brody's view of structure and categories can be understood to retain part-whole/constituency information via the antisymmetry of dominance relationships. Traditional heads can be understood as separate units by refering simply to a single labeled node (though se below regarding heads and PF); traditional phrasal constituents are captured via dominance as in standard approaches. The possible exception to any such straightforward mapping from standard approaches is any "junctures" involving a head with a specifier and complement. Consider (99) again, repeated here: 47 Brody (203:16). The "conflict" that Brody aludes to is perhaps not imediately obvious, but I think it can be unpacked as folows. It is true that clasical dependency theories (e.g., Tesneire 1959) and more recent, conceptualy similar aproaches (e.g., Hudson's (1984) Word Gramar, among others) deploy somewhat diferent sorts of notations and difer at least superficialy from phrase-structure/constituency based aproaches in their mision statements (and there are aproaches which apear to fal in both camps, Stedman's Combinatory Categorial Gramar (CG) strikes me as one such aproach). But loking at curent PS-based views of structure/category its not obvious that there is any conflict. However, there is a diference that can be detected in the gradual historical shift from the initialy deployed rewrite rules of Chomsky's early work (a pure PS-based aproach) to his recent BPS. The shift has revolved almost entirely around the increasingly central role of headship/endocentricity. Phrase-structure rules on their own require no particular category matching betwen their left-and right-hand sides (e.g., X ? Y Z). The recognition of generalizations stateable in terms of positing special members of local part-whole structures to play the role of determining the overal type of the local structure (headship) formed the basis of X-bar theory. Among al the notions that were subsequently introduced under this general umbrela (e.g., cros-categorial harmony, uniform bar-level limitations, etc. se Jackendof 197, Emonds 1985), only the key notion of headship apears to have survived in recognizable form within curent thinking (se Speas 190, Chomsky's 194, 195 BPS, and Chametzky 196, 200 for some related critical discusion). Brody's reduced structures (and, I think, Colins' as wel) can be sen as atempting to remove the last barier betwen the aproaches, colapsing (almost) entirely the idea of formal ordering properties characterizing structure and substantive "dependency" relationships that can be understod to "live on" these dimensions. The question remains as to whether we ned anything more than a single dimension. My sugestion here is that as far as narow syntax goes we do not. This is esentialy the claim that al we ned is branching sequences. 94 (102) a. XP b. X ?"head" ZP X' Z Y ?"complement" "specifier"? Z' X 0 YP Z 0 Y' Y 0 For a given sequence of head-complement relationships, dominance ordering alows us to refer to either individual nodes or to principled subsequences that respect traditional constituency (take '?' to be a dominance link in what follows, which a left-right direction on the page indicating the standard antisymmetry of this relation): (103) A?B?C?D?E a. D?E b. CD?E c. B?CD?E d. AB?CD?E But, for example, we might take B?C to not be a constituent, since there is material which both B and C dominates. It is perhaps les clear what do say about branching in this system with respect to constituency, for example (take this to be the same object as (102)b above): (104) X ? Y Z The straightforward view ould say that Y (and al it dominates) is a unit, as with Z (and al it dominates), but X is not an independent unit. If it was, then why not X?Y excluding Z? Or X?Z excluding Y? But I just said above that we might regard each 95 individual node as being a separate unit in the sense of "independent head". If we are collapsing the head/phrase distinction, how is these maters resolved? Two separate lines of discussion are relevant here. First: let me return to the discussion at the end of the previous section regarding chains as contexts and specifiers as X'-sisters versus adjunction structures, connecting it now with the possibility of adopting these reduced structural descriptions in our formulation of TCG. Second: there is the (more recent) idea that head-movement relationships might be a "PF" phenomena. Regarding the first: there is a general idea that has been floating in the literature that non-minimal/non-maximal (intermediate-level/X') phrasal structures are invisible in some sense for the operations of the syntax. Sorting out this isue, as Chomsky (1995:382n23) observes, "depends on properties of phrases that are stil unclear". For example: Kayne (1994) argues from his asumed Linear Correspondence Axiom (LCA) that al specifiers in fact realize adjunction structures; Starke (2000) argues that we dump the notion of specifier altogether, retaining only the notion of head-complement relations. On Kayne's view e could argue that the system cannot refer to intermediate units since the equivalent element in his system would always constitute segments of category, which his approach consistently treats as esentialy "one thing" (se, e.g., his definition of c-command). But, on the other hand, segments of a category on this view are labeled identicaly as XPs, so perhaps they can be refered to as independent units (certainly for the case of similar kinds of structures arising with adjuncts/modifiers we want this to be so). 48 48 Whether we take the modifier case to be the same as the "adjunction" case (meaning adjunction now as the C-adjunction of the sort May (1985, 191) and Chomsky (1986) discus) depends on whether we take these to work in exactly the same way. On "adjunct" versus "adjunction" se Chametzky (196, 200). 96 On Starke's view, which has it that the analogue of specifiers are understood to be a special case of "heads" in that they project their properties to determine (part of) the label of the dominating structure, the structure corresponding to intermediate projections in standard X-bar theory would be a "visible" unit since it wil always be a maximal projection. 49 Epstein & Sely (1999), on the other hand, argue that intermediate projections are real, but that they behave as "fosils" (se Chomsky 1995:382n24) having initialy been maximal but losing this status when they are targeted by a merge operation on Chomsky's BPS-relational view of intra-phrasal projection. But, based on the asumption that these elements are no longer visible to the system, Epstein & Sely go on to argue that such elements cannot possibly be sisterhood contexts defining chain-links as suggested in Chomsky (1995) since the relevant elements are by hypothesis invisible ? therefore, they conclude, chains cannot exist. Both Chomsky (1999) and Lasnik (2000) point out that its not obvious that intermediate invisibility rules out the merge-context view of syntactic chains, as it sems reasonable to take the motherhood relationship to define the local structure identifying chain links (as I suggested above for independent reasons specific to my technical ambitions here). However, one might respond to this suggestion ? on behalf of Epstein & Sely ? by noting that this just moves the problem around somewhat. On the motherhood view 49 This is Starke's notion of "checking". Instead of having ?P with feature {F} enter into a relation with a ?P with the same feature {F}, Starke sugests that ?P simply projects its {F} upon combination with ?P. Se Starke's discusion for details. 97 of merge contexts it is true that intermediate movements wil now have visible contexts, as they wil typicaly (always?) be dominated by XPs. But now the base position of a given chain should have an invisible element as its context, since presumably its mother wil always be a non-maximal element. Though, this would depend on whether there is a specifier for the head of the base position ? if so then the context wil be maximal and hence visible, if not then it would be intermediate and thus invisible. Note that, as pointed out above, the specifiers-as-adjunction-structures view of Kayne and others might alow us to sidestep these technical problems if we could motivate the possibility of having segments of a category serve as appropriate contexts for the understanding of chains we've been discussing. It is not, I think, quite clear what is realy at stake here. That is, I agre with Chomsky that debate on this subject turns on presuppositions about "properties of phrases that are stil unclear". We can elaborate on this point in another general way. Given the explosion of functional categories that has atended the development of the MP, its an open question for any given element X whether an element which appears to be its specifier is, in fact, rather the specifier of a functional element Y that takes X as its complement. If there is such a Y, then X wil be maximal and hence visible; if not, it wil be intermediate and hence invisible. The degres of fredom that theory makes available for analysis here makes it dificult to sort out these alternatives. Note that it is not impossible ? the present point is only that aray of distinctions made available with these various degres of fredom 98 simply predict more clases/groups of facts than an approach without such degres of fredom (and are thus les restrictive). What we might worry about even at this level of generality is what we might take as independent reasons to introduce principled/motivated constraints in the deployment of theories/models with this many alternatives (i.e., to narow the possibilities/reduce the degres of fredom). For approaches which adopt fine-grained functional category inventories and a maximal/intermediate level distinction and the possibility of C- adjunction (with double XP segments) things get even les clear in terms of the restrictivenes of the overal theory. This whole set of isues ties into a discusion from earlier years, as set out helpfully in the work of Sturrman (1988), regarding projection-level types. Sturrman develops what he refers to as the Single Projection Type Hypothesis (SPTH) which divides syntactic categories into two basic types: (i) recursive and (i) non-recursive. The later we can take to be heads (X 0 s); the former are the equivalent of maximal elements (XPs). 50 It is exactly the concerns regarding isues of restrictivenes raised above that drives Sturrman's theoretical developments in this respect, and it is concerns of this type (as wel as his aim to eliminate redundancies) that similarly drive Brody's introduction of the collapsed structures discussed above. Brody's view renders trivial, for example, the general fact that projection lines (e.g., X 0 ?X'?XP) can never be interupted by some 50 Sturman cites early work of Emonds (1971) where the SPTH is proposed, and Emonds (1973), where the idea is rejected in favor of having two recursive types (the equivalent of modern day X' vs. XP if we take XP to be potentialy recursive). Sturman does not discus the isues which would arise for head movement that might force the adoption of sub-X 0 structure for which one might want to posit recursive X 0 s. 99 other intervening element of a diferent type since, in his reduced structures, there are no such internal distinctions, and therefore there is simply no room for any such interveners. 51 The general view, however, raises questions about what does go in place of the distinctions typicaly understood to underwrite, for example, head-movement versus XP- movement (or the phrase-structure status of modifiers). 52 This brings us to our second relevant line of discusion regarding the Brody-type reduced structures and "constituency" from above: the idea of head-movement as a PF-phenomena. 53 Chomsky (1999) (se also Boeckx & Stjepanovic 2001, Bobaljik 2001) suggest that head movement might not be part of the syntax proper, but rather is a PF- phenomena. However, note that on these views it is certainly not the case that syntactic structure simply does not mater for such operations. For example, Bobaljik's (2001) approach takes syntactic structure to yield a weak pairwise ordering which head-to-head relations are established, so saying that head-movement is a PF operation doesn't imply that it is not constrained in some manner by syntactic structure. 51 But what of the SPTH of Sturman? Do we have recursive catergories, or not? Strictly speaking, the notion of recursion refers to a function that cals itself. So we say that sentences embeded in sentences manifest recursion, and similarly with noun phrases inside of other noun phrases. But in the more recent era of separating out sequential arays of functional and lexical types, do we ever have instances of local recursion in the sense of an X taking an XP complement? Work by Hoekstra (1984) sugests not, formulating what he caled the Unlike Category Condition (UC: *{X 0 XP}). Se van Riemsdijk (198) for critical discusion and an alternative formulation of the key intuition which avoids some potential problems which arise. 52 I won't be discusing adjuncts/modification in this work. 53 What folows regarding "head movement" superficialy parts ways with Brody's discusion, who argues (folowing Baker 1985 and others) for a miror-theoretic understanding of syntax/morphophonology conections. What I am about to sugest however does not strike me as incompatible with Brody's proposals (se my earlier remarks as wel on pre-/post-spel-out coherency and conservation of ordering properties). 100 Let us now tie these two strands of discusion back into our discussion above of constituency in the Brody-type reduced structures. Consider again our abstract dominance sequence and the possible constituency groupings: (105) A?B?C?D?E a. D?E b. CD?E c. B?CD?E d. AB?CD?E We can now tentatively adopt the view of head-movement as a PF-phenomena by saying that the PF-relevant properties of the individual nodes (A, B, C, etc.) are PF-constituents, which are related by principles that may involve reference to syntactic structure (perhaps along the lines sketched in Bobaljik 2001) but which only actualy handle the PF-relevant properties. The isues regarding constituency with respect to individual heads thus fal outside the syntactic system. Now consider branching and constituency again with reference to these reduced structural descriptions: (106) X ? Y Z Now we are fre to take the line suggested above regarding phrasal constituency in terms of traditional dominance ordering. On that view the object in (106) manifests thre constituents, the entire object, Z (and whatever it dominates) and Y (and whatever it dominates). 101 Note however that our view of constituency can interact with directionality of structure building. For example, on a top-down view, the Brody-type structure in (107) would have a derivation like that in (107)a-d: (107) A?B?C?D?E a. A?B b. AB?C c. A?BC?D d. AB?CD?E This kind of alternative is argued for in the work of Philips (1996, 2003) on the basis that it yields a fundamentaly diferent (derivational) conception of constituency making available units that he argues we need for analysis. 54 It wil be important to se whether the asumptions that lead to the conclusion here regarding our suggested treatment of SCM are roughly consistent with Philips' solution to various constituency-test puzzles. If so, then the two independent lines of thinking ? one regarding the dynamics of local unithood and one regarding the dynamics of reducing linked-local relations to local ones ? can be sen to be pointing in (or rather, "to") the same general direction. (I do not addres this isue here, though it sems to me that these reduced structures are consistent with what is needed to implement Philips' analyses). 5 However, I am not principaly concerned with either of these general sets of isues (i.e., head movement or constituency per se). Therefore, my adoption of 54 Philips gets a bit more than just what we arguably ned. On his view any left-edge grouping is a posible constituent. In virtue of this his analyses ned to apeal to other notions to avoid overgeneration (though he argues the required 'other notions' are independently motivated). Se Philips (203) in particular for discusion. 5 That is, what is required is to be able to refer to spec-head constituents excluding complements. As far as I can se this distinction is available in a top-down expansion of structure apealing to these reduced Brody-type structures. 102 asumptions regarding structure and category for the present work can best proced by seking out a way to concentrate on the aspects of these concepts that are of interest for me here. I wil thus be working with the reduced structures of the type Brody proposes, though the view here wil be understood to be derivational, while Brody has extensively argued in favor of a representational view (se, e.g., the papers collected in Brody 2003). Consider the following graph of a typical transitive clause: (108) ?C ? ? T D? ?T ? ?N ? ?v ? ?V ? ?D ? ?N We can extract the Brody-type structure as follows: (109) ? C ?T D? ?v N? ?V ?D ?N 103 It wil be along these dominance spines that the al the action of the system developed here wil happen. To save space in the presentation I wil adopt a horizontal notation, so that (109) wil look like (110) (the connecting arcs representing dominance): (110) C?T?v?V?D?N D?N For everything I wil be arguing here, it wil be suficient to refer to simple sequences of this kind (though we wil augment the labeled nodes with more complex feature descriptions as in ?1.2 above). Note that we can take the adoption of this kind of structure as either fully embracing the Brody-type vision of structure and category labels, or we can simply understand this to be a suitable set of working asumptions which abstract away from the isues of intermediate-level categories, whether we treat head movement as "in" the syntax or not, and questions about how non-argument modifiers are integrated. That is, what this sparse representation alows us to concentrate on is the key type of information that I wil be taking to be important for the TCG system ? namely the nature of category sequences defining the dominance-spine of syntactic objects. Most if not al of what I wil say here is consistent with this weaker view of adopting these ideas as simply a set of working asumptions. However, as noted above, this view collapsing intra-phrasal distinctions sems intuitively more compatible than some other possible approaches with the idea that the workspace cannot tolerate multiple tokenings of a given type X. And given the arguments above that we require motherhood/dominance to underwrite the context-identification view of SCM-type relationships, having a model within which this 104 is the only possibility provides an atractive convergence of independent ideas. These correspondences with our central aims, along with the ability to circumvent the numerous technical dificulties that I mentioned above in connection with some salient alternatives, wil be taken as sufficient justification to proced with these asumptions. 1.6. Chapter Summary We now have the following ideas in place. We asume the WS/O-distinction as a basis for our TCG implementation of an MSO-system. The workspace has been suggested to be restricted in two ways: (111) WORKSPACE ORDER: The elements in the workspace manifest a weak partial order (112) WORKSPACE DISTINCTNESS (ANTI-RECURSION): The workspace does not tolerate the presence of multiple tokens of type X For a workspace containing an X-element, I have suggested that the proces of introducing any second X-element should be understood as part-and-parcel of the contraction procedure. One way that contraction can occur is in virtue of a particular response of the system when confronted with like elements ? they can be identified under what I caled matching relation ?, which we wil se in Chapter 3 requires some further elaboration. The following strengthening of the ordering restriction on workspaces was suggested as wel: (113) Workspace Connectednes (DOMINANCE): The elements in a given syntactic workspace must manifest a connected dominance order (for every x, y in the set, either x dominates y or y dominates x) 105 This efects a fairly radical partition of structures, so that the workspace always only contains esentialy a "single line", indexed via feature relationships so that there is coherent maintenance of speled-out branches of structure. Moreover, given the mechanics of node-identification, it was suggested that speled-out structure may "re- enter" the workspace in certain principled circumstances, and then be required to "re- spel-out" (and then re-enter again, and so on). Again, I wil explore this view in connection with SCM phenomena in Chapter 3. We have adopted a reduced vision of category/structure, importing ideas from the work of Brody (2003). This view as argued above to make for a clearer, technicaly les complicated fit with one of the central intuitions of the TCG approach as stated above in (112). We can now note that this point-of-view can be strengthened a bit. If something like (112) is correct, then something like the Brody-type reduced structures might be in fact required. The alternative would be to introduce a way of distinguishing betwen X 0 , X', and XP. But the entire point of Brody's proposals ? and this holds of the Collins/Stabler view mentioned briefly above as wel ? is that these are distinctions that we can and should learn to live without, as they are redundant with other independently required concepts (we may dispute this, it is ultimately an empirical mater, but that is the claim, and it is the right one to advance on minimalist grounds). Indeed, one can take this general direction of theory-development as a natural continuation of Chomsky's (1994, 1995) BPS-project, which was aimed at (among other things) eliminating primitive bar- level distinctions. The key ideas underlying our view of SCM-type relationships was sen to rely on a weakening of Chomsky's (1995) view of chains as sets of contexts, where "contexts" 106 were understood on his approach as the entire previously established structure that an element ? merges with. But this view was suggested to be too strong, as it yields unique contexts for each "link" of any complex chain. The key idea here revolves on a denial of this ? it is the fact that contexts are not uniquely identifiable that permits cross-domain movements of the linked-local/SCM sort. I turn next to a more detailed empirical and theoretical discussion of SCM. 107 CHAPTER 2: Regarding Succesive Cyclic Movement In this chapter I discuss a number of isues regarding syntactic theory and analysis with reference to succesive cyclic movement (SCM) and related phenomena. First, I canvas an aray of empirical considerations that have ben taken in the past to argue in favor of SCM. I then discus the isue of what motivates the intermediate/non-target movements posited by SCM analyses (the TRIGGERING problem ? se (64) above in ?1.4.2) alongside the isues of how we ought to regulate phases and understand localizing evaluation for convergence (the CONVERGENCE problem). 2.1. Types of Successive Cyclicity Effects I turn now to take stock of the sorts of considerations that have led gramarians to think that something like succesive cyclic movement operations are for real. The initial motivation for positing succesive-cyclic movements came from discussion and arguments in Chomsky (1973), where it was proposed that wh-movement ought to be viewed as clause-local, with the "edges" of clauses (a COMP node made available under S-bar) serving as escape hatches. For some time then succesive cyclic movement was motivated only by theory-internal considerations arising in the proper treatment of movement locality. However, a number of other phenomena have since been brought forward that are of a diferent sort. Most (if not al ? se below) of these phenomena provide contribute to the body of converging evidence for the idea that movement relations are not generaly "one-fel- swoop", but rather manifest a linking of local relations. This idea of succesive cyclic or 108 linked-local relations, as we wil se here, brings a remarkably diverse range of phenomena into a single abstract clas. It wil be helpful in this discusion to make reference to a set of distinctions drawn from Abels (2003). He distinguishes betwen some logical possibilities regarding ways that movement relationships might afect or interact with intervening material, discriminating betwen "punctuated" and two diferent general types of "uniform" conceptions. Consider first this general schema: (114) FILLER GAP PATH The possible efects that any such filer/gap relationship might have on elements along the path betwen them can be discussed with reference to the following tri-partition: (115) a. ? ? (Uniform; no efect of ??? on the path) b. ? ? (Uniform; entire path efected by ???) c. ? ? (Punctuated; only parts of path efected by ???) .. .. .. .. .. We wil se that the various phenomena that have been argued to favor succesive cyclic movement analyses are not homogeneous ? al of the possibilities in (115) are instantiated. 2.1.1. Wh-Copying What strikes me as one of the most intuitively convincing types of evidence is the following: in certain languages we actualy se overt copies of moved elements. The 109 following data are drawn from an interesting summary and theoretical discussion in Felser (2004) ilustrating this phenomena for wh-movement in a number of languages: 56 (116) a. Wen glaubst Du wen sie getroffen hat? German who think you who she met has ?Who do you think she has met?? b. W?r tinke jo w?r-t Jan wennet? Frisian where think you where that-CL J. resides ?Where do you think that John lives?? c. Waarvoor dink julle waarvoor werk ons? Afrikaans wherefore think you wherefore work we ?What do you think we are working for?? d. Kas o Dem?ri mislenola kas i Ar?fa dikhla? Romani whom Demir think whom A. saw ?Who does Demir think Arifa saw?? This phenomena is punctuated in Abels' sense ? this kind of copying is only available at clausal boundaries (i.e., CPs). Below we wil se other types of evidence that suggests the possibility that wh-movement might generaly be "more succesive cyclic" than this, implicating the edges of VP as wel (specificaly vP). So we wil want to ask why this copying phenomena does not show up anywhere but clause edges. Interestingly, we also se this kind of copying phenomena in the L1-acquisition of English, where the target gramar does not ever permit this kind of copying. Consider the following case of so-caled 'medial-wh' (De Viliers et al. 1990; McDaniel et al. 1995; Thornton 1990): 56 Felser (204) draws the folowing examples from the folowing sources: (16)b is from Hiemstra (1986: 9); (16)c from Du Plesis (197:725); and (16)d is adapted by Felser from data in McDaniel (1989:569n.5). 110 (117) a. What do you think what Mini put on _ ? b. Who do you think who's in the box? The existence of this kind of copying in early child English raises interesting chalenges for the Subset Principle (Berwick 1985), as this apparently lies within a superset of the gramar of standard English. It would sem then that learners would require negative evidence to abandon such wh-copying. But regardles of how this is sorted out (perhaps in terms of an indirect sort of negative evidence), the existence of such cases points strongly towards the reality of something like SCM. 2.1.2. Q-Stranding Other cases show an efect that is intuitively related to the copying phenomena ilustrated above, where it is aleged that we can se part of a moved expresion "stranded" in positions which it has by hypothesis moved through. Facts of this kind include the paterns of so-caled quantifier stranding reported in a dialect of Irish English by McCloskey (2000) ? specificaly in West Ulster English. 57 Consider: (118) a. What al did you get t for Christmas? Standard English b. ho al did you met t when you were in Dery? c. Where al did they to t for their holidays? (119) a. What did you get t for Christmas? Standard English b. ho did you met t when you were in Dery? c. Where did they go t for their holidays? (120) a. What did you get all for Christmas? West Ulster English b. ho did you met all when you were in Dery? c. Where did they to all for their holidays? 57 McCloskey refers readers to the work of Henry (195). Aparently the Q-float-type phenomena under discusion varies, like the copying phenomena discused above, by dialect, and like "Standard" English is not present/posible in Belfast English. Se McCloskey's paper for further discusion of dialect diferences in this regard. 111 McCloskey notes that in Standard English the cases in (118) versus (119) difer in whether they require "that the answer is a plurality [..] insisting on an exhaustive (118)), rather than a partial, listing of the members of the answer set". The interesting cases from West Ulster English, in (120), are claimed by McCloskey patern in these interpretative properties with the examples in (118), and not those in (119). He notes that this phenomena is not exclusively tied to matrix clause interogatives, but appears in embedded environments as wel: (121) a. I don't remember what all I said West Ulster English b. I don't remember what I said all McCloskey develops a stranding-type analysis for these phenomena, whereby wh- elements move succesive cyclicaly and may abandon the asociated element all at places along the movement path. Following earlier proposals (Postal 1974, Koopman 1999), McCloskey asumes a structure like (122) for the wh-element plus the quantificational element all, analogous to ideas that have been put forward in analyses of similar phenomena involving NP-movement (se below on stranding and raising-to- subject): (122) wh DP who DP i D al D 0 t i This structure, McCloskey argues, alows the possibility of either of the two circled nodes to undergo movement (i.e., he makes the not unreasonable asumption that both the specifier and the dominating DP node both bear the wh-properties as marked above). If 112 this is correct, then in principle every position to which the wh-moves could be a position where all is stranded. 58 Important then for the notion of succesive cyclic movement are the following cases in (123) which ilustrate that all can be stranded in an intermediate position: (123) a. Where do you think all they'll want to visit t? b. Who did Frank tel you all that they were after t? c. What do they claim al (that) we did t? This kind of stranding phenomena is sometimes refered to as "floating" in virtue of earlier approaches to these maters which involved transformational operations that literaly moved (floated) such elements from position to position. 59 In addition to the logical possibility of true "floating" as a way to possibly analyze such cases, there is also an extensive literature treating phenomena of this type in terms of analyses that base-generate all in the various positions where it occurs, thus suggesting the possibility that the statement of the laws governing distribution of elements of this type might be independent of the putative path of succesive cyclic movement operations. 60 We can push this point a bit with a consideration of similar phenomena as it arises in cases of A-movement (these data from standard English): 58 McCloskey acknowledges that this analysis relies on the posibility of left-branch extraction ? in order for the wh-element to strand the quantificational element al. Though he notes as wel that such an operation fals within the bounds of known cros-linguistic variation. 59 Analyses positing literal transformational floating were ofered, for example, in Kayne (1978). Se Bobaljik (201) for a thorough review of the isues surounding these elements and what they may (or may not) be able to tel us about the nature of syntactic structures and the properties of movement operations. 60 Some of these base-generation analyses posit that elements like al have the special property that they can only be generated in positions where they can enter into a relation with a movement trace. On such views the distribution of al is not, in fact, independent of movement ? rather the diference is in terms of whether that element ever was "in" any positions other than the one in which it surfaces. 113 (124) a. The men all semed to appear to be likely to leave b. The men semed all to appear to be likely to leave c. The men semed to al appear to be likely to leave d. The men semed to appear all to be likely to leave e. The men semed to appear to all be likely to leave f. The men semed to appear to be all likely to leave g. The men semed to appear to be likely all to leave h. The men semed to appear to be likely to all leave The possible stranding sites appear to be a bit more prolific here than is typicaly asumed. 61 Although, as with McCloskey's A'-movement cases, there is apparently dialect variation here as wel. 62 If the distribution of these elements marks the path of movement, then this suggests that we have the folowing intermediate positions: (125) The men semed t to t appear t to t be t likely t to t leave A subset of these are motivated under fairly standard asumptions about the existence of a base/? position internal to the most embedded VP and the existence of EP-features marking the "subject" positions of the relevant infinitivals (marked below). But thre others are les straightforward (marked with "??" in (126)): ?? (126) The men semed t to t appear t to t be t likely t to t leave EP VP-INTERNAL/?-POSITION The distribution of all in these cases might then be suggesting that movement is very succesive cyclic. In contrast to the wh-movement cases of stranding discused by McCloskey, which appear punctuated, this A-movement case appears to manifest a 61 Acounts that insist that movement targets the edge of every intervening XP would do fine in acounting for this patern, but then its not clear why we shouldn't se more prolific stranding in A'-movement then. 62 Judgments on a-h vary, though to my ear they are al equaly aceptable. (Norbert Hornstein informs me that he finds b, d, and g les aceptable than the rest). Se Hornstein (200) for a posible explanation. 114 uniform "al positions afected" relationship betwen the surface position of the subject NP and its base/?-position. (That is, if we have reason to think movement realy does target every single projection on the path). 63 However, as mentioned above, there exists analyses which posit instead base- generation. While the case is by no means closed, its not crystal clear that these kinds of facts actualy bear on the question of succesive cyclic movement (se Bobaljik 2002). 2.1.3. Agrement on the Path Asuming agrement relationships are typicaly local (not trivial; se below), the distribution of ?-properties might tel us something about the path of movement. However, given the possibility of an operation of the AGREE sort proposed in Chomsky (1999), where the relation betwen two agreing elements can be potentialy non-local, such phenomena might not tel us anything about the path of movement. Nonetheles, consider the following. Kayne (1989) offers cases like the following ilustrating participal agrement on the path that the relevant local A-movements would usualy be asumed to take: (127) Les files sont apparues avoir ?t? report?es disparues the girls are.3 RD PL appeared.3 RD PL have been reported.3 RD PL disappeared.3 RD PL 'The girls appeared to have been said to have disappeared' However, if these relations can be licensed without movement, then its not clear what we should make of these agrement facts. On Chomsky's view such series of embedded non- finite clauses manifest no internal phase divisions, so it is not implausible that agrement 63 The view I wil end up endorsing in Chapter 3 regarding this particular case wil be inconclusive. 115 could be understood to occur long-distance. It would then be an open question regarding the presence/absence of "EP-features" that would determine whether the movement of the NP (les files) from its base position would have to involve al, some, or none of the intermediate nodes. But such a non-movement (long-distance "agre") approach is not as plausible for similar local agrement phenomena that occur in a variety of languages in the domain of A'-movement. 64 For example, consider the complementizer agrement phenomena in Irish (McCloskey 1979, 2000): (128) Credim gu-r inis s? br?ag I-believe go-PAST tel he lie "I believe that he told a lie" (129) an t-ainm a hinnseadh d?inn a bh? ar an a?t the name aL was-told to-us aL was on the place "the name that we were told was on the place" Finite clauses in Irish manifest a diference betwen finite clauses that do versus do not contain an A'-movement trace. In the former we se the bold particle in (128); in the later we se the particle aL (129). This phenomena manifests at every clause edge, strongly suggesting something like succesive cyclic movement is at work. Chamorro (Chung 1982, 1994) has been argued to show similar kinds of intermediate agrement efects (though se fn65 below). Consider (from Chung 1994:1): (130) Hum?lum si Maria [na ha-p?nak si Juan i p?tgun] AGR-assume Mary COMP AGR-spank Juan the child "Maria asumes that Juan spanked the child" 64 Its not plausible that non-local agrement is at work here since these agrement relationships would have to cros phase-boundaries (on Chomsky's view) and clausal boundaries in general on anyone's acount, if there is no local movements involved. 116 (131) Hayi hinalom?a si Maria [t pum?nak t i p?tgun] who? WH.assume Maria WH.spank the child "Who does Maria asume spanked the child?" Chung notes that, "in simple wh-constructions [..] the presence of a moved wh-phrase is signaled morphologicaly on some head in the extended projection of [+V]" and that in "long-distance wh-constructions, the special morphology shows up on every such head along the path" (glossed as WH in (129) above). Extraction across multiple boundaries thus shows the agrement efect at every clausal edge: (132) Hafa sinangani-n Juan as Dolores [t ni minalago'?a [t p?ra un-taitai t]? WHAT? WH[OBJ2].tel Juan OBL Dolores COMP WH[OBL].want-AGR FUT WH[OBJ].AGR-read "hat did Juan tel Dolores that he wants you to read?" Similar kinds of local agrement phenomena have been documented in a number of other languages. 65 Seting aside some interesting complications facts such as these provide fairly strong support for SCM. 6 Both the Irish and Chamorro cases manifest a punctuated efect on the movement path. Moreover, it is not possible to "skip" intermediate positions such that the agrement efects would show up both above and below a position in which the efect would be absent. Where it occurs, it occurs al the way down the structure from the fronted element to the extraction site. 65 In, for example: Kikuyu (Clements 1984, Sabel 200), More (Ha?k 190), Palauan (Georgopolous 1985), Pasamaquody (Bruening 201), Malay/Bahasa Indonesia (Cole & Hermon 200, Sady 191) 6 The "complications" include (i) the local agrement along the movement path evident in Chamoro is not analyzable as agrement betwen the wh-element and the local head of CP and is not strictly speaking agrement with the moving XP, but rather with distinguishable properties of the trace elements (se Chung 194:7-1) and (i) it turns out that there are cases for which such sucesive cyclic agrement is optional. Chung argues that these cases manifest a D-linked versus non-D-linked distinction (se Pesetsky (1987), and se Cinque (191) for a view subsuming D- linking under referentiality) with the D-linked cases manifesting the optionality thus sugesting only optional sucesive cyclic movement. 117 2.1.4. Some Binding Theoretic Efects Consider the binding possibilities of the self-form in (133)a: (133) a. Which pictures of himself did John know Bil wanted? b. [which..himself] did John know [which..himself] Bil wanted [which..himself] Cases such as (133)a are ambiguous ? the self-form can be interpreted as anteceded by either John or Bil. Such cases are not totaly straightforward ? there are a number of confounding factors that must be controlled for (se ?3.2.2 in Chapter 3 on "logophors"). Here we wil simply note that such ambiguities have been tied to succesive cyclicity in A'-movement via the idea that the local movements create local contexts for the licensing of the self-elements. Similarly, we se from the following pair of examples that similar phenomena appear to manifest in both A'- and A-movement. Consider (from Bars 2001): (134) a. The women 1 asked [which pictures of themselves 1/2/3 ] the men 2 had said that the children 3 had brought t WH to the school fair b. The women 1 consider [old pictures of themselves 1/2/3 ] to have struck the men 2 as [appearing to the children 3 [t NP to be amusing] These cases, on a succesive cyclic movement view, would be related to structures involving local copies (or traces which could be "reconstructed into") as follows: (135) a. The women 1 asked [wh..selves 1/2/3 ] the men 2 had said [wh..selves 1/2/3 ] that the children 3 had brought [wh..selves 1/2/3 ] to the school fair b. The women 1 consider [..selves 1/2/3 ] to have struck the men 2 as [..selves 1/2/3 ] [appearing to the children 3 [..selves 1/2/3 ] to be amusing..] 118 Both A'-movement (135)a) and A-movement (135)b) appear to manifest the same sort of "expansion" efect, as Bars notes, in terms of the available antecedents for the relevant self-forms. The ful set of available antecedents is interpretable with a fairly tight view of binding in mind if we adopt a succesive cyclic view for both types of movement as sketched above. Other interesting cases involving binding impossibilities as they appear to arise in A-movement. First, consider the following cases in (136)a&b, and what sem to be a binding-theoretic violation in (136)d and (i) the absence of ambiguity in (136)f, as contrasted with (136)c: (136) a. John 1 semed to himself 1 to appear to Mary to be geting fat b. John 1 semed to appear to himself 1 to be geting fat c. John 1 semed to Mary to appear to himself 1 to be geting fat d. *Mary semed to John 1 to appear to himself 1 to be geting fat e. It semed to John 1 to appear to himself 1 that he was geting fat f. John 1 semed to Bil 2 to appear to himself 1/*2 to be geting fat First, note there is no problem with John as the antecedent for the reflexive himself in (136)a. Similarly, we can increase the distance and have the reflexive ocupying the experiencer-P of the second raising verb, and the reflexive binding is stil posible, as in (136)b. So the first point is to take note of a thre-way disjunction of possibilities: (i) either binding domains can span across levels of embedding or (i) the antecedent must somehow be "in" the lower infinitival clause in addition to participating in its overt position or (ii) the reflexive must somehow be "in" the superordinate matrix clause in addition to participating the relations of its overt position. 119 Al of (i)-(ii) have been advocated at one point or another, so its worth considering in terms of what else we say here which of these we options we can remain consistent with. Now consider (136)c/d. The first experiencer-P (to Mary) does not serve to block the antecedence relation betwen John and himself in (136)c, though somehow the relation is blocked where there is no overt intervening element, despite the fact that antecedence betwen the two experiencer-Ps is otherwise perfectly legitimate. 67 This whole aray of facts fals out nicely if we asume that the following movement operations have transpired: (137) a. John 1 semed to himself 1 (t John ) to appear to Mary (t John ) to be geting fat b. John 1 semed (t John ) to appear to himself 1 (t John ) to be geting fat c. John 1 semed to Mary (t John ) to appear to himself 1 (t John ) to be geting fat d. *Mary semed to John 1 (t Mary ) to appear to himself 1 (t Mary ) to be geting fat e. It semed to John 1 to appear to himself 1 that he was geting fat The case in (136)/(137)a can be straightforwardly understood with a clause-local conception of binding domains, as can (136)/(137)b&c. In the former the overt position of the subject John licenses the reflexive; in the later two it is the trace/copy of John in the subject position of the infinitival to appear which provides the local licensing. The interesting case of blocking/intervention now arises in (136)/(137)d, which we understand to be out for esentialy the same reason that *Mary appeared to himself to be geting fat is out; that is, there is a mandatory local antecedent for the reflexive, and it 67 Some speakers do not find the binding of the self-form in these cases to be aceptable. These speakers apear to prefer the a-case(s) with a simple pronoun over the b-case(s) with a self-form in (i): (i) a. It semed to John 1 (to tend) (to be likely) to apear to him 1 that he was geting fat b. It semed to John 1 (to tend) (to be likely) to apear to himself 1 that he was geting fat Se Chapter 3, ?3.2.2 for some discusion. 120 disagres in gender specification. So (136)/(137)d is out via a minimality type efect since the trace/copy of Mary constitutes a closer potential antecedent than John does. And we know from (136)/(137)e that John being embedded in a P structure has nothing to do with the impossibility of binding in (136)/(137)d, since such relations are independently fine. There are some interesting wrinkles here that require addresing (in particular the status of these self-elements in relation to the anaphor/logophor distinction ? recal I noted above that the binding betwen experiencer-Ps does not appear to be clause-local). I wil return to these maters in the discussion and analyses in Chapter 3. 2.1.5. Interaction with Variable Binding Consider also the following (se Bo?kovi? 2002; Lebeaux 1991; Nunes 1995): (138) a. *[His 1 mother's 2 bread] sems to her 2 _ to be known by every man 1 to be _ the best there is b. [His 1 mother's 2 bread] sems to every man 1 _ to be known by her 2 to be _ the best there is This case, like the A'-movement case I wil discuss in a moment, shows the necesary availability of intermediate position reconstruction. We understand the bracketed phrase to have moved through the positions marked via the underscores. In order to license the bound variable reading for the a-case, the bracketed structure must be in the scope of every man. But this puts the element in the c-command domain of the pronoun her, which induces obviation (Condition C efect). What the b-case shows, however, is that it is possible to have the bracketed phrase reconstruct to the intermediate position, where it is within the scope of every man but above the pronoun her. And, as Bo?kovi? points out, the il-formed a-case with the indicated co-indexing is fully aceptable on the bound 121 variable reading so long as we have disjoint reference betwen her and his mother. The combination of these observations suggests that intermediate reconstruction is possible, but not necesary. It moreover suggests that the output structure handled by the interpretative systems must be coherent in the sense that the moved phrasal complex must be interpreted as "in" one or the other positions, but not both (as this would cause conflicts that would presumably correspond to unaceptability). This case, like the case involving the self elements discused above, constitute to my mind quite strong evidence for something like cyclic movement. Notice as wel that the same sort of example can be constructed in the context of A'-movement, but with a twist which suggests that A'-movement is in fact even more local than just COMP-to-COMP. Consider (from Fox 1998): (139) a. ? [Which of the papers that he 1 gave Mary 2 ] did every student 1 ask her 2 to read carefully? b. * [Which of the papers that he 1 gave Mary 2 ] did she 2 ask every student 1 to revise? The pronominal element he in (139)a must occur in a position below the quantifier every in every student in order for the bound variable reading to obtain the standard asumption that c-command relations determine scope possibilities relevant to such variable binding. But, in order for the construction to avoid a Condition C violation betwen her and Mary, the complex wh-expresion must appear above the surface position of Mary. This means that the wh-expresion in (139)a must reconstruct in the ?-marked position and not the *- marked one, as in (140)a: 122 (140) a. ? [Which of the papers that he 1 gave Mary 2 ] did every student 1 [ vP ? [ask her 2 to read * carefully? b. * [Which of the papers that he 1 gave Mary 2 ] did she 2 [ vP * [ask every student 1 to revise * ? But when we switch the positions of the every-phrase and the relevant pronoun, as in (139)b, we se (in (140)b) that there is no position which could permit the variable binding of the pronoun (he) within the wh-expresion without having Mary c-commanded by she thus resulting in a Condition C violation (this is indicated above by *-marking the relevant possible positions). This kind of case suggests that not only is sucesive cyclic movement real, but that it involves more than just the typicaly asumed movements of the COMP-to-COMP sort. Rather, movement must proced as follows: (141) [ CP [WH] [ C' 0 [ .. [ vP [WH] [ vP .. [ CP [WH] [ C' 0 [ .. [WH] .. So the question about what features might be involved in motivating succesive cyclic A'- movements appears to require properties that can be present not only in intermediate CP- positions, but also at the edges of vPs. 68 Some other phenomena bear on this possibility as wel, some of which I wil discuss in Chapter 3. Its worth noting here that this kind of phenomena can be shown to extend to boundaries in structure where there is arguably no vP, as in pasives (from Legate 2000): 69 68 As noted earlier, wh-copying phenomena (as wel as McCloskey's stranding) apears to never manifest at positions other than the relevant C-positions. Asuming this is true, whatever the motivations for these two types of cyclic movement (to CP and to vP), we are ned of a story which explains why there should be such diferences. 69 In the first examples, Legate indicates the intended interpretation to be that Mary keps being introduced to her own dates at parties; in the second, there is a charity auction at which dates with bachelors are sold. The argument 123 (142) a. ? [At which of the parties that he 1 invited Mary 2 to] was every man 1 [ vP ? [introduced to her 2 *? b. * [At which of the parties that he 1 invited Mary 2 to] was she 2 [ vP * [introduced to every man 1 *? (143) a. ? [At which charity event that he 1 brought Mary 2 to] was every man 1 [ vP ? [sold to her 2 *? b. * [At which charity event that he 1 brought Mary 2 to] was he 2 [ vP * [sold to every woman 1 *? Such cases are problematic for the view that phases are just vP and CP. I wil discuss these maters a bit in Chapter 3. 2.1.6. Inversion Efects In Spanish we se the following ordering alternation: (144) a. Contest? la pregunta Juan answered the question Juan 'Juan answered the question' b. Juan contest? la pregunta Juan answered the question 'Juan answered the question' In certain cases of wh-movement, this inversion ordering is obligatory: (145) a. ?Qu? quer?an esos dos? what wanted those two 'What did those two want?' b. *?Qu? esos dos quer?an? what those two wanted? is extended with an examination of unacusatives as wel. Legate also considers other tests for phase-hod and concludes that they al point towards a wider inventoy of phases than just vP and CP. 124 Of relevance to the existence of succesive cyclic movement are cases such as the following, where we se this inversion in every local domain that we would think the wh- element has pased through: (146) ? Qu? pensaba Juan [que le hab?a dicho Pedro [que hab?a publicado la revista]] what thought Juan that him had told Peter that had published the journal 'What did John think that Peter had told him that the journal had published?' Efects similar to these exist in French ? so-caled stylistic inversion ? and were initialy documented and argued to support succesive cyclic movement analyses in Kayne & Pollock (1978). However, the relevant phenomena in French appears to be generaly optional, thus it is dificult to say whether the relevant cyclic movements themselves are optional or not. The Spanish cases brought forward in Torego's work however were argued to be an obligatory phenomena. However, these data have become in subsequent years somewhat controversial as there appears to be dialect diferences regarding which types of wh-element must trigger such efects. These diferences may be systematic however. Bakovi? (1995) compiles some of these diferences as have emerged in the literature and supplements them with an extensive survey eliciting speaker judgments. The following diferences emerge in Bakovi?'s study: (147) a. No inversion with any wh-phrases (Su?er 1994) b. Inversion with argument wh-phrases only (Torrego 1984; Su?er 1994) c. Inversion with al but reason wh-phrases (por qu?/"why") (Goodal 1991a,b) d. Inversion with al wh-phrases in matrix clauses; al but reason wh-phrases in subordinate clauses (Bakovi?'s survey) e. Inversion with al but reason wh-phrases in matrix clauses; only argument wh-phrases in subordinate clauses (Bakovi?'s survey) f. Inversion with argument wh-phrases in matrix clauses; no inversion in subordinate clauses (Bakovi?'s survey) 125 These findings have not, to my knowledge, been corroborated by any independent investigation. However, the patern suggestively converges with optionality of agrement in Chamorro, which Chung (1994) shows to hold in cases of D-linking. Consider along similar lines the phenomena in English of matrix subject-auxiliary inversion: (148) a. Who _ liked John? b. ho did John like _ ? (149) a. Who did Mary think _ liked John? b. *Who did ary think did John like _ ? In many standard dialects of English, SAI is an exclusively matrix phenomena, as the il- formed case in (149)b shows. However, there exist dialects in which such aux-inversions happen in embedded contexts as wel, and they share some of the interesting properties of other cases that suggest cyclic movement. Consider the following cases from Belfast English: 70 (150) a. Who did John hope [ would he se _ ]? b. hat did Mary claim [did they steal _ ]? c. I wonder what did John think would he get _ ? d. Who did John say [did Mary claim [had John feared [would Bil atack _ ]]? This last case appears to show the local/SCM-type efects of wh-movement on Subj/Aux inversion in ways analogous to the Subj/V inversion of the Spanish type. 70 These are taken from Pesetsky & Torego (201), who themselves cite the extensive work of Henry (195) on this dialect. 126 2.1.7. Control as Raising? On at least some views of control phenomena sequences of control predicates must manifest succesive cyclic movement. Hornstein (2000) and Manzini & Roussou (2000) develop analyses within the MP that atempt to asimilate control and raising. For example, on Hornstein's story (151)a is derived by operations alowing movement of the nominal expresion into (and then out of) superordinate ?-positions, finaly landing in a Case position (151)b ignores the matrix ?-position for expect and marks positions as either ? or "EP"): (151) a. John expected to want to try to leave on time CASE EP ? EP ? EP ? b. John expected (t) to (t) want (t) to (t) try (t) to (t) leave on time Such a movement (raising) analysis of control phenomena has been the subject of some recent controversy. 71 In the TCG approach developed here we wil encounter theory- internal reasons prohibiting any straightforward identification of control and raising, though I wil suggest that the TCG mechanics bear on the relation betwen the two in an interesting way. In particular, a way is discussed in connecting (as does Hornstein) control with reflexivization, but in a way that remains distinct from raising (though both wil involve node-identification/contraction). 71 Se in particular Culicover & Jackendof (203), Landau (204), and Boeckx & Hornstein (203). 127 2.2. Some Thinking on Successive Cyclicity Consider again the case of wh-movement with which we began our discusion of SCM (repeated here as the b-cases in (152)-(154)) to which we now add for consideration instances of NP-movement argued to be similarly succesive cyclic: so-caled Raising-to- Subject (RtS; the a-cases in (152)-(154)): (152) a. John sems to be likely to appear to like carots b. What did Dave think that Mary believed that John liked? (153) a. John [sems [to be likely [to appear [ _ to like carots]] b. What [did Dave think [that Mary believed [that John liked _ ]]? (154) a. John [sems [ _ to be likely [ _ to appear [ _ to like carots]] b. What [did Dave think [ _ [that Mary believed [ _ [that John liked _ ]]]? I argue that although there is ample evidence that the picture of these dependencies as decomposed in (154)a/b is largely correct, current theory does not appear to have reached anything like a consensus as to what explains the fact that the system works in this way. The crux of the problem that these phenomena present for theory and analysis centers on the fact that it sems that we cannot approach these linked-local dependencies with a general reduction to the typical simple local cases in mind: 72 (155) a. John appears [ _ to like carots]] b. What [did John like _ ]]]? 72 Strictly, the "base" position of the movement ilustrated in (15)a would be a derived position, conected to the underlying VP-internal position. My focus at the moment is on the properties of the "target" position, so this example does just fine in this respect. As it hapens, I wil later on sugest that the relationship depicted by the link in (15)a is not, in fact, a CHAIN, whereas the relation in (15)b is. What wil be a CHAIN in the area of A-movement wil be the conection betwen the Case/? position and the vP ?-position (e.g., [ TP John i [ T' T 0 [ vP t i [ v' v 0 [ .. ]]]. Local pasive structures wil similarly constitute CHAINS (e.g., John was kicked _ ). 128 Any simple reduction of the linked-local A- and A'-movements above to local cases like these is not possible because the properties which we might take to drive the establishment of these local relations (Case/?-properties for the (155)a and wh-properties for (155)b) are arguably not present in the intermediate positions for the more complicated structures involving embedding. In fact, when such positions with the relevant properties are available, movement relationships which would then have to cross such positions to relate to a more distant landing site are impossible, as wh-island and superaising cases show. (156) a. Dave wondered [what John likes _ ] b. *What did Dave wonder [whether John likes _ ] (157) a. It sems John was told [ _ to arive on time] b. *John sems it was told [ _ to arive on time] The complement of wonder in (156) selects for a [+wh] complementizer which can be a local/direct landing site for wh-movement. But when this property is present, it cannot be moved across, as (156)b shows. Similar restrictions hold for raising, prohibiting an NP from moving past a Case/? position (157)b). It is also impossible for an element which has satisfied the relevant properties be moved further to satisfy them, as it were, "again". Consider what are sometimes caled "frezing" facts such as the following: 73 73 There exist, however, some curious aparent exceptions to these generalizations. Consider, for example, so-caled 'copy-raising' constructions (Rogers 1967): (i) John sems [like/as if [he likes carots] These share with the raising-to-subject constructions the property of having a non-thematic subject (as Potsdam & Runer (203) show in some detail). But this sugests that the overt matrix subject John somehow enters into a movement or chain relationship with the embeded subject pro-form he in (i). I wil not be discusing these 129 (158) *John sems that _ liked carots (vs. It sems that John liked carrots) (159) *Who do you wonder _ Bil liked _ ? (vs. You wonder [who Bil liked _ ]) So now how do we deal with such intermediate movement cases in the general framework of feature-checking and Last Resort? The idea of Last Resort has been a central notion in minimalist research: syntactic operations are not fre, but must be driven or caused. 74 Such driving causal forces are subsumed under the notion of feature- checking. So what causes the intermediate movements in this sense? 2.2.1. M-Features Standard minimalist thinking views movement relationships as governed by a condition of LAST RESORT which generaly disalows syntactic operations unles they are required to license/check a property or feature that would otherwise cause the derivation to crash (se Chomsky 1995:280 for one technical version of this idea). This constitutes a rejection of the notion of Move-? developed within Government and Binding (GB) approaches in the 1980s and early 1990s. On this earlier GB view, operations were regarded as esentialy fre, but with restrictions imposed on either derivations or on levels of representation to rule out impossibilities. On the GB-view then, the work of stating principles governing movement was thus not typicaly handled within the statement of the operations themselves. Lasnik & constructions here, though the general aproach to them in the present framework should be clear ? these canot be of the same embeded clausal type as raising predicates. 74 Like many ideas in curent work, feature-driven operations have a longish history, present in very early work in the form of so-caled trigering morphemes that were tied to various specific kinds of transformations (Klima 1968). In curent MP work, the inventory of "operations" is now quite general, so there are not "construction specific" transformations that could be specificaly trigered (or, at least, that is the aim for theory construction). Rather, what is trigered in curent theory some response out of a smal set of general options (e.g., 'move', 'delete', etc.). 130 Saito (1992), for example, suggest a general formulation "Afect-?" which simply says "Do anything (move, insert, delete) anything". Constraints on levels of representation (like the ECP or Subjacency) were then understood to inspect the broad range of possibilities that arise from the general Afect-? formulation, and reject those outputs which did not comply with the conditions on the various levels (e.g., S-Structure, LF). One way of looking at this would be to say that GB-type systems looked at gramatical constraints governing wel-formednes as restricting outputs of otherwise unrestricted operations. The MP, in contrast, can be viewed as imposing conditions on the inputs to such operations ? that is, in the statement of laws governing what can count as a legitimate structural description over which an operation can apply. Operations can be, and in fact are, stil viewed as general in the MP (e.g., merge, move, insert, delete; as opposed to, e.g., "pasivize"), but the leading idea for theory construction is one of economy ? nothing can happen unles it is forced. Viewed in this way, the diferences betwen GB and more recent work in the MP take on a bit more subtlety. We can se the subtlety of the GB/MP diference as it emerges in current work's appeal to what I wil cal "M-features". Much current work which atempts to formulate analyses of sucesive movement consistent with Last Resort does so at the cost of advancing closer towards genuine explanation. In efect, what much recent work does to motivate these movements in a way consistent with Last Resort is to adopt a brute force solution; that is, to posit a 131 feature or property of the positions constituting the intermediate links in these composed dependencies. Cal this a Merge/Move-feature or M-feature for short. 75 This is potentialy saying something quite exciting, perhaps despite appearances. We could understand this claim about the forces driving movement relationships as having hit upon a genuine explanation. Thus, movement is primitive: a dep, central, ireducible fact of the mater regarding the workings of the faculty of human language (FL). In minimalist terms, it is neither to be motivated with reference to the nature of the interface betwen gramar and other cognitive systems (natural interactions) nor in virtue of the internal coherence of the system based on virtualy conceptualy necesary asumptions about its workings. Movement is, rather, a basic property of the narow syntactic component of FL. On the other hand, the M-feature view could also be taken to be saying something perhaps more depresing. That is: we simply do not have a clue why there might be displacement properties in human language syntax. The best we can do at the moment is catalog the facts, and hope that something turns up which might lead us towards a beter understanding. Or, slightly les depresing: we may have lots of ideas about why there ought to be such displacement properties, but no teribly strong reasons for thinking any one of them versus any other is right. 76 75 In certain cases this feature is sugested to be satisfiable by Merge alone, as in cases of expletive elements. I return to this later on. 76 My sense is that we curently face something more like the later situation in theoretical linguistics. The only way out is of course to kep pursuing the options. Note that the "depresing" picture nedn't be taken to be al that depresing. Sometimes we go through stages in the development of ideas where realy the best we can do is to put like things into the same box, and kep loking for god generalizations that have further predictive power. So while I wil be writing here with a skeptical tone about what I'm caling "M-features" (e.g., EP-/P-features), its important to realize that Chomsky's recent generalization of these properties is a sensible move in this respect in that it broadens the range of phenomena claimed to go "in the same box", whatever their ultimate explanation. My view, developed here, is that M-features actualy pick out a heterogeneous set. Local and sucesive cyclic "movement" are not the same phenomena. 132 M-features have taken on an increasingly important role in recent developments in minimalist syntactic theory. Chomsky (1999) suggests that the M-feature conception of movement be generalized beyond just the kinds of intermediate/non-target movements taken to be involved in succesive cyclicity. He proposes alternative mechanisms to handle the licensing relations that in earlier manifestations of minimalist syntax had been understood to be the movement-driving ones (like Case/? or wh-features). Fukui (1986) refers to such properties as Kase features; I wil refer to them here as Core Licensing Properties (CLPs). So, these relationships betwen CLPs in Chomsky's recent work now fal under his PROBE/GOAL relations and the notions of "matching" and AGRE. And, having relocated the job of handling these CLPs in this potentialy non-local (or, rather, les local) way, he suggests that what drives actual displacement is rather M-features ? for all movement. This means abandoning M-features in connection with curent approaches comes with the burden of rethinking movement generaly, not just the mysterious succesive cyclic ones, but the local cases as wel. A quick ilustration with the A-movement example from above can serve to ilustrate this recent generalization of the role of M-features, as wel as the basic idea behind the AGRE operation. Whereas minimalist acounts previously understood al but the last of the sub-steps of movement to be driven by M-features (as in (160)a), now al such steps are so motivated, as in (160)b', following the independent Case/agrement (?/?) licensing which happens "long distance" via the operation AGREE. The agreing item, dubbed the probe, scans its c-command domain for a matching element or goal 133 (e.g., John in (160)b in virtue of its matching ?/?-features). Once AGREE takes place, it is the presence of M-features which drive the various sub-movements in (160)b': (160) a. ? ?[?/?] ? ?[M] ? ?[M] ? ?[M] ? [?] John [sems [ _ to be likely [ _ to appear [ _ to [ _ like carots]]] b. PROBE?????????GOAL _ [ [?/?] sems [ _ to be likely [ _ to appear [ _ to [ (John [?/?] like carots]]] ?AGREE??AGRE??AGRE??AGREE b'. ? ?[M] ? ?[M] ? ?[M] ? ?[M] ? [?] John [sems [ _ to be likely [ _ to appear [ _ to [ _ like carots]]] In (160)a note that there are thre types of movement chains that we can diferentiate in terms of what licenses their upper and lower 'links'. 7 There are: (i) chains with ?-tails and M-feature heads, (i) chains with both head and tail characterized/licensed by M- features, and (ii) licensed (final landing site) heads with M-feature tails. Nowhere in this scenario is there a chain that resembles the basic local situation, with both the head and tail in substantive licensing configurations (e.g., a chain CH = ?DP ?/? , DP ? ?). (However, note that on at least one technical view discused in Chomsky (1995), discussed above, there would be a chain connecting the base and final target positions). 78 In (160)b/b' however, we se only two types of chain relation, one with a ?-tail and an M-feature licensed head, and the rest with M-feature heads and tails. Again, in the later scenario ?/? relations are taken to be an independent precondition for the M-feature 7 I wil here folow the comon terminology of refering to the "top" of a given chain as its HEAD, and the "botom" of a given chain as its TAIL. Take CHAIN for now to be a descriptive term of convenience, without intending comitment to a technical notion of CHAIN as oposed to MOVEMENT or a binding-type relation, or what-have-you. Later on I wil comit to a specific view in this regard. 78 Chomsky himself apears to have lost interest in the potential diferences betwen various technical ways of handling movement (se in particular his closing remarks in Chomsky 199 on this topic where both "chains" and "multiple merge" are labeled "terminological conveniences" with yet another set of terminological distinctions used to pick out the "more stern 'oficial' theory"). 134 driven movements ? there must be a derivationaly prior PROBE/GOAL relationship. M- features have thus, in Chomsky's recent work at least, become synonymous with "movement" itself. I find the general line problematic. The presence/absence of M-features is not principled. They are stipulated as obligatorily present wherever evidence suggests movement is not optional, and optionaly present wherever evidence suggests movement is not obligatory. We might be tempted to view this ? the M-feature Generalization ? as a kind of reintroduction of the Government & Binding (GB-)era conception of MOVE- ?. What is the status of Last Resort when we can say that it necesarily applies where features are present that require checking, but have the presence/absence of the driving properties be themselves unregulated? But this move is not as clear as the GB-era conception of Move-? given the lack of supporting machinery of the GB-type that has by-and-large been stripped away under MP asumptions. In GB we understood applications of the general "move anything anywhere" rule as being restricted by other formal and substantive requirements, typicaly (though not always) stated as conditions on levels of representation so that movements could not result in violations (of Case theory, Binding, the ECP, etc.). 79 So its not true that optional M-features are reintroducing Move-? in this sense ? we can't move anything anywhere. Rather, we can move when (i) the appropriate AGRE relationship holds, and (i) when there is an M-feature. 79 I say "not always" because acounts typicaly varied as to whether conditions were built-in to rule- aplications or stated in terms of output filters (e.g., we can take movement to be restricted to ocuring betwen elements in a c-comand relation, or we can take movement to be "fre" with conditions on CHAINS, or the like). 135 What governs the distribution of M-features is something of a mystery. This is, perhaps, appropriate, in the following sense. It amounts to the asertion that the existence of displacement phenomena in human language is an unsolved open set of problems, perhaps an "imperfection" in Chomsky's (1995, 1998, 1999) sense. Why is there displacement? What is the content of the overt versus covert movement distinction? Were we fooling ourselves in GB-era architectures or early minimalist approaches by thinking that constraints on levels of representation (in GB) or checking of core licensing properties (e.g., "morphology driven movement") in the MP was realy offering us an explanation for the relevant phenomena? The early implementations of the Last Resort logic needed to appeal to "strong" versus "weak" flavors of core licensing properties ? doesn't this distinction do what M-features are doing now (i.e., code the overt-covert distinction)? Are we to expect a development of the concept of M-features? Should we expect to find some independent motivation for their existence? It could be, but at the moment our understanding of M-features does not appear to realy extend beyond this inter- motivation ? where there is movement there is an M-feature, and vice-versa. Al we realy know is that they are not features of the sort that we used to take as the properties driving movement (e.g., wh/Q, Case, ?, etc.). It has been suggested for the case of succesive cyclic wh-movement that the relevant M-features may be of a quasi-interogative nature, but it is quite unclear why these properties should, for any local syntactic reason, become asociated with the outer edges of embedded declaratives to drive the relevant intermediate movements, as in our 136 example above. 80 Alternatively, it might be suggested that these features come along for fre whenever there "wil be" the relevant core licensing properties present in some later derivational stage. But this means that the presence/absence of these properties requires reference to arbitrarily large stretches of syntactic computation to establish their local legitimacy. Current worries in the minimalist literature about whether this or that view of derivations requires "look-ahead" or not ought to be focused now on the existence of M- features and with how their presence is localy justified. Other suggestions have explored the idea that such M-features are focus related, but this appears to run into similar problems regarding how to motivate them for exactly where they are needed to drive succesive movement and not elsewhere (se Felser 2004). And, in general, these features are required to be present at the edges of otherwise rather diferent kinds of categorial domains (at least C and v, and perhaps D as wel) in order to drive the relevant movements "out of" the domains Chomsky cals phases. The question of why these properties should be asociated with exactly the domains that are otherwise suggested to be phases has no principled answer that I am aware of. Why should M-features line-up with phase-edges? For A-movement, the postulation of M-features takes the form of what amounts to a return to "square one" regarding the Extended Projection Principle (EP) introduced in Chomsky (1982) (se Bo?kovi? 2003 for some discussion of this point), esentialy stipulating the need for filed specifiers for the relevant intermediate/non-target head 80 Se Lasnik & Uriagereka (forthcoming) for some discusion of this point and the notion of "lokahead" in ading what I'm caling M-features. 137 elements, and thus artificialy creating positions that then cannot be pased over without violating minimality of movement restrictions. 81 The salient alternative approach to the locality of such movements is to introduce an analogously artificial restriction on movement, stipulating that A-movement must go from INFL to INFL (T-to-T) and A'-movement must go from COMP-to-COMP (C-to-C). I think that this is correct as a description, as wil become clear in our development of TCG below. But the important question that needs answering is, "why?". Stipulating T- to-T or C-to-C movement appears to be Bo?kovi?'s (2002) conclusion regarding what I'm here caling M-feature-driven movement, namely: M-features do not exist, but movement must go through these local positions anyway (se Grohmann 2003 for an extension of this stipulation). Why? Because that's the way movement works ? its local. But this is not, to my mind, solving the problem. To be fair, Bo?kovi? does addres half of the problem and this is important (se our analyses of raising and the "EP" in Chapter 3). What he shows is that where M- features sem to be redundant with core licensing properties, we can do just fine handling facts without the postulation of M-features. However, this stil leaves us with the intermediate movement cases as motivations for M-features, at least if we stay within the general borders of approaches consistent with Last Resort. To sum up, I take current theory to be in the following bind: 81 Frank (202) has sugested a sort of "doubling" of EP-feature types, so that there are both A-movement- relevant EP-features and A'-movement (wh-)EP-features ? that is, two types of M-features (he posits also a doubling of Case properties, one relevant for A-movement and one for A'-movement ? his so-caled wh-Case). 138 (161) a. Empiricaly, intermediate (succesive cyclic) movements sem to be real b. If LAST RESORT governs syntactic operations without exception, there must be some feature(s) involved in these intermediate movements,.. ..BUT,.. c. ..the relevant motivating properties can't be in the set of core licensing properties, since these (e.g. Case, WH) are typicaly satisfiable only once d. So, either: (i) M-features exist, and are what is responsible for at least intermediate movements, OR, (i) M-features do not exist, and something else is responsible for intermediate movements (so either Last Resort is not completely general, or something overides it, or it can somehow be restated so as to provide non-feature-driven movement) Chomsky's recent move in the face of this bind is (161)d(i). However, Chomsky (1999:33) notes that what I am caling M-features are simply "selectional" features (linguistic properties) that are "uninterpreted" and which moreover constitute "an apparent imperfection, which we hope to show is not real by appeal to design specifications". This is the general route the present work aims to take: to show that M- features as such are dispensable, though the general property of the syntactic component they have been (spuriously) introduced to describe is certainly real. In my view, the earlier stages of the MP were closer to being on the right track in the following sense: movement is about what I caled Core Licensing Properties (CLPs). Where things have gone astray is in the specific kinds of eforts made render intermediate/non-target movements in evidence in cyclicity phenomena consistent with Last Resort. The central direction of early versions of minimalist inquiry regarding CLPs 139 strikes me as being esentialy correct and I think that this is the direction inquiry should continue to go in the characterization of local relations and dependencies. The mising piece then, with such a story about local relationships in place, is a way to understand how to truly reduce the non-local cases to the local ones. 2.2.2. Cyclicity without M-features? The larger context of the minimalist program includes the general idea that syntactic operations are governed by Last Resort. Technicaly this manifests as the idea that movement must efect some (local) licensing of properties that would be otherwise il- formed (i.e., without the movement operation to create a local context for their satisfaction/licensing). A local instance of wh-movement happens in order to check/license the wh-properties specifying the interogative properties of the matrix clause (as in Who did Bil hit?). But the situation involving embedding, ilustrated above, requires that the intermediate C-element not be of the sort for which WH/Q-properties are present. So something else must drive these intermediate movements. Work in the MP has taken one of two general approaches, either (i) some inventory of special features are introduced to drive the intermediate/non-target movements (i.e., M-features as discussed above) or (i) some atempt is made to derive the efects of local movements without supposing that there is some special property there in the structure. I turn now to (i). This later (i)-type) strategy has manifested in a number of diferent ways. One route of thinking along these lines simply denies SCM and proposes non-local movement is "direct" (one-fel-swoop) and thus atempts to acount for local phenomena exhibited 140 along the movement path by other means. Zwart (1993) for example supposes that in long-distance wh-movement some dummy items are first inserted in al the intermediate CP-related positions, and that the wh-element itself then moves directly to the target position (thus obeying his view of movement economy: "Fewest Steps"). However this does not appear to be much of an advance over the M-feature view, but rather a diferent conception. The problem under this approach is then: what drives the insertion of intermediate elements that then alow the moving wh-element to pas over those positions? Radford (2001) argues for a convergence-based understanding of phases which is wed to a one-fel-swoop view of movement as wel (se Felser 2004:548 for some critical discusion). Epstein & Sely (1999), as mentioned briefly above, argue on the basis of technical isues regarding chain definitions that movement must be similarly a one-step proces. It would take me too far afield to consider al the alternative possibilities involved in contrasting stories positing SCM with those of the one-fel-swoop variety. I wil therefore narow my focus in what follows to examining approaches that in one way or another have appealed to linked-local movements but which have atempted to dispense with M-features. However, it is worth pointing out that this general clas of approaches requires something like Zwart's solution, as discussed above. Somehow the locality of SCM-type efects need to be acounted for, and its unclear how one-fel-swoop stories can do this without evoking strongly non-local principles. For example, we noted above regarding Chung's (1983, 1994) facts from Chamorro that the relevant agrements that manifest are tied to each and every relevant local domain on the movement path. Its unclear how to do this without postulating linked local relations of some kind. And note 141 as wel that completely disalowing linked-local movements requires a special view for understanding the binding-related connectivity efects discussed above (???). Moreover, as we wil se in Chapter 3, there appear to be situations in which it looks like SCM is maybe not a necesary feature of the relationship (i.e., situations where the local efects do not show up uniformly). We saw some instances of this in the form of "dialect variability" in Irish English Q-stranding vs. standard English, variability in the possibility of S-V inversion efects in Spanish, variation in matrix/embedded SAI across diferent dialects in English, and so on. I wil suggest in the next chapter a general schema for analysis which relies on the posibility of SCM enforced in the series of workspaces defining the derivation versus a rather diferent sort of relationship (unselective binding of the sort discused in Pesetsky 1987 to handle so-caled D-linking) that may hold over output structures. If the general idea is on track, then we wil actualy need both SCM and some other sort of potentialy non-local (or les-local) kind of relationship in order to acount for the presence/absence of the sorts of efects canvased above. So, with an exception or two in the discussion below, I wil from here on drop discussion of one-fel- swoop acounts. 82 Another instance of the (i)-type approach (atempting to live without M-features) comes in the form of acounts claiming that movement is supercyclic. Boeckx (2001), following ideas introduced in Takahashi (1994), suggests that intermediate movements are motivated by a direct condition on the "distance" a given movement relation can span, 82 This is not to say that no such acount is posible, just that there are reasons for suspicion of one-fel- swop which do not arise for the SCM view. I take the general reserach agenda of atempting to reduce al aparently non-local phenomena to local domains seriously, which is why the interest in SCM-type acounts. I refer the reader to the references cited in the text for discusions of one-fel-swop views. 142 such that movements must involve the edge of every intervening phrase betwen the base and final/target landing site positions. This view predicts, in Abel's (2003) terminology mentioned above, uniform efects on the movement path, and so clearly would require further asumptions to capture the various phenomena that appear more selective in their manifestation (e.g., wh-copying, the binding/connectivity facts). Heck & M?ller (2000) propose that Last Resort is a violable condition, and posit a system of violable constraint interaction that conspires to drive local movements without direct feature-checking involved at the non-target sites (se their notion of "Phase Balance"). However, as pointed out by Felser (2004), their approach relies on undesirable look-ahead logic (i.e., cast in the framework of Optimality Theory, their approach requires reference not just to phase-sized numerations but rather to arbitrarily large ones). Castilo & Uriagereka (2000) offer an approach that is interesting for the present investigation in virtue of the relation of their proposals to the TAG-theoretic sort of analyses discused earlier. I discuss them here together which alows us to make some points about the "movements" asumed in TAG-analysis within elementary tres and their need for something like EP-requirements. Castilo & Uriagereka (C&U) suggest there are initial movements that are local, and which involve the ultimate target position housing core licensing properties. Like TAG approaches, they suggests that such an initialy formed dependency can then be subsequently "stretched" by adding intervening elements. However two aspects of their approach are unlike analyses offered in the TAG framework. First, C&U suggest that individual merge operations do the job of "stretching" the initial local dependency by adding intervening elements one-at-a-time, as 143 opposed to the en bloc insertion (adjoining/splicing-in) of auxiliary structures as in TAG. C&U take this one-at-a-time addition of intervening elements to be a merge-based generalization of so-caled "tucking-in" that Richards (1997) argued to be relevant for movement operations (i.e., alowing non-root mergers, which is what C&U need to efect the incremental stretching of local dependencies with step-wise additions of single elements). A second diference betwen C&U's approach and standard TAG analyses is that they take the local movement relationship to be one that connects the moving item with its actual "target" landing-site. Its worth taking a moment to show why this isn't how things are typicaly handled in TAG-theoretic elementary tres. We can ilustrate this with what corresponds to cyclic A-movement in TAG, though the point caries over directly to A'-movement cases as we wil se. Consider: (162) a. John sems to tend to appear to like carots The TAG derivation for this example might be (but as we wil se, typicaly is not) understood to evoke the following elementary structure (left below) and the corresponding auxiliary tres (right-hand side below): 144 (163) CP T' C 0 TP T' T 0 VP John DP T' T 0 VP apear V 0 T' T 0 vP tend V 0 T' t DP v' T' v 0 VP T 0 VP like V 0 DP carots sem V 0 T' ELEMENTARY TRE AUXILIARY TREES Asume that T in the elementary structure above is finite (i.e., that this structure converges as an independently wel-formed clause). This corresponds to Castilo & Uriagereka's idea for the wh-movement case above. This particular sort of division of basic tre structures turns out to be problematic in TAG, but stepping through this derivation wil alow us to se more clearly in comparison how at least one version of current TAG analysis of such phenomena is taken to work. As discussed earlier in the last chapter, auxiliary structures have matching root and foot nodes, which are T' for raising predicates (on the right-hand side above). This reference to X'-level categories that are not dominated by a corresponding maximal node (XP) may however be inesential to the acount. 83 83 This is an important isue conecting with the position one takes regarding primitive "bar-level" distinctions. In a BPS-style acount, for example, it is unclear how one would motivate this particular property of rot and fot nodes in auxiliary tre structures without abandoning the central idea of relativistic phrase-level status.Frank (202:?n?) sugests that it may be a straightfoward task to render this aproach consistent with a purely relational conception of bar-level distinctions. But this is not clear to me. He also corectly notes that having primitive C'-nodes is not at al inconsistent with Muysken's (1982) aproach which worked with feature sets (e.g., ?max, ?project). Muysken's initial view is relational in that projection is understod to be a coherency condition on the values and structural distribution of such features in domination sequences (e.g., a X{+max,-proj} canot coherently enter into an imediate dominance relation with another X{+max,-proj} node (etc.). Thus a C'-node with no dominating CP would simply be a {-max, +proj} element. It does sem unreasonable to think that verbs might difer, in this feature-based view, as to whether they select a C' or a CP conceived in Muysken's original feature-based terms. However, Chomsky's 145 In any event, these root and foot nodes can be understood match/correspond to the T'-node in the elementary tre depicted on the left. The movement of the nominal element (John) is understood to take place from the base/thematic position (here: specifier of v) to the derived matrix position directly, consistent with the general idea in play within TAG approaches that al dependencies should be localized to basic tre structures that form the input to TAG adjoining and substitution. In this case, the relevant operation is adjoining, which works as follows for the combination of the elementary structure and the raising auxiliary headed by appear: (164) CP T' C 0 TP T' T 0 VP John DP T' T 0 VP apear V 0 T' T 0 vP tend V 0 T' t DP v' T' v 0 VP T 0 VP like V 0 DP carots sem V 0 T' ELEMENTARY TRE AUXILIARY TREES conception in the BPS aproach is diferent in this regard. On that view projection level is purely relational ? there can be no C' without there being a CP. 146 (165) CP T' C 0 TP T' T 0 VP John DP T' T 0 VP apear V 0 T' T 0 VP tend V 0 T' apear V 0 T' T' T 0 vP T 0 VP t DP v' sem V 0 T' v 0 VP like V 0 DP carots ELEMENTARY TRE AUXILIARY TREES Adjoining generaly efects a kind of spliting of an atomic node (here: T') in the elementary tre, splicing the auxiliary structure into its position as depicted above. The same operation could then be repeated to adjoin (splice-in) the other two raising predicates. Note that the addition of the auxiliary structures serves to "stretch" the movement dependency created in the initial elementary tre ? again, there are no intermediate traces of movement in the TAG architecture. 84 This specific derivation has to my knowledge never been advocated for RtS phenomena within TAG, and its easy to se why. The element which constitutes the matrix T-head (T 0 ) in the elementary tre is no longer the matrix instance of this category 84 The absence of intermediate traces is touted as a virtue in these aproaches, though I actualy think this raises some puzles for TAG analysis; se ?1.5.1, (94)-(97) above, and ?3.2 & ?3.3 below. However, se Frank & Kroch (195) for some relevant discusion relying on an alternative way to think of some of the relevant conectivity facts. 147 in the output. Rather, it is necesarily stranded in the lower clause, with the T 0 -element in the auxiliary tre becoming the highest instance of this head. If finitenes is encoded on these T-elements directly (i.e., if T comes as either {+fin} or {-fin} ? not a trivial asumption), then this sort of derivation strands the T{+fin} element in the lower clause. Also, if we are taking the agrement and Case properties of the T-element in the elementary tre to be enter into some kind of checking/licensing relationship with the "raised" nominal, then similarly the relevant agrement properties should be stranded in the downstairs clause as auxiliary structures as adjoined. For reasons of these sorts, researchers working within the TAG framework have developed rather diferent analyses than the one sketched above. For concretenes I concentrate here on the proposals of Frank (2002). Frank's approach to Raising-to- Subject (RtS) asumes a rather diferent inventory of structures constituting the input to the TAG derivation: (166) T' TP T' T 0 VP John DP T' T 0 VP apear V 0 T' to T 0 vP tend V 0 T' t DP v' T' v 0 VP T 0 VP like V 0 DP carots sems V 0 T' 148 The important diference is in the nature of the elementary tre with respect to finitenes, and the nature of one of the auxiliary tres (for this example, the auxiliary structure headed by sems). The relevant elements above which difer from the problematic derivation given in (165) above are boxed. Frank asumes the auxiliary tre that wil become the matrix verb is itself marked as {+finite}, while the top of the elementary tre containing the local NP-movement is has the properties that it manifests in the output, namely that it is {-finite}. This property is thus asumed to be a property of the relevant T-nodes. Seting aside for now how the adjoining operations are constrained so that the one finite raising auxiliary ends up "at the top", we can se now that the trouble that arises in the derivation given above in (164)/(165) doesn't arise for (166) since the elementary tre is itself taken to be non-finite (as it appears in the target derived structure). The finite/matrix structure is introduced by an auxiliary structure. This requires that there be some force which is responsible for the initial A-movement within the elementary structure. Frank (2002) formulates a version of the EP to get this result (the view is interesting but requires too much discusion to get into here ? se Frank (2002) ? the point here is simply that TAG requires something other than Core Licensing Properties to drive the initial movement). Now observe the somewhat subtler though similar situation that arises in wh- movement. A TAG derivation for the wh-movement case above would go as follows. We asume the elementary and auxiliary tres underlying the TAG derivation of (167)a as in (167)b, and the first of two adjoining operations works as pictured in (167)c: 149 (167) a. What did Dave think Mary believed John liked? b. CP C' C' what DP C' C 0 TP C 0 TP C 0 TP Mary DP T' Dave DP T' John DP T' T 0 VP T 0 VP T 0 VP believe V 0 C' think V 0 C' like V 0 t what ELEMENTARY TRE AUXILIARY TREES c. CP C' C' what DP C' C 0 TP C 0 TP C 0 TP Mary DP T' Dave DP T' Mary DP T' T 0 VP T 0 VP T 0 VP believe V 0 C' think V 0 C' believe V 0 C' C 0 TP John DP T' T 0 VP like V 0 t what Following the adjoining step pictured in (167)c, the second auxiliary tre would be similarly spliced-in (adjoined) yielding the full output structure (or, again, perhaps the two auxiliaries first combined and then adjoined as one). As adjoined material is spliced- in in this manner, we se again that the movement dependency formed over the input elementary tre (in (167)b) is systematicaly "stretched". The addition of the second 150 auxiliary tre would further stretch this elementary-tre-defined relationship yielding in the output the superficial appearance of a "long-distance" syntactic dependency. 85 The present point however is that the C-head that on MP (and other) views would be taken to house the wh-properties driving the movement to the ultimate target position is the C-head in the elementary structure above. But it is not this head which ends up as the "matrix" C. That head is provided by auxiliary tre. So, as with the raising-derivation given above, there must be some other property in the elementary tre which motivates the initial (and only) "movement". TAG derivations of this sort thus need something like the M-feature (EP-type) requirements. These maters ned not distract us any further. Thre points emerge from this digresion about TAG versus the C&U proposal which do concern us however. First: the view which Castilo & Uriagereka adopt involves an initial CLP-driven movement operation, and this is not how the comparable TAG derivations are usualy taken to work. Rather, conventions on the adjoining operation in TAG are evoked such that wh-licensing is built into the "spliting" of nodes that acompanies the splicing-in of auxiliary structure. Isues arise in this TAG view however concerning (e.g.) do-support and auxiliary inversion generaly, which I wil not get into here (se Frank 2002, Rogers 200? for some discussion and some solutions within the TAG approach). Second: our sketch of how the TAG derivations do work has brought to the surface the fact that the elementary-tre-internal movements in NP-raising and wh- movement need to be driven by an EP-type requirement. In later discussion (Chapter 4) 85 I'm asuming here that the auxiliary structures are spliced-in one at a time. Alternatively they could be joined together first and then spliced-in in a single instance of adjoining. Se Frank (202) for discusion. 151 we wil se that it may be possible to import some of the asumptions about structure and category (of the Brody-type discussed at the end of Chapter 1) into the TAG framework to do away with EP-type requirements on the movements within elementary tres. Sketching this possibility wil be important to our Chapter 4 discussion of the diferences betwen the TCG approach developed here and TAG. So C&U's idea of an initial CLP- motivated movement may turn out to be of interest in discusions of EP-type requirements across the TAG and MP-type frameworks. Third: C&U's discussion turns out to be of interest in the context of the TCG approach developed here for another reason. In particular, they make use of a notion of distinctnes in discussing wh-islands that is helpful for us to consider. Take the wh-island violation in (168): (168) ?What did John wonder whether Mary thought Dave liked _ ? On C&U's view this begins with the following local movement: (169) [ CP what [ C' 0 -[+WH] [ TP Dave liked t WH ] They then suggest that as the items are added-in betwen a moved wh-element and the structure containing its base position. In wh-island cases, as C&U point out, the derivation wil inevitably reach a point where a "like" C-element wil have to be introduced and projected below the initialy moved wh-element and its asociated CP structure. Such situations, they suggest, result in pathology (the derivation crashes or perhaps cancels at this point, or is perhaps problematic at the interfaces for this reason). 152 I have already mentioned the role that distinctnes wil play in enabling SCM in the present acount. Later on I wil consider a story similar in spirit to C&U's as an acount of wh-island and other related phenomena. Consider now another reasonable line of thinking which begins with an observation about one way we might simply deny SCM and implement the "one-fel- swoop" logic. We might be tempted to say that intermediate movements are impossible for exactly the reason that they cannot be final landing sites for elements (as noted above ? se (155)-(159)). Locality of movement would then be understood in terms of closest possible landing site. This would be in place of the somewhat odd counterfactual view of "closest" which is often appealed to, under which the "closest" landing site is simply one that belongs to the right more general super-category of positions which could have otherwise hosted the licensing features to make the position an actualy licit target/final landing site. That is, non-finite T is of the type that could have, for example, hosted Case and ?- properties, if it were only of the right sub-type of the general type T; embedded declarative C is of the general type that could have hosted wh-properties, and so on. But this general outlook about categories being of a more general type which could have housed the relevant core licensing properties raises another possibility. We could maintain the idea that intermediate positions do not properly license elements that move to them, but stil somehow make use of the observation that such intermediate landing sites are in fact "of the general type" that could otherwise serve as target sites, if only they housed the relevant features specifying the appropriate sub-type. The idea 153 would be to separate-out landing-sites for movement from the motivating forces of movement. We thus might alternatively pursue a kind of "false-advertising" view that could support the SCM view after al. Suppose subordinate elements can se up the c-command path and can detect the general (super-)category of an element as a possible suitable target of movement, and this posibility is enough to localy license movement to that element's domain. The fact that this would turn out to not license the position as an "ultimate" landing site simply presents the situation which we want to obtain, namely that the element is then in a higher position from which it may sek a target in the next higher domain and it is stil "active" in terms of having its properties not yet satisfied (licensed/checked). Note that this perspective would make movement contingent on the needs of the moving element, thus faling under some version of GREED, in contrast to approaches which either partition-out the responsibility for movement triggering betwen the source/target positions and the moved element, or else makes the driving force of movement the upper element's responsibility (e.g., so-caled ATRACT based view). A fully atract-based approach couldn't support the false advertising view of cyclicity, obviously, since the idea is that the relevant properties that could actualy license movement are "not there". And even if do not develop an atract-based view, the basic idea sems prety stipulative. Why should only the category, and not its features (or their absence) be detectable in the local context? Maybe however this view could be serviced as a landing site theory, with some other motivation given to drive the movement operation itself. 154 There is another way of looking at things that relates to remarks made earlier in this discusion (specificaly when introducing Chomsky's DbP view in the context of our WS/O-distinction; se ?1.4). That is, we might maintain that it is the local il-formednes of a to-be-displaced item in the context of its base position that is sufficient to trigger movement. Thus, local contexts with elements in their base position (with no features checked) could be understood to somehow expel their uninterpretable contents. SCM would then be driven or not depending on how the system aseses the wel- formednes of sub-stretches of derivation ? localizing inspection of the derivation for convergence would establish sub-domains which would have to have certain elements displace in order to be wel-formed. This localized non-feature-driven view of movement as LAST RESORT might also be coupled with the "false-advertising" view suggested above (so there stil would perhaps have to be a category of the right general type to house the expeled element). In particular, having some notion of localized convergence evaluation (output "size" restrictions) sems promising, since we have a theory involving the idea that an item with uninterpretable properties might crash a derivation if these properties are not checked/licensed. This can perhaps be exploited to drive local movements. So, one interesting possibility for SCM is to consider shrinking the domains over which we evaluate wel-formednes, so that an element could in principle violate Last Resort in order to avoid rendering its local environment il-formed. And this is what the MSO-type vision of Chomsky's DbP does (even if the specific categories he has in mind as phase- inducing aren't quite right ? se below ? the general idea has the right form). We can regard this intuitively as a fit with the general idea to the view of uninterpretable features 155 as "viral"; that is, local contexts are forced to expel the "sick" elements in order to not remain "infected". 86 Also, we have the idea that expeled items might have to move to a category of the right general type, which again we take to mean a category which is of the type that could in principle house the right sort of core licensing properties. Expeled items, needing to license their properties, go to the closest place where it looks like this could happen. That this does not actualy happen with intermediate movements simply means that the relevant moving element remain a source of "infection" for the next highest domain, and must be expeled again to the next super-ordinate domain, and so on. (Presto! SCM!). Lasnik & Uriagereka (forthcoming) propose of view of this sort which relies on the idea of "expeling" elements "up the tre" as each localy evaluated structure is forced to come to terms, as it were, with a contained element housing viral properties. Let us sum up these two ideas as follows: (170) EXPULSION OF VIRAL ELEMENTS: elements with uninterpretable features must move when necesary, this means either (i) to check a feature, or (i) to avoid crashing a derivation when a local structure is evaluated 87 (171) MOVING INTO CATEGORIALY LIKELY CHECKING CONFIGURATION: elements know where to look to get their uninterpretable features checked/satisfied (but the features aren't always there) What drives SCM then is the presupposition that there are sequences of categories betwen a base and a target position for a "to-be-moved" element, and that each category in this sequence just happens to have the following property: it is an element with a 86 On the notion of features as "viral", se Uriagereka (198) and a discusion in Lasnik (199b). 87 This is more-or-les a vision of movement economy enshrined in Lasnik's (199a) notion of Enlightened Self Interest (versus standard Gred). 156 subset of the properties of the target landing site. I wil take this to be central in the analyses of the next chapter. 2.3. Chapter Summary & Further Critical Remarks We have encountered some potentialy interesting preliminary ideas that might support SCM without the postulation of intermediate M-features. Recal from above we identified the following two general isues (borrowing from the formulation in Felser 2004) pertaining to MSO-type phases of derivation and SCM: (172) Asuming SCM exists,.. a. What motivates it? Properties of the moving element? Properties of the intermediate target? Both? Neither? (i.e. something else)? (TRIGERING?) b. What are the mechanics of movement like such that unlicensed features {*F} do not remain to cause convergence problems in speled-out domains (either on the copy, or on the remerge view)? (CONVERGENCE?) The discussion in the previous section has focused on some alternative acounts addresing the TRIGGERING problem, and we have so far not said much about the CONVERGENCE isue as stated above, though as we just discussed at least one type of solution to the TRIGERING isues relies on asumptions regarding local convergence evaluation. First, note that our canvasing of some of the empirical terain evoked in discussions of SCM raised the following additional problem, related to both (172)a and (172)b ? cal this the DISTRIBUTION problem: (172) c. What are the range of potential non-target landing sites for SCM? (DISTRIBUTION?) 157 This dovetails with the distinctions drawn from Abels (2003) regarding uniform presence/absence of efects along the movement paths versus the possibility of "punctuated" efects. Note that on a purely direct-feature-driven view of movement (e.g., of the sort postulating M-features), the triggering and distribution problems are the same. But this is not necesarily so on other views, as we wil se in a moment. Specificaly, we encountered the following ideas with respect to TRIGGERING: (173) a. M-features exist, and these motivate intermediate movements (consistent with Last Resort) b. Intermediate movements occur (consistent with a version of Last Resort) in order to avoid crashing the derivation when local sub-structures are evaluated Chomsky's (1999) position sems to couple (173)a and (173)b. But if (173)b can be shown to suffice, then one might think that we can dispense with M-features (a desirable outcome, as noted, on both Chomsky's view 8 and following the general argumentation above in ?2.2.1). But its not clear that this localized vision of last resort actualy does suffice. That is, on a phase-based view, the DISTRIBUTION question is supposed to get an answer in terms of whatever we motivate as the relevant phase-inducing heads, coupled with localized convergence evaluation. If non-feature-driven movement can happen on the view of Last Resort sketched above (to avoid a local crash), then we might expect these movements to be tied to the "edges" (external domain) of phase-inducers. 8 As noted above, Chomsky regards these properties as an aparent imperfection that we hope on minimalist grounds to show is "not real". 158 However I wil now argue that even on these views we may stil need something like M-features (or, as I suggest below, a diferent view of the categories present in verbal extended projection series). Recal on Chomsky's view that the complement domain of phase-inducing heads spels-out when the next such phase-inducer is introduced. But on the asumption that C and v are the phase-inducers, this means that the complement domain of, for example, v, wil not spel-out until the next phase-head (C). But then why should there be movement to the edge of vP, and not any other position betwen the phase-heads? If we can show that movement is the edge of vP, and not higher, this suggests that the way to have Last Resort consistent movements would be to go with Chomsky's M- feature view. And if this is correct we also need to have movement target positions below the root. This may in fact be independently required, but the present point is that on Chomsky's view of phases the localized convergence motivation doesn't work to drive elements to the edge of vP. That is, it would not be strictly the last resort to move to such edges, since it is only after introducing further material that the spel-out of the complement domain wil be required. A strict last resort view would be to have the movement only justified at the point in derivation where the problem arises, and as we have just sen this is not when a phase-inducing head is introduced. To se the point, consider the following derivation (deploying a version of the reduced structures from Chapter One for convenience): 159 (174) a. D wh b. V ? D wh c. v V ? D wh In (174) we have a derivation with wh-element as the object of V up to the point where v ? a phase-inducer ? is introduced into the derivation. At this point, there is no imediate threat that the wh-element wil be stranded in this position. So unles we asume that the operation driving movement has some look-ahead capability, there is no reason under the localized Last Resort logic that would drive movement to the edge of vP. 89 The derivation continues, adding the subject-D to get the external-? role from v (174)e, adding T (174)f, and moving D to T (174)g: (174) d. v ? V ? D wh e. v ? V ? D wh D f. T ? v ? V ? D wh D g. T ? v ? V ? D wh D D h. C ? T ? v ? V ? D wh D D 89 One might object to this, pointing out that the 'lokahead' required would stil be fairly local (only up to the next phase-head. But on Chomsky's view, if C and v are the only relevant phases this could be an unbounded stretch of structure (e.g., as he argues for sucesive cyclic raising environments, which are asumed not to introduce phase- distinctions). For the SCM in raising, as mentioned, Chomsky posits a EP-features to drive the local movements, but there if there are no phase heads in such sequences then these motivations are decoupled. The point of the argument I am runing in the text is that there is reason to think even where we do have phase-divisions that we would ned M- features anyway. 160 The last step pictured here is the addition of the next aleged phase-inducer (C). Do we need spel-out of the complement domain to happen prior to this step? Yes, if there is to be movement to some position below C, otherwise its not obvious that the wh-element couldn't first move directly to C and have the spel-out operation apply afterwards. So this means that the local movement out of the complement domain of v must happen prior to the addition of C. The natural point in derivation to apply the Last Resort logic would be betwen steps (174)g and (174)h. But then, where does the wh-element move? And why? It sems that moving to adjoin to the T-domain is just as plausible as moving to the edge of vP. (Though note movement to T would obey extension/root-merge, and movement to v would not ? asuming this is not isue, both possibilities remain). However, the kinds of facts discussed in ?2.1.5 strongly suggest that if there is very cyclic movement (within clauses) then it must involve a position below TP (se, e.g., (139)-(140)). Note as wel that we cannot apply the logic described above of moving to a "categorialy likely checking configuration", since its completely unclear that v ever hosts the relevant CLPs. The conclusion is that if we ned movement to the edge of vP we need an M- feature as Chomsky proposes. Or, we ned to configure the general outlook such that derivations "know" that there is an upcoming second phase-inducing head. Chomsky has suggested that derivations may begin from limited inventories of elements ? sub-arays ? which are understood to be selected from the lexicon in such a way that they can contain only one phase-head. This could potentialy solve the problem if we can somehow restrict sub-arays to only containing information that wil end up below phase- 161 inducing heads ? that is, material constituting the complement domain. Then we might motivate movement to the phase-edge of v (in the example above) in terms of a condition keyed to the introduction of a second sub-aray containing another phase-inducing element. This would be conceptualy similar to both Uriagereka's idea of having "derived terminals" ? esentialy treating phase-sized arays as atomic element with respect to later stages of derivation. It would also be quite close then to a "mini-TAG" view, which posited phase-sized elementary tre domains. But its unclear that this isn't simply a statement of the problem. Why couldn't the initial phase include everything up to, but not including the C-element? At isue here is how we partition structures into phases/spel-out domains. Having them restricted to containing only a single phase-head doesn't obviously constrain what else can be in an initial sub-aray, so they may include material above v or not. If so, then the problem of distinguishing phase-edges from structure above such heads (but below the next phase head) stil arises. This situation is general. Suppose that the wh-element has somehow reached the vP edge (the double-v's here can be taken to be an adjunction structure, e.g., vP ? vP, or a maximal and intermediate level vP ? v', it doesn't mater for present purposes). (175) T ? v ? v ? V ? t Dwh D D wh t D The same situation described above arises here. The next step introduces a C element. The step after that presumably a selecting V, perhaps with a local object, and only after this we might encounter another v which by asumption forces the complement domain 162 of C to spel-out. So, again, we either need to the movement to not be strictly speaking consistent with last resort, or we need an M-feature asociated with C. We have reached then two main conclusions. First, the expeling of viral elements story, and any other like-approach seking to drive cyclic movements out of independently evaluated domains based on convergence needs, needs some additional asumptions to get things to work. In particular, it sems that movement to the edges of phases cannot be motivated by a strict localizing of Last Resort. To the extent that we render the idea with respect to triggering a local/non-target movement consistent with a localized vision of economy (e.g., se Collins 1997), its not obvious how to discriminate betwen the phase edge and any other category that may be present below the next phase- inducer. Therefore, it sems that something like M-features are required if we have reason to think that movement is to vP and not to higher elements below the next phase- inducing head. Second, its not clear how the idea of moving to a "categorialy likely checking configuration" could drive wh-movement (or NP-movement) to anything but intermediate C (or T for NP-movement), at least not on standard asumptions about the structure of verbal extended projections (e.g., asuming roughly C-T-v-V). 90 We saw in addition in our discussion in this Chapter that TAG approaches (at least on the general approach advocated in Frank 2002) also require reference to an EP- 90 However, as we wil se in the next chapter, it has ben argued (se Butler 204, Beleti 20?, Pesetsky & Torego 204) that there may inded be such elements below v but above V. If this is corect, then we could salvage the combination of expeling viral elements and moving to a likely checking configuration (this requires a version of the "split-VP" hypothesis (Koizumi 193, 195, Lasnik 199a, Johnson 191, Runer 195). So, if verbal extended projections involve internal "mini-clauses" ? C-T-v-C-T-V ? or perhaps some equivalent, then we may have a road- in to a uniform story about cyclicity stateable in these terms. 163 type of property to motivate initial movements within elementary tres. Problems were sen to arise for the view of adjoining auxiliary structures at X'-nodes for SCM in of the wh and NP varieties if it was asumed that the initial movement involves what we have caled Core Licensing Properties (CLPs; i.e. if the initial movement is the ultimately licensed one). Instead, EP-properties are understood to drive local movements, and the licensing of the sort involving CLPs is regulated by cross-tre feature-checking built-in to the adjoining mechanism. In the next chapter, I turn to the task of further developing the asumptions and mechanics of TCG and the WS/O-distinction on which it is based. The main argument is that TCG makes available an acount of SCM that dispenses with the need to appeal to M-features. We wil provide some fairly coarse-grained examinations of some of the SCM phenomena sketched in this chapter, providing enough of a story to show how the mechanics could be deployed in more detailed investigations in each domain. 164 CHAPTER 3: TCG Analysis The structure of this chapter is as follows. We here deploy and further develop the TCG- approach with reference to some general sets of facts, concentrating mostly on points bearing on the architecture under discussion, rather than on analyses pursued to any great depth. We begin with a discussion of local relations which brings up some problems that went undiscused in Chapter 1. This leads then to two discussions regarding SCM ? one pertaining to A-movement, and one to A'-movement (concentrating on wh-relations). We then speculate on a general extension of the logic deployed to SCM cases to other cases within local structures, opening the suggestion to pursue so-caled 'split-VP' or "stacking"-style analyses to a perhaps extreme point of dividing individual thematic elements into their own litle "mini-clauses", forming the clause internal equivalent of traditional-clause divisions. This is then shown to be relevant for cases left out by the traditional view of the clause asumed in the SCM discussion ? that is, cases suggesting that at least wh-movement involves relations beyond clause edges, including as wel some internal phrases. Then we stop, and move to concluding remarks, a discussion of some further architectural isues and open questions/problems. 3.1. Local Relations: Part One Consider the following simple case with an unacusative verb in (176): (176) A man arived A relatively standard view of the structure of (176) would look as follows (ignoring the C-domain): 165 (177) TP DP T' ?/? a D N man T 0 VP arive V 0 t DP ? Deploying the reduced structures and derivations discussed in Chapter One we would have (ignoring the internal asembly of the nominal): 91 (178) a. T ?:n ?:? b. T ?:n ?:? c. T ?:n ?:? d. T ?:n?? ?:??:f e. T ?:f D ?:? ?:f D ?:? ?:f D ?:???:n ?:f D ?:n ?:f f. T ?:f V ? g. T ?:f V ?[?:f] D ?:n ?:f D ?:n ?:f The D and T elements asociate and swap ? and ? as outlined earlier (?1.2), pictured in step (178)c. This proces results in the deletion of ? on T ? the outcome is a "discharge" of the ?-property of T, and a ?-relation betwen T and D. The addition of the thematic (V) element (arrive) leaves us with the structure in (178)f. And, as suggested earlier, the ?- property of V takes ?:f as its value, completing the A-relation circuit (closing off the open position of the thematic predicate). In this way D is indirectly connected (via the ?/? feature-complex) to the internal role. I mentioned in Chapter One as wel that we wil be viewing ?-variables as inherently undistinguished open slots, so that ?/? properties are actualy required to 91 I asume that D-elements typicaly come without ?-specifications (i.e., with ?:?), and that this property is valued in virtue of asociating with N, which comes with ?:f (valued ?). 166 individuate them. We can expand on this point by considering a minimal addition to (176): (179) A man arived sad Following the basic structure of proposals of Wiliams (1983, 1994) and others, I wil take this to be a situation where non-distinct ? results in an identification in virtue of the ?-property of the unacusative-? comes to be borne by the adjoined adjectival, as follows: (180) T ?:f V ?[?:f] A ???[?:f] T ?:f V ?[?:f] A ?[?:f] D ?:n ?:f D ?:n ?:f We wil now se that on the rather loose view sketched in Chapter One regarding possible configurations for feature-relationships, even fairly smal increases in complexity appears to present us with problems. Consider a simple transitive: (181) He likes her (182) a. T ?:n ?:? b. T ?:n ?:? c. T ?:n ?:? d. T ?:n?? ?:??:f e. T ?:f D ?:? ?:f D ?:? ?:f D ?:???:n ?:f D ?:n ?:f f. T ?:f ? v ?:a ?:? g. T ?:f ???[?:f] v ?:a ?:? h. T ?:f ?[?:f] v ?:a ?:? V ? D ?:n ?:f D ?:n ?:f D ?:n ?:f i. T ?:f ?[?:f] v ?:a ?:? V ? D ?:? ?:g j. T ?:f ?[?:f] v ?:a?? ?:??:g V ? D ?:???:a ?:g D ?:n ?:f D ?:n ?:f Asuming that v introduces both the "external"-? and acusative (?:a), I wil mark this element as " ? v ?:a ?:? " to indicate that the ?-role is upwardly directed, and the ?/? properties 167 downwards, as in (182)f. However, this talk of "upward/downward" raises the isue of whether there can be configurations where the ?-role dominates its argument in this approach. So far we have sketched a view under which A-relations, including the external-? asignment, are mediated by ?/? properties. So such mediation is suggested to be possible; but is it necesary? Consider the steps beyond the introduction of v in (182)h-j. In (182)i it is indicated that the ?/?-value swap happens independent of the ?-role taking its value. Is this the right way to think about this? What is at stake? For one thing, since we have stated the mechanisms of feature valuation in a way that alows for probes to dominate goals or vice-versa, its not clear what prevents, e.g., the v-introduced ?-role from asociating with its own ?/? properties. Or, for that mater, having the nominative-T's ?-properties value the internal (V-introduced) ?. It would sem, in order words, that we ned to introduce some asymmetries (e.g., have v-? only look "upwards"; or V-? not able to look past the v-introduced ?/?-properties). In other words, we need some notion of locality here so that we don't get he likes her meaning she likes him, and other impossibilities. Recal that we made a smal fuss earlier in this discussion (Chapter One) about the possibility of redundancies betwen statements of locality built-in to rules or imposed on their outputs (e.g., minimality-type restrictions) and approaches which offered some characterization of domains, leaving the operations otherwise unrestricted. Here we are presented with a situation for which it isn't at al obvious that our conception of workspace restrictions (of the distinctnes and ordering sorts) could be relevant. What recursive structure or repeated elements are there in such local domains for the 168 workspace to resist via the contraction mechanics? We need a story about the possible/impossible feature-connections in local domains. Importantly, if we have to introduce local notions regarding relative or absolute "structural distance" for operations to apply, it opens the question as to whether it would be best to treat everything that way (since it is required for the most local cases). 92 I wil suggest below, following some other developments, that the right position here may be to bite the bullet, and explore the posibility that there is a bit more structure in these local domains than mets the eye. The strategy here wil be to work backwards from the more complicated cases (in particular linked-local relations of the SCM type) to the (superficialy) simpler ones. It is in the domains where we se SCM type efects that our suggestions regarding workspace distinctnes find their plausibility. The idea wil then be to se whether we can find motivation for extending the ideas to sek out possible divisions within local structures that alow us to cary-over the central ideas about workspace-demarcated domains. So let us develop the analyses in more detail in these domains, and return to the local considerations. 3.2. Linked-Local Relations I: Raising to Subject (RtS) Consider again the following standard case of cyclic A-movement in raising-to-subject (RtS): (183) John [sems [ _ to tend [ _ to appear [ _ to like carots]] 92 Se Frank (202) for some similar discusions regarding the posible ned for locality of movement restrictions within elementary tres, and the redundancies such views would pose for the TAG architecture; basicaly the same isues arise there as here as one would expect. 169 First, the subject position of raising verbs is standardly thought to be athematic, as the presence of expletive elements and idiom chunks has traditionaly been taken to suggest: (184) a. There sems/is-likely/appears to be a man here b. The shit sems/is-likely/appears to have hit the fan I wil asume then, as in our Chapter One sketch, that these predicates do not involve a smal-v. I also adopt here the asumption that these infinitival complements are T's, and not C's. 3.2.1. Some Raising Impossibilities & Expletive/Asociate Relations Consider raising from a finite clause (185) and the il-formednes of "superaising" in (186) where the subject moves past a position where it could have landed (were the position not otherwise filed): (185) *John 1 sems [ CP that _ is here] (186) a. It sems John was told [ _ to arive on time] b. *John sems it was told [ _ to arive on time] The properties of these cases might be taken to follow from our notion of workspace distinctnes, repeated here: (187) WORKSPACE DISTINCTNESS (ANTI-RECURSION): The workspace does not tolerate the presence of multiple tokens of type X We provided in our earlier sketch of SCM a general story in terms of (187) plus the requirement on workspace ordering that highlighted one kind of response that the system might make to potential distinctnes violations. The outcome was a proces of node- 170 identification plus a contraction of the active workspace (to avoid ordering conflicts). Let us now consider a diferent situation: (188) John believes [that the earth is flat] Here we wil have, upon encountering the edge of the embedded clause (remember: derivations go top-down!) a potential distinctnes violation betwen matrix and embedded C. At least two diferent responses to this situation are possible: (189) C ? T ? v ? V ? C C ? T ? v ? V ? C D D (190) C ? T ? v ? V ? C C ? T ? v ? V ? C D D The response in (189) would be a conservative one in which the workspace would simply shift ("move-on") as we sketched in our introductory discussion. This would obey the restriction on distinctnes as the higher instance of C would simply be abandoned to the output. The alternative is a more radical shift, esentialy beginning an entirely new domain, as in (190). The more radical response would provide us with an explanation for the impossibility of (185) and (186)b. In both cases we would have derivations which expanded top-down as follows in (191), up the boundary of the embedded C-domains. If the more radical contraction occurs, then the subject would be stranded outside the workspace without having had its asociated ?/? properties connected with ?: (191) C ? T ? V ? C C ? T ? V ? C D D 171 Thus both raising from a finite clause and raising past a posible landing site would be subsumed under the same explanation. The result is a non-?-marked subject in both cases: (192) a. *[ CP John sems [ CP that .. b. *[ CP John sems [ CP C 0 [ TP it .. This would thus be a diferent instance of sort of schema offered in Chapter 1, repeated here, regarding an unlicensed property being abandoned from the workspace to the output structure: (193) A ? B ? C ? A ? ?:A ?:A *F:? ? B ? C ? A However, it is important to note that on the view of ?/?-? connections being explored here, the violation is perhaps best viewed in these broader terms ? a faulty A-relation ? rather than just saying that an NP has not received a role. That is, there wil be more than one way that an A-relation can be faulty ? but the general story about feature relations on the dominance path wil be sen to connect them al into the more general clas. That is, while it might be true that the violations above on par with (194) (194) *John sems we clearly do not want to say the same for the otherwise superficialy similar violation in (since there is athematic): (195) *There sems that a man is in the room 172 Given the top-down nature of structure expansion in the present system, the fairly strong intuition of the il-formednes of (195) at the following point in a left-to-right pas can perhaps be taken to be relevant in this connection: (196) *There sems that.. (vs. ? It sems that ..) At the point where the finite embedded clause is encountered, the judgments are fairly uniform across (185), (186)b, and (196). But (196) involves athematic there, suggesting that the detection of anomaly at the clause border in (185) and (186)b, which has a rather similar profile, ought not be conceived of in exclusively ?-theoretic terms. Consider also: (197) a. *What semed that Mary liked _ ? b. *What semed that ary liked piza? Now, both (197)a&b are clearly out, but they sem to have the same profile despite being rather diferent sorts of violations on standard views. In particular, (197)a would be presumably be a Case-theoretic violation, since what occupies two Case positions, whereas (197)b would involve full satisfaction of ?/?, but the wh-element would receive no ?-role. Recal our schema for a local A-relation (e.g., a ?-marked "subject" indexed to the external/v-introduced ? by the ?-properties): (198) C?T ?:f ?v ?[?:f] D ?:f ?:n What the il-formed examples involving non-expletives discused above have in common is the following stage of derivation (just prior to contraction), where V is a raising predicate (an athematic element compared to (198)): 173 (199) C?T ?:f ?V?C D ?:f ?:n If (C, C) satisfy the matching relation, and result in a contraction, then it is true that D ?:f ?:n in (199) wil not be ?-asociated, but that does not explain the otherwise very similar fel to the violation involving there in (195), which on most views of these elements is athematic (though se Moro 1997 for a view in which there, if not "thematic" is at least viewed as an abstract sort of predicate). Bo?kovi? (1997, 2002) discusses what he cals (following a suggestion of Howard Lasnik) the Inverse Case Filter. The traditional Case filter was stated in terms like the following: (200) Extended Case Filter: *[ NP ?] if ? has no Case and ? contains a phonetic matrix or is a variable (Chomsky 1981:175) Case-theoretic violations on this view are pinned on a failure to met a requirement of NPs. However, it is not unreasonable to suggest that Case-theoretic violations might be (or might also be) a mater of Case-asigner's needing to "discharge" their ?-property. This is the idea of the "Inverse" Case Filter. Bo?kovi? discuses examples such as the following: (201) a. * is likely John wil leave b. *John believes to sem Peter is il Both of these kinds of violations have been discussed in terms of the "EP", currently implemented in feature-licensing terms within minimalism. However Bo?kovi? points out that the cases in (201) can be explained in terms of failure to discharge/asign Case 174 (nominative in (201)a and 'exceptional' acusative in (201)b). Can we make use of this general idea in handling the crucial facts regarding expletive-there in a way that connects them to what otherwise (on standard views) appear to be ?-theoretic violations? We could asume (with Chomsky and others) that there is minimaly specified for agrement, but cannot check case. Then we would understand the il-formed there..that case above as a mater of the Inverse Case Filter. However, another view is possible, which I believe can play a role in determining the distribution of there-type expletives (at least in English). Suppose that matching of a feature, any feature, is sufficient to license combination, so that X?Y can be established as a dominance link if they bear the same feature F. Suppose that expletive there, unlike regular "thematic" nominal expresions, bears just unvalued ?, and no ?-specification whatsoever. Then we would have the following: (202) C?T ?:? ?:n ?V?C there D ?:? Now two options present themselves ? either ?-properties can enter into licensing/valuation independently of ?, or (as Chomsky 1998, 1999) suggests, ?-licensing is parasitic in some sense on ?-relationships. Suppose that ?-valuation/licensing can happen independently. Then we have: (203) C?T ?:? ?:n? ?V?C C?T ?:? ?V?C there D ?:???:n there D ?:n 175 This view could help us with the distribution of expletive-there as follows. Suppose in the wel-formed local A-relation, contra to what we have suggested so far, that thematic elements (v/V) come to the derivation with an unvalued ?-feature as their argument. I suggested earlier that we might in general view ?/?-properties as individuating the variable positions of thematic predicates, and two ways of thinking about this were offered: (i) the open positions are undiferentiated "slots" and (i) the one open position is indistinguishable from another because they al "start" with a general default value. Suppose then that they start as ?[?:?]. An expletive element in the subject position of a transitive verb in English then wil encounter the following stage of derivation if the sketch in (203) is correct: (204) C?T ?:? ?v ?[?:?] there D ?:n (e.g., *there hit ..) Asuming that subsumption (se (31), ?1.2) is required to hold for valuation, ?:?-?:? wil make ?-discharge as we have envisioned it impossible. The prediction is that expletive-constructions must require some other way to get T-? valued. How could that happen? It must be, acording to this view, that T-? relates directly to a subordinate nominal element, something that is independently valued for ?. And that nominal must be in a configuration that is somehow appropriately thematic, but without involving ?- asignment. Moreover, there cannot be any intervening ?[?:?], as this could be sen to block the relevant relationship betwen T-? and some lower nominal "asociate". 176 This picture sems roughly correct. Expletives generaly appear in ?-positions that are not imediately/localy asociated with ?, as in raising, copular constructions, and unacusatives. They can appear as wel in ECM environments in English, so long as the condition on there being no intervening ?-element is met, e.g.,: (205) a. I believe there to be a man in the room b. I believe there to have arived a man c. I believe there to appear to be a man in the room d. *I believe there to have a man left e. *I believe there to be an idiot (vs. I believe John to be an idiot) Let us consider a counterpart to (205) to take look at how these relations are established ? the idea then is that there wil serve to license matrix ?, but not ?. The result is that ? must be valued in another way, and it cannot connect directly with a ?-element (because the ?-argument of ? wil be ?:? as wel, and so the subsumption condition on valuation wil not be met). Consider then the clasic case: (206) There sems to be a man in the room We begin then with a local structure as discussed above: (207) a. C?T ?:? ?V b. C?T ?:? ?V?T there D ?:n there D ?:n The defective T complement to the raising predicate V is introduced, resulting in contraction, leaving us with the following workspace (ignoring the output structure here): (208) a. C?T ?:? b. C?T ?:? ?V be ?D ?:? ?: there D ?:n there D ?:n ?-agre c. C?T ?:? ?V be ?D ?:? ?: ?N ?:f ?:? d. C?T ?:???:f ?V be ?D ?:???:f ?: ?N ?:f ?:? there D ?:n there D ?:n 177 In this situation we could understand the nominal element introduced (man in a man) to value the ?-properties that are unvalued prior to (208)d, but then what of the ?-properties of the nominal asociate? Given our adoption of a single dominance order, there bears no direct relationship to the asociate. We can solve this problem by taking the other option suggested above (following Chomsky) and suggest that ? and ? valuation are linked. We keep the idea that there has a lone unvalued ?-property. This wil explain why local ?/?-valuation cannot happen. This wil leave us with an initial stage of derivation more like this: (209) C?T ?:? ?:n ?V there D ?:? Now a diferent question arises, namely: when a ?-element is introduced why can't the valued ?-property of T "fil-in" the value of ? as we suggested for wh-questions? I wil return to this in a moment. First, consider how this wil implement a familiar transmision type story regarding expletive-asociate relationships. If T and expletive-there are able to relate via matching (T?-D?), but with valuation of the properties impossible because there bears no ?-property, then the final stage of derivation above in (208)d would look as follows in (208)d', with ?-valuation happening then parasiticaly on succesful ?- agrement as in (208)e: ? ? (208) d'. C?T ?:???:f ?:n ?V be ?D ?:???:f ?: ?N ?:f ?:? there D ?:? 178 e. C?T ?:f ?:n?? ?V be ?D ?:???:f ?: ?:n ?N ?:f ?:???:n ? ? there D ?:???:n Thus, T-? values both the expletive and the asociate, deleting as in the regular case. We have then a mechanism whereby expletive-there is a ?-marked element, but so is the asociate. We thus agre with the asesment of Lasnik (1995) that the expletive checks the matrix Case property, but do not posit an additional partitive case (e.g., as in both Lasnik 1995, and Beleti 1988) that is responsible for licensing the asociate. Rather, the asociate wil be able to enter, in virtue of its ?-properties, into 'caseles' types of predication, as in smal clause structures for example: 93 (210) a. I consider [John a genius] b. I consider [John inteligent] c. ..[a man] [ in the room ] For the present point regarding expletive/asociate relationships, I wil asume that the structure of the relationship betwen [a man] and [in the room] comes out as follows for the continuation of our (208) derivation in (208)f: (208) f. in P ?[?:?] C?T ?:f ?V be ?D ?:f ?:n ?N ?:f ?:n there D ?:n g. in P ?[?:?]??[?:f] C?T ?:f ?V be ?D ?:f ?:n ?N ?:f ?:n there D ?:n 93 I mean by 'caseles' here just that case-asignment is not part-and-parcel of the local relation in the sense that Case must be asigned from outside the basic predication. 179 The D-element is ?-marked, and so it may enter into a relation with an element which does not, itself cary a potentialy interfering ?-property. The asumption then is that there is a clas of ?-like relationships which are "direct" in one sense ? they involve ?-? connections (valuation) that is not acompanies by ?-valuation ? but "indirect" in another sense, in particular any nominal element entering into such relationships wil have to have found its ?-properties licensed in some independent ?/? complex. I wil not, in the present work, get into the isues regarding definitenes efects and the like, though the suggestion here would be that direct ?-? relationships that are reliant on some other instance of ?-asignment are at least one kind of DE environment. The present outline sems compatible with one or another version of Deising-style mapping, but I won't pursue this here (e.g., se Deising (1992); and se Hornstein (1995) for some relevant discussion, and se Safir (1987) for what strikes me as a related conception involving "transmision" and the notion of "unbalanced" chains). So, let us now consider our il-formed cases together from above: (211) *John 1 sems [ CP that _ is here] (212) a. It sems John was told [ _ to arive on time] b. *John sems it was told [ _ to arive on time] (213) *There sems that a man is in the room These would thus correspond to the following two scenarios, where both (211) & (212) manifest a reasonable A-relation as in (214), but one which does not connect with ?, thus 180 leaving the matrix subject John unintegrated. In the case of (213) the subject-related T element (along the the expletive) must be speled-out with unlicensed properties: (214) C?T ?:f ?V?C John D ?:f ?:n (215) C?T ?:? ?:n ?V?C there D ?:? In (215), there can enter the structure in virtue of ?-matching, but given the asumption that ?-valuation is parasitic on ?-relationships, nothing further can happen at this point. Asuming that properties which are unvalued are ilegible at the interface, (215) wil crash as the material intervening betwen C?..?C is spliced-out. So we diagnose two kinds of A-relation deficiency, and have in hand a reasonable story that looks to be able to contribute to understanding expletive-there's distribution, based on a particular implementation of the "transmision" logic (Safir 1982, Chomsky 1986). Also we have implemented a view of ?/?, with suggestive connections to GB-era views. Consider again the Extended Case Filter mentioned above: (216) Extended Case Filter: *[ NP ?] if ? has no Case and ? contains a phonetic matrix or is a variable (Chomsky 1981:175) On the view here this is only part of a more general conception of A-relations (and A'- relations as we'll se below). Why should Case be hooked up with the notion of "variables"? The idea here is that ?-properties name variables, thereby distinguishing them, and that the interconnections of ?/?-properties are what mediates the connections to betwen thematic predicates and nominal expresions, whether operator-like/ 181 quantificational or not (as with ordinary NPs). Vermeulen (1995) (se also Viser & Vermeulen (1996)) point out that we can in general distinguish thre things regarding variables, (i) the "variable itself", (i) the name of the variable, and (ii) the value of the variable (what it ranges over). The "variable itself", I am suggesting is the open position in ?-elements for the participants that these elements relate to eventualities. ?-properties create a path to a ?-feature, which either is bound from above (as in WH-cases) or is asigned to an overt nominal. Before moving on let us return to an isue we left dangling above. Recal the possibility of having the les radical workspace contraction (189) raises again the isue of locality as potentialy independent of these dynamic domains, as discussed above in connection with the derivations for simple transitive structures. I repeat the possible workspace contractions here for convenience: (217) C ? T ? v ? V ? C C ? T ? v ? V ? C D D (218) C ? T ? v ? V ? C C ? T ? v ? V ? C D D Its not clear without imposing some other locality mechanism why the conservative response to a distinctnes violation (217) wouldn't then simply alow a some featural- relation with the embedded T-element. 94 94 Note that the kind of node-identification sugested to underlie SCM presumably couldn't aply betwen finite T's, since these would not manifest the subsumption relationship that we have posited as being the relevant condition under which node identification takes place. 182 Note that the conservative response to a potential distinctnes violation would always end up with there being (globaly) more instances of contraction/spel-out than the radical response. Asuming that transderivational comparisons of more-vs-les numbers of contractions is an undesirable property to have in a minimalist approach, note that there is a local way to ensure that the global number of contractions wil in fact be minimized. This wil always hold if local contractions are maximal ? that is, if the system responds in the "radical" manner depicted above (which yields the A-movement locality facts discussed). Note also that this isue of the diference betwen the radical vs. conservative response to distinctnes violations does not arise when the relevant context nodes (e.g., C or T) are identified as in A- or A'-type SCM (since the relevant nodes are identified there is no way to remove "just one of them", leaving the other in the workspace ? so the only possible response following identification is the radical contraction in order to keep the ordering properties coherent, as sketched in Chapter One). However, if we adopt a local economy view for the non-movement case (maximize contractions) to get an acount of the A-movement locality facts above, we would then have two separate motivations for what otherwise sem to be rather similar sorts of proceses, difering only with respect to whether the relevant nodes are identified or not prior to contraction. My suspicion is that there may be a way to derive these contractions in a unified way, but I have not yet found a satisfactory formulation to this efect. I wil leave the mater open here, asuming the following: (219) Workspace Economy: Contraction/Spel-Out is localy maximal 183 Note that this is actualy consistent with a "least efort" line of thinking, perhaps despite appearances. The idea would be that maintaining distinctions in the workspace is what takes efort, so whenever this burden can be eased (by speling-out) it is maximally eased. 3.2.2. Binding Interventions & SCM Consider now some of the SCM-type efects regarding binding, in particular the cases in (136) repeated here as (220): (220) a. John 1 semed to himself 1 to appear to Mary to be geting fat b. John 1 semed to appear to himself 1 to be geting fat c. John 1 semed to Mary to appear to himself 1 to be geting fat d. *Mary semed to John 1 to appear to himself 1 to be geting fat e. It semed to John 1 to appear to himself 1 that he was geting fat f. John 1 semed to Bil 2 to appear to himself 1/*2 to be geting fat We noted the following two key points about these cases in our Chapter 2 discusion. First, the impossibility of (220)d was suggested to be traceable to an intervention efect on a cyclic raising story, where we would understand Mary to occupy the embedded non- finite subject position of to appear, thus constituting a closer posible binder for the self- form in the experiencer-P. But this creates a ?-conflict, and so it is out. The second point was to note that, regarding (220)e, it appears that John is a suitable binder for the self-form, despite the apparent lack of a c-command relationship. We can now expand on these observations as follows. Recal we noted as wel in our earlier discussion that (220)a and (220)b suggest that either the binding domain for the self-form includes more than one clause (perhaps specified in terms of the presence of an subject or suitably "subject-like" element, as in some approaches to binding), or some relation must be involved to bring the antecedent- 184 dependent pair into a more local relationship. The two salient possibilities for this later sort of solution are the kind of T-to-T-domain SCM of the subject John in these examples, or perhaps an LF-type movement of the self-form. Note that a strict clause-mate condition on these relationships is implausible if we asume that the experiencer-Ps are not implicated in any movement betwen domains (i.e., asuming their positions are fixed where they surface). Consider the following additions to the examples in (220) of one more intervening raising predicate (tend, in (221)) or two more (tend and be likely, (222): (221) a. John 1 semed to himself 1 to tend to appear to Mary to be geting fat b. John 1 semed to tend to appear to himself 1 to be geting fat c. John 1 semed to Mary to tend to appear to himself 1 to be geting fat d. *Mary semed to John 1 to tend to appear to himself 1 to be geting fat e. It semed to John 1 to tend to appear to himself 1 that he was geting fat (222) a. John 1 semed to himself 1 to tend to be likely to apear to Mary to be geting fat b. John 1 semed to tend to be likely to appear to himself 1 to be geting fat c. John 1 semed to Mary to tend to be likely to apear to himself 1 to be geting fat d. *Mary semed to John 1 to tend to be likely to apear to himself 1 to be geting fat e. It semed to John 1 to tend to be likely to apear to himself 1 that he was geting fat Importantly, the judgments remain the same as in (220). The crucial cases are those involving binding betwen two elements situated within these experiencer-Ps ? that is, (221)e and (222)e. What these (e)-cases show is that binding is independently posible betwen the nominals in these Ps, and that the phenomena is not sensitive to the boundaries introduced by intervening embedded non-finite clauses. If the position of these Ps is "fixed", then binding of these self-forms cannot be required to happen within a single clause. Similarly, if movement from these positions is generaly not possible, the idea of having the self-form move into a more local relation with its antecedent is 185 implausible as wel. This leaves the possibility of defining the binding domains in terms of something like the local presence of a "subject" (so if there is no local A-movements we could stil have arbitrarily large binding domains in this sense). The question then is why on such a view of binding domains would we have the contrast betwen the d-/e-cases in (220)-(222)? It would sem that John can be a local binder ? that's what the e-cases show. Moreover, as the following show, many kinds of dependencies that are typicaly understood to require a c-command relationship appear to be licit betwen two such P structures, providing further strength to the asertion that there is no complicating factor of structure in the d-cases in (220)-(222), and that it thus stands as a piece of evidence that appears to demand that something like succesive-cyclic A-movement occurs. Consider (from Castilo, Drury, & Grohmann 1999:95): (223) a. It sems to every boy 1 to appear to his 1 mother that the earth is flat b. It sems to no man to appear to any woman that the earth is flat c. *It sems to him 1 to appear to John 1 that the earth is flat d. It sems to his 1 mother to appear to John 1 that the earth is flat e. ?It sems to John 1 to appear to himself 1 that the earth is flat f. It sems to John 1 to appear to him 1 that the earth is flat g. It sems to them 1 to appear to each other 1 that the earth is flat Thus, variable binding of a pronoun by a quantifier (223)), negative polarity licensing (223)b), and Condition C violations (223)c) as wel as their expected absence in (223)d al point to the generalization that these experiencer elements can bind (etc.) out of their Ps. Curiously, there is an unexpected absence of strong complementarity betwen Conditions A and B, as the aceptable judgment for (223)f shows in comparison to 186 (223)e. That is, (223)f is not degraded with the indicated coreference as is the following case in (224)a with respect to the wel-formed (224)b: (224) a. * John 1 is believed to sem to him 1 to be a genius b. ? John 1 is believed to sem to himself 1 to be a genius Some speaker in fact detect a slight advantage in aceptability for the pronoun case versus the self-form in case of binding betwen elements in experiencer-Ps, finding (223)f slightly beter than (223)e. But the rest of judgments are stable, including the possibility of reciprocal binding as in (223)g. The presence of the strong contrast in (224)a/b and its absence in (223)e/f led Castilo, Drury, & Grohmann (1999) to suggest that self-form in these cases is actualy a logophor (Reinhart & Reuland 1993, Sels 1987), and that, if true, this fact would undermine the argument for succesive-cyclicity of A-movement based on the intervention efect in (223)d (due initialy to Danny Fox, as pointed out by David Pesetsky (p.c.). The argument is esentialy this: since we do not know hat governs the distribution of logophors, it is unclear that we need to posit an intermediate copy/trace of A-movement to explain (220)d, repeated here: (220) d. *Mary semed to John 1 to appear to himself 1 to be geting fat What should acount for speaker judgments regarding (220)d should thus be some to-be- specified story about logophoricity. But this argument from Castilo et al. does not go through ? the case for succesive A-movement made by such observations, I think, stil stands. While it is true that the non-complementary distribution of pronouns and reflexives in these experiencer- 187 P environments suggests that something like logophoricity is in play here (e.g., as in "picture-NPs", se below), this observation says nothing about what stil appears to be an intervention-type efect in the contrast betwen (220)c and (220)d above. That is, whatever logophoricity ultimately amounts to (connected to various maters concerning "point-of-view", "psychological state", and similar notions; se below), 95 it stil sems to be sensitive in some manner to structural factors in the determination of impossible antecedence relationships. 96 The interest of logophors stems from their extra distributional fredoms in comparison to ordinary self-anaphors. For example, in comparison with the complementarity of the straightforward local cases of pronouns/reflexives (as in (225)a vs. b), we find the lack of a strong contrast for the a/b cases in (226) & (227) surprising, as we do with the antecedence betwen the P-embedded experiencer elements in (223)e vs. (223)f (repeated here as (228)a vs. (228)b): (225) a. * John 1 liked him 1 b. ? John 1 liked himself 1 (226) a. ? John 1 liked the pictures of him 1 b. ? John 1 liked the pictures of himself 1 (227) a. ? John 1 thought pictures of him 1 were on display b. ? John 1 thought pictures of himself 1 were on display 95 Note Boeckx (201) contains an interesting discusion drawing on Roryck (19?) who argues for a view of certain raising predicates (e.g., sem, apear) conecting to verbs of comparison, and thus indirectly to concerns relating to "point-of-view". For discusions on the notion of logophoricity, se Clements (1975), Sels (1987), Reinhart & Reuland (193), Wiliams (194), Reuland & Everaert (201). Se below for some further discusion of logophoricity and why it probably isn't in play in the present case. 96 Castilo et al. do note various "structural" factors that sem to be involved in restricting logophor interpretation, sugesting a preference hierarchy for c-comanding vs. m-comanding vs. merely "previously established in the discourse" elements as potential antecedents, but shy away from the problematic conclusions that I reach here regarding the argument these cases stil present for cyclic A-movement. 188 (228) a. ?It sems to John 1 to appear to himself 1 that the earth is flat b. It sems to John 1 to appear to him 1 that the earth is flat But al this is to notice an extra dimension to the distribution of self-elements ? something about these contexts alows something additional possibilities where induction from the basis of the strictly local cases suggests it ought to be out. Some speakers, as mentioned above, find there to be a slight advantage in aceptability for the pronoun in (228)b as compared to the self-form in (228)a. However, this diference rather like the contrast betwen (229)b&c where there is usualy a slight favoring of the pronoun over the self-form. (229) a. *Mary sold the pictures of himself b. ?John 1 thought Mary sold the pictures of himself 1 c. John 1 thought ary sold the pictures of him 1 But the crucial Fox-cases have the unaceptability status more in-line with (229)a. The conclusion is that negative restrictions on these self-elements are clearly in force, whatever their extra distributional fredoms. An explanation of this fact is straightforward on the SCM view of raising. The conclusion can be avoided only if a story about the distribution of logophoric elements could be produced which would rule out by other independently motivated means the cases that can otherwise be handled as a straightforward intervention fact under the cyclic movement analysis. But what is the domain for the binding theory on our restricted workspace view? It sems clear that binding of these self-elements is not something which occurs within the boundaries of the workspace as we have set things out here. Seting aside the picture- NP situation (I won't be discusing the status of recursion in NPs here), however we view 189 the integration of the experiencer-Ps with respect to raising predicates the mechanics of contraction wil always set them off into separate workspaces. Consider: (230) C ? T ? V ? T ? V ?.. D P?D D P?D Asuming that these Ps are somehow V-asociated canonicaly, T-T identification and contraction would result in the structures being in separate domains. I do not have anything to say about how it is that elements (e.g. in (230)) can relate to each other from within their containing Ps, either in the binding of the self-form (228)a or in any of the other relations that sem to be possible betwen these positions (223). Whatever factors underlie the possibility of such relationships, however, what sems clear is that the possibility that the SCM analysis makes available ? of positing an intervener ? directly explains the sharp anomaly of (220)d. It is also clear that the general paterning of the availability of these self-forms include positions that we wil certainly want to say are in distinct phases of derivation (separate workspaces), like the binding of these forms by a matrix element when they are in embedded subject positions as in (227)b. Recal from our discussion in Chapter One the suggestion that we think of the WS/O-distinction as esentialy drawing the interface line, such that output would be conceived as a syntactic structure populated by only PF and LF relevant properties. On the local view of domains being entertained here, the distribution of logophors must be a mater of relationships established over the output structure. Thus I tentatively suggest here that we regard the logophoric self-forms under discussion as distinct from local 190 reflexives in these terms. But note that our view of the output involves a general maintenance of the ordering properties created by construction in the workspace, so we stil have "structural" distinctions that can be made over the output. Let us asume then that the connection betwen a logophor and an antecedent element is captured over the output structure with a Higginbotham's (1985) linking mechanism, though we wil take this linking to be contingent on matching ?-properties of the elements. This wil be understood to be diferent from the matching and valuation that we have so far discussed. I wil in fact suggest below that local reflexivization is a proces involving workspace-local valuation of ?; logophors, however, wil be understood to be independently ?-specified, and either they match up with their independently ?-specified antecedents, or not, under linking. That is: (231) a. Local Reflexives: ?:f?? .. ?:? ? ? MATCHING/VALUATION b. Logophoric -self: ?:f?? .. ?:f?? LINKING This requires that we view ?-properties as "there" in the output structure, and not just the narow syntax (workspace). But we have been presupposing this in the general outlook on these properties anyway in terms of their asumed role in mediating ?-discharge. The linking relation, following Higginbotham, runs from a referentialy dependent element to a referential one. Higginbotham's view asumed that such links get created in two ways, as a reflex of movement, and independent of movement. (e.g., a variable bound by a quantifier would enter into such a linking relationship, though no one, as far as I am 191 aware, has ever suggested that quantifier/pronoun relationships are of the movement sort). So what are the structural conditions on logophoric linking? One general answer that has been offered in the literature is that, esentialy, there are no such conditions. This was the basis for Castilo et al.'s (1999) rejection of the aleged binding intervention case in (220)d as an argument for SCM in raising. Rather, conditions on logophoricity are understood to rely on things like the following (se Sels 1987:445, and Wiliams 1994:86): (232) Logophors connect with a logophoric center, which is an NP that must be a "thinker", "perceiver" meting one of a-c: a. The referent of the NP is "the source of the report" b. The referent of the NP is "the person with respect to whose consciousnes the report is made" c. The referent of the NP is "the person from whose point of view the report is made" Note that no mention of structure is made. In fact, self-forms of this kind can appear without any structuraly present antecedent at al: (233) As for myself, Paris is great this time of year (i.e., "as for me/my-point-of-view, (I think),..X") Let us consider (220)c-e again, to be sure that these notions regarding logophoricity might not, after al, be put to work in explaining the central cases that I am taking to provide evidence for SCM in raising: (220) c. John 1 semed to Mary to appear to himself 1 to be geting fat d. *Mary semed to John 1 to appear to himself 1 to be geting fat e. It semed to John 1 to appear to himself 1 that he was geting fat 192 It is not at al clear that these notions make the right predictions. In (220)d, for example, the seming and appearing are both to John. So it would sem that that if there is a candidate logophoric center for (220)d, it is John and not Mary (i.e., Mary is the one "seming"/"appearing" to be such-and-such, not the one that such-and-such is "seming/appearing-to). The point-of-view criteria should pick out the experiencer, not the matrix subject, as the logophoric center. But this then suggests that there are, after al, some structural conditions or other on these elements, in the sense suggested above ? logophors are indeed keyed to extra- syntactic factors that influence where they may find there antecedents (including implicit arguments, or merely presupposed entities in the discourse), but in the right local environments with a referential NP, their connection to that NP appears to be mandatory. If this is right, then we realy do have a good argument for SCM in raising. I wil return to these isue below, as they bear on the isue of similar arguments as they arise in wh- movement. What we have sen then is that (i) binding domains in terms of an acesible subject or the like don't sem to work properly for explaining this particular range of facts, (i) movement of the self-form is also somewhat implausible, since the elements within dative-experiencers typicaly cannot undergo movement, and (ii) a possible explanation in terms of logophoricity doesn't sem to be able to make the right distinctions either. In addition, although we didn't make a big fuss about it above, the following cases involving reciprocals don't obviously fal into the clas of posible logophors, but nonetheles show al the same efects as the self-forms: 193 (234) a. The boys 1 semed to Mary to appear to each other 1 to be geting fat d. *Mary semed to the boys 1 to appear to each other 1 to be geting fat e. It semed to boys 1 to appear to each other 1 that they were geting fat So in absence of some other story to explain these facts, we need SCM. The question now is what motivates movement to the intermediate positions? I argued in Chapter 2 that we should be suspicious of an "M-feature" solution (e.g., so-caled intermediate EP-features postulated at the edges of embedded non-finite clauses). Other approaches that can handle these facts do so by brute force stipulation that A-movement moves T-to-T (e.g., Bo?kovi? 2002, Grohmann 2003). However we have in our development of the SCM mechanics in TCG an acount which relies on independently required notions of (i) formal ordering properties, (i) and a system of types. With these ingredients we stated our conditions on our workspace, and these can be understood to drive the intermediate movements without appeal to M- features. Recal from Chapter One the schema in (29) for SCM in raising environments (I repeat the relevant portion of derivation here: (29) e. C?T ?:f ?V?T D ?:f ?:n (29) f. C?T ?:f ?V?T ?:f C?T ?:f ?V?T ?:f D ?:f ?:n D ?:f ?:n D ?:f ?:n D ?:f ?:n And recal that this contracted workspace on the right-hand side of (29)f is realy just: (29) f'. C?T ?:f D ?:f ?:n 194 So without M-features we are able to derive the binding paterns above. However, it also sems that views which posit movement to the edge of every phrase would do just as wel with these facts (e.g., se Takahashi 1994, and a recent revival of Takahashi's view in Boeckx 2003). Such super-cyclic views would also sem to do quite wel regarding the distribution of "floated" all sen in English RtS: (235) a. The men all semed to appear to be likely to leave b. The men semed all to appear to be likely to leave c. The men semed to al appear to be likely to leave d. The men semed to appear all to be likely to leave e. The men semed to appear to all be likely to leave f. The men semed to appear to be all likely to leave g. The men semed to appear to be likely all to leave h. The men semed to appear to be likely to all leave Moreover, these facts would perhaps be puzzling on the TCG view offered here, since SCM (the "lowering" efected by node-identification) is predicted to only involve the equivalent of Spec-TP positions. Therefore (235)c, e, f, and h are al problematic. I am not going to pursue these maters here. I don't fully understand at present the wider aray of facts ? a thorough recent review of the relevant theoretical and empirical isues surrounding such "floated" elements (Bobaljik 2002) urges a kind of caution that time and space limitations do not alow me to respect here. I wil note only that there is reason to doubt a "stranding" analysis in general and that at present it sems to me that base-generation analyses have the best empirical coverage, so its not entirely clear that these facts bear directly on SCM. Consider a few examples that bring up the kind of problems that arise (the following are drawn from Bobaljik 1995). Note that the stranding analysis sems to 195 presuppose that elements like all make a wel-formed unit with the DP that can strand them. But this isn't general, consider: (236) a. Some of the students might all have left b. *Al (of) some of the students might have left Although (236)a sems fine, it cannot surface with all as a unit, as il-formednes of the b-case shows. Another clasic case that has been brought up as a chalenge to the stranding analysis that are relevant to our A-movement discussion are unacusatives and pasives: (237) a. *The men have arived al b. The men have al arived (238) a. * The men were kised al b. The men were al kised It quite unclear why these positions should be out on the standard A-movement plus stranding idea. Again, I wil leave these isues to the side, but note that the mater is an important one however ? should it turn out that the stranding-style analysis is independently demanded, this would be inconsistent with the general intuition underlying TCG. In any case, I wil leave the mater open here for further future investigation, noting that these general types of facts could constitute a crucial set of cases that could strongly cal into question the basic ideas proposed here. 3.2.3. Variable Binding/Condition C Interaction Consider another case discussed in Chapter 2 (se the discussion there for references), showing an interaction betwen variable binding by a quantificational element and 196 Condition C. This wil lead us into our discussion of wh-movement below, as wel as raising some general isues that I wil leave open here. (239) a. *[His 1 mother's 2 bread] sems to her 2 _ to be known by every man 1 to be _ the best there is b. [His 1 mother's 2 bread] sems to every man 1 _ to be known by her 2 to be _ the best there is In the a-case, in order for his in the subject to be bound by every man, it must be interpreted in the more embedded position ? but there it gives rise to a Condition C efect, so the reading on the provided coindexing for the a-case is impossible. However, if we switch-around the QP and the pronoun, as in the b-case, the bound-reading becomes possible, but this only makes sense of the interpreted position is below every man but above her. The SCM type of analysis that our framework makes available can acount for this patern as wel, though note that it requires that the "A-moved" expresion [his mother's bread] must reside for interpretation in a non-thematic position. As we noted in Chapter 2 (pointed out in Bo?kovi? 2002) the a-case on the bound variable reading is perfectly aceptable so long as we have disjoint reference betwen her and his mother. The combination of these observations suggests that intermediate reconstruction is possible, but not necesary. It moreover suggests that the output structure handled by the interpretative systems must be coherent (se Bobaljik 2002, Hornstein 2000 on this sense of LF coherence) in the sense that the moved phrasal complex must be interpreted as "in" one or the other positions, but not both (as this would cause conflicts that would presumably correspond to unaceptability). However, it sems 197 that how to understand this isn't entirely straightforward on the view we have been entertaining ? nor is it straightforward on standard views. The question is why/how/when it should be posible to have a nominal in such raising environments be forced to be "interpreted" in an intermediate position. Note that both variable binding and Condition C efects cannot, in general, be workspace-mediated relations on the TCG view. Like the cases of logophors discused above, these are potentialy long-distance relationships (actual "long-distance", not superficialy so as in linked-local/SCM cases). However, unlike the logophor case, there appears to be structural factors involved ? something in the vicinity of c-command is required, whereas this isn't an absolute condition on logophors. We can now bring up the "re-spel-out" mechanism discussed in Chapter One in connection with a concrete case, in particular the combination of our anti-recursion with workspace connectednes (the later is repeated here) (se ?1.4.1): (240) Workspace Connectednes (DOMINANCE): The elements in a given syntactic workspace must manifest a connected dominance order (for every x, y in the set, either x dominates y or y dominates x) Insisting that the workspace always maintain a fully connected dominance order yields the need to spel-out (void from the workspace) every time branching occurs. For the raising case, this means that the subject, which in our top-down view begins the derivation in its putative surface position, must asociate to T and then spel- out when V is introduced. However, in virtue of (i) the feature-connection betwen the subject and matrix-T, and (i) the introduction and identification of the embedded 198 (defective) non-finite T in multiple raising constructions, the subject can, and in fact must "re-enter" the workspace. Consider: (241) C?T ?:f C?T ?:f ?V ~DOM(D,V) & ~DOM(V,D) D ?:f ?:n D ?:f ?:n (242) C?T ?:f ?V ? ? (CONTRACTION/SPEL-OUT) D ?:f ?:n (241) & (242) show the workspace both prior to and following the addition of V. By hypothesis the D-T relation has occurred, but the addition of V violates connectednes, since D and V enter into no ordering relationship. So D spels-out (the workspace contracts to maintain a connected order). The connection betwen D and T is understood to be maintained in virtue of their featural (?/?) relationship. Next, when defective T is introduced, we have the following: (243) C?T ?:f ?V?T C?T ?:f ?V?T ?:f ?D ?:f ?:n D ?:f ?:n D ?:f ?:n In (243) I have for convenience collapsed some steps of derivation. On the left we have the introduction of the embedded non-finite-T. On our anti-recursion asumptions, given that T subsumes T ?:f , the node are identified. Keping with ordering consistency, this requires that the intervening V be spliced out of the workspace, and the T-T identification efects the reintroduction of the matrix subject (right-hand side of (243) ? I include this reintroduction on the horizontal line simply for presentational purposes, left-right and top-down on the page both represent the single dominance relation). 199 The SCM-raising facts we have canvased above demand that the entire LF- content of this D-element be in this lowered position. Two sets of questions arise. 97 First, is the LF-content in both the matrix and this new embedded position? Or does the ?- material have to "collapse" to a single position? Second, why are such lowered elements never re-pronounced in either intermediate or base positions? Taking the second question first, the right generalization appears to be that these D-elements are speled-out in the contexts in which they are initialy ?-valued. This acords with the general intuition of their being a "PF"-function to such properties, but note that our story regarding ?-transmision given above for expletive/asociate relationships then runs into some trouble. For example, the idea there was that there in raising constructions can relate to the structure by ?-matching even though valuation does not occur. In virtue of T-T contractions in raising, the expletive element was suggested to be lowered along with T, acording to our general story about SCM. In fact, it is dificult to se how we could mangage to avoid lowering the expletive given how we have treated regular nominal expresions in such contexts. However, if PF-spel-out is contingent on the context in which ?-properties are valued, then we expect one of two incorrect results for the raising constructions, either: (i) expletives should appear in every intermediate position in raising (244)a, or (i) they should only appear in the lowest such intermediate position (244)b: 97 Actualy at least thre sets of questions arise. The third pertains to the structure of the matrix subject for this example (e.g., [his mother's bread]). Recursion in the nominal domain is not something I have discused at al here, but presumably this wil involve two head elements coding posesion relating to the nominal and pronominal. I am abstracting away from this important isue to concentrate on how information flows in these derivations along our equivalent of verbal extended projection sequences. 200 (244) a. *There sems there to be likely there to apear there to be a man in the rom b. *? sems ? to be likely ? to apear there to be a man in the rom The problem lies in the way we have conceived of node-identification. T-T contractions result in the lower "defective" instances of T taking on the matrix ?/? values. I wil postpone a sketch of the solution as we wil need to say something similar in the domain of WH-movement in our discussion in the next section. Regarding the first isue raised above: what the TCG mechanics provide, I am arguing, is a natural way to understand why there ought to be intermediate-type efects of the SCM sort. I have argued that it is an atractive platform for studying these phenomena, and sketched some preliminary analyses in terms of one posible implementation. But the general acount does not tel us everything, further development is required to understand the isues that arise in interpretation in cases like the one above (and others, se below). The principles that govern reconstruction/connetivity type efects in A-movement are not wel-understood. Some have denied they exist entirely (e.g., se Lasnik 1999, Chomsky 1995), while others have countered that such efects do sometimes show up (Boeckx 2003, Wurmbrand & Bobaljik 1999) and that evidence to the contrary simply points to a lack of full understanding of the diferences betwen the inventory of potentialy movable elements, and does not bear on the general idea of SCM. Some controversy exists over, for example, the status of examples of the following sort (this discussion draws on Wurmbrand & Bobaljik's 1999 presentation, the example is due initialy to the work of May 1977): (245) Some politician is likely t to adres John's constituency 201 The claim about this case is that it manifests a scope ambiguity, with "some politicians" taking scope from either the overt/matrix position or the embedded position from which on standard views it is taken to "raise" from. The ambiguity is thus with respect to the predicate "is likely", and in particular whether the existential introduced in the subject nominal is under or over this predicate scope-wise. For example: (246) ? > likely = "there is some politician who is likely to make the adres" likely > ? = "it is likely that there is some politician who wil make the addres" The ambiguity is clear, and a "copy"-type story, which we have motivated a version of here, can in principle acount for this in terms of "interpreting" the nominal in either the upper or lower position (or taking "ambiguity" here to mean that somehow both positions are occupied, so that we may flip back in forth mentaly betwen the two). Lasnik (1998) has argued, however, that this ambiguity can be explained without reference to scopal distinctions, but rather in terms of specificity. Consider: (247) Some politician adresed John's constituency This has a specific and a non-specific reading, where we may or may not (respectively) have a certain politician in mind. And this distinction corresponds to the ambiguity present in the raising case above. Lasnik's point is that scope ambiguities are Q-Q interactions (e.g., of the everybody loves somebody sort), and its not obvious that there are such relationships at play in the raising case. But nonetheles it is possible to think about specificity diferences in cases like the one in (247) where there is no isues regarding high/low positions from which to interpret an element (though perhaps the 202 isue is best understood in terms of v/VP internal/external, that could be involved in (247)'s ambiguity and the raising one above). Bobaljik & Wurmbrand (1999) have responded a bit to this line of argumentation (as wel to some other chalenges raised by Chomsky regarding the A-traces/copies), but I won't go into this further here. Relevant here is their general conclusion, which is just that while there is reason to doubt that even if SCM in A-relations is totaly general, it doesn't always necesarily lead to reconstruction/connectivity efects, but that there are nonetheles some cases where it sems that such analyses are required to understand the cases where such efect do manifest. Here (above) I have concentrated on one main type of case involving interference efects in binding of self-forms, but there are other cases which bear on these maters that wil require further atention, and which wil be required to help sort out further details for the TCG-style analysis I am offering. 3.3. Linked Local Relations II: Wh-Movement We can pick up the thread from the last section regarding variable-binding and obviation interactions by posing the following question: if we keep the raising construction in (239) the same in al other respects, but change the subject element housing the relevant NP and pronoun to a wh-phrase, do we se the same patern as we saw for the A-movement case? 3.3.1. ?-Identification & Local Movement Consider: (248) a. *[Which of his 1 mother's 2 pies] sems to her 2 _ to be known by every man 1 to be _ the best there is b. [Which of his 1 mother's 2 pies] sems to every man 1 _ to be known by her 2 to be _ the best there is 203 On standard bottom-up derivational views such a wh-element would begin in the base/?- position, and A-move just like the NP in RtS, but the "last" movement would be to the C- domain to licensing the wh-properties. This case manifests exactly the same patern as the ordinary raising case examined in the previous section. In particular, binding of the pronoun his by every man is possible without there having to be obviation betwen her and mother. This raises some questions on our view (though not on standard approaches). We have suggested A'-movement to be a dominance-encoded feature licensing relationship, so the beginning of the derivation for either of the above cases would look as follows: (249) C ?:? WH ?T ?:? ?:n ?V?T D ?:f WH ?:? The problem is that we have understood so far the relevant relationships to go as follows: (250) C ?:???:f WH ?T ?:???:f ?:n ?V?T D ?:f WH?? ?:? And then what efects the "raising" ("lowering") in such constructions are T-T identifications. But this does provide a mechanism for the content of the matrix wh- element to be lowered to the embedded edges of the non-finite complements, as these have by asumption been understood to not involve a C-layer. But the binding/scope interactions above sem to insist that this is what is required. 204 On standard views this is unproblematic: the wh-element begins in a ?-position, and is raised from A-to-A position (involving the edges of the non-finite complements), finaly landing in the matrix ?/? position (matrix T), and then A'-moving to C. Thus on a copy view within such standard asumptions we do not have a problem understanding both the raising case offered above nor the wh-movement variant, since the later kind of relationship includes the former. We are now in a position to further specify the "variable" role that we are suggesting for ?-properties. Note that the traditional GB-era view that we discussed above viewed ?-marked traces as "syntactic" variables, which on some implementations were understood to map to semantic ones. This is more-or-les what we have been presupposing in our discusion so far. However, it is possible given the current structure of our acount to entertain a diferent claim: ?-properties are literally syntactic variables, in the following sense. We have so far entertained the idea of ?-features indexing the open positions of thematic predicates. The idea here is that these elements are the syntactic side of relations to semantic variables (i.e., the open positions of ?). The path of ?-agreing nodes in the structure was suggested to "lead to" a ?-position in regular A-relationships, and that "?- marked nominals" are connected in this way to ?. Suppose that the T-D relation resulting in ?-valued on D and deleted from T is a proces, as we have been suggesting, that we might cal ARGUMENT IDENTIFICATION. In the case of an overt nominal, say a subject, this mark signifies the element that is connected to ? via the sequence of ?-properties on the path. 205 Now consider the situation above. If ? serves to identify arguments, we might entertain the following possibility similar to the node-identification discussed earlier: (251) C ?:f WH?WH[?:n] ?T ?:f ?:n ?V C ?:f WH[?:n] ?T ?:f ?:n ?V D ?:f ?:???:n D ?:f ?:n D ?:f ?:n In short, it looks like we need something like local movement after al. So far the only things in our implementation that realy resembled movement was the edge-to-edge lowering efected by context/node-identification. However, the suggestion here is that this is tied to the special role of Case as a syntactic variable. In virtue of the WH-feature valuation by the local ?-property, the wh-element wil come to be ?-marked. The suggestion is then that in virtue of this identification, the C-related element comes to be dominanted by T ? ?-properties thus mark the entire unit, and where there is localy co- valued ?, there is esentialy the same sort of efect that we se with categorial node identification. That this doesn't happen with ?-properties is thus a constitutive diference betwen these feature types. ?-properties define a local unithood (a stretch of co-valued nodes in a dominance sequence ? a "chain"), and ? marks arguments that are then related by this chain to ?. There may be other technical ways to implement a solution to this isue, but I wil asume this for the rest of this work. So, to sum: ?-valuation marks arguments, and every occurrence of ? on the path is understood to dominate the ?-marked element. Note that D ends up in a derived relation with T, so that A vs. A'-relationships involving D-T are distinguished by the absence/presence of ? on T (respectively). 206 With these mechanical asumptions the regular T-T node identification procedure discussed above for the raising cases wil now function to lower (the copy of) the wh- element to each non-finite complement edge, thus yielding a structure acounting for the possibility of the variable-binding/condition-C interactions for (248). 3.3.2. Core Cases of A'-SCM (& Some Technical Problems Addresed) The general structure of the TCG acount of SCM caries over to the core cases of wh- movement more-or-les straightforwardly. However we are now in a position to re-raise and discuss some possible answers to some technical isues left open earlier. In particular, we considered the possibility in Chapter One of having a "Make-OP" style operation built into our WH-feature licensing mechanism ? basicaly C-WH deletes D- WH, leaving this property only on C. This suggests then being able to treat the "residue" of such a deletion on D as a (potentialy complex) variable like element. However, the mechanics of node-identification and contraction suggest that we should understand this WH-property as copied to al the lower C-nodes in our SCM analyses, in virtue of the identification which makes the lowering of the actual Dwh (now a "residual" structure interpreted as a variable). We contrasted these two schemas in ?1.4.1 (54) & (55), repeated here): (252) C WH[?:?] ?..?C WH[?:?] ?..?C WH[?:?] ?..? D ?:? D ?:? D ?:? (253) C WH[?:?] ?..?C?..?C?..? D ?:? D ?:? D ?:? 207 We noted that it is realy the later, and not the former that we want, though as we have stated things the former is the one that the TCG derivations sem to produce. Let us consider then a somewhat subtler formulation of the proces of node- identification. What we want is for the proces to yield an identification that wil justify the "re-entering" into the workspace of the dominated, "to-be-moved" element. However, we want to this to proced without a copying of al of the information asociated with the upper element, but we want the upper-element properties to remain "visible" in the workspace, so that the lowering can result in a local structure where licensing occurs. Note that this isue concerns both the A'- and A-relations. The isue arose with respect to A-relations above with respect to having ?-properties appear in al embedded positions and our suggestion that ?-licensing could be understood (for A-relations) to indicate the point in the structure where an element is pronounced. But for expletive there we suggested that this element was precisely one that did not alow local ?-valuation, and so it must be caried along to embedded contexts in our version of the A-type of SCM. The idea then for an alternative view of node-identification would be to say that the features asociated with a node X are "fixed" with respect to the output structure. Whatever the nature of the connection betwen (e.g.) WH-properties and the ?-vocabulary asociated with that node, that relation keeps those properties fixed to that initial position as it is determined when the element enters the derivation. Node identification can then stil occur, under the same general conditions of subsumption as we have been asuming. But while this wil identify positions in the workspace, it wil not "copy" the relevant features to the lower position. To se the idea 208 conceive of our workspace/output distinction in terms of separate layers or tiers of structure. Consider: (254) X [F] ?..?X?..?X?.. X ? ? CONTRACTION? CONTRACTION? X [F] ?..?X [F] ?..?X [F] ?.. X ? ? I mentioned earlier on in this discussion (se Chapter One) that the WS/O-distinction alowed a way of thinking of "many" in the output structure as "one" in the workspace. This is a situation where the mapping is insisted to be one-to-one for any given stage. At the "end" of a derivation, the relevant ?-properties wil be connected to lower variables via the mechanisms we have been developing above, but the syntactic information itself is "fast and fleting". It is available for local domain construction within the workspace, and is, in situations alowing node-identification, permited to be "caried over" to lower domains, but once the derivation is completed the workspace itself is gone, and so are the formal properties contained within it (that ? and ? information is connected to). So, how then do we end up with "one" in the workspace corresponding to "many" in the output structure? The idea here relies on the "re-spel-out" mechanism discused earlier. If the workspace is constrained by the connectednes requirement, insisting esentialy that there only be a single dominance sequence at any given stage, then branching requires speling-out (removal from the workspace). However, we are viewing the node-identification procedure as preserving output structure relationships so long as they do not introduce local ordering conflicts in the workspace. This means that any 209 element Y that may be asociated with X in our schema above in (254) wil be "re- entered" into the workspace in virtue of identification of X's. And as further structure is added they wil have to "re-spel-out". This yields multiples in the output for the asociated Y-elements. But notice that no such "leaving" and "re-entering" is required for the X-elements as these never cause a problem for the connectednes condition (they are always present in the path). However, we noted also in our Chapter One discusion that any such mechanism that would insist on "reintroducing" speled-out material in virtue of the node identification proces could potentialy run afoul of our anti-recursion conditions, as such speled-out branches could be arbitrarily complex. Suppose instead that the initial feature licensing relationship which holds of the top-most element is sufficient to evoke the "lowering" ? that is, what maters to this proces is the initialy established agrement (?) relationship, and that this information is "caried over" to lower domains in virtue of succesive node identifications. Then, instead of reasigning syntactic/categorial information to such lowered complexes, we can say that some minimal information is asigned, perhaps just D and the relevant ? information, or perhaps just ?. The result is that the lowering that atends node identification re-introduces only an index of sorts which we take to dominate just LF-relevant vocabulary. This alows us to keep with the idea of "pronouncing" elements in A-relations where the relevant ?-property is, without the problem of expletive-there spel-out raised above. And, it gives us the structures we want for wh-movement, with local copies of variable like elements (e.g., WH(x)..[x mother]..[x mother]..[x mother], in whose mother did John think Bil knew Sarah met, etc.). 210 Now consider some of the binding-connectivity efects discussed in Chapter 2. For example: (255) a. Which pictures of himself did John know Bil wanted? b. [which..himself] did John know [which..himself] Bil wanted [which..himself] The ambiguity present here we can now atribute to the TCG derivations of SCM efects, again, like the raising cases, without the postulation of special features (M-features) driving the individual movements. The self-element ends up in local relationships without interveners with both of the possible antecedents. This helps to understand cases such as those discused in Chapter One as wel, for example: (256) a. John thought pictures of himself/*herself were on sale b. Which pictures of himself/*herself did John think were on sale (257) a. ?John thought Mary sold pictures of himself b. Which pictures of himself did John think Mary sold And reciprocal elements distribute in basicaly the same way as is wel-known: (258) a. The boys thought pictures of each other were on sale b. Which pictures of each other did the boys think were on sale? (259) a. ?The boys thought I sold pictures of each other b. Which pictures of each other did the boys think I sold? Imposible bindings across intervening elements sugest that a "direct" relationship betwen the matrix wh-phrase and the base position is insufficient. The wel-formednes of the b-cases in (257) and (259) can be acounted for if there is a local relationship to the phrase containing self/each-other ? and this is what the SCM-style analysis provides. 211 Note as wel that where we have sugested that intermediate C-nodes are absent, as with the complements of raising predicates, we do not have a localy licensed "copy" that could enter into the relevant binding relations. The following examples (Abels 2003:30) ilustrate: (260) a. *Which picture of himself did Mary sem to John (Mary) to like e b. Which picture of himself did it sem that John liked e c. hich picture of himself did Mary think John wanted (John) to pack e As Abels observes, the a-case supports the idea that raising infinitives are not CPs, since if they were they would support a potential landing site that would put the wh-phrase within the local environment of the NP (John) that could be a binder. Where we have evidence for CPs, as in the b- and c-cases, we also have the possibility of binding the self-form. 98 Note as wel that under the contraction view of SCM we in general expect nested dependencies of the sort predicted by Path Containment approaches, pioneered in the work of Pesetsky (1982), Kayne (1984), May (1985) and others. This is a quite general property of multiple "like" dependencies. Consider the following familiar sorts of cases from Pesetsky (1982): (261) a. What books do you know [ who [ PRO to persuade e [PRO to read e ]] b. *Who do you know [what books [ PRO to persuade e [PRO to read e ]] 98 This a-case above also reveals a paralelism with a comparable copy-raising construction, sugesting that these do not involve CP's either. a. *Which picture of himself did Mary sem to John like she wanted e b. Mary semed to John like she wanted pictures of herself c. *Mary semed to John like she wanted pictures of himself 212 We se the same kinds of nesting versus crossing efects across the range of A'- movement relationships, including within the structure of relative clauses, in topicalization, infinitival relatives, tough-movement, too/enough-movement, and comparatives (se Pesetsky 1982:269 for examples). On the general structure of the acount, it is worth pointing out that C-C nodes in the workspace wil be unable to be identified in workspace contraction if subsumption does not hold. So, if we understand interogative embedded complements as being specified for WH, then this lack of ability to contract/identify can yield for us an acount of wh-islands. (262) ?Who did John wonder whether Mary liked _ ? ? ho did John wonder who ary liked _ ? Moreoever, if we take the identification of C-nodes to be sensitive to a more general category of operator elements, then we can extend the "imposible contraction" story to other clases of so-caled non-bridge verbs, like factives for example (se Frank 2002 for some discussion along these lines). There are, however, cases we mentioned in Chapter Two which sugest that SCM is "more cyclic" than our view predicts, in particular facts that sugest that the "edge" of v/VP is an intermediate landing site. I wil return to these cases below, after we have discused some posible extensions of the general architecture to local domains. The situation we are in with respect to evidence for a v/VP level intermediate movement is fairly straightforward: we are forced to posit more structure within local domains in order for the general aproach to yield the facts. 213 3.4. Local Relations: Part Two In this section I return to some of the isues raised at the beginning of this chapter regarding local relationships for simple transitives. There we noted that our feature- valuation mechanics semed to require bi-directional valuation on the dominance ordering (to alow but upward ?-valuation and downward ?-valuation in D-T relations, for example), but that no locality restrictions suggested a chance for chaos when more than one ?-element would be in the same local domain. Here I pursue the posibility that this situation never arises. 3.4.1. Raising-to-Object (RtO) Consider: (263) John believes him to be a genius There are two main lines of thinking regarding these constructions which difer with respect to how the acusative-marked element (him) is viewed with respect to the matrix/embedded clause boundary. The choice of analysis typicaly swings with the claims made about the categorial status of the embedded infinitival. On the one hand, there is the idea that him is in the lower clause, and that there is an "exceptional" proces which converts the categorial status of the embedded structure from CP to TP (S' to S in traditional terms; se Chomsky 1981). On the other hand, there is the idea that these cases involve raising to an "object" position (Postal 1974). In modern views that have resurrected this Raising-to-Object (RtO) view, the categorial type of the embedded clause is usualy taken to be a TP, and much is made of the similarities of these cases to the RtS sort of NP-movement discussed above. 214 A number of factors favor the RtO type of analysis, 9 here I wil name just a few. First, pasivizing believe targets this embedded ECM'd element, strongly suggesting matrix objecthood since, much like the impossibility of raising out such contexts, subjects of lower CPs clearly cannot undergo this proces: (264) a. He is believed to be a genius b. *He is believed that _ is a genius Second, binding-theoretic conditions apply to this element as if it is a matrix object, and not like a lower subject: (265) John 1 believes himself 1 /*him 1 to be a genius However, the meaning equivalence of the following two cases insists that we understand the ECM'd nominal to be the thematic subject of the lower clause: 10 (266) a. John believes Dave is a genius b. John believes Dave to be a genius But, on the other hand, binding conditions difer in their efects for these two kinds of complements, which again suggests that the ECM'd nominal is in the higher clause: (267) a. John 1 believes he 1/2 is a genius b. John 1 believes him *1/2 to be a genius The combination of these facts ? participation in the formal proceses of matrix objects but thematic asociation to the embedded material ? strongly suggests a raising-style account. 9 Se Johnson (191), Koizumi (193, 195), Lasnik (195), Runer (195), Bobaljik (195) for recent discusions. 10 Rosenbaum (1967), among others. For sumary discusion and further references se Runer (to apear). 215 Another standard argument includes reference to other efects of hierarchical position as indexed by binding/scope posibilities, which suggest that the ECM'd nominal is in the higher clause (Lasnik & Saito 1991; Postal 1974): 101 (268) a. ?The DA proved the defendants 1 to be guilty during each other's 1 trials b. *The DA proved that the defendants 1 were guilty during each other's 1 trials Postal's clasic RtO analysis of these constructions, which sems to do prety wel with the facts, was imported into current approaches via the asumption that objective/acusative Case asignment is esentialy like that of subjects, and involves movement from a base thematic position to a specifier position. In Chomsky (1991) Lasnik & Saito (1991), and Johnson (1991), among others, this was considered to be an object-related Agrement head (Agr O ). In more recent work (Chomsky 1995 and subsequent) the notion of separate agrement heads has been caled into question. 102 However, the general idea of the RtO-analysis can be pictured as follows, where the nominal is understood in the general case to raise to some functional category F below matrix T to licensing its Case properties, as in (269). So the implementation of Postal- style RtO then looks like (270): (269) FP NP F' F 0 VP V 0 t 101 As is wel known, similar efects of hierarchy hold for Condition C, negative polarity licensing, etc. 102 Though se Beleti (201). 216 (270) FP NP F' F 0 VP V 0 TP t T' T 0 .. As a last review note on these constructions, an analysis like (270) also fits nicely with the various "subject-like" properties exhibited in ECM, as witnesed by expletives and idiom pieces in these positions, paraleling the properties of athematic positions in raising to subject constructions: (271) a. I believe there to be a moron in the White House b. I believe it to be the case that Dave left c. I believe it that Dave left d. I believe the shit to have hit the fan So suppose then that something like the object-raising story is correct. How can we capture it in our view of contraction? Notice that without some asumption about a higher functional element responsible for acusative Case in ECM, we predict that (272) should be a subject-raising situation if the infinitival structure is a TP. (272) John believed him to like carots This ought to manifest a sequence of elements and a phase structure like (273) if there is no intervening functional element of the right sort to block the contraction: (273) C?T?v?V?T?v?V ?(MATRIX-T/EMBEDED NON-FINITE T IDENTIFY/CONTRACT) D D 217 There are two possibilities. We could consider something like the old CP/S' plus "deletion" (i.e., ?TP/S) analysis of Chomsky (1981), in which case the T-T contraction ilustrated above would be blocked as desired: (274) C?T?v?V?C?T?v?V D D But this would be to loose al the nice properties of the object-raising analysis sketched above. Plus, its not clear how we could possibly implement the notion of S'/CP-deletion in this system, since that would put us right back in the same situation we started with regarding the undesirable contraction above in (273). Another alternative, one consistent with the story we have told about RtS, would be to claim that there is an matrix-object-related T node above V, but below the C-T-v subject argument complex. Pesetsky & Torrego (2004) suggest such an acount within their general atempt to connect the presence/absence of T with Case-theory generaly. Their structure for simple transitives is thus: (275) [ CP C 0 [ TP T 0 [ vP v 0 [ TP T 0 [ VP V 0 ..]]] This alows for us to consider the possibility that the matrix clause is realy hiding a bit more structure. And this story would then alow us to view object-raising as T-T contraction, exactly as we did above for subject-raising, but now with "object-related" T contracting with the embedded infinitival. (276) C?T?v?T?V?T?v?V D D 218 Note that our view of structural contraction forces this analysis on us. I find this interesting since this sort of iterated clause-internal "mini-clause" structure has been explicitly argued for by a number of authors under the label of the so-caled Split-VP Hypothesis. The general idea behind Split-VP includes the now fairly widely adopted view of separating/dividing the lexical shel of verbal domains into a core verbal element V (the ultimate head) and a smal-v element, understood to introduce an external argument. (277) [ vP v 0 .. [ VP V 0 ..] Koizumi's (1993, 1995) notion of Split-VP has it that these verbal elements are separated into distinct zones in virtue of the existence of one (or more) intervening functional elements (F-heads): (278) [ .. [ vP v 0 .. [ FP F n 0 .. [ FP F 1 0 .. [ VP V 0 ..]] .. ] There is a diverse aray of analyses evoking Split-VPs in this sense in the literature, and while the exact nature of these intervening functional elements is by no means setled, there appears to be something of a growing consensus that some such division/separation approach may be correct. Candidate types evoked to label these intervening functional elements betwen the separated ?-elements (v and V) include Agr(ement), a lower T(ense), Asp(ect), a lower instance of C(omp), among others. 103 Lasnik (1995, 1999) includes arguments based on the properties of pseudogapping that support the idea of 103 Many others, actualy. Work in the minimalist program has sen no shortage of proposals arguing for functional category distinctions. I wil be working with fairly blunt tols in this regard, but as mentioned earlier, the eforts in analysis which this thesis aspires to are mainly in service of developing a clear and plausible picture of the theoretical ideas and the consequences for general architecture. 219 Split-VP and overt object and verb movement in English. Runner (1995) contains arguments along these lines as wel. I wil not review these arguments here, and I wil also not be discussing head movement in this thesis. But the conclusion which I wish to extract from this is that the iterated mini-clause analysis that our view of structural contraction appears to force upon us is by no means unprecedented, and in fact has a fair amount of independent empirical support. Let's consider this a bit more. What prevents T-T contraction then within the main clause, so that external and internal arguments might either become confused (as we worried about earlier) or inappropriately identified? Note that the asumption here would be that the two T-elements within a single clause would be distinguished by ?-properties, as follows: (279) C?T ?:f ? v ?[?:?] ?T ?:f ?:n ?V ?[?:?] D ?:f ?:n The anti-recursion condition on workspaces wil force the "subject" T-v structure to be spliced out when the second T is introduced. This has the welcome outcome that it appears to solve the dificulties we raised at the outset of this chapter regarding the locality of valuation. We have esentialy imported the structure of the acount now into the local domain of single clauses. So we now view ECM as RtO, paralel with our RtS derivations from earlier discussion. For example: (280) C?T ?:f ? v ?[?:?] ?T ?:g ?:a?? ?V?T D ?:f ?:n D ?:g ?:???:a 220 Asuming that the complements of "ECM verbs" (now using the term as a descriptive label) is a "defective" form of T, we would have the same node-identification and contraction for the object-raising in (280) as we did for the subject cases. In the next section I suggest that this view of clause-internal structure can be put to work to yield some interesting properties of pasives, and make some tenative suggestions regarding local binding and control phenomena. 3.4.2. Pasives, Local Binding, & Control Note that the root-first directionality for the emergence/expansion of these sequences of categories is crucial. On either a representational or bottom-up derivational view, it would be possible to view the two instances of T in these structures as undergoing contraction, resulting in the following structure where we have T-contraction 'over' an intervening C: 104 (281) C?T?v?C?T?V On the root-first view, (281) presents phase conflicts with those demanded by the necesary C-C contraction, which we can se this clearly by superimposing the two: (282) C?T?v?C?T?V Asuming for the moment that items are entered into the workspace one-at-a-time with the asumed direction/ordering given above, this means that C-C contraction wil always block T-T. However, we predicts on this one-at-a-time view that in the absence of the 104 Although in a representational implementation we wouldn't view this as literal contraction in the sense I have ben entertaining, but rather just some notion of domain that would be defined over "like elements". What I am arguing here is that only on the rot-first view do we get the right sort of domains. 221 intervening C-element we should se instances of T-T identification & contraction if there are no properties of these elements to distinguish them (so subsumption does hold). (283) C?T?v?T?V This, I propose, is exactly what happens in the case of pasive. Absence of ? on object- related-T could be sen to drive a kind of "clause-internal" raising on these asumptions. Suppose that objective T can enter the derivation without ?/?-properties. (284) C?T ?:f ? v ?[?:?] ?TV D ?:f ?:n The idea of the instance of T-T contraction happening over the element v instantiates the idea that when acusative Case is absent, the external role could be sen as esentialy spliced out of the active stretch of derivation without being ?-valued. This is, I submit, the present but unexpresed external ?-role in pasives. Such a posibility suggests that ?-roles need not, strictly speaking, be asigned. Left open (not closed off by the introduction of a local satisfied Case property) they function as implicit arguments. This view requires however that in general ? cannot be ?- valued prior to the relevant T-T identification. Suppose then in general that ? difers from the other properties we have discused in that it is valued when it is removed from the workspace. On this view v ? (or any ?-element) can only be properly ?-valued if it is "speled-out" together with a superordinate specified ?-property. I asume this in what follows; to be explicit: (285) LAST RESORT ?-VALUATION: ?[?:?] is valued at Spel-Out 222 This isn't a completely wild speculation ? the general idea is that we regard ?- relationships (the ulimate "integration" of nominal elements into the emerging event structure) as a mater of the interface mapping. Here this is our WS/O-distinction, so its not unreasonable to locate our version of ?-connections to this particular mapping (WS to the "LF-relevant" properties of the derived output structure). Note that we had to asume in earlier discussion that ?-properties are parasitic on ?-properties and their valuations. But we discussed a few cases where we wanted ?- relationships to be able to hold independently (e.g. the secondary predication case in the beginning of this chapter; the smal clauses where asociates of expletive-there may be found, etc.). What happens if there are ?-properties on object-related T, but no ?- properties? In such cases I suggest, we have the possibility of connecting internal and external (or v and V) ?-properties. Such a structure is atractive for thinking about the properties of so-caled inherently reflexive (286) and reciprocal (287) verbs. (286) a. John washed/bathed/shaved b. John washed/bathed/shaved himself (287) a. The women met/kised/hugged b. The women met/kised/hugged each other Though this view requires that we posit an additional layer of functional structure. Consider: (288) C?T ?:f ? v ?[?:?] ?T ?:? ?V D ?:f ?:n 223 Any ?-relationship betwen T-T would result in the splicing-out of v situation suggested above for pasivization. How then could we manage local connections betwen arguments of the sort that manifest in the inherent reflexives/reciprocals? Suppose we take the idea of iterated clause-internal structure around ?-elements a step further, and view the C-T-V type structure as the general way that functional elements cluster around lexical ones, resulting in a full "stacking" view, to borrow terminology from Bobaljik (1995). He contrasts split-VP type architectures with a "leap- frogging" view. Consider an ilustration. Take the light nodes to be rougly ?-related, grey to be ?/?-related, and the dark nodes to be operator/A'-related: (289) a. b. The a-view is the one which makes for a leapfrogging view of domain relationships. This is a fairly common perspective. Grohmann (2003) enshrines roughly this C-T-V kind of division in his "prolific domains" view of clausal architecture, where each domain potentialy decomposes into more fine-grained inventories of categories. Roughly, there is a domain where al thematic relations are computed, and this maps (by movement) to a domain where ? and ? and the like are licensed, and then these map (by movement) to a higher domain involving relationships of the A'-sort, including perhaps discourse related functions (e.g., like topic and focus and the like in a Rizian "split-CP" view; se Rizi 1997; se also Platzack 2000 for a view similar to Grohmann's). 224 In Bobaljik's terms, results in "leapfrogging" type movement relations to map the elements from the lowest to the highest domains: (290) a. b. Movement relations needn't necesarily always be uniform in the fashion pictured above on the leapfrogging view ? note that stranding elements in diferent domains in these movement relationships yields the degres of fredom to describe various diferent word- order, scope/binding relations, and the like (e.g., depending also on the isues of the sort mentioned earlier in our discussion of the WS/O-distinction regarding where one keeps the ?- versus the ?-relevant information). The present architecture could be sen as embracing the same general vision of functional divisions, but with a diferent view about how they are organized and come together. The relations indicated in the a-view above correspond to the following ones in the b-view: (291) a. b. 225 In addition we have suggested certain limited ways that the b-type domains can interact across their boundaries. That is, there are a limited range of relationships of the following type: (292) These correspond to the relations governed by the context/node-identification, with the notion of workspace contraction serving to limit the available "viewing window" to just these domains housing distinct elements. Bobaljik (1995:ch3) gives a range of arguments in favor of the b-view ("stacking" in his terms) and offers a number of arguments against the a-view ("leap-frogging"). What I wil consider here and below is an extension of this general way of thinking that aims to stay consistent with the general notion underlying the TCG approach as we have been developing it. Suppose then that we have a fully iterated view of local transitive structures, this would then look as follows (with the entire presented without any licensing of properties indicated): (293) C ?:? ?T ?:? ?:n ?v ?[?:?] ?C ?:? ?T ?:? ?:a ?V ?[?:?] D ?:f ?:? D ?:f ?:? I have added ?-properties to the C-elements, let us now consider how this view might function with respect to inherently reflexives/reciprocals. Above we noted that T-T 226 identification would result in splicing-out v, yielding a present but not ?/?-connected element that would be interpreted in the output as the "implicit" external argument present in pasives. Two possibilities suggest themselves on this picture. First, the derivation could proced esentialy as in previous discusion with two new additions. Take the derivation up to the introduction of v in (294) (al the feature valuations are pictured here on a single step for convenience): (294) C ?:???:f ?T ?:???:f ?:n? ? v ?[?:?] D ?:f ?:???:n Two diferences are now incorporated: (i) ?-properties on C, which are localy valued when D is related to T, and (i) the ?-property is not valued for ? (this happens in the mapping to the output ? whenever v is "spliced-out"). Subsequent addition of our hypothetical "object-related" C-element then yields: (295) C ?:f ?T ?:f ? v ?[?:?] ?C ?:? D ?:f ?:n Since this new C-element subsume the higher one, we would have an instance of contraction, which I wil suppose results in the following: (296) C ?:f ?T ?:f ?v ?[?:f] ?C ?:f D ?:f ?:n The important parts (highlighted above) are: (i) v-? is valued in the mapping the output (in being voided from the workspace) and (i) the lower C-element is now valued for the upper domain's ?-value. Now addition of the lower T-V structure looks as follows: 227 (297) C ?:f ?T ?:f ? v ?[?:f] ?C ?:f ?T ?:? ?:a ?V ?[?:?] D ?:f ?:n Presence of T-?:a (acusative) on our asumptions requires/enforces distinct D. Suppose that overt self-anaphors are divided into a pronominal part and the "self" part, and that the function of the "self" part is to absorb acusative ? (se Hornstein 2000 for a similar view along these lines). I asume that is arives as a bundle with its "pronimal" part which is a bare D with unvalued ?, so that we have the element which I wil mark as just self D ?:? ?: (298) C ?:f ?T ?:f ? v ?[?:f] ?C ?:f ?T ?:? ?:a ?V ?[?:?] D ?:f ?:n self D ?:? ?: ?-valuation then works in the expected way to yield: (299) C ?:f ?T ?:f ? v ?[?:f] ?C ?:f ?T ?:f ?:a ?V ?[?:f] D ?:f ?:n self D ?:f ?:a Which is thus al one ?-chain. The asumption is that self-serves to "capture" the ?- property, so it does not serve to individuate the "pronominal" part, which becomes valued by T-?. Thus the two ?-elements have the same index, yielding the ?-properties of local anaphors. Note that we could alternatively view the self forms as coming to the derivation specified for ?, this would result in the usual D-T exchange of ?/?-values. On the view above there is the oddity of having T-? filed in by C, and then having T value both the ? and ? properties of the anaphor. On the alternative just mentioned, we would get the same result as the representation in (299), except that technicaly the marked ?-features below wil not have entered into a valuation relationship: 228 (300) C ?:f ?T ?:f ? v ?[?:f] ?C ?:f ?T ?:f ?:a ?V ?[?:f] D ?:f ?:n self D ?:f ?:a In our discussion of logophoric self-elements above, we suggested the following diference betwen local reflexives and logophoric-self, repeated here: (301) a. Local Reflexives: ?:f?? .. ?:? ? ? MATCHING/VALUATION b. Logophoric -self: ?:f?? .. ?:f?? LINKING In the local context the mater is obviously quite subtle, as a distinction is being drawn betwen one versus two tokens of a valued feature. The intuition behind local valuation, implicit throughout, is the notion that is known as "rentrancy" in feature- based/unification frameworks, 105 is that co-valued features are literaly sharing a value. I have been asuming here that this yields a kind of internal unit-hood along the dominance sequence, and what is being explored now is the possibility of such relations extending across local recursive domains. (Relations betwen valued ?-properties I have suggested be treated as relations on the output structure of the linking sort). Note that regular pronouns on the present view ould then plausibly be viewed as coming to the derivation with valued ?, and unvalued ?, like regular nominals. This would on our asumptions yield local obviation (*John 1 saw him 1 ). The idea for inherent reflexives would then be to say that these occur where the lower ?-property would be absent entirely, as follows: 105 Se Shieber (1985) for discusion and references. 229 (302) C ?:f ?T ?:f ? v ?[?:f] ?C ?:f ?T ?:f ?V ?[?:f] D ?:f ?:n Asume that this is how inherent reflexives work, and that something like this is relevant to inherent reciprocity. There are obviously complications with the later that do not arise in former regarding plurality and how members of the denoted set are understood to participate in the relevant relation (e.g., sorting out various flavors of "strength" of the reciprocal relation). Seting aside this significant complication, what I'd like to focus on now is the following property that arises in both in pasivization: (303) a. John washed/bathed/shaved b. John washed/bathed/shaved himself c. John was washed/bathed/shaved (304) a. The women met/kised/hugged b. The women met/kised/hugged each other c. The women were met/kised/hugged A curious fact about these kinds of predicates is that both the inherent reflexive and reciprocal readings disappear in pasivization. 106 The c-cases above cannot have the b- reading which the a-cases with 'mising' direct objects obligatorily have. So the idea here would be that the upper (nominative) ?-properties, in virtue of contraction, would create a second ?-domain, thus serving to index both the external and the internal role. This general line of thinking could then be understood to support the inherent reflexive/reciprocal readings. Roughly this kind of distinction is developed by Hornstein (2000) though with rather diferent technical asumptions: presence/absence of acusative Case can result ? 106 These facts were pointed out to me by Ian Roberts (p.c.). Se Baker, Johnson, & Roberts (1985) for an acount that has a somewhat similar structure despite having litle else in comon with the present architecture. 230 in Hornstein's terms ? in movement of and NP from the object to the subject ?-position, licensing reflexive readings (Hornstein does not discuss inherent reciprocals). 107 The presence of the relevant ?-property results in the two roles being distinguished, a possibility clearly permited by these verbs types (e.g., John washed the baby; The women kised the baby). The view of pasive offered above explains this. The move is to suggest that as object related T can lack a ?-property, it also may lack a projected/selected C-element, which otherwise serves to "shield" it from contraction with the higher T. Since these derivations individuate/index the internal role via the upper (nominative) ?, this wil necesarily be distinct from v ? , thus yielding the absence of the inherent reflexive/reciprocal readings for these verbs. This is worth taking a closer look at, as it bears on the isues of what happens where in the TCG approach we've ben developing here. The asumption is that "like" features in the workspace simply come to share values. So two ? features, or two ? features, if these are in the same workspace, then they share values. Period. (Given that we have now partitioned ?-elements into separate workspace zones). Landau (2004) criticizes Hornstein's analysis, with which I share some asumptions (the whole story above simply stipulates absence of acusative Case) as follows. Why are Case features not potentialy optional for all transitives? Thus the acusative ? (our "lower" C ? ) could be omited with the mandatory reading then being an inherently reflexive reading, where (305)a would have to mean what (305)b does. 107 The general idea of having a Case-distinction permit the conection betwen internal and external arguments is atributed to sugestions made by Howard Lasnik & Alan Mun. 231 (305) a. John hit b. John hit himself This strikes me as a reasonable question, but one that has a reasonable answer. This is, it sems to me, a bit like asking why it cannot be the case that an unacusative verb (e.g., arrive) cannot end up in the syntax with an outer v-shel and thus manifest a structure supporting things like John arived the man with some corresponding transitive or causative reading (e.g., 'John made the man arive' or some such). There are realy two possibilities as far as I can se to addres this isue. Either there is such a thing as non-compositional, not-fully-productive sort of "structure- building", or something else acounts for the lack of productivity (or promiscuity) among the decomposed bits in approaches adopting one or another view of the separation hypothesis. It sems fairly clear to me that diferences betwen verbs licensing inherent reflexivity/reciprocity versus not is a lexicon distinction. This only means "arbitrary fact" if we presuppose a Bloomfieldian view. The question is: how does this distinction manifests in the syntax? What are the properties that must be projected such that we can acount for the paterns that are exhibited by the relevant elements? What is being claimed here (and, as I understand things, by Hornstein as wel) is that Case properties are central to how these diferent manifestations of a verb are projected into the syntax. Landau's objection sems to presuppose that we need to regard Case optionality in the intended sense as a mater of the workings of the syntactic system, as opposed to consequences for the syntactic system which could be sen to follow from alternative projection possibilities of Case/?-properties asociated with diferent clases of elements. 232 What answers the objection for unacusatives? Clearly it must be part of the specifications of these elements that they cannot enter into a structure with a superordinate v-shel. Once we have made the move of doing the decomposition of the VP in this manner there's no geting the genie back in the bottle, we have to live with the somewhat more exciting ontology of the separation/decomposition view. And this, it sems, means acepting that there are some non-compositional sorts of organization involved here. If I'm right in this line of argumentation, then connecting these sorts of "argument structure" alternations with the theory of Case (or agrement perhaps as wel, and perhaps more in the functional hierarchy) then it begins to become plausible that we are looking at (as suggested earlier) paradigmatic organization, which we needn't necesarily expect to be "productive" in the manner that syntagmatic organization is. Its just a diferent kind of system. This does of course presuppose that what is captured "in the lexicon" includes specifications regarding the projection possibilities which go beyond a single phrasal projection layer. But this is general, and not specific to the isues as they arise regarding inherently reflexive or reciprocal verbs. This is, in fact, part of the point of work elaborating on the notion of extended projections of the sort discussed in the work of Grimshaw (1991, 2000) and others, or so it sems to me. Given the advent of the functional projection explosion which has atended the development of minimalist theory, it is no longer conceivable that we can understand "projection" as governing the distribution of elements within a single categorial shel. 233 Let us ask the same question in a related domain in an analogous way that wil make the isues somewhat clearer, as the isue strikes me as one worth spending some time on. Consider the following distinctions in the "lexical syntax" of various diferent types of elements, as conceived in the widely adopted work of Hale & Keyser (1993, 2002): (306) a. XP b. XP c. XP d. X 0 X 0 YP ZP X ZP X X 0 YP X 0 Y 0 They distinguish betwen an elements which (a) take a complement, (b) take a complement and a specifier, (c) take only a specifier, and which (d) take neither a complement nor a specifier. They suggest that while these possibilities do not universaly align with categories, there may be asociations which predominant (e.g., they suggest that in English, the predominant realization of (a) is V, (b) is P, (c) is A, and (d) is N). So: how do we think about "mismatches" ? that is, situations in which the wrong element somehow gets asociated with projection realizations which clash with its properties? For example, Hale & Keyser treat so-caled unergatives (e.g., laugh) as realized by monadic head-complement structures of the (a)-type above, as follows: (307) VP V 0 NP | laugh They suggest that the impossibility of such elements participating in the following transitivity alternations follows from the analysis above: 234 (308) a. The children laughed b. *The clown laughed the children (i.e., "the children laughed because of the clown) They note that this property is shared by analytic expresions make trouble: (309) a. The cowboys made trouble b. *The beer made the cowboys trouble (i.e., the cowboys made trouble because of the beer) Thus the transitivity alternation impossibilities are understood to follow because these elements are in their surface "object-les" form in a sense already transitive. However, H&K asume that the relevant structures such as the one above for laugh do not strictly speaking exist at any level of syntactic structure. Rather, they asume that there is a proces of conflation which happens as a "concomitant" of Chomsky's MERGE. That is, there are not two operations like (310)a, but rather simply MERGE and the consequences of MERGE (conflation) for items of this particular type, as in (310)b: (310) a. V b. V laugh V ? N laugh MERGE(V,N) + CONFLATE(V,N) (311) V laugh V ? ??N laugh MERGE&CONFLATE(V,N) So, why isn't this proces totaly general? What stops impossible applications involving "nominal" elements that do not fal into the unergative clas? Take the nominal element cigar; why shouldn't it manifest the properties that laugh exhibits in virtue of undergoing this kind of MERGE+CONFLATION? 235 (312) *John cigared (meaning perhaps: 'John had a cigar', it doesn't mater for the present point) This would be an instance of the kind of mismatch raised above. That is, generaly, once we have made some of these distinctions betwen predicates a mater of structure, what is it that keeps the relevant elements in their appropriate bins? Alternative derivational routes of the sort explored above regarding pasives and reflexive/reciprocal verbs are like the options of entering into a MERGE+CONFLATION derivation versus one with simply MERGE. What these alternative derivations do is to partition the space of structural possibilities given a set of distinctions provided as input. This is to say that "lexicon information is syntacticaly represented". The job of syntactic theory is to provide a principled partition of this space that captures the relevant properties of the structural alternatives given the properties projected by given items. I wil leave this to the side for now, observing that the suggested analysis rather large isues. I have offered a general direction for thinking about their solution, and that's about it. However, there is one final further point raised above that requires discussion. Nothing in what I have said about these reflexive/reciprocal verbs distinguishes betwen the two clases. The relevant claims being made here are that (i) absence of ?- properties make it so we do not have local obviation, so that the external and internal roles can become asociated, and (i) that the properties of pasive derivations make it so the external role wil be not ?-asociated, but that since the internal role wil be, these two roles wil be distinct. This acounts for the similarities of reflexive/reciprocal verbs under pasivization (why these inherent readings go away for both types of verbs). 236 But I have nothing further to add here about how the semantics works for these cases such that we can discriminate betwen the efects of this kind of external/internal argument connection for the two clases (i.e., the fact that the women kised does not mean 'the women kised themselves'). I do think that the present system akes promising distinctions which can be taken as a good basis upon which to build in this respect. I wil leave this as an open question, taking the current acount to provide part of the final solution (the part which captures the similarities ? leaving the diferences up to an unspecified semantic story for now). 108 We can, however, capitalize on the general logic deployed for reflexives to provide a schema for analyses of control phenomena. Taking note of the collection of properties shared by obligatory control and reflexivization (Hornstein 2000), I suggest here that the same general notion of C-C identification and contraction is at work. That control predicates take C-complements is a widely held asumption in theories of control. One fairly clear source of evidence for this sort of analysis is based in the fact that, as pointed out in Landau (2000), control predicates ? but never raising predicates ? can be sen in many languages to manifest overt complementizers. Moreover, in languages which manifest this distinction overtly, those predicates which appear to be ambiguous betwen the control vs. raising type, when they manifest a complementizer, only manifest the control reading. So asuming these predicates to take 108 Juan Uriagereka points out cases like "they scratched" which sem to be vague betwen a reflexive, reciprocal, or an implicit (distinct) object reading. This is one of many cases to adres. There is also the posibility of "making" a reciprocal verb by pasivizing certain ditranstives. For example, He introduced John to Mary versus John was introduced, which with a plural subject (they were introduced) is ambiguous betwen an implicit object reading (i.e. they were introduced (to the audience) or a reciprocal (each other) reading. The present sugestion for heading into these isues is to continue to explore the apeal to samenes/diference in the form of ?/? properties. 237 C-complements, suppose that we say that these similarly manifest absence of an individuating ?-property that we have argued to be the driving factor in inherent reflexivization. Let us begin with a familiar contrast (Rosenbaum 1967) betwen raising and control: (313) a. Dave persuaded a doctor to examine Bil b. Dave expected a doctor to examine Bil (314) a. Dave persuaded Bil to be examined by a doctor b. Dave expected Bil to be examined by a doctor These examples difer in the interpretative properties of the embedded infinitival clause, depending on active/pasive voice. The b-cases are ECM/object-raising, and are basicaly synonymous. The a-cases, involving control, are not; they difer as to who is being persuaded (Bil or a doctor). Another familiar contrast involves idiom chunks, which raising (a-cases) but not control (b-cases) alows (i.e., to the extent the b-cases are ok they must not be idiomatic): (315) a. The cat semed to be out of the bag b. #The cat tried to be out of the bag (316) a. John expected the cat to be out of the bag b. #John persuaded the cat to be out of the bag As mentioned earlier, Hornstein (2000) argues for an approach which reduces control to raising/movement. Under this view the salient diference betwen the two sorts of construction is simply how many ?-roles are hanging around. Whereas in raising an element moves from a ?-position to a Case position, in control elements are understood to move from ?-to-?, picking up "?-features" as they go. Thus, the above contrasts are 238 straightforwardly acounted for in terms of whether the idiom chunks, which cannot receive thematic roles without loosing their idiomaticity, have to move through a position where geting a ?-feature is avoidable. Similarly, the active/pasive diference betwen object raising and object control shown above simply difer as to whether the pasivized NP end up in a theta-position (control) versus not (raising). Manzini & Roussou (2000) develop an idea similar in spirit to Hornstein's raising/control reduction, though with a rather diferent technical implementation. They suggest that in both raising and control the "moved" element is rather simply inserted into its surface position, and from there it "atracts" ?-features (conceived as aspectual features). The raising/control diference turns on simply whether a single ?-feature or more than one such feature is atracted to the "controller". Let's consider some posible structures for some core cases. Consider first the diference betwen the object raising and the control manifestations of a verb like expect: (317) a. John expected him to leave b. John expected to leave The object raising version we expect to evoke the following structure (using a shorthand notation indicating the relevant T-T identification): (318) C?T ?:f ?v ? ?C?T ?:g ?V?T?v ? D ?:n ?:f D ?:a ?:g What about the control case? If control complements generaly involve a C-layer, then we can view the extension of the extention of the ?-properties into these non-finite domains 239 just as we did in the case of local reflexives above, which would look as follows for subject control. We could asume either of the folowing: (319) C ?:f ?T ?:f ?v ? ?C ?:f ?T ?:f ?V?C ?:f ?T ?:f ?v ? D ?:n ?:f (320) C ?:f ?T ?:f ?v ? ?C ?:f ?T ?:f ?v ? D ?:n ?:f Both of these would result in the extension of the matrix subject domain so that it end up in a local relation with the lower v ? . The two posibilities would difer in whether we would find reason to maintain the object-related C and T elements in absence of either object-? or the ?-les ?/? properties argued to be present in object raising cases. I wil not pursue this isue, though I cannot at present se a reason to maintain the more complicated structure. On this view, note that we would be asuming the subject element is itself not brought into a local relationship with the lower role, as with raising. Rather, only its ?- feature is. Consider in this connection the often noted lack of reconstruction efects in control (but not raising). (321) a. Someone from New York is likely to win the lottery b. Someone from New York is eager to win the lottery As noted in May (1985), the a- and b-case above difer in whether they admit a reading with someone scoping low. That is, the raising (a-) case is ambiguous betwen meaning that some particular person from New York is likely to win, versus a low scope reading paraphrasable as "It is likely that someone from New York wil win the lottery", where it 240 is clear that we do not have any particular person in mind. Importantly, the control (b-) case above has no such reading; it only exhibits the higher scope interpretation in which we have a particular person in mind who is eager to win. This raising/control diference follows on the asumption that computing scope requires the actual scope bearing element to be in the relevant local domain. (However, these facts as an argument for the present view should be treated with caution; se our discussion above regarding Lasnik's arguments about specificity). Note that object control (e.g., John persuaded Mary to leave) now gets a paralel derivation to the one offered for object-raising, only involving C-contraction as with the subject-control cases above. Consider: (322) C ?:f ?T ?:f ?v ? ?C ?:g ?T ?:g ?V?T?v ? OBJECT RAISING D ?:a ?:f D ?:b ?:g (323) C?T ?:f ?v ? ?C ?:g ?T ?:g ?V?C ?:g ?T ?:g ?v ? OBJECT CONTROL D ?:a ?:f D ?:b ?:g Another possibility for analysis of control in the present terms is the equivalent of a bare- VP complement: (324) C?T ?:f ?v ? ?V?v ? ?.. D ?:n ?:f Obligatory control verbs with gerundival complements might be of this sort. (325) John tried [ vP eating the pie] 241 Such cases would thus involve direct relationships betwen v's, somewhat akin to the suggested story at the beginning of this chapter for secondary predication (John arrived sad). Note that of course what has been offered in this section is just a sketch. Nonetheles, two key points emerge that wil be important to pursue further. First, there can be no straightforward direct control/raising asimilation in the present architecture. But, second, the isues are now perhaps a bit more subtle. Given that we have reconceived movement in general as "agre-type" feature/category relationships, al localizable relations fal into this general bin in one way or another. The general appeal to agrement (?) properties sketched above for a potential acount of control relationships is consistent with Landau's (2000) view, but as we do not recognize a separate "movement- type" of relation, its not clear that raising and control aren't being brought closer together in terms of being subserved by the same general mechanisms (albeit in diferent ways). These isues need to be more carefully pursued within the TCG framework to se how things turn out, but the general format that the system makes available for analysis suggests that at least a partial control/raising unification may be feasible (so we may have a position intermediate betwen those advanced by Hornstein and Manzini/Rousou on the one hand, and views of the sort championed by Landau on the other). At any rate, I leave these maters for future investigation. 242 3.5. Clausal Unithood & Wh-Again,.. The discussion in the previous section is of course quite speculative. However, we noted above that positing an object-related T-element is not without precedent, nor is the general "split-VP"/"stacking" approach. Note as wel that in a recent thesis, Butler (2004) argues for an general view of phase-hood with roughly the kind of iterated CP-structure that our view of contraction requires. In addition to the development of his own arguments, he points to a number of other places in the literature where similar kinds of asumptions have been shown to bear fruit in syntactic analysis. 109 Iterated clause-internal sub-structures of this kind I have in mind were also proposed by Demuth & Gruber (1994), who distinguish betwen Basic Projection Sequences (BPS's) and Lexical Projection Sequences (LPS's). To avoid confusion with references to Chomsky's Bare Phrase Structure (also "BPS"), and to suggest the connection with the organization of what I earlier refered to as Core Licensing Properties (CLPs), let us refer to the sort of objects that Demuth & Gruber cal Basic Projection Sequences as instead CORE PROJECTION SEQUENCES (CPSs), and to make an explicit connect to Grimshaw's (1991, 2000) proposals, cal the analogue of their LPS instead an EXTENDED PROJECTION SEQUENCE (EPS). Demuth & Gruber's proposals difer somewhat in detail from what I proposed here (or what Butler proposes for example), but the ideas are al very similar. On D&G's 109 Butler's articulation of phases is quite detailed, and motivated by conections to a particular view of the syntax-semantics interface relevant to understanding quantification, scope, and the like, building on ideas of Begheli & Stowel (197), Beleti (201, 203), Jayaselan (201) among others. I refer readers to Butler's thesis for further discusion and references. 243 view their BPS's iterate to form LPS's. An LPS is simply a series of BPS's with a kind of ultimate lexical/thematic head at the bottom of the lowest BPS. We can ilustrate the idea as it is relevant to what has been suggested here with reference to our proposed iterated CP-TP-v/VP structures as follows, using our new terminology for the relevant units (CPS/EPS): 10 CPS CPS (326) [ CP C 0 [ TP T 0 [ vP v 0 [ CP C 0 [ TP T 0 [ vP V 0 ..]]] EPS "ULTIMATE" HEAD OF THE EPS EPS's then might be sen as a series of CPS's which bottom out in a major lexical category. It may be that the typical case is that an EPS is at most two CPS's (as in (326) above) though further isues not examined here may force us to conclude otherwise (the structure of ditransitives, causatives, and many other maters). Of course, we would like to have some idea of what makes a series of CPS's "hang-together" to form an EPS. There are a couple of things we might say on this score which require further investigation but which sem like the right sort of ideas. First, recal from our discussion of contraction and node-identification in ?3.3.2 the idea the following general schema that we used to explain how features might stay "fixed" to positions in the output structure, despite being implicated in lower domains. What we 10 The reasons for my changing terminology are not just to avoid confusions with references to Chomsky's theory of phrase structure. Demuth & Gruber actualy understand their BPS's/LPS's to botom out in a thematic element, with the higher BPS's understod to be athematic, so there is only one of these "thematic" BPS's in a given LPS on their view. Se their paper for discusion and interesting analysis of compound tenses in Bantu languages. Given the fit of this general idea with Split-VP ideas I wil not be importing this aspect of their story. Here lowest elements of both sub-units (now: CPS's), that is v and V, are both thematic elements. And, while v has sometimes ben entertained as a member of the "functional" category inventory, I wil here regard it as esentialy functional (perhaps "semi-lexical"). 244 require is perhaps some further property which could be sen to run through a series of CPS's, in a sense serving to hold them together. This would be something analogous to the way ?-properties have been suggested to "hold together" a series of distinct elements forming our CPS's. One plausible candidate, which we might or might not wish to view as "part of th syntax" in any direct way, are variables and quantifiers asociated with eventualities (the "e" variable casualy refered to in earlier discussion). It sems plausible to say that something like Kratzer's (1996) event identification might serve to unify two such CPS structures into a single "EPS" (i.e., some way in which the event variables in the two separate domains are linked/identified). And, just as there are properties marking the edges of CPS's, we might examine other properties that might serve to group our CPS's into larger units. Finitenes and Force (Rizi 1997) might be such properties. Realized as categories, these could be elements which serve to mark off our larger stretches of structure equivalent to the traditional clause. However these maters are pursued, I wil close this chapter with reference to one last clas of facts that our view of SCM does not predict unles we take the idea of recursive structure into the clause in the way suggested above. In particular I am refering to the Fox examples discussed in Chapter Two. Consider: (327) a. ? [Which of the papers that he 1 gave Mary 2 ] did every student 1 [ vP ? [ask her 2 to read * carefully? b. * [Which of the papers that he 1 gave Mary 2 ] did she 2 [ vP * [ask every student 1 to revise * ? 245 Instead of moving to the 'edge' of vP, here we have a uniform approach of C-C contraction which serves to bring the wh-element into the local configurations that are necesary to acount for the possible and impossible interpretations in these cases. However, recal as wel from Chapter 2 the folowing cases (Legate 2000): (328) a. ? [At which of the parties that he 1 invited Mary 2 to] was every man 1 [ vP ? [introduced to her 2 *? b. * [At which of the parties that he 1 invited Mary 2 to] was she 2 [ vP * [introduced to every man 1 *? (329) a. ? [At which charity event that he 1 brought Mary 2 to] was every man 1 [ vP ? [sold to her 2 *? b. * [At which charity event that he 1 brought Mary 2 to] was he 2 [ vP * [sold to every woman 1 *? These cases, as noted in our earlier discusion, are problematic for Chomsky's (1999) view of phases as just C and v, as they sem to involve pasives. 11 These facts are incompatible with the present view as wel. Recal our acount of pasives denied the presence of an object-related C-element to derive the "suppresion" of v-? (its splicing- out of the workspace in virtue of T-T contraction). However, there is another line of argumentation available to us given the general perspective of domains. Note that the cases above are pasives of ditransitives. What we have claimed in terms of the stacking of independent thematic domains with independent functional structure (perhaps the extreme of the "stacking" view) ought perhaps to hold of indirect objects as wel. What is required to capture the facts above is some position that 11 Legate provides similar examples with unacusatives, though to build the cases she requires a special sort of unacusative that takes more than one internal argument. I'm uncertain about the clasification of the verb she uses (se her paper for the cases), but should the argument turn out to be ok, I think the story I run in the main text for the pasive case wil cary over (should it turn out to be sustainable!). 246 the wh-phrase must move which is below the subject (e.g., so the pronoun within it may be bound by every man as in the a-cases above) but above the indirect object (so as to avoid obviation in the a-cases above). Suppose that this position is the edge of the indirect object's domain, conceived uniformly with the subject and direct objects cases. To get the facts, we need an outer C- type layer surrounding the prepositional phrase, i.e.: (330) C?T?v?C?T?V?C??P?D subj. direct obj. indirect obj However, an extra C-layer may not actualy be required. What are prepositional elements anyway? In clasical X-bar theoretic feature decompositions of categories they were typicaly regarded as negatively specified for both "n" and "v" properties. Interestingly, Pesetsky & Torrego (2004), in addition to positing an object-related T-element, also suggest the possibility that (at least some) prepositions may be a "type of T-element". They hint at a connection in terms of connections betwen the elements in terms of the functions the play in the semantics of time and space, but the interesting point from the present perspective is the possibility of their belong to a general type including T. Note as wel that there is often discussion of prepositional/complementizer type relations with, for example, worries about whether prepositional-looking elements that introduce clausal structures (e.g., before John ate the piza) are realy of the P or C type (e.g., Lasnik & Saito 1991 argue for a C-type analysis of cases of this sort). A number of interesting possibilities arise here that deserve more atention that I can devote here, but let me make another general point. The implementation of the TCG ideas that have been pursued in the present chapter suggest a research strategy aimed at 247 re-evaluating how we partition syntactic clases. The actual "labels" we have deployed in our discussion and analyses are clasical ones, and so might naturaly evoke some suspicion (which is not unreasonable, e.g., "there aren't any clause internal complementizers!"). 12 But in the spirit of pointing a direction for such potential reclasifications, the present point is that it is not entirely crazy to think that C, T, and P might be fruitfuly viewed as members of a larger clas. At any rate, if the general line of thinking is on the right track, we might then discover clases of the "P-type" which would relate to C, T, or perhaps to both types via the node-identification mechanism developed here. We might find reason to atribute diferent features to this super-clas to discriminate possible/impossible identifications along the lines that have been suggested for the subject/object situation and for cros- clausal relations above (e.g., two C-wh's cannot identify, etc.). For Legate's data above, if this is correct then it might be that the wh-element moves to the edge of the indirect-object domain (C-"P" identification). This view would be interesting as wel to explore with respect to pasivization and ditransitives. However, these are topics for another day. 3.6. Conclusions: The Take-Home Message of TCG Here are the core points to take home. First: the analyses that TCG supports with respect to core "clasical" cases of SCM, in particular raising-to-subject and wh-movement of the typical clause-edge-to-edge variety, should be taken to be the central result. 12 Though, again, se Bulter (204), Beleti (201, 203), Jayaselan (201). 248 If nothing else about this disertation is correct, the ability of the present system to provide a basic platform for understanding SCM without postulating what we caled M-features is something that should be atempted to be maintained in any further pursuit of this enterprise. The efect of this architecture is to reduce a wide-ranging general clas of superficialy non-local dependencies to local ones. In the local domains themselves, we se only independently motivated "core licensing properties" (CLPs) at work in establishing the key relationships. What alows us to dispense with M-features (i.e., "the EP") is the mechanics of node identification. Generaly, the idea is that intermediate positions of the SCM sort exist only because of (i) the existence of core local-type relation hold in the matrix clause and (i) the general fact that lower "like-elements" constitute informational supersets of matrix contexts. Crucial to executing this intuition is the WS/O-distinction, which alows us to separate-out local computation of relationships from the resultant/derived output in a way that alows non-local relations in the output to be maintained within a local workspace. What I have sketched in the present chapter is one possible implementation of a more general set of ideas. However, I have suggested that some of the specifics yield an interesting story, both about some particulars and in the general form of the answer that is provided regarding clause-structure and dependency relationships. We are now in a position to addres as wel an isue that I have left unaddresed throughout ? the asumption that these derivations work "top-down". Numerous technical aspects of the presentation hinge on this asumption, but the general logic of the SCM story is where the distinction is clearest. The necesity of a top-down derivation 249 goes hand-in-hand with the denial of M-features. If there are no such properties, then it is not possible in general to have an element asociate directly with intermediate positions, nor is it possible to "move" to them. Note as wel that offering a format within which an eliminative agenda regarding EP-type properties in favor of a view where local licensing is handled in terms of CLPs distinguishes the present approach from the views of TAG that we discussed in Chapter 2, where elementary tre-local movement is not generaly understood in this way. Nedles to say this is just the setup for an analytical investigation into the wider range of cases, and cross-linguistic diferences, that have been taken to motivate EP- features and the like. What I have offered here is a start on what strikes me as the most serious chalenge ? geting rid of non-CLPs as motivators for intermediate movements. We could, presumably, atempt to motivate a "bottom-up" view along one of the lines mentioned in Chapter 2 (e.g., non-feature-driven movement to avoid crashing the derivation), with a wh-element starting in a base position, but this bottom/embedded domain wil not itself be a "phase" on the node-identification and contraction view ? as the relevant "like element" won't arive until the top of the next highest clause. Moreover, other like elements (e.g., the V selecting the embedded clause) wil arise first, and in virtue of the anti-recursion restriction on the workspace, wil force a splicing out of the intervening material. This could be taken to motivate movement directly to a superordinate VP-adjoined type position, skipping a lower C, but this would sem to be contrary most of the empirical evidence reviewed in Chapter 2. Also, this would be a mixed-view system of displacement, which would involve both standard movement and 250 the node-identification mechanism. 13 So, it turns out on the present view that top-down structure expansion and eliminating M-features go hand-in-hand. Returning to TAG approaches, one might ask whether the adjoining/substitution mechanics could be put to work in ways similar to what has been developed here in appealing to the reduced Brody-type structures that collapse intra-phrasal projection-level distinctions. This sems possible, though I have not investigated the mater. The general outlook here on the relationship betwen TCG, TAG, and the MSO- type systems discussed in Chapter One is that they constitute a family of closely related approaches. The introduction of the WS/O-distinction at the outset of this work is designed to form a general background context within which various aspects of the diferent approaches might be mixed/matched and then tested against the facts of human language. What I have offered here is an outline of one such approach. And there are numerous isues which have not even been scratched. In order to concentrate on the key properties of interest here regarding recursion, the node- identification view of lowering/copying, and the like, isues regarding head-movement and modification have been completely avoided. This is a serious omision, and should be one of the first areas to be developed in any continued thinking on this general approach. 13 Note that we did sugest something close to such mixed view in our discusion of A' to A-position movement earlier in the chapter ? but the details were developed to bring this case within the general logic of identifying properties on the dominance sequence as a way of deriving copying of elements in the output. 251 3.7. Closing I wish to close with some general points and a few open questions. First, consider again our earlier discussion regarding chain structures, where we suggested that Chomsky's (1995) "technical options" (including our third previously undiscussed option) regarding chains are not in fact diferent possible ways of talking about the same thing, but rather simply diferent things. In our dominance encoded feature-relations, suggested that the following thre schemas are actualy diferent chain types: (331) Binding/Control A'-relations Some A-relations (connections betwen A-relations) Simple A-relations were understood to involve two or thre node connections, linking up a ?-marked nominal with ? by means of ?-properties. These sequences I suggested, might be themselves linkable, via node-identifications involving the top-most members of each such sequence. This was the basic shape that was suggested for local reflexives (perhaps also inherently reflexive verbs), and was offered as a schema for the structure of control relations as wel. This is the picture of chains that we se in the left-most schema in (331). Local A'-relations, it was suggested are best understood as local WH-? relationships, which are themselves connected to ? via ?-relationships. This is the picture in the middle schema in (331). Finaly, the rightmost schema above manifests the structure I have atributed to (e.g.) pasives. There a T-T contraction was argued to 252 splice-out intervening v, leaving it in the workspace unasociated with any other element (this was suggested to be the present but unexpresed external role). The picture then is of a lower ?/?-? relationship which is connected to a higher position (e.g., nominative T). There is a central idea in play here that our discussion has not adequately touched upon. The mechanics we have been working with asume a single formal ordering dimension, that admits to branching, along which feature-licensing relationships are characterized. The ideas just discussed regarding diferent "constituency structures" for chain relationships has been key. This is just a sketch. Putting the system to work in more in-depth and rigorous analysis is what is now required. What the present work has acomplished is to set the stage for a novel type of approach that I have argued has the right general structure to provide principled acounts of SCM phenomena at the least, and perhaps has consequences for other concerns. In general, I wish to stres again here that I believe it is best to view the present approach along with the others that have been discussed alongside it (TAG and MP/MSO appraoches) as a family of closely related ideas. The eforts here have brought out some diferences betwen these approaches, but the hope is that they have also been brought somewhat closer together. Consider again both the Chamorro agrement facts and the S-V inversion cases from Spanish discussed in Chapter 2: (332) ? Qu? pensaba Juan [que le hab?a dicho Pedro [que hab?a publicado la revista]] what thought Juan that him had told Peter that had published the journal 'What did John think that Peter had told him that the journal had published?' 253 (333) Hafa sinangani-n Juan as Dolores [t ni minalago'?a [t p?ra un-taitai t]? WHAT? WH[OBJ2].tel Juan OBL Dolores COMP WH[OBL].want-AGR FUT WH[OBJ].AGR-read "hat did Juan tel Dolores that he wants you to read?" I noted in Chapter Two that the Spanish facts at least have been subject to some controversy. In this connection I mentioned the work of Bakovi? (1995), who documents the following dialect variation: (334) a. No inversion with any wh-phrases (Su?er 1994) b. Inversion with argument wh-phrases only (Torrego 1984; Su?er 1994) c. Inversion with al but reason wh-phrases (por qu?/"why") (Goodal 1991a,b) d. Inversion with al wh-phrases in matrix clauses; al but reason wh-phrases in subordinate clauses (Bakovi?'s survey) e. Inversion with al but reason wh-phrases in matrix clauses; only argument wh-phrases in subordinate clauses (Bakovi?'s survey) f. Inversion with argument wh-phrases in matrix clauses; no inversion in subordinate clauses (Bakovi?'s survey) The general conclusion of Bakovi?'s research into these maters is that there is a scale which to a first approximation tracks a hierarchy of wh-elements, ordered on a more-to- les "referential" (or perhaps "argumental") scale. Dialects variation appears to be systematic if Bakovi? is right. First, it is possible to have a dialect with no inversion at al. However, if there is inversion and if it is alowed with wh-elements position X in a more-to-les referential/argumental continuum, then it is alowed with al the others higher in the hierarchy. Moreover, there appears to be a "subset" relationship, with respect to the matrix/subordinate distinction, such that embedded clause inversion possibilities with respect to this hierarchy of wh-elements is always a subset of what is possible in the matrix. 254 Interestingly, Chung (1994) reports that the local agrement facts in Chamorro are optional for referential wh-elements, taking the relevant clases of elements to be those picked out in the work of Cinque (1991) (se also Pesetsky 1987 on "D-linking"). What sort of mechanics should be deployed to acount for their syntactic properties? Petesky (1987) offers a story under which relies on a division betwen "regular" wh-movement, and a kind of unselective binding by a matrix Q-morpheme, of the sort offered in the work of Baker (1979) to provide a grounding for the D-linked/non- D-linked distinction. Suppose something like this is correct. There are then two types of relationships that in principle can acount for the connection betwen a wh-element and potentialy distant (embedded) thematic information. A local SCM mechanism, and a potentialy long-distance kind of (semantic?) relation. But why should this be? Why should there be two? If it can happen long-distance, why not always that way? Or why not always linked-local? Note that our WS/O-distinction offers a reasonable place to hang this diference ? we could understand unselective binding to be a semantic relation (perhaps of the linking sort we discussed for logophors earlier, realizing an unselective binding relation) holding over output structures, and also have the edge-to-edge linked-local style relation as mediated by the syntactic workspace. My suspicion is that the right way to approach these isues should involve a close examination of the learnability of the distinctions. In general our view here has been of the narow syntactic computation as itself constituting the interface betwen the lexicon and a PF/LF output structure. The learnability of core local relations ought to fal within a correspondingly local view of where learners find the information needed to acquire gramar (something along the lines of Lightfoot's Degre- 255 Zero "plus a litle"; Lightfoot 1989). If the ideas here regarding node-identification and contraction are generaly on track, learners needn't have aces to anything more than roughly (traditional) clause-sized objects to acquire the relevant distinctions that pertain as wel to linked-local/SCM-type relationships, as these I have argued fal out from the basic mechanics. But there is stil a serious problem to be faced, one with two facets: (i) its just not true that all dependencies are reducible to local domains, that this is the case is what motivates views like that introduced by Pesetsky (1987), as mentioned above, and (i) its just not true that matrix level generalizations cary over to embedded domains. Regarding (i), what sems to be the case is that something like Ros's (1973) "Penthouse Principle". Ross offered the following metaphor "whose truth is borne out in myriad cases of Real Apartment Life": (335) The Penthouse Principle: More goes on upstairs than downstairs. Life, it sems, is always more exciting in the Penthouse; anything happening downstairs is sure to be a trendy copy of things that have already been done upstairs. So we needn't strain our capacity for decoding metaphors, Ross translates into terms more linguistic: (336) No syntactic proces can apply only in subordinate clauses. This sems to be borne out by many of the SCM-efects discused in Chapter Two. Inversions, wh-copying, and local agrement al sem to be required "at the top" if they hold in embedded domains, and where they hold in embedded domains, they must hold "al the way down" (typicaly no domains may be skipped). These maters strike me as important to investigations concerned with Degre-N learnability and may be a route that might help us to beter understand SCM-type efects. Recal that wh-copying shows up as 256 wel in L1-acquisition of English (se ?2) which is a case where the adult/target gramar does not generaly permit such copying. This suggests that learners are not necesarily conservative in the sense of waiting for positive evidence to alter their gramars to alow multiple PF-spel-outs of this kind. And if children do not have aces to direct negative evidence that these constructions are not part of the target gramar, then something else must be in play to alow them to converge on the correct target. The following questions then can frame further inquiry into these maters: (337) a. Is the Penthouse Principle true? b. If it is true, why? In virtue of what? c. How does whatever underlies this principle efect the presence/absence of SCM-type efects? Part of the answer to questions a/b I believe lies in how we understand the domains that learners have aces to in order to find evidence to sort out where their target language lies in the UG-governed aray of possibilities. Question-c is then framed in our TCG approach (deploying the WS/O-distinction) in terms of how the mechanisms of unselective binding or the like arise, making truly long(er)-distance relationships possible. I wil leave these maters here, noting in closing that the general structure of our acount makes room for thre kinds of dependencies: (i) those that are purely local, (i) those that are linked-local, and (ii) those that are non-local. Further, I have suggested here a way that natural language gramars might reduce the (i)-type to the (i)-type. How the (ii)-type fits in to this view is a job for future inquiry. 257 Last, consider some remarks from Fodor (1977) regarding gramars and derivational directionality, which wil alow us to sum-up some of the key ideas discussed in this disertation in a more general way: We might supose that we could isolate the isue of directionality by comparing two (imaginary) gramars, G 1 and G 2 , which are identical except that the rules of G 1 are the inverses of the rules of G 2 (i.e., the input to each rule of G 1 is the output of the coresponding rule of G 2 , and vice-versa), and the order of aplication of the rules in G 1 is the inverse of the order of aplication of the coresponding rules in G 2 . The set of structural representations constituting the derivation of a given sentence would be identical in both gramars, but these structures would be generated in reverse order. But now notice that where G 1 has a deletion rule, the coresponding rule in G 2 wil be an insertion rule; where G 1 has a rule moving a constituent to the left, the coresponding rule of G 2 wil move that constituent to the right. G 1 Derivation G 2 step 1: d is moved to abcd step 2: d is moved to the left of bc adbc right of bc step 2: a deletes dbc step 1: a is inserted before d before d The diference is direction is inevitably acompanied by a diference in the operations that particular rules perform. [We might] consider the posibility of constraining the rules of a gramar so that they can perform certain types of operations but not others. We might then be able to decide betwen the gramars G 1 and G 2 on this basis. But let us temporarily abstract from this isue by suposing that the rules of G 1 and G 2 al conform to the definition of a posible rule of gramar. Could there, nevertheles, be some reason for prefering either G 1 or G 2 ? The consensus of opinion, even among those who agre about almost nothing else, apears to be that there would be no significant diference betwen the two gramars ? as long as the old confusion of gramars with psychological models of spech production and perception is avoided. The important bit in this stretch of pasage is the observation that "diference in direction is inevitably acompanied by a diference in the operations that particular rules perform". It is such a diference betwen bottom-up and left-to-right/incremental asembly that Philips (1996, 2003) exploits in that the incremental system sems to alow us to make reference to "units" that we need for analysis but which are unavailable in a bottom-up characterization. We can excise the following main isues from Fodor's pasage: 258 (338) INDEPENDENCE: considering ordering of operations in gramar and performance-theoretic systems as independent conceptualy, what considerations might lead us to consider one or another possible view of the ordering of combinatory operations in the gramar? (339) CORESPONDENCE: suppose we were to have strong reasons for thinking one or another global ordering for syntactic structure asembly was correct, how ought we think about correspondence relationships to operations in parsing and production ? how ought we think about GRAMAR as embedded in "time"? The sub-part of the pasage from Fodor (1977) above contrasting two toy gramars G and G' ends with the fairly reasonable asertion that it is dificult to se how we could find reason to think there was any real diference betwen such gramars. Seting to the side for the moment the present work and the work of Philips and others, the state-of-afairs regarding such questions about directionality in more recent theory is even a bit more dificult, if anything, than it was at the time of Fodor's writing. Following the above-quoted discussion, Fodor goes on to contrast two ways of understanding a rule like wh-movement implicated in (340) and (343), one including a rule of "wh-fronting" and another including a rule of "wh-backing". The later rule would map (342) to (341) and (345) to (344), while the former would do the reverse: (340) Who do you expect to murder Jemia? (341) Q You Pres expect [WH+pro murder Jemia] WH-FRONTING (342) Q WH+pro you Pres expect [murder Jemia] WH-BACKING (343) Who do you expect to murder? (344) Q You Pres expect [PRO murder WH+pro ] WH-FRONTING (345) Q WH+pro you Pres expect [PRO murder] WH-BACKING If direction of derivation does not fundamentaly mater then there should be no non- trivial diferences betwen these derivations. Of course, any present-day comparison 259 betwen two such visions of transformational operations difers quite a bit from the theoretical situation that obtained in the days before the asumption of structure preservation (Kimbal 1972, Emonds 1985). 14 Without structure preservation the rule of wh-backing appears to require a bit more in the way of additional asumptions that wh- fronting does. 15 That is, without traces or some kind of marker or variable or the like designating the internal position to which wh-backing would displace the wh-element, some additional mechanism would be required to filter out misapplications. In contrast, the rule of wh-fronting has a salient target: the "edge" of the sentence. 16 Of course, in Fodor's example things are rigged in favor of wh-fronting given the presence of an abstract Q-morpheme (a "trigger" morpheme 17 ), but with no corresponding marker for the base (trace/copy or thematic) position. If we embrace structure preservation, then our contrast betwen wh-fronting and wh-backing looks like this, where we simply switch the direction of the arows on our "movement" notation (asuming copies rather than traces for the moment): 14 Kimbal was, to my knowledge, the first to observe the interest of a restriction stated in terms of the kinds of objects produceable in principle by phrase-structure rules and the kinds of objects produceable by transformational operations. The idea of structure preservation realy caught on, however, folowing the work of Emonds (1970, 1985), who used this as a general restriction which enabled him to diferentiate betwen structure-preserving and non- structure-preserving operations (e.g., his rot transformations were of the later type). The development of this notion with respect to traces/copies put this notion on even more solid ground, though it fel somewhat into the background as a general kind of constraint on operations and the structure of the gramar. Se Newmeyer (1986) for a god discusion of this history. 15 Fodor observes this, but includes no discusion of "structure-preservation" in this context. 16 That is, even in absence of "triger morphemes" or a designated landing site for such transformations, wh- fronting apears to require les information to aply corectly. This is al asuming, of course, that the transformational aproach to these maters is the right way to go. There exists context-fre gramars (e.g., GPSG) which incorporate rules with complex ("slash") symbols which make the kind of asymetry Fodor is pointing to irelevant. Such gramars, like the context-fre rules we examined above, can operated trivialy either from terminals to "S" or from "S" to terminals. Whether or not the kind of asymetries Fodor is pointing to here stil can be found in more recent models of syntax is one way of stating the major theme of this work. 17 Or, for more recent versions of the "triger morpheme" idea, consult almost any recent article which has the words "minimalism" or "derivation" or "checking" in the abstract key-words line. 260 (340) Who do you expect to murder Jemia? (341)' Q WH+pro you Pres expect [WH+pro murder Jemia] wh-backing (342)' Q WH+pro you Pres expect [WH+pro murder Jemia] wh-fronting (343) Who do you expect to murder? (344)' Q WH+pro you Pres expect [PRO murder WH+pro] wh-backing (345)' Q WH+pro you Pres expect [PRO murder WH+pro] wh-fronting In more modern terms, we have a diference betwen an operation which raises the wh- element, leaving a null copy or a trace, versus an element which lowers a null copy/trace. Or, more neutraly, we have some minimal formal way of establishing a dependency betwen two positions in a structure, and atendant (interpretative?) proceses which sort out where to pronounce and interpret what. Given structure preservation, in other words, the isue of derivational directionality becomes quite a bit foggier than it was at the time Fodor's book was published (27 years ago). Consider some further remarks which Fodor makes on wh- backing, which reveal the present point about structure preservation nicely (despite including no explicit mention of structure preservation as such; bold emphasis mine): 18 WH-backing knows which noun phrase to move, but how it could know where to move it? What indicates that there is an apropriate gap for the interogative pronoun to move into at the end of [(343)] but not at the end of [(340)]? A gap, after al, is just a nothing. Two words are adjacent that otherwise would not have ben. The information that determines where there is a gap, and which gap has to be filed by the WH-Backing transformation, is information about the dep structures of these sentences, and about other transformations that do and do not aply in their derivations. [..] for the WH-Backing transformation to aply corectly, it would ned information about structures in the derivation of a sentence which are 'deper' than the one on which it operates, i.e. structures which are generated only AFTER WH-Backing itself has aplied. By contrast, the standard WH-Fronting rule is self-suficient; it can aply corectly without 'loking ahead' to later stages of the derivation. The reason for this diference is that WH-Fronting paralels, while WH- 18 This pasage also raises questions about "lok-ahead" which have become relevant in much curent derivational syntactic theory. 261 Backing oposes, the direction of flow of information betwen structures like [(341)] and [(342)]. Before WH-Fronting aplies, the position of the interogative pronoun indicates its syntactic and semantic role in the sentence. But this information is lost when al interogative pronouns are moved into the same position at the front of the sentence. Thus, [(341)] contains more information (in this respect) than [(342)]; [(341)] contains enough information to determine [(342)], but [(342)] does not contain enough to determine [(341)]. Other transformations (e.g., Pasive, Particle Movement) aparently involve no los of information and hence determine a unique output when aplied in either direction. An asymetry in information content betwen two adjacent structural representations in a derivation thus gives some content to the notion of the direction of the rule (p10-12). With the notion of structure preservation in place, the informational asymmetry Fodor points to with respect to wh-fronting versus wh-backing no longer holds ? at least, not obviously. But these aspects of Fodor's discussion regarding informational asymmetries serve to bring some isues into the foreground rather clearly for our purposes here, even if they rely on now outdated asumptions. Note the part of the above pasage regarding the isue of wh-backing "opposing" the "flow of information" and the mater of information loss in derivations. These isues become relevant ? though in a diferent way ? in the context of current derivational minimalist syntactic theory if we consider the notions of "phases" and the like within MSO-systems, which exhibit what we have caled 'expand/contract dynamics'. So we now have encountered a possible motivation ? albeit quite general and abstract ? for pursuit of one or another global directionality for syntactic derivations: the potential existence of informational asymetries. In this work we have built an approach to gramar which capitalizes on such asymmetries, suggesting that what underlies SCM-type efects is an informational superset relation (subsumption) which holds betwen positions hosting "intermediate" positions and matrix positions where core licensing properties are related. This is thus an argument for a particular direction of derivation that focuses on purely competence- 262 theoretic isues. What of the maters of corespondence then ? that is, the relation betwen gramar and parser? I have not addresed maters of performance in this work at al, but to wind up this closing discussion, at least the following two points are of interest for further pursuit in the present framework. First, as mentioned in Chapter One, the TCG mechanics make available a gramar-based conception of how "displaced" elements may be buffered in a sense (kept in the workspace) so that they may be integrated in some lower domain. But our above discussion regarding the possibility of other mechanisms that handle such dependencies diferently raises the question of how these two diferent sets of mechanisms ? one that is WS-local, and one that functions over the output structure ? may or may not interact in on-line procesing. Second, the top-down perspective as it has been wed here with our workspace ordering and distinctnes constraints, suggests some ways of beginning to think about how categories/features and ordering information might be mapped to "time". Chomsky's term "phases" may turn out to be particularly apt in the sense that we might investigate ways in which categories/features could be understood to have a duration ? that is, a time-course in which these properties are "active" in on-line procesing. The general view of categories/features and ordering pursued here suggests that samenes/diference may mater for determining an abstract sort of "chain constituency" that is important to understanding local structures and how they may overlap (or not). This may be potentialy translatable onto a time-axis in a way that preserves the informational groupings that same/diferent properties have been suggested to efect along the dominance ordering. This is a fairly general, somewhat vague 263 suggestion, to be sure, but I believe something along these lines may be the key to understanding the various ways that we might understand gramar as relating to time. 264 References Abels, K. 2003 Succesive Cyclicity, Anti-Locality, and Adposition Stranding. Doctoral disertation, University of Connecticut, Storrs. Abney, S. & M. Johnson. 1991. Memory Requirements and Local Ambiguities for Parsing Strategies. Journal of Psycholinguistic Research 20(3):233-250. Adger, D. & G. Ramchand 2003. Merge vs. Move: Wh-Dependencies Revisited. Ms. University of London. Aoun, J. 1985. The Grammar of Anaphora. Cambridge, MA: MIT Pres. Asudeh, A. 2004. Resumption as Resource Management. Doctoral disertation. Stanford University. Bach, E. 1986. The Algebra of Events. Linguistics & Philosophy 9:5-16 Baker, C. L. 1970. Notes on the Description of English Questions: The Role of an Abstract Question Morpheme. Foundations of Language 6, 197-219. Baker, M., K. Johnson, & I. Roberts. 1987. Pasive Arguments Raised. Linguistic Inquiry 20: 219-251 Bakovi?, E. 1995. A Markednes Subhierarchy in Syntax: Optimality and Inversion in Spanish. Ms. Rutgers University. Bars, A. 2001. Syntactic Reconstruction Efects. In C. Collins & M. Baltin (eds.) The Handbook of Contemporary Syntactic Theory, Oxford: Blackwel, 670-696 Begheli, F. & T. Stowel. 1997. Distributivity and Negation: the syntax of each and every. In A. Szabolcsi (ed.) Ways of Scope Taking, Dordrecht: Kluwer, 71-107. Beleti, A. 1988. The Case of Unacusatives. Linguistic Inquiry 19:1-34. Beleti, A. 2001a. Agrement Projections. In C. Collins & M. Baltin (eds.) The Handbook of Contemporary Syntactic Theory, Oxford: Blackwel, 483-510. Beleti, A. 2001b. 'Inversion' as Focalization. In A. Hulk & J.-Y. Pollock (eds.) Subject Inversion in Romance and the Theory of Universal Grammar, Oxford: Oxford University Pres, 60-90. Beleti, A. 2003. Aspects of the low IP area. In Luigi Rizi (ed.) The Structure of IP and CP: The Cartography of Syntactic Structures, Oxford: Oxford University Pres. Benmamoun, A. 1998. Spec-Head Agrement & Overt Case in Arabic. In D. Adger et al. (eds.) Specifiers: Minimalist Approaches, Oxford: Oxford University Pres, 110- 125. Berwick, R. 1985. The Acquisition of Syntactic Knowledge. Cambridge, MA: MIT Pres. 265 Berwick, R. & A. Weinberg 1985. The Grammatical Basis of Linguistic Performance. Cambridge: MIT Pres. Bitner, M. 1994. Case, Scope, & Binding. Dordrecht: Kluwer. Biter, M. & K. Hale. 1996. The Structural Determination of Case and Agrement. Linguistic Inquiry 27(1), 1-68. Bobaljik, J. 1995. Morphosyntax: The Syntax of Verbal Inflection. Doctoral disertation. MIT. Bobaljik, J. 2002. Floated Quantifiers: Handle with Care. In L. Cheng & R. Sybesma (eds.) The Second State of the Article Book. Berlin: Mouton de Gruyter. Boeckx, C. 2000. EP Eliminated. Ms. University of Connecticut. Boeckx, C. 2003. Islands & Chains. Amsterdam: John Benjamins. Boeckx, C. & N. Hornstein. 2003. Reply to 'Control is not Movement'. Linguistic Inquiry 34/2: 269-280. Boeckx, C. & N. Hornstein. 2004. Movement under Control. Linguistic Inquiry 35/3: 431-452. Bo?kovi?, Z. 2002. A-Movement and the EP. Syntax 5: 167-218. Bresnan, J. 1971. Contraction and the Transformational Cycle. Ms. MIT, Cambridge, MA. Brody, M. 2000. Miror Theory: Syntactic Representation in Perfect Syntax. Linguistic Inquiry 31(1) 29-56. Brody, M. 2003. Towards an Elegant Syntax. London: Routledge. Butler, J. 2004a. Phase Structure, Phrase Structure, & Quantification. Doctoral disertation, University of York. Butler, J. 2004b. On Having Arguments and Agreing: A Semantic EP. York Papers in Linguistics 1, 1-27. Castilo, J.C., J. Drury, & K. K. Grohmann. Localizing Procrastinate. Ms. University of Maryland, College Park. Castilo, J.C., J. Drury, & K. K. Grohmann. 1999. Merge over Move and the Extended Projection Principle. In S. Aoshima, J. Drury, & T. Neuvonen (eds.) University of Maryland Working Papers in Linguistics 8, 63-103. Castilo, J. C. & J. Uriagereaka. 2000. A Note on Succesive Cyclicity. In M. Guimar?es, L. Meroni, C. Rodrigues, & I. San Martin (eds.) University of Maryland Working Papers in Linguistics 9, 1-13. Chametzky, R. 1996. A Theory of Phrase Markers and the Extended Base. Albany: SUNY Pres. 266 Chametzky, R. 2000. Phrase Structure: From GB to Minimalism. Oxford: Blackwel. Chomsky, N. 1955/1975. The Logical Structure of Linguistic Theory. New York: Plenum. Chomsky, N. 1965. Aspects of a Theory of Syntax. Cambridge, MA: MIT Pres. Chomsky, N. 1973. Conditions on Transformations. In S. Anderson & P. Kiparsky (eds.) A Fetschift for Morris Halle, New York: Holt, Reinhart, & Winston, 232-286. Chomsky, N. 1980. Rules and Representations. New York: Columbia University Pres. Chomsky, N. 1981. Lectures on Government & Binding. Dordrecht: Foris. Chomsky, N. 1982. Some Concepts and Consequences of the Theory of Government and Binding. Cambridge, MA: MIT Pres. Chomsky, N. 1986a. Barriers. Cambridge, MA: MIT Pres. Chomsky, N. 1986b. Knowledge of Language: Its Nature, Origin, & Use. New York: Praeger. Chomsky, N. 1993. A Minimalist Program for Linguistic Theory. In K. Hale & S. J. Keyser (eds.) The View from Building 20: Esays in Linguistics in Honor of Sylvain Bromberger, Cambridge, MA: MIT Pres, 1-52. Chomsky, N. 1994. Bare Phrase Structure. MIT Ocasional Papers in Linguistics, 5. Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: MIT Pres. Chomsky, N. 1998. Minimalist Inquiries: The Framework. MIT Working Papers In Linguistics. Chomsky, N. 1999. Derivation by Phase. Ms., MIT. Clements, 1975. The Logophoric Pronoun in Ewe: Its Role in Discourse. Journal of West African Languages 10: 141-177. Collins, C. 1997. Local Economy. Cambridge, MA: MIT Pres. Collins, C. 2000. Eliminating Labels. Ms. Cornel University. Davidson, D. 1967. The Logical Form of Action Sentences. In N. Rescher (ed.) The Logic of Decision and Action. U. of Pitsburgh Pres. Demuth, K. & J. Gruber. 1994. Constraining XP Sequences. Ms. Brown University and UQAM. De Viliers, J., T. Roeper, and A. Vainikka. 1990. The acquisition of long-distance rules. In L. Frazier and J. de Viliers (eds.) Language Procesing and Language Acquisition, Dordrecht: Kluwer, 257-297. den Dikken, M. & A. Szabolcsi. 2002. Islands. In L. Cheng & R. Sybesma (eds.) The Second State of the Article Book. Berlin: Mouton de Gruyter. 267 DiScuilo, A. & E. Wiliams. 1987. On the Definition of Word. Cambridge, MA: MIT Pres. Drury, J. 1998a. Root-First Derivations: Atomic Merge & The Coresidence Theory of Movement. Ms. University of Maryland, College Park. Drury, J. 1998b. The Promise of Derivations: Atomic Merge & Multiple Spel-Out. Groninger Arbeiten zur germanistischen Linguistik 42, 61-108. Drury, J. 1999. The mechanics of ?-derivations. In S. Aoshima et al. (eds.) UMWPiL 8. Drury, J. 2000. Some Thoughts on Generalizing Transmision. In M. Guimar?es, L. Meroni, C. Rodrigues, & I. San Martin (eds.) University of aryland Working Papers in Linguistics 9, 1-13. Drury, J. 2005. The Order of Form. Ms. Georgetown University, Washington D.C. Du Plesis, H. 1977. Wh-Movement in Afrikaans. Linguistic Inquiry 8, 723-726. Emonds, J. 1970. Root and Structure-Preserving Transformations. Indiana University Linguistics Club Publication. Emonds, J. 1976. A Tranformational Approach to English Syntax. New York: Academic Pres. Emonds, J. 1985. A Uniform Theory of Syntactic Categories. Dordrecht: Foris. Epstein, S., E. Groat, R. Kawashima, & H. Kitahara. 1998. A Derivational Approach to Syntactic Relations. New York: Oxford University Pres. Felser, C. 2004. Wh-Copying, phases, and succesive cyclicity. Lingua 114, 543-574 Fodor, J. D. 1977. Semantics: Theories of Meaning in Generative Grammar. Hasocks, Eng.: Harvester Pres. Fodor, J. 1983. Modularity of Mind. Cambridge, MA: MIT Pres Fodor, J. 2000. The Mind Doesn't Work That Way. Cambridge, MA: MIT Pres. Fox, D. 1998. Economy and Semantic Interpretation. Cambridge, MA: MIT Pres. Frampton, J. & S. Guttman. 2000. Agrement is Feature-Sharing. Ms. Northwestern University. Frank, R. 1992. Syntactic Locality and Tre Adjoining Grammar: Gramatical, Acquisition, and Procesing Perspectives. Doctoral disertation. University of Pennsylvania. Frank, R. 2002. Phrase Structure Composition and Syntactic Dependencies. Cambridge, MA: MIT Pres. Frank, R. & A. Kroch 1995. Generalized Transformations and the Theory of Gramar. Studia Linguistica 49, 103-151. 268 Fukui, N. 1986. A Theory of Category Projection and its Applications. Doctoral disertation, MIT. Fukui, N. & M. Speas. 1986. Specifiers and Projections. In N. Fukui et al. (eds.) MIT Working Papers in Lingusitics 8. Grimshaw, J. 1991. Extended Projections. Ms. Brandeis University. Grimshaw, J. 1997. Projection, Heads, and Optimality. Linguistic Inquiry 28, 373-422. Groat, E. & J. O'Neil 1996. Spel-Out at the LF Interface. In W. Abraham, S. Epstein, H. Thrainsson, & J.-W. Zwart (eds.) Minimal Ideas, Amsterdam: John Benjamins, 305-327. Grohmann, K. K. 2003. Prolific Domains. Amsterdam: John Benjamins. Grohmann, K. K., J. Drury, & J. C. Castilo. 2000. No More EP. In R. Bilerery & B.D. Lilegaugen (eds), WCFL 19: Procedings of the 19th West Coast Conference on Formal Linguistics. Cascadila Pres. Guimar?es, M. 1999. Phonological Cascades & Intonational Structure in Dynamic Top- Down Syntax. Ms. UMCP Guimar?es, M. 2004. Derivation and Representation of Syntactic Amalgams. Doctoral disertation. UMCP. Hale, K. & S. J. Keyser 1993. On Argument Structure and the Lexical Expresion of Syntactic Relations. In K. Hale & S. J. Keyser (eds.) The View from Building 20: Esays in Linguistics in Honor of Sylvain Bromberger, Cambridge, MA: MIT Pres. Hale, K. & S. J. Keyser 2002. Prolegomenon to a Theory of Argument Structure. Cambridge, MA: MIT Pres. Hal?, F., R. A. A. Oldeman, & P. B. Tomlinson. 1978. Tropical tres and forests: An architectural analysis. Berlin: Springer-Verlag. Heck, F. and G. M?ller. 2000. Succesive cyclicy, long-distance superiority, and local optimization. Procedings of WCFL 19, 101-114. Henry, A. 1995. Belfast English and Standard English: Dialect Variation and Parameter Seting. Oxford: Oxford University Pres. Herburger, E. 2000. What Counts. Cambridge, MA: MIT Pres. Hiemstra, I. 1986. Some aspects of Wh-Questions in Frisian. North-Western European Language Evolution (NOWELE) 8, 97-110. Higginbotham, J. 1985. On Semantics. Linguistic Inquiry 16/4: 547-593. Higginbotham, J. 1988. Contexts, Models, and Meanings: A Note on the Data of Seantics. In R. Kempson (ed.) Mental Representations: The Interface Betwen Language and Reality, Cambridge: Cambridge University Pres, 29-48. 269 Hornstein, N. 2001. Move! A Minimalist Theory of Contrual. Oxford: Blackwel. Huang, J. 1982. Logical Relations in Chinese and the Theory of Grammar. Doctoral disertation, MIT. Jackendoff, R. 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA: MIT Pres. Jackendoff, R. 1977. X-bar Syntax. Cambridge, MA: MIT Pres. Jayaselan, K. A. 2001. IP-internal Topic and Focus Phrases. Studia Linguistica 55: 39- 75. Jenkins, L. 2000. Biolinguistics. Cambridge: Cambridge University Pres. Johnson, K. 1991. Object Positions. Natural Language & Linguistic Theory 9: 577-636. Kayne, R. 1975. French Syntax. Cambridge, MA: MIT Pres. Kayne, R. 1984. Connectednes and Binary Branching. Dordrecht: Foris. Kayne, R. 1989. Facets of Romance Past Participal Agrement. In P. Beninca (ed.) Dialect Variation and the Theory of Grammar, Dordrecht: Foris, 85-103. Kayne, R. 1994. The Antisymetry of Syntax. Cambridge, MA: MIT Pres. Kayne, R. & J.-Y. Polock 1978. Stylistic Inversion, Succesive Cyclicity, and Move NP in French. Linguistic Inquiry 9: 595-621. Kimbal, J. 1972. Cyclic and Linear Gramars. In J. Kimbal (ed), Syntax and Semantics 1, New York: Seminar Pres, pp 63-80. Klima, E. 1964. Negation in English. In J. Fodor & J. Katz (eds.) The Structure of Language. Englewood Clifs, NJ: Prentice-Hal: 246-323. Koizumi, M. 1993. Object Agrement Phrases and the Split-VP Hypothesis. In J. Bobaljik & C. Philips (eds.) Papers on Case & Agrement I, MITWPL 18. Koizumi, M. 1995. Phrase Structure in Minimalist Syntax. Doctoral disertation. MIT. Kratzer, A. 1996. Severing the External Argument from its Verb. In J. Rooryck & L. Zaring (eds.) Phrase Structure and the Lexicon, Dordrecht: Kluwer. Kulick, S. 2000. Constraining Non-Local Dependencies in Tre Adjoining Gramar: Computational and Linguistic Perspectives. Doctoral disertation. UPenn. Landau, I. 2000. Elements of Control. Dordrecht: Kluwer. Landau, I. 2004. Movement out of Control. Linguistic Inquiry 34(3), 470-498. Lasnik, H. 1995. A Note on Pseudogapping. In Papers on Minimalist Syntax MITWPL 27, 142-163. Lasnik, H. 1999. Minimalist Analysis. Oxford: Blackwel. 270 Lasnik, H. & M. Saito. 1991. Move-?. Cambridge, MA: MIT Pres. Lasnik, H. & J. Uriagereka. Forthcoming. A Course in Minimalist Syntax. Oxford: Blackwel. Lightfoot, D. 1989. The Child's Trigger Experience: Degre-0 Learnability. Behavioral & Brain Sciences 12: 321-334. Manzini, R. & A. Roussou. 2000. A Minimalist theory of A-movement and Control. Lingua 110, 409-447. May, R. 1985. Logical Form. Cambridge, MA: MIT Pres. McCloskey, J. 2000. Quantifier Float in Wh-Movement in an Irish English. Linguistic Inquiry 31:1, 57-84. McDaniel, D. 1989. Partial and Multiple Wh-movement. Natural Language and Linguistic Theory 7, 565-604. McDaniel, D., B. Chui, & T. Maxfield. 1995. Parameters for Wh-Movement Types: Evidence from Child English. Natural Language and Linguistic Theory 13, 709- 753. McKinnon, R. & L. Osterhout. 1996. Constraints on Movement Phenomena in Sentence Procesing: Evidence from Event-Related Brain Potentials. Language & Cognitive Proceses 11, 495-524. Muysken, P. 1982. Parameterizing the notion "Head". Journal of Linguistic Research 2, 57-75. Nunes, J. 1995. The Copy Theory of Movement and Linearization of Chains in the Minimalist Program. Doctoral disertation. University of Maryland. Parsons, T. 1990. Events in the Semantics of English. Cambridge, MA: MIT Pres. Pesetsky, D. 1982. Paths & Categories. Doctoral disertation. MIT. Pesetsky, D. 1987. Wh-in-Situ: Movement and Unselective Binding. In E. Reuland & A. G. B. ter Meulen (eds), The Representation of (In)definitenes. Cambridge, MA: MIT Pres. Pesetsky, D. & E. Torrego. 2001. T-to-C Movement: Causes and Consequences. In M. Kenstowicz (ed.) Ken Hale: A Life in Language, Cambridge, MA: MIT Pres, 355-426. Pesetsky, D. & E. Torego. 2002. Tense, Case, and the Nature of Syntactic Categories. Ms. MIT & U. of Masachusets Boston [in J. Gueron & J. Lecarme (eds.) The Syntax of Time, Cambridge, MIT Pres]. Philips, C. 1996. Order and Structure. Doctoral disertation. MIT. Philips, C. 2003. Linear Order and Constituency. Linguistic Inquiry 34, 37-90. Postal, P. 1974. On Raising. Cambridge, MA: MIT Pres. 271 Postdam, E. & J. Runner 2003. Richard Returns: Copy Raising and Its Implications. Procedings of CLS 2001. Quine, W.V.O. 1940. Mathematical Logic. Harvard University Pres. Reinhart, T. 2000. The Theta System: Syntactic Realization of Verbal Concepts. OTS Working Papers in Linguistics. Reinhart, T. & E. Reuland. 1993. Reflexivity. Linguistic Inquiry 24: 657-720. Reuland, E. & M. Everaert. 2001. Deconstructing Binding. In C. Collins & M. Baltin (eds.) The Handbook of Contemporary Syntactic Theory, Oxford: Blackwel, 634- 669. Resnik, P. 1992. Left Corner Parsing & Psychological Plausibility. Procedings of the 14th International Conference on Computational Linguistics (COLING '92). Nantes, France. Rizac, M. 2004. Elements of Cyclic Syntax: Agre and Merge. Doctoral Disertation. University of Toronto. Rogers, A. 1971. Thre kinds of physical perception verbs. Papers from the Eighth Regional Meting of the Chicago Linguistics Society, Chicago: CLS, 303-315. Rosenbaum, P. 1967. The Grammar of English Predicate Complement Constructions. Cambridge, MA: MIT Pres. Ross, J. R. 1973. The Penthouse Principle and the Order of Constituents. Papers from the Comparative Syntax Festival (CLS 9). Runner, J. 1995. Noun Phrase Licensing and Interpretation. Doctoral disertation. University of Masachusets. Runner, J. To Appear. The Acusative Plus Infinitive Construction in English. In B. Hollebrandse and R. Goedemans (eds.) The Syntax Companion. Oxford: Blackwel. Sabel, J. 2000. Partial Wh-Movement and the Typology of Wh-Questins. In U. Lutz, G. M?ller, A. von Stechow (eds), Wh-Scope Marking. John Benjamins, Amsterdam and Philadelphia, pp 409-446. Saddy, D. 1991. Wh-scope Mechanisms in Bahasa Indonesia. In L. Cheng & H. Demirdache (eds), MIT Working Papers in Linguistics 15. MITWPL, Cambridge MA: pp 183-218. Sauerland, U. 1995. The Lemings Theory of Case. Paper presented at the 7th Student Conference in Linguistics (SCIL 7), University of Connecticut. Schein, B. 1992. Plurals and Events. Cambridge, MA: MIT Pres. Sels, P. 1987. Aspects of Logophoricity. Linguistic Inquiry 18, 445-479. 272 Shieber, S. 1986. An Introduction to Unification-Based Approaches to Grammar. CSLI: University of Chicago Pres. Takahashi, D. 1994. Minimality of Movement. Doctoral disertation. University of Connecticut. Terada, H. 1999. Succesive Cyclicity and Incremenality. English Linguistics 16/2: 243- 274. Tesneire, L. 1959. Elements de Syntaxe Structurale. Klincksieck, Paris. Thornton, R. 1990. Adventures in Long-Distance Moving: The Acquisition of Complex Wh-Questions. Doctoral disertation, University of Connecticut. Torrego, E. 1984. On Inversion in Spanish and Some of its Efects. Linguistic Inquiry 15: 103-129. Ura, H. 1998. Checking, economy, and copy-raising in Igbo. Linguistic Analysis 28: 67- 88. Uriagereka, J. 1998. Rhyme & Reason: An Introduction to Minimalist Syntax. Cambridge, MA: MIT Pres. Uriagereka, J. 1999. Multiple Spel Out. In S. Epstein & N. Hornstein (eds.) Working Minimalism, Cambridge, MA: MIT Pres, 251-282. Uriagereka, J. 2002a. Derivations: Exploring the Dynamics of Syntax. London: Routledge. Uriagereka, J. 2002b. Spel Out Consequences. Ms. UMCP. van Riemsdijk, H. 1998. Categorial Feature Magnetism: The Endocentricity and Distribution of Projections. Journal of Comparative Germanic Linguistics 2: 1- 48. Vermeulen, R. 2002. Multiple Nominative Constructions in Japanese. In M. Cazloi- Goeta et al. (eds.) Fifth Durham Postgraduate Conference in Theoretical and Applied Linguistics: Conference Procedings 224-233. Watanabe, A. 1995. The Conceptual Basis of Cyclicity. In R. Pensalfini & H. Ura (eds.) MITWPL 27, 269-291. Weinberg, A. 1999. A Minimalist Theory of Human Sentence Procesing. In S. Epstein & N. Hornstein (eds.) Working Minimalism, Cambridge, MA: MIT Pres, 282- 314. Wiliams, E. 1994. Thematic Structure in Syntax. Cambridge, MA: MIT Pres. Zwart, C. J.-W. 1993. Dutch Syntax: A Minimalist Approach. Doctoral disertation. University of Groningen. 273 Zwart, C. J.-W. 1996. Shortest Move vs. Fewest Steps. In W. Abraham, S. Epstein, H. Thrainsson, & J.-W. Zwart (eds.) Minimal Ideas, Amsterdam: John Benjamins, 305-327.