ABSTRACT 
 
Title of Disertation: 
ALTERNATIVE DIRECTIONS FOR MINIMALIST INQUIRY: 
EXPANDING AND CONTRACTING PHASES OF DERIVATION  
  
 John Edward Drury, Doctor of Philosophy, 2005 
  
Disertation Directed By: Profesor Juan Uriagereka, Linguistics Department 
 
This disertation develops novel derivational mechanics for characterizing the syntactic 
component of human language ? Tre Contraction Gramar (TCG). TCG fals within a 
general clas of derivationaly-oriented minimalist approaches, constituting a version of a 
Multiple Spel Out (MSO-)system (Chomsky 1999, Uriagereka 1999, 2002). TCG posits 
a derivational WORKSPACE restricting the size of structures that can be active at a given 
stage of derivation. As structures are expanded, workspace limitations periodicaly force 
contractions of the span of structure visible to operations. These expansion-contraction 
dynamics are shown to have implications for our understanding of locality of 
dependencies, specificaly regarding succesive cyclic movement. The mechanics of 
TCG rely on non-standard asumptions about the direction of derivation ? structure 
asembly is required to work top-down. TCG draws a key idea from TAG; that is, 
recursive structure ought to play a direct role in delimiting the range of possible 
interactions betwen syntactic elements in phases of derivation. TAG factors complex 
structures into non-recursive elementary tres and recursive auxiliary tres that are 
 
 
combinable via TAG's two operations (substitution/adjoining). In TCG the expansion of 
structure in the workspace is similarly limited to containing only non-recursive stretches 
of structure. In the course of a derivation, encountering "repeated elements" in the 
expanding dominance ordering forces contractions of the workspace (understood to 
happen in potentialy diferent ways depending on the properties of repeated elements). 
In certain circumstances, repeated elements are identified, alowing information from 
earlier stages of derivation to be caried over to later stages, underwriting our (novel) 
view of succesive cyclicity. Recursive structure is retained in the global "output" 
structure, upon parts of which we understand the workspace to be superimposed. 
 
 
i 
 
 
 
 
 
 
 
 
 
 
ALTERNATIVE DIRECTIONS FOR MINIMALIST INQUIRY: 
EXPANDING AND CONTRACTING PHASES OF DERIVATION 
 
 
 
By 
 
 
John Edward Drury 
 
 
 
 
 
Disertation submited to the Faculty of the Graduate School of the 
University of Maryland, College Park, in partial fulfilment 
of the requirements for the degre of 
Doctor of Philosophy 
2005 
 
 
 
 
 
 
 
 
 
Advisory Commite: 
Profesor Juan Uriagereka, Chair 
Profesor Norbert Hornstein 
Profesor Howard Lasnik 
Profesor Colin Philips 
Profesor Amy Weinberg 
Profesor James Reggia 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
? Copyright by 
John Edward Drury 
2005 
 
i 
Dedication 
 
 
 
To my Father 
Dr. Thomas Francis Drury 
for fostering love of wisdom 
 
 
And to Lisa 
for patience and understanding 
 
 
ii 
Acknowledgements 
There is no way to adequately thank al of the people who made the completion of this 
disertation possible. But this is the place where one is to make such an atempt, so here 
'goes. I thank first my friend and commite chair Juan Uriagereka for leaving just 
enough bread crumbs behind him on that twisting and dificult to follow trail that leads to 
the Outside of The Box. Thanks as wel go to my commite: Colin Philips, Amy 
Weinberg, Howard Lasnik, and Norbert Hornstein, and to James Reggia for agreing to 
fil the role as Deans Rep on short notice. To both Howard and Norbert I owe a special 
additional debt for last minute help above and beyond the cal. 
 Standing on the shoulders of others is a job for an acrobat ? which I am not. So, 
to the following people I must say that I am sorry for al the kicking and if I poked you in 
the eye once or twice, please know it was not intentional,..thank you for your patience 
and for whatever help you provided me at one point or another in conversations about 
linguistics in general or about isues in the vicinity of this disertation: Mark Arnold, 
Stephen Crain, Norbert Hornstein, Howard Lasnik, David Lebeaux, David Lightfoot, 
Colin Philips, Phil Resnik, Amy Weinberg, Juan Carlos Castilo, Kleanthes Grohmann, 
Cedric Boeckx, Max Guimar?es, Jef Lily, Mat Kaiser, Tom Cornel, Robert 
Chametzky, Chris Wilder, Georges Rey, Ricardo Extepare, Itziar San Martin, Aitizber 
Atutxa, Lucia Quintana, Eli Murgia, Caro Struijke, Acrisio Pires, Peggy Antonise, Julien 
Musolino, Viola Miglio, and Laura Benua. 
 I owe a serious debt to Maryland's Linguistics Department as whole, for 
supporting my work and my education, and for providing an excelent environment in 
 
iv 
which to grow. Within this department, Kathi Faulkingham deserves extra special thanks 
for helping to ensure that my typicaly pathological response to administrative deadlines 
did not cause any unsolvable problems. 
 Thanks also go to Michael Ulman for generous support and encouragement 
during my stay in the Brain & Language Lab in Georgetown University's Department of 
Neuroscience, and to B&L Lab members past and present, especialy Karsten Steinhauer 
and Mat Walenski. 
 I thank also my parents, Thomas and Margaret Drury for unconditional love and 
backing in al my endeavors. And last, to my amazing wife Lisa, thank you for putting up 
with my proces and my drama over these years, and for keeping the faith that one day 
(today!) this would actualy come to an end,.. and thank you litle O, for demanding that 
I complete this work so as to make more time for play and laughter. 
 
 
v 
Table of Contents 
DEDICATION............................................................................i 
ACKNOWLEDGEMENTS.................................................................ii 
TABLE OF CONTENTS...................................................................v 
CHAPTER 1: SPELING OUT THE WORKSPACE............................................1 
1.1. WS CONSTRAINTS & A SKETCH OF SCM IN TCG.......................................10 
1.2. LOCAL & LINKED LOCAL RELATIONS: SAMPLE TCG DERIVATIONS........................19 
1.3. SUMARY SO FAR,..& THE PATH AHEAD.............................................34 
1.4. SO-SYSTEMS, TAG, & GENERALIZING THE WS/O-DISTINCTION.........................38 
1.4.1. MSO and Linearization........................................................39 
1.4.2. Phases/MSO & Sucesive Cyclicity.............................................52 
1.4.3. What Goes for Cyclicity in TAG.................................................62 
1.5. IMPLEMENTATION OF TCG.........................................................6 
1.5.1. On Relating Positions in Structure...............................................6 
1.5.2. Labels & Structure............................................................89 
1.6. CHAPTER SUMARY.............................................................104 
CHAPTER 2: REGARDING SUCESIVE CYCLIC MOVEMENT.............................107 
2.1. TYPES OF SUCCESIVE CYCLICITY EFECTS...........................................107 
2.1.1. Wh-Copying................................................................108 
2.1.2. Q-Stranding................................................................10 
2.1.3. Agrement on the Path.......................................................14 
2.1.4. Some Binding Theoretic Efects................................................17 
2.1.5. Interaction with Variable Binding..............................................120 
2.1.6. Inversion Efects.............................................................123 
2.1.7. Control as Raising?..........................................................126 
2.2. SOME THINKING ON SUCCESIVE CYCLICITY..........................................127 
2.2.1. M-Features.................................................................129 
2.2.2. Cyclicity without M-features?..................................................139 
2.3. CHAPTER SUMARY & FURTHER CRITICAL REMARKS..................................156 
CHAPTER 3: TCG ANALYSIS............................................................164 
3.1. LOCAL RELATIONS: PART ONE.....................................................164 
3.2. LINKED-LOCAL RELATIONS I: RAISING TO SUBJECT (RTS)...............................168 
3.2.1. Some Raising Imposibilities & Expletive/Asociate Relations.......................169 
3.2.2. Binding Interventions & SCM..................................................183 
3.2.3. Variable Binding/Condition C Interaction........................................195 
3.3. LINKED LOCAL RELATIONS I: WH-MOVEMENT.......................................202 
3.3.1. ?-Identification & Local Movement.............................................202 
3.3.2. Core Cases of A'-SCM (& Some Technical Problems Adresed).....................206 
3.4. LOCAL RELATIONS: PART TWO.....................................................213 
3.4.1. Raising-to-Object (RtO)......................................................213 
3.4.2. Pasives, Local Binding, & Control.............................................20 
3.5. CLAUSAL UNITHOD & WH-AGAIN,.................................................242 
3.6. CONCLUSIONS: THE TAKE-HOME MESAGE OF TCG...................................247 
3.7. CLOSING.......................................................................251 
REFERENCES..........................................................................264 
 
 
 
1 
CHAPTER 1: Speling Out the Workspace 
the idea of the form implicitly contains also the history of such a form 
Hal?, Oldeman, & Tomalinson (1978) 
This disertation develops novel derivational mechanics for characterizing the syntactic 
component of human language ? Tre Contraction Gramar (TCG). The approach fals 
into the general clas of derivationaly oriented systems under development within the 
Minimalist Program,
1
 and more specificaly into a category of models that I wil cal here 
Multiple Spel Out (MSO-)systems (Chomsky 1999, Uriagereka 1999, 2002). MSO-
systems, generaly speaking, divide derivations into sub-derivations, the outputs of which 
may be independently evaluated at the interfaces to extra-gramatical systems and may 
play a special role in demarcating domains with import for understanding the locality of 
syntactic relationships. 
 I propose in this work a general way of thinking about how MSO-systems 
function, relying on a distinction betwen a syntactic WORKSPACE and a derived OUTPUT 
structure. It is within this context that the core theoretical intuition underlying TCG 
emerges. The approach is informed as wel by Tre Adjoining Gramar (TAG).
2
 In 
particular, as in TAG approaches, the fundamental notion of recursive structure is argued 
here to play an direct role in understanding the locality properties of syntactic 
dependencies. The general perspective developed in this work ses MSO-systems (and 
TCG in specific) together with TAG as a family of closely related approaches. 
                                                
1
 Chomsky (195, 198, 199), Uriagereka (198, 202), Lasnik & Uriagereka (204). 
2
 Se Frank (192, 202), Frank & Kroch (195) among many others. 
 
2 
 TCG distinguishes itself in terms of the mechanisms it makes available for 
analyses of succesive cyclic movement (SCM) phenomena in two ways that I argue to 
be of broad interest theoreticaly: 
 
(1) a. The Non-Existence of "EP-/P-features": if the key ideas are right, 
special features driving intermediate movements in SCM are not neded 
 
b. Derivational Directionality: the mechanics of TCG derivations demand 
that structure asembly work "top-down", and not bottom-up as in 
Chomsky's (1994, 1995) widely adopted Bare Phrase Structure (BPS) 
 
It turns out that (1)a and (1)b are related. Teling the story of how this might be so is the 
main mision of this disertation. 
A metaphor helps to get the general intuition behind our Workspace/Output-
Distinction (WS/O-distinction): picture an arbitrarily complex syntactic structure upon 
which we might shine a spotlight beam which can iluminate only smal portions of 
structure, leaving the rest in darknes. Construction of syntactic objects and the licensing 
of dependencies within such structures is understood to take place within the iluminated 
span, and not elsewhere (no syntactic work can happen in the dark). Thus, in order to 
expand structure beyond the maximal span of ilumination, the spotlight beam must 
"move on". As a consequence, some of the previously established structure wil 
necesarily have to be left behind outside of the iluminated zone. 
If derivationaly later expansion of structure requires the spotlight to move-on 
before a required syntactic licensing operation has occurred, there is no backtracking to 
fix the problem. In such a situation the output structure is stuck with some 
ilegible/unlicensed property which has been left outside the spotlight. (The interface 
 
3 
systems, unlike the narow syntactic combinatorial system defined by the workspace, can 
se in the dark.) 
The task of developing this metaphor into concrete proposals that can support 
reasoning about the syntactic component of the human language faculty is thus the task of 
specifying the principles that could be understood to govern the span of this spotlight 
beam, how and what operations happen within it, how and why it moves on to further 
expand structure, what happens to the old structure left outside the workspace boundaries 
as such later expansions occur, and so on. 
3
 
Consider a graphical ilustration of the intuition informing our workspace/output 
distinction (WS/O-distinction). The following in (2) and (3) show two diferent partial 
derivations. The first of these schemas ilustrates a system in which construction within 
the workspace creates the output structure in a top-down fashion; the second schema 
represents a similar proces working bottom-up. Direction of derivation in this sense wil 
be important in this work. As mentioned above, the present approach wil be required to 
work roughly as pictured in (2) and not as in (3). 
 
(2)  
 
 
 
 
???DIRECTION OF DERIVATION??? 
 
                                                
3
 Note that any conection to the notion of "working memory" as this notion is deployed in the psychological 
literature is metaphoric. In particular, the limitations proposed here to govern the maximal spans of the derivational 
workspace are not expected to vary acros individuals. The constraints proposed here are aleged to be "hard" 
architectural constraints on the combinatorial component. I do think there is rom to relate the proposals here to a story 
about memory systems, but I won't be spending time on this isue in the present work. 
 
4 
(3)  
 
 
 
???DIRECTION OF DERIVATION??? 
 
The enclosures in these schemas represents the state of the workspace at earlier vs. later 
stages of derivation. Again, if this "spotlight beam" is limited in its scope then it must at 
certain points of derivation move on to expand further structure, thus leaving some of the 
previously established structure outside of its boundaries. Such abandoned spans are 
represented by the dotted-arcs for the portions of the structure outside the enclosures in 
(2) and (3). The notion of an OUTPUT then is the entire span of the established structure of 
which the workspace only alows us to se parts of at any given step.
4
 I wil show below 
that this way of thinking ? our workspace/output distinction (WS/O-distinction) ? is 
helpful for reasoning about the workings of MSO-type approaches generaly. For 
example, in addition to having this distinction serve as a platform for the development of 
TCG, I wil be discussing both Chomsky's (1998, 1999) view of spel-out as happening 
"by phase" in these terms, as wel as Uriagereka's (1999) linearization-based view. 
 To begin to get a beter fel for the WS/O-distinction, consider the old idea that 
there might exist special cyclic nodes defining domains in syntactic complexes within 
which some inventory of transformational operations ?T
1
,..,T
n
? are applied and then 
reapplied to (recycled in) the next higher domain, and so on, as in (4). 
 
                                                
4
 With one exception: the first "phase" of derivation involving expansion of the structural description wil, up 
until the first contraction of the workspace, corespond one-to-one with the output structure (se also fn6). 
 
5 
(4) S
4
 
  ?(RE)APLY ?T
1
,..,T
n
? (cycle-?) 
S
3
 
  ?(RE)APLY ?T
1
,..,T
n
? (cycle-?) 
 S
2
 
  ?(RE)APLY ?T
1
,..,T
n
? (cycle-?) 
 S
1
 
  ?APLY ?T
1
,..,T
n
? (cycle-?) 
 
This can be understood in present terms by identifying the maximal span of the 
workspace with such cyclic domains. As cycles are completed, a new workspace (cyclic 
domain) begins, leaving the previous cyclic domain stranded outside the borders of the 
workspace. This is pictured in (5): 
 
(5)       S
4
 
 
    
S
3  
S
3
 
 
   S
2  
S
2  
S
2
 
 
 S
1  
S
1  
S
1  
S
1
 
 
 
 
But this is just translating terminology ? ordered nodes marking domains for 
applications of an inventory of transformations give way to a limited workspace that 
periodicaly expands up to a cyclic node and contracts, "clearing the buffer" to make way 
for the next domain expansion.
5
 So what is the interest of the WS/O-distinction? 
                                                
5
 My thinking of spel-out in terms an emptying/clearing of a bufer of sorts grows in part from numerous 
helpful conversations with Max Guimar?es. Se Guimar?es (199) for related discusion involving alternative 
derivational directionality and posible aplications to thinking about syntax/prosody relationships. 
 
6 
 Note that the schema in (5), and those in both (2) and (3) above, obey a general 
restriction. The structures in the workspaces at diferent steps of derivation in these 
examples are always contiguous (sometimes improper) parts of the output structure.
6
 
 I suggest that a principled view of how the contents of the workspace are 
regulated can result in situations where non-contiguous portions of the output structure 
are represented in the workspace. This wil be key to our development of a basis for 
analyses of core cases of SCM. Consider a rather fancier ilustration which conveys this 
alternative intuition: 
 
(6) a.  b.  c. 
 
 
     
ws
=
 
 
 
 
In (6) two types of workspace contractions are represented. Asuming for this example a 
top-down derivation, the first such contraction occurs betwen the first two structures in 
(6)a and (6)b. This is just like the contractions depicted in (2) in which we se expansion 
of the workspace at the bottom end forcing an abandonment of structure at the top (i.e., 
the spotlight beam oves on; of course the opposite happens with our schemas in (3) and 
(5) above where expansions at the top force abandoning structure at the bottom). 
                                                
6
 That is, for any starting point of a derivation, up to the first contraction of the available span of structure 
visible in the workspace the workspace structure coresponds directly to the output. I say "coresponds" rather than "is 
identical to" since I wil be viewing the workspace superimposition on outputs as a mater of only syntactic (F) 
properties being "in" the workspace, which wil be understod to be asociated with the PF- (?) and LF- (?) relevant 
properties that wil be understod to be what populates the output structure. However, the "structure" (ordering 
relationships betwen nodes) wil be understod to be the same acros the workspace/output division. I wil unpack al 
this below. 
 
7 
However: betwen the second and third structure in (6)b and (6)c we have a 
contraction which is unlike the first. In this second contraction it is neither the top nor the 
bottom of the structure which is voided from the workspace, but rather some intermediate 
stretch. This results in bringing two nodes (the open/unshaded nodes in (6)) into a more 
local relationship within the workspace than existed prior to the contraction. It also 
results in some intervening material being "spliced-out" of the workspace, though we 
understand this spliced-out material to stil be present in the output. Thus, the two 
rightmost objects (those in (6)c) are equivalent in terms of what is in the workspace (I 
mark this above as "
ws
=
") though the second abstracts away from the output structure to 
which it is connected (i.e., via the elements stil in the workspace). 
Our metaphor of a spotlight beam breaks down at this point of course, so let us 
kick that ladder away ? the formal intuition should now be clear enough to proced. The 
key idea is that non-contiguous portions of the output may be maintained in the 
workspace (WS). I wil suggest below that the best way of viewing WS/O-distinction is 
not in terms of two levels of syntax, but rather to understand the connection betwen the 
two as a dynamic interface. WS-computations incrementaly yield a structured object 
populated by only LF and PF relevant properties. But an interface is the meting point of 
at least two diferent systems, if the WS is itself an interface system, then we should ask, 
"betwen what and what else?" On one side of the WS I have just suggested that we have 
a structured PF/LF object ? what then is on the other side? Answer: the lexicon. The 
view of syntactic architecture can be visualized as follows: 
 
 
 
 
 
8 
(7) L  WS-1 
 E 
 X  WS-2 
  I   
 C   
 O  WS-n 
 N 
 
A beter way of putting things then would be to say that the WS is itself the "interface" 
betwen the lexicon, on the one hand, and derived PF/LF output structures on the other. 
The lexicon feds the WS which expands up to its limits (such limits are introduced and 
developed below), and then moves-on or contracts. The dynamics of WS expansion and 
contraction leaves in its wake a structured object ? a tre ? which is populated by only 
PF and LF relevant properties. 
The interest of the WS/O-distinction within the TCG approach developed here is 
in the nature of the shrinking/contraction proceses that yields a way of treating 
superficialy non-local relationships as potentialy reducible to more local domains. So 
the key question becomes: what drives contractions of the workspace, and how might we 
understand these to work in a way that can support analysis of more non-local-looking 
relationships? 
I begin development of my answer(s) to this question in ?1.1, introducing two 
constraints on categories and ordering in the workspace, and show how this yields a 
novel schema for analyses of succesive cyclicity phenomena. In ?1.2 I develop some 
sample general derivations for A- and A'-relations, and highlights some features that wil 
be of interest in later discussion. ?1.3 sums up the previous two. ?1.4 backs up to 
consider MSO-systems and TAG focusing mainly on their general outlooks on succesive 
cyclic movement. The discussion of these neighboring models is framed within our 
 
9 
WS/O-distinction, bringing forward its generality and highlighting some conceptual 
advantages of viewing MSO-systems in particular in this way. These discussions lead us 
to consider, in ?1.5, some technical isues regarding the theory of movement chains, 
developing a version of an idea offered in Chomsky (1995) where it is suggested that 
movement chains be understood as sets of contexts/positions.
7
 In ?1.5 I also consider 
isues regarding categories/features and formal ordering ? that is, the theory of local 
intra-/inter-phrasal organization ? and setle on a "reduced" view adopting some ideas 
from Brody (2000, 2003). This reduced view is suggested to be a positive step in the 
direction of restrictivenes of the overal theory, though the central motivation for its 
adoption is its overal "good fit" with the key intuitions underlying TCG. 
Following this, Chapter 2 discusses empirical and theoretical isues regarding 
succesive cyclic movement in some more detail, and raises some isues regarding so-
caled EP-/P-features ? what are caled "Move/Merge-features" or "M-features" here 
? targeting them for elimination in the TCG approach. A number of other views 
regarding cyclicity are discussed as wel. 
Chapter 3 then turns to develop the TCG ideas in more detailed analytical 
discussion, focusing initialy on Raising-to-Subject (RtS) and wh-movement. The 
approach is demonstrated to require a top-down implementation, and some contrasts with 
TAG are discussed. The approach is then explored in possible extensions to other 
                                                
7
 To jump ahead a bit: I wil sugest that Chomsky's particular view ? which views the "context" for an 
element ? as the entire derivation up to the point where ? is introduced/merged (i.e., the "sister" of ? is the context 
defining this ocurence of ?, and the "sister" is itself viewed as the entire structure that this sister element dominates). 
I sugest in ?1.5 that this is both to strong and to weak ? it is to strong in that identifying/distinguishing 
ocurences requires reference to arbitrarily large stretches of previously established structure; it is to weak because 
we wil se that reducing our understanding of contexts to just the label of the context ? relates to wil permit us to 
view some contexts as indistinguishable from others. This wil turn out to be what underwrites SCM without the 
posultation of EP-properties. 
 
10 
phenomena. I conclude with some open questions and general discussion regarding the 
architecture of syntactic theory. 
1.1. WS Constraints & A Sketch of SCM in TCG 
In this section I introduce two possible constraints on the workspace and examine some 
of their consequences. These notions conspire to yield linked-local relations of the 
succesive cyclic movement (SCM) sort. 
 
(8) WORKSPACE ORDER: 
The elements in the workspace manifest a weak partial order (DOMINANCE)
8
 
 
(9) WORKSPACE DISTINCTNESS (ANTI-RECURSION): 
 The workspace does not tolerate the presence of multiple tokens of type X 
 
First, as mentioned, the system wil be understood to work "top-down". I wil return to 
explain why things must, in fact, work this way in the conclusion. Take the shaded node 
in (10)a to be a "to-be-moved" element and the open/unshaded node to be its initial 
structural context (housing the relevant licensing feature(s), e.g., wh or perhaps Case/? 
information). Asume for the moment that the branching order represented by the tre 
structure manifests the traditional notion of dominance (i.e., a transitive, antisymmetric, 
reflexive relation): 
 
 
 
 
 
 
 
 
 
                                                
8
 I wil consider the posibility of a rather stronger statement regarding ordering in later discusion.  
 
11 
(10) a.  b.  c.  d.  e. 
 
                             ? 
 
 
          
ws
=
 
 
 
   MATCHING RELATION ?    IDENTIFICATION    CONTRACTION 
 
At the point in (10)b an element is introduced which satisfies a matching relation "?". 
This relation wil be further specified below, for now simply take ?xy to be satisfied if x 
and y are non-distinct. This situation thus violates the distinctnes condition on 
workspace contents stated above in (9), so something must happen in response in order 
for this to be a wel-formed workspace. 
Suppose that the system responds by taking these non-distinct elements to be 
esentialy the one and the same thing. If they are identified then whatever the higher 
element dominates, so does the lower one. This efects a copying/lowering of the shaded 
node (10)c). Note that we do not duplicate elements in the workspace ? what happens 
in the workspace is an identification of the open/unshaded nodes, so that subsequent to 
the contraction step in (10)d there are not two tokens or occurrences of either the 
"moving" element or its context. There are rather just single nodes for each in the 
workspace (note the WS equivalence is marked again as 
ws
=
 betwen (10)d and (10)e 
above). 
 However, there are now, in virtue of this proces, two such pairs in the output 
structure. Thus, what remains in the workspace subsequent to contraction is best 
understood in terms of the picture in (10)e, though (10)d captures the workspace/output 
structure correspondence. Thus, what is one in the workspace can be many in the output. 
 
12 
 Observe that while identifying these open nodes in the structure could be 
understood to efect the equivalent of a lowering operation in terms of preserving the 
domination relations of the upper elements in the output, we clearly sem to require a 
way of blocking the similar copying of al such dominated elements. That is, why 
shouldn't this copying apply to the other dominated material (e.g., the dark nodes on the 
main path in (11)a, resulting in (11)b and then (11)c following contraction)? 
 
(11) a.  b.  c.  
 
                 ? 
 
 
 
 
 
 
How is a regres of sorts avoided? What stops this proces from copying al the 
domination relations of the upper instance of the two identified nodes (i.e., ? dominates ? 
dominates ? dominates ?,..ad infinitum)? How does this terminate? 
I suggest that we regard the IDENTIFICATION and CONTRACTION steps pictured 
above in (10)c and (10)d as esentialy a one-step operation governed by the general 
ordering restriction given in (8). Adding the "moved" element and its context to the 
botom of the workspace structure in virtue of the identification of this lower node with 
the upper one adds new pairs to the dominance relationship in the workspace. Technicaly 
this wil only be possible if elements in previous pairs in this dominance order that would 
introduce conflicts violating the antisymmetry of the dominance relation are removed 
from the workspace (though, importantly, preserved in the output). This is, esentialy, 
the notion of contraction in the TCG framework. 
 
13 
Consider (11) again, with the nodes labeled so that we may refer to them in 
specifying the relevant formal ordering properties as in (12): 
 
(12) a.   b. 
      ?     ? 
    ?    ? 
  ?     ?      ?         ? 
   ?       ?  ?' 
     ?'    ? 
          ?   ? 
           ? 
 
Prior to the introduction of ?' (i.e. the step prior to (12)a) and the subsequent 
identification (?, ?'), we have the following dominance order D: 
 
(13) D = ??, ??, ??, ??, ??, ??, ??, ??,.. 
??, ??, ??, ??,.. 
??, ?? 
 
The introduction of ?' then adds the following pairs to D: 
 
(14) D = ??, ??, ??, ??, ??, ??, ??, ??, ??, ?'?,.. 
??, ??, ??, ??, ??, ?'?, ??, ??,.. 
??, ??, ??, ?'?, ??, ??,.. 
??, ?'?, ??, ?? 
 
Asuming D is generaly a weak partial order (transitive, antisymmetric, and reflexive), if 
we identify ? and ?' then we have ordering conflicts even if we do not copy al the nodes 
? dominates to the local domain of ?' ? for example: ??, ?? and ??, ??, ??, ?? and ??, ??, 
and so on. If everything the upper "occurrence" of ? dominates is copied,
9
 then we end up 
with the situation in (11)b/c and (12)b, and many more ordering conflicts would thus 
                                                
9
 I wil be refering to the notion of copying as a convenience. The idea here is that there is an operation (node 
identification) which results in the equivalent of copying, but that there is no specific "duplicating" operation which 
takes a single element ? as an input and produces a pair of identical outputs. 
 
14 
arise in virtue of creating domination symmetries for the elements {?, ?, ?} (so that we 
have both ??, ?? and ??, ??, etc.). 
 How might the system respond to the possibility of creating such ordering 
conflicts? Nunes (1995, 1999, 2004) addreses a similar problem as it arises in his 
development of the copy theory of movement set within the context of Kayne's (1994) 
Linear Correspondence Axiom (LCA). For Nunes the problem is that when an element ? 
is copied and (re)merged in a c-commanding position, similar kinds of ordering conflicts 
emerge since ?, in addition to now c-commanding "itself", both c-commands and is c-
commanded by al the intervening nodes along the movement path. A Kayean view of 
structure/order correspondence requires there be no such conflicts in order to map 
hierarchy to precedence.
10
 Nunes reconciles the conflicts betwen the linearization 
demands imposed by the LCA and the symmetric c-command relations in the structure 
resulting from movement as copying by positing a mechanism he cals Chain Reduction, 
stated as follows:
1
 
 
(15) CHAIN REDUCTION: 
Delete the minimal number of constituents of a nontrivial chain CH that suffices 
for CH to be mapped to a linear order in acordance with the LCA 
 
A similar idea can be employed to fit with the idea of removing elements from the 
workspace (contraction/spel-out). The outcome we want is for ??? in (12)a to be 
reintroduced in the output structure so, for example, the elements {?, ?, ?} wil al 
dominate ? in the output (this wil be important for our treatment of certain connectivity 
                                                
10
 Se also Chomsky (195) for some discusion of this point where the deletion of copies to satisfy the LCA 
conceived as a bare output condition on the PF side of the gramar is proposed. 
1
 Nunes aditionaly proposes a formal feature elimination procedure that is crucial to his analyses. I won't 
discus this here. 
 
15 
efects ? se Chapter 3). But we want these intervening {?, ?, ?} elements to be spliced-
out of the workspace so as not to introduce ordering conflicts. 
Note that this requires some distinction betwen the workspace and the output to 
ensure that what is problematic with respect to ordering conflicts in the workspace is not 
problematic in the output. The nature of this particular diference relies on later 
developments, but I wil offer a sketch below. 
First consider what happens if we asume the following. Take the structure under 
discussion prior to the addition of the element ?' which wil match under relation ? with 
?, and let us prune away some of the notation to focus on the relevant elements and their 
ordering properties, as in (16): 
 
(16)  
      ? 
    ?   ? ? ? ? ? ? ? 
   ?     ? 
   ?  ? 
 
Now we add ?': 
 
(17) ? ? ? ? ? ? ? ? ?' 
 
? 
 
(?, ?') satisfies ?, and the nodes are identified in the workspace. Since ? and ?' are now 
the same element, ? comes to be dominated by the intervening elements: 
 
(18) ? ? ? ? ? ? ? ? ? 
 
?               ? 
 
 
16 
This creates no ordering conflict since ? was in no domination relation with the 
intervening elements prior to the identification. But the intervening elements do create 
ordering conflicts, and so the workspace must contract (splice-out the interveners) to 
respect the properties of the dominance order: 
 
 
(19) ? ? ? ? ? ? ? ? ?  ? 
     
ws
=
 
?               ?  ? 
 
     WORKSPACE AND OUTPUT JUST THE WORKSPACE
12
 
 
Although we have not yet specified the nature of the matching relation ?, the mechanics 
of workspace contraction as just discussed follow from our workspace constraints in (8) 
and (9) (together with ?). The ordering constraint in particular ensures that we wil be 
able to add the new domination relationships for ?, but: (i) we cannot add relations that 
cause ordering conflicts and (i) any elements that would create such problems 
subsequent to the identification (?,?') must be spliced-out. And the addition of the new 
domination relationships that efect the "lowering" of ? folows from the proposed 
response of the system to a potential violation of the distinctnes condition. 
What then of the output structure? Does it obey this (or any) ordering restriction 
or not? What about distinctnes? 
If we make the standard asumption that the items being combined are minimaly 
triples of semantic (?), phonological (?), and syntactic (F) information, ??, ?, F?, then the 
                                                
12
 So the workspace has just one ? and one ?. The output, however, has two ?'s and two ?'s (or, rather, as I 
wil sugest in a moment, the output has two corespondents of ? and two corespondents of ?). 
 
17 
following line of thinking is available to us, and wil in fact be central to our conception 
of the WS/O-distinction: the workspace manipulates only F-properties. 
In fact, we can take this a step further: "being in the workspace" could be 
identified with "having F-properties". The general idea is another way of framing the key 
intuition underlying the TCG approach. That is, categorial F-properties are a limited 
commodity in the syntactic workspace. A given manifestation of the workspace can 
contain exactly many distinct elements as there are categorial distinctions in the system. 
There is no such limitation of this sort "outside the workspace" because being outside the 
workspace just means that these formal distinctions are no longer connected with ??, ?? 
information. 
On the general view developed here, the output structure is a syntactic object in 
the sense of manifesting the formal ordering properties established in the workspace, but 
it wil be populated with only ? and ? properties. The output wil in this sense be an 
object of the interface systems, with the PF-component inspecting only the ?-vocabulary 
and the LF-component inspecting only the ?-vocabulary, but with both sets of vocabulary 
constrained by the same structure.
13
 
 The following ilustrates the idea abstractly. Our general intuition of the 
workspace having to "move-on" to expand new structure is pictured first for an abstract 
domination order of elements: 
 
                                                
13
 Having structure housing both the ? and ? types of vocabulary also yields a venue for exploring primitive 
?/? corespondences over such structures. For example, the wel-known conections betwen prosody/intonation and 
the semantics of focus would be one such area to explore with these mechanics. These maters are not explored here. In 
general we wil be concentrating mostly on what hapens in the workspace, and how this might relate to the output. 
However some brief remarks wil be made about how we might think about relationships established over output 
structures ? these are sugested to be potentialy truly non-local (examples include variable binding by a quantifier, 
long-distance obviation efects, so-caled unselective binding, etc.). 
 
18 
 
(20) A ? B ? C ? A  ?   A ? B ? C ? A   ?   A ? B ? C ? A ?B 
 
Supposing then that {A, B, C} are the relevant formal properties, as the workspace moves 
on what wil be left in the output structure are the asociated ? and ? properties of each 
formal element {A, B, C}, (e.g., {
?:A
?:
, 
?:B
?:
,..}): 
 
(21) A ? B ? C ? A  ?   
?:A
?:
 ? B ? C ? A   ?   
?:A
?:
 ? 
?:B
?:
 ? C ? A ?B 
 
Now when we say that the workspace "moves-on", we understand this to mean that the 
relevant formal properties which ?/?-pairs are connected to must be "reused" in 
establishing new expansions of structure in the workspace. That is, if the anti-
recursion/distinctnes condition in (9) holds, this means that such formal/syntactic 
information must be stripped away from earlier introduced elements so that it can be used 
to structuralize new/incoming ones. 
 We can now ilustrate the situation described metaphoricaly above ? where an 
unlicensed F-property is abandoned from the workspace: 
 
(22) A ? B ? C ? A  ?   
?:A
?:A
*F:?
 ? B ? C ? A 
 
Asuming that there is no "backtracking" of the workspace, this wil produce an anomaly 
as the interface systems are confronted with an ilegible element. Here we have marked 
this offending property as "*F:?", though note that above we suggested that an element's 
"being in the workspace" be identified with "having syntactic/formal properties". Below I 
wil be suggesting specific roles for licensing properties like WH, Case/agrement, and ?, 
so the way this wil actualy be understood wil be in terms of a failure of a formal 
 
19 
relation obtaining in the workspace leading to an ilegible PF or LF property (se ?1.2 
below, and Chapter 3 for some discussion of features, valuation, and interface legibility). 
 Regarding our concerns about ordering and repeated elements in the output: this is 
now best viewed as a constitutive diference betwen the workspace and the output 
structure. The distinction resides in exactly whether it is possible to represent multiple 
tokens of a given type or not. In the output, this is possible. In the workspace, it is not. 
The systems supporting the representation/procesing of PF and LF vocabularies, that is, 
are capable of handling multiple tokens of a given type; the narow syntactic computation 
in the workspace, which is stated over formal features/properties, cannot do this. This is 
one of the central ideas underlying the TCG approach. 
 An important idea here, discussed in ?1.5, is the idea of thinking of movement 
chains as sets of contexts/positions, though I wil argue that we require a simpler view 
than the one presented in Chomsky (1995). There it is suggested that we view contexts as 
the entire structure derived up to the point where a moved/remerged item is 
(re)integrated. I argue that returning to a simpler view, where the context is simply the 
local label, and not the entire structure, alows us to view certain sets of contexts as 
indistinguishable, yielding SCM. 
 In the next section I develop some sample derivations for core cases of A- and A'-
movement to get some technical ideas on the table. 
1.2. Local & Linked Local Relations: Sample TCG Derivations 
Now let us consider a pair of standard cases for which SCM analyses have been 
deployed, in particular wh-movement and raising-to-subject (RtS). First, some 
 
20 
simplifications regarding structure and category wil be helpful ? I wil return to discuss 
these simplifications further in ?1.5. 
Consider the following case with multiple clausal embedding in (23)a, with the 
partial description in (23)b: 
 
(23) a. Dave thought Mary believed John liked piza 
b. [
CP
 C
0
 [
TP
 DP [
T'
 T
0
 [
VP
 V
0
 [
CP
 C
0
 [
TP
 DP [
T'
 T
0
 [
VP
 V
0
 [
CP
 C
0
 [
TP
 DP [
VP
 V
0
 DP]..] 
 
If we look at the "spine" of the clause as structured in (23)b ? that is, the dominance 
ordering running from the root to the most embedded element that manifests the sequence 
of head-complement selection/projection relationships
14
 ? we se the following 
sequence of major categorial distinctions betwen types of elements as in (24)b (ignoring 
intra-phrasal projection level distinctions, thus collapsing any XP/X' to just X): 
 
(24) a. [
CP
 C
0
 [
TP
 DP [
T'
 T
0
 [
VP
 V
0
 [
CP
 C
0
 [
TP
 DP [
T'
 T
0
 [
VP
 V
0
 [
CP
 C
0
 [
TP
 DP [
VP
 V
0
 DP]..] 
 
b.           C?T?V?C?T?V?C?T?V?D 
 
This spine branches to include the external arguments in the specifier positions of the T-
elements asociated with each verb, which we add to this reduced diagram as follows (the 
branching, directional arc is superimposed here to clearly indicate the asumed 
dominance ordering): 
 
(25) C?T?V?C?T?V?C?T?V?D 
 
   D        D        D 
 
                                                
14
 On some views, the relation from functional-to-functional elements and functional-to-lexical elements is 
discused in terms of selection (e.g., C
0
 selects TP, T
0
 selects VP, etc.), perhaps with a distinction made betwen 
"syntactic" and "semantic" selection (se, e.g., Abney 1987, 191). On other views (Grimshaw 191, 202; van 
Riemsdijk 191, 198) functional-to-functional and functional-to-lexical relations are governed by the notion of 
(extended) projection, while "selection" is reserved for lexical-to-functional and lexical-to-lexical relations. 
 
21 
These reduced structures wil be sufficient to make the points of interest here. Later on I 
wil argue that this should be sen as more than expository convenience, but rather is a 
view of structure that makes available the "best fit" with our core constraints on the 
workspace (in (8)/(9)).
 15
 Now consider the folowing: 
 
(26) a. John sems to tend to appear to like carots 
b. What did Dave think that Mary believed that John liked? 
 
(27) a. John [sems [to tend [to appear [ _ to like carots]] 
 
b. What [did Dave think [that Mary believed [that John liked _ ]]? 
 
(28) a. John [sems [ _ to tend [ _ to appear [ _ to like carots]] 
 
b. What [did Dave think [ _ [that Mary believed [ _ [that John liked _ ]]]? 
 
There is something approaching a consensus in the literature that the examples in (26)a/b  
(raising to subject/RtS and wh-movement) are best viewed as involving linked local 
relationships of the sort pictured in (28)a/b, and not a direct "one-fel-swoop" relation as 
in (27)a/b. This is not entirely uncontroversial, though I wil canvas a range of facts in 
Chapter 2 that are drawn from a variety of languages and which, taken together, strongly 
suggest that something like these so-caled succesive cyclic movements (SCMs) are real. 
 The TCG vision of these linked local dependencies can be schematized as in (29) 
for the RtS case (I wil return to the wh-movement case below). (29)a gives a birds-eye 
view of the entire derivation, with the first stage building the matrix clause structure as in 
(29)b. 
                                                
15
 I wil be deploying reduced structures in this "horizontal" notation throughout this work. Structures of this 
reduced type are esentialy those argued for in recent work of Brody (199, 203), and can be sen as related to more 
general eforts to downsize the aray of label-types that analysis can apeal to. Colins (201) is another such aproach, 
but one which is incompatible with the central intuitions I wil be developing regarding sucesive cyclicity (and 
"movement" generaly). I wil return to these isues below in ?1.5. 
 
22 
 
(29) a.   sems    to tend   to apear    to like carots 
 
C?T?V?T?V?T?V?T?v?V?D 
 
   D     D     D      D 
 
b. C?T
?:?
?:n
?V 
 
    D
?:f
?:?
 
 
Asume that T and D both enter the derivation with Case (?) and agrement (?) 
properties. T-? is unvalued, requiring a relationship with another element with valued-? 
(D-?); asume the reverse holds for ?-properties (T-? is valued, take ? to range over {?, 
n, a, d}, for "unvalued", nominative, acusative, and dative/oblique, respectively; 
similarly, take ? to range over {?, f, g, h} where f, g, etc. are stand-ins for more complex 
atributes and values like ?:NUM:plural, etc.).
16
 
 I asume that D and T enter into a reciprocal valuation, esentialy swapping 
values, T retains ? and deletes ?, while D retains both valued properties, as follows (here 
and throughout, I wil mark alterations of feature properties ? valuation, deletion, etc. 
with transitions like "?:???:f" = "unvalued feature ? gets value 'f'", or ?:n?? = 
"valued feature ?:n deletes" as in (29)c): 
 
(29) c. C?T
?:???:f
?:n?
?V  d. C?T
?:f
?V 
 
    D
?:f
?:???:n
       D
?:f
?:n
 
 
At the next step of derivation a "like element" ? T ? is introduced. I am asuming that 
raising predicates (i) do not include a specification for an external argument (i.e., no 
                                                
16
 This particular formulation of feature relations folows earlier proposals (Castilo, Drury, & Grohman 
199; Grohman, Drury, & Castilo 200; Drury 200). 
 
23 
smal-v, though one is present to introduce the external argument of the most embedded 
clause, se (29)a above), and (i) take defective T-complements (in roughly the sense of 
Chomsky 1999). Thus, the second T could be viewed as distinct from the first, since they 
difer in properties (the first/higher T has a ?-property that the second/lower T lacks): 
 
(29) e. C?T
?:f
?V?T 
 
     D
?:f
?:n
 
 
However, this is exactly the context in which we want the "reverse" of raising to occur. 
Suppose then that we asume the following as a first pas on our so-far unspecified 
matching relation ? from above: 
 
(30) MATCHING RELATION ?: 
For two elements ? and ?, ??? if: 
(i) Either ? dominates ? or ? dominates ? 
 and 
(i) Either ? subsumes ? or ? subsumes ? 
 
The condition makes reference to the notion of SUBSUMPTION, common in unification-
based approaches to gramar deploying feature structures, specificaly (from Shieber 
1986:15): 
 
(31) SUBSUMPTION: 
A feature structure D subsumes a feature structure D' (D _ D') if D contains a 
subset of the information in D'. 
 
For example, given a node labeled X and a node labeled X
[F]
, the former wil subsume the 
later since X contains a subset of the information in X
[F]
. Subsumption is thus the "more 
general than" relation. 
 
24 
 What we have above can be sen as a generalization of Chomsky's (1999) notion 
of AGREE, though (i) it introduces the possibility of both upwards and downwards 
valuation on the dominance order, and (i) it extends the relationship to hold amongst 
categories. On Chomsky's view, in contrast, such relationships are asymmetrical, with 
unvalued elements ("probes") scanning their subordinate (c-command) domain for 
matching elements that can provide them with values ("goals").
17
 Thus, asymmetry in 
valuation is taken to track asymmetry in formal ordering (e.g., goals can't typicaly value 
probes they c-command). 
We wil se later on some potential troubles with this statement of matching, in 
particular when applied to individual features it causes locality problems even for fairly 
simple examples (e.g., alowing valuation to go in either general direction on the 
dominance ordering wil be sen to make it dificult to understand how he saw her can't 
mean she saw him ? se ?3.1 for discussion). For now however this way of thinking wil 
alow us to give a sketch of how things work. 
Taking the matching relation ? to involve categories, and not just features of 
them, might be taken to require some further comment. However, if there turns out to not 
be a good reason to have a fundamental division betwen categories and features, then 
this follows as a reasonably natural generalization of Chomsky's conception of 
                                                
17
 However, a similar kind of ?/? reciprocity as we have deployed here is present in a diferent form in 
Chomsky formulation (roughly: his idea is that ? gets valued as a reflex of agre with a ?-complete element; I won't 
coment further on this). However, on Chomsky's view the relation betwen the subject in a RtS construction is not 
directly asociated with its surface/PF-output position. Rather, the traditional line of having this element orginate in its 
?-position is asumed. This, I believe, holds onto a residue of D-structure. Though not coded in terms of a level of 
representation characterizing potentialy unbounded objects (an infinite base component), it is nonetheles retained in 
the notion that items must enter the derivation through a ?-position. I se no minimalist motivation for this restriction, 
which is part of the motivation for exploring an alternative route regarding derivational directionality. However, we 
wil se that the alternative top-down conception is acualy demanded by the general view of contraction as aplied to 
SCM phenomena. 
 
25 
information flow and dependency-formation. What we wil se rather is that features may 
be divided into clases which either serve to relate elements internal to a domain (e.g., ?-
features) and potentialy across such domains, while other features/properties (e.g., 
categorial distinctions like C, T, etc.) wil serve to separate/distinguish elements within 
domains and across domains. I wil return to unpack these ideas more explicitly below. 
The key idea to keep in mind is a feature-based view of domain boundaries ? some 
properties are responsible for holding things together within domains, and others are 
responsible for keeping domains apart (or, as in SCM type relations, alowing limited 
overlaps betwen domains). 
 The general move that is being entertained here is to wed this generalization of 
AGREE with a version of Chomsky's (1995) discusion of CHAINS formalized as sets of 
context positions for an element ?. I wil return to elaborate on this point below, but note 
that what is being proposed here is esentialy a "context" identification view of SCM 
(se ?1.5). Before turning to these and other related maters in detail let us first complete 
the example derivation for RtS, and then take a look at one for wh-movement. 
 Returning to the derivation in (29): since ? holds of ?T, T
?:f
?, these nodes are 
identified. Following the discussion above regarding identification and contraction and 
maintaining a coherent ordering in the workspace, this results in the following with the 
raising predicate itself (V) "splicing-out" of the workspace, and D
?:f
?:n
 being "reintroduced" 
at the bottom of the dominance order: 
 
(29) f. C?T
?:f
?V?T
?:f
   C?T
?:f
?V?T
?:f
 
 
     D
?:f
?:n
      D
?:f
?:n
      D
?:f
?:n
      D
?:f
?:n
 
 
 
26 
And recal that this contracted workspace on the right-hand side here is realy just: 
 
(29) f'. C?T
?:f
 
 
     D
?:f
?:n
 
 
The addition of the further raising predicates for the derivation of (29) goes exactly the 
same way, until the most embedded domain is reached. Prior to the introduction of the 
embedded v-element hosting external-?, we would again have a workspace like that in 
(29)f/f'. Introduction of v, I asume brings with it a ?-feature:
18
 
 
(29) g. C?T
?:f
?v
?
 
 
     D
?:f
?:n
 
 
?-features, I wil asume, correspond/connect to thematic predicates in a neo-
Davidsonian sense (se, e.g., Parsons 1990, Schein 1993, Herberger 2000), relating a 
participant variable to an eventuality/situation. I suggest that the participant variables of 
such thematic predicates are inherently non-distinct and require valuation by ?/? 
properties in order to be rendered localy distinct (not having these local properties 
around can result in the identification of such variables, as I wil suggest is relevant for 
control and local anaphora, for example). 
 In the present derivation the step in (29)g involves a local A-relation that the 
superficialy non-local relation to the matrix has been reduced to via succesive 
contractions of the workspace ? in efect "carying-along" the matrix T ?/? properties. It 
is this general property of these derivations which wil alow us to dispense with 
                                                
18
 Where the enclosure representing the workspace boundaries is not relevant I wil simply leave it out. 
 
27 
reference to so-caled "EP-features" or their like (se Chapter 2 and 3). Intermediate 
specifier positions can exist, on this view, because (i) matrix ones exist, and (i) 
intermediate positions involve an informational superset (more-general-than) relation 
with the corresponding matrix positions. 
I asume that ? in (29)g takes the value of the dominating agrement feature (here 
?:f) as in "?[?:f]". The suggestion is that ?-role asignment to the D-element is indirect, 
esentialy importing a notion very similar in spirit to Wiliams (1994:33) notion of 
"vertical binding".
19
 The outcome of this valuation then is as follows: 
 
(29) h. C?T
?:f
?v
?[?:f]
 
 
     D
?:f
?:n
 
 
In general, I wil be understanding A-relations and thematic discharge in this way ? ?/? 
exchange betwen T and D results in Case-marking of D and valuation of T-?. The ?-
property then asociates with ?, which esentialy takes this value as an index marking 
the participant variable (that is, ..?
v
? _ , e?.. ? ..?
v
??:f, e?..). I wil return to elaborate on 
this point of view. 
A'-relations wil be viewed somewhat diferently. However, the fundamental 
notion of contraction and node/context identification wil be understood to work in the 
same way for (e.g.) SCM involving wh-elements. Whereas the identifications for A-
                                                
19
 Se also Wiliams (1983). A number of recent proposals of this kind have ben entertained in the literature 
as implemented within a feature system alongside an adoption of an Agre-type relationship of Chomsky's (199) sort. 
Se Rizac (204), Adger & Ramchand (201), Butler (204b), Koneneman & Neleman (203). Very similar notions 
have had a long tradition in frameworks that work exclusively with feature logics (e.g., HPSG; se Shieber 1985; 
Polard & Sag (194). Wiliams proposal technicaly involves an indexing procedure conecting thematic roles with 
dominating projections, with predication then ocuring betwen maximal projections as a mater of index sharing, 
thereby resulting in a conection to the lower (coindexed ?-role). Se Castilo, Drury, & Grohman (199) for some 
earlier discusion of such features relations and the notion of VP-internal subjects; and se also Drury (200). 
 
28 
movement involved the T-domain, the relevant relations in A'-movement wil be betwen 
C-elements. Before turning to ilustrate SCM of wh-elements, let us consider the local 
case of wh-movement: 
 
(32) Who _ likes piza? 
 
We wil ilustrate a derivation for (32) down to v (remember: "top-down") to show how 
WH, ?/?, and ? information wil be understood to relate. 
As A-relations serve to establish a set of feature-licensing relations resulting in an 
indirect view of ?-discharge, A'-relationships similarly provide a set of relations resulting 
in indirect ?-asignment. Asume that wh-elements come with a WH-property which (i) 
takes ?-features as values, and (i) matches and deletes WH on D. Asume C is has 
unvalued ? as wel, so we have the following in (33): 
 
(33) a. C
?:?
WH
?T
?:?
?:n
?v
?
  b. C
?:???:f
WH
?T
?:?
?:n
?v
?
 
 
 D
?:f
WH
?:?
     D
?:f
WH??
?:?
 
 
The now valued ?-property of C serves to value T-?, and WH takes the ?-property of T as 
a value (WH[?:n]). In virtue of these relations D-? can now be valued by C: 
 
(33) c. C
?:f
WH?WH[?:n]
 ? T
?:???:f
?:n
 ? v
?
 
 
 D
?:f
?:???:n
 
 
Thus, the presence of the WH-property serves to mediate the ?/? swap of values. Like the 
A-relation case, there is some back-and-forth directionality to the flow of information in 
these feature-relationships. In A-relations, recal from above, we saw ?- and ?-valuation 
going in opposite "directions". Consider (29)c/d again: 
 
29 
 
(29) c. C?T
?:???:f
?:n?
?V  d. C?T
?:f
?V 
         ?          ? 
    D
?:f
?:???:n
       D
?:f
?:n
 
 
In the wh-movement case (A'-relation) the same holds, though the WH-property is aleged 
to be implicated in a mediating role (? goes from D to C to T; ? goes from T to C to D): 
 
          ? 
(33) c. C
?:f
WH?WH[?:n]
 ? T
?:???:f
?:n
 ? v
?
 
       ? 
 D
?:???:n
?:f
 
 
Now, as with the basic A-relation case discussed above, the ?-property of T wil index 
the thematic position: 
 
(33) d. C
?:f
WH[?:n]
?T
?:f
?:n
?v
???[?:f]
 
 
 D
?:f
?:n
 
 
The asumption here then is that indexing the participant variable of the ?-role with ? is 
to close-it off (saturate it) ? the ?-property can only become connected with this ?-
property if it has been valued in a way that has resulted in the asignment of Case. This 
happens in two possible ways now: (i) as in the A-relation, where ? is connected to an 
overt nominal marked ?, and so the ?-variable wil be connected with the semantic 
properties of that element, or (i) it is connected with a "bound ?", asociated with the 
upper WH property ? that is, connected with an "individuator" in our terms. 
 There is a version of a traditional view being implemented here. In GB-era terms 
(e.g., Chomsky 1981) we have the ideas that "?-marked traces" are "variables" and that in 
 
30 
general ?/? are intimately connected with ?-theory. I wil return to these maters in 
further discussion in Chapter 3. 
 Let us consider now the picture we have of local A- and A'-relations side-by-side: 
 
(34) C
?:f
WH[?:n]
?T
?:f
?:n
?v
?[?:f]
  C?T
?:f
?v
?[?:f]
 
 
D
?:f
?:n
       D
?:f
?:n
 
 
   A'-RELATION   A-RELATION 
 
In the A-relation, we have ?-features which form the connection betwen elements,
20
 and 
in the A'-relation there is a mix. That is: 
 
 
(35) C
?:f
WH[?:n]
?T
?:f
?:n
?v
?[?:f]
  C?T
?:f
?v
?[?:f]
 
 
D
?:f
?:n
       D
?:f
?:n
 
 
   A'-RELATION   A-RELATION 
 
This is one reasonable way of specifying the flow of feature-licensing information in 
local domains. ?-information flows from items that are specified to those that are not, 
filing in the values along the path; and ?-information does the same, though in the 
opposite direction on the path. 
 The suggestion is that once we have a valuation mechanism of the AGREE-sort that 
has recently been appealed to in elaborating minimalist syntactic theories (Chomsky 
1999), then it sems there are some fairly straightforward ways to make it do most, if not 
all of the work. 
                                                
20
 I am ignoring here any ?-properties that may be asociated with C in the A-relation example. We might 
asume that C-? can manifest an open clause, as with relatives, if we atribute a non-WH operator property to C. Later I 
wil explore the idea that ? on non-finite C (without an operator property) is what alows indexing of non-?-marked ? 
(control). 
 
31 
We need mechanism to "build", for example, extended projection sequences in 
the verbal domain. On a traditional movement story, we interleave the building of such 
sequences via merge operations with movement/remerge relationships involving nominal 
expresions as each of the relevant levels of structure is constructed. So, a ?-asigning 
element relates to a nominal, discharging its role to that nominal; further operations add 
higher projections specifying other licensing properties (Case/agrement), which we then 
need to relate to the nominal element as wel (so we have an A-movement); further 
categories/features are added above that, which may provide yet further licensing 
properties, and so we relate the nominal expresion yet again to the next highest layer (so 
we might have an A'-relation). 
 But we can view a local A'-relation complex in at least the following two diferent 
ways: (i) D
wh
-V, D
wh
-T, and D
wh
-C, or (i) C-T-V + D
wh
. Below (?1.5.1) I wil review 
Chomsky's (1995) discusion of chains as sets of contexts, and suggest that coupling a 
simplified version of that with an AGREE-type mechanism yields the following picture: 
 
(36) a. In local sequences of categories (which wil, in acordance with the anti-
recursion provision in (9), not include repeated "like elements") like 
features co-value, 
 
  and, 
 
b. Encountering like categories results in a similar "co-valuation" (like 
elements are identified, though, as with feature-valuation generaly, only 
so long as one subsumes the other). 
 
Intuitively, categorial diferences in local domains prevents collapse of nodes ? dislike 
elements "repel" one another. But this does not stop like features from identifying with 
 
32 
each other within such local domains (likes "atract" one another).
21
 Across local 
domains, we have the possibility of interactions because the edges of such domains can 
become identified by keeping this same atract/repel logic in place (a like category is 
introduced, and this can result in identifications which alows a kind of domain expansion 
? as we saw above ? a kind of copying/lowering). 
 Note also that the feature-relationships in our schematic A'-relation builds-in a 
useful property. Consider how a standard "copying" view of SCM of wh-elements works: 
 
(37) a. Which picture of himself did John think Bil liked _ ? 
b. [which..self] did John think [which..self] Bil liked [which..self] 
 
It has been noted that on the copy view of such movements there must be some operation 
which ensures that the actual wh-operator does not appear in al the lower copies. As 
Safir (1999: 591) points out, quantifiers cannot bind other quantifiers, and somehow the 
lower copies must be understood as variable-like elements. Acordingly, one or another 
variant of the folowing sort of operation is typicaly taken to be in efect (Munn 1998 
cals this "Make OP"; this particular ilustration is taken from Safir's discussion):
2
 
 
(38) a. Whose mother did Bil se _  ? 
b. whose [whose mother] did Bil se [whose mother] 
 STEP?: "lift" the operator out 
c. whose [x mother] did Bil se [x mother] 
  STEP?: make variable/delete-WH 
 
This operation is built into the WH-licensing discussed above (se (33)a/b). The 
implementation is in terms of matching D and C WH-properties, with deletion of this 
                                                
21
 Se van Riemsdijk (198) the working out of an intuition which similarly makes use of atract/repel, but in 
a rather diferent way from what I am entertaining here (his "Categorial Feature Magnetism"). 
2
 Se Chomsky (195:203), Mun (194:39), Fox (199, 202), and Safir's (199:591) discusions. Se 
also Fox (203) and the notion of Trace Conversion. 
 
33 
property from the D-element, but the efect is the equivalent of D projecting its WH-
properties to C (there are several way that we could implement the idea, the one given 
above is simply one such way). 
If we keep with our asumptions above, including the asumption of a top-down 
derivation, then we can understand the "operator" properties to be housed in C, leaving 
the D-wh phrase itself with a "hole" indexed by its ?-property. The result then is that the 
local structure provide above (repeated here) wil have a logical form of the sort in (39)b: 
 
(39) a. C
?:f
WH[?:n]
?T
?:f
?:n
?v
?[?:n]
 
 
 D
?:f
?:n
 
 
b. WH[?:n] ..[ ?:n ].. ?
v
??:n, e? 
  
(i.e.,..wh(x) .. [..x..] .. [..Px..] 
      whose x .. [..x mother..] .. [..Px..]; as in (38)c) 
 
Now the top-down structure of this story makes it possible to understand the equivalent 
of a Make-OP sort of operation as happening on the first step (when D and C are 
integrated). 
 However, note that on longer distance wh-movement (the above example is local 
asociation with subject ?/?-?) the WH won't have a ?-value until it encounters a lower 
valued-?. Acording the logic of category identification and lowering sketched above, we 
might take this to result in an operator being succesively lowered to each new domain 
edge, along with the residue of the wh-element itself (e.g., roughly [x NP] = "D
?:?
" here): 
 
(40) a. C
WH[?:?]
?..?C
WH[?:?]
?..?C
WH[?:?]
?..? 
 
 D
?:?
         D
?:?
         D
?:?
 
 
 
34 
This gives us part of what we may want for SCM, which is variable-like elements in al 
the intermediate positions, but it also gives us something that we don't want, namely the 
wh-operator at al the intermediate positions as wel (i.e., recal the point from above that 
quantifiers don't bind other quantifiers). 
 Below I wil suggest a way, relating to some ideas proposed by Uriagereka (1999) 
and from previous work of my own (Drury 1998), which appears to have the right 
properties to naturaly yield the result that we do want, which looks more like this (in 
terms of what we want in the output structure): 
 
(41) C
WH[?:?]
?..?C?..?C?..? 
 
D
?:?
         D
?:?
    D
?:?
 
 
I return this briefly below in discussing MSO-systems (?1.4.1), and again when we turn 
to analysis in Chapter 3. 
1.3. Summary So Far,..& The Path Ahead 
We have so far introduced a few key ideas. Let's sum up before proceding. We have 
posited a workspace/output-distinction (henceforth: WS/O-distinction). In the course of 
elaborating on the key intuition we have suggested that the distinction be understood as a 
dividing line betwen the systems that manipulate elements by handling their syntactic 
properties only, versus those that handle the ?- or the ?-properties. Moreover, some of the 
formal/syntactic properties (e.g. WH, ?/?, and ?) have been understood to play a direct 
role in mapping out logical form distinctions. One way to look at this claim is to view the 
"workspace" as we have sketched it so far as "the interface" betwen the sound-meaning 
systems and whatever system(s) are responsible for the general ordering properties of 
 
35 
lexical/functional extended projection sequences. The suggestion above was that the 
workspace is a dynamic interface betwen "the lexicon" and the PF/LF systems. 
 Prior to this, we outlined some consequences of workspace restrictions stated over 
ordering and category distinctnes, and showed how the combination of these ideas yields 
a schema for analysis that sems to provide for a novel view of succesive cyclic type 
movement. The mechanics were suggested to follow on a natural generalization of an 
AGREE-type operation/relation of the sort studied in Chomsky (1999), broadened to 
include categorial identifications in a way that alows cross-domain interactions in virtue 
of a "carying-over" of higher properties of elements into lower domains (via node-
identification under subsumption). Some specific asumptions for A- and A'-relationships 
were sketched, providing a general (though reasonably detailed) outline of the approach 
to be further constrained and deployed below (Chapter 3). 
 The availability of the type of analyses relevant to SCM phenomena wil be 
argued here to be extremely interesting in the minimalist seting. As we wil se, the 
approach makes available a route for analysis which does away with any appeal to (what 
I argue are) spurious movement-driving features that have ben taken to underwrite SCM 
in much current minimalist work (so-caled EP/P-features ? which I wil generaly 
refer to as Move/Merge-features or "M-features"). 
 There are, however, a number of component ideas in play here that require some 
further background discussion before proceding. For example (i) the motivations for the 
reduced phrase-structure graphs deployed above, (i) the idea of relegating al 
"movement" relationships to one or another type of category/feature relationship on the 
dominance path, and how this could be connected with other existing lines of thought 
 
36 
regarding movement/chains, (ii) the generality of the WS/O-distinction, (iv) the 
conceptual connections to other proposed MSO-type systems as wel as to TAG 
approaches. 
Additionaly, one particular consequence of the TCG view, mentioned briefly 
above, is worth bringing up again here before heading into more detailed discussion. The 
general structure of the acount of SCM efects demands that syntactic derivations be 
viewed as asembling structure roughly top-down, instead of bottom-up as asumed in 
Chomsky's widely adopted Bare Phrase Structure (BPS; Chomsky 1994, 1995). This 
move (inverting the direction of derivation) on its own teaches us nothing about 
succesive cyclicity.
23
 However, coupled with the right alternative views regarding 
structure and categories/features and how they might be generated by a derivational 
system, directionality can be sen to play a crucial role. This somewhat unorthodox 
outcome converges with the results of a number of other recent investigations which have 
                                                
23
 Contra a discusion in Terada (199), who sugests that sucesive cyclicity efects can be beter 
understod within the incremental/left-to-right view of derivations proposed in Philips (196). While I agre that 
derivational directionality may be important, nothing in Terada's discusion establishes this conclusion. In Terada's 
proposal intermediate movements are stipulated to be driven by features as in many other minimalist aproaches 
(positing what I cal Merge/Move-features or M-features se Chapter Two below). Terada apears to think that having 
the ultimate licensing feature (e.g., +wh for question-formation) checked "first" helps in some way with "lok-ahead" 
isues. But the logic relies on a spurious division betwen the 'top-most' licensing properties and lower ones (like 
Case/? and ?-properties). The "lok-ahead" problem is symetric. In a botom-up aproach Case/? and ? are localy 
licensed but a (e.g.) a wh-element must somehow eventualy reach its coresponding licensing context, so where there 
is multiple embeding there is a lok-ahead isue (the wh-element neds to "know" that the right licensing property wil 
eventualy show up). But the same goes on a top-down (or left-to-right/incremental) view, just the other way around (a 
wh-element neds to "know" that Case/? and ? information wil eventualy show up, though its wh-property may be 
licensed imeidately upon entering the derivation). The mystery/problem/puzle of SCM efects is rather about why 
there are ever any movement operations other than those which would conect these basic (wh, Case/?, ?) properties. 
Why are there intermediate movements (chain-links/traces)? Positing M-features (e.g., EP/P-features in Chomsky's 
parlance) to drive intermediate movements (as Terada does) is not a solution ? that is the problem. I se nothing in 
Terada's discusion that ads to our understanding of why derivations ought to have a non-standard "direction" nor why 
sucesive cyclic movement ought to exist. On the present aproach, in contrast, direction of derivation is demanded in 
order for things to work. I thank Cedric Boeckx for pointing me to Terada's paper. 
 
37 
reached similar conclusions regarding directionality and syntactic derivations (se in 
particular Philips 1996, 2003).
24
 
In addition to this diference with respect to standard BPS, the TCG approach 
difers as wel from the structure of TAG derivations, which taken to obey a Markovian 
condition insisting that it be localy determinable whether a given pair wise combination 
of tre-structures is licit or not. One efect of this condition in TAG is an ordering 
fredom which for cases beyond pair-wise combination of tres alows the possibility of a 
many-to-one mapping of derivation structures to derived structures.
25
 
I think a large part of the interest in the mechanics of the TCG system is that it has 
this fairly abstract general requirement regarding derivational directionality. What I am 
unsure about at present is what the ultimate significance of these ordering diferences 
might be for the study of gramar qua "system of human knowledge" (i.e., as properties 
of a competence-level theory). 
There are, however, some obvious points of interest to be made with respect to 
connecting gramatical theory and parsing (and perhaps production). The treatment of 
linked-local relationships here in efect introduces a way that displaced constituents can 
be in sense buffered as structure is expanded and then re-acesed as lower domains are 
constructed. The structure of the TCG acount thus does not require the explicit add-on 
of a memory stack or related storage devices that have been appealed to in the past in 
                                                
24
 Other work along this same line includes previous work of my own, Drury (197, 198, 199a,b), and a 
number others including Boeckx (199), Guimar?es (199, 204), Richards (199, 201), Terada (199). 
25
 That is, as in some other aproaches (like clasical Categorial Gramar (CG) or Stedman's Combinatory 
Cateorial Gramar (CG), TAG derivations manifest a kind of so-caled "spurious ambiguity". This label is 
somewhat of a misnomer both in CG and in TAG, as both aproaches have sugested that the relevant derivational 
ordering alternatives are not in fact "spuriously" ambiguous but rather do make linguisticaly significant distinctions. 
Se Frank (202) for discusion. 
 
38 
discussions of filer-gap dependencies in the context of parsing theories. The functional 
equivalent of such a device is, as saw in the sketch offered above, an esential component 
of the basic mechanics. I wil not be concerned here with these isues, though its worth 
keeping in mind in the background. 
In the next section I back up to consider the general structure of some proposed 
MSO-systems and TAG, looking at the structure of these approaches in terms of our 
WS/O-distinction. Following this I turn to some technical discusion further motivating 
some of the component ideas of implementation of TCG pursued here. 
1.4. MSO-Systems, TAG, & Generalizing the WS/O-Distinction 
MSO-systems as they have emerged within the MP, generaly speaking, are 
derivationaly oriented models which parcel structure asembly into principled stages in 
virtue of applications of an operation caled Spel Out (SO). Depending on asumptions 
varying across implementations, diferent sorts of MSO-systems efect diferent 
partitions of syntactic complexes (or stages of derivation) into local domains or chunks. 
Common across implementations is the general idea of SO as an operation that is 
periodicaly applied in the course of derivation resulting in a reduction or contraction of 
structural descriptions by shunting or transfering portions of structure to neighboring 
systems with which the syntax must interface. In this manner evaluation of certain 
aspects of wel-formednes of syntactic complexes is thus suggested to be divided such 
that sub-parts of structure are independently inspected by the principles governing the 
interfaces. 
 
39 
 The general idea of multiple spel out has a number of antecedents in earlier 
literature.
26
 Within the context of the development of the Minimalist Program (MP) it 
arose in consideration of the architecture proposed in Chomsky (1993), which contained 
a weak residue of Government & Binding (GB) theory's level of S-Structure. Rather than 
a full fledged level of representation, this S-structure hold-over was simply taken to be a 
"point" of derivation as discussed in Chomsky (1995:229): 
 
at some point in the (uniform) computation to LF, there is an operation Spel-Out that aplies to the 
structure ? already formed. Spel-Out strips away from ? those elements relevant only to ?, 
[emphasis mine-JD] leaving the residue ?
L
, which is maped to ? by operations of the kind used to 
form ?. ? itself is then maped to ? by operations unlike those of the N[umeration]?? computation. 
We cal the subsystem of C
HL
 that maps ? to ? the phonological component, and the subsystem that 
continues the computation from  to LF the covert component. The pre-Spel-Out computation we 
cal overt. 
 
This pasage characterizes the core properties of the minimalist Y-model:
27
 
 
(42)                     overt component       covert component 
            LF 
 Lexicon ? Numeration ? MERGE/MOVE ? Spel-Out 
            PF 
                                      phonological component 
 
The development of MSO-systems in more recent work questions the idea of a single 
"point" of Spel-Out. 
1.4.1. MSO and Linearization 
Uriagereka (1999) was to my knowledge the first to propose within the seting of the MP 
that we ought to regard spel-out not as a single point in the syntactic derivation, but 
                                                
26
 Se, e.g., Jackendof (1972) and Bresnan (1971,1972). 
27
 I have made no mention so far of the notion of "Numeration" in this model (as an intermediary betwen the 
Lexicon and the syntactic derivation). This object wil play almost no role here, though se our concluding discusion 
in Chapter 3. 
 
40 
rather as a procedure that can apply more than once, perhaps limited by economy 
principles (e.g., perhaps of the general Last Resort variety, mandating that no operation 
occurs unles necesary to ensure convergence). 
Uriagereka's proposal draws on the work of Kayne's (1994) proposed 
correspondence relation betwen linear precedence and c-command. Supposing with 
Kayne that asymmetric c-command relationships map to linear precedence, and taking 
Chomsky's view of spel-out as a proces of stripping away "those elements relevant only 
for ?" (se above), Uriagereka suggested that we identify domains for spel-out with sub-
structures which constitute total/connected c-command orders. He argues that this alows 
a simplification of Chomsky's (1995) implementation of Kayne's LCA which avoids the 
need to state an induction step to cover the linearization of parts of complex structures 
with respect to parts within other such complexes. To ilustrate, consider: 
 
(43)           D 
 
    A            D 
 
a     B      d     E 
 
   b     C        e     F 
 
 
A version of Kayne's general idea would be to claim that where ? asymmetricaly c-
commands ?, ? precedes ?.
28
 For the sub-structure dominated by A above this would 
yield the order ?a, b, ..?. But while we say that A asymmetricaly c-commands the 
                                                
28
 Se Kayne (194) for a conceptual argument that asymetric c-comand ought to be maped to 
precedence, as oposed to subsequence. Se also Uriagereka (198) and Chametzky (200) for important related 
discusions. 
 
41 
elements dominated by the two-segment category D, the elements dominated by A do 
not.
29
 Therefore we need to add a step to the linearization procedure. That is, there are 
two separate c-command domains here which each independently constitute a 
total/connected order. But b and e, for example, are not so ordered. So we need a step to 
tel us that al the parts of the A-substructure are to precede al the parts of the D-
substructure so long as A asymmetricaly c-commands D (se Kayne's (1994) discusion 
for his handling of this isue). 
 Uriagereka's suggestion is that since independent c-command domains are 
trivialy linearizable (i.e., they do not require appeal to an induction step of the sort 
informaly sketched above), they independently undergo spel-out. The output of this 
procedure could be understood as a flatened structure which we regard as stil "there" in 
the computation, but whose internal parts are frozen and therefore unable to undergo 
further interactions in any later stages of derivation. Alternatively, separately linearized 
substructure could be regarded as sent to the PF-component, leaving only a residual 
place-holder element "@", with some minimal specification of category/feature 
information relevant to the interaction of the speled-out unit A with the rest of the 
structure.
30
 These two options are sketched here: 
 
 
 
 
 
 
                                                
29
 I'm mixing in Kayne's asumptions regarding specifiers as adjuncts ? this asumption is replaced in 
Chomsky's discusion by asumptions regarding intermediate-phrasal-level "invisibility" in order to get the right 
asymetries for c-comand to hold. This is inesential to the overview I am giving of Uriagereka's proposal in the 
main text, though I should make it clear that he otherwise folows Chomsky's BPS aproach in his specific formulation. 
30
 Perhaps simply core licensing properties like Case, agrement, wh, etc. 
 
42 
(44) a.     D    b.    D 
 
    A            D     @
A
           D 
 
 ?a, b,..?      d     E               d     E 
 
               e     F                  e    F 
 
 
The intuition in both implementations is that speled-out structure functions like a derived 
terminal, alowing a trivial statement of structure/order correspondence (? precedes ? ? 
? asymmetricaly c-commands ?). Precedence betwen elements which do not 
themselves enter into an asymmetric c-command relationship fal out from the structure 
of derivations involving separate linearization of c-command domains. 
 Uriagereka specificaly proposed that non-complements might be understood as 
the structures that must undergo independent linearization in the sense just sketched, and 
further argued that Condition on Extraction Domain (CED) efects (Huang 1982) could 
be understood to follow from this. So the cases in (45) would be understood to be 
ungramatical because the bracketed sub-structures would have to be independently 
speled-out, making their contents "frozen" and thus inacesible to further merge/move 
operations (the relative clause in (45)c would be out on the standard asumption that 
these structures involve adjunction and thus are also non-complements): 
 
(45) a. *What do [explanations of e] bother you? 
b. *What was Mary bothered [because Peter explained e]? 
c. *What do you know [the girl] [that _ explained e] 
 
Thus, on this linearization driven view of MSO, we have a potential acount of at least 
these particular so-caled strong islands. 
 
43 
However, its not obvious why it should be that subjects and adjuncts need to be 
independently speled-out, as opposed to the structures they asociate with ? either 
option would sem to permit the simplification of the linearization procedure.
31
 
In Drury (1998, 1999) it is proposed that Uriagereka's linearization-based view of 
MSO be put together with incremental derivations of the sort proposed in Philips (1996) 
(se also Philips 2003). We can frame a version of this proposal within our WS/O-
distinction as follows: 
 
(46) Workspace Connectednes (C-Command): 
The elements in a given syntactic workspace must manifest a connected c-
command order (i.e., for every x, y in the set, either x c-commands y or y c-
commands x) 
 
Recal our top-down schema of the WS/O-distinction from (2), repeated here as (47): 
 
(47)  
 
 
 
 
This derivation would have workspaces which al obey (46). The following workspace 
would not: 
 
(48)  
 
 
 
                                                
31
 Se Drury (198, 199), Johnson (200) for some discusion of this and related points. Se also Uriagereka 
(202) for criticism of Uriagereka (199), and a working through of some alternative posibilities that for reasons of 
time and space I wil not consider in the present work. 
 
44 
On a top-down view of structure expansion, the c-command path that was first asembled 
would have to "spel-out". We can envision speled-out structure as being "ejected" from 
the workspace as follows, in the spirit of our proposed contractions discused above (?1.1 
& 1.2): 
 
(49)  
 
 
 
The shaded node above would stil be visible/present in the workspace, but the structure 
it dominates would be excluded (removed from the workspace = speled out). 
 There are numerous details here that require elaboration (e.g., with respect to 
symmetry vs. asymmetry of c-command betwen sisters; se Kayne 1994, Chomsky 
1995), but the basic idea would be that the workspace would be limited to only contain 
trivialy linearizable structure as in Uriagereka's proposed simplification of a Kayne-type 
order/structure correspondence. This view then doesn't include anything that gets around 
the objections raised above however (e.g., regarding which of two sisters spels-out, etc). 
I put it on the table now to underscore the generality of the key idea of viewing spel-out 
of sub-structures as esentialy being "kicked out" of the active workspace (we wil se 
other ilustrations below). 
 Note that the reduced structures introduced in the sample derivations in ?1.2 were 
suggested to involve only a dominance ordering. Suppose that we were to modify the 
proposed restriction in (46) to refer to dominance in our reduced structures, as in (50): 
 
(50) Workspace Connectednes (DOMINANCE): 
The elements in a given syntactic workspace must manifest a connected 
dominance order (for every x, y in the set, either x dominates y or y dominates x) 
 
45 
 
At each branching point in the top-down expansion of the domination sequence, 
something would have to "spel-out" (be voided from the workspace). 
 To ilustrate, take the following nodes to be introduced in the order indicated by 
their number. The initial sequence ??? would satisfy (50), but the addition of ? 
would add a domination relation betwen ? and ? but no such relation betwen ? and 
?, so we could take ? to be required to "spel-out" (be voided from the workspace). 
Subsequent spel-outs would ocur for the same reason (SO1, SO2, etc.). The result of 
this condition is binary branching. 
 
(51) ? 
 
SO1 ?   ? 
 
SO2      ?   ? 
 
SO3          ?  ? 
etc,.. 
 
Note that these spel-outs would have to work diferently than the general shape of 
Uriagereka's proposal. Since the connectednes requirement would be stated over the 
dominance relationship, the even-numbered nodes would literaly have to be "gone" from 
the workspace. So this raises the question as to how they might interact with later 
structure. 
 However, recal from our sketch of A- and A'-relations above that the connection 
betwen items is mediated by various kinds of feature-exchanges (valuations, etc.). On 
this view, for example, ? and ? above might relate in such a way as to leave the 
appropriate feature-relationships visible on ? alone (and thus stil "in" the dominance 
 
46 
path). For example, it was suggested that ?-asignment to a "subject" is mediated by the 
interelationships betwen D and T with respect to ? and ? properties, with the ?-
properties serving as a link to the thematic variable (?) introduced by the verb. Following 
these valuation exchanges, D is marked for ? and D and T are connected by co-valued ?. 
 Given this general picture, we might consider the possibility of an element being 
speled-out (e.g., like ? above), and then to re-entering the workspace in virtue of a later 
instance of node-identification. For example, suppose that ? introduces another instance 
of the ? type: 
 
(52) ? 
 
SO1 ?   ? 
 
SO2      ?   ? 
 
SO3          ?  ? 
 
                  ? 
 
If (?, ?) mets our matching relation ?, then identification would occur. But, as argued 
above, this would require the splicing-out of al the intervening odd-numbered nodes in 
(52). But nothing would prevent the copying/lowering of ?, as this would create no 
ordering conflicts, nor would it violate the connectednes condition: 
 
 
 
 
 
 
 
 
 
 
 
 
47 
 
(53) ? 
 
 ?   ? 
                   CONTRACTION 
      ?   ? 
 
          ?  ? 
 
                   ? 
 
                   ? 
 
Intuitively, this would have the efect of an element (here: "?") leaving the active 
workspace (being speled-out), and then "returning" again to the workspace as its context 
was copied via node-identification. Note that further additions to the structure from the 
point in (53) (e.g., asociating a new element directly with ?) would result in ? having 
to spel-out again, in order for the workspace to comply with connectednes. 
 But, ? could dominate arbitrarily complex structure, so its not obvious that we 
could, given our distinctnes condition on the WS, simply reintroduce such complexes at 
the bottom of the domination order in virtue of the node-identification ilustrated above. 
However, recal that the ?-? relation has been understood to involve some feature-value 
exchange. This suggests the possibility that we could understand the copying/lowering as 
involving simply a reintroduction of a simple label, implementing the notion of a derived 
terminal in Uriagereka's (1999) sense. That is, the node identification (?,) would result 
in reintroducing a simplex marker for ? above, facilitating the copying/lowering we 
require but not reintroducing al of the potentialy complex structure dominated by ? into 
the workspace. This would stil alow us to se the "left-branch" material of ? being 
 
48 
succesively reintroduced into the output structure, in virtue of the initial feature-
licensing connection established in the matrix position. 
 The "context" element itself (?) would, in contrast to ?, be a constant presence 
in the workspace (it does not spel-out, get reintroduced, spel-out again, as ? would). 
These mechanics are relevant to a discussion at the end of ?1.2 regarding cyclic-wh 
movement and an operation of the "Make-OP" sort. There we refered to the diference 
betwen the following two sorts of structures, and suggested that the former introduces a 
copying of operator-elements that we do not sem to want; whereas the later sems to 
have the right properties: 
 
(54) C
WH[?:?]
?..?C
WH[?:?]
?..?C
WH[?:?]
?..? 
 
D
?:?
         D
?:?
         D
?:?
 
 
(55) C
WH[?:?]
?..?C?..?C?..? 
 
D
?:?
         D
?:?
    D
?:?
 
 
The node-identification procedure, plus the now strengthened condition on workspace 
ordering (in terms of connectednes of the dominance order) makes available a 
distinction betwen C and D of exactly the sort we want. C is constantly "there" in the 
workspace, while D must spel-out, be re-entered to the workspace, spel-out again, and 
so on as the local domains are established in a top-down fashion.  
 I wil return to these general ideas in the course of developing some analyses to 
explore the mater a bit (se Chapter 3, especialy ?3.2.3 & ?3.3.3), but the general 
suggestion is that we think of connectednes of the dominance ordering and the anti-
 
49 
recursion restriction as working together to factor complex structures into natural local 
domains, creating major boundaries at both points of branching and points of recursion. 
Every syntactic theory of which I am aware needs to say something about (i) a 
theory of types, and (i) formal ordering properties. The suggestion here is that it may be 
possible to get these very basic notions to do quite a bit of work for us if we can sek out 
the right combination of conceptions of each (as suggested also in the work of Epstein, 
Groat, Kawashima, & Kitahara (1998), though with rather diferent conceptions pursued). 
 For now, observe that the general idea of MSO as proposed by Uriagereka 
removes a "point" of spel-out from the familiar Y-model in (42) above, in favor of a 
more dynamic looking system with multiple such points: 
 
(56)                         MERGE/MOVE           LF 
 
 Lexicon ? Num? SO
1
 ?..? SO
2
 ?..? SO
n
 
 
 
       ?PF? 
 
This kind of derivational architecture raises a number questions about the status of levels 
of representation. If spel-out is not a unitary point, do we need to amend our conception 
of PF as a unified object in the sense of "level of representation"? Uriagereka suggests 
that his dynamic view of speling-out is compatible with a level-les conception in which 
there is no unified object that is subject to a single-step evaluation with respect to 
gramatical conditions. On his view, separately speled-out sub-structures could be sen 
as being sent to the interface systems separately, leaving the gramar architecture with a 
PF-component but no level of this sort per se. 
 
50 
However, note that there is nothing in the MSO view that requires us to abandon 
levels of representation. It could simply be that MSO establishes the PF representation in 
the steps given by the independent instances of linearization, but that it stil forms a 
coherent connected object that can be subjected to further (PF-system) operations. That 
is, we can simply regard levels as incrementaly established. But it maters a bit what we 
take the "levels" to actualy be. I wil return to this point, but note here that this is roughly 
the content of the WS/O-distinction (a limited span derivational workspace that 
incrementaly builds an output representation). However, the suggestion above was that 
the object which is incrementaly asembled is "syntactic" in the sense of manifesting the 
formal ordering properties laid down in the workspace, but which is an object of the 
extra-gramatical PF/LF-systems in terms of what sorts of properties/features/categories 
populate this object (what sort of properites "decorate its nodes", if you wil). 
This PF-motivated view of MSO raises questions about LF too, in particular: are 
there reasons for thinking that SO is involved in similar kinds of divisions of derivations 
on the LF-side of the gramar? Asymmetric c-comand, after al, is taken to be relevant 
for scope and binding and the like; are there thus reasons for thinking that SO sends 
material to both the PF and LF systems, leaving us with a model like (57)? 
 
(57)                           ?LF? 
 
 
 Lexicon ? Num? SO
1
 ?..? SO
2
 ?..? SO
n
 
 
 
       ?PF? 
 
 
51 
An architecture of this general sort ? with both PF- and LF-relevant spel-outs (seting 
aside the staggered vs. uniform views) ? has in fact been suggested in connection with 
Chomsky's recent proposals regarding phases (derivation by phase: DbP) which brings us 
to the DbP/MSO view of derivations and their handling of SCM. 
 Note that Uriagereka's linearization-based view of MSO on its own does not ofer 
us anything imediately obvious in terms of helping to understand SCM phenomena. C-
command domains are themselves potentialy unbounded in depth, whereas the key point 
about SCM is that roughly clausal (or perhaps smaler) units form special domains that 
"punctuate" otherwise longer-distance looking relationships into linked-up local ones.
32
 
 However, this view might be interestingly fit together with something like 
Chomsky's phases which constitute a subset of the domains picked out by Uriagereka's 
linearization-based conception. In later discusion I wil suggest that Uriagereka's central 
idea maybe best viewed in terms of general formal ordering restrictions on the workspace 
along the lines sketched above ? that is, not specificaly tied to linearization demands, 
but rather to the internal coherence of ordering properites in narow syntax.
3
 I turn now 
to a phase-based MSO-system and cyclic movement. 
                                                
32
 I borow the notion of "punctuated" relations from Abels (203); se Chapter 2. Se also Bo?kovi? (202) 
for a discusion evoking ideas from Aoun (1986) and others regarding the notion of having certain phrase boundaries 
serve to "break" chains. 
3
 Again, se also Uriagereka (202) for critical discusion of his own previous proposals, and some 
alternative sugestions that I won't be considering in the present study. My own view here wil involve a formulation 
akin to the notion of Workspace Conectednes ofered in (46). Its worth noting here that Hornstein & Uriagereka 
(199) apeal to a similar kind of interface motivation for speling-out as this linearization based conception, though on 
the other (LF) side of the gramar. Briefly: they examine the posiblity that moved DP's may project their label to 
determine the type of their dominating category, esentialy alowing the potential of taking clauses as "external 
arguments" of DP. They sugest this as the syntax suporting generalized quantifiers, and argue that, like the left-
branch-type efects at PF, the projection of D-labels in their moved-to target positions creates an analogous kind of 
efect at LF. Technicaly, they argue that DP's do not so project their labels in the overt syntax, but in the covert 
component a "re-projection" ocurs, esentialy alowing specifiers to determine the types of their containing phrases. 
 
52 
1.4.2. Phases/MSO & Succesive Cyclicity 
Ilustrated in (58)a-d is a general schema for a fairly standard derivational approach to 
such linked-local relationships familiar from the MP; below we locate this general line of 
thinking within Chomsky's (1999) approach. 
 
(58) a. XP  b. XP  c. XP 
 
     ..?
{*F}
..    ?
{*F}
   X'    
{*F}
X
0
 
 
            ..t
?
.. 
 
                   XP 
 
                ?
{*F}
   X' 
 
                .t
?
.. 
 
 d. XP 
 
    ?
{?F}
   X' 
 
    
{?F}
X
0
 
 
 
 
                XP 
 
              t
?
    X' 
 
                 ..t
?
.. 
 
In the context of the MP, the moving element ? is understood to have some property {F} 
which requires that ? enter into a licensing relationship that cannot be established in its 
initial position within the subtre marked XP in (58)a (so we mark {F} here as {*F} until 
 
53 
licensed, and as {?F} afterwards).
34
 For the case of local wh-movement (e.g., who did 
John se _ ?) we understand the wh-element to be directly displaced (perhaps copied 
and/or remerged) to the surface position where the licensing/checking of this feature {*F} 
can occur via a match with a corresponding feature housed in the target position (pictured 
in (58)d). 
Of interest here is that the dependency betwen the top and bottom positions 
implicated in this sort of movement is aleged to not be "one-fel-swoop" but rather to 
involve linked local relations. That is, for reasons which vary across specific models, ? 
may be required to move to an intermediate or non-target position in which its unlicensed 
property {*F} is not satisfied ? pictured in (58)b ? on its way to the final/target 
position where this licensing can (in fact, must) occur. Some other property may be 
satisfied by this intermediate movement, perhaps some property of this intermediate 
position or perhaps in virtue of constraints built into the movement operation itself. 
 Or, it may be that both sorts of motivations are in play. For example, in 
Chomsky's DbP formulation, elements must move to intermediate positions in order not 
to be stranded in an independently speled-out domain. The idea of the DbP approach is 
that structures may be evaluated piece-meal, so unlicensed elements must be displaced 
from such localy evaluated structures in order not to crash the derivation. Chomsky 
motivates these "escape hatch" type movements from localy evaluated domains by 
positing features/properties of potential intermediate movement landing-sites to play the 
                                                
34
 I don't care for present purposes about any deletion/erasure procedures that may be part of such 
licensing/checking. 
 
54 
role of the local licensor for such operations (I wil later on refer to these putative features 
as "Merge/Move-features" or "M-features"). 
 Chomsky suggests that certain syntactic categories are phase-inducing, and that 
when multiple such heads are introduced into the derivation this results in systematic 
limitations on what remains "in active memory" versus what material is speled-out and 
thus no longer acesible to computation. His Phase Impenetrability Condition (PIC) 
insists that for a given phase head H
PH-1
, when a second such head H
PH-2
 is introduced the 
complement domain of H
PH-1
 spels out. 
 Abstractly then we have a derivation (on Chomsky's asumptions, a bottom-up 
one) with periodic applications of spel-out that in our workspace formulation we can 
picture as follows (phase-inducing heads are marked): 
 
(59)  
 
 
         ,.., 
 
 
        H
PH-1
 INTRODUCED                  H
PH-2
 INTRODUCED   COMPLEMENT DOMAIN ETC,.. 
              OF H
PH-1
 SPELS-OUT 
         (i.e., THE WORKSPACE CONTRACTS!) 
 
Speled-out structure under the WS/O-distinction, as outlined above for the linearization-
based view of MSO, is simply structure that is no longer in the workspace. Again, the 
WS/O-distinction as I am understanding it here is extremely general, and it is intended to 
be so. This gives us a common platform to discus these diferent (otherwise technicaly 
rather diferent) proposals. It is, however, more than just another "way of talking". There 
is a substantive claim implicit here which I am carying across the discussions of the 
 
55 
TCG approach as sketched above, Uriagereka's linearization-based MSO, and now the 
derivation-by/spel-out-by-phase view as proposed by Chomsky. The central claim 
revolves around the technical asumption that has been built-in here, which is that what is 
in the workspace is a piece of the output structure itself, matched up with formal 
properties which alow the establishment of ordering properties and syntacticaly 
significant relationships over such structures. 
 The general outlook avoids some questions we might ask of the informaly 
presented notions (in both Uriagereka's and Chomsky's work) of the syntax "handing-
over" or "transfering"/"shunting" structure to other systems in a piecemeal fashion. That 
is: what ensures pre-/post-spel-out coherence? How are the independently "handed-over" 
chunks related in these other systems? Do they need to respect the ordering properties 
established in the workspace? Or not? 
 Note that there are several ways that workspace ordering might be "respected". 
For example, we could think of the mapping betwen individual nodes in the workspace 
to the output structure as being miror theoretic for (e.g.) the ?-vocabulary (as in the 
seminal work of Mark Baker 1985, and as adopted in Brody 1999, 2003). I won't, 
however, be pursuing this particular point in this work (though I think its the right one to 
pursue given the overal architecture), it is the general point about pre-/post-spel-out 
coherence that I wish to stres here. The WS/O-distinction as we have conceived of it 
here provides a straightforward model regarding pre/post-spel-out (there should be a 
conservation of ordering properties ? syntactic ordering should continue to constrain 
post-syntactic relationships). Nothing of course rules out the possibility of post-syntactic 
operations that would deform structure in ways that would result in los of information 
 
56 
(so relationships across the interface wouldn't be trivialy/transparently reversible). The 
present point is not that any of these things are impossible, but rather just that the WS/O-
distinction provides a format within which to frame the isues. 
 Now asking questions about what spels out (and when) in the derivation is 
rephrased in terms of asking how/why/when the derivational workspace contracts. 
 Thus, returning now to Chomsky's phase-based conception: Why should the 
workspace contract to exclude the complement domain of Chomsky's putative phase-
inducing heads? Why not the entire sub-structure the phase-inducing head projects? Or, 
with our WS/O-distinction in play, why not retain the complement structure and simply 
spel-out the edge of the phase (i.e., the head and its external dependents)? 
 Note that on Chomsky's view of phases the domains circumscribed by the borders 
of the workspace overlap from some steps in the derivation to others. When the second 
phase-inducing head (H
PH-2
) is introduced the first such phase-head and its external 
dependents are stil in the workspace even though the complement domain is understood 
to spel-out (= "voided from the workspace"). 
So one important point ilustrated by the discussion so far is that having 
succesive stages of derivation in which some elements survive in the workspace despite 
further expansions of structure yields a kind of overlap (ilustrated below in (61)a). This 
overlap is crucial to elaborating the notion of "escape hatches" for cross-domain 
dependencies. That is, escape hatches are possible because some position(s) constitute the 
top of certain workspaces that survive at the bottom of later workspaces ? this alows an 
element moving to positions residing in such domain overlaps to be visible to elements 
higher in the structure. 
 
57 
 There is, however, nothing inevitable about such possible overlaps betwen 
domains. We could, for example, imagine an architecture which would not permit them. 
In such a system, when a second phase-inducing head is introduced we could take this to 
signal the start of a whole new orkspace. Consider: 
 
(60) a. b. c. d.   e.     f.    g. 
 
 
         ,.., 
 
 
        H
PH-1
 INTRODUCED                  H
PH-2
 INTRODUCED   COMPLEMENT DOMAIN ETC,.. 
              OF H
PH-2
 SPELS-OUT 
 
On such a view the step betwen (60)e and (60)f would result in the establishment of a 
brand-new workspace signaled by the introduction of a new phase-inducing head. This 
would yield a theory with non-overlapping phases. 
Succesive snapshots the derivational workspace for a longer stretch of derivation 
can be sen to yield maximal stretches of workspace structure (maximaly expanded 
workspaces prior to any particular contraction steps or "spel-outs") as follows for 
Chomsky's view (61)a) versus our hypothetical non-overlapping system (61)b). Also, 
we include here for consideration a third possible state-of-afairs which imposes no 
restrictions on the workspace whatsoever (61)c): 
 
 
 
 
 
 
 
 
 
 
 
58 
(61) a.   b.   c.  
 
    PHASE-? 
 
       PHASE-? 
 
         PHASE-? 
 
            PHASE-? 
 
  PHASE OVERLAPS  NO PHASE OVERLAP  NO PHASES 
 
On the (61)a-view we expect the possibility of limited kinds of interactions betwen 
elements in a structure if we understand the operations responsible for connecting 
elements in substantive dependency relationships to be limited to what is present in the 
workspace at a given stage of derivation. On the (61)b-view we expect no such 
interactions. On the (61)c-view, if nothing more is said by way of constraining 
operations, interactions of al sorts are expected. 
The (61)c-view would simply identify the workspace and output for al steps of 
derivation and would thus need to appeal to something other than the dynamic sort of 
domain-demarcation under discussion here to understand locality. For example, 
operations might be limited to only being able to relate two elements ? and ? in a 
structure if there is no intervening element ? that could enter into the same type of 
relation (e.g., Relativized Minimality; Rizi 1990). 
Its easy to se that having a workspace which limits the reach of dependency-
forming operations in the gramar could be redundant with distance restrictions of the 
minimality sort that are often appealed to in the literature. (It may perhaps already be 
obvious given the heavy emphasis I have placed on the notion of workspace which 
direction I wil suggest we go in removing any such redundancy). 
 
59 
 Now let us consider a derivation involving sucesive cyclic movement in these 
terms. Take the darkly shaded marked node in the following to be an element bearing our 
{*F} property that requires licensing not available in its initial position, and first consider 
what happens if there is no SCM (as above, the grey-shaded nodes are the phase-inducing 
heads): 
 
(62)  
 
 
 
                                                     *
!
 
 
        H
PH-1
 INTRODUCED                  H
PH-2
 INTRODUCED   COMPLEMENT DOMAIN 
              OF H
PH-1
 SPELS-OUT 
           (THE WORKSPACE CONTRACTS) 
 
On this view then, if the {*F}-bearing element does not move from the complement 
domain of the first phase-inducing head (H
PH-1
) it wil be stranded in the abandoned 
(speled-out) portion of (the output) structure. On the asumption that such unlicensed 
properties are uninterpretable by or ilegible to the interface systems, this derivation 
would crash. 
 However, the structure of this acount makes available the possibility of the {*F}-
bearing moving to some position outside the complement domain of H
PH-1
 prior to the 
spel-out of that domain and thus managing to stay within the workspace (remaining 
active/visible for later steps of derivation). To ilustrate: 
 
 
 
 
 
 
 
 
60 
(63)  
 
 
 
                                                     ? 
 
        H
PH-1
 INTRODUCED                  H
PH-2
 INTRODUCED   COMPLEMENT DOMAIN 
              OF H
PH-1
 SPELS-OUT 
           (THE WORKSPACE CONTRACTS) 
 
As such a derivation continues, introduction of further phase-inducing heads would thus 
drive further contractions of the workspace (spel-outs) and would thus require additional 
local movements of the {*F}-bearing element until it reached a domain within which it 
could asociate with an appropriate licensor. 
 There are some interconnected technical maters that require atention in such an 
approach. First, what happens to the problematic {*F} feature of the lower element 
(trace/copy)? Second, what motivates the movement out of the relevant complement 
domain? 
Regarding the first point, if we regard the displacement operation as "copying" 
then why is the {*F}-feature not problematic for the lower copy when its containing 
domain spels-out (fals outside the workspace)? If it is not a copying operation ? and 
instead involves a literal re-merger resulting in a multi-motherhood structure, then why 
doesn't the same worry hold (i.e., the {*F} property should stil reside in both structural 
contexts)? On either the copy or remerge view it sems that we do not have a way to 
avoid the outcome that held in the non-movement situation if sub-parts of structure are 
undergo cyclic evaluation of the sort just sketched. We might suggest that the copy left 
by movement can have its {*F} property frely deleted, but then why couldn't this happen 
in the non-movement case? The answer to this question might be taken to involve appeal 
 
61 
to some later stage of derivation where a matching {F}-property would go unchecked, but 
its not clear that this response would be correct (e.g., what about wh-in situ?). 
Regarding the licensing of the intermediate movement, there are two obvious 
possibilities. First, there might be some property of the intermediate landing-site that 
serves to license the movement but crucialy not to license the {*F} property (i.e., it must 
"move-on" to some other position to license this property). Second, we might motivate 
the movement as not being driven by the landing-site properties, but rather by some 
combination of the local context and the {*F}-property itself. Movement out of localy 
evaluated domains might be possible just in case failure to do so would result in a crashed 
derivation at that point.
35
 This relates to the question above regarding the properties of 
the wh-element itself, and how/why it does not crash the derivation even if it does move 
(in virtue of leaving behind a copy) and what to say about situations where we do not 
want the element to move (e.g., in wh-in situ cases). Again, I wil return to these isues in 
later discussion. Before we move on to consider how TAG derivations work in 
comparison, let us sum up some key questions raised in this section. I wil borrow from a 
discussion in Falser (2004) and refer to these as the questions of TRIGERING (64)a and 
CONVERGENCE (64)b: 
 
(64) Asuming SCM exists,.. 
 
a. What motivates it? Properties of the moving element? Properties of the 
intermediate target? Both? Neither? (i.e. something else)? (TRIGERING?) 
 
b. What are the mechanics of movement like such that unlicensed features 
{*F} do not remain to cause convergence problems in speled-out domains 
(either on the copy, or on the remerge view)? (CONVERGENCE?) 
                                                
35
 This basic line is sugested in Lasnik & Uriagereka (forthcoming). 
 
62 
 
1.4.3. What Goes for Cyclicity in TAG 
The categorial distinctnes condition on the syntactic workspace introduced above (se 
(9)) relates to ideas from work in Tre Adjoining Gramar (TAG), though the insights 
wil be implemented rather diferently here. The key TAG idea is that we might 
understand the fundamental notion of recursive structure to play a central role in 
understanding the range of possible interactions betwen phases of derivation (more 
neutraly: betwen chunks of structure). As a mater of its basic architecture, TAG factors 
complex structures into non-recursive elementary tres and recursive auxiliary tres that 
are combinable via TAG's two main operations (substitution and adjoining). These two 
operations can be pictured as follows: 
 
(65)    X      X 
  Y 
    SUBSTITUTE 
 
   Y      Y 
 
(66)    X     X 
  Y
r
 
  ADJOIN 
   Y     Y
r
 
     Y
f
 
 initial       auxiliary  Y
f
 
 
         derived 
 
Auxiliary tres in TAG, as pictured above, are special in that they are taken to have 
related top (root) and bottom (foot) nodes (e.g., Y
r
 and Y
f
 in (66)) which enables the 
complex of relationships which they "sandwich" to be spliced-in for an some equivalent 
atomic element within another structured object (e.g., Y in the initial tre in (66)). Thus, 
 
63 
tres without this top/bottom characteristic are elementary; those with this characteristic 
are auxiliary. As Frank & Kroch (1995:113) put it, "the recursive character of auxiliary 
tres provides [..] a domination-preserving expansion of a single node in a piece of 
phrase structure into a larger structure. 
 Now consider what happens in place of succesive cyclic movement in TAG. I 
say "in place of" because in TAG syntactic dependencies like wh-movement are argued to 
be localized to elementary tres ? movement across such structures of the GB/MP sort 
ilustrated above is ruled out on general architectural grounds. Thus the movement of the 
element ? in (67)a targets what wil in fact be its final landing-site, crucialy within the 
bounds of the elementary tre.
36
 This is the only movement operation that there is in this 
approach. This movement (/chain) relation betwen the base and final/target position is 
then stretched as a consequence of the adjoining operation ilustrated above, which 
splices-in intervening material as pictured in the step from (67)b to (67)c. 
 
(67) a. XP  b. XP   c. XP 
            X' 
   ?
{*F}
   X'    ?
{*F}
   X'     ?
{*F}
   X' 
            X' 
         ..t
?
..         ..t
?
..           X' 
 
                ..t
?
.. 
 
                                                
36
 I wil discus later on some ideas about limitations regarding the "size" and "shape" of elementary and 
auxiliary tre structures, folowing among others the work of Frank (192, 202). I am leaving to the side the isue of 
the licensing of the feature {*F} for this ilustration of TAG-mechanics. How this licensing works requires a bit more 
detailed and subtle discusion. This issue is important however, as it turns out that the TAG aproach I discus here 
(i.e., from Frank's 202 discusion) requires formulating the adjoining operation to alow "checking" acros elementary 
tres. Se Chapter 2 for some relevant discusion. 
 
64 
Further adjoining operations can then splice-in more auxiliary structures, yielding the 
efect of a long-distance relationship betwen ? and its base trace position. Note that 
there are no "intermediate traces/copies" on this view. 
A key aspect of TAG is the identification of the root/foot nodes of auxiliary 
structures with a corresponding/matching element in an elementary/initial tre in the 
adjoining operation. The TCG approach developed here exploits a "matching" 
relationship of roughly this kind as wel, though such matching is understood to efect the 
opposite of TAG-theoretic adjoining as we saw in our introductory sketch (contraction). 
Consider how we might view TAG derivations in the context of our WS/O-
distinction. In the development of one specific TAG approach, that in Frank (2002), it is 
suggested that syntactic derivations are divided into two major stages. The first stage 
involves the merge/move mechanics familiar from Chomsky (1995). However, departing 
from Chomsky's "one-stage" system, Frank suggests that the merge/move portion of 
derivations is limited to only being able to generate structures that met the following 
general condition: 
 
(68) CONDITION ON ELEMENTARY TREE MINIMALITY (CETM): 
The syntactic heads in an elementary tre and their projections must form an 
extended projection of a single lexical head 
 
Reference to "extended projections" comes from the work of Grimshaw (1991, 2002) and 
others. The efect of this condition is that the merge/move portion of syntactic derivation 
only create objects that are roughly clause-sized or smaler. These merge/move-derived 
objects then, in Frank's system, are fed to a second stage of derivation which deploys the 
TAG-theoretic operations of substitution and adjoining sketched above. 
 
65 
 Translated into our WS/O terms, we can understand Frank's CETM as a condition 
on syntactic workspaces, and then asume that there can be multiple such workspaces 
corresponding to the basic tre structures that form the input to the second, TAG-
theoretic stage of derivation. This TAG portion of derivation is then conceived as a 
component which efects combination of such workspaces to form derived (output) 
structures (I take it that this basic picture is clear enough to not require a diagram). 
 Of course, this amounts to a rather diferent conception than the views sketched 
above. I mentioned the possibility earlier of having a system which would impose no 
restrictions on the syntactic workspace and thus would require that locality be stated in 
ways other than the dynamic view of local domains we sketched at the outset. The TAG 
view on this general outlook would be an entirely diferent approach, but one that would 
view syntactic workspaces as always coextensive with outputs. However, instead of 
introducing locality constraints on operations within the workspace, we rather have a 
limitation on workspace size (i.e., of the CETM sort), plus the major architectural 
division of derivation into (i) a first stage of local structure creation with multiple 
workspaces, and (i) a second stage that handles combination of workspaces into larger 
complexes. 
 The TCG view developed here maintains a "one-stage" view in the sense of not 
positing two distinct stages of syntactic operations. I turn now to elaborate further. 
 
66 
1.5. Implementation of TCG 
This section backs up to consider some technical isues regarding the notion(s) of 
movement chains (?1.5.1). This leads us to consider, when coupled with the sketch of 
contraction offered above, a "reduced" view of categories and structure (?1.5.2). 
1.5.1. On Relating Positions in Structure 
This section discusses a posible view of movement chains based ideas from Chomsky 
(1995). In derivational terms we think of an item ? as first asociating with some other, 
independent element to form a structure ? constituting the initial/base position B (as in 
(69)a). Later on, derivationaly speaking, some operation causes ? to enter into a second 
set of relationships with some target element T to form the structure ?' constituting the 
derived/target position, where the T-elements dominate the B-elements (as in (69)b): 
 
(69) a. ?= B
M
 b. ?'= T
M
 
 
 ?     B
S
  ?     T
S
 
 
 
              B
M
 
 
           ?     B
S
 
 
Take the superscripted 'M' and 'S' in (69)a/b to stand for the mother and sister elements 
respectively, which together form a merge-derived structural context for ? (e.g., B
S
 = 
base position sister, etc.). I wil return to the isue of whether one or the other of these 
may suffice ? or whether both are somehow required ? for identifying the contexts for 
a moved/displaced element ?, but note here that Chomsky (1995:252) understands the 
 
67 
relevant element for defining the contexts of ? as the sister or co-constituent of ? (i.e., B
S
 
and T
S
 in (69)b). Consider: 
 
Supose that ? raises to a target [T] in ?, so that the result of the operation is ?' [..]. The element ? 
now apears twice in ?', in its initial position and in the raised [target] position. We can identify the 
initial position of ? as the pair ??, ?? (? the co-constituent of ? in ? [i.e., B
S
 in (69)a/b?JED]), and 
the raised position as the pair ??, K? (K the co-constituent of the raised term ? in ?' [i.e., T
S
 in 
(69)b?JED]). Actualy, ? and K would sufice; the pair is simply more perspicuous. Though ? and 
its trace are identical, the two positions are distinct. We can take the chain CH that is the object 
interpreted by LF to be the pair of positions. [..] C-comand relations are determined by the 
maner of construction of [the object in (69)b above?JED]. Chains are unambiguously determined 
in this way. 
 
Following through on this view for our example in (69)b we se that there are two such 
relevant positions in ?', POS
1
 and POS
2
, where POS
1
 = ??, T
S
? and POS
2
 = ??, B
S
?.
37
 As 
Chomsky puts it, these two positions are distinct, but together they constitute the chain 
CH = ?POS
1
, POS
2
? = ??, T
S
?,??, B
S
?; or, if we adopt the "more austere version", then 
CH is simply ?T
S
, B
S
? since "? and its trace are identical". 
 Observe however that there are situations in which the context positions are 
themselves identical. In particular, consider again the following standard cases of the 
putative cyclic A- and A'-movements in (70): 
 
(70) a. John [sems [ _ to be likely [ _ to appear [ _ to like carots]] 
 
b. What [did Dave think [ _ [that Mary believed [ _ [that John liked _ ]]]? 
 
Intermediate A- and A'-movements involved in such cases are clasicaly taken to have 
more-or-les the following general abstract shapes:
38
 
 
                                                
37
 I am taking a shortcut here for expository purposes. The co-constituent forming the sister context for the 
"raised" position in (69)b would not be just "T", but rather a more complex set-theoretic object in Chomsky's general 
Bare Phrase Structure aproach. Here we take "T" to "stand in" for this more complex object. 
38
 There are, as we wil discus later, views which take movement to be "more cyclic" than this, as wel as 
les (i.e., "one-fel-swop" views). I wil discus these maters in the next Chapter. 
 
68 
(71) a. TP
{*F}
 TP TP TP VP
?
 
 
b. CP
{*F}
 CP CP CP CP VP
?
 
 
 
So while it may be true that a moving element ? is "identical" within each of these 
contexts, at least some of the relevant context pairs ought to yield identity (or non-
distinctnes) as wel. T-to-T and C-to-C movement could yield a chain CH = {??, C'?, ??, 
C'?}. Is this a problem? 
 The suggestion inherent in the TCG view of cyclicity sketched above can be 
understood as the claim that such a state-of-afairs is not only "not a problem", it is in fact 
crucial to understanding linked local relationships. As we saw earlier, this is at the heart 
of the TAG architecture as wel. And as we wil se briefly later on, these relationships 
have been argued to be central in some MP work (e.g., se Bo?kovi? 2002, Grohmann 
2003; both of whom stipulate that A-movement is T-to-T and A'-movement is C-to-C). 
 However, it is important to note that the sketch above regarding Chomsky's view 
of chains and contexts overlooks a key feature of his view. For Chomsky, "contexts" are 
not simply the local label of the element that a moving "?" relates to, but rather the entire 
structure derived up to that point. So, there would on his view be no isue which could 
arise in terms of distinguishing the contexts in CH = {??, C'?, ??, C'?}, since the contexts 
would always be unique (they are distinguished by the diferences in the structure they 
dominate). I wil turn to this in a moment. 
 The suggestion here (as sketched in ?1.1 & ?1.2) is that the natural relationship is 
not betwen a moving element and these various intermediate positions which just 
happen to share the super-category specifications of the sought-after target landing site; 
 
69 
rather, the natural relationships are betwen the contexts themselves. That is, the natural 
basic relations that the "moving" element can/should be understood to enter into are the 
substantive "core" licensing properties (e.g., wh, ?/?, ?, etc., what Fukui & Speas 1986 
caled "Kase" properties). The generalizations about the "movements" other than these 
are most elegantly and naturaly stated by positing direct relationships betwen the 
contexts. For this, we need only to specify a notion of like/unlike within an architecture 
where such diferences could mater for derivation and representation. This is the aim of 
TCG. 
 Our sketch of the TCG approach to SCM presupposed that the "contexts" which 
can be identified (resulting in lowering) were understood to relate to the "moving" 
element in terms of a local dominance relation. Note Chomsky's suggested view above 
states things in terms of sisterhood. Below I wil show that the sisterhood view can't 
support what we would require of it within the TCG approach, and that we in fact require 
the relevant relation to be motherhood/domination. 
 However, before heading down that road it of some interest to probe Chomsky's 
discussion of movement chains a bit further to consider two important technical isues. In 
particular, the notation used above to mark the SCM's in (70) does not acurately capture 
one the versions chains discused in Chomsky (1995), though as we wil se, it sems to 
be demanded by the more recent work proposing that derivations work by phase 
(depending, as we wil se, on how we view "spel-out" ? our WS/O-distinction turns 
out to be helpful in this respect ? se below). 
 
70 
The two important isues involve (i) how we understand what contexts are, and 
(i) how contexts are tracked/connected in the course of the derivation. Specificaly, 
Chomsky (1995:300) makes the following remarks about (72) (= his (88):
39
 
 
(72) We are likely [t
3
 to be expected [t
2
 to [t
1
 build airplanes]] 
 
He writes: 
 
Here the traces are identical in constitution to we, but the four identical elements are distinct terms, 
positionaly distinguished [..]. Some technical questions remain open. Thus, when we raise ? (with 
co-constituent ?) to target K, forming the chain CH = (?, t), and then raise ? again to target L, 
forming the chain CH' = (?, t'), do we take t' to be the trace in the position of t or ? of CH? In the 
more precise version, do we take CH' to be (??, L?, ??, K?) or (??, L?, ??, ??)? Supose the later, 
which is natural, particularly if sucesive-cyclic raising is necesary in order to remove al -
Interpretable features of ? (so that the trace in the initial position wil then have al such features 
deleted). We therefore asume that in [(72)] the element ? in t
1
 raises to position t
2
 to for the chain 
CH
1
 of [(73)], then raises again to form CH
2
, then again to form CH
3
. 
 
(73) a. CH
1
 = (t
2
, t
1
) 
b. CH
2
 = (t
3
, t
1
) 
c. CH
3
 = (we, t
1
) 
 
CHAINS are ordered pairs of contexts where a particular context for a given element ? is 
understood to be its sister or co-constituent. If we take this to mean that the "context" is 
the entire structure dominated by the sister element, then the more complete version of 
the relevant objects in (73) for the derivation of (72) are those in (74): 
 
(74) a. CH
1
 = (?we, [to we [build airplanes]?, ?we, [build airplanes]?) 
b. CH
2
 = (?we, [to be expected [we [to we [build airplanes]]?, ?we, [build 
airplanes]?) 
c. CH
3
 = (?we, [are likely [we [to be expected [we [to we [build 
airplanes]]]?, ?we, [build airplanes]?) 
 
                                                
39
 Chomsky's example in the discusion refered to in the text used the token "we are likely to be asked to 
build airplanes". I have switched out ask for expect in my discusion here. It sems clear from the context that 
Chomsky intended to have an pasivized ECM verb in this example, as pointed out to me by Howard Lasnik. 
 
71 
Below I wil suggest that we adopt this idea regarding contexts, but reject the view of 
contexts as understood as the entire structure up to the relevant step of derivation. 
The "technical questions" Chomsky raises in the quoted pasage above amount to 
the choice betwen the following two options regarding how contexts are connected in 
the course of derivation. From the derivational stage depicted in (75), we can consider the 
result of the next movement of we to be (76)a or (76)b: 
 
(75) ..[to be expected [we [to [we [build airplanes]] 
 
          MOVE 
(76) a. [we [to be expected [we [to [we [build airplanes]]] 
 
 
          MOVE 
b. [we [to be expected [we [to [we [build airplanes]]] 
 
 
It is interesting to note that if the technical option Chomsky pursues (76)a) is correct, 
that we appear to have cases for which the concepts of MOVE and CHAIN would be 
disociable ? compare (76)b where the any notation for the resultant chains would 
transparently recapitulate the derivational history of movement (se also (76)b' below for 
ilustration). These two options turn out to not be equaly compatible (at least not equaly 
straightforwardly compatible) with derivation by phase. 
Before turning to this point about chains and phases, note there is at least one 
other technical option which Chomsky does not consider. This third option would regard 
movement as extending the initial (derivationaly prior or "older") chain and forming a 
new one as in (76)c (in contrast Chomsky's version creates a new base-position-tailed 
chain, leaving the older one intact): 
 
 
72 
(76) c.         MOVE 
 [we [to be expected [we [to [we [build airplanes]]] 
 NEW CHAIN CH
2
? 
 OLD CHAIN CH
1
? 
 
The next step on this alternative view ould involve the formation of a third chain (CH
3
) 
and a kind of stretching of the previous two chains, as in (76)c': 
 
(76) c'.       MOVE 
  we [are likely [we [to be expected [we [to [we [build airplanes]]] 
 NEW CHAIN CH
3
? 
  OLD CHAIN CH
2
? 
OLDER CHAIN CH
1
? 
 
Contrast this with the next step for (76)a (Chomsky's approach) in (76)a': 
 
(76) a'.       MOVE 
  we [are likely [we [to be expected [we [to [we [build airplanes]]] 
 
 
 
Note that on both Chomsky's alternative technical view (76)a/a') and the alternative 
(76)c/c') we can understand move and chain as disociable to some extent, compared to 
(76)b where the relevant movements and chains are esentialy the same, as mentioned 
above; consider (76)b' in this regard: 
 
(76) b'.              MOVE 
  we [are likely [we [to be expected [we [to [we [build airplanes]]] 
               CHAIN 
 
What is the diference betwen the choices in (76)a-c? One obvious point is that only 
(76)b/b' sems straightforwardly compatible with the notion of spel-out by phase of 
Chomsky (1999) (this is the notation used informaly above to ilustrate our basic 
succesive-cyclic A- and A'-movements). 
 
73 
 Phase theory, as introduced above, is a recent version of the general idea of cyclic 
domains for rule application. In Chomsky's recent work the suggestion is that CP and vP 
(and possibly others) constitute special domains which, upon derivational completion, 
require that their complement domains be shunted/transfered to the interpretative 
systems for evaluation. Above we suggested a way of viewing these transfer steps of 
derivation within our workspace/output structure distinction. Consider however the 
impact that viewing spel-out/transfer as a literal "handing-over" or "removal" of sub-
parts of structure has on our discussion of chains in Chomsky's terms. 
We can make the point with reference to the case of succesive cyclic wh-
movement. For this example, we wil consider only C
0
 as constituting relevant phase-
inducing category, as the point regarding the formal shape of chains remains the same 
even if we consider additional narower domains for spel out (like v/VP). 
The relevant structures and chains would look like (77) for succesive cyclic wh-
movement on Chomsky's view, and like (78) on the alternative discussed above: 
 
(77) What [did Dave think [ _ [that Mary believed [ _ [that John liked _ ]]]? 
 
 
 
 
(78) What [did Dave think [ _ [that Mary believed [ _ [that John liked _ ]]]? 
 
 
 
 
But, suppose that we understand phases (here: CPs) as speling-out their complement 
domains upon reaching the next higher phase-inducing head, as Chomsky (1999) 
proposes. Technicaly, the proposal embodied in Chomsky's PHASE IMPENETRABILITY 
 
74 
CONDITION (PIC) has it that when the next highest phase-inducing head is reached, al of 
the substructure constituted by the previous phase-head's complement domain spels-out, 
leaving a residue (roughly equivalent to the "checking domain" of Chomsky's earlier 
proposals, se Chomsky 1993). This means that the first movement of the wh-element in 
our example wil have its "head" visible and wil therefore be able to be moved to the 
next CP, as this element wil occupy the "edge" of the previous phase. But what happens 
to the initial chain (?) when the substructure containing its "tail" spels-out? 
 
(79) a.       [what [that John liked what] 
      ?C-PHASE                   
?
 
b.           [that Mary believed [what [that John liked what] 
              
?
 
      ?C-PHASE   SPELLOUT???? 
c.           [that Mary believed [what [that John liked what] 
 
             BROKEN CHAIN??? 
      ?C-PHASE 
d.           [that Mary believed [what [.......?........] 
             ?? 
 
      ?C-PHASE 
e.       what [that Mary believed [what [.......?........] 
   
?
 
 f. etc.,.. (by phase) 
 
The spel-out by phase idea, regardles of the grain or size of structures considered to 
constitute such phase-domains, is not straightforwardly compatible with the idea of 
having the kind of view of chains in (77), nor the variant introduced above in (78).
40
 The 
                                                
40
 It could be that Chains are "real", and work as sugested in Chomsky (195) (as discused above), but that 
they are fundamentaly not "syntactic objects". Maintaining the reality of "chains" in a derivation-by-phase architecture 
apears to require the postulation of a kind of cros-dimensional object that exists acros sub-stretches of syntactic 
computation and the interpretative components or that chains are fundamentaly objects of the interpretative system(s) 
which the syntax in some sense creates but canot itself handle/manipulate (the system only ses particular elements 
"?" which can merge and remerge). 
 
75 
problem with both of these conceptions is that they involve the postulation of a syntactic 
relationship which is maintained to the base position, but on the derivation by phase view 
these lower positions are understood to be in some sense "absent" at the relevant later 
stages of derivation in virtue of the spel-out operation. 
Even the idea of having a composed or linked chain appears to not make any 
sense on the derivation by phase view, as there is no stage of derivation over which we 
could describe such objects. We appear to either need chains to be non-syntactic entities 
? e.g., objects of the interface system (maybe plausible) ? or we need to regard chains 
as objects somehow superimposed over a dynamic derivational history (i.e., stil 
"syntactic" but "higher order").
41
 
Note as wel that it is not entirely clear how to maintain chains as syntactic 
objects on a phase-based view that adopts as wel the view of contexts as the entire 
structure of derivation up to the relevant point (i.e., everything dominated by the 
sister/context for ?). The view of contexts as the entire structure to which ? relates sems 
to require that we can refer to such structures ? but if portions of such contexts are 
dynamicaly shunted/transfered by phase, its not obvious how this should work. At best, 
contexts could be defined down to the previous phase-inducing head, and not below. 
These are interesting consequences it sems to me. Put another way, suppose we 
take the conditions we want to hold of chains to be syntactic conditions. If we can't refer 
to chains themselves (since there are no structural contexts over which we can capture the 
relevant relationships in the multiple spel out view), this means, for example, that 
                                                
41
 Se Uriagereka (198) for a discusion of such a view. 
 
76 
whatever properties are asociated with the wh-element in virtue of having entered into 
the complement (?) position of the embedded verb like in our example above, these must 
be somehow maintained as properties of the element itself (e.g., ?-marking of and 
element ? could be understood as ? receiving or being marked somehow with a ?-feature 
? se Hornstein 2000 for an extensive development of this approach). 
However, note that the workspace/output distinction as we have introduced it 
sidesteps these isues. I mentioned above that we might deploy our view to avoid isues 
that might arise regarding pre-/post-spel-out coherence. This is exactly such a situation. 
Consider again our earlier schema (i.e., the bottom-up version), repeated here in (80): 
 
(80)  
 
 
 
 
However it is that we might choose to view CHAINS, this schema alows us to 
straightforwardly maintain the overal coherence of the derivation in virtue of 
maintaining the output structure in the way pictured above. Thus the initial and 
subsequent movement pictured in (81)a/b could be sen to yield any of the objects in 
(81)c-e, depending on how we sort out Chomsky's two technical options (81)c and 
(81)e) or our additional one (81)d): 
 
(81) a.  b.  c.  d.  e. 
 
     =    OR          OR 
 
 
77 
So we have a format available for considering al of the possibilities discused above 
regarding how contexts are tracked/connected throughout the course of a derivation. 
Moreover, this view is neutral as it stands on the question of whether we define contexts 
as just the sister-label, or in terms of the entire structure dominated by the sister/co-
constituent. I wil return to this isue in a moment. 
This underscores again the generality of the workspace/output structure 
distinction and the new idea it brings to discussions of MSO-systems. It helps us here 
because it points to a way of conceiving of spel-out which does not involve a literal 
"handing-over" of structure from the syntax to the interface systems, as spel-out is 
sometimes characterized informaly. Or, rather, the WS/O-distinction offers a concrete 
formulation of the content of "handing-over"/"transfer" under which the technical 
questions raised above do not arise. So whatever relations we establish as part of the 
syntax can stil be "there", but simply not within the active stretch of syntactic 
computation. This makes it possible to conceive of chains in any of the ways pictured 
above, with potential stages of derivation that might have workspaces in which only parts 
of a given chain might be visible. This is another instance of the WS/O-distinction 
providing a way to understand pre/post-spel-out coherence.
42
 
Let us now put this discussion back together with Chomsky's idea that chains are 
fundamentaly connections betwen contexts. Take the unshaded nodes below to be the 
                                                
42
 McGinis (204:64fn18) raises the isue of how c-comand is suposed to be understod as holding 
acros phases. For example if c-comand is understod as derivational in the Epstein et al (198) sense, its not obvious 
that when ? and ? merge they come to c-comand everything each other dominates if some of the derivationaly 
previous domination relations are literaly no longer "there" in the narow syntactic computation. Her exposition 
presuposes the intuitive notion of "handing-over"/"transfer", which is why her raising this question makes sense. 
Again, these isues do not arise given our WS/O-distinction. 
 
78 
sister or co-constituent elements defining the contexts (the "chain") for the moved 
element ? (occurrence of ? represented by the shaded nodes). 
 
(82)  
 
 
 
Viewing these context elements as independently relating along the dominance sequence 
? a relationship that is "there" in any event, whether we view at as manifesting a 
"dependency" relationship or not ?yields the following thre possibilities in (83) 
corresponding to those in (82): 
 
(83)  
 
 
 
We can now dispense with the extra-structural arcs yielding: 
 
(84)  
 
 
 
If the relationships betwen context elements are in some sense independent of the 
element(s) that relate to them (i.e., that "move through" the positions they in part define), 
then we might consider the possibility that a given element ? might relate only once to 
such chain structures, perhaps targeting diferent parts of such complexes. 
 
79 
We might also entertain a diferent conception of contexts, in two senses. First, as 
pointed out by Chomsky (1999) and Lasnik (2000), there are two possible relations on a 
merge-based view that might be the relevant for implementing the context-view of 
chains. So far we have considered only sisterhood/co-constituency, but there is also 
motherhood or imediate domination/containment. We wil require this later conception 
(se below). 
Second, we have the following possible diference betwen two ways regarding 
how to understand what contexts actualy are, which is independent of the 
sisterhood/motherhood distinction: 
 
(85)  
 
 
 CONTEXT=LABEL CONTEXT=ENTIRE STRUCTURE
43
 
 
On the righthand side we have a picture of contexts as suggested in Chomsky's example 
(se (74) above). This is a view cast in hiearchical terms that is similar in spirit to 
Chomsky's (1955) definitions of contexts in terms of strings (where occurrences of an 
element ? are uniquely defined by the left-to-right content of a string up to a given 
occurrence ? picking up on a notion present in Quine (1960) for formalizing variable 
occurrences in logic). On this view, we cannot exploit the possibility of having the 
system be "unable to distinguish" betwen contexts, since contexts are derivationaly 
                                                
43
 Howard Lasnik (p.c.) points out that this distinction is technicaly betwen two diferent ways of 
conceiving of labels ? either as local head/phrase information or as encoding the "entire derivational history" up to the 
relevant point where ? is integrated (whether on "first" or some subsequent re-merge). I wil retain the notion of label 
for the local category/feature information view, using the notion of the "entire structure/derivational-history" to refer to 
Chomsky's (195) view. 
 
80 
unique. On the lefthand side, however, we have a descriptively les powerful view hich 
identifies contexts by just the local label of ?'s sister (or, perhap instead: ?'s mother). 
 These two views shake out somewhat diferently if derivations work top-down. 
For example, if ?'s is initialy integrated in the top-most position, and particular points of 
derivation are what is relevant for identifying contexts as in Chomsky's view, then ?'s 
initial context wil be just the sister or mother node characterizing this initial position. 
The natural extension of the idea of contexts as the entire derivation up to the relevant 
point where ? is integrated (or remerged) to the top-down view would then se 
intermediate positions as identified by al of the structure that dominates them. 
Howevere, the local-label view remains the same on a top-down view. That is: 
 
(86)  
 
 
 CONTEXT=LABEL CONTEXT=ENTIRE STRUCTURE 
   "TOP DOWN" 
 
It is the weaker notion of contexts ? viewing them as simply the local label, and not the 
entire structure up to the relevant point of derivation ? that our view of SCM requires. 
This could perhaps be motivated on minimalist grounds appealing to simplicity and 
locality ? the local label view does not require that we keep track of arbitrary stretches 
of derivation in order to keep track of occurrences of a given element ?. However, the 
suggestion here is actualy a bit stronger than this. That is, rejecting the descriptive power 
inherent in the "entire structure" view of contexts yields a system that is weaker in 
precisely the way that we require to understand SCM. It is, in fact, another way of stating 
 
81 
the key idea being developed here to say that it is because contexts are narowly/localy 
defined that situations can arise where they are not unique, and it is this state-of-afairs 
that underwrites SCM type relationships (that is, situations in which contexts in adjacent 
domains cannot be uniquely identified). 
So, to sum, we adopt a context-based view of movement chains, but limit the 
defining contexts to just the imediate local relationships (here: dominance relations 
realizing feature-licensing connections). 
In addition, we might consider the possibility that our thre technical options 
regarding how contexts are connected to each other in the course of the derivation are 
actualy not in fact technical/theoretical options for characterizing a single sort of 
relationship, but rather thre diferent species of chains ? diferent constituency 
structures of chains if you like (se Uriagereka 1998:399 for a related discussion). Recal 
from our discussion of some schematic TCG derivations for A'- and A-relationships in 
?1.2 that we pointed to a "grouping" defined by stretches of agreing properties on the 
dominance ordering, particular stretches of the path with shared ? and/or ? values. In 
particular, we pointed to the following: 
 
(87) C
?:f
WH[?:n]
?T
?:f
?:n
?v
?[?:n]
  C?T
?:f
?v
?[?:f]
 
 
D
?:f
?:n
       D
?:f
?:n
 
 
   A'-RELATION   A-RELATION 
 
In Chapter Thre I wil suggest that these ideas about co-valued properties along the 
dominance path can in fact be helpfully viewed as a kind of "chain" constituency. Note 
that the verbal projection path in the A'-relation in (87) manifests a grouping of the sort 
 
82 
sen in the middle schema in (84), repeated here in (88) with some possible descriptive 
labels suggesting a typology of relationships that I wil return to. 
 
(88)  
 
 
 
   Binding/Control  A'-relations  Some A-relations 
 (connections betwen 
    A-relations) 
 
It wil be beyond the scope of the present work to pursue these divisions (and the 
possibility of others perhaps) in great detail, but in the course of developing some 
analyses I wil again return to these schemas to point out some of the paterns which 
emerge on the specific implementation of the general TCG view being proposed in this 
work (se our concluding discussion in Chapter Thre). 
 Let us return now to some technical possibilities regarding the node-identification 
proces that I suggested might be useful in understanding SCM. This wil lead us to a 
discussion regarding labels and structure and a particular set of asumptions regarding 
these concepts that I wil be adopting here. 
Recal the following schema from above (repeated here as (89)) in which we 
suggested that the structural context for a "moving" element might be understood (under 
some relevant matching relation ?) to collapse/become-identified-with some lower like 
element. The suggestion was that such identification results in the equivalent of lowering. 
 
 
 
 
 
 
 
83 
(89) a.  b.  c.  d.  e. 
 
                             ? 
 
 
          = 
 
 
   MATCHING RELATION ?    IDENTIFICATION    CONTRACTION 
 
Although we did not mention this earlier (as we had not yet discussed the notion of 
chains and contexts), this view demands that the relevant contexts for ? be understood in 
terms of motherhood. Note what happens if we view the relevant matching to ocur with 
the sister element of the shaded node as in (90): 
 
(90) a.  b.  c.  d.  e. 
 
                             ? 
 
 
          = 
 
 
   MATCHING RELATION ?    IDENTIFICATION    CONTRACTION 
 
The matching of the open nodes would either result in no equivalent of "copying", so that 
the structure would simply reduce as pictured in (90)d/e, or we would have to entertain 
the idea that the open/unshaded and shaded nodes in (90) can instantiate a sisterhood 
relation which is independent of any dominance relationship, alowing us to extend the 
logic we introduced above regarding dominance to efect a similar "lowering". But its not 
clear how this later view would work. To se what I mean by needing an independent 
sisterhood relation, consider (91): 
 
 
 
 
84 
(91) a.  b. 
 
                             ? 
           ? 
 
  ?? 
 
In order for the lower open node to have been introduced, as shown in (91)b, it must 
already have a sister. So its not clear how the sisterhood relation could support anything 
like the "lowering" operation we have been considering as a possible basis for 
approaching SCM phenomena. The picture in (90) above stil remains a possibility that 
would be of interest (recal this is the same as the picture we initialy offered to introduce 
the notion of contraction; se example (6)), but this would yield nothing like a 
"movement" relation, as it would not cause the shaded node to enter into any new 
dominance relationships in the output (or in the workspace). Below we expand on the 
reduced structural descriptions that were appealed to in our sketch of SCM analyses 
earlier on ? these structures esentialy deny the existence of linguisticaly significant 
"sisterhood" relationships. In addition to the technical problems for sisterhood just raised, 
our independent asumptions about structure wil thus be sen to rule-out the very 
possibility of identifying contexts in this manner (only the motherhood/dominance-type 
relations wil be available in principle). 
Let us consider some posible ways of viewing the upper context of ? which 
depend on diferent conceptions of phrase-internal projection distinctions, about which 
we have so far said nothing. Here are two familiar ones: 
 
 
 
 
 
 
85 
(92) a.    XP  b.     XP 
 
   ?
{?F}
   X'   ?
{?F}
   XP 
 
       X
0
    ..      X
0
    .. 
 
In (92)a we have a traditional view positing thre diferent phrase-internal projection 
types, the head (X
0
), the intermediate (non-minimal/non-maximal) X', and the maximal 
XP. In (92)b we have the view that specifiers are in fact adjunction structures (in the 
formal sense of May 1985, 1991; Chomsky 1986, as proposed e.g., in Kayne 1994). We 
can note right away that the specifiers-as-adjunctions view wil be dificult to render 
consistent with the intuition behind the matching relation ? as we have so far been 
hinting at it ? that is: the general idea of understanding the regulation of the size of the 
active workspace in terms of recursion (repeats of like elements). 
The general idea of TCG as we have been developing it is that the syntactic 
workspace cannot tolerate multiple tokens of a given type X, and that because of this 
limitation situations arise in which the workspace might either contract to remove one of 
the offending like elements from the workspace, or it might simply be unable to 
distinguish the two resulting in the sort of collapse/identification sketched informaly 
above. 
 On this intuition ? that recursion in structure maters for regulating the maximal 
expansions of the syntactic workspace ? its unclear how there could be categories 
divided into segments of the adjunction sort. There may be technical ways of working 
 
86 
with such structures to implement the TCG intuition regarding a workspace limited to 
representing single tokens of given types, but I wil not pursue this possibility here.
4
 
 Note that the traditional view involving intermediate-level categories in (92)a 
above is appealed to in TAG-theoretic derivations. Recal from above the general schema 
for the TAG equivalent of non-local movement relations: 
 
(93) a. XP  b. XP   c. XP 
            X' 
   ?
{*F}
   X'    ?
{*F}
   X'     ?
{*F}
   X' 
            X' 
         ..t
?
..         ..t
?
..           X' 
 
                ..t
?
.. 
 
Can we appeal directly to this idea such that TCG would simply involve the inverse of 
TAG-adjoining to shrink/contract structures? 
 The answer, I wil argue, is "no". Viewing the node-identification and contraction 
of structure as a kind of anti-adjoining wil fail to generate the structures that I wil argue 
are needed to understand the interaction betwen cyclic movement and certain binding-
theoretic phenomena. Simply removing or splicing-out intervening structure defined by a 
top- and a bottom-node of the X'-type wil not result in the kind of lowering that would 
result in ? being dominated by material below its upper occurrence, though it does 
succed in creating new local domains over which ? wil dominate. This is what I 
showed above in examples (90) & (91). 
                                                
4
 For example, we might find technical justification for distinguishing the segments of adjunction structures 
based on feature values, an idea that I make use of in a diferent way in later discusion. 
 
87 
But the former (geting ? to be dominated by its previously 'neighboring' initial 
dominance domain) is what I wil argue to be required to correctly handle the binding 
facts (discussed below and returned to in more detail in Chapters 2 & 3). 
 To quickly re-ilustrate the point, now in specific connection with TAG: having a 
anti-adjoining procedure (just reversing "direction" of standard TAG steps of derivation) 
could yield splicing-out of the sort in (94): 
 
(94) a. XP ? X'    .. ? Z ? ..    X'  
 
? 
 
 b.          XP ? X' 
 
          ? 
 
But this sort of operation could not result in ? being dominated by the intervening 
element Z in the output structure. Cases where this maters are ilustrated with the 
following examples: 
 
(95) a. John thought pictures of himself/*herself were on sale 
b. Which pictures of himself/*herself did John think were on sale 
 
(96) a. ?John thought Mary sold pictures of himself 
b. Which pictures of himself did John think Mary sold 
 
The self-form within an NP in the embedded subject position (95)a can (and must as the 
agrement mismatch shows) be bound by the matrix subject. This self-form can precede 
the matrix NP if it is within a fronted wh-phrase without loss of aceptability. This could 
be understood by connecting the analysis for (95)b to that of (95)a by positing a copy of 
the wh-phrase in the base position (or a trace that can be reconstructed into in some way). 
 
88 
 However, note that if the phrase containing the self-form is in the object position, 
the embedded subject must bind it, and nothing higher can (96)a. But on given this 
observation we cannot extend the (96)a analysis to (96)b via positing a trace/copy in the 
object position of the embedded verb, since we've just sen that binding by the matrix 
subject is not possible with the self-form in that position. But if there is an intermediate 
movement, for example to the top edge of the embedded clause, nothing intervenes 
betwen the self-form and matrix subject thus opening the possibility of keeping the view 
of binding constant across these examples. That is, (96)b could be sen to involve the 
following partial representation as folows: 
           C-COMAND 
(97) [Which pictures of himself] did John think [
CP
 [wh .. himself] [Mary sold [wh .. himself]] 
 
Crucialy, in order for this line of analysis to work, the relevant intermediate copy/trace 
has to be c-commanded by the matrix subject, as pictured above. But we've just sen 
above that this is not what the TAG derivation provides. In our schema (94) above what 
we need is for the XP to somehow end up under intervening elements like Z ? but this is 
not what we get either with TAG's adjoining or with a possible inverse operation that 
would otherwise be consistent with the views being developed here (e.g., contraction of 
X' elements). 
 In addition to these technical/conceptual and empirical concerns there is also the 
following worry, which is similar to the concern raised above for adjunction-type 
structures. Do we have reason to think that XP and X' are distinct such that X' would not 
interfere with the XP-XP relationship that we sem to need to support context-
identification and the consequence "lowering" of elements? If XP and X' are distinct, and 
 
89 
the relevant matching/identification could involve XP, then things could work for SCM 
as I sketched them above. But this would require that we refer directly to the equivalent 
of "bar-level" specifications to get the technical details of this view of movement off the 
ground. Another possibility is that X' elements are (i) there in the structure, (i) non-
distinct from XP, but (ii) for some reason they are "invisible". I return to this isue 
below, but note here that I wil not be pursuing this line of thinking. I wil instead be 
adopting a view here based on a diferent conception of categories and structure which 
does not admit the possibility of an X'/XP distinction in the first place. This view alows 
us to sidestep a number of these technical isues and problems, and abstract away from 
other isues that wil not be a of central interest. 
 The view I wil be working with is drawn from one of a few interesting recent 
minimalist investigations aiming to reduce the available range of distinctions in the 
theory of phrase-structure that analysis can appeal to. It is arguably more consistent than 
the salient alternatives with the general intuition about the workspace not tolerating "like 
elements", as we wil se. I turn to these maters directly. 
1.5.2. Labels & Structure 
Let us consider two rather diferent ways of simplifying structural descriptions with 
respect to structure and category that have been suggested in the recent literature. First, 
Collins (2001) has suggested that we might head towards a theory in which label 
distinctions are eliminated as marks on derived structure, retaining this information as a 
designation only for the ultimate parts of structures (i.e., the terminal elements). So 
instead of the sort of object from Chomsky's (1994) Bare Phrase Structure (BPS) in (98)a, 
 
90 
where the underlined occurrence of the symbol '?' is taken to be the label of the merge-
derived complex {?, ?}, we have rather (98)b, which encodes this label information only 
for the ultimate parts of the structure: 
 
(98) a. {?,{?, ?}}  OR,..    ? 
 
     ?     ? 
 
 b. {?, ?}  OR,..  
 
    ?     ? 
 
Collins approach is quite interesting, but it wil not support the key idea I am aiming to 
develop here regarding dominance-encoding of chain-information and the node-
identification procedure that I suggest as relevant for succesive cyclic movement. This is 
so because Collins' system does away entirely with the relevant label markings on 
derived structure, so the system does not make available the formal means to expres the 
general idea underlying TCG.
45
 
However, others have pursued somewhat similar atempts at reducing the 
distinctions available for principles to refer to in the pursuit of eliminating redundancies 
and (perhaps) thus increasing restrictivenes. For example, other such label-reducing/-
eliminating kinds of moves have been suggested as follows: 
 
 
 
 
 
 
 
                                                
45
 This may be hasty, but at present I do not se a clear way to begin articulating the TCG system I am 
developing here within Colins' asumptions. 
 
91 
 
(99) a. XP   b.     X ?"head" 
 
 ZP  X'     Z Y ?"complement" 
          "specifier"? 
  Z'  X
0
 YP 
 
  Z
0
   Y' 
 
    Y
0
 
 
(100)    < 
 
 ?    ? 
 
Brody (2000, 2003) suggests a reduction in the available distinctions for capturing 
phrase-structure generalizations, eliminating the structural distinctions in (99)a in favor 
of the more sparse (99)b. This is certainly a label-elimination approach, but diferent 
from what Collins pursues. Stabler (1999) suggests something along the lines of what 
Collins proposes with the minimal diference of including a pointer which indicates the 
asymmetry of projection (i.e., which dominance-line constitutes the link to the head of a 
given combination ? as in (100)). Collins rather offers an inventory of principles which 
conspire to yield the results that labels are typicaly meant to encode (which, if correct, 
would eliminate the need for any such 'pointer' indicating the head of the structure). 
Stabler's view as far as I can se wouldn't support the system I am elaborating here either, 
for basicaly the same reason that we cannot deploy Collins' approach. 
 However, there is a general question about al of these approaches that is worth 
raising: What is going on here? Eliminating primitive (intrinsic/non-relational) bar-level 
distinctions is not a new idea in the theory of phrase structure. This was present in the 
work of Muysken (1983), who adopted a relational conception of these distinctions 
 
92 
(specified in terms of coherency conditions on the projection distribution of features like 
[?maximal] and [?project]), and this general relational conception is modified and 
adopted in Chomsky's (1994, 1995) BPS. But what is being suggested here is an 
elimination of the distinctions altogether.
46
 
The Collins/Stabler approach difers from Brody's in which direction we 
understand the elimination (beter: reduction?) to work if we consider a mapping/ 
transition from the typicaly asumed sort of structure to each of these proposed 
conceptions. That is, for the standard view of the head-complement unit in (101), we 
have the following two alternative conceptions, difering on what is retained in the 
model. 
 
(101)    ? 
     (Brody) 
    ?   ? 
 
 ?    ?      (<) 
     (Collins/Stabler) 
      ?    ? 
 
Of these two ways of thinking, Brody's approach might sem at first blush to be more 
radical. The Collins/Stabler approach retains the part-whole structure central to the last 
half century of work in generative gramar (indeed: to most if not al of the entire history 
thinking about language structure generaly!) by maintaining the head/phrase distinction 
in structural terms while retaining labels only for heads. Brody clearly intends to stay 
within this tradition as wel, though his view of basic phrasal structures, he notes, is 
                                                
46
 Or, perhaps more acurately, relocating the conceptual/empirical burden borne by these notions onto the 
backs of other (hopefuly independently required) ones. I refer the reader to both Colins' and Brody's discusions. 
 
93 
intended to eliminate as wel "the apparent conflict betwen the long tradition of 
dependency theories" and "phrase structure theories of syntactic representation".
47
 The 
isue of doubling labels in the projection relation betwen head and its dominating 
phrasal node(s) doesn't arise, as he has simply removed the distinction entirely, alowing 
only a single node (so, only a single label). 
 However, it sems clear that Brody's view of structure and categories can be 
understood to retain part-whole/constituency information via the antisymmetry of 
dominance relationships. Traditional heads can be understood as separate units by 
refering simply to a single labeled node (though se below regarding heads and PF); 
traditional phrasal constituents are captured via dominance as in standard approaches. 
The possible exception to any such straightforward mapping from standard approaches is 
any "junctures" involving a head with a specifier and complement. Consider (99) again, 
repeated here: 
 
 
 
                                                
47
 Brody (203:16). The "conflict" that Brody aludes to is perhaps not imediately obvious, but I think it can 
be unpacked as folows. It is true that clasical dependency theories (e.g., Tesneire 1959) and more recent, conceptualy 
similar aproaches (e.g., Hudson's (1984) Word Gramar, among others) deploy somewhat diferent sorts of notations 
and difer at least superficialy from phrase-structure/constituency based aproaches in their mision statements (and 
there are aproaches which apear to fal in both camps, Stedman's Combinatory Categorial Gramar (CG) strikes 
me as one such aproach). But loking at curent PS-based views of structure/category its not obvious that there is any 
conflict. However, there is a diference that can be detected in the gradual historical shift from the initialy deployed 
rewrite rules of Chomsky's early work (a pure PS-based aproach) to his recent BPS. The shift has revolved almost 
entirely around the increasingly central role of headship/endocentricity. Phrase-structure rules on their own require no 
particular category matching betwen their left-and right-hand sides (e.g., X ? Y Z). The recognition of 
generalizations stateable in terms of positing special members of local part-whole structures to play the role of 
determining the overal type of the local structure (headship) formed the basis of X-bar theory. Among al the notions 
that were subsequently introduced under this general umbrela (e.g., cros-categorial harmony, uniform bar-level 
limitations, etc. se Jackendof 197, Emonds 1985), only the key notion of headship apears to have survived in 
recognizable form within curent thinking (se Speas 190, Chomsky's 194, 195 BPS, and Chametzky 196, 200 
for some related critical discusion). Brody's reduced structures (and, I think, Colins' as wel) can be sen as atempting 
to remove the last barier betwen the aproaches, colapsing (almost) entirely the idea of formal ordering properties 
characterizing structure and substantive "dependency" relationships that can be understod to "live on" these 
dimensions. The question remains as to whether we ned anything more than a single dimension. My sugestion here is 
that as far as narow syntax goes we do not. This is esentialy the claim that al we ned is branching sequences. 
 
94 
 
(102) a. XP   b.     X ?"head" 
 
 ZP  X'     Z Y ?"complement" 
          "specifier"? 
  Z'  X
0
 YP 
 
  Z
0
   Y' 
 
    Y
0
 
 
For a given sequence of head-complement relationships, dominance ordering alows us to 
refer to either individual nodes or to principled subsequences that respect traditional 
constituency (take '?' to be a dominance link in what follows, which a left-right direction 
on the page indicating the standard antisymmetry of this relation): 
 
(103) A?B?C?D?E 
 
a. D?E 
b. CD?E 
c. B?CD?E 
d. AB?CD?E 
 
But, for example, we might take B?C to not be a constituent, since there is material 
which both B and C dominates. It is perhaps les clear what do say about branching in 
this system with respect to constituency, for example (take this to be the same object as 
(102)b above): 
 
(104) X ? Y 
 
Z 
 
The straightforward view ould say that Y (and al it dominates) is a unit, as with Z (and 
al it dominates), but X is not an independent unit. If it was, then why not X?Y 
excluding Z? Or X?Z excluding Y? But I just said above that we might regard each 
 
95 
individual node as being a separate unit in the sense of "independent head". If we are 
collapsing the head/phrase distinction, how is these maters resolved? 
 Two separate lines of discussion are relevant here. First: let me return to the 
discussion at the end of the previous section regarding chains as contexts and specifiers 
as X'-sisters versus adjunction structures, connecting it now with the possibility of 
adopting these reduced structural descriptions in our formulation of TCG. Second: there 
is the (more recent) idea that head-movement relationships might be a "PF" phenomena. 
 Regarding the first: there is a general idea that has been floating in the literature 
that non-minimal/non-maximal (intermediate-level/X') phrasal structures are invisible in 
some sense for the operations of the syntax. Sorting out this isue, as Chomsky 
(1995:382n23) observes, "depends on properties of phrases that are stil unclear". For 
example: Kayne (1994) argues from his asumed Linear Correspondence Axiom (LCA) 
that al specifiers in fact realize adjunction structures; Starke (2000) argues that we dump 
the notion of specifier altogether, retaining only the notion of head-complement relations. 
On Kayne's view e could argue that the system cannot refer to intermediate units 
since the equivalent element in his system would always constitute segments of category, 
which his approach consistently treats as esentialy "one thing" (se, e.g., his definition 
of c-command). But, on the other hand, segments of a category on this view are labeled 
identicaly as XPs, so perhaps they can be refered to as independent units (certainly for 
the case of similar kinds of structures arising with adjuncts/modifiers we want this to be 
so).
48
 
                                                
48
 Whether we take the modifier case to be the same as the "adjunction" case (meaning adjunction now as the 
C-adjunction of the sort May (1985, 191) and Chomsky (1986) discus) depends on whether we take these to work in 
exactly the same way. On "adjunct" versus "adjunction" se Chametzky (196, 200). 
 
96 
On Starke's view, which has it that the analogue of specifiers are understood to be 
a special case of "heads" in that they project their properties to determine (part of) the 
label of the dominating structure, the structure corresponding to intermediate projections 
in standard X-bar theory would be a "visible" unit since it wil always be a maximal 
projection.
49
 
Epstein & Sely (1999), on the other hand, argue that intermediate projections are 
real, but that they behave as "fosils" (se Chomsky 1995:382n24) having initialy been 
maximal but losing this status when they are targeted by a merge operation on Chomsky's 
BPS-relational view of intra-phrasal projection. But, based on the asumption that these 
elements are no longer visible to the system, Epstein & Sely go on to argue that such 
elements cannot possibly be sisterhood contexts defining chain-links as suggested in 
Chomsky (1995) since the relevant elements are by hypothesis invisible ? therefore, 
they conclude, chains cannot exist. 
Both Chomsky (1999) and Lasnik (2000) point out that its not obvious that 
intermediate invisibility rules out the merge-context view of syntactic chains, as it sems 
reasonable to take the motherhood relationship to define the local structure identifying 
chain links (as I suggested above for independent reasons specific to my technical 
ambitions here). 
However, one might respond to this suggestion ? on behalf of Epstein & Sely 
? by noting that this just moves the problem around somewhat. On the motherhood view 
                                                
49
 This is Starke's notion of "checking". Instead of having ?P with feature {F} enter into a relation with a ?P 
with the same feature {F}, Starke sugests that ?P simply projects its {F} upon combination with ?P. Se Starke's 
discusion for details. 
 
97 
of merge contexts it is true that intermediate movements wil now have visible contexts, 
as they wil typicaly (always?) be dominated by XPs. But now the base position of a 
given chain should have an invisible element as its context, since presumably its mother 
wil always be a non-maximal element. Though, this would depend on whether there is a 
specifier for the head of the base position ? if so then the context wil be maximal and 
hence visible, if not then it would be intermediate and thus invisible. Note that, as pointed 
out above, the specifiers-as-adjunction-structures view of Kayne and others might alow 
us to sidestep these technical problems if we could motivate the possibility of having 
segments of a category serve as appropriate contexts for the understanding of chains 
we've been discussing. 
 It is not, I think, quite clear what is realy at stake here. That is, I agre with 
Chomsky that debate on this subject turns on presuppositions about "properties of phrases 
that are stil unclear". 
We can elaborate on this point in another general way. Given the explosion of 
functional categories that has atended the development of the MP, its an open question 
for any given element X whether an element which appears to be its specifier is, in fact, 
rather the specifier of a functional element Y that takes X as its complement. If there is 
such a Y, then X wil be maximal and hence visible; if not, it wil be intermediate and 
hence invisible. 
The degres of fredom that theory makes available for analysis here makes it 
dificult to sort out these alternatives. Note that it is not impossible ? the present point is 
only that aray of distinctions made available with these various degres of fredom 
 
98 
simply predict more clases/groups of facts than an approach without such degres of 
fredom (and are thus les restrictive). 
What we might worry about even at this level of generality is what we might take 
as independent reasons to introduce principled/motivated constraints in the deployment of 
theories/models with this many alternatives (i.e., to narow the possibilities/reduce the 
degres of fredom). For approaches which adopt fine-grained functional category 
inventories and a maximal/intermediate level distinction and the possibility of C-
adjunction (with double XP segments) things get even les clear in terms of the 
restrictivenes of the overal theory. 
This whole set of isues ties into a discusion from earlier years, as set out 
helpfully in the work of Sturrman (1988), regarding projection-level types. Sturrman 
develops what he refers to as the Single Projection Type Hypothesis (SPTH) which 
divides syntactic categories into two basic types: (i) recursive and (i) non-recursive. The 
later we can take to be heads (X
0
s); the former are the equivalent of maximal elements 
(XPs).
50
 
It is exactly the concerns regarding isues of restrictivenes raised above that 
drives Sturrman's theoretical developments in this respect, and it is concerns of this type 
(as wel as his aim to eliminate redundancies) that similarly drive Brody's introduction of 
the collapsed structures discussed above. Brody's view renders trivial, for example, the 
general fact that projection lines (e.g., X
0
?X'?XP) can never be interupted by some 
                                                
50
 Sturman cites early work of Emonds (1971) where the SPTH is proposed, and Emonds (1973), where the 
idea is rejected in favor of having two recursive types (the equivalent of modern day X' vs. XP if we take XP to be 
potentialy recursive). Sturman does not discus the isues which would arise for head movement that might force the 
adoption of sub-X
0
 structure for which one might want to posit recursive X
0
s. 
 
99 
other intervening element of a diferent type since, in his reduced structures, there are no 
such internal distinctions, and therefore there is simply no room for any such 
interveners.
51
 
The general view, however, raises questions about what does go in place of the 
distinctions typicaly understood to underwrite, for example, head-movement versus XP-
movement (or the phrase-structure status of modifiers).
52
 This brings us to our second 
relevant line of discusion regarding the Brody-type reduced structures and 
"constituency" from above: the idea of head-movement as a PF-phenomena.
53
 
Chomsky (1999) (se also Boeckx & Stjepanovic 2001, Bobaljik 2001) suggest 
that head movement might not be part of the syntax proper, but rather is a PF-
phenomena. However, note that on these views it is certainly not the case that syntactic 
structure simply does not mater for such operations. For example, Bobaljik's (2001) 
approach takes syntactic structure to yield a weak pairwise ordering which head-to-head 
relations are established, so saying that head-movement is a PF operation doesn't imply 
that it is not constrained in some manner by syntactic structure. 
                                                
51
 But what of the SPTH of Sturman? Do we have recursive catergories, or not? Strictly speaking, the notion 
of recursion refers to a function that cals itself. So we say that sentences embeded in sentences manifest recursion, 
and similarly with noun phrases inside of other noun phrases. But in the more recent era of separating out sequential 
arays of functional and lexical types, do we ever have instances of local recursion in the sense of an X taking an XP 
complement? Work by Hoekstra (1984) sugests not, formulating what he caled the Unlike Category Condition (UC: 
*{X
0
 XP}). Se van Riemsdijk (198) for critical discusion and an alternative formulation of the key intuition which 
avoids some potential problems which arise. 
52
 I won't be discusing adjuncts/modification in this work. 
53
 What folows regarding "head movement" superficialy parts ways with Brody's discusion, who argues 
(folowing Baker 1985 and others) for a miror-theoretic understanding of syntax/morphophonology conections. What 
I am about to sugest however does not strike me as incompatible with Brody's proposals (se my earlier remarks as 
wel on pre-/post-spel-out coherency and conservation of ordering properties). 
 
100 
Let us now tie these two strands of discusion back into our discussion above of 
constituency in the Brody-type reduced structures. Consider again our abstract 
dominance sequence and the possible constituency groupings: 
 
(105) A?B?C?D?E 
 
a. D?E 
b. CD?E 
c. B?CD?E 
d. AB?CD?E 
 
We can now tentatively adopt the view of head-movement as a PF-phenomena by saying 
that the PF-relevant properties of the individual nodes (A, B, C, etc.) are PF-constituents, 
which are related by principles that may involve reference to syntactic structure (perhaps 
along the lines sketched in Bobaljik 2001) but which only actualy handle the PF-relevant 
properties. The isues regarding constituency with respect to individual heads thus fal 
outside the syntactic system. 
 Now consider branching and constituency again with reference to these reduced 
structural descriptions: 
 
(106) X ? Y 
 
Z 
 
Now we are fre to take the line suggested above regarding phrasal constituency in terms 
of traditional dominance ordering. On that view the object in (106) manifests thre 
constituents, the entire object, Z (and whatever it dominates) and Y (and whatever it 
dominates). 
 
101 
 Note however that our view of constituency can interact with directionality of 
structure building. For example, on a top-down view, the Brody-type structure in (107) 
would have a derivation like that in (107)a-d: 
 
(107) A?B?C?D?E 
 
a. A?B 
b. AB?C 
c. A?BC?D 
d. AB?CD?E 
 
This kind of alternative is argued for in the work of Philips (1996, 2003) on the basis that 
it yields a fundamentaly diferent (derivational) conception of constituency making 
available units that he argues we need for analysis.
54
 It wil be important to se whether 
the asumptions that lead to the conclusion here regarding our suggested treatment of 
SCM are roughly consistent with Philips' solution to various constituency-test puzzles. If 
so, then the two independent lines of thinking ? one regarding the dynamics of local 
unithood and one regarding the dynamics of reducing linked-local relations to local ones 
? can be sen to be pointing in (or rather, "to") the same general direction. (I do not 
addres this isue here, though it sems to me that these reduced structures are consistent 
with what is needed to implement Philips' analyses).
5
 
However, I am not principaly concerned with either of these general sets of 
isues (i.e., head movement or constituency per se). Therefore, my adoption of 
                                                
54
 Philips gets a bit more than just what we arguably ned. On his view any left-edge grouping is a posible 
constituent. In virtue of this his analyses ned to apeal to other notions to avoid overgeneration (though he argues the 
required 'other notions' are independently motivated). Se Philips (203) in particular for discusion. 
5
 That is, what is required is to be able to refer to spec-head constituents excluding complements. As far as I 
can se this distinction is available in a top-down expansion of structure apealing to these reduced Brody-type 
structures. 
 
102 
asumptions regarding structure and category for the present work can best proced by 
seking out a way to concentrate on the aspects of these concepts that are of interest for 
me here. I wil thus be working with the reduced structures of the type Brody proposes, 
though the view here wil be understood to be derivational, while Brody has extensively 
argued in favor of a representational view (se, e.g., the papers collected in Brody 2003). 
Consider the following graph of a typical transitive clause: 
 
(108)         ?C 
 
     ?    ? T 
 
    D?            ?T 
 
 ?     ?N    ?    ?v 
 
                  ?     ?V 
 
                     ?     ?D 
 
                        ?     ?N 
 
We can extract the Brody-type structure as follows: 
 
(109)               ? C 
 
                  ?T 
 
              D?    ?v 
 
                 N?    ?V 
 
                           ?D 
 
                              ?N 
 
 
103 
It wil be along these dominance spines that the al the action of the system developed 
here wil happen. To save space in the presentation I wil adopt a horizontal notation, so 
that (109) wil look like (110) (the connecting arcs representing dominance): 
 
(110) C?T?v?V?D?N 
 
    D?N 
 
For everything I wil be arguing here, it wil be suficient to refer to simple sequences of 
this kind (though we wil augment the labeled nodes with more complex feature 
descriptions as in ?1.2 above). 
Note that we can take the adoption of this kind of structure as either fully 
embracing the Brody-type vision of structure and category labels, or we can simply 
understand this to be a suitable set of working asumptions which abstract away from the 
isues of intermediate-level categories, whether we treat head movement as "in" the 
syntax or not, and questions about how non-argument modifiers are integrated. That is, 
what this sparse representation alows us to concentrate on is the key type of information 
that I wil be taking to be important for the TCG system ? namely the nature of category 
sequences defining the dominance-spine of syntactic objects. Most if not al of what I 
wil say here is consistent with this weaker view of adopting these ideas as simply a set of 
working asumptions. However, as noted above, this view collapsing intra-phrasal 
distinctions sems intuitively more compatible than some other possible approaches with 
the idea that the workspace cannot tolerate multiple tokenings of a given type X. And 
given the arguments above that we require motherhood/dominance to underwrite the 
context-identification view of SCM-type relationships, having a model within which this 
 
104 
is the only possibility provides an atractive convergence of independent ideas. These 
correspondences with our central aims, along with the ability to circumvent the numerous 
technical dificulties that I mentioned above in connection with some salient alternatives, 
wil be taken as sufficient justification to proced with these asumptions. 
1.6. Chapter Summary 
We now have the following ideas in place. We asume the WS/O-distinction as a basis 
for our TCG implementation of an MSO-system. The workspace has been suggested to 
be restricted in two ways: 
 
(111) WORKSPACE ORDER: 
 The elements in the workspace manifest a weak partial order 
 
(112) WORKSPACE DISTINCTNESS (ANTI-RECURSION): 
 The workspace does not tolerate the presence of multiple tokens of type X 
 
For a workspace containing an X-element, I have suggested that the proces of 
introducing any second X-element should be understood as part-and-parcel of the 
contraction procedure. One way that contraction can occur is in virtue of a particular 
response of the system when confronted with like elements ? they can be identified 
under what I caled matching relation ?, which we wil se in Chapter 3 requires some 
further elaboration. 
 The following strengthening of the ordering restriction on workspaces was 
suggested as wel: 
 
(113) Workspace Connectednes (DOMINANCE): 
The elements in a given syntactic workspace must manifest a connected 
dominance order (for every x, y in the set, either x dominates y or y dominates x) 
 
 
105 
This efects a fairly radical partition of structures, so that the workspace always only 
contains esentialy a "single line", indexed via feature relationships so that there is 
coherent maintenance of speled-out branches of structure. Moreover, given the 
mechanics of node-identification, it was suggested that speled-out structure may "re-
enter" the workspace in certain principled circumstances, and then be required to "re-
spel-out" (and then re-enter again, and so on). Again, I wil explore this view in 
connection with SCM phenomena in Chapter 3. 
 We have adopted a reduced vision of category/structure, importing ideas from the 
work of Brody (2003). This view as argued above to make for a clearer, technicaly les 
complicated fit with one of the central intuitions of the TCG approach as stated above in 
(112). We can now note that this point-of-view can be strengthened a bit. If something 
like (112) is correct, then something like the Brody-type reduced structures might be in 
fact required. The alternative would be to introduce a way of distinguishing betwen X
0
, 
X', and XP. But the entire point of Brody's proposals ? and this holds of the 
Collins/Stabler view mentioned briefly above as wel ? is that these are distinctions that 
we can and should learn to live without, as they are redundant with other independently 
required concepts (we may dispute this, it is ultimately an empirical mater, but that is the 
claim, and it is the right one to advance on minimalist grounds). Indeed, one can take this 
general direction of theory-development as a natural continuation of Chomsky's (1994, 
1995) BPS-project, which was aimed at (among other things) eliminating primitive bar-
level distinctions. 
 The key ideas underlying our view of SCM-type relationships was sen to rely on 
a weakening of Chomsky's (1995) view of chains as sets of contexts, where "contexts" 
 
106 
were understood on his approach as the entire previously established structure that an 
element ? merges with. But this view was suggested to be too strong, as it yields unique 
contexts for each "link" of any complex chain. The key idea here revolves on a denial of 
this ? it is the fact that contexts are not uniquely identifiable that permits cross-domain 
movements of the linked-local/SCM sort. 
 I turn next to a more detailed empirical and theoretical discussion of SCM. 
 
 
107 
CHAPTER 2: Regarding Succesive Cyclic Movement 
In this chapter I discuss a number of isues regarding syntactic theory and analysis with 
reference to succesive cyclic movement (SCM) and related phenomena. First, I canvas 
an aray of empirical considerations that have ben taken in the past to argue in favor of 
SCM. I then discus the isue of what motivates the intermediate/non-target movements 
posited by SCM analyses (the TRIGGERING problem ? se (64) above in ?1.4.2) 
alongside the isues of how we ought to regulate phases and understand localizing 
evaluation for convergence (the CONVERGENCE problem). 
2.1. Types of Successive Cyclicity Effects 
I turn now to take stock of the sorts of considerations that have led gramarians to think 
that something like succesive cyclic movement operations are for real. The initial 
motivation for positing succesive-cyclic movements came from discussion and 
arguments in Chomsky (1973), where it was proposed that wh-movement ought to be 
viewed as clause-local, with the "edges" of clauses (a COMP node made available under 
S-bar) serving as escape hatches. For some time then succesive cyclic movement was 
motivated only by theory-internal considerations arising in the proper treatment of 
movement locality. However, a number of other phenomena have since been brought 
forward that are of a diferent sort. 
Most (if not al ? se below) of these phenomena provide contribute to the body 
of converging evidence for the idea that movement relations are not generaly "one-fel-
swoop", but rather manifest a linking of local relations. This idea of succesive cyclic or 
 
108 
linked-local relations, as we wil se here, brings a remarkably diverse range of 
phenomena into a single abstract clas. 
 It wil be helpful in this discusion to make reference to a set of distinctions 
drawn from Abels (2003). He distinguishes betwen some logical possibilities regarding 
ways that movement relationships might afect or interact with intervening material, 
discriminating betwen "punctuated" and two diferent general types of "uniform" 
conceptions. Consider first this general schema: 
 
(114) FILLER       GAP 
            PATH 
 
The possible efects that any such filer/gap relationship might have on elements along 
the path betwen them can be discussed with reference to the following tri-partition: 
 
(115) a. ?   ? (Uniform; no efect of ??? on the path) 
 
b. ?   ? (Uniform; entire path efected by ???) 
 
c. ?   ? (Punctuated; only parts of path efected by ???) 
    ..  ..   ..   ..  .. 
We wil se that the various phenomena that have been argued to favor succesive cyclic 
movement analyses are not homogeneous ? al of the possibilities in (115) are 
instantiated. 
2.1.1. Wh-Copying 
What strikes me as one of the most intuitively convincing types of evidence is the 
following: in certain languages we actualy se overt copies of moved elements. The 
 
109 
following data are drawn from an interesting summary and theoretical discussion in 
Felser (2004) ilustrating this phenomena for wh-movement in a number of languages:
56
 
 
(116) a. Wen glaubst Du wen sie getroffen hat?   German 
who think you who she met has 
?Who do you think she has met?? 
 
b. W?r tinke jo w?r-t Jan wennet?    Frisian 
where think you where that-CL J. resides 
?Where do you think that John lives?? 
 
c. Waarvoor dink julle waarvoor werk ons?   Afrikaans 
wherefore think you wherefore work we 
?What do you think we are working for?? 
 
d. Kas o Dem?ri mislenola kas i Ar?fa dikhla?   Romani 
whom Demir think whom A. saw 
?Who does Demir think Arifa saw?? 
 
This phenomena is punctuated in Abels' sense ? this kind of copying is only available at 
clausal boundaries (i.e., CPs). Below we wil se other types of evidence that suggests the 
possibility that wh-movement might generaly be "more succesive cyclic" than this, 
implicating the edges of VP as wel (specificaly vP). So we wil want to ask why this 
copying phenomena does not show up anywhere but clause edges. 
Interestingly, we also se this kind of copying phenomena in the L1-acquisition of 
English, where the target gramar does not ever permit this kind of copying. Consider 
the following case of so-caled 'medial-wh' (De Viliers et al. 1990; McDaniel et al. 1995; 
Thornton 1990): 
 
 
 
                                                
56
 Felser (204) draws the folowing examples from the folowing sources: (16)b is from Hiemstra (1986: 
9); (16)c from Du Plesis (197:725); and (16)d is adapted by Felser from data in McDaniel (1989:569n.5). 
 
110 
(117) a. What do you think what Mini put on _ ? 
 b. Who do you think who's in the box? 
 
The existence of this kind of copying in early child English raises interesting chalenges 
for the Subset Principle (Berwick 1985), as this apparently lies within a superset of the 
gramar of standard English. It would sem then that learners would require negative 
evidence to abandon such wh-copying. But regardles of how this is sorted out (perhaps 
in terms of an indirect sort of negative evidence), the existence of such cases points 
strongly towards the reality of something like SCM. 
2.1.2. Q-Stranding 
Other cases show an efect that is intuitively related to the copying phenomena ilustrated 
above, where it is aleged that we can se part of a moved expresion "stranded" in 
positions which it has by hypothesis moved through. Facts of this kind include the 
paterns of so-caled quantifier stranding reported in a dialect of Irish English by 
McCloskey (2000) ? specificaly in West Ulster English.
57
 Consider: 
 
(118) a. What al did you get t for Christmas?  Standard English 
b. ho al did you met t when you were in Dery? 
c. Where al did they to t for their holidays? 
 
(119) a. What did you get t for Christmas?   Standard English 
b. ho did you met t when you were in Dery? 
c. Where did they go t for their holidays? 
 
(120) a. What did you get all for Christmas?   West Ulster English 
b. ho did you met all when you were in Dery? 
c. Where did they to all for their holidays? 
 
                                                
57
 McCloskey refers readers to the work of Henry (195). Aparently the Q-float-type phenomena under 
discusion varies, like the copying phenomena discused above, by dialect, and like "Standard" English is not 
present/posible in Belfast English. Se McCloskey's paper for further discusion of dialect diferences in this regard. 
 
111 
McCloskey notes that in Standard English the cases in (118) versus (119) difer in 
whether they require "that the answer is a plurality [..] insisting on an exhaustive (118)), 
rather than a partial, listing of the members of the answer set". The interesting cases from 
West Ulster English, in (120), are claimed by McCloskey patern in these interpretative 
properties with the examples in (118), and not those in (119). He notes that this 
phenomena is not exclusively tied to matrix clause interogatives, but appears in 
embedded environments as wel: 
 
(121) a. I don't remember what all I said   West Ulster English 
b. I don't remember what I said all 
 
McCloskey develops a stranding-type analysis for these phenomena, whereby wh-
elements move succesive cyclicaly and may abandon the asociated element all at 
places along the movement path. Following earlier proposals (Postal 1974, Koopman 
1999), McCloskey asumes a structure like (122) for the wh-element plus the 
quantificational element all, analogous to ideas that have been put forward in analyses of 
similar phenomena involving NP-movement (se below on stranding and raising-to-
subject): 
 
(122)      
wh
DP 
 
who
DP
i
    D 
 
    
al
D
0
    t
i
 
 
 
This structure, McCloskey argues, alows the possibility of either of the two circled nodes 
to undergo movement (i.e., he makes the not unreasonable asumption that both the 
specifier and the dominating DP node both bear the wh-properties as marked above). If 
 
112 
this is correct, then in principle every position to which the wh-moves could be a position 
where all is stranded.
58
 
Important then for the notion of succesive cyclic movement are the following 
cases in (123) which ilustrate that all can be stranded in an intermediate position: 
 
(123) a. Where do you think all they'll want to visit t? 
b. Who did Frank tel you all that they were after t? 
c. What do they claim al (that) we did t? 
 
This kind of stranding phenomena is sometimes refered to as "floating" in virtue of 
earlier approaches to these maters which involved transformational operations that 
literaly moved (floated) such elements from position to position.
59
 
 In addition to the logical possibility of true "floating" as a way to possibly analyze 
such cases, there is also an extensive literature treating phenomena of this type in terms 
of analyses that base-generate all in the various positions where it occurs, thus suggesting 
the possibility that the statement of the laws governing distribution of elements of this 
type might be independent of the putative path of succesive cyclic movement 
operations.
60
 
 We can push this point a bit with a consideration of similar phenomena as it arises 
in cases of A-movement (these data from standard English): 
 
                                                
58
 McCloskey acknowledges that this analysis relies on the posibility of left-branch extraction ? in order for 
the wh-element to strand the quantificational element al. Though he notes as wel that such an operation fals within 
the bounds of known cros-linguistic variation. 
59
 Analyses positing literal transformational floating were ofered, for example, in Kayne (1978). Se 
Bobaljik (201) for a thorough review of the isues surounding these elements and what they may (or may not) be able 
to tel us about the nature of syntactic structures and the properties of movement operations. 
60
 Some of these base-generation analyses posit that elements like al have the special property that they can 
only be generated in positions where they can enter into a relation with a movement trace. On such views the 
distribution of al is not, in fact, independent of movement ? rather the diference is in terms of whether that element 
ever was "in" any positions other than the one in which it surfaces. 
 
113 
(124) a. The men all semed to appear to be likely to leave 
 b. The men semed all to appear to be likely to leave 
 c. The men semed to al appear to be likely to leave 
 d. The men semed to appear all to be likely to leave 
 e. The men semed to appear to all be likely to leave 
 f. The men semed to appear to be all likely to leave 
 g. The men semed to appear to be likely all to leave 
 h. The men semed to appear to be likely to all leave 
 
The possible stranding sites appear to be a bit more prolific here than is typicaly 
asumed.
61
 Although, as with McCloskey's A'-movement cases, there is apparently 
dialect variation here as wel.
62
 If the distribution of these elements marks the path of 
movement, then this suggests that we have the folowing intermediate positions:
 
 
 
(125) The men semed t to t appear t to t be t likely t to t leave 
 
A subset of these are motivated under fairly standard asumptions about the existence of 
a base/? position internal to the most embedded VP and the existence of EP-features 
marking the "subject" positions of the relevant infinitivals (marked below). But thre 
others are les straightforward (marked with "??" in (126)): 
 
?? 
 
(126) The men semed t to t appear t to t be t likely t to t leave 
 
          EP                  VP-INTERNAL/?-POSITION 
 
The distribution of all in these cases might then be suggesting that movement is very 
succesive cyclic. In contrast to the wh-movement cases of stranding discused by 
McCloskey, which appear punctuated, this A-movement case appears to manifest a 
                                                
61
 Acounts that insist that movement targets the edge of every intervening XP would do fine in acounting 
for this patern, but then its not clear why we shouldn't se more prolific stranding in A'-movement then. 
62
 Judgments on a-h vary, though to my ear they are al equaly aceptable. (Norbert Hornstein informs me 
that he finds b, d, and g les aceptable than the rest). Se Hornstein (200) for a posible explanation. 
 
114 
uniform "al positions afected" relationship betwen the surface position of the subject 
NP and its base/?-position. (That is, if we have reason to think movement realy does 
target every single projection on the path).
63
 
However, as mentioned above, there exists analyses which posit instead base-
generation. While the case is by no means closed, its not crystal clear that these kinds of 
facts actualy bear on the question of succesive cyclic movement (se Bobaljik 2002). 
2.1.3. Agrement on the Path 
Asuming agrement relationships are typicaly local (not trivial; se below), the 
distribution of ?-properties might tel us something about the path of movement. 
However, given the possibility of an operation of the AGREE sort proposed in Chomsky 
(1999), where the relation betwen two agreing elements can be potentialy non-local, 
such phenomena might not tel us anything about the path of movement. Nonetheles, 
consider the following. 
Kayne (1989) offers cases like the following ilustrating participal agrement on 
the path that the relevant local A-movements would usualy be asumed to take: 
 
(127) Les files sont     apparues      avoir ?t?  report?es     disparues 
the girls are.3
RD
PL appeared.3
RD
PL have been reported.3
RD
PL disappeared.3
RD
PL 
'The girls appeared to have been said to have disappeared' 
 
However, if these relations can be licensed without movement, then its not clear what we 
should make of these agrement facts. On Chomsky's view such series of embedded non-
finite clauses manifest no internal phase divisions, so it is not implausible that agrement 
                                                
63
 The view I wil end up endorsing in Chapter 3 regarding this particular case wil be inconclusive. 
 
115 
could be understood to occur long-distance. It would then be an open question regarding 
the presence/absence of "EP-features" that would determine whether the movement of 
the NP (les files) from its base position would have to involve al, some, or none of the 
intermediate nodes. 
 But such a non-movement (long-distance "agre") approach is not as plausible for 
similar local agrement phenomena that occur in a variety of languages in the domain of 
A'-movement.
64
 For example, consider the complementizer agrement phenomena in 
Irish (McCloskey 1979, 2000): 
 
(128) Credim  gu-r      inis s? br?ag 
I-believe go-PAST tel he lie 
"I believe that he told a lie" 
 
(129) an  t-ainm a  hinnseadh d?inn a  bh?  ar an a?t 
the name  aL was-told   to-us aL was on the place 
"the name that we were told was on the place" 
 
Finite clauses in Irish manifest a diference betwen finite clauses that do versus do not 
contain an A'-movement trace. In the former we se the bold particle in (128); in the 
later we se the particle aL (129). This phenomena manifests at every clause edge, 
strongly suggesting something like succesive cyclic movement is at work. 
Chamorro (Chung 1982, 1994) has been argued to show similar kinds of 
intermediate agrement efects (though se fn65 below). Consider (from Chung 1994:1): 
 
(130) Hum?lum si Maria [na   ha-p?nak si Juan i   p?tgun] 
AGR-assume Mary  COMP AGR-spank Juan the child 
 "Maria asumes that Juan spanked the child" 
                                                
64
 Its not plausible that non-local agrement is at work here since these agrement relationships would have to 
cros phase-boundaries (on Chomsky's view) and clausal boundaries in general on anyone's acount, if there is no local 
movements involved. 
 
116 
 
(131) Hayi hinalom?a si Maria [t pum?nak t i  p?tgun] 
who? WH.assume  Maria  WH.spank  the child 
 "Who does Maria asume spanked the child?" 
 
Chung notes that, "in simple wh-constructions [..] the presence of a moved wh-phrase is 
signaled morphologicaly on some head in the extended projection of [+V]" and that in 
"long-distance wh-constructions, the special morphology shows up on every such head 
along the path" (glossed as WH in (129) above). Extraction across multiple boundaries 
thus shows the agrement efect at every clausal edge: 
 
 
(132) Hafa  sinangani-n Juan as Dolores [t ni  minalago'?a   [t p?ra un-taitai    t]? 
WHAT? WH[OBJ2].tel Juan OBL Dolores  COMP WH[OBL].want-AGR FUT WH[OBJ].AGR-read 
 "hat did Juan tel Dolores that he wants you to read?" 
 
Similar kinds of local agrement phenomena have been documented in a number of other 
languages.
65
 Seting aside some interesting complications facts such as these provide 
fairly strong support for SCM.
6
 
 Both the Irish and Chamorro cases manifest a punctuated efect on the movement 
path. Moreover, it is not possible to "skip" intermediate positions such that the agrement 
efects would show up both above and below a position in which the efect would be 
absent. Where it occurs, it occurs al the way down the structure from the fronted element 
to the extraction site. 
                                                
65
 In, for example: Kikuyu (Clements 1984, Sabel 200), More (Ha?k 190), Palauan (Georgopolous 1985), 
Pasamaquody (Bruening 201), Malay/Bahasa Indonesia (Cole & Hermon 200, Sady 191) 
6
 The "complications" include (i) the local agrement along the movement path evident in Chamoro is not 
analyzable as agrement betwen the wh-element and the local head of CP and is not strictly speaking agrement with 
the moving XP, but rather with distinguishable properties of the trace elements (se Chung 194:7-1) and (i) it turns 
out that there are cases for which such sucesive cyclic agrement is optional. Chung argues that these cases manifest 
a D-linked versus non-D-linked distinction (se Pesetsky (1987), and se Cinque (191) for a view subsuming D-
linking under referentiality) with the D-linked cases manifesting the optionality thus sugesting only optional 
sucesive cyclic movement. 
 
117 
2.1.4. Some Binding Theoretic Efects 
Consider the binding possibilities of the self-form in (133)a: 
 
(133) a. Which pictures of himself did John know Bil wanted? 
b. [which..himself] did John know [which..himself] Bil wanted [which..himself] 
 
Cases such as (133)a are ambiguous ? the self-form can be interpreted as anteceded by 
either John or Bil. Such cases are not totaly straightforward ? there are a number of 
confounding factors that must be controlled for (se ?3.2.2 in Chapter 3 on "logophors"). 
 Here we wil simply note that such ambiguities have been tied to succesive 
cyclicity in A'-movement via the idea that the local movements create local contexts for 
the licensing of the self-elements. 
 Similarly, we se from the following pair of examples that similar phenomena 
appear to manifest in both A'- and A-movement. Consider (from Bars 2001): 
 
(134) a. The women
1
 asked [which pictures of themselves
1/2/3
] the men
2
 had said 
that the children
3
 had brought t
WH
 to the school fair 
 
b. The women
1
 consider [old pictures of themselves
1/2/3
] to have struck the 
men
2
 as [appearing to the children
3
 [t
NP
 to be amusing] 
 
These cases, on a succesive cyclic movement view, would be related to structures 
involving local copies (or traces which could be "reconstructed into") as follows: 
 
(135) a. The women
1
 asked [wh..selves
1/2/3
] the men
2
 had said [wh..selves
1/2/3
] that 
the children
3
 had brought [wh..selves
1/2/3
] to the school fair 
 
b. The women
1
 consider [..selves
1/2/3
] to have struck the men
2
 as 
[..selves
1/2/3
] [appearing to the children
3
 [..selves
1/2/3
] to be amusing..] 
 
 
118 
Both A'-movement (135)a) and A-movement (135)b) appear to manifest the same sort 
of "expansion" efect, as Bars notes, in terms of the available antecedents for the 
relevant self-forms. The ful set of available antecedents is interpretable with a fairly tight 
view of binding in mind if we adopt a succesive cyclic view for both types of movement 
as sketched above. 
 Other interesting cases involving binding impossibilities as they appear to arise in 
A-movement. First, consider the following cases in (136)a&b, and what sem to be a 
binding-theoretic violation in (136)d and (i) the absence of ambiguity in (136)f, as 
contrasted with (136)c: 
 
(136) a. John
1
 semed to himself
1
 to appear to Mary to be geting fat 
 b. John
1
 semed to appear to himself
1
 to be geting fat 
 c. John
1
 semed to Mary to appear to himself
1
 to be geting fat 
 d. *Mary semed to John
1
 to appear to himself
1
 to be geting fat 
 e. It semed to John
1
 to appear to himself
1
 that he was geting fat 
 f. John
1
 semed to Bil
2
 to appear to himself
1/*2
 to be geting fat 
 
First, note there is no problem with John as the antecedent for the reflexive himself in 
(136)a. Similarly, we can increase the distance and have the reflexive ocupying the 
experiencer-P of the second raising verb, and the reflexive binding is stil posible, as in 
(136)b. So the first point is to take note of a thre-way disjunction of possibilities: (i) 
either binding domains can span across levels of embedding or (i) the antecedent must 
somehow be "in" the lower infinitival clause in addition to participating in its overt 
position or (ii) the reflexive must somehow be "in" the superordinate matrix clause in 
addition to participating the relations of its overt position. 
 
119 
 Al of (i)-(ii) have been advocated at one point or another, so its worth 
considering in terms of what else we say here which of these we options we can remain 
consistent with. 
 Now consider (136)c/d. The first experiencer-P (to Mary) does not serve to 
block the antecedence relation betwen John and himself in (136)c, though somehow the 
relation is blocked where there is no overt intervening element, despite the fact that 
antecedence betwen the two experiencer-Ps is otherwise perfectly legitimate.
67
 
 This whole aray of facts fals out nicely if we asume that the following 
movement operations have transpired: 
 
(137) a. John
1
 semed to himself
1
 (t
John
) to appear to Mary (t
John
) to be geting fat 
 b. John
1
 semed (t
John
) to appear to himself
1
 (t
John
) to be geting fat 
 c. John
1
 semed to Mary (t
John
) to appear to himself
1
 (t
John
) to be geting fat 
 d. *Mary semed to John
1
 (t
Mary
) to appear to himself
1
 (t
Mary
) to be geting fat 
 e. It semed to John
1
 to appear to himself
1
 that he was geting fat 
 
The case in (136)/(137)a can be straightforwardly understood with a clause-local 
conception of binding domains, as can (136)/(137)b&c. In the former the overt position 
of the subject John licenses the reflexive; in the later two it is the trace/copy of John in 
the subject position of the infinitival to appear which provides the local licensing. 
 The interesting case of blocking/intervention now arises in (136)/(137)d, which 
we understand to be out for esentialy the same reason that *Mary appeared to himself to 
be geting fat is out; that is, there is a mandatory local antecedent for the reflexive, and it 
                                                
67
 Some speakers do not find the binding of the self-form in these cases to be aceptable. These speakers 
apear to prefer the a-case(s) with a simple pronoun over the b-case(s) with a self-form in (i): 
(i) a. It semed to John
1
 (to tend) (to be likely) to apear to him
1
 that he was geting fat 
 b. It semed to John
1
 (to tend) (to be likely) to apear to himself
1
 that he was geting fat 
Se Chapter 3, ?3.2.2 for some discusion. 
 
120 
disagres in gender specification. So (136)/(137)d is out via a minimality type efect 
since the trace/copy of Mary constitutes a closer potential antecedent than John does. 
And we know from (136)/(137)e that John being embedded in a P structure has nothing 
to do with the impossibility of binding in (136)/(137)d, since such relations are 
independently fine. There are some interesting wrinkles here that require addresing (in 
particular the status of these self-elements in relation to the anaphor/logophor distinction 
? recal I noted above that the binding betwen experiencer-Ps does not appear to be 
clause-local). I wil return to these maters in the discussion and analyses in Chapter 3. 
2.1.5. Interaction with Variable Binding 
Consider also the following (se Bo?kovi? 2002; Lebeaux 1991; Nunes 1995): 
 
(138) a. *[His
1
 mother's
2
 bread] sems to her
2
 _ to be known by every man
1
 to be _ 
the best there is 
 
b. [His
1
 mother's
2
 bread] sems to every man
1
 _ to be known by her
2
 to be _ 
the best there is 
 
This case, like the A'-movement case I wil discuss in a moment, shows the necesary 
availability of intermediate position reconstruction. We understand the bracketed phrase 
to have moved through the positions marked via the underscores. In order to license the 
bound variable reading for the a-case, the bracketed structure must be in the scope of 
every man. But this puts the element in the c-command domain of the pronoun her, which 
induces obviation (Condition C efect). What the b-case shows, however, is that it is 
possible to have the bracketed phrase reconstruct to the intermediate position, where it is 
within the scope of every man but above the pronoun her. And, as Bo?kovi? points out, 
the il-formed a-case with the indicated co-indexing is fully aceptable on the bound 
 
121 
variable reading so long as we have disjoint reference betwen her and his mother. The 
combination of these observations suggests that intermediate reconstruction is possible, 
but not necesary. It moreover suggests that the output structure handled by the 
interpretative systems must be coherent in the sense that the moved phrasal complex must 
be interpreted as "in" one or the other positions, but not both (as this would cause 
conflicts that would presumably correspond to unaceptability). This case, like the case 
involving the self elements discused above, constitute to my mind quite strong evidence 
for something like cyclic movement. 
 Notice as wel that the same sort of example can be constructed in the context of 
A'-movement, but with a twist which suggests that A'-movement is in fact even more 
local than just COMP-to-COMP. Consider (from Fox 1998): 
 
(139) a. ? [Which of the papers that he
1
 gave Mary
2
] did every student
1
 ask her
2
 to 
read carefully? 
b. * [Which of the papers that he
1
 gave Mary
2
] did she
2
 ask every student
1
 to 
revise? 
 
The pronominal element he in (139)a must occur in a position below the quantifier every 
in every student in order for the bound variable reading to obtain the standard asumption 
that c-command relations determine scope possibilities relevant to such variable binding. 
But, in order for the construction to avoid a Condition C violation betwen her and Mary, 
the complex wh-expresion must appear above the surface position of Mary. This means 
that the wh-expresion in (139)a must reconstruct in the ?-marked position and not the *-
marked one, as in (140)a: 
 
 
 
 
 
122 
(140) a. ? [Which of the papers that he
1
 gave Mary
2
] did every student
1
 [
vP
 ? [ask 
her
2
 to read  * carefully? 
 
b. * [Which of the papers that he
1
 gave Mary
2
] did she
2
 [
vP
 *  [ask every 
student
1
 to revise * ? 
 
But when we switch the positions of the every-phrase and the relevant pronoun, as in 
(139)b, we se (in (140)b) that there is no position which could permit the variable 
binding of the pronoun (he) within the wh-expresion without having Mary c-commanded 
by she thus resulting in a Condition C violation (this is indicated above by *-marking the 
relevant possible positions). 
 This kind of case suggests that not only is sucesive cyclic movement real, but 
that it involves more than just the typicaly asumed movements of the COMP-to-COMP 
sort. Rather, movement must proced as follows: 
 
(141) [
CP
 [WH] [
C'
 
0
 [ .. [
vP
 [WH] [
vP
 .. [
CP
 [WH] [
C'
 
0
 [ .. [WH] .. 
 
 
So the question about what features might be involved in motivating succesive cyclic A'-
movements appears to require properties that can be present not only in intermediate CP-
positions, but also at the edges of vPs.
68
 Some other phenomena bear on this possibility as 
wel, some of which I wil discuss in Chapter 3. 
 Its worth noting here that this kind of phenomena can be shown to extend to 
boundaries in structure where there is arguably no vP, as in pasives (from Legate 
2000):
69
 
                                                
68
 As noted earlier, wh-copying phenomena (as wel as McCloskey's stranding) apears to never manifest at 
positions other than the relevant C-positions. Asuming this is true, whatever the motivations for these two types of 
cyclic movement (to CP and to vP), we are ned of a story which explains why there should be such diferences. 
69
 In the first examples, Legate indicates the intended interpretation to be that Mary keps being introduced to 
her own dates at parties; in the second, there is a charity auction at which dates with bachelors are sold. The argument 
 
123 
 
(142) a. ? [At which of the parties that he
1
 invited Mary
2
 to] was every man
1
 [
vP
 ? 
[introduced to her
2
 *? 
 
b. * [At which of the parties that he
1
 invited Mary
2
 to] was she
2
 [
vP
  * 
[introduced to every man
1
 *? 
 
(143) a. ? [At which charity event that he
1
 brought Mary
2
 to] was every man
1
 [
vP
 ? 
[sold to her
2
 *? 
 
b. * [At which charity event that he
1
 brought Mary
2
 to] was he
2
 [
vP
 * [sold to 
every woman
1
 *? 
 
Such cases are problematic for the view that phases are just vP and CP. I wil discuss these 
maters a bit in Chapter 3. 
2.1.6. Inversion Efects 
In Spanish we se the following ordering alternation: 
 
(144) a. Contest? la pregunta Juan 
 answered the question Juan 
 'Juan answered the question' 
 
b. Juan contest? la pregunta 
 Juan answered the question 
 'Juan answered the question' 
 
In certain cases of wh-movement, this inversion ordering is obligatory: 
 
(145) a. ?Qu? quer?an esos dos? 
  what wanted those two 
 'What did those two want?' 
 
b. *?Qu? esos dos quer?an? 
    what those two wanted? 
 
                                                                                                                                            
is extended with an examination of unacusatives as wel. Legate also considers other tests for phase-hod and 
concludes that they al point towards a wider inventoy of phases than just vP and CP. 
 
124 
Of relevance to the existence of succesive cyclic movement are cases such as the 
following, where we se this inversion in every local domain that we would think the wh-
element has pased through: 
 
(146) ? Qu? pensaba Juan [que le hab?a dicho Pedro [que hab?a publicado la revista]] 
what thought Juan that him had told Peter that had published the journal 
'What did John think that Peter had told him that the journal had published?' 
 
Efects similar to these exist in French ? so-caled stylistic inversion ? and were 
initialy documented and argued to support succesive cyclic movement analyses in 
Kayne & Pollock (1978). However, the relevant phenomena in French appears to be 
generaly optional, thus it is dificult to say whether the relevant cyclic movements 
themselves are optional or not. The Spanish cases brought forward in Torego's work 
however were argued to be an obligatory phenomena. 
However, these data have become in subsequent years somewhat controversial as 
there appears to be dialect diferences regarding which types of wh-element must trigger 
such efects. These diferences may be systematic however. Bakovi? (1995) compiles 
some of these diferences as have emerged in the literature and supplements them with an 
extensive survey eliciting speaker judgments. The following diferences emerge in 
Bakovi?'s study: 
 
(147) a. No inversion with any wh-phrases (Su?er 1994) 
 b. Inversion with argument wh-phrases only (Torrego 1984; Su?er 1994) 
 c. Inversion with al but reason wh-phrases (por qu?/"why") (Goodal 1991a,b) 
d. Inversion with al wh-phrases in matrix clauses; al but reason wh-phrases 
in subordinate clauses (Bakovi?'s survey) 
e. Inversion with al but reason wh-phrases in matrix clauses; only argument 
wh-phrases in subordinate clauses (Bakovi?'s survey) 
f. Inversion with argument wh-phrases in matrix clauses; no inversion in 
subordinate clauses (Bakovi?'s survey) 
 
 
125 
These findings have not, to my knowledge, been corroborated by any independent 
investigation. However, the patern suggestively converges with optionality of agrement 
in Chamorro, which Chung (1994) shows to hold in cases of D-linking. 
Consider along similar lines the phenomena in English of matrix subject-auxiliary 
inversion: 
 
(148) a. Who _ liked John? 
 b. ho did John like _ ? 
 
(149) a. Who did Mary think _ liked John? 
 b. *Who did ary think did John like _ ? 
 
In many standard dialects of English, SAI is an exclusively matrix phenomena, as the il-
formed case in (149)b shows. However, there exist dialects in which such aux-inversions 
happen in embedded contexts as wel, and they share some of the interesting properties of 
other cases that suggest cyclic movement. Consider the following cases from Belfast 
English:
70
 
 
(150) a. Who did John hope [ would he se _ ]? 
 b. hat did Mary claim [did they steal _ ]? 
 c. I wonder what did John think would he get _ ? 
 d. Who did John say [did Mary claim [had John feared [would Bil atack _ ]]? 
 
This last case appears to show the local/SCM-type efects of wh-movement on Subj/Aux 
inversion in ways analogous to the Subj/V inversion of the Spanish type. 
                                                
70
 These are taken from Pesetsky & Torego (201), who themselves cite the extensive work of Henry (195) 
on this dialect. 
 
126 
2.1.7. Control as Raising? 
On at least some views of control phenomena sequences of control predicates must 
manifest succesive cyclic movement. Hornstein (2000) and Manzini & Roussou (2000) 
develop analyses within the MP that atempt to asimilate control and raising. For 
example, on Hornstein's story (151)a is derived by operations alowing movement of the 
nominal expresion into (and then out of) superordinate ?-positions, finaly landing in a 
Case position (151)b ignores the matrix ?-position for expect and marks positions as 
either ? or "EP"): 
 
(151) a. John expected to want to try to leave on time 
 
CASE          EP   ?     EP   ?   EP   ? 
 
 b. John expected (t) to (t) want (t) to (t) try (t) to (t) leave on time 
 
Such a movement (raising) analysis of control phenomena has been the subject of some 
recent controversy.
71
 In the TCG approach developed here we wil encounter theory-
internal reasons prohibiting any straightforward identification of control and raising, 
though I wil suggest that the TCG mechanics bear on the relation betwen the two in an 
interesting way. In particular, a way is discussed in connecting (as does Hornstein) 
control with reflexivization, but in a way that remains distinct from raising (though both 
wil involve node-identification/contraction).  
                                                
71
 Se in particular Culicover & Jackendof (203), Landau (204), and Boeckx & Hornstein (203). 
 
127 
2.2. Some Thinking on Successive Cyclicity 
Consider again the case of wh-movement with which we began our discusion of SCM 
(repeated here as the b-cases in (152)-(154)) to which we now add for consideration 
instances of NP-movement argued to be similarly succesive cyclic: so-caled Raising-to-
Subject (RtS; the a-cases in (152)-(154)): 
 
(152) a. John sems to be likely to appear to like carots 
b. What did Dave think that Mary believed that John liked? 
 
(153) a. John [sems [to be likely [to appear [ _ to like carots]] 
 
b. What [did Dave think [that Mary believed [that John liked _ ]]? 
 
(154) a. John [sems [ _ to be likely [ _ to appear [ _ to like carots]] 
 
b. What [did Dave think [ _ [that Mary believed [ _ [that John liked _ ]]]? 
 
I argue that although there is ample evidence that the picture of these dependencies as 
decomposed in (154)a/b is largely correct, current theory does not appear to have reached 
anything like a consensus as to what explains the fact that the system works in this way. 
 The crux of the problem that these phenomena present for theory and analysis 
centers on the fact that it sems that we cannot approach these linked-local dependencies 
with a general reduction to the typical simple local cases in mind:
72
 
 
(155) a. John appears [ _ to like carots]] 
 
b. What [did John like _ ]]]? 
 
                                                
72
 Strictly, the "base" position of the movement ilustrated in (15)a would be a derived position, conected 
to the underlying VP-internal position. My focus at the moment is on the properties of the "target" position, so this 
example does just fine in this respect. As it hapens, I wil later on sugest that the relationship depicted by the link in 
(15)a is not, in fact, a CHAIN, whereas the relation in (15)b is. What wil be a CHAIN in the area of A-movement wil 
be the conection betwen the Case/? position and the vP ?-position (e.g., [
TP
 John
i
 [
T'
 T
0
 [
vP
 t
i
 [
v'
 v
0
 [ .. ]]]. Local 
pasive structures wil similarly constitute CHAINS (e.g., John was kicked _ ). 
 
128 
Any simple reduction of the linked-local A- and A'-movements above to local cases like 
these is not possible because the properties which we might take to drive the 
establishment of these local relations (Case/?-properties for the (155)a and wh-properties 
for (155)b) are arguably not present in the intermediate positions for the more 
complicated structures involving embedding. 
In fact, when such positions with the relevant properties are available, movement 
relationships which would then have to cross such positions to relate to a more distant 
landing site are impossible, as wh-island and superaising cases show. 
 
(156) a. Dave wondered [what John likes _ ] 
 
b. *What did Dave wonder [whether John likes _ ] 
 
(157) a. It sems John was told [ _ to arive on time] 
 
b. *John sems it was told [ _ to arive on time] 
 
The complement of wonder in (156) selects for a [+wh] complementizer which can be a 
local/direct landing site for wh-movement. But when this property is present, it cannot be 
moved across, as (156)b shows. Similar restrictions hold for raising, prohibiting an NP 
from moving past a Case/? position (157)b). 
It is also impossible for an element which has satisfied the relevant properties be 
moved further to satisfy them, as it were, "again". Consider what are sometimes caled 
"frezing" facts such as the following:
73
 
                                                
73
 There exist, however, some curious aparent exceptions to these generalizations. Consider, for example, 
so-caled 'copy-raising' constructions (Rogers 1967): 
(i) John sems [like/as if [he likes carots] 
These share with the raising-to-subject constructions the property of having a non-thematic subject (as 
Potsdam & Runer (203) show in some detail). But this sugests that the overt matrix subject John somehow enters 
into a movement or chain relationship with the embeded subject pro-form he in (i). I wil not be discusing these 
 
129 
 
(158) *John sems that _ liked carots (vs. It sems that John liked carrots) 
 
(159) *Who do you wonder _ Bil liked _ ? (vs. You wonder [who Bil liked _ ]) 
 
So now how do we deal with such intermediate movement cases in the general 
framework of feature-checking and Last Resort? The idea of Last Resort has been a 
central notion in minimalist research: syntactic operations are not fre, but must be driven 
or caused.
74
 Such driving causal forces are subsumed under the notion of feature-
checking. So what causes the intermediate movements in this sense? 
2.2.1. M-Features 
Standard minimalist thinking views movement relationships as governed by a condition 
of LAST RESORT which generaly disalows syntactic operations unles they are required 
to license/check a property or feature that would otherwise cause the derivation to crash 
(se Chomsky 1995:280 for one technical version of this idea). This constitutes a 
rejection of the notion of Move-? developed within Government and Binding (GB) 
approaches in the 1980s and early 1990s. On this earlier GB view, operations were 
regarded as esentialy fre, but with restrictions imposed on either derivations or on 
levels of representation to rule out impossibilities. 
 On the GB-view then, the work of stating principles governing movement was 
thus not typicaly handled within the statement of the operations themselves. Lasnik & 
                                                                                                                                            
constructions here, though the general aproach to them in the present framework should be clear ? these canot be of 
the same embeded clausal type as raising predicates. 
74
 Like many ideas in curent work, feature-driven operations have a longish history, present in very early 
work in the form of so-caled trigering morphemes that were tied to various specific kinds of transformations (Klima 
1968). In curent MP work, the inventory of "operations" is now quite general, so there are not "construction specific" 
transformations that could be specificaly trigered (or, at least, that is the aim for theory construction). Rather, what is 
trigered in curent theory some response out of a smal set of general options (e.g., 'move', 'delete', etc.). 
 
130 
Saito (1992), for example, suggest a general formulation "Afect-?" which simply says 
"Do anything (move, insert, delete) anything". Constraints on levels of representation 
(like the ECP or Subjacency) were then understood to inspect the broad range of 
possibilities that arise from the general Afect-? formulation, and reject those outputs 
which did not comply with the conditions on the various levels (e.g., S-Structure, LF). 
One way of looking at this would be to say that GB-type systems looked at 
gramatical constraints governing wel-formednes as restricting outputs of otherwise 
unrestricted operations. The MP, in contrast, can be viewed as imposing conditions on the 
inputs to such operations ? that is, in the statement of laws governing what can count as 
a legitimate structural description over which an operation can apply. Operations can be, 
and in fact are, stil viewed as general in the MP (e.g., merge, move, insert, delete; as 
opposed to, e.g., "pasivize"), but the leading idea for theory construction is one of 
economy ? nothing can happen unles it is forced. Viewed in this way, the diferences 
betwen GB and more recent work in the MP take on a bit more subtlety. We can se the 
subtlety of the GB/MP diference as it emerges in current work's appeal to what I wil 
cal "M-features". 
Much current work which atempts to formulate analyses of sucesive movement 
consistent with Last Resort does so at the cost of advancing closer towards genuine 
explanation. In efect, what much recent work does to motivate these movements in a 
way consistent with Last Resort is to adopt a brute force solution; that is, to posit a 
 
131 
feature or property of the positions constituting the intermediate links in these composed 
dependencies. Cal this a Merge/Move-feature or M-feature for short.
75
 
 This is potentialy saying something quite exciting, perhaps despite appearances. 
We could understand this claim about the forces driving movement relationships as 
having hit upon a genuine explanation. Thus, movement is primitive: a dep, central, 
ireducible fact of the mater regarding the workings of the faculty of human language 
(FL). In minimalist terms, it is neither to be motivated with reference to the nature of the 
interface betwen gramar and other cognitive systems (natural interactions) nor in 
virtue of the internal coherence of the system based on virtualy conceptualy necesary 
asumptions about its workings. Movement is, rather, a basic property of the narow 
syntactic component of FL. 
 On the other hand, the M-feature view could also be taken to be saying something 
perhaps more depresing. That is: we simply do not have a clue why there might be 
displacement properties in human language syntax. The best we can do at the moment is 
catalog the facts, and hope that something turns up which might lead us towards a beter 
understanding. Or, slightly les depresing: we may have lots of ideas about why there 
ought to be such displacement properties, but no teribly strong reasons for thinking any 
one of them versus any other is right.
76
 
                                                
75
 In certain cases this feature is sugested to be satisfiable by Merge alone, as in cases of expletive elements. 
I return to this later on. 
76
 My sense is that we curently face something more like the later situation in theoretical linguistics. The 
only way out is of course to kep pursuing the options. Note that the "depresing" picture nedn't be taken to be al that 
depresing. Sometimes we go through stages in the development of ideas where realy the best we can do is to put like 
things into the same box, and kep loking for god generalizations that have further predictive power. So while I wil 
be writing here with a skeptical tone about what I'm caling "M-features" (e.g., EP-/P-features), its important to realize 
that Chomsky's recent generalization of these properties is a sensible move in this respect in that it broadens the range 
of phenomena claimed to go "in the same box", whatever their ultimate explanation. My view, developed here, is that 
M-features actualy pick out a heterogeneous set. Local and sucesive cyclic "movement" are not the same 
phenomena. 
 
132 
 M-features have taken on an increasingly important role in recent developments 
in minimalist syntactic theory. Chomsky (1999) suggests that the M-feature conception of 
movement be generalized beyond just the kinds of intermediate/non-target movements 
taken to be involved in succesive cyclicity. He proposes alternative mechanisms to 
handle the licensing relations that in earlier manifestations of minimalist syntax had been 
understood to be the movement-driving ones (like Case/? or wh-features). Fukui (1986) 
refers to such properties as Kase features; I wil refer to them here as Core Licensing 
Properties (CLPs). 
 So, these relationships betwen CLPs in Chomsky's recent work now fal under 
his PROBE/GOAL relations and the notions of "matching" and AGRE. And, having 
relocated the job of handling these CLPs in this potentialy non-local (or, rather, les 
local) way, he suggests that what drives actual displacement is rather M-features ? for 
all movement. This means abandoning M-features in connection with curent approaches 
comes with the burden of rethinking movement generaly, not just the mysterious 
succesive cyclic ones, but the local cases as wel. 
A quick ilustration with the A-movement example from above can serve to 
ilustrate this recent generalization of the role of M-features, as wel as the basic idea 
behind the AGRE operation. Whereas minimalist acounts previously understood al but 
the last of the sub-steps of movement to be driven by M-features (as in (160)a), now al 
such steps are so motivated, as in (160)b', following the independent Case/agrement 
(?/?) licensing which happens "long distance" via the operation AGREE. The agreing 
item, dubbed the probe, scans its c-command domain for a matching element or goal 
 
133 
(e.g., John in (160)b in virtue of its matching ?/?-features). Once AGREE takes place, it is 
the presence of M-features which drive the various sub-movements in (160)b': 
 
(160) a.   ?
?[?/?]
     ?
?[M]
         ?
?[M]
       ?
?[M]
  ?
[?]
 
John [sems [ _ to be likely [ _ to appear [ _ to [ _ like carots]]] 
 
 
 b.  PROBE?????????GOAL 
_ [
[?/?]
sems [ _ to be likely [ _ to appear [ _ to [ (John
[?/?]
 like carots]]] 
     ?AGREE??AGRE??AGRE??AGREE 
 
 b'.   ?
?[M]
      ?
?[M]
         ?
?[M]
       ?
?[M]
  ?
[?]
 
John [sems [ _ to be likely [ _ to appear [ _ to [ _ like carots]]] 
 
In (160)a note that there are thre types of movement chains that we can diferentiate in 
terms of what licenses their upper and lower 'links'.
7
 There are: (i) chains with ?-tails 
and M-feature heads, (i) chains with both head and tail characterized/licensed by M-
features, and (ii) licensed (final landing site) heads with M-feature tails. Nowhere in this 
scenario is there a chain that resembles the basic local situation, with both the head and 
tail in substantive licensing configurations (e.g., a chain CH = ?DP
?/?
, DP
?
?). (However, 
note that on at least one technical view discused in Chomsky (1995), discussed above, 
there would be a chain connecting the base and final target positions).
78
 
In (160)b/b' however, we se only two types of chain relation, one with a ?-tail 
and an M-feature licensed head, and the rest with M-feature heads and tails. Again, in the 
later scenario ?/? relations are taken to be an independent precondition for the M-feature 
                                                
7
 I wil here folow the comon terminology of refering to the "top" of a given chain as its HEAD, and the 
"botom" of a given chain as its TAIL. Take CHAIN for now to be a descriptive term of convenience, without intending 
comitment to a technical notion of CHAIN as oposed to MOVEMENT or a binding-type relation, or what-have-you. 
Later on I wil comit to a specific view in this regard. 
78
 Chomsky himself apears to have lost interest in the potential diferences betwen various technical ways 
of handling movement (se in particular his closing remarks in Chomsky 199 on this topic where both "chains" and 
"multiple merge" are labeled "terminological conveniences" with yet another set of terminological distinctions used to 
pick out the "more stern 'oficial' theory"). 
 
134 
driven movements ? there must be a derivationaly prior PROBE/GOAL relationship. M-
features have thus, in Chomsky's recent work at least, become synonymous with 
"movement" itself. 
 I find the general line problematic. The presence/absence of M-features is not 
principled. They are stipulated as obligatorily present wherever evidence suggests 
movement is not optional, and optionaly present wherever evidence suggests movement 
is not obligatory. We might be tempted to view this ? the M-feature Generalization ? 
as a kind of reintroduction of the Government & Binding (GB-)era conception of MOVE-
?. What is the status of Last Resort when we can say that it necesarily applies where 
features are present that require checking, but have the presence/absence of the driving 
properties be themselves unregulated? 
 But this move is not as clear as the GB-era conception of Move-? given the lack 
of supporting machinery of the GB-type that has by-and-large been stripped away under 
MP asumptions. In GB we understood applications of the general "move anything 
anywhere" rule as being restricted by other formal and substantive requirements, 
typicaly (though not always) stated as conditions on levels of representation so that 
movements could not result in violations (of Case theory, Binding, the ECP, etc.).
79
 So its 
not true that optional M-features are reintroducing Move-? in this sense ? we can't 
move anything anywhere. Rather, we can move when (i) the appropriate AGRE 
relationship holds, and (i) when there is an M-feature. 
                                                
79
 I say "not always" because acounts typicaly varied as to whether conditions were built-in to rule-
aplications or stated in terms of output filters (e.g., we can take movement to be restricted to ocuring betwen 
elements in a c-comand relation, or we can take movement to be "fre" with conditions on CHAINS, or the like). 
 
135 
 What governs the distribution of M-features is something of a mystery. This is, 
perhaps, appropriate, in the following sense. It amounts to the asertion that the existence 
of displacement phenomena in human language is an unsolved open set of problems, 
perhaps an "imperfection" in Chomsky's (1995, 1998, 1999) sense. Why is there 
displacement? What is the content of the overt versus covert movement distinction? Were 
we fooling ourselves in GB-era architectures or early minimalist approaches by thinking 
that constraints on levels of representation (in GB) or checking of core licensing 
properties (e.g., "morphology driven movement") in the MP was realy offering us an 
explanation for the relevant phenomena? The early implementations of the Last Resort 
logic needed to appeal to "strong" versus "weak" flavors of core licensing properties ? 
doesn't this distinction do what M-features are doing now (i.e., code the overt-covert 
distinction)? 
 Are we to expect a development of the concept of M-features? Should we expect 
to find some independent motivation for their existence? It could be, but at the moment 
our understanding of M-features does not appear to realy extend beyond this inter-
motivation ? where there is movement there is an M-feature, and vice-versa. Al we 
realy know is that they are not features of the sort that we used to take as the properties 
driving movement (e.g., wh/Q, Case, ?, etc.). 
It has been suggested for the case of succesive cyclic wh-movement that the 
relevant M-features may be of a quasi-interogative nature, but it is quite unclear why 
these properties should, for any local syntactic reason, become asociated with the outer 
edges of embedded declaratives to drive the relevant intermediate movements, as in our 
 
136 
example above.
80
 Alternatively, it might be suggested that these features come along for 
fre whenever there "wil be" the relevant core licensing properties present in some later 
derivational stage. But this means that the presence/absence of these properties requires 
reference to arbitrarily large stretches of syntactic computation to establish their local 
legitimacy. Current worries in the minimalist literature about whether this or that view of 
derivations requires "look-ahead" or not ought to be focused now on the existence of M-
features and with how their presence is localy justified. 
Other suggestions have explored the idea that such M-features are focus related, 
but this appears to run into similar problems regarding how to motivate them for exactly 
where they are needed to drive succesive movement and not elsewhere (se Felser 
2004). And, in general, these features are required to be present at the edges of otherwise 
rather diferent kinds of categorial domains (at least C and v, and perhaps D as wel) in 
order to drive the relevant movements "out of" the domains Chomsky cals phases. The 
question of why these properties should be asociated with exactly the domains that are 
otherwise suggested to be phases has no principled answer that I am aware of. Why 
should M-features line-up with phase-edges? 
For A-movement, the postulation of M-features takes the form of what amounts to 
a return to "square one" regarding the Extended Projection Principle (EP) introduced in 
Chomsky (1982) (se Bo?kovi? 2003 for some discussion of this point), esentialy 
stipulating the need for filed specifiers for the relevant intermediate/non-target head 
                                                
80
 Se Lasnik & Uriagereka (forthcoming) for some discusion of this point and the notion of "lokahead" in 
ading what I'm caling M-features. 
 
137 
elements, and thus artificialy creating positions that then cannot be pased over without 
violating minimality of movement restrictions.
81
 
The salient alternative approach to the locality of such movements is to introduce 
an analogously artificial restriction on movement, stipulating that A-movement must go 
from INFL to INFL (T-to-T) and A'-movement must go from COMP-to-COMP (C-to-C). 
I think that this is correct as a description, as wil become clear in our development of 
TCG below. But the important question that needs answering is, "why?". Stipulating T-
to-T or C-to-C movement appears to be Bo?kovi?'s (2002) conclusion regarding what I'm 
here caling M-feature-driven movement, namely: M-features do not exist, but movement 
must go through these local positions anyway (se Grohmann 2003 for an extension of 
this stipulation). Why? Because that's the way movement works ? its local. But this is 
not, to my mind, solving the problem. 
To be fair, Bo?kovi? does addres half of the problem and this is important (se 
our analyses of raising and the "EP" in Chapter 3). What he shows is that where M-
features sem to be redundant with core licensing properties, we can do just fine handling 
facts without the postulation of M-features. However, this stil leaves us with the 
intermediate movement cases as motivations for M-features, at least if we stay within the 
general borders of approaches consistent with Last Resort. 
 To sum up, I take current theory to be in the following bind: 
 
 
 
                                                
81
 Frank (202) has sugested a sort of "doubling" of EP-feature types, so that there are both A-movement-
relevant EP-features and A'-movement (wh-)EP-features ? that is, two types of M-features (he posits also a 
doubling of Case properties, one relevant for A-movement and one for A'-movement ? his so-caled wh-Case). 
 
138 
(161) a. Empiricaly, intermediate (succesive cyclic) movements sem to be real 
 
b. If LAST RESORT governs syntactic operations without exception, there 
must be some feature(s) involved in these intermediate movements,.. 
 
..BUT,.. 
 
c. ..the relevant motivating properties can't be in the set of core licensing 
properties, since these (e.g. Case, WH) are typicaly satisfiable only once 
 
d. So, either: 
 
(i) M-features exist, and are what is responsible for at least 
intermediate movements, 
 
OR, 
 
(i) M-features do not exist, and something else is responsible for 
intermediate movements (so either Last Resort is not completely 
general, or something overides it, or it can somehow be restated 
so as to provide non-feature-driven movement) 
 
Chomsky's recent move in the face of this bind is (161)d(i). However, Chomsky 
(1999:33) notes that what I am caling M-features are simply "selectional" features 
(linguistic properties) that are "uninterpreted" and which moreover constitute "an 
apparent imperfection, which we hope to show is not real by appeal to design 
specifications". This is the general route the present work aims to take: to show that M-
features as such are dispensable, though the general property of the syntactic component 
they have been (spuriously) introduced to describe is certainly real. 
 In my view, the earlier stages of the MP were closer to being on the right track in 
the following sense: movement is about what I caled Core Licensing Properties (CLPs). 
Where things have gone astray is in the specific kinds of eforts made render 
intermediate/non-target movements in evidence in cyclicity phenomena consistent with 
Last Resort. The central direction of early versions of minimalist inquiry regarding CLPs 
 
139 
strikes me as being esentialy correct and I think that this is the direction inquiry should 
continue to go in the characterization of local relations and dependencies. The mising 
piece then, with such a story about local relationships in place, is a way to understand 
how to truly reduce the non-local cases to the local ones. 
2.2.2. Cyclicity without M-features? 
The larger context of the minimalist program includes the general idea that syntactic 
operations are governed by Last Resort. Technicaly this manifests as the idea that 
movement must efect some (local) licensing of properties that would be otherwise il-
formed (i.e., without the movement operation to create a local context for their 
satisfaction/licensing). A local instance of wh-movement happens in order to 
check/license the wh-properties specifying the interogative properties of the matrix 
clause (as in Who did Bil hit?). But the situation involving embedding, ilustrated above, 
requires that the intermediate C-element not be of the sort for which WH/Q-properties are 
present. So something else must drive these intermediate movements. 
Work in the MP has taken one of two general approaches, either (i) some 
inventory of special features are introduced to drive the intermediate/non-target 
movements (i.e., M-features as discussed above) or (i) some atempt is made to derive 
the efects of local movements without supposing that there is some special property 
there in the structure. I turn now to (i). 
This later (i)-type) strategy has manifested in a number of diferent ways. One 
route of thinking along these lines simply denies SCM and proposes non-local movement 
is "direct" (one-fel-swoop) and thus atempts to acount for local phenomena exhibited 
 
140 
along the movement path by other means. Zwart (1993) for example supposes that in 
long-distance wh-movement some dummy items are first inserted in al the intermediate 
CP-related positions, and that the wh-element itself then moves directly to the target 
position (thus obeying his view of movement economy: "Fewest Steps"). 
However this does not appear to be much of an advance over the M-feature view, 
but rather a diferent conception. The problem under this approach is then: what drives 
the insertion of intermediate elements that then alow the moving wh-element to pas 
over those positions? Radford (2001) argues for a convergence-based understanding of 
phases which is wed to a one-fel-swoop view of movement as wel (se Felser 2004:548 
for some critical discusion). Epstein & Sely (1999), as mentioned briefly above, argue 
on the basis of technical isues regarding chain definitions that movement must be 
similarly a one-step proces. 
It would take me too far afield to consider al the alternative possibilities involved 
in contrasting stories positing SCM with those of the one-fel-swoop variety. I wil 
therefore narow my focus in what follows to examining approaches that in one way or 
another have appealed to linked-local movements but which have atempted to dispense 
with M-features. However, it is worth pointing out that this general clas of approaches 
requires something like Zwart's solution, as discussed above. Somehow the locality of 
SCM-type efects need to be acounted for, and its unclear how one-fel-swoop stories 
can do this without evoking strongly non-local principles. For example, we noted above 
regarding Chung's (1983, 1994) facts from Chamorro that the relevant agrements that 
manifest are tied to each and every relevant local domain on the movement path. Its 
unclear how to do this without postulating linked local relations of some kind. And note 
 
141 
as wel that completely disalowing linked-local movements requires a special view for 
understanding the binding-related connectivity efects discussed above (???). Moreover, 
as we wil se in Chapter 3, there appear to be situations in which it looks like SCM is 
maybe not a necesary feature of the relationship (i.e., situations where the local efects 
do not show up uniformly). We saw some instances of this in the form of "dialect 
variability" in Irish English Q-stranding vs. standard English, variability in the possibility 
of S-V inversion efects in Spanish, variation in matrix/embedded SAI across diferent 
dialects in English, and so on. I wil suggest in the next chapter a general schema for 
analysis which relies on the posibility of SCM enforced in the series of workspaces 
defining the derivation versus a rather diferent sort of relationship (unselective binding 
of the sort discused in Pesetsky 1987 to handle so-caled D-linking) that may hold over 
output structures. If the general idea is on track, then we wil actualy need both SCM 
and some other sort of potentialy non-local (or les-local) kind of relationship in order to 
acount for the presence/absence of the sorts of efects canvased above. So, with an 
exception or two in the discussion below, I wil from here on drop discussion of one-fel-
swoop acounts.
82
 
Another instance of the (i)-type approach (atempting to live without M-features) 
comes in the form of acounts claiming that movement is supercyclic. Boeckx (2001), 
following ideas introduced in Takahashi (1994), suggests that intermediate movements 
are motivated by a direct condition on the "distance" a given movement relation can span, 
                                                
82
 This is not to say that no such acount is posible, just that there are reasons for suspicion of one-fel-
swop which do not arise for the SCM view. I take the general reserach agenda of atempting to reduce al aparently 
non-local phenomena to local domains seriously, which is why the interest in SCM-type acounts. I refer the reader to 
the references cited in the text for discusions of one-fel-swop views. 
 
142 
such that movements must involve the edge of every intervening phrase betwen the base 
and final/target landing site positions. This view predicts, in Abel's (2003) terminology 
mentioned above, uniform efects on the movement path, and so clearly would require 
further asumptions to capture the various phenomena that appear more selective in their 
manifestation (e.g., wh-copying, the binding/connectivity facts). 
Heck & M?ller (2000) propose that Last Resort is a violable condition, and posit a 
system of violable constraint interaction that conspires to drive local movements without 
direct feature-checking involved at the non-target sites (se their notion of "Phase 
Balance"). However, as pointed out by Felser (2004), their approach relies on undesirable 
look-ahead logic (i.e., cast in the framework of Optimality Theory, their approach 
requires reference not just to phase-sized numerations but rather to arbitrarily large ones). 
Castilo & Uriagereka (2000) offer an approach that is interesting for the present 
investigation in virtue of the relation of their proposals to the TAG-theoretic sort of 
analyses discused earlier. I discuss them here together which alows us to make some 
points about the "movements" asumed in TAG-analysis within elementary tres and 
their need for something like EP-requirements. 
Castilo & Uriagereka (C&U) suggest there are initial movements that are local, 
and which involve the ultimate target position housing core licensing properties. Like 
TAG approaches, they suggests that such an initialy formed dependency can then be 
subsequently "stretched" by adding intervening elements. 
However two aspects of their approach are unlike analyses offered in the TAG 
framework. First, C&U suggest that individual merge operations do the job of 
"stretching" the initial local dependency by adding intervening elements one-at-a-time, as 
 
143 
opposed to the en bloc insertion (adjoining/splicing-in) of auxiliary structures as in TAG. 
C&U take this one-at-a-time addition of intervening elements to be a merge-based 
generalization of so-caled "tucking-in" that Richards (1997) argued to be relevant for 
movement operations (i.e., alowing non-root mergers, which is what C&U need to efect 
the incremental stretching of local dependencies with step-wise additions of single 
elements). 
A second diference betwen C&U's approach and standard TAG analyses is that 
they take the local movement relationship to be one that connects the moving item with 
its actual "target" landing-site. Its worth taking a moment to show why this isn't how 
things are typicaly handled in TAG-theoretic elementary tres. We can ilustrate this 
with what corresponds to cyclic A-movement in TAG, though the point caries over 
directly to A'-movement cases as we wil se. Consider: 
 
(162) a. John sems to tend to appear to like carots 
 
The TAG derivation for this example might be (but as we wil se, typicaly is not) 
understood to evoke the following elementary structure (left below) and the 
corresponding auxiliary tres (right-hand side below): 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
144 
(163)     CP 
         T' 
C
0
    TP       T' 
     T
0
   VP 
 
John
DP    T'          T
0
    VP 
         
apear
V
0
    T' 
      T
0
    vP         
tend
V
0
     T' 
 
         t
DP
    v'      T' 
 
           v
0
    VP       T
0
    VP 
 
            
like
V
0
    DP
carots
     
sem
V
0
    T' 
 
      ELEMENTARY TRE       AUXILIARY TREES 
 
Asume that T in the elementary structure above is finite (i.e., that this structure 
converges as an independently wel-formed clause). This corresponds to Castilo & 
Uriagereka's idea for the wh-movement case above. 
This particular sort of division of basic tre structures turns out to be problematic in 
TAG, but stepping through this derivation wil alow us to se more clearly in 
comparison how at least one version of current TAG analysis of such phenomena is taken 
to work. As discussed earlier in the last chapter, auxiliary structures have matching root 
and foot nodes, which are T' for raising predicates (on the right-hand side above). This 
reference to X'-level categories that are not dominated by a corresponding maximal node 
(XP) may however be inesential to the acount.
83
 
                                                
83
 This is an important isue conecting with the position one takes regarding primitive "bar-level" 
distinctions. In a BPS-style acount, for example, it is unclear how one would motivate this particular property of rot 
and fot nodes in auxiliary tre structures without abandoning the central idea of relativistic phrase-level status.Frank 
(202:?n?) sugests that it may be a straightfoward task to render this aproach consistent with a purely relational 
conception of bar-level distinctions. But this is not clear to me. He also corectly notes that having primitive C'-nodes is 
not at al inconsistent with Muysken's (1982) aproach which worked with feature sets (e.g., ?max, ?project). 
Muysken's initial view is relational in that projection is understod to be a coherency condition on the values and 
structural distribution of such features in domination sequences (e.g., a X{+max,-proj} canot coherently enter into an 
imediate dominance relation with another X{+max,-proj} node (etc.). Thus a C'-node with no dominating CP would 
simply be a {-max, +proj} element. It does sem unreasonable to think that verbs might difer, in this feature-based 
view, as to whether they select a C' or a CP conceived in Muysken's original feature-based terms. However, Chomsky's 
 
145 
In any event, these root and foot nodes can be understood match/correspond to the 
T'-node in the elementary tre depicted on the left. The movement of the nominal element 
(John) is understood to take place from the base/thematic position (here: specifier of v) to 
the derived matrix position directly, consistent with the general idea in play within TAG 
approaches that al dependencies should be localized to basic tre structures that form the 
input to TAG adjoining and substitution. In this case, the relevant operation is adjoining, 
which works as follows for the combination of the elementary structure and the raising 
auxiliary headed by appear: 
 
(164)     CP 
         T' 
C
0
    TP       T' 
     T
0
   VP 
 
John
DP    T'          T
0
    VP 
         
apear
V
0
    T' 
      T
0
    vP         
tend
V
0
     T' 
 
         t
DP
    v'      T' 
 
           v
0
    VP       T
0
    VP 
 
            
like
V
0
    DP
carots
     
sem
V
0
    T' 
 
      ELEMENTARY TRE       AUXILIARY TREES 
 
 
 
 
 
 
 
 
 
 
                                                                                                                                            
conception in the BPS aproach is diferent in this regard. On that view projection level is purely relational ? there 
can be no C' without there being a CP. 
 
146 
(165)     CP 
         T' 
C
0
    TP       T' 
     T
0
   VP 
 
John
DP    T'          T
0
    VP 
        
apear
V
0
     T' 
      T
0
    VP         
tend
V
0
      T' 
 
     
apear
V
0
    T'      T' 
 
              T
0
     vP       T
0
    VP 
 
              t
DP
    v'         
 sem
V
0
    T' 
 
                 v
0
    VP 
 
                  
like
V
0
    DP
carots
      
 
      ELEMENTARY TRE       AUXILIARY TREES 
 
Adjoining generaly efects a kind of spliting of an atomic node (here: T') in the 
elementary tre, splicing the auxiliary structure into its position as depicted above. The 
same operation could then be repeated to adjoin (splice-in) the other two raising 
predicates. Note that the addition of the auxiliary structures serves to "stretch" the 
movement dependency created in the initial elementary tre ? again, there are no 
intermediate traces of movement in the TAG architecture.
84
 
This specific derivation has to my knowledge never been advocated for RtS 
phenomena within TAG, and its easy to se why. The element which constitutes the 
matrix T-head (T
0
) in the elementary tre is no longer the matrix instance of this category 
                                                
84
 The absence of intermediate traces is touted as a virtue in these aproaches, though I actualy think this 
raises some puzles for TAG analysis; se ?1.5.1, (94)-(97) above, and ?3.2 & ?3.3 below. However, se Frank & 
Kroch (195) for some relevant discusion relying on an alternative way to think of some of the relevant conectivity 
facts. 
 
147 
in the output. Rather, it is necesarily stranded in the lower clause, with the T
0
-element in 
the auxiliary tre becoming the highest instance of this head. 
If finitenes is encoded on these T-elements directly (i.e., if T comes as either 
{+fin} or {-fin} ? not a trivial asumption), then this sort of derivation strands the 
T{+fin} element in the lower clause. Also, if we are taking the agrement and Case 
properties of the T-element in the elementary tre to be enter into some kind of 
checking/licensing relationship with the "raised" nominal, then similarly the relevant 
agrement properties should be stranded in the downstairs clause as auxiliary structures 
as adjoined. 
For reasons of these sorts, researchers working within the TAG framework have 
developed rather diferent analyses than the one sketched above. For concretenes I 
concentrate here on the proposals of Frank (2002). Frank's approach to Raising-to-
Subject (RtS) asumes a rather diferent inventory of structures constituting the input to 
the TAG derivation: 
 
(166)  
         T' 
        TP       T' 
     T
0
   VP 
 
John
DP    T'          T
0
    VP 
         
apear
V
0
    T' 
     
to
T
0
    vP         
tend
V
0
     T' 
 
         t
DP
    v'      T' 
 
           v
0
    VP       T
0
     VP 
 
            
like
V
0
    DP
carots
     
sems
V
0
    T' 
 
 
148 
The important diference is in the nature of the elementary tre with respect to finitenes, 
and the nature of one of the auxiliary tres (for this example, the auxiliary structure 
headed by sems). The relevant elements above which difer from the problematic 
derivation given in (165) above are boxed. Frank asumes the auxiliary tre that wil 
become the matrix verb is itself marked as {+finite}, while the top of the elementary tre 
containing the local NP-movement is has the properties that it manifests in the output, 
namely that it is {-finite}. This property is thus asumed to be a property of the relevant 
T-nodes. 
 Seting aside for now how the adjoining operations are constrained so that the one 
finite raising auxiliary ends up "at the top", we can se now that the trouble that arises in 
the derivation given above in (164)/(165) doesn't arise for (166) since the elementary tre 
is itself taken to be non-finite (as it appears in the target derived structure). The 
finite/matrix structure is introduced by an auxiliary structure. This requires that there be 
some force which is responsible for the initial A-movement within the elementary 
structure. Frank (2002) formulates a version of the EP to get this result (the view is 
interesting but requires too much discusion to get into here ? se Frank (2002) ? the 
point here is simply that TAG requires something other than Core Licensing Properties to 
drive the initial movement). 
 Now observe the somewhat subtler though similar situation that arises in wh-
movement. A TAG derivation for the wh-movement case above would go as follows. We 
asume the elementary and auxiliary tres underlying the TAG derivation of (167)a as in 
(167)b, and the first of two adjoining operations works as pictured in (167)c: 
 
 
 
149 
(167) a. What did Dave think Mary believed John liked? 
 
b.       CP 
        C'      C' 
 
what
DP     C' 
     C
0
    TP  C
0
    TP 
       C
0
    TP 
     
Mary
DP    T'  
Dave
DP   T' 
       
John
DP    T' 
           T
0
   VP       T
0
   VP 
            T
0
    VP 
          
believe
V
0
    C'      
think
V
0
    C' 
             
like
V
0
    t
what
 
     ELEMENTARY TRE      AUXILIARY TREES 
 
c.       CP 
        C'      C' 
 
what
DP    C' 
     C
0
    TP  C
0
    TP 
       C
0
    TP 
     
Mary
DP    T'  
Dave
DP   T' 
       
Mary
DP    T' 
           T
0
   VP       T
0
   VP 
            T
0
    VP 
          
believe
V
0
    C'      
think
V
0
    C' 
           
believe
V
0
    C' 
 
                  C
0
    TP 
 
             
John
DP    T' 
      
                       T
0
    VP 
      
                        
like
V
0
    t
what
 
 
Following the adjoining step pictured in (167)c, the second auxiliary tre would be 
similarly spliced-in (adjoined) yielding the full output structure (or, again, perhaps the 
two auxiliaries first combined and then adjoined as one). As adjoined material is spliced-
in in this manner, we se again that the movement dependency formed over the input 
elementary tre (in (167)b) is systematicaly "stretched". The addition of the second 
 
150 
auxiliary tre would further stretch this elementary-tre-defined relationship yielding in 
the output the superficial appearance of a "long-distance" syntactic dependency.
85
 
 The present point however is that the C-head that on MP (and other) views would 
be taken to house the wh-properties driving the movement to the ultimate target position 
is the C-head in the elementary structure above. But it is not this head which ends up as 
the "matrix" C. That head is provided by auxiliary tre. So, as with the raising-derivation 
given above, there must be some other property in the elementary tre which motivates 
the initial (and only) "movement". TAG derivations of this sort thus need something like 
the M-feature (EP-type) requirements. 
These maters ned not distract us any further. Thre points emerge from this 
digresion about TAG versus the C&U proposal which do concern us however. 
First: the view which Castilo & Uriagereka adopt involves an initial CLP-driven 
movement operation, and this is not how the comparable TAG derivations are usualy 
taken to work. Rather, conventions on the adjoining operation in TAG are evoked such 
that wh-licensing is built into the "spliting" of nodes that acompanies the splicing-in of 
auxiliary structure. Isues arise in this TAG view however concerning (e.g.) do-support 
and auxiliary inversion generaly, which I wil not get into here (se Frank 2002, Rogers 
200? for some discussion and some solutions within the TAG approach). 
Second: our sketch of how the TAG derivations do work has brought to the 
surface the fact that the elementary-tre-internal movements in NP-raising and wh-
movement need to be driven by an EP-type requirement. In later discussion (Chapter 4) 
                                                
85
 I'm asuming here that the auxiliary structures are spliced-in one at a time. Alternatively they could be 
joined together first and then spliced-in in a single instance of adjoining. Se Frank (202) for discusion. 
 
151 
we wil se that it may be possible to import some of the asumptions about structure and 
category (of the Brody-type discussed at the end of Chapter 1) into the TAG framework 
to do away with EP-type requirements on the movements within elementary tres. 
Sketching this possibility wil be important to our Chapter 4 discussion of the diferences 
betwen the TCG approach developed here and TAG. So C&U's idea of an initial CLP-
motivated movement may turn out to be of interest in discusions of EP-type 
requirements across the TAG and MP-type frameworks. 
Third: C&U's discussion turns out to be of interest in the context of the TCG 
approach developed here for another reason. In particular, they make use of a notion of 
distinctnes in discussing wh-islands that is helpful for us to consider. Take the wh-island 
violation in (168): 
 
(168) ?What did John wonder whether Mary thought Dave liked _ ? 
 
On C&U's view this begins with the following local movement: 
 
(169) [
CP
 what [
C'
 
0
-[+WH] [
TP
 Dave liked t
WH
] 
 
They then suggest that as the items are added-in betwen a moved wh-element and the 
structure containing its base position. In wh-island cases, as C&U point out, the 
derivation wil inevitably reach a point where a "like" C-element wil have to be 
introduced and projected below the initialy moved wh-element and its asociated CP 
structure. Such situations, they suggest, result in pathology (the derivation crashes or 
perhaps cancels at this point, or is perhaps problematic at the interfaces for this reason). 
 
152 
I have already mentioned the role that distinctnes wil play in enabling SCM in 
the present acount. Later on I wil consider a story similar in spirit to C&U's as an 
acount of wh-island and other related phenomena. 
Consider now another reasonable line of thinking which begins with an 
observation about one way we might simply deny SCM and implement the "one-fel-
swoop" logic. We might be tempted to say that intermediate movements are impossible 
for exactly the reason that they cannot be final landing sites for elements (as noted above 
? se (155)-(159)). Locality of movement would then be understood in terms of closest 
possible landing site. 
This would be in place of the somewhat odd counterfactual view of "closest" 
which is often appealed to, under which the "closest" landing site is simply one that 
belongs to the right more general super-category of positions which could have otherwise 
hosted the licensing features to make the position an actualy licit target/final landing site. 
That is, non-finite T is of the type that could have, for example, hosted Case and ?-
properties, if it were only of the right sub-type of the general type T; embedded 
declarative C is of the general type that could have hosted wh-properties, and so on. 
But this general outlook about categories being of a more general type which 
could have housed the relevant core licensing properties raises another possibility. We 
could maintain the idea that intermediate positions do not properly license elements that 
move to them, but stil somehow make use of the observation that such intermediate 
landing sites are in fact "of the general type" that could otherwise serve as target sites, if 
only they housed the relevant features specifying the appropriate sub-type. The idea 
 
153 
would be to separate-out landing-sites for movement from the motivating forces of 
movement. 
We thus might alternatively pursue a kind of "false-advertising" view that could 
support the SCM view after al. Suppose subordinate elements can se up the c-command 
path and can detect the general (super-)category of an element as a possible suitable 
target of movement, and this posibility is enough to localy license movement to that 
element's domain. The fact that this would turn out to not license the position as an 
"ultimate" landing site simply presents the situation which we want to obtain, namely that 
the element is then in a higher position from which it may sek a target in the next higher 
domain and it is stil "active" in terms of having its properties not yet satisfied 
(licensed/checked). 
Note that this perspective would make movement contingent on the needs of the 
moving element, thus faling under some version of GREED, in contrast to approaches 
which either partition-out the responsibility for movement triggering betwen the 
source/target positions and the moved element, or else makes the driving force of 
movement the upper element's responsibility (e.g., so-caled ATRACT based view). A 
fully atract-based approach couldn't support the false advertising view of cyclicity, 
obviously, since the idea is that the relevant properties that could actualy license 
movement are "not there". And even if do not develop an atract-based view, the basic 
idea sems prety stipulative. Why should only the category, and not its features (or their 
absence) be detectable in the local context? Maybe however this view could be serviced 
as a landing site theory, with some other motivation given to drive the movement 
operation itself. 
 
154 
There is another way of looking at things that relates to remarks made earlier in 
this discusion (specificaly when introducing Chomsky's DbP view in the context of our 
WS/O-distinction; se ?1.4). That is, we might maintain that it is the local il-formednes 
of a to-be-displaced item in the context of its base position that is sufficient to trigger 
movement. Thus, local contexts with elements in their base position (with no features 
checked) could be understood to somehow expel their uninterpretable contents. 
SCM would then be driven or not depending on how the system aseses the wel-
formednes of sub-stretches of derivation ? localizing inspection of the derivation for 
convergence would establish sub-domains which would have to have certain elements 
displace in order to be wel-formed. This localized non-feature-driven view of movement 
as LAST RESORT might also be coupled with the "false-advertising" view suggested above 
(so there stil would perhaps have to be a category of the right general type to house the 
expeled element). 
In particular, having some notion of localized convergence evaluation (output 
"size" restrictions) sems promising, since we have a theory involving the idea that an 
item with uninterpretable properties might crash a derivation if these properties are not 
checked/licensed. This can perhaps be exploited to drive local movements. So, one 
interesting possibility for SCM is to consider shrinking the domains over which we 
evaluate wel-formednes, so that an element could in principle violate Last Resort in 
order to avoid rendering its local environment il-formed. And this is what the MSO-type 
vision of Chomsky's DbP does (even if the specific categories he has in mind as phase-
inducing aren't quite right ? se below ? the general idea has the right form). We can 
regard this intuitively as a fit with the general idea to the view of uninterpretable features 
 
155 
as "viral"; that is, local contexts are forced to expel the "sick" elements in order to not 
remain "infected".
86
 
Also, we have the idea that expeled items might have to move to a category of 
the right general type, which again we take to mean a category which is of the type that 
could in principle house the right sort of core licensing properties. Expeled items, 
needing to license their properties, go to the closest place where it looks like this could 
happen. That this does not actualy happen with intermediate movements simply means 
that the relevant moving element remain a source of "infection" for the next highest 
domain, and must be expeled again to the next super-ordinate domain, and so on. 
(Presto! SCM!). Lasnik & Uriagereka (forthcoming) propose of view of this sort which 
relies on the idea of "expeling" elements "up the tre" as each localy evaluated structure 
is forced to come to terms, as it were, with a contained element housing viral properties. 
Let us sum up these two ideas as follows: 
 
(170) EXPULSION OF VIRAL ELEMENTS: elements with uninterpretable features must 
move when necesary, this means either (i) to check a feature, or (i) to avoid 
crashing a derivation when a local structure is evaluated
87
 
 
(171) MOVING INTO CATEGORIALY LIKELY CHECKING CONFIGURATION: elements know 
where to look to get their uninterpretable features checked/satisfied (but the 
features aren't always there) 
 
What drives SCM then is the presupposition that there are sequences of categories 
betwen a base and a target position for a "to-be-moved" element, and that each category 
in this sequence just happens to have the following property: it is an element with a 
                                                
86
 On the notion of features as "viral", se Uriagereka (198) and a discusion in Lasnik (199b). 
87
 This is more-or-les a vision of movement economy enshrined in Lasnik's (199a) notion of Enlightened 
Self Interest (versus standard Gred). 
 
156 
subset of the properties of the target landing site. I wil take this to be central in the 
analyses of the next chapter. 
2.3. Chapter Summary & Further Critical Remarks  
We have encountered some potentialy interesting preliminary ideas that might support 
SCM without the postulation of intermediate M-features. Recal from above we identified 
the following two general isues (borrowing from the formulation in Felser 2004) 
pertaining to MSO-type phases of derivation and SCM: 
 
(172) Asuming SCM exists,.. 
 
a. What motivates it? Properties of the moving element? Properties of the 
intermediate target? Both? Neither? (i.e. something else)? (TRIGERING?) 
 
b. What are the mechanics of movement like such that unlicensed features 
{*F} do not remain to cause convergence problems in speled-out domains 
(either on the copy, or on the remerge view)? (CONVERGENCE?) 
 
The discussion in the previous section has focused on some alternative acounts 
addresing the TRIGGERING problem, and we have so far not said much about the 
CONVERGENCE isue as stated above, though as we just discussed at least one type of 
solution to the TRIGERING isues relies on asumptions regarding local convergence 
evaluation. 
First, note that our canvasing of some of the empirical terain evoked in 
discussions of SCM raised the following additional problem, related to both (172)a and 
(172)b ? cal this the DISTRIBUTION problem: 
 
(172) c. What are the range of potential non-target landing sites for SCM? 
(DISTRIBUTION?) 
 
 
157 
This dovetails with the distinctions drawn from Abels (2003) regarding uniform 
presence/absence of efects along the movement paths versus the possibility of 
"punctuated" efects. Note that on a purely direct-feature-driven view of movement (e.g., 
of the sort postulating M-features), the triggering and distribution problems are the same. 
But this is not necesarily so on other views, as we wil se in a moment. Specificaly, we 
encountered the following ideas with respect to TRIGGERING: 
 
(173) a. M-features exist, and these motivate intermediate movements (consistent 
with Last Resort) 
 
b. Intermediate movements occur (consistent with a version of Last Resort) 
in order to avoid crashing the derivation when local sub-structures are 
evaluated 
 
Chomsky's (1999) position sems to couple (173)a and (173)b. But if (173)b can be 
shown to suffice, then one might think that we can dispense with M-features (a desirable 
outcome, as noted, on both Chomsky's view
8
 and following the general argumentation 
above in ?2.2.1). 
 But its not clear that this localized vision of last resort actualy does suffice. That 
is, on a phase-based view, the DISTRIBUTION question is supposed to get an answer in 
terms of whatever we motivate as the relevant phase-inducing heads, coupled with 
localized convergence evaluation. If non-feature-driven movement can happen on the 
view of Last Resort sketched above (to avoid a local crash), then we might expect these 
movements to be tied to the "edges" (external domain) of phase-inducers. 
                                                
8
 As noted above, Chomsky regards these properties as an aparent imperfection that we hope on minimalist 
grounds to show is "not real". 
 
158 
However I wil now argue that even on these views we may stil need something 
like M-features (or, as I suggest below, a diferent view of the categories present in 
verbal extended projection series). 
Recal on Chomsky's view that the complement domain of phase-inducing heads 
spels-out when the next such phase-inducer is introduced. But on the asumption that C 
and v are the phase-inducers, this means that the complement domain of, for example, v, 
wil not spel-out until the next phase-head (C). But then why should there be movement 
to the edge of vP, and not any other position betwen the phase-heads? 
If we can show that movement is the edge of vP, and not higher, this suggests that 
the way to have Last Resort consistent movements would be to go with Chomsky's M-
feature view. And if this is correct we also need to have movement target positions below 
the root. This may in fact be independently required, but the present point is that on 
Chomsky's view of phases the localized convergence motivation doesn't work to drive 
elements to the edge of vP. That is, it would not be strictly the last resort to move to such 
edges, since it is only after introducing further material that the spel-out of the 
complement domain wil be required. A strict last resort view would be to have the 
movement only justified at the point in derivation where the problem arises, and as we 
have just sen this is not when a phase-inducing head is introduced.
 
 
To se the point, consider the following derivation (deploying a version of the 
reduced structures from Chapter One for convenience): 
 
 
 
 
 
 
 
159 
(174) a. D
wh
 
b. V ? D
wh
 
 c. v  V ? D
wh
 
 
In (174) we have a derivation with wh-element as the object of V up to the point where v 
? a phase-inducer ? is introduced into the derivation. At this point, there is no 
imediate threat that the wh-element wil be stranded in this position. So unles we 
asume that the operation driving movement has some look-ahead capability, there is no 
reason under the localized Last Resort logic that would drive movement to the edge of 
vP.
89
 The derivation continues, adding the subject-D to get the external-? role from v 
(174)e, adding T (174)f, and moving D to T (174)g: 
 
(174) d. v ? V ? D
wh
 
 
e. v ? V ? D
wh
 
 
 D 
 
f. T ? v ? V ? D
wh
 
 
     D 
 
g. T ? v ? V ? D
wh
 
 
 D   D 
 
 
h. C ? T ? v ? V ? D
wh
 
 
      D   D 
 
                                                
89
 One might object to this, pointing out that the 'lokahead' required would stil be fairly local (only up to the 
next phase-head. But on Chomsky's view, if C and v are the only relevant phases this could be an unbounded stretch of 
structure (e.g., as he argues for sucesive cyclic raising environments, which are asumed not to introduce phase-
distinctions). For the SCM in raising, as mentioned, Chomsky posits a EP-features to drive the local movements, but 
there if there are no phase heads in such sequences then these motivations are decoupled. The point of the argument I 
am runing in the text is that there is reason to think even where we do have phase-divisions that we would ned M-
features anyway.  
 
160 
The last step pictured here is the addition of the next aleged phase-inducer (C). Do we 
need spel-out of the complement domain to happen prior to this step? Yes, if there is to 
be movement to some position below C, otherwise its not obvious that the wh-element 
couldn't first move directly to C and have the spel-out operation apply afterwards. So 
this means that the local movement out of the complement domain of v must happen prior 
to the addition of C. The natural point in derivation to apply the Last Resort logic would 
be betwen steps (174)g and (174)h. 
 But then, where does the wh-element move? And why? It sems that moving to 
adjoin to the T-domain is just as plausible as moving to the edge of vP. (Though note 
movement to T would obey extension/root-merge, and movement to v would not ? 
asuming this is not isue, both possibilities remain). However, the kinds of facts 
discussed in ?2.1.5 strongly suggest that if there is very cyclic movement (within clauses) 
then it must involve a position below TP (se, e.g., (139)-(140)). 
 Note as wel that we cannot apply the logic described above of moving to a 
"categorialy likely checking configuration", since its completely unclear that v ever hosts 
the relevant CLPs. 
 The conclusion is that if we ned movement to the edge of vP we need an M-
feature as Chomsky proposes. Or, we ned to configure the general outlook such that 
derivations "know" that there is an upcoming second phase-inducing head. Chomsky has 
suggested that derivations may begin from limited inventories of elements ? sub-arays 
? which are understood to be selected from the lexicon in such a way that they can 
contain only one phase-head. This could potentialy solve the problem if we can 
somehow restrict sub-arays to only containing information that wil end up below phase-
 
161 
inducing heads ? that is, material constituting the complement domain. Then we might 
motivate movement to the phase-edge of v (in the example above) in terms of a condition 
keyed to the introduction of a second sub-aray containing another phase-inducing 
element. This would be conceptualy similar to both Uriagereka's idea of having "derived 
terminals" ? esentialy treating phase-sized arays as atomic element with respect to 
later stages of derivation. It would also be quite close then to a "mini-TAG" view, which 
posited phase-sized elementary tre domains. 
 But its unclear that this isn't simply a statement of the problem. Why couldn't the 
initial phase include everything up to, but not including the C-element? At isue here is 
how we partition structures into phases/spel-out domains. Having them restricted to 
containing only a single phase-head doesn't obviously constrain what else can be in an 
initial sub-aray, so they may include material above v or not. If so, then the problem of 
distinguishing phase-edges from structure above such heads (but below the next phase 
head) stil arises. 
 This situation is general. Suppose that the wh-element has somehow reached the 
vP edge (the double-v's here can be taken to be an adjunction structure, e.g., vP ? vP, or a 
maximal and intermediate level vP ? v', it doesn't mater for present purposes). 
 
(175) T ? v ? v ? V ? t
Dwh
 
 
D   D
wh
  t
D
 
 
The same situation described above arises here. The next step introduces a C element. 
The step after that presumably a selecting V, perhaps with a local object, and only after 
this we might encounter another v which by asumption forces the complement domain 
 
162 
of C to spel-out. So, again, we either need to the movement to not be strictly speaking 
consistent with last resort, or we need an M-feature asociated with C. 
 We have reached then two main conclusions. First, the expeling of viral elements 
story, and any other like-approach seking to drive cyclic movements out of 
independently evaluated domains based on convergence needs, needs some additional 
asumptions to get things to work. In particular, it sems that movement to the edges of 
phases cannot be motivated by a strict localizing of Last Resort. To the extent that we 
render the idea with respect to triggering a local/non-target movement consistent with a 
localized vision of economy (e.g., se Collins 1997), its not obvious how to discriminate 
betwen the phase edge and any other category that may be present below the next phase-
inducer. Therefore, it sems that something like M-features are required if we have 
reason to think that movement is to vP and not to higher elements below the next phase-
inducing head. 
 Second, its not clear how the idea of moving to a "categorialy likely checking 
configuration" could drive wh-movement (or NP-movement) to anything but intermediate 
C (or T for NP-movement), at least not on standard asumptions about the structure of 
verbal extended projections (e.g., asuming roughly C-T-v-V).
90
 
 We saw in addition in our discussion in this Chapter that TAG approaches (at 
least on the general approach advocated in Frank 2002) also require reference to an EP-
                                                
90
 However, as we wil se in the next chapter, it has ben argued (se Butler 204, Beleti 20?, Pesetsky & 
Torego 204) that there may inded be such elements below v but above V. If this is corect, then we could salvage the 
combination of expeling viral elements and moving to a likely checking configuration (this requires a version of the 
"split-VP" hypothesis (Koizumi 193, 195, Lasnik 199a, Johnson 191, Runer 195). So, if verbal extended 
projections involve internal "mini-clauses" ? C-T-v-C-T-V ? or perhaps some equivalent, then we may have a road-
in to a uniform story about cyclicity stateable in these terms. 
 
163 
type of property to motivate initial movements within elementary tres. Problems were 
sen to arise for the view of adjoining auxiliary structures at X'-nodes for SCM in of the 
wh and NP varieties if it was asumed that the initial movement involves what we have 
caled Core Licensing Properties (CLPs; i.e. if the initial movement is the ultimately 
licensed one). Instead, EP-properties are understood to drive local movements, and the 
licensing of the sort involving CLPs is regulated by cross-tre feature-checking built-in to 
the adjoining mechanism. 
 In the next chapter, I turn to the task of further developing the asumptions and 
mechanics of TCG and the WS/O-distinction on which it is based. The main argument is 
that TCG makes available an acount of SCM that dispenses with the need to appeal to 
M-features. We wil provide some fairly coarse-grained examinations of some of the 
SCM phenomena sketched in this chapter, providing enough of a story to show how the 
mechanics could be deployed in more detailed investigations in each domain. 
 
 
164 
CHAPTER 3: TCG Analysis 
The structure of this chapter is as follows. We here deploy and further develop the TCG-
approach with reference to some general sets of facts, concentrating mostly on points 
bearing on the architecture under discussion, rather than on analyses pursued to any great 
depth. We begin with a discussion of local relations which brings up some problems that 
went undiscused in Chapter 1. This leads then to two discussions regarding SCM ? one 
pertaining to A-movement, and one to A'-movement (concentrating on wh-relations). 
 We then speculate on a general extension of the logic deployed to SCM cases to 
other cases within local structures, opening the suggestion to pursue so-caled 'split-VP' 
or "stacking"-style analyses to a perhaps extreme point of dividing individual thematic 
elements into their own litle "mini-clauses", forming the clause internal equivalent of 
traditional-clause divisions. This is then shown to be relevant for cases left out by the 
traditional view of the clause asumed in the SCM discussion ? that is, cases suggesting 
that at least wh-movement involves relations beyond clause edges, including as wel 
some internal phrases. Then we stop, and move to concluding remarks, a discussion of 
some further architectural isues and open questions/problems. 
3.1. Local Relations: Part One 
Consider the following simple case with an unacusative verb in (176): 
 
(176) A man arived 
 
A relatively standard view of the structure of (176) would look as follows (ignoring the 
C-domain): 
 
 
165 
(177)          TP 
 
    DP         T' 
    
?/? 
 
a
D    N
man
  T
0
   VP 
 
           
arive
V
0
    t
DP
 
                 ? 
 
Deploying the reduced structures and derivations discussed in Chapter One we would 
have (ignoring the internal asembly of the nominal):
91
 
 
(178) a. T
?:n
?:?
 b. T
?:n
?:?
 c. T
?:n
?:?
 d. T
?:n??
?:??:f
 e. T
?:f
  
 
    D
?:?
?:f
  D
?:?
?:f
  D
?:???:n
?:f
  D
?:n
?:f
 
 
 f. T
?:f
 V
?
 g. T
?:f
 V
?[?:f]
 
 
  D
?:n
?:f
   D
?:n
?:f
 
 
The D and T elements asociate and swap ? and ? as outlined earlier (?1.2), pictured in 
step (178)c. This proces results in the deletion of ? on T ? the outcome is a "discharge" 
of the ?-property of T, and a ?-relation betwen T and D. The addition of the thematic (V) 
element (arrive) leaves us with the structure in (178)f. And, as suggested earlier, the ?-
property of V takes ?:f as its value, completing the A-relation circuit (closing off the 
open position of the thematic predicate). In this way D is indirectly connected (via the 
?/? feature-complex) to the internal role. 
 I mentioned in Chapter One as wel that we wil be viewing ?-variables as 
inherently undistinguished open slots, so that ?/? properties are actualy required to 
                                                
91
 I asume that D-elements typicaly come without ?-specifications (i.e., with ?:?), and that this property is 
valued in virtue of asociating with N, which comes with ?:f (valued ?). 
 
166 
individuate them. We can expand on this point by considering a minimal addition to 
(176): 
 
(179) A man arived sad 
 
Following the basic structure of proposals of Wiliams (1983, 1994) and others, I wil 
take this to be a situation where non-distinct ? results in an identification in virtue of the 
?-property of the unacusative-? comes to be borne by the adjoined adjectival, as 
follows: 
 
(180) T
?:f
 V
?[?:f]
   A
???[?:f]
 T
?:f
 V
?[?:f]
   A
?[?:f]
 
 
 D
?:n
?:f
    D
?:n
?:f
 
 
We wil now se that on the rather loose view sketched in Chapter One regarding 
possible configurations for feature-relationships, even fairly smal increases in 
complexity appears to present us with problems. Consider a simple transitive: 
 
(181) He likes her 
 
(182) a. T
?:n
?:?
 b. T
?:n
?:?
 c. T
?:n
?:?
 d. T
?:n??
?:??:f
 e. T
?:f
  
 
    D
?:?
?:f
  D
?:?
?:f
  D
?:???:n
?:f
  D
?:n
?:f
 
 
 f. T
?:f
 
?
v
?:a
?:?
 g. T
?:f
 
???[?:f]
v
?:a
?:?
 h. T
?:f
    
?[?:f]
v
?:a
?:?
   V
?
 
 
  D
?:n
?:f
   D
?:n
?:f
    D
?:n
?:f
 
 
 i. T
?:f
    
?[?:f]
v
?:a
?:?
   V
?
    D
?:?
?:g
   j. T
?:f
    
?[?:f]
v
?:a??
?:??:g
   V
?
    D
?:???:a
?:g
 
 
  D
?:n
?:f
     D
?:n
?:f
 
 
Asuming that v introduces both the "external"-? and acusative (?:a), I wil mark this 
element as "
?
v
?:a
?:?
" to indicate that the ?-role is upwardly directed, and the ?/? properties 
 
167 
downwards, as in (182)f. However, this talk of "upward/downward" raises the isue of 
whether there can be configurations where the ?-role dominates its argument in this 
approach. So far we have sketched a view under which A-relations, including the 
external-? asignment, are mediated by ?/? properties. So such mediation is suggested to 
be possible; but is it necesary? 
 Consider the steps beyond the introduction of v in (182)h-j. In (182)i it is 
indicated that the ?/?-value swap happens independent of the ?-role taking its value. Is 
this the right way to think about this? What is at stake? 
 For one thing, since we have stated the mechanisms of feature valuation in a way 
that alows for probes to dominate goals or vice-versa, its not clear what prevents, e.g., 
the v-introduced ?-role from asociating with its own ?/? properties. Or, for that mater, 
having the nominative-T's ?-properties value the internal (V-introduced) ?. It would 
sem, in order words, that we ned to introduce some asymmetries (e.g., have v-? only 
look "upwards"; or V-? not able to look past the v-introduced ?/?-properties). In other 
words, we need some notion of locality here so that we don't get he likes her meaning she 
likes him, and other impossibilities. 
 Recal that we made a smal fuss earlier in this discussion (Chapter One) about the 
possibility of redundancies betwen statements of locality built-in to rules or imposed on 
their outputs (e.g., minimality-type restrictions) and approaches which offered some 
characterization of domains, leaving the operations otherwise unrestricted. Here we are 
presented with a situation for which it isn't at al obvious that our conception of 
workspace restrictions (of the distinctnes and ordering sorts) could be relevant. What 
recursive structure or repeated elements are there in such local domains for the 
 
168 
workspace to resist via the contraction mechanics? We need a story about the 
possible/impossible feature-connections in local domains. Importantly, if we have to 
introduce local notions regarding relative or absolute "structural distance" for operations 
to apply, it opens the question as to whether it would be best to treat everything that way 
(since it is required for the most local cases).
92
 
 I wil suggest below, following some other developments, that the right position 
here may be to bite the bullet, and explore the posibility that there is a bit more structure 
in these local domains than mets the eye. The strategy here wil be to work backwards 
from the more complicated cases (in particular linked-local relations of the SCM type) to 
the (superficialy) simpler ones. It is in the domains where we se SCM type efects that 
our suggestions regarding workspace distinctnes find their plausibility. The idea wil 
then be to se whether we can find motivation for extending the ideas to sek out possible 
divisions within local structures that alow us to cary-over the central ideas about 
workspace-demarcated domains. So let us develop the analyses in more detail in these 
domains, and return to the local considerations. 
3.2. Linked-Local Relations I: Raising to Subject (RtS) 
Consider again the following standard case of cyclic A-movement in raising-to-subject 
(RtS): 
 
(183) John [sems [ _ to tend [ _ to appear [ _ to like carots]] 
 
 
                                                
92
 Se Frank (202) for some similar discusions regarding the posible ned for locality of movement 
restrictions within elementary tres, and the redundancies such views would pose for the TAG architecture; basicaly 
the same isues arise there as here as one would expect. 
 
169 
First, the subject position of raising verbs is standardly thought to be athematic, as the 
presence of expletive elements and idiom chunks has traditionaly been taken to suggest: 
 
(184) a. There sems/is-likely/appears to be a man here 
b. The shit sems/is-likely/appears to have hit the fan 
 
I wil asume then, as in our Chapter One sketch, that these predicates do not involve a 
smal-v. I also adopt here the asumption that these infinitival complements are T's, and 
not C's. 
3.2.1. Some Raising Impossibilities & Expletive/Asociate Relations 
Consider raising from a finite clause (185) and the il-formednes of "superaising" in 
(186) where the subject moves past a position where it could have landed (were the 
position not otherwise filed): 
 
(185) *John
1
 sems [
CP
 that _ is here] 
 
 
(186) a. It sems John was told [ _ to arive on time] 
 
b. *John sems it was told [ _ to arive on time] 
 
The properties of these cases might be taken to follow from our notion of workspace 
distinctnes, repeated here: 
 
(187) WORKSPACE DISTINCTNESS (ANTI-RECURSION): 
 The workspace does not tolerate the presence of multiple tokens of type X 
 
We provided in our earlier sketch of SCM a general story in terms of (187) plus the 
requirement on workspace ordering that highlighted one kind of response that the system 
might make to potential distinctnes violations. The outcome was a proces of node-
 
170 
identification plus a contraction of the active workspace (to avoid ordering conflicts). Let 
us now consider a diferent situation: 
 
(188) John believes [that the earth is flat] 
 
Here we wil have, upon encountering the edge of the embedded clause (remember: 
derivations go top-down!) a potential distinctnes violation betwen matrix and 
embedded C. At least two diferent responses to this situation are possible: 
 
(189) C ? T ? v ? V ? C   C ? T ? v ? V ? C 
 
     D         D 
 
(190) C ? T ? v ? V ? C   C ? T ? v ? V ? C 
  
     D         D 
 
The response in (189) would be a conservative one in which the workspace would simply 
shift ("move-on") as we sketched in our introductory discussion. This would obey the 
restriction on distinctnes as the higher instance of C would simply be abandoned to the 
output. The alternative is a more radical shift, esentialy beginning an entirely new 
domain, as in (190). 
 The more radical response would provide us with an explanation for the 
impossibility of (185) and (186)b. In both cases we would have derivations which 
expanded top-down as follows in (191), up the boundary of the embedded C-domains. If 
the more radical contraction occurs, then the subject would be stranded outside the 
workspace without having had its asociated ?/? properties connected with ?: 
 
(191) C ? T ? V ? C   C ? T ? V ? C 
 
     D         D 
 
171 
 
Thus both raising from a finite clause and raising past a posible landing site would be 
subsumed under the same explanation. The result is a non-?-marked subject in both 
cases: 
 
(192) a. *[
CP
 John sems [
CP
 that .. 
 b. *[
CP
 John sems [
CP
 C
0
 [
TP
 it .. 
 
This would thus be a diferent instance of sort of schema offered in Chapter 1, repeated 
here, regarding an unlicensed property being abandoned from the workspace to the output 
structure: 
 
(193) A ? B ? C ? A  ?   
?:A
?:A
*F:?
 ? B ? C ? A 
 
However, it is important to note that on the view of ?/?-? connections being explored 
here, the violation is perhaps best viewed in these broader terms ? a faulty A-relation ?
rather than just saying that an NP has not received a role. That is, there wil be more than 
one way that an A-relation can be faulty ? but the general story about feature relations 
on the dominance path wil be sen to connect them al into the more general clas. 
 That is, while it might be true that the violations above on par with (194) 
 
(194) *John sems 
 
we clearly do not want to say the same for the otherwise superficialy similar violation in 
(since there is athematic): 
 
(195) *There sems that a man is in the room 
 
 
172 
Given the top-down nature of structure expansion in the present system, the fairly strong 
intuition of the il-formednes of (195) at the following point in a left-to-right pas can 
perhaps be taken to be relevant in this connection: 
 
(196) *There sems that.. 
 (vs. ? It sems that ..) 
At the point where the finite embedded clause is encountered, the judgments are fairly 
uniform across (185), (186)b, and (196). But (196) involves athematic there, suggesting 
that the detection of anomaly at the clause border in (185) and (186)b, which has a rather 
similar profile, ought not be conceived of in exclusively ?-theoretic terms. Consider also: 
 
(197) a. *What semed that Mary liked _ ? 
 b. *What semed that ary liked piza? 
 
Now, both (197)a&b are clearly out, but they sem to have the same profile despite being 
rather diferent sorts of violations on standard views. In particular, (197)a would be 
presumably be a Case-theoretic violation, since what occupies two Case positions, 
whereas (197)b would involve full satisfaction of ?/?, but the wh-element would receive 
no ?-role. Recal our schema for a local A-relation (e.g., a ?-marked "subject" indexed to 
the external/v-introduced ? by the ?-properties): 
 
(198) C?T
?:f
?v
?[?:f]
 
 
    D
?:f
?:n
 
 
What the il-formed examples involving non-expletives discused above have in common 
is the following stage of derivation (just prior to contraction), where V is a raising 
predicate (an athematic element compared to (198)): 
 
 
 
173 
(199) C?T
?:f
?V?C 
 
    D
?:f
?:n
 
 
If (C, C) satisfy the matching relation, and result in a contraction, then it is true that D
?:f
?:n
 in 
(199) wil not be ?-asociated, but that does not explain the otherwise very similar fel to 
the violation involving there in (195), which on most views of these elements is 
athematic (though se Moro 1997 for a view in which there, if not "thematic" is at least 
viewed as an abstract sort of predicate). 
 Bo?kovi? (1997, 2002) discusses what he cals (following a suggestion of Howard 
Lasnik) the Inverse Case Filter. The traditional Case filter was stated in terms like the 
following: 
 
(200) Extended Case Filter: 
 *[
NP
 ?] if ? has no Case and ? contains a phonetic matrix or is a variable 
        (Chomsky 1981:175) 
 
Case-theoretic violations on this view are pinned on a failure to met a requirement of 
NPs. However, it is not unreasonable to suggest that Case-theoretic violations might be 
(or might also be) a mater of Case-asigner's needing to "discharge" their ?-property. 
This is the idea of the "Inverse" Case Filter. Bo?kovi? discuses examples such as the 
following: 
 
(201) a. * is likely John wil leave 
 b. *John believes to sem Peter is il 
 
Both of these kinds of violations have been discussed in terms of the "EP", currently 
implemented in feature-licensing terms within minimalism. However Bo?kovi? points out 
that the cases in (201) can be explained in terms of failure to discharge/asign Case 
 
174 
(nominative in (201)a and 'exceptional' acusative in (201)b). Can we make use of this 
general idea in handling the crucial facts regarding expletive-there in a way that connects 
them to what otherwise (on standard views) appear to be ?-theoretic violations? 
 We could asume (with Chomsky and others) that there is minimaly specified for 
agrement, but cannot check case. Then we would understand the il-formed there..that 
case above as a mater of the Inverse Case Filter. However, another view is possible, 
which I believe can play a role in determining the distribution of there-type expletives (at 
least in English). 
 Suppose that matching of a feature, any feature, is sufficient to license 
combination, so that X?Y can be established as a dominance link if they bear the same 
feature F. Suppose that expletive there, unlike regular "thematic" nominal expresions, 
bears just unvalued ?, and no ?-specification whatsoever. Then we would have the 
following: 
 
(202) C?T
?:?
?:n
?V?C 
 
there
D
?:?
 
 
Now two options present themselves ? either ?-properties can enter into 
licensing/valuation independently of ?, or (as Chomsky 1998, 1999) suggests, ?-licensing 
is parasitic in some sense on ?-relationships. Suppose that ?-valuation/licensing can 
happen independently. Then we have: 
 
(203) C?T
?:?
?:n?
?V?C C?T
?:?
?V?C 
 
there
D
?:???:n
  
there
D
?:n
 
 
 
175 
This view could help us with the distribution of expletive-there as follows. Suppose in 
the wel-formed local A-relation, contra to what we have suggested so far, that thematic 
elements (v/V) come to the derivation with an unvalued ?-feature as their argument. I 
suggested earlier that we might in general view ?/?-properties as individuating the 
variable positions of thematic predicates, and two ways of thinking about this were 
offered: (i) the open positions are undiferentiated "slots" and (i) the one open position is 
indistinguishable from another because they al "start" with a general default value. 
Suppose then that they start as ?[?:?]. An expletive element in the subject position of a 
transitive verb in English then wil encounter the following stage of derivation if the 
sketch in (203) is correct: 
 
(204) C?T
?:?
?v
?[?:?]
 
 
there
D
?:n
 
 
(e.g., *there hit ..) 
 
Asuming that subsumption (se (31), ?1.2) is required to hold for valuation, ?:?-?:? 
wil make ?-discharge as we have envisioned it impossible. The prediction is that 
expletive-constructions must require some other way to get T-? valued. How could that 
happen? It must be, acording to this view, that T-? relates directly to a subordinate 
nominal element, something that is independently valued for ?. And that nominal must be 
in a configuration that is somehow appropriately thematic, but without involving ?-
asignment. Moreover, there cannot be any intervening ?[?:?], as this could be sen to 
block the relevant relationship betwen T-? and some lower nominal "asociate". 
 
176 
 This picture sems roughly correct. Expletives generaly appear in ?-positions 
that are not imediately/localy asociated with ?, as in raising, copular constructions, 
and unacusatives. They can appear as wel in ECM environments in English, so long as 
the condition on there being no intervening ?-element is met, e.g.,: 
 
(205) a. I believe there to be a man in the room 
b. I believe there to have arived a man 
c. I believe there to appear to be a man in the room 
d. *I believe there to have a man left 
e. *I believe there to be an idiot (vs. I believe John to be an idiot) 
Let us consider a counterpart to (205) to take look at how these relations are established 
? the idea then is that there wil serve to license matrix ?, but not ?. The result is that ? 
must be valued in another way, and it cannot connect directly with a ?-element (because 
the ?-argument of ? wil be ?:? as wel, and so the subsumption condition on valuation 
wil not be met). Consider then the clasic case: 
 
(206) There sems to be a man in the room 
 
We begin then with a local structure as discussed above: 
 
(207) a. C?T
?:?
?V b. C?T
?:?
?V?T 
   
 there
D
?:n
  
there
D
?:n
 
 
The defective T complement to the raising predicate V is introduced, resulting in 
contraction, leaving us with the following workspace (ignoring the output structure here): 
 
(208) a. C?T
?:?
 b. C?T
?:?
?V
be
?D
?:?
?:
 
   
 there
D
?:n
  
there
D
?:n
 
        ?-agre 
c. C?T
?:?
?V
be
?D
?:?
?:
?N
?:f
?:?
 d. C?T
?:???:f
?V
be
?D
?:???:f
?:
?N
?:f
?:?
 
 
 
there
D
?:n
    
there
D
?:n
 
 
177 
 
In this situation we could understand the nominal element introduced (man in a man) to 
value the ?-properties that are unvalued prior to (208)d, but then what of the ?-properties 
of the nominal asociate? Given our adoption of a single dominance order, there bears no 
direct relationship to the asociate. 
 We can solve this problem by taking the other option suggested above (following 
Chomsky) and suggest that ? and ? valuation are linked. We keep the idea that there has 
a lone unvalued ?-property. This wil explain why local ?/?-valuation cannot happen. 
This wil leave us with an initial stage of derivation more like this: 
 
(209)  C?T
?:?
?:n
?V 
   
 there
D
?:?
 
 
Now a diferent question arises, namely: when a ?-element is introduced why can't the 
valued ?-property of T "fil-in" the value of ? as we suggested for wh-questions? I wil 
return to this in a moment. First, consider how this wil implement a familiar transmision 
type story regarding expletive-asociate relationships. If T and expletive-there are able to 
relate via matching (T?-D?), but with valuation of the properties impossible because 
there bears no ?-property, then the final stage of derivation above in (208)d would look 
as follows in (208)d', with ?-valuation happening then parasiticaly on succesful ?-
agrement as in (208)e: 
 
            ?                   ? 
(208) d'. C?T
?:???:f
?:n
?V
be
?D
?:???:f
?:
?N
?:f
?:?
 
 
 
there
D
?:?
 
 
 
 
 
178 
e. C?T
?:f
?:n??
?V
be
?D
?:???:f
?: ?:n
?N
?:f
?:???:n
 
              ?                ? 
 
there
D
?:???:n
 
 
Thus, T-? values both the expletive and the asociate, deleting as in the regular case. We 
have then a mechanism whereby expletive-there is a ?-marked element, but so is the 
asociate. We thus agre with the asesment of Lasnik (1995) that the expletive checks 
the matrix Case property, but do not posit an additional partitive case (e.g., as in both 
Lasnik 1995, and Beleti 1988) that is responsible for licensing the asociate. Rather, the 
asociate wil be able to enter, in virtue of its ?-properties, into 'caseles' types of 
predication, as in smal clause structures for example:
93
 
 
(210) a. I consider [John a genius] 
b. I consider [John inteligent] 
c. ..[a man] [ in the room ] 
 
For the present point regarding expletive/asociate relationships, I wil asume that the 
structure of the relationship betwen [a man] and [in the room] comes out as follows for 
the continuation of our (208) derivation in (208)f: 
 
(208) f.    
in
P
?[?:?]
 
   
  C?T
?:f
?V
be
?D
?:f
?:n
?N
?:f
?:n
 
 
 
there
D
?:n
 
 
 g.    
in
P
?[?:?]??[?:f]
 
   
  C?T
?:f
?V
be
?D
?:f
?:n
?N
?:f
?:n
 
 
 
there
D
?:n
 
 
                                                
93
 I mean by 'caseles' here just that case-asignment is not part-and-parcel of the local relation in the sense 
that Case must be asigned from outside the basic predication. 
 
179 
The D-element is ?-marked, and so it may enter into a relation with an element which 
does not, itself cary a potentialy interfering ?-property. The asumption then is that 
there is a clas of ?-like relationships which are "direct" in one sense ? they involve ?-? 
connections (valuation) that is not acompanies by ?-valuation ? but "indirect" in 
another sense, in particular any nominal element entering into such relationships wil 
have to have found its ?-properties licensed in some independent ?/? complex. I wil not, 
in the present work, get into the isues regarding definitenes efects and the like, though 
the suggestion here would be that direct ?-? relationships that are reliant on some other 
instance of ?-asignment are at least one kind of DE environment. The present outline 
sems compatible with one or another version of Deising-style mapping, but I won't 
pursue this here (e.g., se Deising (1992); and se Hornstein (1995) for some relevant 
discussion, and se Safir (1987) for what strikes me as a related conception involving 
"transmision" and the notion of "unbalanced" chains). 
 So, let us now consider our il-formed cases together from above: 
 
(211) *John
1
 sems [
CP
 that _ is here] 
 
 
(212) a. It sems John was told [ _ to arive on time] 
 
b. *John sems it was told [ _ to arive on time] 
 
 
(213) *There sems that a man is in the room 
 
These would thus correspond to the following two scenarios, where both (211) & (212) 
manifest a reasonable A-relation as in (214), but one which does not connect with ?, thus 
 
180 
leaving the matrix subject John unintegrated. In the case of (213) the subject-related T 
element (along the the expletive) must be speled-out with unlicensed properties: 
 
(214) C?T
?:f
?V?C 
 
John
D
?:f
?:n
 
 
(215) C?T
?:?
?:n
?V?C 
 
there
D
?:?
 
 
In (215), there can enter the structure in virtue of ?-matching, but given the asumption 
that ?-valuation is parasitic on ?-relationships, nothing further can happen at this point. 
Asuming that properties which are unvalued are ilegible at the interface, (215) wil 
crash as the material intervening betwen C?..?C is spliced-out. 
 So we diagnose two kinds of A-relation deficiency, and have in hand a reasonable 
story that looks to be able to contribute to understanding expletive-there's distribution, 
based on a particular implementation of the "transmision" logic (Safir 1982, Chomsky 
1986). Also we have implemented a view of ?/?, with suggestive connections to GB-era 
views. Consider again the Extended Case Filter mentioned above: 
 
(216) Extended Case Filter: 
 *[
NP
 ?] if ? has no Case and ? contains a phonetic matrix or is a variable 
        (Chomsky 1981:175) 
 
On the view here this is only part of a more general conception of A-relations (and A'-
relations as we'll se below). Why should Case be hooked up with the notion of 
"variables"? The idea here is that ?-properties name variables, thereby distinguishing 
them, and that the interconnections of ?/?-properties are what mediates the connections 
to betwen thematic predicates and nominal expresions, whether operator-like/ 
 
181 
quantificational or not (as with ordinary NPs). Vermeulen (1995) (se also Viser & 
Vermeulen (1996)) point out that we can in general distinguish thre things regarding 
variables, (i) the "variable itself", (i) the name of the variable, and (ii) the value of the 
variable (what it ranges over). The "variable itself", I am suggesting is the open position 
in ?-elements for the participants that these elements relate to eventualities. ?-properties 
create a path to a ?-feature, which either is bound from above (as in WH-cases) or is 
asigned to an overt nominal. 
 Before moving on let us return to an isue we left dangling above. Recal the 
possibility of having the les radical workspace contraction (189) raises again the isue of 
locality as potentialy independent of these dynamic domains, as discussed above in 
connection with the derivations for simple transitive structures. I repeat the possible 
workspace contractions here for convenience: 
 
(217) C ? T ? v ? V ? C   C ? T ? v ? V ? C 
 
     D         D 
 
(218) C ? T ? v ? V ? C   C ? T ? v ? V ? C 
  
     D         D 
 
Its not clear without imposing some other locality mechanism why the conservative 
response to a distinctnes violation (217) wouldn't then simply alow a some featural-
relation with the embedded T-element.
94
 
                                                
94
 Note that the kind of node-identification sugested to underlie SCM presumably couldn't aply betwen 
finite T's, since these would not manifest the subsumption relationship that we have posited as being the relevant 
condition under which node identification takes place. 
 
182 
 Note that the conservative response to a potential distinctnes violation would 
always end up with there being (globaly) more instances of contraction/spel-out than the 
radical response. Asuming that transderivational comparisons of more-vs-les numbers 
of contractions is an undesirable property to have in a minimalist approach, note that 
there is a local way to ensure that the global number of contractions wil in fact be 
minimized. This wil always hold if local contractions are maximal ? that is, if the 
system responds in the "radical" manner depicted above (which yields the A-movement 
locality facts discussed). 
 Note also that this isue of the diference betwen the radical vs. conservative 
response to distinctnes violations does not arise when the relevant context nodes (e.g., C 
or T) are identified as in A- or A'-type SCM (since the relevant nodes are identified there 
is no way to remove "just one of them", leaving the other in the workspace ? so the only 
possible response following identification is the radical contraction in order to keep the 
ordering properties coherent, as sketched in Chapter One). 
 However, if we adopt a local economy view for the non-movement case 
(maximize contractions) to get an acount of the A-movement locality facts above, we 
would then have two separate motivations for what otherwise sem to be rather similar 
sorts of proceses, difering only with respect to whether the relevant nodes are identified 
or not prior to contraction. My suspicion is that there may be a way to derive these 
contractions in a unified way, but I have not yet found a satisfactory formulation to this 
efect. I wil leave the mater open here, asuming the following: 
 
(219) Workspace Economy: Contraction/Spel-Out is localy maximal 
 
 
183 
Note that this is actualy consistent with a "least efort" line of thinking, perhaps despite 
appearances. The idea would be that maintaining distinctions in the workspace is what 
takes efort, so whenever this burden can be eased (by speling-out) it is maximally eased. 
3.2.2. Binding Interventions & SCM 
Consider now some of the SCM-type efects regarding binding, in particular the cases in 
(136) repeated here as (220): 
 
(220) a. John
1
 semed to himself
1
 to appear to Mary to be geting fat 
 b. John
1
 semed to appear to himself
1
 to be geting fat 
 c. John
1
 semed to Mary to appear to himself
1
 to be geting fat 
 d. *Mary semed to John
1
 to appear to himself
1
 to be geting fat 
 e. It semed to John
1
 to appear to himself
1
 that he was geting fat 
 f. John
1
 semed to Bil
2
 to appear to himself
1/*2
 to be geting fat 
 
We noted the following two key points about these cases in our Chapter 2 discusion. 
First, the impossibility of (220)d was suggested to be traceable to an intervention efect 
on a cyclic raising story, where we would understand Mary to occupy the embedded non-
finite subject position of to appear, thus constituting a closer posible binder for the self-
form in the experiencer-P. But this creates a ?-conflict, and so it is out. 
 The second point was to note that, regarding (220)e, it appears that John is a 
suitable binder for the self-form, despite the apparent lack of a c-command relationship. 
We can now expand on these observations as follows. 
 Recal we noted as wel in our earlier discussion that (220)a and (220)b suggest 
that either the binding domain for the self-form includes more than one clause (perhaps 
specified in terms of the presence of an subject or suitably "subject-like" element, as in 
some approaches to binding), or some relation must be involved to bring the antecedent-
 
184 
dependent pair into a more local relationship. The two salient possibilities for this later 
sort of solution are the kind of T-to-T-domain SCM of the subject John in these 
examples, or perhaps an LF-type movement of the self-form. 
Note that a strict clause-mate condition on these relationships is implausible if we 
asume that the experiencer-Ps are not implicated in any movement betwen domains 
(i.e., asuming their positions are fixed where they surface). Consider the following 
additions to the examples in (220) of one more intervening raising predicate (tend, in 
(221)) or two more (tend and be likely, (222): 
 
(221) a. John
1
 semed to himself
1
 to tend to appear to Mary to be geting fat 
 b. John
1
 semed to tend to appear to himself
1
 to be geting fat 
 c. John
1
 semed to Mary to tend to appear to himself
1
 to be geting fat 
 d. *Mary semed to John
1
 to tend to appear to himself
1
 to be geting fat 
 e. It semed to John
1
 to tend to appear to himself
1
 that he was geting fat 
 
(222) a. John
1
 semed to himself
1
 to tend to be likely to apear to Mary to be geting fat 
 b. John
1
 semed to tend to be likely to appear to himself
1
 to be geting fat 
 c. John
1
 semed to Mary to tend to be likely to apear to himself
1
 to be geting fat 
 d. *Mary semed to John
1
 to tend to be likely to apear to himself
1
 to be geting fat 
 e. It semed to John
1
 to tend to be likely to apear to himself
1
 that he was geting fat 
 
Importantly, the judgments remain the same as in (220). The crucial cases are those 
involving binding betwen two elements situated within these experiencer-Ps ? that is, 
(221)e and (222)e. What these (e)-cases show is that binding is independently posible 
betwen the nominals in these Ps, and that the phenomena is not sensitive to the 
boundaries introduced by intervening embedded non-finite clauses. If the position of 
these Ps is "fixed", then binding of these self-forms cannot be required to happen within 
a single clause. Similarly, if movement from these positions is generaly not possible, the 
idea of having the self-form move into a more local relation with its antecedent is 
 
185 
implausible as wel. This leaves the possibility of defining the binding domains in terms 
of something like the local presence of a "subject" (so if there is no local A-movements 
we could stil have arbitrarily large binding domains in this sense). 
 The question then is why on such a view of binding domains would we have the 
contrast betwen the d-/e-cases in (220)-(222)? It would sem that John can be a local 
binder ? that's what the e-cases show. 
Moreover, as the following show, many kinds of dependencies that are typicaly 
understood to require a c-command relationship appear to be licit betwen two such P 
structures, providing further strength to the asertion that there is no complicating factor 
of structure in the d-cases in (220)-(222), and that it thus stands as a piece of evidence 
that appears to demand that something like succesive-cyclic A-movement occurs. 
Consider (from Castilo, Drury, & Grohmann 1999:95): 
 
(223) a. It sems to every boy
1
 to appear to his
1
 mother that the earth is flat 
b. It sems to no man to appear to any woman that the earth is flat 
c. *It sems to him
1
 to appear to John
1
 that the earth is flat 
d. It sems to his
1
 mother to appear to John
1
 that the earth is flat 
e. ?It sems to John
1
 to appear to himself
1
 that the earth is flat 
f. It sems to John
1
 to appear to him
1
 that the earth is flat 
g. It sems to them
1
 to appear to each other
1
 that the earth is flat 
 
Thus, variable binding of a pronoun by a quantifier (223)), negative polarity licensing 
(223)b), and Condition C violations (223)c) as wel as their expected absence in (223)d 
al point to the generalization that these experiencer elements can bind (etc.) out of their 
Ps. 
 Curiously, there is an unexpected absence of strong complementarity betwen 
Conditions A and B, as the aceptable judgment for (223)f shows in comparison to 
 
186 
(223)e. That is, (223)f is not degraded with the indicated coreference as is the following 
case in (224)a with respect to the wel-formed (224)b: 
 
(224) a. * John
1
 is believed to sem to him
1
 to be a genius 
 b. ? John
1
 is believed to sem to himself
1
 to be a genius 
 
Some speaker in fact detect a slight advantage in aceptability for the pronoun case 
versus the self-form in case of binding betwen elements in experiencer-Ps, finding 
(223)f slightly beter than (223)e. But the rest of judgments are stable, including the 
possibility of reciprocal binding as in (223)g. 
 The presence of the strong contrast in (224)a/b and its absence in (223)e/f led 
Castilo, Drury, & Grohmann (1999) to suggest that self-form in these cases is actualy a 
logophor (Reinhart & Reuland 1993, Sels 1987), and that, if true, this fact would 
undermine the argument for succesive-cyclicity of A-movement based on the 
intervention efect in (223)d (due initialy to Danny Fox, as pointed out by David 
Pesetsky (p.c.). The argument is esentialy this: since we do not know hat governs the 
distribution of logophors, it is unclear that we need to posit an intermediate copy/trace of 
A-movement to explain (220)d, repeated here: 
 
(220) d. *Mary semed to John
1
 to appear to himself
1
 to be geting fat 
 
What should acount for speaker judgments regarding (220)d should thus be some to-be-
specified story about logophoricity. 
 But this argument from Castilo et al. does not go through ? the case for 
succesive A-movement made by such observations, I think, stil stands. While it is true 
that the non-complementary distribution of pronouns and reflexives in these experiencer-
 
187 
P environments suggests that something like logophoricity is in play here (e.g., as in 
"picture-NPs", se below), this observation says nothing about what stil appears to be an 
intervention-type efect in the contrast betwen (220)c and (220)d above. That is, 
whatever logophoricity ultimately amounts to (connected to various maters concerning 
"point-of-view", "psychological state", and similar notions; se below),
95
 it stil sems to 
be sensitive in some manner to structural factors in the determination of impossible 
antecedence relationships.
96
 
 The interest of logophors stems from their extra distributional fredoms in 
comparison to ordinary self-anaphors. For example, in comparison with the 
complementarity of the straightforward local cases of pronouns/reflexives (as in (225)a 
vs. b), we find the lack of a strong contrast for the a/b cases in (226) & (227) surprising, 
as we do with the antecedence betwen the P-embedded experiencer elements in (223)e 
vs. (223)f (repeated here as (228)a vs. (228)b): 
 
(225) a. * John
1
 liked him
1
 
 b. ? John
1
 liked himself
1
 
 
(226) a. ? John
1
 liked the pictures of him
1
 
 b. ? John
1
 liked the pictures of himself
1
 
 
(227) a. ? John
1
 thought pictures of him
1
 were on display 
 b. ? John
1
 thought pictures of himself
1
 were on display 
 
 
                                                
95
 Note Boeckx (201) contains an interesting discusion drawing on Roryck (19?) who argues for a view 
of certain raising predicates (e.g., sem, apear) conecting to verbs of comparison, and thus indirectly to concerns 
relating to "point-of-view". For discusions on the notion of logophoricity, se Clements (1975), Sels (1987), Reinhart 
& Reuland (193), Wiliams (194), Reuland & Everaert (201). Se below for some further discusion of 
logophoricity and why it probably isn't in play in the present case. 
96
 Castilo et al. do note various "structural" factors that sem to be involved in restricting logophor 
interpretation, sugesting a preference hierarchy for c-comanding vs. m-comanding vs. merely "previously 
established in the discourse" elements as potential antecedents, but shy away from the problematic conclusions that I 
reach here regarding the argument these cases stil present for cyclic A-movement. 
 
188 
(228) a. ?It sems to John
1
 to appear to himself
1
 that the earth is flat 
b. It sems to John
1
 to appear to him
1
 that the earth is flat 
 
But al this is to notice an extra dimension to the distribution of self-elements ? 
something about these contexts alows something additional possibilities where induction 
from the basis of the strictly local cases suggests it ought to be out. 
 Some speakers, as mentioned above, find there to be a slight advantage in 
aceptability for the pronoun in (228)b as compared to the self-form in (228)a. However, 
this diference rather like the contrast betwen (229)b&c where there is usualy a slight 
favoring of the pronoun over the self-form. 
 
(229) a. *Mary sold the pictures of himself 
b. ?John
1
 thought Mary sold the pictures of himself
1
 
c. John
1
 thought ary sold the pictures of him
1
 
 
But the crucial Fox-cases have the unaceptability status more in-line with (229)a. The 
conclusion is that negative restrictions on these self-elements are clearly in force, 
whatever their extra distributional fredoms. An explanation of this fact is 
straightforward on the SCM view of raising. 
 The conclusion can be avoided only if a story about the distribution of logophoric 
elements could be produced which would rule out by other independently motivated 
means the cases that can otherwise be handled as a straightforward intervention fact 
under the cyclic movement analysis. 
 But what is the domain for the binding theory on our restricted workspace view? 
It sems clear that binding of these self-elements is not something which occurs within 
the boundaries of the workspace as we have set things out here. Seting aside the picture-
NP situation (I won't be discusing the status of recursion in NPs here), however we view 
 
189 
the integration of the experiencer-Ps with respect to raising predicates the mechanics of 
contraction wil always set them off into separate workspaces. Consider: 
 
(230) C  ? T ? V ? T ? V ?.. 
  
      D    P?D D    P?D 
 
Asuming that these Ps are somehow V-asociated canonicaly, T-T identification and 
contraction would result in the structures being in separate domains. 
 I do not have anything to say about how it is that elements (e.g. in (230)) can 
relate to each other from within their containing Ps, either in the binding of the self-form 
(228)a or in any of the other relations that sem to be possible betwen these positions 
(223). Whatever factors underlie the possibility of such relationships, however, what 
sems clear is that the possibility that the SCM analysis makes available ? of positing an 
intervener ? directly explains the sharp anomaly of (220)d. 
 It is also clear that the general paterning of the availability of these self-forms 
include positions that we wil certainly want to say are in distinct phases of derivation 
(separate workspaces), like the binding of these forms by a matrix element when they are 
in embedded subject positions as in (227)b. 
 Recal from our discussion in Chapter One the suggestion that we think of the 
WS/O-distinction as esentialy drawing the interface line, such that output would be 
conceived as a syntactic structure populated by only PF and LF relevant properties. On 
the local view of domains being entertained here, the distribution of logophors must be a 
mater of relationships established over the output structure. Thus I tentatively suggest 
here that we regard the logophoric self-forms under discussion as distinct from local 
 
190 
reflexives in these terms. But note that our view of the output involves a general 
maintenance of the ordering properties created by construction in the workspace, so we 
stil have "structural" distinctions that can be made over the output. Let us asume then 
that the connection betwen a logophor and an antecedent element is captured over the 
output structure with a Higginbotham's (1985) linking mechanism, though we wil take 
this linking to be contingent on matching ?-properties of the elements. 
 This wil be understood to be diferent from the matching and valuation that we 
have so far discussed. I wil in fact suggest below that local reflexivization is a proces 
involving workspace-local valuation of ?; logophors, however, wil be understood to be 
independently ?-specified, and either they match up with their independently ?-specified 
antecedents, or not, under linking. That is: 
 
(231) a. Local Reflexives: ?:f?? .. ?:? ? ? 
     MATCHING/VALUATION 
 
b. Logophoric -self: ?:f?? .. ?:f?? 
                        LINKING 
 
This requires that we view ?-properties as "there" in the output structure, and not just the 
narow syntax (workspace). But we have been presupposing this in the general outlook 
on these properties anyway in terms of their asumed role in mediating ?-discharge. The 
linking relation, following Higginbotham, runs from a referentialy dependent element to 
a referential one. Higginbotham's view asumed that such links get created in two ways, 
as a reflex of movement, and independent of movement. (e.g., a variable bound by a 
quantifier would enter into such a linking relationship, though no one, as far as I am 
 
191 
aware, has ever suggested that quantifier/pronoun relationships are of the movement 
sort). 
 So what are the structural conditions on logophoric linking? One general answer 
that has been offered in the literature is that, esentialy, there are no such conditions. 
This was the basis for Castilo et al.'s (1999) rejection of the aleged binding intervention 
case in (220)d as an argument for SCM in raising. Rather, conditions on logophoricity are 
understood to rely on things like the following (se Sels 1987:445, and Wiliams 
1994:86): 
 
(232) Logophors connect with a logophoric center, which is an NP that must be a 
"thinker", "perceiver" meting one of a-c: 
 
 a. The referent of the NP is "the source of the report" 
b. The referent of the NP is "the person with respect to whose consciousnes 
the report is made" 
c. The referent of the NP is "the person from whose point of view the report 
is made" 
 
Note that no mention of structure is made. In fact, self-forms of this kind can appear 
without any structuraly present antecedent at al: 
 
(233) As for myself, Paris is great this time of year 
 (i.e., "as for me/my-point-of-view, (I think),..X") 
 
Let us consider (220)c-e again, to be sure that these notions regarding logophoricity 
might not, after al, be put to work in explaining the central cases that I am taking to 
provide evidence for SCM in raising: 
 
(220) c. John
1
 semed to Mary to appear to himself
1
 to be geting fat 
 d. *Mary semed to John
1
 to appear to himself
1
 to be geting fat 
 e. It semed to John
1
 to appear to himself
1
 that he was geting fat 
 
 
192 
It is not at al clear that these notions make the right predictions. In (220)d, for example, 
the seming and appearing are both to John. So it would sem that that if there is a 
candidate logophoric center for (220)d, it is John and not Mary (i.e., Mary is the one 
"seming"/"appearing" to be such-and-such, not the one that such-and-such is 
"seming/appearing-to). The point-of-view criteria should pick out the experiencer, not 
the matrix subject, as the logophoric center. 
 But this then suggests that there are, after al, some structural conditions or other 
on these elements, in the sense suggested above ? logophors are indeed keyed to extra-
syntactic factors that influence where they may find there antecedents (including implicit 
arguments, or merely presupposed entities in the discourse), but in the right local 
environments with a referential NP, their connection to that NP appears to be mandatory. 
If this is right, then we realy do have a good argument for SCM in raising. I wil return 
to these isue below, as they bear on the isue of similar arguments as they arise in wh-
movement. 
 What we have sen then is that (i) binding domains in terms of an acesible 
subject or the like don't sem to work properly for explaining this particular range of 
facts, (i) movement of the self-form is also somewhat implausible, since the elements 
within dative-experiencers typicaly cannot undergo movement, and (ii) a possible 
explanation in terms of logophoricity doesn't sem to be able to make the right 
distinctions either. 
 In addition, although we didn't make a big fuss about it above, the following cases 
involving reciprocals don't obviously fal into the clas of posible logophors, but 
nonetheles show al the same efects as the self-forms: 
 
193 
 
(234) a. The boys
1
 semed to Mary to appear to each other
1
 to be geting fat 
 d. *Mary semed to the boys
1
 to appear to each other
1
 to be geting fat 
 e. It semed to boys
1
 to appear to each other
1
 that they were geting fat 
 
 So in absence of some other story to explain these facts, we need SCM. The 
question now is what motivates movement to the intermediate positions? 
 I argued in Chapter 2 that we should be suspicious of an "M-feature" solution 
(e.g., so-caled intermediate EP-features postulated at the edges of embedded non-finite 
clauses). Other approaches that can handle these facts do so by brute force stipulation that 
A-movement moves T-to-T (e.g., Bo?kovi? 2002, Grohmann 2003). 
 However we have in our development of the SCM mechanics in TCG an acount 
which relies on independently required notions of (i) formal ordering properties, (i) and a 
system of types. With these ingredients we stated our conditions on our workspace, and 
these can be understood to drive the intermediate movements without appeal to M-
features. Recal from Chapter One the schema in (29) for SCM in raising environments (I 
repeat the relevant portion of derivation here: 
 
(29) e. C?T
?:f
?V?T 
 
     D
?:f
?:n
 
 
(29) f. C?T
?:f
?V?T
?:f
   C?T
?:f
?V?T
?:f
 
 
     D
?:f
?:n
      D
?:f
?:n
      D
?:f
?:n
      D
?:f
?:n
 
 
And recal that this contracted workspace on the right-hand side of (29)f is realy just: 
 
(29) f'. C?T
?:f
 
 
     D
?:f
?:n
 
 
 
194 
So without M-features we are able to derive the binding paterns above. However, it also 
sems that views which posit movement to the edge of every phrase would do just as wel 
with these facts (e.g., se Takahashi 1994, and a recent revival of Takahashi's view in 
Boeckx 2003). Such super-cyclic views would also sem to do quite wel regarding the 
distribution of "floated" all sen in English RtS: 
 
(235) a. The men all semed to appear to be likely to leave 
 b. The men semed all to appear to be likely to leave 
 c. The men semed to al appear to be likely to leave 
 d. The men semed to appear all to be likely to leave 
 e. The men semed to appear to all be likely to leave 
 f. The men semed to appear to be all likely to leave 
 g. The men semed to appear to be likely all to leave 
 h. The men semed to appear to be likely to all leave 
 
Moreover, these facts would perhaps be puzzling on the TCG view offered here, since 
SCM (the "lowering" efected by node-identification) is predicted to only involve the 
equivalent of Spec-TP positions. Therefore (235)c, e, f, and h are al problematic. 
 I am not going to pursue these maters here. I don't fully understand at present the 
wider aray of facts ? a thorough recent review of the relevant theoretical and empirical 
isues surrounding such "floated" elements (Bobaljik 2002) urges a kind of caution that 
time and space limitations do not alow me to respect here. I wil note only that there is 
reason to doubt a "stranding" analysis in general and that at present it sems to me that 
base-generation analyses have the best empirical coverage, so its not entirely clear that 
these facts bear directly on SCM. 
 Consider a few examples that bring up the kind of problems that arise (the 
following are drawn from Bobaljik 1995). Note that the stranding analysis sems to 
 
195 
presuppose that elements like all make a wel-formed unit with the DP that can strand 
them. But this isn't general, consider: 
 
(236) a. Some of the students might all have left 
 b. *Al (of) some of the students might have left 
 
Although (236)a sems fine, it cannot surface with all as a unit, as il-formednes of the 
b-case shows. Another clasic case that has been brought up as a chalenge to the 
stranding analysis that are relevant to our A-movement discussion are unacusatives and 
pasives: 
 
(237) a. *The men have arived al 
 b. The men have al arived 
 
(238) a. * The men were kised al 
 b. The men were al kised 
 
It quite unclear why these positions should be out on the standard A-movement plus 
stranding idea. 
 Again, I wil leave these isues to the side, but note that the mater is an important 
one however ? should it turn out that the stranding-style analysis is independently 
demanded, this would be inconsistent with the general intuition underlying TCG. In any 
case, I wil leave the mater open here for further future investigation, noting that these 
general types of facts could constitute a crucial set of cases that could strongly cal into 
question the basic ideas proposed here. 
3.2.3. Variable Binding/Condition C Interaction 
Consider another case discussed in Chapter 2 (se the discussion there for references), 
showing an interaction betwen variable binding by a quantificational element and 
 
196 
Condition C. This wil lead us into our discussion of wh-movement below, as wel as 
raising some general isues that I wil leave open here. 
  
(239) a. *[His
1
 mother's
2
 bread] sems to her
2
 _ to be known by every man
1
 to be _ 
the best there is 
 
b. [His
1
 mother's
2
 bread] sems to every man
1
 _ to be known by her
2
 to be _ 
the best there is 
 
In the a-case, in order for his in the subject to be bound by every man, it must be 
interpreted in the more embedded position ? but there it gives rise to a Condition C 
efect, so the reading on the provided coindexing for the a-case is impossible. However, 
if we switch-around the QP and the pronoun, as in the b-case, the bound-reading becomes 
possible, but this only makes sense of the interpreted position is below every man but 
above her. 
 The SCM type of analysis that our framework makes available can acount for 
this patern as wel, though note that it requires that the "A-moved" expresion [his 
mother's bread] must reside for interpretation in a non-thematic position. 
 As we noted in Chapter 2 (pointed out in Bo?kovi? 2002) the a-case on the bound 
variable reading is perfectly aceptable so long as we have disjoint reference betwen her 
and his mother. The combination of these observations suggests that intermediate 
reconstruction is possible, but not necesary. It moreover suggests that the output 
structure handled by the interpretative systems must be coherent (se Bobaljik 2002, 
Hornstein 2000 on this sense of LF coherence) in the sense that the moved phrasal 
complex must be interpreted as "in" one or the other positions, but not both (as this would 
cause conflicts that would presumably correspond to unaceptability). However, it sems 
 
197 
that how to understand this isn't entirely straightforward on the view we have been 
entertaining ? nor is it straightforward on standard views. The question is 
why/how/when it should be posible to have a nominal in such raising environments be 
forced to be "interpreted" in an intermediate position. 
 Note that both variable binding and Condition C efects cannot, in general, be 
workspace-mediated relations on the TCG view. Like the cases of logophors discused 
above, these are potentialy long-distance relationships (actual "long-distance", not 
superficialy so as in linked-local/SCM cases). However, unlike the logophor case, there 
appears to be structural factors involved ? something in the vicinity of c-command is 
required, whereas this isn't an absolute condition on logophors. 
 We can now bring up the "re-spel-out" mechanism discussed in Chapter One in 
connection with a concrete case, in particular the combination of our anti-recursion with 
workspace connectednes (the later is repeated here) (se ?1.4.1): 
 
(240) Workspace Connectednes (DOMINANCE): 
The elements in a given syntactic workspace must manifest a connected 
dominance order (for every x, y in the set, either x dominates y or y dominates x) 
 
Insisting that the workspace always maintain a fully connected dominance order yields 
the need to spel-out (void from the workspace) every time branching occurs. 
 For the raising case, this means that the subject, which in our top-down view 
begins the derivation in its putative surface position, must asociate to T and then spel-
out when V is introduced. However, in virtue of (i) the feature-connection betwen the 
subject and matrix-T, and (i) the introduction and identification of the embedded 
 
198 
(defective) non-finite T in multiple raising constructions, the subject can, and in fact must 
"re-enter" the workspace. Consider: 
 
(241)  C?T
?:f
 C?T
?:f
?V 
       ~DOM(D,V) & ~DOM(V,D) 
     D
?:f
?:n
    D
?:f
?:n
 
 
(242)  C?T
?:f
?V 
         ?          ? (CONTRACTION/SPEL-OUT) 
     D
?:f
?:n
 
 
(241) & (242) show the workspace both prior to and following the addition of V. By 
hypothesis the D-T relation has occurred, but the addition of V violates connectednes, 
since D and V enter into no ordering relationship. So D spels-out (the workspace 
contracts to maintain a connected order). The connection betwen D and T is understood 
to be maintained in virtue of their featural (?/?) relationship. 
 Next, when defective T is introduced, we have the following: 
 
(243) C?T
?:f
?V?T   C?T
?:f
?V?T
?:f
?D
?:f
?:n
 
 
   D
?:f
?:n
       D
?:f
?:n
 
 
In (243) I have for convenience collapsed some steps of derivation. On the left we have 
the introduction of the embedded non-finite-T. On our anti-recursion asumptions, given 
that T subsumes T
?:f
, the node are identified. Keping with ordering consistency, this 
requires that the intervening V be spliced out of the workspace, and the T-T identification 
efects the reintroduction of the matrix subject (right-hand side of (243) ? I include this 
reintroduction on the horizontal line simply for presentational purposes, left-right and 
top-down on the page both represent the single dominance relation). 
 
199 
 The SCM-raising facts we have canvased above demand that the entire LF-
content of this D-element be in this lowered position. Two sets of questions arise.
97
 First, 
is the LF-content in both the matrix and this new embedded position? Or does the ?-
material have to "collapse" to a single position? Second, why are such lowered elements 
never re-pronounced in either intermediate or base positions? 
 Taking the second question first, the right generalization appears to be that these 
D-elements are speled-out in the contexts in which they are initialy ?-valued. This 
acords with the general intuition of their being a "PF"-function to such properties, but 
note that our story regarding ?-transmision given above for expletive/asociate 
relationships then runs into some trouble. For example, the idea there was that there in 
raising constructions can relate to the structure by ?-matching even though valuation does 
not occur. In virtue of T-T contractions in raising, the expletive element was suggested to 
be lowered along with T, acording to our general story about SCM. In fact, it is dificult 
to se how we could mangage to avoid lowering the expletive given how we have treated 
regular nominal expresions in such contexts. However, if PF-spel-out is contingent on 
the context in which ?-properties are valued, then we expect one of two incorrect results 
for the raising constructions, either: (i) expletives should appear in every intermediate 
position in raising (244)a, or (i) they should only appear in the lowest such intermediate 
position (244)b: 
 
 
                                                
97
 Actualy at least thre sets of questions arise. The third pertains to the structure of the matrix subject for 
this example (e.g., [his mother's bread]). Recursion in the nominal domain is not something I have discused at al here, 
but presumably this wil involve two head elements coding posesion relating to the nominal and pronominal. I am 
abstracting away from this important isue to concentrate on how information flows in these derivations along our 
equivalent of verbal extended projection sequences. 
 
200 
(244) a. *There sems there to be likely there to apear there to be a man in the rom 
b. *? sems ? to be likely ? to apear there to be a man in the rom 
 
The problem lies in the way we have conceived of node-identification. T-T contractions 
result in the lower "defective" instances of T taking on the matrix ?/? values. I wil 
postpone a sketch of the solution as we wil need to say something similar in the domain 
of WH-movement in our discussion in the next section. 
 Regarding the first isue raised above: what the TCG mechanics provide, I am 
arguing, is a natural way to understand why there ought to be intermediate-type efects of 
the SCM sort. I have argued that it is an atractive platform for studying these 
phenomena, and sketched some preliminary analyses in terms of one posible 
implementation. But the general acount does not tel us everything, further development 
is required to understand the isues that arise in interpretation in cases like the one above 
(and others, se below). 
 The principles that govern reconstruction/connetivity type efects in A-movement 
are not wel-understood. Some have denied they exist entirely (e.g., se Lasnik 1999, 
Chomsky 1995), while others have countered that such efects do sometimes show up 
(Boeckx 2003, Wurmbrand & Bobaljik 1999) and that evidence to the contrary simply 
points to a lack of full understanding of the diferences betwen the inventory of 
potentialy movable elements, and does not bear on the general idea of SCM. 
 Some controversy exists over, for example, the status of examples of the 
following sort (this discussion draws on Wurmbrand & Bobaljik's 1999 presentation, the 
example is due initialy to the work of May 1977): 
 
(245) Some politician is likely t to adres John's constituency 
 
 
201 
The claim about this case is that it manifests a scope ambiguity, with "some politicians" 
taking scope from either the overt/matrix position or the embedded position from which 
on standard views it is taken to "raise" from. The ambiguity is thus with respect to the 
predicate "is likely", and in particular whether the existential introduced in the subject 
nominal is under or over this predicate scope-wise. For example: 
 
(246) ? > likely = "there is some politician who is likely to make the adres" 
 likely > ? = "it is likely that there is some politician who wil make the addres" 
 
The ambiguity is clear, and a "copy"-type story, which we have motivated a version of 
here, can in principle acount for this in terms of "interpreting" the nominal in either the 
upper or lower position (or taking "ambiguity" here to mean that somehow both positions 
are occupied, so that we may flip back in forth mentaly betwen the two). 
 Lasnik (1998) has argued, however, that this ambiguity can be explained without 
reference to scopal distinctions, but rather in terms of specificity. Consider: 
 
(247) Some politician adresed John's constituency 
 
This has a specific and a non-specific reading, where we may or may not (respectively) 
have a certain politician in mind. And this distinction corresponds to the ambiguity 
present in the raising case above. Lasnik's point is that scope ambiguities are Q-Q 
interactions (e.g., of the everybody loves somebody sort), and its not obvious that there 
are such relationships at play in the raising case. But nonetheles it is possible to think 
about specificity diferences in cases like the one in (247) where there is no isues 
regarding high/low positions from which to interpret an element (though perhaps the 
 
202 
isue is best understood in terms of v/VP internal/external, that could be involved in 
(247)'s ambiguity and the raising one above). 
 Bobaljik & Wurmbrand (1999) have responded a bit to this line of argumentation 
(as wel to some other chalenges raised by Chomsky regarding the A-traces/copies), but I 
won't go into this further here. Relevant here is their general conclusion, which is just that 
while there is reason to doubt that even if SCM in A-relations is totaly general, it doesn't 
always necesarily lead to reconstruction/connectivity efects, but that there are 
nonetheles some cases where it sems that such analyses are required to understand the 
cases where such efect do manifest. Here (above) I have concentrated on one main type 
of case involving interference efects in binding of self-forms, but there are other cases 
which bear on these maters that wil require further atention, and which wil be required 
to help sort out further details for the TCG-style analysis I am offering. 
3.3. Linked Local Relations II: Wh-Movement 
We can pick up the thread from the last section regarding variable-binding and obviation 
interactions by posing the following question: if we keep the raising construction in (239) 
the same in al other respects, but change the subject element housing the relevant NP and 
pronoun to a wh-phrase, do we se the same patern as we saw for the A-movement case? 
3.3.1. ?-Identification & Local Movement 
Consider: 
 
(248) a. *[Which of his
1
 mother's
2
 pies] sems to her
2
 _ to be known by every 
man
1
 to be _ the best there is 
 
b. [Which of his
1
 mother's
2
 pies] sems to every man
1
 _ to be known by her
2
 
to be _ the best there is 
 
203 
 
On standard bottom-up derivational views such a wh-element would begin in the base/?-
position, and A-move just like the NP in RtS, but the "last" movement would be to the C-
domain to licensing the wh-properties. This case manifests exactly the same patern as the 
ordinary raising case examined in the previous section. In particular, binding of the 
pronoun his by every man is possible without there having to be obviation betwen her 
and mother. This raises some questions on our view (though not on standard approaches). 
We have suggested A'-movement to be a dominance-encoded feature licensing 
relationship, so the beginning of the derivation for either of the above cases would look 
as follows: 
 
(249) C
?:?
WH
?T
?:?
?:n
?V?T 
 
D
?:f
WH
?:?
 
The problem is that we have understood so far the relevant relationships to go as follows: 
 
(250) C
?:???:f
WH
?T
?:???:f
?:n
?V?T 
 
D
?:f
WH??
?:?
 
 
And then what efects the "raising" ("lowering") in such constructions are T-T 
identifications. But this does provide a mechanism for the content of the matrix wh-
element to be lowered to the embedded edges of the non-finite complements, as these 
have by asumption been understood to not involve a C-layer. But the binding/scope 
interactions above sem to insist that this is what is required. 
 
204 
 On standard views this is unproblematic: the wh-element begins in a ?-position, 
and is raised from A-to-A position (involving the edges of the non-finite complements), 
finaly landing in the matrix ?/? position (matrix T), and then A'-moving to C. Thus on a 
copy view within such standard asumptions we do not have a problem understanding 
both the raising case offered above nor the wh-movement variant, since the later kind of 
relationship includes the former. 
 We are now in a position to further specify the "variable" role that we are 
suggesting for ?-properties. Note that the traditional GB-era view that we discussed 
above viewed ?-marked traces as "syntactic" variables, which on some implementations 
were understood to map to semantic ones. This is more-or-les what we have been 
presupposing in our discusion so far. However, it is possible given the current structure 
of our acount to entertain a diferent claim: ?-properties are literally syntactic variables, 
in the following sense. 
 We have so far entertained the idea of ?-features indexing the open positions of 
thematic predicates. The idea here is that these elements are the syntactic side of relations 
to semantic variables (i.e., the open positions of ?). The path of ?-agreing nodes in the 
structure was suggested to "lead to" a ?-position in regular A-relationships, and that "?-
marked nominals" are connected in this way to ?. Suppose that the T-D relation resulting 
in ?-valued on D and deleted from T is a proces, as we have been suggesting, that we 
might cal ARGUMENT IDENTIFICATION. In the case of an overt nominal, say a subject, this 
mark signifies the element that is connected to ? via the sequence of ?-properties on the 
path. 
 
205 
 Now consider the situation above. If ? serves to identify arguments, we might 
entertain the following possibility similar to the node-identification discussed earlier: 
 
(251) C
?:f
WH?WH[?:n]
?T
?:f
?:n
?V  C
?:f
WH[?:n]
?T
?:f
?:n
?V 
 
D
?:f
?:???:n
   D
?:f
?:n
     D
?:f
?:n
 
 
In short, it looks like we need something like local movement after al. So far the only 
things in our implementation that realy resembled movement was the edge-to-edge 
lowering efected by context/node-identification. However, the suggestion here is that 
this is tied to the special role of Case as a syntactic variable. In virtue of the WH-feature 
valuation by the local ?-property, the wh-element wil come to be ?-marked. The 
suggestion is then that in virtue of this identification, the C-related element comes to be 
dominanted by T ? ?-properties thus mark the entire unit, and where there is localy co-
valued ?, there is esentialy the same sort of efect that we se with categorial node 
identification. That this doesn't happen with ?-properties is thus a constitutive diference 
betwen these feature types. ?-properties define a local unithood (a stretch of co-valued 
nodes in a dominance sequence ? a "chain"), and ? marks arguments that are then 
related by this chain to ?. 
 There may be other technical ways to implement a solution to this isue, but I wil 
asume this for the rest of this work. So, to sum: ?-valuation marks arguments, and every 
occurrence of ? on the path is understood to dominate the ?-marked element. Note that D 
ends up in a derived relation with T, so that A vs. A'-relationships involving D-T are 
distinguished by the absence/presence of ? on T (respectively). 
 
206 
 With these mechanical asumptions the regular T-T node identification procedure 
discussed above for the raising cases wil now function to lower (the copy of) the wh-
element to each non-finite complement edge, thus yielding a structure acounting for the 
possibility of the variable-binding/condition-C interactions for (248). 
3.3.2. Core Cases of A'-SCM (& Some Technical Problems Addresed) 
The general structure of the TCG acount of SCM caries over to the core cases of wh-
movement more-or-les straightforwardly. However we are now in a position to re-raise 
and discuss some possible answers to some technical isues left open earlier. In 
particular, we considered the possibility in Chapter One of having a "Make-OP" style 
operation built into our WH-feature licensing mechanism ? basicaly C-WH deletes D-
WH, leaving this property only on C. This suggests then being able to treat the "residue" 
of such a deletion on D as a (potentialy complex) variable like element. 
 However, the mechanics of node-identification and contraction suggest that we 
should understand this WH-property as copied to al the lower C-nodes in our SCM 
analyses, in virtue of the identification which makes the lowering of the actual Dwh (now 
a "residual" structure interpreted as a variable). We contrasted these two schemas in 
?1.4.1 (54) & (55), repeated here): 
 
(252) C
WH[?:?]
?..?C
WH[?:?]
?..?C
WH[?:?]
?..? 
 
D
?:?
         D
?:?
         D
?:?
 
 
(253) C
WH[?:?]
?..?C?..?C?..? 
 
D
?:?
         D
?:?
    D
?:?
 
 
 
207 
We noted that it is realy the later, and not the former that we want, though as we have 
stated things the former is the one that the TCG derivations sem to produce. 
 Let us consider then a somewhat subtler formulation of the proces of node-
identification. What we want is for the proces to yield an identification that wil justify 
the "re-entering" into the workspace of the dominated, "to-be-moved" element. However, 
we want to this to proced without a copying of al of the information asociated with the 
upper element, but we want the upper-element properties to remain "visible" in the 
workspace, so that the lowering can result in a local structure where licensing occurs. 
 Note that this isue concerns both the A'- and A-relations. The isue arose with 
respect to A-relations above with respect to having ?-properties appear in al embedded 
positions and our suggestion that ?-licensing could be understood (for A-relations) to 
indicate the point in the structure where an element is pronounced. But for expletive there 
we suggested that this element was precisely one that did not alow local ?-valuation, and 
so it must be caried along to embedded contexts in our version of the A-type of SCM. 
 The idea then for an alternative view of node-identification would be to say that 
the features asociated with a node X are "fixed" with respect to the output structure. 
Whatever the nature of the connection betwen (e.g.) WH-properties and the ?-vocabulary 
asociated with that node, that relation keeps those properties fixed to that initial position 
as it is determined when the element enters the derivation. 
 Node identification can then stil occur, under the same general conditions of 
subsumption as we have been asuming. But while this wil identify positions in the 
workspace, it wil not "copy" the relevant features to the lower position. To se the idea 
 
208 
conceive of our workspace/output distinction in terms of separate layers or tiers of 
structure. Consider: 
 
(254) X
[F]
?..?X?..?X?.. 
 
 X
?
?
 
 
              CONTRACTION? CONTRACTION? 
 X
[F]
?..?X
[F]
?..?X
[F]
?.. 
 
 X
?
?
 
 
I mentioned earlier on in this discussion (se Chapter One) that the WS/O-distinction 
alowed a way of thinking of "many" in the output structure as "one" in the workspace. 
This is a situation where the mapping is insisted to be one-to-one for any given stage. At 
the "end" of a derivation, the relevant ?-properties wil be connected to lower variables 
via the mechanisms we have been developing above, but the syntactic information itself 
is "fast and fleting". It is available for local domain construction within the workspace, 
and is, in situations alowing node-identification, permited to be "caried over" to lower 
domains, but once the derivation is completed the workspace itself is gone, and so are the 
formal properties contained within it (that ? and ? information is connected to). 
 So, how then do we end up with "one" in the workspace corresponding to "many" 
in the output structure? The idea here relies on the "re-spel-out" mechanism discused 
earlier. If the workspace is constrained by the connectednes requirement, insisting 
esentialy that there only be a single dominance sequence at any given stage, then 
branching requires speling-out (removal from the workspace). However, we are viewing 
the node-identification procedure as preserving output structure relationships so long as 
they do not introduce local ordering conflicts in the workspace. This means that any 
 
209 
element Y that may be asociated with X in our schema above in (254) wil be "re-
entered" into the workspace in virtue of identification of X's. And as further structure is 
added they wil have to "re-spel-out". This yields multiples in the output for the 
asociated Y-elements. But notice that no such "leaving" and "re-entering" is required for 
the X-elements as these never cause a problem for the connectednes condition (they are 
always present in the path). 
 However, we noted also in our Chapter One discusion that any such mechanism 
that would insist on "reintroducing" speled-out material in virtue of the node 
identification proces could potentialy run afoul of our anti-recursion conditions, as such 
speled-out branches could be arbitrarily complex. Suppose instead that the initial feature 
licensing relationship which holds of the top-most element is sufficient to evoke the 
"lowering" ? that is, what maters to this proces is the initialy established agrement 
(?) relationship, and that this information is "caried over" to lower domains in virtue of 
succesive node identifications. Then, instead of reasigning syntactic/categorial 
information to such lowered complexes, we can say that some minimal information is 
asigned, perhaps just D and the relevant ? information, or perhaps just ?. The result is 
that the lowering that atends node identification re-introduces only an index of sorts 
which we take to dominate just LF-relevant vocabulary. 
This alows us to keep with the idea of "pronouncing" elements in A-relations 
where the relevant ?-property is, without the problem of expletive-there spel-out raised 
above. And, it gives us the structures we want for wh-movement, with local copies of 
variable like elements (e.g., WH(x)..[x mother]..[x mother]..[x mother], in whose 
mother did John think Bil knew Sarah met, etc.). 
 
210 
 Now consider some of the binding-connectivity efects discussed in Chapter 2. 
For example: 
 
(255) a. Which pictures of himself did John know Bil wanted? 
b. [which..himself] did John know [which..himself] Bil wanted [which..himself] 
 
The ambiguity present here we can now atribute to the TCG derivations of SCM efects, 
again, like the raising cases, without the postulation of special features (M-features) 
driving the individual movements. The self-element ends up in local relationships without 
interveners with both of the possible antecedents. 
 This helps to understand cases such as those discused in Chapter One as wel, for 
example: 
 
(256) a. John thought pictures of himself/*herself were on sale 
b. Which pictures of himself/*herself did John think were on sale 
 
(257) a. ?John thought Mary sold pictures of himself 
b. Which pictures of himself did John think Mary sold 
 
And reciprocal elements distribute in basicaly the same way as is wel-known: 
 
(258) a. The boys thought pictures of each other were on sale 
b. Which pictures of each other did the boys think were on sale? 
 
 
(259) a. ?The boys thought I sold pictures of each other 
b. Which pictures of each other did the boys think I sold? 
 
Imposible bindings across intervening elements sugest that a "direct" relationship 
betwen the matrix wh-phrase and the base position is insufficient. The wel-formednes 
of the b-cases in (257) and (259) can be acounted for if there is a local relationship to the 
phrase containing self/each-other ? and this is what the SCM-style analysis provides. 
 
211 
 Note as wel that where we have sugested that intermediate C-nodes are absent, 
as with the complements of raising predicates, we do not have a localy licensed "copy" 
that could enter into the relevant binding relations. The following examples (Abels 
2003:30) ilustrate: 
 
(260) a. *Which picture of himself did Mary sem to John (Mary) to like e 
 b. Which picture of himself did it sem that John liked e 
 c. hich picture of himself did Mary think John wanted (John) to pack e 
 
As Abels observes, the a-case supports the idea that raising infinitives are not CPs, since if 
they were they would support a potential landing site that would put the wh-phrase within 
the local environment of the NP (John) that could be a binder. Where we have evidence 
for CPs, as in the b- and c-cases, we also have the possibility of binding the self-form.
98
 
 Note as wel that under the contraction view of SCM we in general expect nested 
dependencies of the sort predicted by Path Containment approaches, pioneered in the 
work of Pesetsky (1982), Kayne (1984), May (1985) and others. 
 This is a quite general property of multiple "like" dependencies. Consider the 
following familiar sorts of cases from Pesetsky (1982): 
 
(261) a. What books do you know [ who [ PRO to persuade e [PRO to read e ]] 
  
 b. *Who do you know [what books [ PRO to persuade e [PRO to read e ]] 
 
                                                
98
 This a-case above also reveals a paralelism with a comparable copy-raising construction, 
sugesting that these do not involve CP's either. 
a. *Which picture of himself did Mary sem to John like she wanted e 
b. Mary semed to John like she wanted pictures of herself 
c. *Mary semed to John like she wanted pictures of himself 
 
212 
We se the same kinds of nesting versus crossing efects across the range of A'-
movement relationships, including within the structure of relative clauses, in 
topicalization, infinitival relatives, tough-movement, too/enough-movement, and 
comparatives (se Pesetsky 1982:269 for examples). 
 On the general structure of the acount, it is worth pointing out that C-C nodes in 
the workspace wil be unable to be identified in workspace contraction if subsumption 
does not hold. So, if we understand interogative embedded complements as being 
specified for WH, then this lack of ability to contract/identify can yield for us an acount 
of wh-islands. 
 
(262) ?Who did John wonder whether Mary liked _ ? 
 ? ho did John wonder who ary liked _ ? 
 
Moreoever, if we take the identification of C-nodes to be sensitive to a more general 
category of operator elements, then we can extend the "imposible contraction" story to 
other clases of so-caled non-bridge verbs, like factives for example (se Frank 2002 for 
some discussion along these lines). 
 There are, however, cases we mentioned in Chapter Two which sugest that SCM 
is "more cyclic" than our view predicts, in particular facts that sugest that the "edge" of 
v/VP is an intermediate landing site. I wil return to these cases below, after we have 
discused some posible extensions of the general architecture to local domains. The 
situation we are in with respect to evidence for a v/VP level intermediate movement is 
fairly straightforward: we are forced to posit more structure within local domains in order 
for the general aproach to yield the facts. 
 
213 
3.4. Local Relations: Part Two 
In this section I return to some of the isues raised at the beginning of this chapter 
regarding local relationships for simple transitives. There we noted that our feature-
valuation mechanics semed to require bi-directional valuation on the dominance 
ordering (to alow but upward ?-valuation and downward ?-valuation in D-T relations, 
for example), but that no locality restrictions suggested a chance for chaos when more 
than one ?-element would be in the same local domain. Here I pursue the posibility that 
this situation never arises. 
3.4.1. Raising-to-Object (RtO) 
Consider: 
 
(263) John believes him to be a genius 
 
There are two main lines of thinking regarding these constructions which difer with 
respect to how the acusative-marked element (him) is viewed with respect to the 
matrix/embedded clause boundary. The choice of analysis typicaly swings with the 
claims made about the categorial status of the embedded infinitival. On the one hand, 
there is the idea that him is in the lower clause, and that there is an "exceptional" proces 
which converts the categorial status of the embedded structure from CP to TP (S' to S in 
traditional terms; se Chomsky 1981). On the other hand, there is the idea that these cases 
involve raising to an "object" position (Postal 1974). In modern views that have 
resurrected this Raising-to-Object (RtO) view, the categorial type of the embedded clause 
is usualy taken to be a TP, and much is made of the similarities of these cases to the RtS 
sort of NP-movement discussed above. 
 
214 
 A number of factors favor the RtO type of analysis,
9
 here I wil name just a few. 
First, pasivizing believe targets this embedded ECM'd element, strongly suggesting 
matrix objecthood since, much like the impossibility of raising out such contexts, subjects 
of lower CPs clearly cannot undergo this proces: 
 
(264) a. He is believed to be a genius 
 b. *He is believed that _ is a genius 
 
Second, binding-theoretic conditions apply to this element as if it is a matrix object, and 
not like a lower subject: 
 
(265) John
1
 believes himself
1
/*him
1
 to be a genius 
 
However, the meaning equivalence of the following two cases insists that we understand 
the ECM'd nominal to be the thematic subject of the lower clause:
10
 
 
(266) a. John believes Dave is a genius 
 b. John believes Dave to be a genius 
 
But, on the other hand, binding conditions difer in their efects for these two kinds of 
complements, which again suggests that the ECM'd nominal is in the higher clause: 
 
(267) a. John
1
 believes he
1/2
 is a genius 
 b. John
1
 believes him
*1/2
 to be a genius 
 
The combination of these facts ? participation in the formal proceses of matrix objects 
but thematic asociation to the embedded material ? strongly suggests a raising-style 
account. 
                                                
9
 Se Johnson (191), Koizumi (193, 195), Lasnik (195), Runer (195), Bobaljik (195) for recent 
discusions. 
10
 Rosenbaum (1967), among others. For sumary discusion and further references se Runer (to apear). 
 
215 
Another standard argument includes reference to other efects of hierarchical 
position as indexed by binding/scope posibilities, which suggest that the ECM'd nominal 
is in the higher clause (Lasnik & Saito 1991; Postal 1974):
101
 
 
(268) a. ?The DA proved the defendants
1
 to be guilty during each other's
1
 trials 
 b. *The DA proved that the defendants
1
 were guilty during each other's
1
 trials 
 
Postal's clasic RtO analysis of these constructions, which sems to do prety wel with 
the facts, was imported into current approaches via the asumption that 
objective/acusative Case asignment is esentialy like that of subjects, and involves 
movement from a base thematic position to a specifier position. In Chomsky (1991) 
Lasnik & Saito (1991), and Johnson (1991), among others, this was considered to be an 
object-related Agrement head (Agr
O
). In more recent work (Chomsky 1995 and 
subsequent) the notion of separate agrement heads has been caled into question.
102
 
However, the general idea of the RtO-analysis can be pictured as follows, where the 
nominal is understood in the general case to raise to some functional category F below 
matrix T to licensing its Case properties, as in (269). So the implementation of Postal-
style RtO then looks like (270): 
 
(269)     FP 
 
 NP    F' 
 
    F
0
    VP 
 
       V
0
     t 
 
 
                                                
101
 As is wel known, similar efects of hierarchy hold for Condition C, negative polarity licensing, etc. 
102
 Though se Beleti (201). 
 
216 
(270)     FP 
 
 NP     F' 
 
    F
0
    VP 
 
       V
0
    TP 
 
          t     T' 
 
             T
0
    .. 
 
As a last review note on these constructions, an analysis like (270) also fits nicely with 
the various "subject-like" properties exhibited in ECM, as witnesed by expletives and 
idiom pieces in these positions, paraleling the properties of athematic positions in raising 
to subject constructions: 
 
(271) a. I believe there to be a moron in the White House 
 b. I believe it to be the case that Dave left 
 c. I believe it that Dave left 
 d. I believe the shit to have hit the fan 
 
So suppose then that something like the object-raising story is correct. How can we 
capture it in our view of contraction? Notice that without some asumption about a higher 
functional element responsible for acusative Case in ECM, we predict that (272) should 
be a subject-raising situation if the infinitival structure is a TP. 
 
(272) John believed him to like carots 
 
This ought to manifest a sequence of elements and a phase structure like (273) if there is 
no intervening functional element of the right sort to block the contraction: 
 
(273) C?T?v?V?T?v?V ?(MATRIX-T/EMBEDED NON-FINITE T IDENTIFY/CONTRACT) 
 
   D        D 
 
 
217 
There are two possibilities. We could consider something like the old CP/S' plus 
"deletion" (i.e., ?TP/S) analysis of Chomsky (1981), in which case the T-T contraction 
ilustrated above would be blocked as desired: 
 
(274) C?T?v?V?C?T?v?V 
 
   D           D 
 
But this would be to loose al the nice properties of the object-raising analysis sketched 
above. Plus, its not clear how we could possibly implement the notion of S'/CP-deletion 
in this system, since that would put us right back in the same situation we started with 
regarding the undesirable contraction above in (273). 
 Another alternative, one consistent with the story we have told about RtS, would 
be to claim that there is an matrix-object-related T node above V, but below the C-T-v 
subject argument complex. 
 Pesetsky & Torrego (2004) suggest such an acount within their general atempt 
to connect the presence/absence of T with Case-theory generaly. Their structure for 
simple transitives is thus: 
 
(275) [
CP
 C
0
 [
TP
 T
0
 [
vP
 v
0
 [
TP
 T
0
 [
VP
 V
0
 ..]]] 
 
This alows for us to consider the possibility that the matrix clause is realy hiding a bit 
more structure. And this story would then alow us to view object-raising as T-T 
contraction, exactly as we did above for subject-raising, but now with "object-related" T 
contracting with the embedded infinitival. 
 
(276) C?T?v?T?V?T?v?V 
 
   D     D 
 
218 
 
Note that our view of structural contraction forces this analysis on us. I find this 
interesting since this sort of iterated clause-internal "mini-clause" structure has been 
explicitly argued for by a number of authors under the label of the so-caled Split-VP 
Hypothesis. 
 The general idea behind Split-VP includes the now fairly widely adopted view of 
separating/dividing the lexical shel of verbal domains into a core verbal element V (the 
ultimate head) and a smal-v element, understood to introduce an external argument. 
 
(277) [
vP
 v
0
 .. [
VP
 V
0
 ..] 
 
Koizumi's (1993, 1995) notion of Split-VP has it that these verbal elements are separated 
into distinct zones in virtue of the existence of one (or more) intervening functional 
elements (F-heads): 
 
(278) [ .. [
vP
 v
0
 .. [
FP
 F
n
0
 .. [
FP
 F
1
0
 .. [
VP
 V
0
 ..]] .. ] 
 
There is a diverse aray of analyses evoking Split-VPs in this sense in the literature, and 
while the exact nature of these intervening functional elements is by no means setled, 
there appears to be something of a growing consensus that some such division/separation 
approach may be correct. Candidate types evoked to label these intervening functional 
elements betwen the separated ?-elements (v and V) include Agr(ement), a lower 
T(ense), Asp(ect), a lower instance of C(omp), among others.
103
 Lasnik (1995, 1999) 
includes arguments based on the properties of pseudogapping that support the idea of 
                                                
103
 Many others, actualy. Work in the minimalist program has sen no shortage of proposals arguing for 
functional category distinctions. I wil be working with fairly blunt tols in this regard, but as mentioned earlier, the 
eforts in analysis which this thesis aspires to are mainly in service of developing a clear and plausible picture of the 
theoretical ideas and the consequences for general architecture. 
 
219 
Split-VP and overt object and verb movement in English. Runner (1995) contains 
arguments along these lines as wel. I wil not review these arguments here, and I wil 
also not be discussing head movement in this thesis. But the conclusion which I wish to 
extract from this is that the iterated mini-clause analysis that our view of structural 
contraction appears to force upon us is by no means unprecedented, and in fact has a fair 
amount of independent empirical support. 
 Let's consider this a bit more. What prevents T-T contraction then within the main 
clause, so that external and internal arguments might either become confused (as we 
worried about earlier) or inappropriately identified? Note that the asumption here would 
be that the two T-elements within a single clause would be distinguished by ?-properties, 
as follows: 
 
(279) C?T
?:f
? v
?[?:?]
?T
?:f
?:n
?V
?[?:?]
 
 
    D
?:f
?:n
 
 
The anti-recursion condition on workspaces wil force the "subject" T-v structure to be 
spliced out when the second T is introduced. This has the welcome outcome that it 
appears to solve the dificulties we raised at the outset of this chapter regarding the 
locality of valuation. We have esentialy imported the structure of the acount now into 
the local domain of single clauses. 
 So we now view ECM as RtO, paralel with our RtS derivations from earlier 
discussion. For example: 
 
(280) C?T
?:f
? v
?[?:?]
?T
?:g
?:a??
?V?T 
 
    D
?:f
?:n
         D
?:g
?:???:a
  
 
 
220 
Asuming that the complements of "ECM verbs" (now using the term as a descriptive 
label) is a "defective" form of T, we would have the same node-identification and 
contraction for the object-raising in (280) as we did for the subject cases. 
 In the next section I suggest that this view of clause-internal structure can be put 
to work to yield some interesting properties of pasives, and make some tenative 
suggestions regarding local binding and control phenomena. 
3.4.2. Pasives, Local Binding, & Control 
Note that the root-first directionality for the emergence/expansion of these sequences of 
categories is crucial. On either a representational or bottom-up derivational view, it 
would be possible to view the two instances of T in these structures as undergoing 
contraction, resulting in the following structure where we have T-contraction 'over' an 
intervening C:
104
 
 
(281) C?T?v?C?T?V 
 
On the root-first view, (281) presents phase conflicts with those demanded by the 
necesary C-C contraction, which we can se this clearly by superimposing the two: 
 
(282) C?T?v?C?T?V 
 
Asuming for the moment that items are entered into the workspace one-at-a-time with 
the asumed direction/ordering given above, this means that C-C contraction wil always 
block T-T. However, we predicts on this one-at-a-time view that in the absence of the 
                                                
104
 Although in a representational implementation we wouldn't view this as literal contraction in the sense I 
have ben entertaining, but rather just some notion of domain that would be defined over "like elements". What I am 
arguing here is that only on the rot-first view do we get the right sort of domains. 
 
221 
intervening C-element we should se instances of T-T identification & contraction if there 
are no properties of these elements to distinguish them (so subsumption does hold). 
 
(283) C?T?v?T?V 
 
This, I propose, is exactly what happens in the case of pasive. Absence of ? on object-
related-T could be sen to drive a kind of "clause-internal" raising on these asumptions. 
Suppose that objective T can enter the derivation without ?/?-properties. 
 
(284) C?T
?:f
? v
?[?:?]
?TV 
 
    D
?:f
?:n
 
 
The idea of the instance of T-T contraction happening over the element v instantiates the 
idea that when acusative Case is absent, the external role could be sen as esentialy 
spliced out of the active stretch of derivation without being ?-valued. This is, I submit, 
the present but unexpresed external ?-role in pasives. 
 Such a posibility suggests that ?-roles need not, strictly speaking, be asigned. 
Left open (not closed off by the introduction of a local satisfied Case property) they 
function as implicit arguments. This view requires however that in general ? cannot be ?-
valued prior to the relevant T-T identification. Suppose then in general that ? difers from 
the other properties we have discused in that it is valued when it is removed from the 
workspace. On this view v
?
 (or any ?-element) can only be properly ?-valued if it is 
"speled-out" together with a superordinate specified ?-property. I asume this in what 
follows; to be explicit: 
 
(285) LAST RESORT ?-VALUATION: ?[?:?] is valued at Spel-Out 
 
 
222 
This isn't a completely wild speculation ? the general idea is that we regard ?-
relationships (the ulimate "integration" of nominal elements into the emerging event 
structure) as a mater of the interface mapping. Here this is our WS/O-distinction, so its 
not unreasonable to locate our version of ?-connections to this particular mapping (WS to 
the "LF-relevant" properties of the derived output structure). 
 Note that we had to asume in earlier discussion that ?-properties are parasitic on 
?-properties and their valuations. But we discussed a few cases where we wanted ?-
relationships to be able to hold independently (e.g. the secondary predication case in the 
beginning of this chapter; the smal clauses where asociates of expletive-there may be 
found, etc.). What happens if there are ?-properties on object-related T, but no ?-
properties? 
 In such cases I suggest, we have the possibility of connecting internal and external 
(or v and V) ?-properties. Such a structure is atractive for thinking about the properties of 
so-caled inherently reflexive (286) and reciprocal (287) verbs. 
 
(286) a. John washed/bathed/shaved 
 b. John washed/bathed/shaved himself 
 
(287) a. The women met/kised/hugged 
 b. The women met/kised/hugged each other 
 
Though this view requires that we posit an additional layer of functional structure. 
Consider: 
 
(288) C?T
?:f
? v
?[?:?]
?T
?:?
?V 
 
    D
?:f
?:n
 
 
 
223 
Any ?-relationship betwen T-T would result in the splicing-out of v situation suggested 
above for pasivization. How then could we manage local connections betwen 
arguments of the sort that manifest in the inherent reflexives/reciprocals? 
 Suppose we take the idea of iterated clause-internal structure around ?-elements a 
step further, and view the C-T-V type structure as the general way that functional 
elements cluster around lexical ones, resulting in a full "stacking" view, to borrow 
terminology from Bobaljik (1995). He contrasts split-VP type architectures with a "leap-
frogging" view. Consider an ilustration. Take the light nodes to be rougly ?-related, grey 
to be ?/?-related, and the dark nodes to be operator/A'-related: 
 
(289) a.   b.  
 
 
 
 
 
 
 
 
The a-view is the one which makes for a leapfrogging view of domain relationships. This 
is a fairly common perspective. Grohmann (2003) enshrines roughly this C-T-V kind of 
division in his "prolific domains" view of clausal architecture, where each domain 
potentialy decomposes into more fine-grained inventories of categories. Roughly, there 
is a domain where al thematic relations are computed, and this maps (by movement) to a 
domain where ? and ? and the like are licensed, and then these map (by movement) to a 
higher domain involving relationships of the A'-sort, including perhaps discourse related 
functions (e.g., like topic and focus and the like in a Rizian "split-CP" view; se Rizi 
1997; se also Platzack 2000 for a view similar to Grohmann's). 
 
224 
 In Bobaljik's terms, results in "leapfrogging" type movement relations to map the 
elements from the lowest to the highest domains: 
 
(290) a.   b.  
 
 
 
 
 
 
 
 
 
Movement relations needn't necesarily always be uniform in the fashion pictured above 
on the leapfrogging view ? note that stranding elements in diferent domains in these 
movement relationships yields the degres of fredom to describe various diferent word-
order, scope/binding relations, and the like (e.g., depending also on the isues of the sort 
mentioned earlier in our discussion of the WS/O-distinction regarding where one keeps 
the ?- versus the ?-relevant information). 
 The present architecture could be sen as embracing the same general vision of 
functional divisions, but with a diferent view about how they are organized and come 
together. The relations indicated in the a-view above correspond to the following ones in 
the b-view: 
 
(291) a.   b.  
 
 
 
 
 
 
 
 
225 
In addition we have suggested certain limited ways that the b-type domains can interact 
across their boundaries. That is, there are a limited range of relationships of the following 
type: 
 
(292)  
 
 
 
 
These correspond to the relations governed by the context/node-identification, with the 
notion of workspace contraction serving to limit the available "viewing window" to just 
these domains housing distinct elements. 
 Bobaljik (1995:ch3) gives a range of arguments in favor of the b-view ("stacking" 
in his terms) and offers a number of arguments against the a-view ("leap-frogging"). 
What I wil consider here and below is an extension of this general way of thinking that 
aims to stay consistent with the general notion underlying the TCG approach as we have 
been developing it. 
 Suppose then that we have a fully iterated view of local transitive structures, this 
would then look as follows (with the entire presented without any licensing of properties 
indicated): 
 
(293) C
?:?
?T
?:?
?:n
?v
?[?:?]
?C
?:?
?T
?:?
?:a
?V
?[?:?]
 
 
      D
?:f
?:?
              D
?:f
?:?
 
 
I have added ?-properties to the C-elements, let us now consider how this view might 
function with respect to inherently reflexives/reciprocals. Above we noted that T-T 
 
226 
identification would result in splicing-out v, yielding a present but not ?/?-connected 
element that would be interpreted in the output as the "implicit" external argument 
present in pasives. Two possibilities suggest themselves on this picture. First, the 
derivation could proced esentialy as in previous discusion with two new additions. 
Take the derivation up to the introduction of v in (294) (al the feature valuations are 
pictured here on a single step for convenience): 
 
(294) C
?:???:f
?T
?:???:f
?:n?
? v
?[?:?]
 
 
        D
?:f
?:???:n
 
 
Two diferences are now incorporated: (i) ?-properties on C, which are localy valued 
when D is related to T, and (i) the ?-property is not valued for ? (this happens in the 
mapping to the output ? whenever v is "spliced-out"). Subsequent addition of our 
hypothetical "object-related" C-element then yields: 
 
 
(295) C
?:f
?T
?:f
? v
?[?:?]
?C
?:?
 
 
     D
?:f
?:n
 
 
Since this new C-element subsume the higher one, we would have an instance of 
contraction, which I wil suppose results in the following: 
 
(296) C
?:f
?T
?:f
?v
?[?:f]
?C
?:f
 
 
     D
?:f
?:n
 
 
The important parts (highlighted above) are: (i) v-? is valued in the mapping the output 
(in being voided from the workspace) and (i) the lower C-element is now valued for the 
upper domain's ?-value. Now addition of the lower T-V structure looks as follows: 
 
227 
 
(297) C
?:f
?T
?:f
? v
?[?:f]
?C
?:f
?T
?:?
?:a
?V
?[?:?]
 
 
     D
?:f
?:n
 
 
Presence of T-?:a (acusative) on our asumptions requires/enforces distinct D. Suppose 
that overt self-anaphors are divided into a pronominal part and the "self" part, and that the 
function of the "self" part is to absorb acusative ? (se Hornstein 2000 for a similar view 
along these lines). I asume that is arives as a bundle with its "pronimal" part which is a 
bare D with unvalued ?, so that we have the element which I wil mark as just 
self
D
?:?
?:
 
 
(298) C
?:f
?T
?:f
? v
?[?:f]
?C
?:f
?T
?:?
?:a
?V
?[?:?]
 
 
     D
?:f
?:n
            
self
D
?:?
?:
 
 
?-valuation then works in the expected way to yield: 
 
(299) C
?:f
?T
?:f
? v
?[?:f]
?C
?:f
?T
?:f
?:a
?V
?[?:f]
 
 
     D
?:f
?:n
            
self
D
?:f
?:a
 
 
Which is thus al one ?-chain. The asumption is that self-serves to "capture" the ?-
property, so it does not serve to individuate the "pronominal" part, which becomes valued 
by T-?. Thus the two ?-elements have the same index, yielding the ?-properties of local 
anaphors. Note that we could alternatively view the self forms as coming to the derivation 
specified for ?, this would result in the usual D-T exchange of ?/?-values. On the view 
above there is the oddity of having T-? filed in by C, and then having T value both the ? 
and ? properties of the anaphor. On the alternative just mentioned, we would get the same 
result as the representation in (299), except that technicaly the marked ?-features below 
wil not have entered into a valuation relationship: 
 
 
228 
 
(300) C
?:f
?T
?:f
? v
?[?:f]
?C
?:f
?T
?:f
?:a
?V
?[?:f]
 
 
     D
?:f
?:n
            
self
D
?:f
?:a
 
 
In our discussion of logophoric self-elements above, we suggested the following 
diference betwen local reflexives and logophoric-self, repeated here: 
 
(301) a. Local Reflexives: ?:f?? .. ?:? ? ? 
     MATCHING/VALUATION 
 
b. Logophoric -self: ?:f?? .. ?:f?? 
                        LINKING 
 
In the local context the mater is obviously quite subtle, as a distinction is being drawn 
betwen one versus two tokens of a valued feature. The intuition behind local valuation, 
implicit throughout, is the notion that is known as "rentrancy" in feature-
based/unification frameworks,
105
 is that co-valued features are literaly sharing a value. I 
have been asuming here that this yields a kind of internal unit-hood along the dominance 
sequence, and what is being explored now is the possibility of such relations extending 
across local recursive domains. (Relations betwen valued ?-properties I have suggested 
be treated as relations on the output structure of the linking sort). 
 Note that regular pronouns on the present view ould then plausibly be viewed as 
coming to the derivation with valued ?, and unvalued ?, like regular nominals. This 
would on our asumptions yield local obviation (*John
1
 saw him
1
). 
 The idea for inherent reflexives would then be to say that these occur where the 
lower ?-property would be absent entirely, as follows: 
 
                                                
105
 Se Shieber (1985) for discusion and references. 
 
229 
(302) C
?:f
?T
?:f
? v
?[?:f]
?C
?:f
?T
?:f
?V
?[?:f]
 
 
     D
?:f
?:n
 
Asume that this is how inherent reflexives work, and that something like this is relevant 
to inherent reciprocity. There are obviously complications with the later that do not arise 
in former regarding plurality and how members of the denoted set are understood to 
participate in the relevant relation (e.g., sorting out various flavors of "strength" of the 
reciprocal relation). Seting aside this significant complication, what I'd like to focus on 
now is the following property that arises in both in pasivization: 
 
(303) a. John washed/bathed/shaved 
 b. John washed/bathed/shaved himself 
 c. John was washed/bathed/shaved 
 
(304) a. The women met/kised/hugged 
 b. The women met/kised/hugged each other 
 c. The women were met/kised/hugged 
 
A curious fact about these kinds of predicates is that both the inherent reflexive and 
reciprocal readings disappear in pasivization.
106
 The c-cases above cannot have the b-
reading which the a-cases with 'mising' direct objects obligatorily have. 
 So the idea here would be that the upper (nominative) ?-properties, in virtue of 
contraction, would create a second ?-domain, thus serving to index both the external and 
the internal role. This general line of thinking could then be understood to support the 
inherent reflexive/reciprocal readings. 
 Roughly this kind of distinction is developed by Hornstein (2000) though with 
rather diferent technical asumptions: presence/absence of acusative Case can result ? 
                                                
106
 These facts were pointed out to me by Ian Roberts (p.c.). Se Baker, Johnson, & Roberts (1985) for an 
acount that has a somewhat similar structure despite having litle else in comon with the present architecture. 
 
230 
in Hornstein's terms ? in movement of and NP from the object to the subject ?-position, 
licensing reflexive readings (Hornstein does not discuss inherent reciprocals).
107
 The 
presence of the relevant ?-property results in the two roles being distinguished, a 
possibility clearly permited by these verbs types (e.g., John washed the baby; The 
women kised the baby). 
 The view of pasive offered above explains this. The move is to suggest that as 
object related T can lack a ?-property, it also may lack a projected/selected C-element, 
which otherwise serves to "shield" it from contraction with the higher T. Since these 
derivations individuate/index the internal role via the upper (nominative) ?, this wil 
necesarily be distinct from v
?
, thus yielding the absence of the inherent 
reflexive/reciprocal readings for these verbs. 
 This is worth taking a closer look at, as it bears on the isues of what happens 
where in the TCG approach we've ben developing here. The asumption is that "like" 
features in the workspace simply come to share values. So two ? features, or two ? 
features, if these are in the same workspace, then they share values. Period. (Given that 
we have now partitioned ?-elements into separate workspace zones). 
 Landau (2004) criticizes Hornstein's analysis, with which I share some 
asumptions (the whole story above simply stipulates absence of acusative Case) as 
follows. Why are Case features not potentialy optional for all transitives? Thus the 
acusative ? (our "lower" C
?
) could be omited with the mandatory reading then being an 
inherently reflexive reading, where (305)a would have to mean what (305)b does. 
                                                
107
 The general idea of having a Case-distinction permit the conection betwen internal and external 
arguments is atributed to sugestions made by Howard Lasnik & Alan Mun. 
 
231 
 
(305) a. John hit 
 b. John hit himself 
 
This strikes me as a reasonable question, but one that has a reasonable answer. This is, it 
sems to me, a bit like asking why it cannot be the case that an unacusative verb (e.g., 
arrive) cannot end up in the syntax with an outer v-shel and thus manifest a structure 
supporting things like John arived the man with some corresponding transitive or 
causative reading (e.g., 'John made the man arive' or some such). 
 There are realy two possibilities as far as I can se to addres this isue. Either 
there is such a thing as non-compositional, not-fully-productive sort of "structure-
building", or something else acounts for the lack of productivity (or promiscuity) among 
the decomposed bits in approaches adopting one or another view of the separation 
hypothesis. 
 It sems fairly clear to me that diferences betwen verbs licensing inherent 
reflexivity/reciprocity versus not is a lexicon distinction. This only means "arbitrary fact" 
if we presuppose a Bloomfieldian view. The question is: how does this distinction 
manifests in the syntax? What are the properties that must be projected such that we can 
acount for the paterns that are exhibited by the relevant elements? What is being 
claimed here (and, as I understand things, by Hornstein as wel) is that Case properties 
are central to how these diferent manifestations of a verb are projected into the syntax. 
Landau's objection sems to presuppose that we need to regard Case optionality in the 
intended sense as a mater of the workings of the syntactic system, as opposed to 
consequences for the syntactic system which could be sen to follow from alternative 
projection possibilities of Case/?-properties asociated with diferent clases of elements. 
 
232 
 What answers the objection for unacusatives? Clearly it must be part of the 
specifications of these elements that they cannot enter into a structure with a 
superordinate v-shel. Once we have made the move of doing the decomposition of the 
VP in this manner there's no geting the genie back in the bottle, we have to live with the 
somewhat more exciting ontology of the separation/decomposition view. And this, it 
sems, means acepting that there are some non-compositional sorts of organization 
involved here. 
 If I'm right in this line of argumentation, then connecting these sorts of "argument 
structure" alternations with the theory of Case (or agrement perhaps as wel, and perhaps 
more in the functional hierarchy) then it begins to become plausible that we are looking 
at (as suggested earlier) paradigmatic organization, which we needn't necesarily expect 
to be "productive" in the manner that syntagmatic organization is. Its just a diferent kind 
of system. 
 This does of course presuppose that what is captured "in the lexicon" includes 
specifications regarding the projection possibilities which go beyond a single phrasal 
projection layer. But this is general, and not specific to the isues as they arise regarding 
inherently reflexive or reciprocal verbs. This is, in fact, part of the point of work 
elaborating on the notion of extended projections of the sort discussed in the work of 
Grimshaw (1991, 2000) and others, or so it sems to me. Given the advent of the 
functional projection explosion which has atended the development of minimalist theory, 
it is no longer conceivable that we can understand "projection" as governing the 
distribution of elements within a single categorial shel. 
 
233 
Let us ask the same question in a related domain in an analogous way that wil 
make the isues somewhat clearer, as the isue strikes me as one worth spending some 
time on. Consider the following distinctions in the "lexical syntax" of various diferent 
types of elements, as conceived in the widely adopted work of Hale & Keyser (1993, 
2002): 
 
(306) a.    XP  b.    XP  c.    XP  d. X
0
 
 
X
0
    YP  ZP     X  ZP     X 
 
      X
0
    YP     X
0
     Y
0
 
 
They distinguish betwen an elements which (a) take a complement, (b) take a 
complement and a specifier, (c) take only a specifier, and which (d) take neither a 
complement nor a specifier. They suggest that while these possibilities do not universaly 
align with categories, there may be asociations which predominant (e.g., they suggest 
that in English, the predominant realization of (a) is V, (b) is P, (c) is A, and (d) is N). 
 So: how do we think about "mismatches" ? that is, situations in which the wrong 
element somehow gets asociated with projection realizations which clash with its 
properties? For example, Hale & Keyser treat so-caled unergatives (e.g., laugh) as 
realized by monadic head-complement structures of the (a)-type above, as follows: 
 
(307)    VP 
 
V
0
    NP 
       | 
    laugh 
 
They suggest that the impossibility of such elements participating in the following 
transitivity alternations follows from the analysis above: 
 
234 
 
(308) a. The children laughed 
b. *The clown laughed the children  
(i.e., "the children laughed because of the clown) 
 
They note that this property is shared by analytic expresions make trouble: 
 
(309) a. The cowboys made trouble 
b. *The beer made the cowboys trouble 
 (i.e., the cowboys made trouble because of the beer) 
 
Thus the transitivity alternation impossibilities are understood to follow because these 
elements are in their surface "object-les" form in a sense already transitive. However, 
H&K asume that the relevant structures such as the one above for laugh do not strictly 
speaking exist at any level of syntactic structure. Rather, they asume that there is a 
proces of conflation which happens as a "concomitant" of Chomsky's MERGE. That is, 
there are not two operations like (310)a, but rather simply MERGE and the consequences 
of MERGE (conflation) for items of this particular type, as in (310)b: 
 
(310) a.    V    b. V
laugh
 
 
       V
?
     N
laugh
 
      MERGE(V,N)      +           CONFLATE(V,N) 
 
(311)     V
laugh
 
 
       V
?
??N
laugh
 
   MERGE&CONFLATE(V,N) 
 
So, why isn't this proces totaly general? What stops impossible applications involving 
"nominal" elements that do not fal into the unergative clas? Take the nominal element 
cigar; why shouldn't it manifest the properties that laugh exhibits in virtue of undergoing 
this kind of MERGE+CONFLATION? 
 
 
235 
(312) *John cigared 
 (meaning perhaps: 'John had a cigar', it doesn't mater for the present point) 
 
This would be an instance of the kind of mismatch raised above. That is, generaly, once 
we have made some of these distinctions betwen predicates a mater of structure, what is 
it that keeps the relevant elements in their appropriate bins? 
 Alternative derivational routes of the sort explored above regarding pasives and 
reflexive/reciprocal verbs are like the options of entering into a MERGE+CONFLATION 
derivation versus one with simply MERGE. What these alternative derivations do is to 
partition the space of structural possibilities given a set of distinctions provided as input. 
This is to say that "lexicon information is syntacticaly represented". The job of syntactic 
theory is to provide a principled partition of this space that captures the relevant 
properties of the structural alternatives given the properties projected by given items. 
 I wil leave this to the side for now, observing that the suggested analysis rather 
large isues. I have offered a general direction for thinking about their solution, and that's 
about it. However, there is one final further point raised above that requires discussion. 
 Nothing in what I have said about these reflexive/reciprocal verbs distinguishes 
betwen the two clases. The relevant claims being made here are that (i) absence of ?-
properties make it so we do not have local obviation, so that the external and internal 
roles can become asociated, and (i) that the properties of pasive derivations make it so 
the external role wil be not ?-asociated, but that since the internal role wil be, these two 
roles wil be distinct. This acounts for the similarities of reflexive/reciprocal verbs under 
pasivization (why these inherent readings go away for both types of verbs). 
 
236 
 But I have nothing further to add here about how the semantics works for these 
cases such that we can discriminate betwen the efects of this kind of external/internal 
argument connection for the two clases (i.e., the fact that the women kised does not 
mean 'the women kised themselves'). I do think that the present system akes promising 
distinctions which can be taken as a good basis upon which to build in this respect. I wil 
leave this as an open question, taking the current acount to provide part of the final 
solution (the part which captures the similarities ? leaving the diferences up to an 
unspecified semantic story for now).
108
 
 We can, however, capitalize on the general logic deployed for reflexives to 
provide a schema for analyses of control phenomena. Taking note of the collection of 
properties shared by obligatory control and reflexivization (Hornstein 2000), I suggest 
here that the same general notion of C-C identification and contraction is at work. 
 That control predicates take C-complements is a widely held asumption in 
theories of control. One fairly clear source of evidence for this sort of analysis is based in 
the fact that, as pointed out in Landau (2000), control predicates ? but never raising 
predicates ? can be sen in many languages to manifest overt complementizers. 
Moreover, in languages which manifest this distinction overtly, those predicates which 
appear to be ambiguous betwen the control vs. raising type, when they manifest a 
complementizer, only manifest the control reading. So asuming these predicates to take 
                                                
108
 Juan Uriagereka points out cases like "they scratched" which sem to be vague betwen a reflexive, 
reciprocal, or an implicit (distinct) object reading. This is one of many cases to adres. There is also the posibility of 
"making" a reciprocal verb by pasivizing certain ditranstives. For example, He introduced John to Mary versus John 
was introduced, which with a plural subject (they were introduced) is ambiguous betwen an implicit object reading 
(i.e. they were introduced (to the audience) or a reciprocal (each other) reading. The present sugestion for heading 
into these isues is to continue to explore the apeal to samenes/diference in the form of ?/? properties. 
 
237 
C-complements, suppose that we say that these similarly manifest absence of an 
individuating ?-property that we have argued to be the driving factor in inherent 
reflexivization. 
 Let us begin with a familiar contrast (Rosenbaum 1967) betwen raising and 
control: 
 
(313) a. Dave persuaded a doctor to examine Bil 
 b. Dave expected a doctor to examine Bil 
 
(314) a. Dave persuaded Bil to be examined by a doctor 
 b. Dave expected Bil to be examined by a doctor 
  
These examples difer in the interpretative properties of the embedded infinitival clause, 
depending on active/pasive voice. The b-cases are ECM/object-raising, and are basicaly 
synonymous. The a-cases, involving control, are not; they difer as to who is being 
persuaded (Bil or a doctor). 
 Another familiar contrast involves idiom chunks, which raising (a-cases) but not 
control (b-cases) alows (i.e., to the extent the b-cases are ok they must not be idiomatic): 
 
(315) a. The cat semed to be out of the bag 
 b. #The cat tried to be out of the bag 
 
(316) a. John expected the cat to be out of the bag 
 b. #John persuaded the cat to be out of the bag 
  
 As mentioned earlier, Hornstein (2000) argues for an approach which reduces 
control to raising/movement. Under this view the salient diference betwen the two sorts 
of construction is simply how many ?-roles are hanging around. Whereas in raising an 
element moves from a ?-position to a Case position, in control elements are understood to 
move from ?-to-?, picking up "?-features" as they go. Thus, the above contrasts are 
 
238 
straightforwardly acounted for in terms of whether the idiom chunks, which cannot 
receive thematic roles without loosing their idiomaticity, have to move through a position 
where geting a ?-feature is avoidable. 
 Similarly, the active/pasive diference betwen object raising and object control 
shown above simply difer as to whether the pasivized NP end up in a theta-position 
(control) versus not (raising). 
 Manzini & Roussou (2000) develop an idea similar in spirit to Hornstein's 
raising/control reduction, though with a rather diferent technical implementation. They 
suggest that in both raising and control the "moved" element is rather simply inserted into 
its surface position, and from there it "atracts" ?-features (conceived as aspectual 
features). The raising/control diference turns on simply whether a single ?-feature or 
more than one such feature is atracted to the "controller". 
 Let's consider some posible structures for some core cases. Consider first the 
diference betwen the object raising and the control manifestations of a verb like expect: 
 
(317) a. John expected him to leave 
 b. John expected to leave 
 
The object raising version we expect to evoke the following structure (using a shorthand 
notation indicating the relevant T-T identification): 
 
(318) C?T
?:f
?v
?
?C?T
?:g
?V?T?v
?
 
 
    D
?:n
?:f
         D
?:a
?:g
 
 
What about the control case? If control complements generaly involve a C-layer, then we 
can view the extension of the extention of the ?-properties into these non-finite domains 
 
239 
just as we did in the case of local reflexives above, which would look as follows for 
subject control. We could asume either of the folowing: 
 
(319) C
?:f
 ?T
?:f
?v
?
?C
?:f
?T
?:f
?V?C
?:f
?T
?:f
?v
?
 
 
      D
?:n
?:f
 
 
(320) C
?:f
 ?T
?:f
?v
?
?C
?:f
?T
?:f
?v
?
 
 
      D
?:n
?:f
 
 
Both of these would result in the extension of the matrix subject domain so that it end up 
in a local relation with the lower v
?
. The two posibilities would difer in whether we 
would find reason to maintain the object-related C and T elements in absence of either 
object-? or the ?-les ?/? properties argued to be present in object raising cases. I wil not 
pursue this isue, though I cannot at present se a reason to maintain the more 
complicated structure. 
 On this view, note that we would be asuming the subject element is itself not 
brought into a local relationship with the lower role, as with raising. Rather, only its ?-
feature is. Consider in this connection the often noted lack of reconstruction efects in 
control (but not raising). 
 
(321) a. Someone from New York is likely to win the lottery 
 b. Someone from New York is eager to win the lottery 
 
As noted in May (1985), the a- and b-case above difer in whether they admit a reading 
with someone scoping low. That is, the raising (a-) case is ambiguous betwen meaning 
that some particular person from New York is likely to win, versus a low scope reading 
paraphrasable as "It is likely that someone from New York wil win the lottery", where it 
 
240 
is clear that we do not have any particular person in mind. Importantly, the control (b-) 
case above has no such reading; it only exhibits the higher scope interpretation in which 
we have a particular person in mind who is eager to win. This raising/control diference 
follows on the asumption that computing scope requires the actual scope bearing 
element to be in the relevant local domain. (However, these facts as an argument for the 
present view should be treated with caution; se our discussion above regarding Lasnik's 
arguments about specificity). 
 Note that object control (e.g., John persuaded Mary to leave) now gets a paralel 
derivation to the one offered for object-raising, only involving C-contraction as with the 
subject-control cases above. Consider: 
 
(322) C
?:f
?T
?:f
?v
?
?C
?:g
?T
?:g
?V?T?v
?
   OBJECT RAISING 
 
      D
?:a
?:f
          D
?:b
?:g
 
 
 
(323) C?T
?:f
?v
?
?C
?:g
?T
?:g
?V?C
?:g
 ?T
?:g
?v
?
  OBJECT CONTROL 
 
    D
?:a
?:f
           D
?:b
?:g
 
 
Another possibility for analysis of control in the present terms is the equivalent of a bare-
VP complement: 
 
(324) C?T
?:f
?v
?
?V?v
?
?.. 
 
    D
?:n
?:f
 
 
Obligatory control verbs with gerundival complements might be of this sort. 
 
(325) John tried [
vP
 eating the pie] 
 
 
241 
Such cases would thus involve direct relationships betwen v's, somewhat akin to the 
suggested story at the beginning of this chapter for secondary predication (John arrived 
sad). 
 Note that of course what has been offered in this section is just a sketch. 
Nonetheles, two key points emerge that wil be important to pursue further. First, there 
can be no straightforward direct control/raising asimilation in the present architecture. 
But, second, the isues are now perhaps a bit more subtle. Given that we have 
reconceived movement in general as "agre-type" feature/category relationships, al 
localizable relations fal into this general bin in one way or another. The general appeal to 
agrement (?) properties sketched above for a potential acount of control relationships is 
consistent with Landau's (2000) view, but as we do not recognize a separate "movement-
type" of relation, its not clear that raising and control aren't being brought closer together 
in terms of being subserved by the same general mechanisms (albeit in diferent ways). 
These isues need to be more carefully pursued within the TCG framework to se how 
things turn out, but the general format that the system makes available for analysis 
suggests that at least a partial control/raising unification may be feasible (so we may have 
a position intermediate betwen those advanced by Hornstein and Manzini/Rousou on 
the one hand, and views of the sort championed by Landau on the other). At any rate, I 
leave these maters for future investigation. 
 
242 
3.5. Clausal Unithood & Wh-Again,.. 
The discussion in the previous section is of course quite speculative. However, we noted 
above that positing an object-related T-element is not without precedent, nor is the 
general "split-VP"/"stacking" approach. 
 Note as wel that in a recent thesis, Butler (2004) argues for an general view of 
phase-hood with roughly the kind of iterated CP-structure that our view of contraction 
requires. In addition to the development of his own arguments, he points to a number of 
other places in the literature where similar kinds of asumptions have been shown to bear 
fruit in syntactic analysis.
109
 
 Iterated clause-internal sub-structures of this kind I have in mind were also 
proposed by Demuth & Gruber (1994), who distinguish betwen Basic Projection 
Sequences (BPS's) and Lexical Projection Sequences (LPS's). To avoid confusion with 
references to Chomsky's Bare Phrase Structure (also "BPS"), and to suggest the 
connection with the organization of what I earlier refered to as Core Licensing 
Properties (CLPs), let us refer to the sort of objects that Demuth & Gruber cal Basic 
Projection Sequences as instead CORE PROJECTION SEQUENCES (CPSs), and to make an 
explicit connect to Grimshaw's (1991, 2000) proposals, cal the analogue of their LPS 
instead an EXTENDED PROJECTION SEQUENCE (EPS). 
 Demuth & Gruber's proposals difer somewhat in detail from what I proposed 
here (or what Butler proposes for example), but the ideas are al very similar. On D&G's 
                                                
109
 Butler's articulation of phases is quite detailed, and motivated by conections to a particular view of the 
syntax-semantics interface relevant to understanding quantification, scope, and the like, building on ideas of Begheli & 
Stowel (197), Beleti (201, 203), Jayaselan (201) among others. I refer readers to Butler's thesis for further 
discusion and references. 
 
243 
view their BPS's iterate to form LPS's. An LPS is simply a series of BPS's with a kind of 
ultimate lexical/thematic head at the bottom of the lowest BPS. 
 We can ilustrate the idea as it is relevant to what has been suggested here with 
reference to our proposed iterated CP-TP-v/VP structures as follows, using our new 
terminology for the relevant units (CPS/EPS):
10
 
 
       CPS     CPS 
 
(326) [
CP
 C
0
 [
TP
 T
0
 [
vP
 v
0
 [
CP
 C
0
 [
TP
 T
0
 [
vP
 V
0
 ..]]] 
 
              EPS   "ULTIMATE" HEAD OF THE EPS 
 
EPS's then might be sen as a series of CPS's which bottom out in a major lexical 
category. It may be that the typical case is that an EPS is at most two CPS's (as in (326) 
above) though further isues not examined here may force us to conclude otherwise (the 
structure of ditransitives, causatives, and many other maters). 
 Of course, we would like to have some idea of what makes a series of CPS's 
"hang-together" to form an EPS. There are a couple of things we might say on this score 
which require further investigation but which sem like the right sort of ideas. First, recal 
from our discussion of contraction and node-identification in ?3.3.2 the idea the 
following general schema that we used to explain how features might stay "fixed" to 
positions in the output structure, despite being implicated in lower domains. What we 
                                                
10
 The reasons for my changing terminology are not just to avoid confusions with references to Chomsky's 
theory of phrase structure. Demuth & Gruber actualy understand their BPS's/LPS's to botom out in a thematic 
element, with the higher BPS's understod to be athematic, so there is only one of these "thematic" BPS's in a given 
LPS on their view. Se their paper for discusion and interesting analysis of compound tenses in Bantu languages. 
Given the fit of this general idea with Split-VP ideas I wil not be importing this aspect of their story. Here lowest 
elements of both sub-units (now: CPS's), that is v and V, are both thematic elements. And, while v has sometimes ben 
entertained as a member of the "functional" category inventory, I wil here regard it as esentialy functional (perhaps 
"semi-lexical"). 
 
244 
require is perhaps some further property which could be sen to run through a series of 
CPS's, in a sense serving to hold them together. This would be something analogous to 
the way ?-properties have been suggested to "hold together" a series of distinct elements 
forming our CPS's. 
 One plausible candidate, which we might or might not wish to view as "part of th 
syntax" in any direct way, are variables and quantifiers asociated with eventualities (the 
"e" variable casualy refered to in earlier discussion). It sems plausible to say that 
something like Kratzer's (1996) event identification might serve to unify two such CPS 
structures into a single "EPS" (i.e., some way in which the event variables in the two 
separate domains are linked/identified). 
 And, just as there are properties marking the edges of CPS's, we might examine 
other properties that might serve to group our CPS's into larger units. Finitenes and 
Force (Rizi 1997) might be such properties. Realized as categories, these could be 
elements which serve to mark off our larger stretches of structure equivalent to the 
traditional clause. 
 However these maters are pursued, I wil close this chapter with reference to one 
last clas of facts that our view of SCM does not predict unles we take the idea of 
recursive structure into the clause in the way suggested above. In particular I am refering 
to the Fox examples discussed in Chapter Two. Consider: 
 
(327) a. ? [Which of the papers that he
1
 gave Mary
2
] did every student
1
 [
vP
 ? [ask 
her
2
 to read  * carefully? 
 
b. * [Which of the papers that he
1
 gave Mary
2
] did she
2
 [
vP
 *  [ask every 
student
1
 to revise * ? 
 
 
245 
Instead of moving to the 'edge' of vP, here we have a uniform approach of C-C contraction 
which serves to bring the wh-element into the local configurations that are necesary to 
acount for the possible and impossible interpretations in these cases. 
 However, recal as wel from Chapter 2 the folowing cases (Legate 2000): 
 
(328) a. ? [At which of the parties that he
1
 invited Mary
2
 to] was every man
1
 [
vP
 ? 
[introduced to her
2
 *? 
 
b. * [At which of the parties that he
1
 invited Mary
2
 to] was she
2
 [
vP
  * 
[introduced to every man
1
 *? 
 
(329) a. ? [At which charity event that he
1
 brought Mary
2
 to] was every man
1
 [
vP
 ? 
[sold to her
2
 *? 
 
b. * [At which charity event that he
1
 brought Mary
2
 to] was he
2
 [
vP
 * [sold to 
every woman
1
 *? 
 
These cases, as noted in our earlier discusion, are problematic for Chomsky's (1999) 
view of phases as just C and v, as they sem to involve pasives.
11
 These facts are 
incompatible with the present view as wel. Recal our acount of pasives denied the 
presence of an object-related C-element to derive the "suppresion" of v-? (its splicing-
out of the workspace in virtue of T-T contraction). 
 However, there is another line of argumentation available to us given the general 
perspective of domains. Note that the cases above are pasives of ditransitives. What we 
have claimed in terms of the stacking of independent thematic domains with independent 
functional structure (perhaps the extreme of the "stacking" view) ought perhaps to hold of 
indirect objects as wel. What is required to capture the facts above is some position that 
                                                
11
 Legate provides similar examples with unacusatives, though to build the cases she requires a special sort 
of unacusative that takes more than one internal argument. I'm uncertain about the clasification of the verb she uses 
(se her paper for the cases), but should the argument turn out to be ok, I think the story I run in the main text for the 
pasive case wil cary over (should it turn out to be sustainable!). 
 
246 
the wh-phrase must move which is below the subject (e.g., so the pronoun within it may 
be bound by every man as in the a-cases above) but above the indirect object (so as to 
avoid obviation in the a-cases above). 
 Suppose that this position is the edge of the indirect object's domain, conceived 
uniformly with the subject and direct objects cases. To get the facts, we need an outer C-
type layer surrounding the prepositional phrase, i.e.: 
 
(330) C?T?v?C?T?V?C??P?D 
    subj.   direct obj.  indirect obj 
 
However, an extra C-layer may not actualy be required. What are prepositional elements 
anyway? In clasical X-bar theoretic feature decompositions of categories they were 
typicaly regarded as negatively specified for both "n" and "v" properties. Interestingly, 
Pesetsky & Torrego (2004), in addition to positing an object-related T-element, also 
suggest the possibility that (at least some) prepositions may be a "type of T-element". 
They hint at a connection in terms of connections betwen the elements in terms of the 
functions the play in the semantics of time and space, but the interesting point from the 
present perspective is the possibility of their belong to a general type including T. 
 Note as wel that there is often discussion of prepositional/complementizer type 
relations with, for example, worries about whether prepositional-looking elements that 
introduce clausal structures (e.g., before John ate the piza) are realy of the P or C type 
(e.g., Lasnik & Saito 1991 argue for a C-type analysis of cases of this sort). 
 A number of interesting possibilities arise here that deserve more atention that I 
can devote here, but let me make another general point. The implementation of the TCG 
ideas that have been pursued in the present chapter suggest a research strategy aimed at 
 
247 
re-evaluating how we partition syntactic clases. The actual "labels" we have deployed in 
our discussion and analyses are clasical ones, and so might naturaly evoke some 
suspicion (which is not unreasonable, e.g., "there aren't any clause internal 
complementizers!").
12
 
 But in the spirit of pointing a direction for such potential reclasifications, the 
present point is that it is not entirely crazy to think that C, T, and P might be fruitfuly 
viewed as members of a larger clas. 
 At any rate, if the general line of thinking is on the right track, we might then 
discover clases of the "P-type" which would relate to C, T, or perhaps to both types via 
the node-identification mechanism developed here. We might find reason to atribute 
diferent features to this super-clas to discriminate possible/impossible identifications 
along the lines that have been suggested for the subject/object situation and for cros-
clausal relations above (e.g., two C-wh's cannot identify, etc.). 
 For Legate's data above, if this is correct then it might be that the wh-element 
moves to the edge of the indirect-object domain (C-"P" identification). This view would 
be interesting as wel to explore with respect to pasivization and ditransitives. 
 However, these are topics for another day. 
3.6. Conclusions: The Take-Home Message of TCG 
Here are the core points to take home. First: the analyses that TCG supports with respect 
to core "clasical" cases of SCM, in particular raising-to-subject and wh-movement of the 
typical clause-edge-to-edge variety, should be taken to be the central result. 
                                                
12
 Though, again, se Bulter (204), Beleti (201, 203), Jayaselan (201). 
 
248 
If nothing else about this disertation is correct, the ability of the present system 
to provide a basic platform for understanding SCM without postulating what we caled 
M-features is something that should be atempted to be maintained in any further pursuit 
of this enterprise. The efect of this architecture is to reduce a wide-ranging general clas 
of superficialy non-local dependencies to local ones. In the local domains themselves, 
we se only independently motivated "core licensing properties" (CLPs) at work in 
establishing the key relationships. 
What alows us to dispense with M-features (i.e., "the EP") is the mechanics of 
node identification. Generaly, the idea is that intermediate positions of the SCM sort 
exist only because of (i) the existence of core local-type relation hold in the matrix clause 
and (i) the general fact that lower "like-elements" constitute informational supersets of 
matrix contexts. Crucial to executing this intuition is the WS/O-distinction, which alows 
us to separate-out local computation of relationships from the resultant/derived output in 
a way that alows non-local relations in the output to be maintained within a local 
workspace. 
 What I have sketched in the present chapter is one possible implementation of a 
more general set of ideas. However, I have suggested that some of the specifics yield an 
interesting story, both about some particulars and in the general form of the answer that is 
provided regarding clause-structure and dependency relationships. 
 We are now in a position to addres as wel an isue that I have left unaddresed 
throughout ? the asumption that these derivations work "top-down". Numerous 
technical aspects of the presentation hinge on this asumption, but the general logic of the 
SCM story is where the distinction is clearest. The necesity of a top-down derivation 
 
249 
goes hand-in-hand with the denial of M-features. If there are no such properties, then it is 
not possible in general to have an element asociate directly with intermediate positions, 
nor is it possible to "move" to them. Note as wel that offering a format within which an 
eliminative agenda regarding EP-type properties in favor of a view where local 
licensing is handled in terms of CLPs distinguishes the present approach from the views 
of TAG that we discussed in Chapter 2, where elementary tre-local movement is not 
generaly understood in this way. 
 Nedles to say this is just the setup for an analytical investigation into the wider 
range of cases, and cross-linguistic diferences, that have been taken to motivate EP-
features and the like. What I have offered here is a start on what strikes me as the most 
serious chalenge ? geting rid of non-CLPs as motivators for intermediate movements. 
 We could, presumably, atempt to motivate a "bottom-up" view along one of the 
lines mentioned in Chapter 2 (e.g., non-feature-driven movement to avoid crashing the 
derivation), with a wh-element starting in a base position, but this bottom/embedded 
domain wil not itself be a "phase" on the node-identification and contraction view ? as 
the relevant "like element" won't arive until the top of the next highest clause. Moreover, 
other like elements (e.g., the V selecting the embedded clause) wil arise first, and in 
virtue of the anti-recursion restriction on the workspace, wil force a splicing out of the 
intervening material. This could be taken to motivate movement directly to a 
superordinate VP-adjoined type position, skipping a lower C, but this would sem to be 
contrary most of the empirical evidence reviewed in Chapter 2. Also, this would be a 
mixed-view system of displacement, which would involve both standard movement and 
 
250 
the node-identification mechanism.
13
 So, it turns out on the present view that top-down 
structure expansion and eliminating M-features go hand-in-hand. 
 Returning to TAG approaches, one might ask whether the adjoining/substitution 
mechanics could be put to work in ways similar to what has been developed here in 
appealing to the reduced Brody-type structures that collapse intra-phrasal projection-level 
distinctions. This sems possible, though I have not investigated the mater. 
 The general outlook here on the relationship betwen TCG, TAG, and the MSO-
type systems discussed in Chapter One is that they constitute a family of closely related 
approaches. The introduction of the WS/O-distinction at the outset of this work is 
designed to form a general background context within which various aspects of the 
diferent approaches might be mixed/matched and then tested against the facts of human 
language. What I have offered here is an outline of one such approach. 
 And there are numerous isues which have not even been scratched. In order to 
concentrate on the key properties of interest here regarding recursion, the node-
identification view of lowering/copying, and the like, isues regarding head-movement 
and modification have been completely avoided. This is a serious omision, and should 
be one of the first areas to be developed in any continued thinking on this general 
approach. 
                                                
13
 Note that we did sugest something close to such mixed view in our discusion of A' to A-position 
movement earlier in the chapter ? but the details were developed to bring this case within the general logic of 
identifying properties on the dominance sequence as a way of deriving copying of elements in the output. 
 
251 
3.7. Closing 
I wish to close with some general points and a few open questions. First, consider again 
our earlier discussion regarding chain structures, where we suggested that Chomsky's 
(1995) "technical options" (including our third previously undiscussed option) regarding 
chains are not in fact diferent possible ways of talking about the same thing, but rather 
simply diferent things. In our dominance encoded feature-relations, suggested that the 
following thre schemas are actualy diferent chain types: 
 
(331)  
 
 
 
   Binding/Control  A'-relations  Some A-relations 
 (connections betwen 
    A-relations) 
 
Simple A-relations were understood to involve two or thre node connections, linking up 
a ?-marked nominal with ? by means of ?-properties. These sequences I suggested, might 
be themselves linkable, via node-identifications involving the top-most members of each 
such sequence. This was the basic shape that was suggested for local reflexives (perhaps 
also inherently reflexive verbs), and was offered as a schema for the structure of control 
relations as wel. This is the picture of chains that we se in the left-most schema in 
(331). Local A'-relations, it was suggested are best understood as local WH-? 
relationships, which are themselves connected to ? via ?-relationships. This is the picture 
in the middle schema in (331). Finaly, the rightmost schema above manifests the 
structure I have atributed to (e.g.) pasives. There a T-T contraction was argued to 
 
252 
splice-out intervening v, leaving it in the workspace unasociated with any other element 
(this was suggested to be the present but unexpresed external role). The picture then is of 
a lower ?/?-? relationship which is connected to a higher position (e.g., nominative T). 
 There is a central idea in play here that our discussion has not adequately touched 
upon. The mechanics we have been working with asume a single formal ordering 
dimension, that admits to branching, along which feature-licensing relationships are 
characterized. The ideas just discussed regarding diferent "constituency structures" for 
chain relationships has been key. 
 This is just a sketch. Putting the system to work in more in-depth and rigorous 
analysis is what is now required. What the present work has acomplished is to set the 
stage for a novel type of approach that I have argued has the right general structure to 
provide principled acounts of SCM phenomena at the least, and perhaps has 
consequences for other concerns. 
 In general, I wish to stres again here that I believe it is best to view the present 
approach along with the others that have been discussed alongside it (TAG and MP/MSO 
appraoches) as a family of closely related ideas. The eforts here have brought out some 
diferences betwen these approaches, but the hope is that they have also been brought 
somewhat closer together. 
 Consider again both the Chamorro agrement facts and the S-V inversion cases 
from Spanish discussed in Chapter 2: 
 
(332) ? Qu? pensaba Juan [que le hab?a dicho Pedro [que hab?a publicado la revista]] 
what thought Juan that him had told Peter that had published the journal 
'What did John think that Peter had told him that the journal had published?' 
 
 
 
253 
 
(333) Hafa  sinangani-n Juan as Dolores [t ni  minalago'?a   [t p?ra un-taitai    t]? 
WHAT? WH[OBJ2].tel Juan OBL Dolores  COMP WH[OBL].want-AGR FUT WH[OBJ].AGR-read 
 "hat did Juan tel Dolores that he wants you to read?" 
 
I noted in Chapter Two that the Spanish facts at least have been subject to some 
controversy. In this connection I mentioned the work of Bakovi? (1995), who documents 
the following dialect variation: 
 
(334) a. No inversion with any wh-phrases (Su?er 1994) 
 b. Inversion with argument wh-phrases only (Torrego 1984; Su?er 1994) 
 c. Inversion with al but reason wh-phrases (por qu?/"why") (Goodal 1991a,b) 
d. Inversion with al wh-phrases in matrix clauses; al but reason wh-phrases 
in subordinate clauses (Bakovi?'s survey) 
e. Inversion with al but reason wh-phrases in matrix clauses; only argument 
wh-phrases in subordinate clauses (Bakovi?'s survey) 
f. Inversion with argument wh-phrases in matrix clauses; no inversion in 
subordinate clauses (Bakovi?'s survey) 
 
The general conclusion of Bakovi?'s research into these maters is that there is a scale 
which to a first approximation tracks a hierarchy of wh-elements, ordered on a more-to-
les "referential" (or perhaps "argumental") scale. Dialects variation appears to be 
systematic if Bakovi? is right. First, it is possible to have a dialect with no inversion at 
al. However, if there is inversion and if it is alowed with wh-elements position X in a 
more-to-les referential/argumental continuum, then it is alowed with al the others 
higher in the hierarchy. Moreover, there appears to be a "subset" relationship, with 
respect to the matrix/subordinate distinction, such that embedded clause inversion 
possibilities with respect to this hierarchy of wh-elements is always a subset of what is 
possible in the matrix. 
 
254 
 Interestingly, Chung (1994) reports that the local agrement facts in Chamorro are 
optional for referential wh-elements, taking the relevant clases of elements to be those 
picked out in the work of Cinque (1991) (se also Pesetsky 1987 on "D-linking"). 
 What sort of mechanics should be deployed to acount for their syntactic 
properties? Petesky (1987) offers a story under which relies on a division betwen 
"regular" wh-movement, and a kind of unselective binding by a matrix Q-morpheme, of 
the sort offered in the work of Baker (1979) to provide a grounding for the D-linked/non-
D-linked distinction. Suppose something like this is correct. There are then two types of 
relationships that in principle can acount for the connection betwen a wh-element and 
potentialy distant (embedded) thematic information. A local SCM mechanism, and a 
potentialy long-distance kind of (semantic?) relation. 
 But why should this be? Why should there be two? If it can happen long-distance, 
why not always that way? Or why not always linked-local? 
 Note that our WS/O-distinction offers a reasonable place to hang this diference 
? we could understand unselective binding to be a semantic relation (perhaps of the 
linking sort we discussed for logophors earlier, realizing an unselective binding relation) 
holding over output structures, and also have the edge-to-edge linked-local style relation 
as mediated by the syntactic workspace. My suspicion is that the right way to approach 
these isues should involve a close examination of the learnability of the distinctions. In 
general our view here has been of the narow syntactic computation as itself constituting 
the interface betwen the lexicon and a PF/LF output structure. The learnability of core 
local relations ought to fal within a correspondingly local view of where learners find the 
information needed to acquire gramar (something along the lines of Lightfoot's Degre-
 
255 
Zero "plus a litle"; Lightfoot 1989). If the ideas here regarding node-identification and 
contraction are generaly on track, learners needn't have aces to anything more than 
roughly (traditional) clause-sized objects to acquire the relevant distinctions that pertain 
as wel to linked-local/SCM-type relationships, as these I have argued fal out from the 
basic mechanics. But there is stil a serious problem to be faced, one with two facets: (i) 
its just not true that all dependencies are reducible to local domains, that this is the case is 
what motivates views like that introduced by Pesetsky (1987), as mentioned above, and 
(i) its just not true that matrix level generalizations cary over to embedded domains. 
Regarding (i), what sems to be the case is that something like Ros's (1973) "Penthouse 
Principle". Ross offered the following metaphor "whose truth is borne out in myriad 
cases of Real Apartment Life": 
 
(335) The Penthouse Principle: More goes on upstairs than downstairs. 
 
Life, it sems, is always more exciting in the Penthouse; anything happening downstairs 
is sure to be a trendy copy of things that have already been done upstairs. So we needn't 
strain our capacity for decoding metaphors, Ross translates into terms more linguistic: 
 
(336) No syntactic proces can apply only in subordinate clauses. 
 
This sems to be borne out by many of the SCM-efects discused in Chapter Two. 
Inversions, wh-copying, and local agrement al sem to be required "at the top" if they 
hold in embedded domains, and where they hold in embedded domains, they must hold 
"al the way down" (typicaly no domains may be skipped). These maters strike me as 
important to investigations concerned with Degre-N learnability and may be a route that 
might help us to beter understand SCM-type efects. Recal that wh-copying shows up as 
 
256 
wel in L1-acquisition of English (se ?2) which is a case where the adult/target gramar 
does not generaly permit such copying. This suggests that learners are not necesarily 
conservative in the sense of waiting for positive evidence to alter their gramars to alow 
multiple PF-spel-outs of this kind. And if children do not have aces to direct negative 
evidence that these constructions are not part of the target gramar, then something else 
must be in play to alow them to converge on the correct target. The following questions 
then can frame further inquiry into these maters: 
 
(337) a. Is the Penthouse Principle true? 
 b. If it is true, why? In virtue of what? 
c. How does whatever underlies this principle efect the presence/absence of 
SCM-type efects? 
 
Part of the answer to questions a/b I believe lies in how we understand the domains that 
learners have aces to in order to find evidence to sort out where their target language 
lies in the UG-governed aray of possibilities. 
 Question-c is then framed in our TCG approach (deploying the WS/O-distinction) 
in terms of how the mechanisms of unselective binding or the like arise, making truly 
long(er)-distance relationships possible. 
 I wil leave these maters here, noting in closing that the general structure of our 
acount makes room for thre kinds of dependencies: (i) those that are purely local, (i) 
those that are linked-local, and (ii) those that are non-local. Further, I have suggested 
here a way that natural language gramars might reduce the (i)-type to the (i)-type. How 
the (ii)-type fits in to this view is a job for future inquiry. 
 
257 
 Last, consider some remarks from Fodor (1977) regarding gramars and 
derivational directionality, which wil alow us to sum-up some of the key ideas discussed 
in this disertation in a more general way: 
 
We might supose that we could isolate the isue of directionality by comparing two (imaginary) 
gramars, G
1
 and G
2
, which are identical except that the rules of G
1
 are the inverses of the rules of 
G
2
 (i.e., the input to each rule of G
1
 is the output of the coresponding rule of G
2
, and vice-versa), 
and the order of aplication of the rules in G
1
 is the inverse of the order of aplication of the 
coresponding rules in G
2
. The set of structural representations constituting the derivation of a given 
sentence would be identical in both gramars, but these structures would be generated in reverse 
order. But now notice that where G
1
 has a deletion rule, the coresponding rule in G
2
 wil be an 
insertion rule; where G
1
 has a rule moving a constituent to the left, the coresponding rule of G
2
 wil 
move that constituent to the right. 
 
  G
1
   Derivation   G
2
 
 
 step 1: d is moved to    abcd   step 2: d is moved to 
  the left of bc    adbc    right of bc 
 step 2: a deletes     dbc   step 1: a is inserted 
  before d      before d 
 
The diference is direction is inevitably acompanied by a diference in the operations that particular 
rules perform. [We might] consider the posibility of constraining the rules of a gramar so that 
they can perform certain types of operations but not others. We might then be able to decide 
betwen the gramars G
1
 and G
2
 on this basis. But let us temporarily abstract from this isue by 
suposing that the rules of G
1
 and G
2
 al conform to the definition of a posible rule of gramar. 
Could there, nevertheles, be some reason for prefering either G
1
 or G
2
? The consensus of opinion, 
even among those who agre about almost nothing else, apears to be that there would be no 
significant diference betwen the two gramars ? as long as the old confusion of gramars with 
psychological models of spech production and perception is avoided. 
 
The important bit in this stretch of pasage is the observation that "diference in direction 
is inevitably acompanied by a diference in the operations that particular rules perform". 
It is such a diference betwen bottom-up and left-to-right/incremental asembly that 
Philips (1996, 2003) exploits in that the incremental system sems to alow us to make 
reference to "units" that we need for analysis but which are unavailable in a bottom-up 
characterization. 
 We can excise the following main isues from Fodor's pasage: 
 
 
258 
(338) INDEPENDENCE: considering ordering of operations in gramar and 
performance-theoretic systems as independent conceptualy, what considerations 
might lead us to consider one or another possible view of the ordering of 
combinatory operations in the gramar? 
 
(339) CORESPONDENCE: suppose we were to have strong reasons for thinking one or 
another global ordering for syntactic structure asembly was correct, how ought 
we think about correspondence relationships to operations in parsing and 
production ? how ought we think about GRAMAR as embedded in "time"? 
 
The sub-part of the pasage from Fodor (1977) above contrasting two toy gramars G 
and G' ends with the fairly reasonable asertion that it is dificult to se how we could 
find reason to think there was any real diference betwen such gramars. 
Seting to the side for the moment the present work and the work of Philips and 
others, the state-of-afairs regarding such questions about directionality in more recent 
theory is even a bit more dificult, if anything, than it was at the time of Fodor's writing. 
Following the above-quoted discussion, Fodor goes on to contrast two ways of 
understanding a rule like wh-movement implicated in (340) and (343), one including a 
rule of "wh-fronting" and another including a rule of "wh-backing". The later rule would 
map (342) to (341) and (345) to (344), while the former would do the reverse: 
 
(340) Who do you expect to murder Jemia? 
 
(341) Q You Pres expect [WH+pro murder Jemia]   WH-FRONTING   
(342) Q WH+pro you Pres expect [murder Jemia]         WH-BACKING 
 
(343) Who do you expect to murder? 
 
(344) Q You Pres expect [PRO murder WH+pro ]  WH-FRONTING 
(345) Q WH+pro you Pres expect [PRO murder]       WH-BACKING 
 
If direction of derivation does not fundamentaly mater then there should be no non-
trivial diferences betwen these derivations. Of course, any present-day comparison 
 
259 
betwen two such visions of transformational operations difers quite a bit from the 
theoretical situation that obtained in the days before the asumption of structure 
preservation (Kimbal 1972, Emonds 1985).
14
 Without structure preservation the rule of 
wh-backing appears to require a bit more in the way of additional asumptions that wh-
fronting does.
15
 That is, without traces or some kind of marker or variable or the like 
designating the internal position to which wh-backing would displace the wh-element, 
some additional mechanism would be required to filter out misapplications. 
In contrast, the rule of wh-fronting has a salient target: the "edge" of the 
sentence.
16
 Of course, in Fodor's example things are rigged in favor of wh-fronting given 
the presence of an abstract Q-morpheme (a "trigger" morpheme
17
), but with no 
corresponding marker for the base (trace/copy or thematic) position. If we embrace 
structure preservation, then our contrast betwen wh-fronting and wh-backing looks like 
this, where we simply switch the direction of the arows on our "movement" notation 
(asuming copies rather than traces for the moment): 
 
 
                                                
14
 Kimbal was, to my knowledge, the first to observe the interest of a restriction stated in terms of the kinds 
of objects produceable in principle by phrase-structure rules and the kinds of objects produceable by transformational 
operations. The idea of structure preservation realy caught on, however, folowing the work of Emonds (1970, 1985), 
who used this as a general restriction which enabled him to diferentiate betwen structure-preserving and non-
structure-preserving operations (e.g., his rot transformations were of the later type). The development of this notion 
with respect to traces/copies put this notion on even more solid ground, though it fel somewhat into the background as 
a general kind of constraint on operations and the structure of the gramar. Se Newmeyer (1986) for a god 
discusion of this history. 
15
 Fodor observes this, but includes no discusion of "structure-preservation" in this context. 
16
 That is, even in absence of "triger morphemes" or a designated landing site for such transformations, wh-
fronting apears to require les information to aply corectly. This is al asuming, of course, that the transformational 
aproach to these maters is the right way to go. There exists context-fre gramars (e.g., GPSG) which incorporate 
rules with complex ("slash") symbols which make the kind of asymetry Fodor is pointing to irelevant. Such 
gramars, like the context-fre rules we examined above, can operated trivialy either from terminals to "S" or from 
"S" to terminals. Whether or not the kind of asymetries Fodor is pointing to here stil can be found in more recent 
models of syntax is one way of stating the major theme of this work. 
17
 Or, for more recent versions of the "triger morpheme" idea, consult almost any recent article which has 
the words "minimalism" or "derivation" or "checking" in the abstract key-words line. 
 
260 
(340) Who do you expect to murder Jemia? 
 
(341)' Q WH+pro you Pres expect [WH+pro murder Jemia] wh-backing 
 
(342)' Q WH+pro you Pres expect [WH+pro murder Jemia] wh-fronting 
 
(343) Who do you expect to murder? 
 
(344)' Q WH+pro you Pres expect [PRO murder WH+pro] wh-backing 
 
(345)' Q WH+pro you Pres expect [PRO murder WH+pro] wh-fronting 
 
In more modern terms, we have a diference betwen an operation which raises the wh-
element, leaving a null copy or a trace, versus an element which lowers a null copy/trace. 
Or, more neutraly, we have some minimal formal way of establishing a dependency 
betwen two positions in a structure, and atendant (interpretative?) proceses which sort 
out where to pronounce and interpret what. 
Given structure preservation, in other words, the isue of derivational 
directionality becomes quite a bit foggier than it was at the time Fodor's book was 
published (27 years ago). Consider some further remarks which Fodor makes on wh-
backing, which reveal the present point about structure preservation nicely (despite 
including no explicit mention of structure preservation as such; bold emphasis mine):
18
 
 
WH-backing knows which noun phrase to move, but how it could know where to move it? What 
indicates that there is an apropriate gap for the interogative pronoun to move into at the end of 
[(343)] but not at the end of [(340)]? A gap, after al, is just a nothing. Two words are adjacent 
that otherwise would not have ben. The information that determines where there is a gap, and 
which gap has to be filed by the WH-Backing transformation, is information about the dep 
structures of these sentences, and about other transformations that do and do not aply in their 
derivations. [..] for the WH-Backing transformation to aply corectly, it would ned information 
about structures in the derivation of a sentence which are 'deper' than the one on which it operates, 
i.e. structures which are generated only AFTER WH-Backing itself has aplied. By contrast, the 
standard WH-Fronting rule is self-suficient; it can aply corectly without 'loking ahead' to later 
stages of the derivation. The reason for this diference is that WH-Fronting paralels, while WH-
                                                
18
 This pasage also raises questions about "lok-ahead" which have become relevant in much curent 
derivational syntactic theory. 
 
261 
Backing oposes, the direction of flow of information betwen structures like [(341)] and [(342)]. 
Before WH-Fronting aplies, the position of the interogative pronoun indicates its syntactic and 
semantic role in the sentence. But this information is lost when al interogative pronouns are moved 
into the same position at the front of the sentence. Thus, [(341)] contains more information (in this 
respect) than [(342)]; [(341)] contains enough information to determine [(342)], but [(342)] does not 
contain enough to determine [(341)]. Other transformations (e.g., Pasive, Particle Movement) 
aparently involve no los of information and hence determine a unique output when aplied in 
either direction. An asymetry in information content betwen two adjacent structural 
representations in a derivation thus gives some content to the notion of the direction of the rule 
(p10-12). 
 
With the notion of structure preservation in place, the informational asymmetry Fodor 
points to with respect to wh-fronting versus wh-backing no longer holds ? at least, not 
obviously. 
 But these aspects of Fodor's discussion regarding informational asymmetries 
serve to bring some isues into the foreground rather clearly for our purposes here, even 
if they rely on now outdated asumptions. Note the part of the above pasage regarding 
the isue of wh-backing "opposing" the "flow of information" and the mater of 
information loss in derivations. These isues become relevant ? though in a diferent 
way ? in the context of current derivational minimalist syntactic theory if we consider 
the notions of "phases" and the like within MSO-systems, which exhibit what we have 
caled 'expand/contract dynamics'. So we now have encountered a possible motivation ? 
albeit quite general and abstract ? for pursuit of one or another global directionality for 
syntactic derivations: the potential existence of informational asymetries. 
 In this work we have built an approach to gramar which capitalizes on such 
asymmetries, suggesting that what underlies SCM-type efects is an informational 
superset relation (subsumption) which holds betwen positions hosting "intermediate" 
positions and matrix positions where core licensing properties are related. This is thus an 
argument for a particular direction of derivation that focuses on purely competence-
 
262 
theoretic isues. What of the maters of corespondence then ? that is, the relation 
betwen gramar and parser? 
 I have not addresed maters of performance in this work at al, but to wind up 
this closing discussion, at least the following two points are of interest for further pursuit 
in the present framework. First, as mentioned in Chapter One, the TCG mechanics make 
available a gramar-based conception of how "displaced" elements may be buffered in a 
sense (kept in the workspace) so that they may be integrated in some lower domain. But 
our above discussion regarding the possibility of other mechanisms that handle such 
dependencies diferently raises the question of how these two diferent sets of 
mechanisms ? one that is WS-local, and one that functions over the output structure ? 
may or may not interact in on-line procesing. Second, the top-down perspective as it has 
been wed here with our workspace ordering and distinctnes constraints, suggests some 
ways of beginning to think about how categories/features and ordering information might 
be mapped to "time". Chomsky's term "phases" may turn out to be particularly apt in the 
sense that we might investigate ways in which categories/features could be understood to 
have a duration ? that is, a time-course in which these properties are "active" in on-line 
procesing. The general view of categories/features and ordering pursued here suggests 
that samenes/diference may mater for determining an abstract sort of "chain 
constituency" that is important to understanding local structures and how they may 
overlap (or not). This may be potentialy translatable onto a time-axis in a way that 
preserves the informational groupings that same/diferent properties have been suggested 
to efect along the dominance ordering. This is a fairly general, somewhat vague 
 
263 
suggestion, to be sure, but I believe something along these lines may be the key to 
understanding the various ways that we might understand gramar as relating to time. 
 
 
 
264 
References 
Abels, K. 2003 Succesive Cyclicity, Anti-Locality, and Adposition Stranding. Doctoral 
disertation, University of Connecticut, Storrs. 
Abney, S. & M. Johnson. 1991. Memory Requirements and Local Ambiguities for 
Parsing Strategies. Journal of Psycholinguistic Research 20(3):233-250. 
Adger, D. & G. Ramchand 2003. Merge vs. Move: Wh-Dependencies Revisited. Ms. 
University of London. 
Aoun, J. 1985. The Grammar of Anaphora. Cambridge, MA: MIT Pres. 
Asudeh, A. 2004. Resumption as Resource Management. Doctoral disertation. Stanford 
University. 
Bach, E. 1986. The Algebra of Events. Linguistics & Philosophy 9:5-16 
Baker, C. L. 1970. Notes on the Description of English Questions: The Role of an 
Abstract Question Morpheme. Foundations of Language 6, 197-219. 
Baker, M., K. Johnson, & I. Roberts. 1987. Pasive Arguments Raised. Linguistic Inquiry 
20: 219-251 
Bakovi?, E. 1995. A Markednes Subhierarchy in Syntax: Optimality and Inversion in 
Spanish. Ms. Rutgers University. 
Bars, A. 2001. Syntactic Reconstruction Efects. In C. Collins & M. Baltin (eds.) The 
Handbook of Contemporary Syntactic Theory, Oxford: Blackwel, 670-696 
Begheli, F. & T. Stowel. 1997. Distributivity and Negation: the syntax of each and 
every. In A. Szabolcsi (ed.) Ways of Scope Taking, Dordrecht: Kluwer, 71-107. 
Beleti, A. 1988. The Case of Unacusatives. Linguistic Inquiry 19:1-34. 
Beleti, A. 2001a. Agrement Projections. In C. Collins & M. Baltin (eds.) The 
Handbook of Contemporary Syntactic Theory, Oxford: Blackwel, 483-510. 
Beleti, A. 2001b. 'Inversion' as Focalization. In A. Hulk & J.-Y. Pollock (eds.) Subject 
Inversion in Romance and the Theory of Universal Grammar, Oxford: Oxford 
University Pres, 60-90. 
Beleti, A. 2003. Aspects of the low IP area. In Luigi Rizi (ed.) The Structure of IP and 
CP: The Cartography of Syntactic Structures, Oxford: Oxford University Pres. 
Benmamoun, A. 1998. Spec-Head Agrement & Overt Case in Arabic. In D. Adger et al. 
(eds.) Specifiers: Minimalist Approaches, Oxford: Oxford University Pres, 110-
125. 
Berwick, R. 1985. The Acquisition of Syntactic Knowledge. Cambridge, MA: MIT Pres. 
 
265 
Berwick, R. & A. Weinberg 1985. The Grammatical Basis of Linguistic Performance. 
Cambridge: MIT Pres. 
Bitner, M. 1994. Case, Scope, & Binding. Dordrecht: Kluwer. 
Biter, M. & K. Hale. 1996. The Structural Determination of Case and Agrement. 
Linguistic Inquiry 27(1), 1-68. 
Bobaljik, J. 1995. Morphosyntax: The Syntax of Verbal Inflection. Doctoral disertation. 
MIT. 
Bobaljik, J. 2002. Floated Quantifiers: Handle with Care. In L. Cheng & R. Sybesma 
(eds.) The Second State of the Article Book. Berlin: Mouton de Gruyter. 
Boeckx, C. 2000. EP Eliminated. Ms. University of Connecticut. 
Boeckx, C. 2003. Islands & Chains. Amsterdam: John Benjamins. 
Boeckx, C. & N. Hornstein. 2003. Reply to 'Control is not Movement'. Linguistic Inquiry 
34/2: 269-280. 
Boeckx, C. & N. Hornstein. 2004. Movement under Control. Linguistic Inquiry 35/3: 
431-452. 
Bo?kovi?, Z. 2002. A-Movement and the EP. Syntax 5: 167-218. 
Bresnan, J. 1971. Contraction and the Transformational Cycle. Ms. MIT, Cambridge, 
MA. 
Brody, M. 2000. Miror Theory: Syntactic Representation in Perfect Syntax. Linguistic 
Inquiry 31(1) 29-56. 
Brody, M. 2003. Towards an Elegant Syntax. London: Routledge. 
Butler, J. 2004a. Phase Structure, Phrase Structure, & Quantification. Doctoral 
disertation, University of York. 
Butler, J. 2004b. On Having Arguments and Agreing: A Semantic EP. York Papers in 
Linguistics 1, 1-27. 
Castilo, J.C., J. Drury, & K. K. Grohmann. Localizing Procrastinate. Ms. University of 
Maryland, College Park. 
Castilo, J.C., J. Drury, & K. K. Grohmann. 1999. Merge over Move and the Extended 
Projection Principle. In S. Aoshima, J. Drury, & T. Neuvonen (eds.) University of 
Maryland Working Papers in Linguistics 8, 63-103. 
Castilo, J. C. & J. Uriagereaka. 2000. A Note on Succesive Cyclicity. In M. Guimar?es, 
L. Meroni, C. Rodrigues, & I. San Martin (eds.) University of Maryland Working 
Papers in Linguistics 9, 1-13. 
Chametzky, R. 1996. A Theory of Phrase Markers and the Extended Base. Albany: 
SUNY Pres. 
 
266 
Chametzky, R. 2000. Phrase Structure: From GB to Minimalism. Oxford: Blackwel. 
Chomsky, N. 1955/1975. The Logical Structure of Linguistic Theory. New York: Plenum. 
Chomsky, N. 1965. Aspects of a Theory of Syntax. Cambridge, MA: MIT Pres. 
Chomsky, N. 1973. Conditions on Transformations. In S. Anderson & P. Kiparsky (eds.) 
A Fetschift for Morris Halle, New York: Holt, Reinhart, & Winston, 232-286. 
Chomsky, N. 1980. Rules and Representations. New York: Columbia University Pres. 
Chomsky, N. 1981. Lectures on Government & Binding. Dordrecht: Foris. 
Chomsky, N. 1982. Some Concepts and Consequences of the Theory of Government and 
Binding. Cambridge, MA: MIT Pres. 
Chomsky, N. 1986a. Barriers. Cambridge, MA: MIT Pres. 
Chomsky, N. 1986b. Knowledge of Language: Its Nature, Origin, & Use. New York: 
Praeger. 
Chomsky, N. 1993. A Minimalist Program for Linguistic Theory. In K. Hale & S. J. 
Keyser (eds.) The View from Building 20: Esays in Linguistics in Honor of 
Sylvain Bromberger, Cambridge, MA: MIT Pres, 1-52. 
Chomsky, N. 1994. Bare Phrase Structure. MIT Ocasional Papers in Linguistics, 5. 
Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: MIT Pres. 
Chomsky, N. 1998. Minimalist Inquiries: The Framework. MIT Working Papers In 
Linguistics. 
Chomsky, N. 1999. Derivation by Phase. Ms., MIT. 
Clements, 1975. The Logophoric Pronoun in Ewe: Its Role in Discourse. Journal of West 
African Languages 10: 141-177. 
Collins, C. 1997. Local Economy. Cambridge, MA: MIT Pres. 
Collins, C. 2000. Eliminating Labels. Ms. Cornel University. 
Davidson, D. 1967. The Logical Form of Action Sentences. In N. Rescher (ed.) The 
Logic of Decision and Action. U. of Pitsburgh Pres. 
Demuth, K. & J. Gruber. 1994. Constraining XP Sequences. Ms. Brown University and 
UQAM. 
De Viliers, J., T. Roeper, and A. Vainikka. 1990. The acquisition of long-distance rules. 
In L. Frazier and J. de Viliers (eds.) Language Procesing and Language 
Acquisition, Dordrecht: Kluwer, 257-297. 
den Dikken, M. & A. Szabolcsi. 2002. Islands. In L. Cheng & R. Sybesma (eds.) The 
Second State of the Article Book. Berlin: Mouton de Gruyter. 
 
267 
DiScuilo, A. & E. Wiliams. 1987. On the Definition of Word. Cambridge, MA: MIT 
Pres. 
Drury, J. 1998a. Root-First Derivations: Atomic Merge & The Coresidence Theory of 
Movement. Ms. University of Maryland, College Park. 
Drury, J. 1998b. The Promise of Derivations: Atomic Merge & Multiple Spel-Out. 
Groninger Arbeiten zur germanistischen Linguistik 42, 61-108. 
Drury, J. 1999. The mechanics of ?-derivations. In S. Aoshima et al. (eds.) UMWPiL 8. 
Drury, J. 2000. Some Thoughts on Generalizing Transmision. In M. Guimar?es, L. 
Meroni, C. Rodrigues, & I. San Martin (eds.) University of aryland Working 
Papers in Linguistics 9, 1-13. 
Drury, J. 2005. The Order of Form. Ms. Georgetown University, Washington D.C. 
Du Plesis, H. 1977. Wh-Movement in Afrikaans. Linguistic Inquiry 8, 723-726. 
Emonds, J. 1970. Root and Structure-Preserving Transformations. Indiana University 
Linguistics Club Publication. 
Emonds, J. 1976. A Tranformational Approach to English Syntax. New York: Academic 
Pres. 
Emonds, J. 1985. A Uniform Theory of Syntactic Categories. Dordrecht: Foris. 
Epstein, S., E. Groat, R. Kawashima, & H. Kitahara. 1998. A Derivational Approach to 
Syntactic Relations. New York: Oxford University Pres. 
Felser, C. 2004. Wh-Copying, phases, and succesive cyclicity. Lingua 114, 543-574 
Fodor, J. D. 1977. Semantics: Theories of Meaning in Generative Grammar. Hasocks, 
Eng.: Harvester Pres. 
Fodor, J. 1983. Modularity of Mind. Cambridge, MA: MIT Pres 
Fodor, J. 2000. The Mind Doesn't Work That Way. Cambridge, MA: MIT Pres. 
Fox, D. 1998. Economy and Semantic Interpretation. Cambridge, MA: MIT Pres. 
Frampton, J. & S. Guttman. 2000. Agrement is Feature-Sharing. Ms. Northwestern 
University. 
Frank, R. 1992. Syntactic Locality and Tre Adjoining Grammar: Gramatical, 
Acquisition, and Procesing Perspectives. Doctoral disertation. University of 
Pennsylvania. 
Frank, R. 2002. Phrase Structure Composition and Syntactic Dependencies. Cambridge, 
MA: MIT Pres. 
Frank, R. & A. Kroch 1995. Generalized Transformations and the Theory of Gramar. 
Studia Linguistica 49, 103-151. 
 
268 
Fukui, N. 1986. A Theory of Category Projection and its Applications. Doctoral 
disertation, MIT. 
Fukui, N. & M. Speas. 1986. Specifiers and Projections. In N. Fukui et al. (eds.) MIT 
Working Papers in Lingusitics 8. 
Grimshaw, J. 1991. Extended Projections. Ms. Brandeis University. 
Grimshaw, J. 1997. Projection, Heads, and Optimality. Linguistic Inquiry 28, 373-422. 
Groat, E. & J. O'Neil 1996. Spel-Out at the LF Interface. In W. Abraham, S. Epstein, H. 
Thrainsson, & J.-W. Zwart (eds.) Minimal Ideas, Amsterdam: John Benjamins, 
305-327. 
Grohmann, K. K. 2003. Prolific Domains. Amsterdam: John Benjamins. 
Grohmann, K. K., J. Drury, & J. C. Castilo. 2000. No More EP. In R. Bilerery & B.D. 
Lilegaugen (eds), WCFL 19: Procedings of the 19th West Coast Conference 
on Formal Linguistics. Cascadila Pres. 
Guimar?es, M. 1999. Phonological Cascades & Intonational Structure in Dynamic Top-
Down Syntax. Ms. UMCP 
Guimar?es, M. 2004. Derivation and Representation of Syntactic Amalgams. Doctoral 
disertation. UMCP. 
Hale, K. & S. J. Keyser 1993. On Argument Structure and the Lexical Expresion of 
Syntactic Relations. In K. Hale & S. J. Keyser (eds.) The View from Building 20: 
Esays in Linguistics in Honor of Sylvain Bromberger, Cambridge, MA: MIT 
Pres. 
Hale, K. & S. J. Keyser 2002. Prolegomenon to a Theory of Argument Structure. 
Cambridge, MA: MIT Pres. 
Hal?, F., R. A. A. Oldeman, & P. B. Tomlinson. 1978. Tropical tres and forests: An 
architectural analysis. Berlin: Springer-Verlag. 
Heck, F. and G. M?ller. 2000. Succesive cyclicy, long-distance superiority, and local 
optimization. Procedings of WCFL 19, 101-114. 
Henry, A. 1995. Belfast English and Standard English: Dialect Variation and Parameter 
Seting. Oxford: Oxford University Pres. 
Herburger, E. 2000. What Counts. Cambridge, MA: MIT Pres. 
Hiemstra, I. 1986. Some aspects of Wh-Questions in Frisian. North-Western European 
Language Evolution (NOWELE) 8, 97-110. 
Higginbotham, J. 1985. On Semantics. Linguistic Inquiry 16/4: 547-593. 
Higginbotham, J. 1988. Contexts, Models, and Meanings: A Note on the Data of 
Seantics. In R. Kempson (ed.) Mental Representations: The Interface Betwen 
Language and Reality, Cambridge: Cambridge University Pres, 29-48. 
 
269 
Hornstein, N. 2001. Move! A Minimalist Theory of Contrual. Oxford: Blackwel. 
Huang, J. 1982. Logical Relations in Chinese and the Theory of Grammar. Doctoral 
disertation, MIT. 
Jackendoff, R. 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA: 
MIT Pres. 
Jackendoff, R. 1977. X-bar Syntax. Cambridge, MA: MIT Pres. 
Jayaselan, K. A. 2001. IP-internal Topic and Focus Phrases. Studia Linguistica 55: 39-
75. 
Jenkins, L. 2000. Biolinguistics. Cambridge: Cambridge University Pres. 
Johnson, K. 1991. Object Positions. Natural Language & Linguistic Theory 9: 577-636. 
Kayne, R. 1975. French Syntax. Cambridge, MA: MIT Pres. 
Kayne, R. 1984. Connectednes and Binary Branching. Dordrecht: Foris. 
Kayne, R. 1989. Facets of Romance Past Participal Agrement. In P. Beninca (ed.) 
Dialect Variation and the Theory of Grammar, Dordrecht: Foris, 85-103. 
Kayne, R. 1994. The Antisymetry of Syntax. Cambridge, MA: MIT Pres. 
Kayne, R. & J.-Y. Polock 1978. Stylistic Inversion, Succesive Cyclicity, and Move NP 
in French. Linguistic Inquiry 9: 595-621. 
Kimbal, J. 1972. Cyclic and Linear Gramars. In J. Kimbal (ed), Syntax and Semantics 
1, New York: Seminar Pres, pp 63-80. 
Klima, E. 1964. Negation in English. In J. Fodor & J. Katz (eds.) The Structure of 
Language. Englewood Clifs, NJ: Prentice-Hal: 246-323. 
Koizumi, M. 1993. Object Agrement Phrases and the Split-VP Hypothesis. In J. 
Bobaljik & C. Philips (eds.) Papers on Case & Agrement I, MITWPL 18. 
Koizumi, M. 1995. Phrase Structure in Minimalist Syntax. Doctoral disertation. MIT. 
Kratzer, A. 1996. Severing the External Argument from its Verb. In J. Rooryck & L. 
Zaring (eds.) Phrase Structure and the Lexicon, Dordrecht: Kluwer. 
Kulick, S. 2000. Constraining Non-Local Dependencies in Tre Adjoining Gramar: 
Computational and Linguistic Perspectives. Doctoral disertation. UPenn. 
Landau, I. 2000. Elements of Control. Dordrecht: Kluwer. 
Landau, I. 2004. Movement out of Control. Linguistic Inquiry 34(3), 470-498. 
Lasnik, H. 1995. A Note on Pseudogapping. In Papers on Minimalist Syntax MITWPL 
27, 142-163. 
Lasnik, H. 1999. Minimalist Analysis. Oxford: Blackwel. 
 
270 
Lasnik, H. & M. Saito. 1991. Move-?. Cambridge, MA: MIT Pres. 
Lasnik, H. & J. Uriagereka. Forthcoming. A Course in Minimalist Syntax. Oxford: 
Blackwel. 
Lightfoot, D. 1989. The Child's Trigger Experience: Degre-0 Learnability. Behavioral & 
Brain Sciences 12: 321-334. 
Manzini, R. & A. Roussou. 2000. A Minimalist theory of A-movement and Control. 
Lingua 110, 409-447. 
May, R. 1985. Logical Form. Cambridge, MA: MIT Pres. 
McCloskey, J. 2000. Quantifier Float in Wh-Movement in an Irish English. Linguistic 
Inquiry 31:1, 57-84. 
McDaniel, D. 1989. Partial and Multiple Wh-movement. Natural Language and 
Linguistic Theory 7, 565-604. 
McDaniel, D., B. Chui, & T. Maxfield. 1995. Parameters for Wh-Movement Types: 
Evidence from Child English. Natural Language and Linguistic Theory 13, 709-
753. 
McKinnon, R. & L. Osterhout. 1996. Constraints on Movement Phenomena in Sentence 
Procesing: Evidence from Event-Related Brain Potentials. Language & 
Cognitive Proceses 11, 495-524. 
Muysken, P. 1982. Parameterizing the notion "Head". Journal of Linguistic Research 2, 
57-75. 
Nunes, J. 1995. The Copy Theory of Movement and Linearization of Chains in the 
Minimalist Program. Doctoral disertation. University of Maryland. 
Parsons, T. 1990. Events in the Semantics of English. Cambridge, MA: MIT Pres. 
Pesetsky, D. 1982. Paths & Categories. Doctoral disertation. MIT. 
Pesetsky, D. 1987. Wh-in-Situ: Movement and Unselective Binding. In E. Reuland & A. 
G. B. ter Meulen (eds), The Representation of (In)definitenes. Cambridge, MA: 
MIT Pres. 
Pesetsky, D. & E. Torrego. 2001. T-to-C Movement: Causes and Consequences. In M. 
Kenstowicz (ed.) Ken Hale: A Life in Language, Cambridge, MA: MIT Pres, 
355-426. 
Pesetsky, D. & E. Torego. 2002. Tense, Case, and the Nature of Syntactic Categories. 
Ms. MIT & U. of Masachusets Boston [in J. Gueron & J. Lecarme (eds.) The 
Syntax of Time, Cambridge, MIT Pres]. 
Philips, C. 1996. Order and Structure. Doctoral disertation. MIT. 
Philips, C. 2003. Linear Order and Constituency. Linguistic Inquiry 34, 37-90. 
Postal, P. 1974. On Raising. Cambridge, MA: MIT Pres. 
 
271 
Postdam, E. & J. Runner 2003. Richard Returns: Copy Raising and Its Implications. 
Procedings of CLS 2001. 
Quine, W.V.O. 1940. Mathematical Logic. Harvard University Pres.  
Reinhart, T. 2000. The Theta System: Syntactic Realization of Verbal Concepts. OTS 
Working Papers in Linguistics. 
Reinhart, T. & E. Reuland. 1993. Reflexivity. Linguistic Inquiry 24: 657-720. 
Reuland, E. & M. Everaert. 2001. Deconstructing Binding. In C. Collins & M. Baltin 
(eds.) The Handbook of Contemporary Syntactic Theory, Oxford: Blackwel, 634-
669. 
Resnik, P. 1992. Left Corner Parsing & Psychological Plausibility. Procedings of the 
14th International Conference on Computational Linguistics (COLING '92). 
Nantes, France. 
Rizac, M. 2004. Elements of Cyclic Syntax: Agre and Merge. Doctoral Disertation. 
University of Toronto. 
Rogers, A. 1971. Thre kinds of physical perception verbs. Papers from the Eighth 
Regional Meting of the Chicago Linguistics Society, Chicago: CLS, 303-315. 
Rosenbaum, P. 1967. The Grammar of English Predicate Complement Constructions. 
Cambridge, MA: MIT Pres. 
Ross, J. R. 1973. The Penthouse Principle and the Order of Constituents. Papers from the 
Comparative Syntax Festival (CLS 9). 
Runner, J. 1995. Noun Phrase Licensing and Interpretation. Doctoral disertation. 
University of Masachusets. 
Runner, J. To Appear. The Acusative Plus Infinitive Construction in English. In B. 
Hollebrandse and R. Goedemans (eds.) The Syntax Companion. Oxford: 
Blackwel. 
Sabel, J. 2000. Partial Wh-Movement and the Typology of Wh-Questins. In U. Lutz, G. 
M?ller, A. von Stechow (eds), Wh-Scope Marking. John Benjamins, Amsterdam 
and Philadelphia, pp 409-446. 
Saddy, D. 1991. Wh-scope Mechanisms in Bahasa Indonesia. In L. Cheng & H. 
Demirdache (eds), MIT Working Papers in Linguistics 15. MITWPL, Cambridge 
MA: pp 183-218. 
Sauerland, U. 1995. The Lemings Theory of Case. Paper presented at the 7th Student 
Conference in Linguistics (SCIL 7), University of Connecticut. 
Schein, B. 1992. Plurals and Events. Cambridge, MA: MIT Pres. 
Sels, P. 1987. Aspects of Logophoricity. Linguistic Inquiry 18, 445-479. 
 
272 
Shieber, S. 1986. An Introduction to Unification-Based Approaches to Grammar. CSLI: 
University of Chicago Pres. 
Takahashi, D. 1994. Minimality of Movement. Doctoral disertation. University of 
Connecticut. 
Terada, H. 1999. Succesive Cyclicity and Incremenality. English Linguistics 16/2: 243-
274. 
Tesneire, L. 1959. Elements de Syntaxe Structurale. Klincksieck, Paris. 
Thornton, R. 1990. Adventures in Long-Distance Moving: The Acquisition of Complex 
Wh-Questions. Doctoral disertation, University of Connecticut. 
Torrego, E. 1984. On Inversion in Spanish and Some of its Efects. Linguistic Inquiry 15: 
103-129. 
Ura, H. 1998. Checking, economy, and copy-raising in Igbo. Linguistic Analysis 28: 67-
88. 
Uriagereka, J. 1998. Rhyme & Reason: An Introduction to Minimalist Syntax. Cambridge, 
MA: MIT Pres. 
Uriagereka, J. 1999. Multiple Spel Out. In S. Epstein & N. Hornstein (eds.) Working 
Minimalism, Cambridge, MA: MIT Pres, 251-282. 
Uriagereka, J. 2002a. Derivations: Exploring the Dynamics of Syntax. London: 
Routledge. 
Uriagereka, J. 2002b. Spel Out Consequences. Ms. UMCP. 
van Riemsdijk, H. 1998. Categorial Feature Magnetism: The Endocentricity and 
Distribution of Projections. Journal of Comparative Germanic Linguistics 2: 1-
48. 
Vermeulen, R. 2002. Multiple Nominative Constructions in Japanese. In M. Cazloi-
Goeta et al. (eds.) Fifth Durham Postgraduate Conference in Theoretical and 
Applied Linguistics: Conference Procedings 224-233. 
Watanabe, A. 1995. The Conceptual Basis of Cyclicity. In R. Pensalfini & H. Ura (eds.) 
MITWPL 27, 269-291. 
Weinberg, A. 1999. A Minimalist Theory of Human Sentence Procesing. In S. Epstein 
& N. Hornstein (eds.) Working Minimalism, Cambridge, MA: MIT Pres, 282-
314. 
Wiliams, E. 1994. Thematic Structure in Syntax. Cambridge, MA: MIT Pres. 
Zwart, C. J.-W. 1993. Dutch Syntax: A Minimalist Approach. Doctoral disertation. 
University of Groningen. 
 
273 
Zwart, C. J.-W. 1996. Shortest Move vs. Fewest Steps. In W. Abraham, S. Epstein, H. 
Thrainsson, & J.-W. Zwart (eds.) Minimal Ideas, Amsterdam: John Benjamins, 
305-327.