ABSTRACT Title of disertation: THE STRUCTURE OF MEMORY METS MEMORY FOR STRUCTURE IN LINGUISTIC OGNITION Mathew Webb Wagers Doctor of Philosophy, 2008 Disertation directed by: Profesor Colin Philips Department of Linguistics This disertation is concerned with the problem of how structured linguistic representations interact with the architecture of human memory. Much recent work has atempted to unify real-time linguistic memory with a general content-addresable architecture (Lewis & Vasishth, 2005; McElre, 2006). Because gramatical principles and constraints are strongly relational in nature, and linguistic representation hierarchical, this kind of architecture is not wel suited to restricting the search of memory to gramaticaly-licensed constituents alone. This disertation investigates under what conditions real-time language comprehension is gramaticaly acurate. Two kinds of gramatical dependencies were examined in reading time and speeded gramaticality experiments: subject-verb agrement licensing in agrement atraction configurations (?The runners who the driver wave to ..?; Kimbal & Aisen, 1971, Bock & Miler, 1991), and active completion of wh-dependencies. We develop a simple formal model of agrement atraction in an asociative memory that makes acurate predictions across diferent structures. We conclude that dependencies that can only be licensed exclusively retrospectively, by searching the memory to generate candidate analyses, are the most prone to gramatical infidelity. The exception may be retrospective searches with especialy strong contextual restrictions, as in reflexive anaphora. However dependencies that can be licensed principaly by a prospective search, like wh-dependencies or backwards anaphora, are highly gramaticaly acurate. THE STRUCTURE OF MEMORY METS MEMORY FOR STRUCTURE IN LINGUISTIC COGNITION by Mathew Webb Wagers Disertation submited to the Faculty of the Graduate School of the University of Maryland, College Park, in partial fulfilment of the requirements for the degre of Doctor of Philosophy 2008 Commite Profesor Colin Philips, Chair Profesor Norbert Hornstein Asociate Profesor Jefrey Lidz Profesor Amy Weinberg Asociate Profesor Michael Dougherty, Dean?s Representative ? Copyright by Mathew Webb Wagers 2008 i Preface Chapter 2 reports research that was jointly conducted with felow graduate student Elen Lau. Parts of Chapter 2 were submited together with Ms. Lau as a co-authored journal article. ii Acknowledgments I would like to expres my deepest gratitude to Colin Philips. I always felt that he took a chance on my graduate carer, and I thank him for his wilingnes, atention, and patience to guide me. I am grateful for the insights we have been able to share with one another these past years. (Also: despite Colin?s best, if not dogged, eforts over the years, I do not think I have yet been cured of my ?parentheticalitis.? I am nonetheles indebted to him for many improvements in clarity of the presentation of the disertation.) I wish also to thank my commite, who were both flexible and generous with their time. To al my friends, at Maryland and beyond, there is no adequate or comprehensive expresion of gratitude for the support and joy you have provided me. In particular, I want to thank Elen Lau. Some back-of-the-room whispering with Elen during Mark Baker?s 2005 Blackwel lectures led to the agrement atraction studies reported here. Working with Elen since then has been a constant pleasure, and I have learned much from being her collaborator and her friend. I also thank Clare Stroud for her friendship. Clare was my officemate for five years, and I have no doubt that our many Noodles excursions were a crucial part of both our graduate carers. I also thank Brian Dilon & Ming Xiang, who have been puzzling out many of the same isues as I have. This disertation is improved in no smal part because of our many discussions in the lab. Fortunately we were also able to spend much time together outside the lab. Many of the ideas in this disertation sprung from Colin Philips and David Poeppel?s Cognitive Neuroscience of Language seminar; Amy Weinberg and Philip Resnik?s Computational Psycholinguistics seminar; and Colin Philips and Jef Lidz?s Psycholinguistics seminar. And I cannot imagine how those ideas would have germinated, were it not for frequent cookies and discussions with Norbert Hornstein. I am also appreciative of the fedback and encouragement I have received from researchers outside the University of Maryland community, including Bil Badecker, Lyn Frazier, Rick Lewis, Gary Marcus, Brian McElre, Neal Pearlmutter, Julie Van Dyke and the CUNY Psycholinguistics Supper group. I would also like to acknowledge the Graduate School. During the writing of the disertation, I was supported by a Wylie Disertation Felowship. Finaly, Phil Longo played noun-noun compounding with me (yes, it is a game), taught me to eat carots, and always made sure I went to the movies. I cannot imagine finishing this research, or doing much else realy, without his love and support. iv Table of Contents 1 INTRODUCTION....................................................................1 1.1 THE CHALENGE OF NAVIGATING STRUCTURE IN REAL-TIME 1 1.2 GRAMATICAL FIDELITY, GRAMATICAL FALIBILITY 6 1.3 OUTLINE OF THE DISSERTATION 1 1.3.1 Chapter 2: Agrement atraction and selective falibility.............................1 1.3.2 Chapter 3: The trouble with subjects.............................................12 1.3.3 Chapter 4: Active dependency formation and mechanisms for the acurate recognition of gramatical dependencies............................................................14 2 AGREMENT ATRACTION AND SELECTIVE FALIBILITY BINDING AND ACESING FEATURES IN COMPLEX SYNTACTIC OBJECTS, PART I...................16 2.1 INTRODUCTION 16 2.1.1 Agrement atraction: what?s at stake............................................16 2.1.2 Outline of the chapter..........................................................20 2.2 AN OVERVIEW OF AGREEMENT ATRACTION 20 2.3 PREVIOUS STUDIES OF AGREEMENT PRODUCTION AND THE HIERARCHICAL NATURE OF ATRACTION 23 2.3.1 Hierarchical, not linear distance, maters.........................................26 2.3.2 The atractor?s ?structural domain? maters........................................27 2.3.3 Ordering of verb and atractor does not mater.....................................28 2.4 THE FEATURE PERCOLATION ACCOUNT OF AGREEMENT ATRACTION 29 2.5 IMPLICATIONS OF FEATURE PERCOLATION AND OBJECTIONS 3 2.5.1 Eroneous feature percolation as eroneous rule aplication.........................35 2.5.2 Eroneous feature percolation as uncontroled spreading activation...................38 2.5.3 Sumary of eroneous feature percolation........................................41 2.6 THE SIMULTANEITY ACCOUNT OF AGREMENT ATRACTION 42 2.6.1 Simultaneity in planing a complex subject interferes with verb formulation.............42 2.6.2 The relationship betwen planing order and hierarchy.............................4 2.6.3 Disentangling representation and proces-based acounts...........................45 2.7 ATRACTION IN COMPREHENSION 46 2.7.1 The Symetry Prediction.......................................................46 2.7.2 Pearlmuter, Garnsey, & Bock (199)............................................52 2.7.3 Sumary....................................................................59 2.8 TESTING PERCOLATION I: RELATIVE CLAUSE ATRACTION 59 2.8.1 Kimbal & Aisen (1971), Relative Clause Atraction & Experiment 1-2 Rationale........59 2.8.2 Experiment 1.................................................................64 2.8.3 Experiment 2.................................................................71 2.8.4 Experiment 3.................................................................78 2.9 TESTING PERCOLATION I: THE GRAMATICAL-UNGRAMATICAL ASYMETRY IN COMPREHENSION 84 2.9.1 Wagers, Lau, & Philips (208) and On-line Comprehension.........................84 2.9.2 Controling for RT corelations among adjacent regions: a mixed-efects models analysis.87 2.9.3 Experiment 4: Speded gramaticality tests of complex subject atraction..............89 2.10 CONCLUSIONS 93 3 THE TROUBLE WITH SUBJECTS BINDING AND ACESING FEATURES IN COMPLEX SYNTACTIC OBJECTS, PART I........................................................95 3.1 INTRODUCTION 95 3.2 SEARCHING STRUCTURE WITH UNSTRUCTURED SEARCHES 97 3.2.1 Content-adresable search....................................................97 3.2.2 The restricted focus of atention................................................16 v 3.2.3 Implications................................................................120 3.3 AGREEMENT ATRACTION IN COMPREHENSION 12 3.3.1 Intuition....................................................................12 3.3.2 Formalization...............................................................126 3.3.3 Agrement & Case (Experiment 5)..............................................146 3.3.4 Clause-boundednes..........................................................158 3.3.5 Next to the cabinets ..........................................................163 3.3.6 Linking comprehension and production..........................................163 3.4 INTERFERENCE AND SUBJECTS 16 3.4.1 Introduction................................................................16 3.4.2 Van Dyke & Lewis (203), Van Dyke (207)......................................168 3.4.3 Replicating and extending Van Dyke (Experiment 6)...............................180 3.4.4 NPI Licensing v. Reflexive Anaphora............................................18 3.5 CONCLUSIONS 198 4 ACTIVE DEPENDENCY FORMATION AND MECHANISMS FOR THE ACURATE RECOGNITION OF GRAMATICAL DEPENDENCIES..................................201 4.1 INTRODUCTION 201 4.2 THE ROLE OF PREDICTABILITY 203 4.2.1 Forwards v. backward anaphora...............................................203 4.2.2 Reconsidering agrement atraction.............................................208 4.3 PROCESING WH-DEPENDENCY CONSTRUCTIONS 210 4.3.1 Active dependency formation...................................................210 4.3.2 Mechanisms of active dependency formation......................................213 4.3.3 Similarity-based interference and wh-dependency completion........................219 4.3.4 Thre studies................................................................28 4.4 THE GRAMAR?S ROLE IN TRIGGERING WH-DEPENDENCY FORMATION 29 4.4.1 The motivation for active dependency formation and island constraints................29 4.4.2 The Cordinate Structure Constraint and Active Dependency Formation I (Experiment 7)237 4.4.3 Materials and Methods.......................................................237 4.4.4 The Cordinate Structure Constraint and Active Dependency Formation I (Experiment 8)249 4.4.5 General discusion of Experiments 7 & 8.........................................257 4.5 THE FIDELITY OF RETRIEVAL IN WH-DEPENDENCY FORMATION 258 4.5.1 Introduction................................................................258 4.5.2 Experiment 9................................................................260 4.5.3 Experiment 10...............................................................267 4.5.4 Acurately identifying the head of a dependency...................................274 4.6 CARRYING INFORMATION FORWARD IN TIME 278 4.6.1 Lexicaly-specific features (Experiments 1a, 1b).................................279 4.6.2 Lexical identity (Experiment 12)...............................................292 4.6.3 FG (Pied-piping) (Experiment 13)..............................................30 4.6.4 Conclusions.................................................................309 4.7 CONCLUSIONS 31 5 CONCLUSIONS....................................................................314 5.1 SPECIFIC CONCLUSIONS 314 5.1.1 Agrement atraction.........................................................314 5.1.2 Wh-dependency formation.....................................................316 5.2 BROADER CONCLUSIONS 318 6 APENDICES......................................................................32 7 REFERENCES.....................................................................35 vi List of Tables Table 2-1 The Symetry Prediction for Feature Percolation....................................48 Table 2-2 Interpretations of reading time paterns in relative clause agrement comprehension......62 Table 2-3 Sample materials for Experiment 1................................................65 Table 2-4 Sample plural subject materials for Experiment 2....................................72 Table 3-1 Speded gramaticality judgments of complex subject atraction From Experiment 4....123 Table 3-2 Retrieval structure and outcomes Singular-headed subjects, gramatical continuations...135 Table 3-3 Retrieval structure and outcomes Singular-headed subjects, ungramatical continuations.136 Table 3-4 Retrieval structure and outcomes Plural-headed subjects, gramatical continuations.....137 Table 3-5 Retrieval structure and outcomes Plural-headed subjects, ungramatical continuations...139 Table 3-6 Retrieval structure and outcomes for RC Atraction Plural RC head, Singular RC subject, Ungramatical.....................................................................141 Table 3-7 Revised retrieval structure and outcomes, RC atraction Plural RC head, Singular RC subject 143 Table 3-8 Speded gramaticality judgments of RC atraction.................................14 Table 3-9 Revised retrieval structure and outcomes, RC atraction Singular RC head, Plural RC subject 145 Table 3-10 Subject and atractor sampling probabilities for Subj-atached RCs....................148 Table 3-1 Subject and atractor sampling probabilities for Obj-atached RCs....................148 Table 3-12 Sample materials set for Experiment 5...........................................150 Table 3-13 Comprehension acuracy from Van Dyke (207) Experiment 3......................179 Table 3-14 Eroneously chosen nouns in Experiment 3 cloze comprehension task.................180 Table 3-15 Comprehension acuracy for Experiment 6.......................................183 Table 4-1 Predictions for active dependency formation in multiple dependency constructions......236 Table 4-2 Sample materials set for Experiment 7............................................238 Table 4-3 Experiment 7 Aceptability Ratings Sumary.....................................242 Table 4-4 Sample materials set for Experiment 8............................................250 Table 4-5 Sample materials set for Experiment 9............................................261 Table 4-6 Comprehension question acuracy for Experiment 9.................................262 Table 4-7 Sample materials set for Experiment 10...........................................270 Table 4-8 Comprehension question acuracy for Experiment 10...............................271 Table 4-9 Sample materials set for Experiment 12...........................................294 Table 4-10 Comprehension acuracy for Experiment 12......................................295 Table 4-1 Sample materials set for Experiment 13..........................................302 Table 4-12 Comprehension acuracy for Experiment 13......................................303 vi List of Figures Figure 2-1 Percolation of number features in a complex subject.................................30 Figure 2-2 Valuing subject number for ?the key to the cabinets?.................................3 Figure 2-3 The efects of subject-verb misagrement in reading.................................50 Figure 2-4 Pearlmuter, Garnsey & Bock (199) Experiment 1 Results...........................54 Figure 2-5 Pearlmuter, Garnsey & Bock (199) Experiment 2 Results First-pas reading times......57 Figure 2-6 Experiment 1: Relative Clause Atraction Reading Time Results.......................68 Figure 2-7 Experiment 2: RC Atraction Reading Time Results, Singular Subject..................74 Figure 2-8 Experiment 2: RC Atraction Reading Time Results, Plural Subject....................75 Figure 2-9 Experiment 3: Relative clause atraction, Singular subjects Speded gramaticality, proportion ?yes? responses............................................................81 Figure 2-10 Experiment 3: Relative clause atraction, Plural subjects Speded gramaticality, proportion ?yes? responses............................................................82 Figure 2-1 Wagers, Lau, & Philips (208): Complex Subject Atraction........................86 Figure 2-12 agers, Lau, & Philips (208) Experiment 4, Residual RTs Two previous Region RTs regresed out.......................................................................8 Figure 2-13 Experiment 4a: Complex subject atraction, singular subjects Speded gramaticality, proportion ?yes? responses............................................................91 Figure 2-14 Experiment 4b: Complex subject atraction, plural subjects Speded gramaticality, proportion ?yes? responses............................................................92 Figure 3-1 Acesible and inacesible licensers in an abstract tre..............................97 Figure 3-2 Hypothetical SAT functions....................................................102 Figure 3-3 McElre, Foraker, & Dyer (203), Experiment 2 Average SAT Function..............105 Figure 3-4 Implicit encoding of serial order................................................12 Figure 3-5 Comparison of cue convergence rules............................................127 Figure 3-6 Phrase structure tre for ?the man with the hat?....................................130 Figure 3-7 Experiment 5: Relative clause atraction, Subject-atached RCs Speded gramaticality, proportion ?yes? responses...........................................................151 Figure 3-8 Experiment 5: Relative clause atraction, Object-atached RCs Speded gramaticality, proportion ?yes? responses...........................................................153 Figure 3-9 Van Dyke & Lewis (203), Experiment 4.........................................171 Figure 3-10 Van Dyke & Lewis (203), Experiments 2 & 3...................................173 Figure 3-1 Van Dyke (207) Experiment 3, Critical region reading times.......................176 Figure 3-12 Van Dyke (207) Experiment 3, Pre-critical region reading times....................178 Figure 3-13 Experiment 6 Self-paced reading results.........................................185 Figure 4-1 Van Dyke & McElre (206) Critical verb region..................................23 Figure 4-2 Experiment 7 Region-by-Region Reading Times...................................246 Figure 4-3 Experiment 8 Region-by-Region Reading Times...................................254 Figure 4-4 Experiment 9 Reading time results: Wh-clause Conditions...........................264 Figure 4-5 Experiment 9 Reading time results: If-clause Conditions............................265 Figure 4-6 Experiment 10 Reading time results: Relative clause conditions......................271 Figure 4-7 Experiment 10 Reading time results: Cordinated clause conditions...................272 Figure 4-8 Experiment 1a Region-by-region reading times...................................283 Figure 4-9 Experiment 1b Region-by-region reading times (Long:clause).......................287 Figure 4-10 Experiment 12 Reading time results............................................296 Figure 4-1 Experiment 12 Folow-up results Speded gramaticality, proportion ?yes? responses..29 Figure 4-12 Experiment 13 Region-by-region reading times: Short conditions....................304 Figure 4-13 Experiment 13 Region-by-region reading times: Long/P conditions.................305 Figure 4-14 Region-by-region reading times: Long/Clause conditions...........................306 Figure 6-1 Sg [ Sg ] Gramatical RT Distribution: estimated RT G ..............................324 Figure 6-2 Sg [ Sg ] Ungramatical RT Distribution: estimated RT U ...........................325 Figure 6-3 Simulation results: RT A/G /RT U/G means shift symetricaly..........................327 1 1 Introduction 1.1 The challenge of navigating structure in real-time This disertation is concerned with the problem of how structured linguistic representations interact with the architecture of human memory. In particular it addreses the problem of the how constituent encodings of a sentence are acesed by proceses of gramatical dependency formation. Nearly al contemporary theories of gramar share a commitment to richly structured mental representations as necesary components of mature linguistic competence (Bresnan, 2001; Chomsky, 1981, 1995; Pollard & Sag, 1994, Stedman, 1997). Though these theories may deploy diferent kinds or numbers of representations, they al posit abstract categories that can be combined in regular ways to form hierarchicaly-ordered, compositional objects. This hierarchical order underlies many important generalizations about gramatical dependencies. For example, in (1), intrasentential reference for the pronoun ?her? is restricted to ?Laura?: (1) Laura?s friend embarased her at the wedding. The sentence can only mean something like ?Laura?s friend embarased Laura,? and not ?Laura?s friend embarased herself.? In (2), though, ?her? can refer either to ?Laura? or ?Laura?s friend? (but not ?Peggy?): (2) Laura?s friend was afraid that Peggy would embaras her at the wedding. A standard formal description of the facts in (1) and (2) is that co-reference betwen a noun phrase and pronoun is blocked if the noun phrase is both in the same clause as the 2 pronoun and the noun phrase ?c-commands? 1 the pronoun (Principle B, Chomsky, 1981). This description would not be possible if the representation in (2) did not alow reference to notions like hierarchical order and domains of rule applicability. A simplified phrase structure representation of (2) that includes both of these concepts is given in the bracketed sentence: (3) [ S [ NP Laura?s friend][ VP was afraid [ S ? that [ S [ NP eggy][ VP would embaras her ]]] The explanatory benefit of abstraction over structured representations comes with computational chalenges, however. On the timescale of comprehension, tens and hundreds of miliseconds to seconds, the comprehender must deploy the abstract facts about gramatical categories and relations to recognize and understand actual expresions. At the sentence level, pairings of words to structure must be recognized and encoded as part of the current, novel utterance. The representation in (3) contains a considerable amount of information that was simply not in the input. Because natural language expresions can be of considerable complexity and temporal extent, these novel structures must be encoded semi-durably so that they are acesible to later operations. For example, in sentence (2), the reference of the pronoun ?her? must be resolved with respect to the syntactic context provided by the preceding parts of the sentence. So those parts, in their hierarchical order, must be retained for ?her? to be interpreted. We must 1 A category A c-commands a category B if A does not dominate B, and the first node dominating A also dominates B (Chomsky, 1981). The whole phrase ?Laura?s friend? c-commands the pronoun within the same clause in (1), and thus coreference is blocked, but its subconstituent ?Laura? does not. However, in (2), the whole phrase is far enough away from ?her? (outside of the same clause) such that the c-command restriction is voided. I am describing here the basic Binding Theory acount of these facts (Principle B; Chomsky, 1981), but the point is more general: any acount that describes the paterns of aceptability (e.g. Reinhard & Reuland, 1993) wil need a representational vocabulary with comparable terms. 3 consequently not only worry about how the hierarchical order of a sentence is encoded, but also what operations are available to the comprehension system for targeting parts of these large, complex representations, wel after they have left imediate atention. A natural way of acesing constituents in a gramatical fashion would be to follow the hierarchical relations as the phrase structure gives them, like links in a chain. Navigating the representation in order of the hierarchical relations provides a means for restricting reference to only gramaticaly-relevant constituents in the syntactic context. Indeed this is exactly how tre-like data structures are searched in computer science (Knuth, 1965/1997). For example, take another kind of gramatical dependency, one which we wil consider in great detail in this disertation: subject-verb agrement. A verb must agre with the subject in the same clause, as in the following sentence, and its asociated bracketing: (4) [ S [ NP The path [ P to the monuments][ VP was litered with bottles]. If the agrement betwen subject and verb must be verified online, then following the phrase structural relations wil lead directly from a verb to the entire phrase that is the subject. The path for verifying agrement (in this simple representation) could be given succinctly with the following chain of dominance statements ? Start at V: (VP dominates V, S dominates VP, S dominates NP): End at NP. There is also a plural noun in this sentence, ?monuments,? which is gramaticaly inacesible to the agrement relation. It would remain irelevant if only the dominance pathways are followed to the subject NP and some other search order were not employed (i.e. linear). Recently there have been a number of arguments that the memory architecture in which language procesing is embedded is similar in many ways to that of general 4 episodic memory: it is context-dependent and content-addresable (Van Dyke & Lewis, 2003; Lewis & Vasishth, 2005; McElre, 2006; Martin & McElre, 2008). The many pieces that compose a sentence are stil thought to be encoded as linked together in a phrase structure tre, by the dominance relations the gramar generates. But it has been argued that the resolution of gramatical dependencies does not follow a search procedure ordered by those relations. Instead it proceds in a content-addresable fashion, by probing the memory with features that match the desired constituent. For example, if the constituent in subject position is needed to establish a gramatical relation (like agrement), it would not be acesed by succesively following the dominance relations up to that position. Instead it would be retrieved by probing for features characteristic of a subject, like +Nominative case. Crucialy content-addresable retrieval grants direct aces to just those constituents whose information matches features in the probe. On the one hand, this means fast, position-constant aces times (McElre, Foraker, & Dyer, 2003). On the other hand, because the search is unordered, it means that multiple candidate constituents could be returned. One of these candidates may be the gramaticaly-licensed constituent, but others may not. For example, in a biclausal sentence like (5), there are two subjects, and so there are two candidate matches to a simple +Nominative cue. (5) The park ranger was dismayed that the path to the monuments was litered with bottles. If the system were to maintain full fidelity to the principles of the gramar, then on-line comprehension proceses would have to have a structuraly-sensitive decision metric for which of multiple candidates was the right one. This concern over gramatical fidelity has generated two kinds of responses: (1) that online procesing does exhibit gramatical 5 infidelity, and it is particularly exacerbated when there are many similar constituents in a structure (Van Dyke & Lewis, 2003; Van Dyke, 2007); (2) the right combination of cues might be found (at least for a subset of relations) to target the unique, gramaticaly licensed constituent (Martin & McElre, 2008). In this disertation we wil addres these claims by broadly surveying under what scenarios gramatical acuracy is reliably observed and under what scenarios gramatical acuracy sems hard to achieve. In sentence comprehension experiments on subject-verb agrement and the formation of wh-dependencies we can infer what kind of analyses the comprehender is entertaining by looking at paterns of eror detection, reaction time measures of procesing dificulty, and reaction time measures of interpretation. Interestingly we find that subject-verb agrement, the arguably simpler relation, is prone to gramatical infidelity of exactly the kind predicted by a content-addresable architecture. The formation of wh-dependencies, on the other hand, is highly gramaticaly acurate, even though it should be liable to interference from gramaticaly unavailable constituents. We defend the view that the memory architecture does burden comprehenders with a major limitation on inducing structured analyses over linguistic input. Making decisions about how to structure and interpret new input is highly dependent on having hierarchical order information about what has come before. The content-addresable memory is not generaly amenable to acurate structural reference for fundamental reasons. Important structural relations, like c-command, can be stated over any arbitrary pair of constituents, so there?s no reasonable way to make the property of c-command part of constituent encodings. That is, the property of c-command cannot be the content 6 of an encoding. The fact that relational notions cannot restrict the search of linguistic context introduces inacuracy into non-local decisions. This general outlook, however, predicts more falibility than is generaly observed. Many proceses and phenomena, like that of wh-dependency formation, simply sem impervious to ungramatical analyses. Instead of rejecting the architecture outright, however, we propose adaptations that optimize gramatical acuracy. Chief among these are constituent encoding strategies that permit reference to be restricted to major gramatical domains; and the predictive recognition of dependencies that performs as much gramatical licensing as is possible left-to-right. The online structure building system, we conjecture, is reasonably wel- adapted to the memory architecture. 1.2 Grammatical fidelity, grammatical fallibility In the 1960s and 1970s there emerged a basic consensus that the perceptual or mnemonic representation of a sentence reflects the gross properties of the constituent structure (or thematic structure) asigned to it by the gramar (cf. Fodor, Bever & Garet, 1974; Levelt, 1974). Most of the studies ariving at this conclusion used techniques that would be considered ?of-line?: for example, asking experimental participants to recal the location of noise burst in a recorded stimulus (Bever, Lackner, & Kirk, 1969); or, to asign pairs of words scores based on how related they were felt to be (Levelt, 1974). Nonetheles they are informative about what might be caled the ?steady- state? encoding, the representation that persists once major comprehension proceses have concluded at the sentence level. With the advent of experimental measures and designs that can probe on-going procesing on the time-scale of single word or morpheme procesing (self-paced reading, eye-tracking, EG, MEG, etc.), it has become 7 possible to test not just whether the steady-state encoding reflects gramatical distinctions, but whether the instantaneous, on-going encoding is also gramaticaly acurate. In the past 20 years, using the finer measures, and examining a broader collection of relationships, the facts of the mater are, perhaps unsurprisingly, mixed. Some kinds of real-time comprehension proceses are tightly regulated by the gramar and never show evidence that anything but a gramatical analysis is entertained. Those proceses we?l refer to generaly as grammatically faithful. Some proceses, however, sem to entertain analyses of the expresion which the gramar cannot generate or must exclude. Such proceses we?l refer to generaly as grammatically fallible. Let us review two cases here. The example of subject-verb agrement nicely ilustrates the nature of the problem. Subject-verb agrement is a wide-spread phenomenon among the world?s languages and refers to the covariation of verbal morphology with syntactic or semantic properties of the subject phrase (Corbet, 2003). For example, in English the verb form must match the subject phrase in number features: (6) (a) The path was/*were litered with bottles. (b) The paths *was/were litered with bottles. If we replace the simple subject ?the path? with a more complicated one, like ?the path to the monument? or ?the path that Mary took with her father? the agrement patern remains the same. (7) (a) The path to the monument was/*were litered with bottles. (b) The paths to monument *was/were litered with botles. (8) (a) The path that Mary took with her father was/*were litered with bottles. (b) The paths that ary took with her father *was/were litered with bottles. 8 Agrement is determined hierarchicaly: the verb form ust match with a property of the entire subject constituent. It does not, for example, simply match the number on an adjacent noun, as (7b) ilustrates. The number property of the entire subject depends on a distinguished element contained inside it: its head ?path?. The notion of ?head? is central in many theories of phrase structure (Jackendoff, 1977; Pollard & Sag, 1994; Chomsky, 1995) but more generaly reflects the idea that the properties of the whole are determined by an ordered composition of the properties of its parts. Given this core facet of natural language gramars, we can ask whether real-time comprehension is sensitive to a notion of headednes in the same way. That is, does the comprehender form a representation from the input that projects complex syntactic objects from lexical items in a way that distinguishes a head? Agrement is a useful probe for addresing this question, because we know that speakers make wel-defined erors in producing agrement. For example, speakers commonly produce sentences like the folowing: (9) The path to the monuments were litered with botles. In producing such a sentences, a speaker selects a verb form whose number matches not the head of the subject projection, ?path,? but a more deeply embedded noun, ?monuments.? The occurrence of forms like these is widely documented by gramarians and other observers in both writen and spoken English (e.g., Trollope, 1883; Jespersen, 1924; Strang, 1966; Quirk et al., 1972; Kimbal & Aisen, 1973; Francis, 1986). The phenomenon, caled agrement attraction, has drawn the most atention in research on language production (Bock & Miler, 1991, et seq). Perhaps the most prominent contemporary explanation for agrement atraction, encompasing a sizable body of 9 observations, is that syntactic objects are encoded such that features of individual lexical items can eroneously ?percolate? from diferent parts of the representation in a manner that would not be gramaticaly sanctioned (e.g., Eberhard, Cutting & Bock, 2005, Franck, Vigliocco & Nicol 2002; Vigliocco & Nicol 1998). A complex subject, like ?the path to the monuments,? is misasigned plural number, because it contains a plural noun whose plural feature has (stochasticaly) migrated up from a more deeply embedded node in the structure. Agrement atraction represents a case of grammatical fallibility. Under the feature percolation acount, atraction occurs because the binding of features in a structured representation is endogenously eror-prone: lexical items are initialy corectly ordered in a syntactic frame, but it is impossible to stably maintain their initial feature composition because of the way they are structuraly related to one another. (Note we argue against the feature percolation mechanism in Chapter 2). Let us now turn to a case of grammatical fidelity, a case where we do not observe plausible but ungramatical analyses. Here consider the wh-dependency formed inside a relative clause. A typical adult speaker of English draws a distinction betwen the aceptability of the relative clauses in (10a) and (10b): (10) (a) The singers that Laura hoped her fianc? would agre to hire ___ .. (b) *The singers that Laura hoped hiring ___ would be agreable to her fianc? .. Relativization in English involves the formation of an A?-dependency: the head of the dependency occupies an (overt) syntactic position in the periphery of the clause, while the foot of the dependency is asociated with an argument position which can be embedded an unbounded distance inside the clause. The relative clauses in (10) difer 10 with respect to the syntactic position of the foot: in (a), it is a right-branch, complement position; but in (b), it is a complement position within a left-branch subject projection. As a clas, domain restrictions on the foot of A?-dependencies are refered to as island constraints (Ross, 1967). A?-dependency formation has been studied extensively in comprehension. Through a combination of reaction time and electrophysiological measures, we know that comprehenders atempt to link the head of the dependency with potential foot locations as soon as possible. This procesing occurs in advance of direct evidence that there is an open foot position, a phenomenon refered to as active dependency completion (se Aoshima et al., 2004, for a review, or Chapter 4). The question arises whether comprehenders make erors in locating potential foot locations. Faced with the first seven words in (10b), does the comprehender ever posit a foot for the dependency, where it cannot gramaticaly occur? The majority of the evidence suggests that comprehenders obey island constraints in online comprehension, and do not construct A? dependencies that the gramar does not alow (e.g., Stowe, 1986; Traxler & Pickering, 1996; se Philips, 2006, for a review). We provide evidence that strengthens this position in Chapter 4. When we look at a large set of phenomena, clasified by how gramaticaly faithful they are in real-time procesing, it may be that there are no broader-scale paterns. Gramatical fidelity could be an idiosyncratic property of a given construction or proces. However, we wil argue that there are broader paterns. In particular, a key determinant of acuracy sems to be predictability. Gramatical relations that announce their presence early on, like wh-dependencies, can be completed in principaly top-down 11 fashion. This makes sense, we argue, in terms of the memory architecture. Licensing dependencies predictively alows the system to either avoid searching through the syntactic context or to do so with highly targeted, restrictive information. In this way, the gramar can recognize and license dependencies in a way that is adapted to the structural imprecision introduced by the memory. 1.3 Outline of the disertation 1.3.1 Chapter 2: Agrement attraction and selective fallibility The goal of Chapter 2 is to test the feature percolation acount of agrement atraction in comprehension. The feature percolation acount holds that agrement atraction stems from a faulty encoding of the subject?s number features. We first discuss the basic facts of agrement atraction drawn from the production literature which have motivated the feature percolation acount. We then detail several variants of the feature percolation acount along with their implications for the encoding of gramatical features in constituent representations. We contrast the feature percolation with other acounts, which concern the order in which constituents are acesed in agrement licensing. In comprehension, feature-percolation makes a strong and falsifiable prediction, which we refer to as the Symmetry Prediction. Specificaly, it predicts that agrement atraction should lead to ilusions of gramaticality for ungramatical sentences and ilusions of ungramaticality for gramatical sentences, in equal measure. After reviewing previous comprehension research, we present two reading time experiments and two speeded-gramaticality experiments, both of which fail to uphold the Symetry Prediction. We present novel evidence from relative clause atraction sentences (Kimbal 12 & Aisen, 1971) as wel as data from canonical complex subject atraction, of the type discussed above. We find that agrement atraction only improves the perception of ungramatical sentences in comprehension and does not impact gramatical sentences. Agrement atraction, we conclude, exhibits selective falibility, and one which implicates a proces- based acount instead of a representational one. The experimental and theoretical work reported in this chapter represents the fruits of a several-year, joint collaboration with Elen Lau. Experiments 1-4, and some of the analysis, are reported in Wagers, Lau, & Philips (submited to Journal of Memory and Language). 1.3.2 Chapter 3: The trouble with subjects The goal of Chapter 3 is two-fold: (1) to provide an acount of agrement atraction in comprehension that encompases the selective falibility result of Chapter 2; (2) in doing so, to introduce in greater details the properties of a content-addresable memory and the chalenges it poses for hierarchicaly ordered information. We first discuss the architectural commitments which we acept as the basis for the rest of the disertation. We introduce the concept of content-addresable memory and the key properties of a direct aces search (Clark & Gronlund, 1996; McElre, 2006), including similarity based interference (Anderson & Nely, 1996). We argue that this architecture presents inherent dificulties for recovering relational properties like c- command, but we also discuss some strategies for overcoming its limitations. Using asumptions from Shifrin?s Search of Asociative Memory (Gilund & Shifrin, 1984) we develop a formal model of agrement atraction in comprehension 13 which predicts the paterns we observe in our experiments. We argue agrement checking in comprehension stems from a retrieval operation initiated by the verb. This operation is guided by cues that do not converge on a single constituent representation, but partialy match multiple constituents in the representation. It is the interference of the inacesible constituent that leads to spurious agrement licensing. In a speeded-gramaticality experiment, we test and confirm a prediction of this model. We relate our acount to other proces-based acounts of production (Solomon & Pearlmutter, 2004) and a formaly-similar acount developed simultaneously by Badecker & Lewis (2007) for production. The influence of gramaticaly-inacesible constituents has been argued to stem from partial match in several other domains (Van Dyke & Lewis, 2003; Van Dyke, 2007; Vasishth, Drenhaus, Saddy & Lewis, 2005). We review this evidence and find it largely equivocal. We examine claims that complex subject atachment is liable to interference from inacesible constituents (Van Dyke & Lewis, 2003; Van Dyke, 2007). We argue that much of the online evidence for interference is confounded with the number of clauses in the experimental conditions. We provide support for this contention in a reading-time experiment which we modeled off Van Dyke & Lewis (2003). We also discuss Negative Polarity Item licensing (Drenhaus et al., 2005) and reflexive anaphora (Sturt, 2003). A lack of partial match efects has been documented in the resolution of reflexive anaphora (Sturt, 2003; Xiang, Dilon, & Philips, submited), which is semingly unexpected. We argue that resolution of reflexive anaphora is guided by cues which do not strongly activate embedded constituents. 14 1.3.3 Chapter 4: Active dependency formation and mechanisms for the accurate recognition of grammatical dependencies The goals of Chapter 4 are (1) to review and document evidence for gramatical fidelity observed in active dependency formation; and (2) provide an acount of why active dependency formation is faithful. On the basis of thre comprehension studies, we argue that the licensing of wh-dependencies is guided principaly top-down. In doing so it is able to largely avoid retrieving information from the syntactic context and consequently avoids the influence of gramaticaly-inacesible constituents. When retrieval is ultimately necesary, it can be guided by highly restrictive information. Experiments 7-8 provide evidence that the decision to complete a wh-dependency and retrieve the information from the head of the dependency is guided principaly top- down, and not from information provided by the local environment. We show that the Coordinate Structure Constraint (Ross, 1967) is respected in online procesing. We contrast the procesing of wh-dependencies inside coordinate structures, in which active dependency formation is observed, with procesing inside potential parasitic gap environments, in which active dependency formation is not observed. We conclude that the incentive to satisfy global wel-formednes constraints drives active dependency formation moreso than an incentive to satisfy local licensing requirements. Experiments 9-10 provide evidence that identifying the head of a wh-dependency is gramaticaly acurate and not liable to interference. We test whether the resolution of a wh-dependency is impacted by other irelevant dependency heads in the same sentence, both in embedded wh-questions and relative clauses. Because other dependency heads have similar structural and featural properties, it is predicted they should interfere with dependency resolution. Results from Van Dyke & McElre (2006) sem to support this 15 prediction, but we argue that their experiment lacked a crucial control condition. We outline two mechanisms to acount for the fidelity we observe in our experiments: one, an encoding scheme that marks dependencies complete or incomplete; two, maintenance of a smal amount of unique information about the dependency head that could be used to precisely target the correct head in a retrieval operation. Finaly, Experiments 11-13 provide evidence that most lexicaly-specific information about a dependency head is lost over increasing dependency lengths, whereas abstract categorial information is not. This finding supports the idea that some information remains available to the comprehender to guide dependency formation. Experiment 11 tests whether the plausibility of a candidate dependency can be evaluated over longer dependencies. Experiment 12 tests whether a verb-P selectional restriction can be evaluated over longer dependencies. Finaly Experiment 13 tests whether the identity of the dependency head is retained over longer dependencies. Experiments 7, 8 and 11a are reported in Wagers & Philips (submited to Journal of Linguistics). 16 2 Agrement atraction and selective falibility Binding and acesing features in complex syntactic objects, Part I 2.1 Introduction 2.1.1 Agrement attraction: what?s at stake In Chapter 1, we discussed a major chalenge for real-time structure building: how to navigate novel encodings of structure, which are potentialy large and complex. One strategy for understanding the mechanisms by which this occurs is to ask how gramaticaly acurate those online proceses are. On the timescale of tens to hundreds of miliseconds, how faithfully can the procesing system encode a representation with respect to gramatical principles and constraints? And can it direct its atention betwen diferent constituent encodings of that representation in a structure-sensitive fashion? In this chapter, we wil turn our atention to the phenomenon of agrement atraction, as a case study in how to tease apart facets of the encoding itself from facets of how the encoding is acesed in real-time. Agrement atraction is an eror in the formulation of subject-verb agrement, best introduced by way of example: (11) The function of the ducts are unknown. from J.E. Stevens, ?The delicate constitution of sharks,? Bioscience, 4, 61-4 The subject of the clause is the singular DP ?the function of the ducts,? but the verb is in its plural form. It fails to match the gramatical number of the subject projection, as 17 determined by its head, and instead matches one of its subconstituents: ?the ducts.? In agrement atraction the control of agrement sems to be wrested away by a nearby but gramaticaly irelevant constituent. Acounts of agrement atraction have largely appealed to the notion that multiple nouns (or noun phrases) in a complex subject have independent specifications for number, which compete to value the entire noun phrase. The diferences among these acounts lie in why those features compete. There are two major proposals: ? In the first kind of acount, the number features bound are eroneously transfered to the agrement controller; (Eberhard, 1997, Eberhard, Cutting & Bock, 2005; Franck, Vigliocco & Nicol, 2002; Hartsuiker et al. 2001; Nicol, Forster, & Veres, 1997; Vigliocco & Nicol, 1998). This acount is fundamentaly representational, and asigns blame for agrement atraction to an encoding of structure that is gramaticaly inconsistent or imprecise. ? In the second kind of acount, there is no problem in encoding the representation, and asigning features to specific categories in the structure, but there is falibility in how categories are acesed in real-time (Solomon & Pearlmutter, 2004; Badecker & Lewis, 2008). This acount is fundamentaly proces-based, and asigns blame for agrement atraction to a (partialy) structuraly insensitive means of acesing component encodings of syntactic structure. In this chapter, we examine this question, whether agrement atraction is due to the eroneous encoding of the features in the subject, or rather to erors in how these features are acesed in real-time. We turn to the comprehension analog of the agrement 18 atraction production eror to make this case, and ask how comprehenders perceive subject-verb agrement in potential atraction configurations. Based on several real-time reading studies and a complement of speeded gramaticality tests, we argue that it cannot be an unfaithful encoding of features on the agrement controller that is responsible for agrement atraction. 2 In this chapter we wil carefully examine an acount of gramatical infidelity in the encoding of the subject. Our empirical argument has two parts. First we examine an agrement atraction configuration that has received relatively litle atention in previous experimental literature, in which atraction occurs inside a pluraly-headed relative clause (Kimbal & Aisen, 1971): (12) The ducts PL [ RC that the scientist SG study PL ] have an unknown function. We provide evidence that these RC configurations yield exactly the same eror profile as the more canonical complex subject agrement controllers, the configuration atested in example (11). We show that agrement atraction exemplifies what we cal ?selective falibility? in comprehension: when there is a fully gramatical analysis available to the comprehension system, the comprehension system pursues this analysis nearly al of the time. For example, subject-verb agrement checking in comprehension is not disrupted when the verb agres with the head of the subject, even though an atractor is present, as in (13): (13) The function of the ducts is unknown. 2 The reading time studies are first reported in Wagers, Lau & Philips (2008a, submited) and the gramaticality studies in Wagers, Lau & Philips (2008b). 19 Only when a fully gramatical analysis is unavailable is the system tempted into an atraction eror. This selective falibility is incompatible with a system in which binding of number features is endogenously leaky or inacurate. In Chapter 3, we show how a means of acesing features via cue-based retrieval (as in Lewis & Vasishth, 2005) can encompas our results. The analysis we offer points to the second kind of acount for why features compete: structuraly-insensitive aces procedures. The problem engendered by subject-verb agrement formulation in production, and subject-verb agrement checking in comprehension, is one of regulating aces to number features that are properly bound in a syntactic representation. We wil argue these results are germane to other cases of observed falibility in comprehension. Complex subjects have concerned syntacticians for some time (Huang, 1982; Kayne, 1984; Uriagereka, 1999, inter alia). More recently psycholinguists have been discussing and documenting cases of falibility involving procesing subjects in real-time (e.g., Badecker & Straub, 2002; Kluender, 2005; Sturt 2003; Van Dyke & Lewis, 2003; Xiang, Dilon, & Philips, submited). We wil expand the insights from subject-verb agrement to understanding more generaly why subjects, and particularly complex subjects, are troublesome for encoding and acesing hierarchical structure in real-time. As we?l se, just as within the phenomenon of agrement atraction, diferent kinds of subject relationships are falible to diferent extents in real-time procesing, and this is a function of the both way in which information is preserved over time prospectively and acesed retrospectively. 20 2.1.2 Outline of the chapter In section 2.2 we introduce the basic properties of agrement atraction. Then, in section 2.3, we review much of the existing production literature which suggests that atraction is sensitive to structural factors, justifying its interest as a diagnostic of the encoding and navigation of hierarchicaly ordered information. In section 2.4, we discuss the predominant representational acount of agrement atraction, eroneous feature percolation, and, in section 2.5, detail what we believe are the severe implications of this acount for gramatical fidelity in structural encoding. In section 2.6, we sketch out the major contemporary alternative to eroneous feature percolation. In section 2.7, we then argue that comprehension data, rather than production data, can decide the question of whether eroneous feature percolation is active in the system. In sections 2.8 and 2.9, we present five comprehension experiments and draw on related data from Wagers, Lau, & Philips (2008) to argue that eroneous feature percolation is inactive in comprehension. 2.2 An overview of agrement attraction By far the most prominent example of agrement atraction involves complex singular subjects that contain a P complement or adjunct with a plural DP subconstituent. The atested examples in (14) further ilustrate this patern, summarized schematicaly in (15): (14) (a) [ The order of the tasks ] were counterbalanced .. Refel, J.A., Curent Psychology, 16, 308-315 (b) [ Rise in email viruses ] threaten net. Headline, The Guardian, 4 Aug 201, 2 (c) [ The sheer weight of all these figures ] make them harder to understand. Ronald Reagan, 13 Oct 1982, quoted in (Francis, 1986) (15) [ CP .. [ DP D SG [ NP N SG [ P P DP PL ] ] ] SG .. V PL .. ] 21 This agrement patern has long been a subject of concern to gramarians (Francis, 1986; Quirk, Grenbaum, Lech, & Svartvik, 1972; Jespersen, 1924, inter alia), where it has often been refered to as ?proximity concord.? Only occasionaly has it featured in research in the generative tradition (den Dikken, 2001; van Gelderen, 1997; Kimbal & Aisen, 1971). However, even as early as 1883, we find the prolific Victorian novelist Anthony Trollope describing the phenomenon, as he reflects in his autobiography upon his experience as a writer: Rapid writing wil no doubt give rise to inacuracy, - chiefly because the ear, quick and true as may be its operation, wil occasionaly break down under presure, and, before a sentence be closed, wil forget the nature of the composition with which it was commenced. A singular nominative wil be disgraced by a plural verb, because other pluralities have intervened and have tempted the ear into plural tendencies .. Speaking of myself, I am ready to declare that, with much training, I have been unable to avoid them. (Trollope, 1883) 3 Trollope?s explanation that ?the ear .. wil occasionaly break down under presure .. and forget the nature of the composition with which it was commenced,? is esentialy the intuition shared by most observers. For example, Jespersen (1924) refered to a tax on ?mental energy? incurred by the production of complex subject (p. 345). Quirk et al. (1972), in their extensive descriptive gramar, atribute this ?tax? to the increased distance betwen the subject head and the verb, reflecting the view that it is the relative proximity of the intervening (non-head) noun to the verb that is the controling factor in the production of the eror. 3 I thank Norbert Hornstein for bringing this pasage to my atention. To the best of my knowledge, it is the earliest description of (and explanation for) agrement atraction yet discovered. Anthony Trollope was evidently a man of substantial talent, and is also remembered as the inventor of the pilar box (cf. Trollope, 1883). 22 What can we conclude about the encoding of syntactic structure from observations of agrement atraction? It is important to note that the eror is characteristic not only of production, but also comprehension (Pearlmutter, Garnsey, & Bock, 1999; Wagers, Lau, & Philips, 2008): comprehenders experience an ?ilusion? of gramaticality for exactly the forms that are eroneously produced, a phenomenon that we?l expand upon below. Therefore it is likely that agrement atraction reflects deep architectural properties, and not just ones specific to production. If the agrement atraction erors truly reflect ?proximity concord,? ? that is, the verb sometimes simply agres with a serialy or temporaly nearby or adjacent noun phrase, because of its nearnes or adjacency ? then this phenomenon either suggests a rather weak real-time encoding of structure, or a set of real-time proceses, which, unlike those postulated for the gramar, are insensitive to hierarchical structure. Such an acount would cohere with some contemporary models of structural encoding in which string-adjacent or nearby elements in an expresion can compete to form local structures that conflict with global cues (e.g., Stevenson, 1994; Tabor, Galantucci, & Richardson 2004). However the apparent generalization of proximity turns out to be false: it is not the nearnes of the intervening noun to the verb that leads to agrement atraction erors. Rather, the hierarchical distance betwen the head of the subject and intervening noun overwhelmingly determines the rate of eror production. Beginning with Bock & Miler?s seminal 1991 paper, ?Broken Agrement,? the phenomenon of agrement atraction was brought under experimental control in language production studies, and much later, in comprehension studies. We wil now step through a number of these studies, which show that agrement atraction is highly sensitive to the structural properties of the subject 23 projection. Consequently we wil argue that agrement atraction is les indicative of a hierarchy that is weakly structured, but rather diagnostic of how individual feature tokens are acesed in an otherwise wel-formed structure. 2.3 Previous studies of agrement production and the hierarchical nature of attraction Rates of eror production in subject-verb agrement are typicaly studied using an elicitation paradigm. In Bock & Miler?s 1991 study, participants were presented with a recorded sentence preamble, consisting of a complex subject like: ?The key to the cabinets.? Participants then had to repeat the preamble along with a full-sentence completion as quickly as possible 4 . If the preamble was correctly reproduced, the verb form was then scored for whether it was inflected in a number, and, if so, whether the correct inflection was used. In Bock & Miler?s experiments, and in subsequent agrement atraction experiments, the experimental design typicaly manipulates the number of the intervening noun (henceforth, the ?atractor?) with respect to the number on the head noun. For example, in Bock & Miler (1991)?s Experiment 1, preambles al contained singular head nouns, and the number on the atractor was manipulated to either match or mismatch the subject head noun. In this experiment, the ?Match? factor was crossed with the serial length of the preamble. An example set of materials is reproduced below: 4 In Bock & Miler (1991) participants were simply required to supply a completion to the sentence. In other studies, particular sentence types have been encouraged, in order to elicit more agreing verb forms: for example, pasives, which necesarily require an auxiliary. Subsequent studies have also used the visual modality (e.g., Vigliocco & Nicol, 1998). Since these manipulations do not influence paterns of atraction, we omit further discussion of these isues. 24 (16) (a) MATCH/SHORT: The key to the cabinet (b) MISMATCH/SHORT: The key to the cabinets (c) MATCH/LONG: The key to the ornate Victorian cabinet (d) MISMATCH/LONG: The key to the ornate Victorian cabinets The ?Match? terminology is standard in the literature on agrement atraction, but, it is one which we and many outside consumers of this literature find somewhat confusing. In this review of the literature, and the experiments presented below, we wil therefore refer to conditions directly by the number on head and attractor nouns. So, in Bock & Miler?s Experiment 1, Conditions (16a) & (16c) are both ?singular atractor? conditions, whereas (16b) & (16d) are ?plural atractor? conditions. As a short hand, we wil use bracketed labels like the following: Sg [ Pl ], where the outermost symbol indicates the number on hierarchicaly superior noun, i.e., the head noun in a complex subject, and the innermost symbol the number on the embedded noun, i.e. the atractor in a complex subject. The strongest generalization about paterns of agrement with complex subjects is refered to as the ?plural markednes? generalization: the atractor is most likely to exert an influence on agrement when it is plural and the head noun is singular, i.e.: the Sg [ Pl ] form (Bock & Eberhard, 1993; Eberhard, 1997). When the head noun is plural, eror rates in agrement do not depend on the number of the atractor. In their extensive meta-analysis of English agrement production experiments, Eberhard, Cutting & Bock (2005) found that, on average, plural atractors in singularly-headed complex subjects (Sg [Pl ]) elicit plural agrement on the verb 13% of the time (N = 16; compared to a baseline of eroneous plural agrement for Sg [ Sg ] preambles of 1%; N = 14). In contrast, singular atractors in pluraly-headed complex subjects ( Pl [ Sg ] ) elicit eroneous singular agrement only 3% of the time (N = 11; compared to a baseline for Pl 25 [ Pl ] preambles of 2%; N = 11). The plural markednes generalization is cross- linguisticaly robust and has been atested in many other languages, including ones that are morphologicaly richer than English: Spanish (Vigliocco, Butterworth & Garet, 1996), Italian (Vigliocco et al. 1995), German (Hartsuiker et al., 2001), Dutch (Hartsuiker et al., 2001), French (Fayol, Largy & Lemaire, 1994) & Slovene (Harison, 2004). The plural markednes efect restricts the clas of explanations for agrement atraction, and points to a syntacticaly-sensitive mechanism. Firstly, it undercuts the simplest clas of explanations based on adjacency or linear proximity of the atractor to the verb. If only string-locality matered, then we should expect to se atraction erors for Pl [ Sg ] subjects as wel as Sg [ Pl ] subjects. What is striking is not that the occurrence of atraction erors is merely reduced for Pl [ Sg ] subjects, but that it is virtualy absent. Previous observers were aware of the plural markednes efect (e.g., Trollope, 1883, Strang, 1966), but it is only with the experimental studies of the past two decades that it has been possible to establish that Sg [ Pl ] subjects are esentialy the only context in which the embedded number maters for eror rates. Secondly, the plural markednes generalization aligns with observations of morphosyntactic markednes in the nominal system for non-singular forms (Grenberg, 1966). It is consistent with an underspecified or privative feature system for number (e.g., Harley, 1994). In such systems plural features are the ?active? gramatical features, in the sense that morphosyntactic rules can refer to them, but not a singular feature or singular value. Singular nouns simply bear no number feature, or no value for number, being singular ?by default.? It is therefore not surprising, once feature composition is taken into acount, that singular atractors should be inefective in inducing atraction. 26 The plural markednes generalization may suggest that agrement atraction is a gramaticaly-sensitive phenomenon. However, it does not entirely preclude non- structural explanations, however: for example, a more complicated adjacency explanation could hold that verb agrement can sometimes be determined by an adjacent noun if that noun bears a marked feature. More direct tests have established that the atractor influences the verb via a hierarchical syntactic representation, and not a string-linear one. The evidence supporting this claim comes from thre kinds of complex subject experiments. 2.3.1 Hierarchical, not linear distance, matters. The first kind of experiment on the efect of hierarchy shows that the hierarchical distance betwen the head of the subject and the atractor strongly influences the rate of atraction. Franck, Vigliocco & Nicol (2002) showed that when the subject phrase contained two stacked P modifiers, like, ?The inscription on the door(s) of the toilet(s)?, a plural noun inside the medial prepositional phrase led to more atraction erors than the most deeply embedded one. That is to say, participants were more likely to use a plural verb form when the preamble was ?The inscription on the doors to the toilet? than when it was ?The inscription on the door to the toilets.? In this case, a plural noun that is hierarchicaly closer to the subject head is more likely to induce atraction than one which is linearly more adjacent to the verb. 5 Compared to this hierarchical manipulation, 5 There are other properties that align with hierarchical order in this case that could be at play. For example, ?the inscription on the doors of the toilet? lends itself more easily to a distributive interpretation (in which there are multiple inscriptions?) than does ?the inscription on the door of the toilets.? Intuitively there is a more direct corespondence betwen inscriptions and doors than betwen inscriptions and toilets. There is evidence in production that the notional number maters in selecting a verb form for collective nouns like ?flet? or ?gang?, though the evidence is les clear cut for typicaly non-collective 27 increasing the linear distance from the head to the atractor has relatively litle efect. Bock & Miler?s Experiment 1 (1991), using the materials in (16), failed to show a reliable efect of the linear distance betwen the head and the atractor on eror rates, when structure was held constant. These sets of results, taken together, highlight the importance of the hierarchical prominence of the atractor within a complex subject, and minimize the contribution of the relative proximity of the atractor and head to the verb 6 . 2.3.2 The attractor?s ?structural domain? matters. The second kind of experiment demonstrates that the nature of the syntactic domain that contains the atractor impacts rates of atraction. Bock & Cutting (1992) showed that when a potential atractor was embedded in a relative clause, as in the preambles in (17), it had a smaler efect than when embedded in a P (and when serial distance betwen head and atractor was controlled). (17) (a) The boy [ RC that liked the snakes] (b) The editor [ RC who rejected the books] In Bock & Cutting (1992), Sg [ Pl ] complex P subjects induced 6% more atraction erors than did Sg [ Pl ] relative clauses: leading to plural agrement roughly 17% of the time, compared to 11% of the time for the relative clauses 7 . In Solomon & Pearlmuter nouns like ?inscription? (cf. Vigliocco, Butterworth & Garet, 1996; Vigliocco, Hartsuiker, Jarema & Kolk, 1996; Eberhard, 1999; Bock et al., 1999; Eberhard, Cutting, & Bock, 2005). Apart from how number is contrued notionaly, the agrement controller in the stacked P cases, ?inscription?, is restricted by its relation to the ?doors,? and not directly by its relationship to ?toilets?. In Solomon & Pearlmutter (2004)?s acount it is the tightnes or directnes of the semantic relation that controls atraction (discussed in section 2.6.1). 6 But se Haskel & MacDonald, 2005, who argue that while linear efects are smal, they are not absent. 7 These are the rates for responses that could be scored as agreing or not. Sg [ Sg ] baselines for P and RC subjects respectively are 4% and 1% respectively. The eror 28 (2004), the disparity was even greater: complex subjects led to plural agrement 20% of the time, compared to only 10% of the time for relative clauses. Both Bock & Cutting (1992) and Solomon & Pearlmuter (2004) tested sentential complements, like ?The report that Megan described the trafic acidents.? In Solomon & Pearlmutter?s study, these showed esentialy no efects of atraction (1% plural forms for Sg [ Pl ], compared to 3% for Sg [ Sg ]). Bock & Cutting (1992) showed a slightly more modest efect: 2% more erors in Sg [ Pl ] condition, which is stil smaler than an approximate 6% atraction rate for complex P subjects observed in that experiment (where, overal, eror rates were lower). The fact that atraction erors are relatively more infrequent in relative clauses, and even more so for sentential complements, presents a serious chalenge to an acount of agrement atraction which does not take the syntactic representation, or at least some structure beyond linear order, into acount. 2.3.3 Ordering of verb and attractor does not mater. The final kind of experiment directly demonstrates that the relative order of ( subject head ? atractor ? verb ) is not necesary for atraction. Vigliocco & Nicol (1998) compared rates of atraction in simple declaratives with complex subjects, as in (18a), to the polar question version in (18b), in which subject and auxiliary are inverted (comparison was betwen experiments). The auxiliary-inverted configuration disrupts the adjacency betwen the atractor and the verb, but keps the (relative) hierarchical distance betwen the auxiliary and the elements inside the subject constant. Polar questions with complex subjects were elicited by instructing subjects to form questions analysis in Bock & Cutting (1992) is not over Sg [ Pl ] ? Sg [ Sg ] comparisons, which we have been reporting, but rather over Sg [ Pl ] ? Pl [ Pl ] comparisons. 29 given a visualy presented adjective (like ?safe?) folowed by a separately presented complex subject (like ?the helicopter for the flights?). (18) (a) The helicopter for the flights are safe. (b) Are the helicopter for the flights safe? Rates of atraction were found to be nearly identical for both kinds of configuration: for declaratives, 13% (Sg [ Sg ] baseline: 0.3%); for questions, 14% (Sg [ Sg ] baseline: 2%). The preponderance of evidence suggests that what maters most in inducing atraction is the relationship betwen the head of a subject projection and the atracting nouns. The likelihood of atraction occurring is (experimentaly, at least) dominated by that relation; and it sems that the terms that mater are not string-linear but hierarchical. Put succinctly, the older label for agrement atraction, ?proximity concord? is a misnomer. 2.4 The feature percolation account of agrement attraction Based on the existence of hierarchical distance efects, it has been proposed that agrement atraction results from feature movement or ?percolation? within a syntactic representation (Nicol, Forster, Veres, 1997; Franck, Vigliocco & Nicol, 2002; Eberhard, Cutting & Bock, 2005). In percolation, features on a daughter syntactic constituent can be inherited by its mother node. In the typical agrement atraction case of a subject with a P modifier (?the key to the cabinets?), the number features bound to nouns within the P percolate upward, valuing higher phrasal projections for number. In some proportion of cases, it can percolate up to the highest projection, that of the subject noun phrase. The verb or verb phrase, by hypothesis, is reliably valued by the number on the subject phrase, and so wil be inappropriately valued in just that proportion of cases when the P- object?s number percolates far enough to value the subject phrase. Figure 2-1, taken from 30 Vigliocco & Nicol (1998), shows how features on a deeply embedded nominal within the subject can be transfered to the entire subject, via the imediate dominance relations that connect it to the entire subject projection. Figure 2-1 Percolation of number features in a complex subject Upward arows indicate the path of percolation. The [plural] feature on the most deeply embedded noun, N2, is inherited by categories along the dominance path that links it to the subject projection, NP. Taken from Vigliocco & Nicol (1998). Percolation of an embedded nominal?s features is asumed to be eroneous and probabilistic. There is some likelihood, p, that the features on a non-head node wil percolate from a non-head to the projection imediately dominating it. The likelihood of traveling a given hierarchical distance is a product of the likelihood of percolating along each imediate dominance link. Consequently the overal likelihood of traversing longer syntactic distances decreases geometricaly. Because the mechanism of feature percolation has this property, it can acount for structural depth efects like those demonstrated by Bock and Cutting (1992) and Franck et al. (2002): the most deeply 31 embedded noun in a stack of Ps (like toilets in ?the inscription on the door on the toilets?) or in a relative clause (like books in ?the editor who rejected the books?) is uncontroversialy more distant from the subject head noun than the noun in a single P modifier (like cabinets in ?the key to the cabinets?) 8 . Furthermore, feature percolation naturaly encompases the plural markednes efect if the feature system is privative, (i.e., singular number is represented by default in the absence of any number feature, whereas plural number is represented by a plural feature). Then in the case of embedded singulars there wil be no feature to percolate upwards, and therefore there should be no possibility for atraction to occur. Most recently a variant of the feature percolation acount has been quantitatively formalized by Eberhard, Cutting & Bock (2005), in their ?Marking and Morphing? model. Their model provides a framework for marking number on the subject, by pooling the contributions of diferent noun sources into a continuously-valued parameter which they dub ?SAP? (for ?Singular-and-Plural?). A plural noun has lexical number specification of 1, whereas a singular noun, with no collective interpretation, has lexical number specification of 0. SAP is determined by weighting the lexical specification with the 8 Notice, however, that feature percolation is les succesful at explaining a contrast betwen relative clauses and sentential complements. In both these cases, the atractor is in direct object position and equidistant from the subject?s maximal projection (modulo the presence of a D? in the case of sentential complements). One way to capture the fact that relative clauses induce more atraction, consistent with a feature percolation acount, is to suppose that the matrix clause subject head is misasigned number in its gap position within the relative clause, which is nearer the direct object position. This acount would predict misagrement within the relative clause (which has not been tested); it would also sem to depend on how the relationship betwen the gap position in the relative clause and the head of the DP that contains it is formaly mediated. 32 contrastive frequency of the plural. 9 Each head contributes its own ?SAP? to the overal SAP value of the subject projection, weighted by its structural distance to the subject root; se equation (19a). The continuous SAP values are mapped onto the probability of producing plural agrement on the verb via the logistic transformation (19b). Figure 2-2 summarizes this proces for the phrase ?the key to the cabinets? (showing the values for the fre parameters, which were fit from the 17 studies Eberhard, Cutting & Bock surveyed). (19) Marking and Morphing: Verb Model (Eberhard, Cutting & Bock, 2005) (a) Root SAP S(r) is the sum of the SAP, S(m), of each head, j, in the subject, weighted by its structural distance, w, to the root. S(n) is the default marking on the subject, which we can asume here to be 0 (i.e., singular). ! S(r)=S(n)+w j "S(m j ) j # (b) S(r) is mapped onto the probability of producing plural agrement by the logistic transform. The term b is a constant bias, capturing the baseline production of plural agrement, when only singulars are present. ! 1 1+e "[S(r)+b] 9 The contrastive frequency weighting captures the fact that the more frequent the plural form of a given lexical item, the les efective it is for inducing atraction (Bock, Eberhard, & Cutting, 2004). In the limit are words like ?suds,? which (virtualy) have no corresponding singular form. 33 Figure 2-2 Valuing subject number for ?the key to the cabinets? The head ?key? contributes nothing to SAP since it is singular. The atractor ?cabinets? contributes its lexical SAP value 1.15, multiplied by the structural distance weight for nouns in that position (w L = 1.15). The subject SAP value of 1.60 translates into a probability of 13.9% of producing a plural verb. Taken from Eberhard, Cutting & Bock (2005), Figure 6. 2.5 Implications of feature percolation and objections The eroneous feature percolation model of agrement atraction has been one of the most influential models of how a non-head noun within the subject can influence number on the verb. The idea of features being transfered via imediate dominance relations sems to elegantly capture the two major generalizations discussed above: plural markednes and hierarchical distance. Eberhard, Cutting & Bock?s specific implementation of feature percolation fits with four fre parameters data from seventen experiments in which diferent sets of nouns, with diferent sets of properties, were used 34 (and, in addition, is extended to the analogous phenomenon of pronoun atraction). However the representational commitments of an eroneous feature percolation model have perhaps not be adequately appreciated in the literature. There are severe implications both for the gramatical fidelity of the representations as wel as resource consumption concerns for the encoding machinery. We wil outline both of these concerns and consider some apparent empirical shortcomings of the percolation model. This discussion wil lead to two direct tests of the model, acomplished by studying agrement procesing in language comprehension. First it bears emphasizing that what follows is not a critique of the concept of percolation, as a gramatical device. Most major theories need something like percolation to capture the fact that phrases inherit their properties from a distinguished member of the phrase, i.e., the head. There are wel-understood types of rules in syntactic theory that permit feature percolation. In some theories, like X?-theory (Jackendoff, 1977), the phrase structure rewrite rules are (schematicaly) constrained so that the features of the left-hand term atch the features of one right-hand term. Bare Phrase Structure (Chomsky, 1994) guarantes a similar outcome via the labeling convention for the Merge operation. Other gramars use mechanisms like implication (Lambek-style categorial gramars; Bayer & Marcus, 1996) or unification (Shieber, 1986). For example, Head-driven Phrase Structure Gramar (HPSG; Pollard & Sag, 1994) is a prominent unification gramar and it is constrained by the Head Feature Principle, which ensures the agrement features of a phrase are inherited from its head(s). 35 2.5.1 Eroneous feature percolation as eroneous rule application What is at stake is the claim that eroneous feature percolation can occur in complex subjects, roughly in the way outlined in Figure 2-1. There are two ways of construing this claim. Under the first construal, the rule that governs feature inheritance by a mother node from its daughters is misapplied somehow, such that during the structure building proces the plural feature of the embedded noun is succesively inherited by its ancestors, until it reaches the subject node itself. A derivation for an eroneously plural-marked subject phrase is outlined in (20), with eroneous steps stared. (20) An eroneous derivation for ?the key to the cabinets? (a) Build embedded DP: [ DP the cabinets] PL (b) Build embedded P: [ P to DP PL ] (c) * Eroneously asign the number feature of DP PL to P: P PL (d) Build highest NP: [ NP key P PL ] (e) * Eroneously asign the number feature of P PL to NP: NP PL (f) Build subject DP: [ DP the NP PL ] (g) Asign plural to the highest DP: [ DP the key to the cabinets] PL This rule-based construal of the eroneous feature percolation acount therefore imputes gramatical il-formednes to the encoding of the subject in two senses. First, features that are supposed to be inherited by a projection from its head, are instead inherited from its complement. Secondly, these features must either be able to pas through the P or they must be alowed to be asigned to it. In the former case, percolation is alowed when there is not imediate dominance, as the subject projection inherits the number features 36 of a DP that is not its daughter node. In the later case, the P ends up bearing number features, which is otherwise unatested in the gramar of English. 10 Feature percolation may very wel not be step-by-step, along the dominance path. Den Dikken (2001) has proposed a movement analog of percolation that satisfies this condition. Den Dikken asumes that the features of an NP are promoted, via a Generalized Quantifier Raising-like mechanism, to adjoin with the determiner the NP merges with. He further asumes that succesive cyclic movement is possible. Consequently, the following derivation is asumed to be possible, in which the formal features of the atractor NP, ? 2, can QR to adjoin with the highest determiner 1 . These 10 If percolation must occur via imediate dominance, then there are at least two erors of rule application necesary to get the plural feature out of the most deeply embedded DP. If that were true, then the encoding of the subject is even sloppier than it appears on the basis of eror rates. The logic of this conclusion is as follows. Asume that the likelihood of producing plural agrement is a monotone decreasing function of the number of eroneous rule applications. This asumption captures the fact that increasing hierarchical distance betwen subject head and atractor leads to fewer erors, since the extra distance requires more eroneous rule applications. Asume that plural agrement is produced only when the final eroneous rule application occurs, which asigns the number feature of the least embedded P to the least embedded NP. If the 10-15% of atraction erors observed for complex subjects reflects likelihood of both eroneous steps occurring in the same derivation, then it follows that there is a higher likelihood that only the first step occurred. More than 15% of the time, then, the feature composition of the subject is as follows, which would have no visible reflex: (i) [ DP the [ NP key [ P to the cabinets] *PL ] ] If eroneous rule applications are independent and equaly likely events, occurring with probability p, then the ultimate eror rate observed in complex subjects is p 2 . If p 2 is the rate of atraction, 15%, then the likelihood of only one eroneous rule application is approximately 39%. In such a system, over half of al complex subject of the form [ DP [ NP N SG [ P P DP PL ] ] ] contains ungramatical number specifications (39% + 15% = 54%). 1 While den Dikken (2001)?s justification for why the formal features of the embedded DP can licitly move upwards is somewhat obscure, the motivations for asuming it are clear. I repeat one of the arguments here. Following a claim atributed to Richard Kayne in clas lectures (New York University, 1998), the author offers the folowing judgment: when the verb agres with a quantified atractor inside a Sg [ Pl ] subject, only the wide- scope interpretation of the atractor is possible: 37 features c-command the formal features of the higher NP, ? 1 , and thus (by hypothesis) value the entire DP. (21) Formal feature movement: scoping out of embeded DP (a) [ DP the [ NP2 cabinets ] ] ? F movement (b) [ DP [ D ? 2 [the] [ NP2 cabinets ] ] (c) [ NP1 key [ P to [ DP [ D ? 2 [the] [ NP2 cabinets ] ] ] ] ]? F mvmt (d) [ DP [ D ? 1 [the][ NP key [ P to [ DP [ D ? 2 [the] [ NP2 cabinets ] ] ] ] ] ? F mvmt (e) [ DP [ D ? 2 [ D ? 1 [the]][ NP key [ P to [ DP [ D ? 2 [the] [ NP2 cabinets ] ] ] ] ] Den Dikken asumes that there realy is no eror here, just an option that the gramar?s generative machinery provides. However the likelihood of formal feature QR (from landing site to landing site; or the likelihood of checking agrement at LF (se fn. 11) must somehow be conditionalized on distance, to acount for efects of hierarchy. (i) The key to al the doors are mising: ?>?: many keys; #?>?: one key Plural number on the verb and the wide-scope interpretation sem to go hand-in-hand, which might follow if the quantifier and agrement features move to a higher position in the DP ? and if agrement checking can take place at LF. However in both experimental studies and corpora of atested atraction, most examples do not involve an overt quantifier. Furthermore, the judgments here sem delicate to me. To my knowledge, they have not been studied experimentaly, though there is a clear experimental prediction. The author points out (and atributes to Anastasia Giannakidou) that quantifiers restricted to narow scope should not trigger atraction, as in (i): (i) *The key to few doors are mising. I have no dificulty with this sentence on a distributed reading. But the claim could be tested, by contrasting rates of atraction betwen sentences like (i) and (i). Den Dikken (2001) also wishes to acount for a lack of atraction when the atractor is pronominal, as in (ii): (ii) *The identity of them are to remain secret. In the same paper, it is argued that (weak) pronouns are invisible at LF, and consequently cannot trigger agrement atraction. It has recently been argued that atraction is most pronounced when both higher and lower nominals are case ambiguous (se Badecker & Kuminiak, 2007 for review). A case ambiguity requirement can explain the lack of atraction for English pronouns. 38 2.5.2 Eroneous feature percolation as uncontrolled spreading activation There is another construal of eroneous feature percolation, in which valuation of the subject by the atractor noun?s number specification does not result from the misapplication of a gramatical rule at the symbolic level. Rather, it stems from lower- level properties of how structured representations are encoded in a particular cognitive architecture. Eberhard, Cutting & Bock (2005) are most clearly proponents of this view. Features eroneously ?percolate? in structures because structures are encoded in spreading activation networks (Del, 1986), and consequently, number features can pasively migrate from node to node along active connections in any direction. In discussing their own model they write: When a source of number information is bound to a temporary structural network for an utterance, it transmits its information to the structure. Within the structure, the information moves or spreads acording to principles of structural organization, asembly, and disolution (543). The authors further clarify: [T]he transmision of SAP [the model?s continuously-valued plural feature ?MW] was treated as an activation-like proces, with the weights of the connections or branches in the structural network modulating the amount of SAP that is transmited across them (Stevenson, 1994) .. Because SAP may flow unobstructed throughout a structural network, number information bound anywhere within a structure has the potential to influence agrement proceses. For this reason, even number information outside a subject or antecedent noun phrase (as in Hartsuiker et al., 2001) 12 can afect agrement, to a degre that is negatively correlated with its structural distance from the locus of agrement control (544). 12 Hartsuiker, Ant?n-M?ndez, & van Ze (2001) found that in production of Dutch V2 order, Subj SG ? Obj PL ? V sequences can induce agrement atraction, though at lower rates than complex subjects. These results fit uncomfortably with the feature percolation model. One option that does sems consistent, however, is to hold fast to the asumption that the number features on the object must interfere with agrement by overwriting the number features on the subject or at least by percolating up to the S level node where NP and VP unify. The diference in eror rates would therefore reflect the fact that the hierarchical distance betwen the subject projection and the object is greater than the distance betwen the subject projection and a DP it contains. 39 The asumption that number features can be ?transmited? to the structural network is not innocent. In the SAP-model, it is not explicitly stated that the ?structural network? is isomorphic to the phrase structure representation, but it is strongly implied in the claim that the spread of activation follows ?principles of structural organization, asembly, and disolution? (543). Spreading activation models feature in several connectionist acounts of linguistic structure. They are based on a correspondence betwen the shape of a tre representation and the connectivity of the network: units in the network represent nodes in a tre, and connections betwen nodes represent edges betwen tre nodes (Selman & Hirst, 1985; van der Velde and de Kamps, 2006). Such an architecture is directly relevant to the binding problem laid out in Chapter 1: how can al the familiar primitive pieces of a syntactic representation be combined together in a novel fashion, on the fly? Spreading activation models that can instantiate any arbitrary pairing of words to structure require considerable resources. The reason is simple: since connectivity betwen nodes is asumed to be a fixed component of the architecture, al connections must be present in the network betwen any two nodes that may need to be related (van der Velde & de Kamps, 2006; cf. Plate, 1994). Moreover, to encode a significant extent of structure, it is necesary to have multiple nodes for each category type, since multiple tokenings may be necesary. For example, multiple DP and NP nodes are necesary to encode a complex subject, since a complex subject contains two DP category tokens and two NP category tokens. Any lexical item that projects to a particular category must therefore be connected to multiple tokens of that category, since it cannot be known in advance which tokens of a category wil be available for binding. Crucialy, the same can be said of the detailed feature structure of a category: any given NP node must be able to bind any given 40 nominal feature token, like a number feature 13 . The construal of eroneous feature percolation as evidence for the pasive spread of information in a structural network only requires more resource demands, in the form of enriched and likely useles connectivity. Because representations are themselves structured, a model in which an activated plural feature is able to share its activation with a nearby constituent in the structure requires the existence of pathways betwen those constituents specificaly for the transmision of number information. If feature percolation occurs strictly via dominance paths, then in order for a plural feature to spread from an embedded DP upwards, two conditions must be met. Firstly, each category along the way must be able to bind a number feature. Consequently, each category node must be connected to number nodes. As before, the number feature may be otherwise gramaticaly inacurate or uninterpretable, as in the case of binding a plural feature to P. Secondly, for number activation to spread and thus cause the percolation of the plural feature, the number node bound to one category must be connected to the number node bound to the next category in tre. Consequently, there must be rich connectivity betwen al the number nodes in the architecture. Though it is tempting to think of eroneous feature percolation as the pasive spread of activation betwen connected nodes, it sems to require positing many more nodes and connections than would be necesary otherwise. Even if we put resource consumption worries aside, we se that the network must be structured in a very particular way to 13 Since any given structure instantiates only a smal subset of possible relations, spreading activation models must have an efective gating system, so that activation only spreads betwen some nodes. Van der Velde & de Kamps (2006) solve this problem by connecting category or lexical nodes not directly, but through special gating circuits. These circuits must be activated by the parsing module for activity to flow betwen two nodes. 41 permit the occurrence of feature percolation: it is by no means an automatic property of embedding a syntactic structure in a spreading activation network, despite an apparent similarity betwen the mechanism and the metaphor. 2.5.3 Sumary of eroneous feature percolation In the previous sections we outlined some possible mechanistic pathways for eroneous feature percolation, and argued that these mechanisms lead us to make commitments about the encoding of structure that are suspect. However even if these arguments leave us unperturbed, the fact remains that the feature percolation model of agrement atraction leads us to posit a sloppy and gramaticaly unfaithful encoding of the subject constituent. In the production work discussed thus far, evidence for a deficient encoding comes from the sheer fact that erors in agrement production occur. That is, we observe ungramatical forms at particular rates controlled by structural properties of the subject. It is, however, not a necesary conclusion that the encoding of the subject is to blame for the production of these erors. Production requires the interaction of a complicated set of proceses, and is not simply the direct translation of a syntactic tre into a string of words. While the fact that structural properties of the syntactic representation strongly influences agrement atraction points to an encoding problem, paterns of aces involved in sequencing the planning of constituents may also play a role. Indeed, the second, proces-based clas of explanations for agrement atraction has more explicitly tied atraction erors to the order in which the constituents are planned in production or are made available to determine the verb form (Bock & Cutting, 1992; Solomon & Pearlmutter, 2004; Badecker & Lewis, 2008). In the next section, we?l consider how a properly encoded representation of the subject could nonetheles give rise 42 to agrement atraction. 2.6 The simultaneity account of agrement attraction 2.6.1 Simultaneity in planing a complex subject interferes with verb formulation The basic idea behind proces-based acounts of agrement atraction relies on some notion of simultaneous aces to distinct constituents. Recal that Bock & Cutting (1992), and later Solomon & Pearlmutter (2004), observed a disparity betwen rates of agrement atraction induced by complex subjects containing Ps and those containing RCs. Under a percolation acount, the higher rates observed in P subjects compared to RC subjects are due to the decreased structural distance betwen the head and atractor in P subjects. The embedded plural feature has to migrate a shorter distance in the P case than in the RC case. However, Bock & Cutting originaly atributed the diference to a clause-boundednes constraint on planning: features inside a relative clause are les available to interfere with the selection of the verb form, because the target verb is in a diferent clause. Recal that atraction erors are not totaly eliminated in RC cases, so such a clause-boundednes constraint must be weakened somehow. Solomon & Pearlmutter (2004) explain agrement atraction in the same spirit as Bock & Cutting?s clause boundednes model. They asume that constituents within closely-related semantic or syntactic domains may be simultaneously activated during the mapping of a conceptual structure onto a syntactic frame 14 . If two similar constituents are 14 One can get the basic intuition that there is simultaneity in planning by considering another type of speech eror: exchange erors (e.g., Garet, 1980). A clasic exchange eror is ilustrated below: (i) There are [ DP memories of theory ] that do not asume storage costs. coment by Wiliam Badecker, 10 May, 208, Mayfest, University of Maryland 43 co-active, then it is possible that features in one constituent wil interfere with features of the same type in the other constituent, influencing agrement proceses before the subject representation is completed. What is key here is simultaneity of information in the procesing system. At the same time that the complex subject is being structured, the verb form is being prepared: interference is possible to the extent that multiple sources of similar information are available as the verb morphology is being computed. Solomon & Pearlmutter (2004) do not view simultaneity as being determined by a discrete property, like whether or not two constituents belong to the same clause, but instead simultaneity is determined (at least in part) by a concept they cal ?semantic relatednes.? To get a sense of what they mean by semantic relatednes, consider two kinds of DP (their own examples): a phrase like ?the ring of silver,? consisting of a noun head and its P modification, and a phrase like ?the ketchup and the mustard,? which is a conjunction of two NPs. In the first DP, ?ring? and ?silver? are considered to be more semanticaly related, because this DP refers to a ring which has an inherent property characterized by silver (namely, its material composition). Consequently understanding the exact sense of ?ring? expresed by this phrase is contingent upon the modification with the noun ?silver.? On other hand, in the second DP, while ?ketchup? and ?mustard? are related by the conjunction and have a real-world relationship in the sense that ketchup and mustard often co-occur in the world, understanding what ketchup means in this phrase is not contingent upon understanding that it co-occurs linguisticaly with ?mustard.? ?Ketchup? Clearly the phrase ?theories of memory? was intended to be produced in this case, but the speaker exchanged the hierarchicaly lower (and linearly later) head noun ?theory? with the NP head ?memory,? implying that production planning does not map clearly onto serial or hiearchical order. Note that the plural morphology was not exchanged (and the verb form reflects this as wel). 44 and ?mustard? are les semanticaly related. In the planning of these DPs, Solomon & Pearlmutter propose that more semanticaly related nouns are more likely to be simultaneously active: therefore, ?ring? is more likely to be co-active with ?silver,? than ?ketchup? and ?mustard.? Solomon and Pearlmutter provide evidence for their view by showing a dependence betwen agrement atraction and semantic relatednes when structural distance is held constant. The more closely related two nouns inside a complex subject are (rated to be), the more likely that subject is to induce atraction. Nouns inside Ps that expres an acompaniment or locative relations (e.g. ?the chauffeur with the actors?, ?the piza with the tasty beverages?) induce fewer agrement erors than those in a functional or atributive role (e.g. ?the chauffeur for the actors?, ?the piza with the yummy toppings?). Likewise, direct object nouns in relative clauses induce more agrement atraction than direct object nouns in sentential complements. The explanation of that contrast was somewhat mysterious for the percolation acount because the relative distances betwen the subject head and the atractor are the same (se fn. 8); and it is perhaps no les mysterious in the absence of a theory of semantic relatednes (which the authors do not ofer). However, ratings of ?felt relatednes? betwen two nouns in an expresion do show that an RC head noun and the direct object of the RC are felt to be more closely related than the direct object of a sentential complement. Perhaps the co- argument relationship betwen an RC head and the RC direct object is partialy to blame. 2.6.2 The relationship betwen planning order and hierarchy The major chalenge that a simultaneity acount faces is how to capture the structural efects outlined in section 2.3. The strategy for addresing this chalenge is to 45 posit that the order of planning, and consequently the likelihood of simultaneity betwen two head nouns, tracks hierarchical prominence, at least approximately. Consider Franck et al.?s (2002) contrast betwen rates of atraction induced by medial versus most deeply embedded NPs. (22) (i) the inscription [ on the doors [ of the toilet ] ] > more atraction (i) the inscription [ on the door [ of the toilets ] ] It sems reasonable in this case to asume a higher likelihood of co-activation betwen ?inscription? and ?doors? than ?inscription? and ?toilets?, since there is a more direct relationship betwen ?inscription? and ?doors? than ?inscription? and ?toilets? (indeed it is ?doors? that mediates the relationship betwen the two). However, then consider Vigliocco & Nicol?s (1998) finding in question formation, that there is roughly equal occurrence of forms like ?The helicopter for the flights are ready? and ?Are the helicopter for the flights ready?? We can maintain a correspondence betwen planning order and hierarchical prominence of the head and atractor nouns, but how the order of planning those nouns is ordered with respect to the Aux wil mater. Solomon & Pearlmutter (2004) are not explicit about how the Aux is selected. But, intuitively, if there is a constant relationship betwen the planning of the Aux with respect to the subject head noun, regardles of the Aux?s output order, then the same simultaneity mechanism should apply. 2.6.3 Disentangling representation and proces-based accounts The eroneous feature percolation model of agrement atraction atributes agrement atraction erors to a faulty representation of the subject. The simultaneity acount atributes agrement atraction erors to a planning proces that has dificulty distinguishing betwen similar constituents that are simultaneously available to the 46 system. The consequence of both of these acounts is measured in terms of the rates of eror production: that is, what is observed is that the system produces an eror. However, we propose to disentangle the two acounts by looking at comprehension data. In trying to understand the relation betwen agrement atraction and syntactic representation, evidence from agrement procesing in comprehension provides a useful complement to evidence from production. Comprehension experiments provide greater control over the representations we can force participants to entertain, and crucialy alow us to observe reactions to both gramatical and ungramatical agrement. The feature percolation acount makes a clear prediction about how gramatical and ungramatical strings containing atractors should be procesed: crucialy, gramatical procesing should be made more dificult to the extent that ungramatical procesing is eased. By comparing word-by-word reaction times in sentence reading, we can test this prediction. First, in the next section, we spel out this prediction in greater detail. 2.7 Atraction in Comprehension 2.7.1 The Symetry Prediction Although the comprehension literature is smaler, its findings are generaly convergent with those from production; such studies find that in the same scenarios in which individuals are likely to produce an agrement eror, they experience les dificulty in procesing an agrement eror (Nicol, Forster & Veres, 1997; Pearlmutter, Garnsey & Bock, 1999). The types of experiments that comprise this literature, i.e., speeded judgment, reading time and electrophysiological studies, provide interestingly diferent measures than are found in production studies. Whereas the key measure in production studies of agrement atraction is the proportion of agrement erors that an individual 47 produces, the key measure in comprehension studies is an index of procesing dificulty, as reflected in a localized reading time diference (e.g. Pearlmutter, Garnsey & Bock, 1999), or a signature patern of evoked potentials (e.g. Kan, 2002). A number of studies have also used gramaticality judgments under time presure following rapid serial visual presentation (Haussler & Bader, submited) or word-by-word gramaticality judgments (Clifton, Frazier & Devy, 1999). A crucial diference from production studies is that it is possible in a comprehension experiment to compare the language procesing response to both gramatical and ungramatical agrement in the context of agrement atraction environments. The feature percolation acount of agrement atraction makes a very clear prediction for comprehension of gramatical agrement when the subject would typicaly induce agrement atraction in production. The feature percolation acount states that a complex subject like: (23) [ DP D [ NP N SG [ P P DP PL ] ] ] e.g., The key to the cabinets is encoded with the plural feature in some predictable percentage of cases; say, 15%. When the input string includes this DP and an ungramatical verb, i.e. a plural form, in 15% of cases the verb and subject should appear to agre (and appear to misagre in 85% of cases). But, likewise, when the input includes this DP and a gramatical verb, i.e., a singular form, in 15% of the cases the verb and subject should appear to misagre (and appear to agre in 85% of the cases). This prediction we dub the symmetry prediction: The faulty encoding should lead gramatical input to be perceived as ungramatical ? to induce ilusions of ungrammatical ? just as it should lead ungramatical input to be perceived as 48 gramatical ? to induce ilusions of grammaticality. The rates at which these ?ilusions? occur should be significant and comparable. Table 2-1 works out this prediction for a specific example. Subject Verb The key to the cabinets was Were Gramatical Encoded Percentage of cases Sg 85% Yes No N U M BE R Sg Pl 15% P E R C E I V E D G R A M A T I C A L ? No Yes Table 2-1 The Symmetry Prediction for Feature Percolation ?The key to the cabinets? is a gramaticaly singular phrase. If it is eroneously encoded as plural phrase in some percentage of cases, then it should be perceived to agre with ?were? in that percentage of cases, and to misagre with ?was? in a similar percentage of cases. We can thus test the eroneous feature percolation acount in comprehension if we can measure perception of gramaticality across atractor and non-atractor containing sentences in which the verb matches the head of the complex subject. As mentioned above there are several methodologies that could potentialy serve this purpose. We could ask participants to clasify sentences as aceptable or not, and directly measure the distribution of responses, in a speeded gramaticality test. Such a test is relatively of- line however, so we also turn to self-paced word-by-word for a finer time course measure. Reaction time studies in eye-tracking and self-paced reading have shown that reaction times increase sharply at the verb or on the word imediately following it. (Pearlmutter, Garnsey & Bock, 1999). And in electrophysiology Coulson et al. (1998), Hagoort et al. (1993), and Osterhout & Mobley (1995) have al shown that subject-verb agrement failures lead to a P600, the evoked response potential that is often taken to 49 index syntactic procesing dificulties, on the verb signaling misagrement 15 . Wagers, Lau & Philips (2008) confirm that the online behavioral response to agrement violations is substantial. In Experiment 1, participants read sentences containing simple subjects, either singular or plural, and the copular or auxiliary BE, which either matched or mismatched the subject in number: (24) (a) GRAMATICAL: SG/PL SUBJECT The old key/s unsurprisingly was/were rusty from many years of disuse. (b) UNGRAMATICAL: SG/PL SUBJECT The old key/s unsurprisingly were/was rusty from many years of disuse. Figure 2-3 shows word-by-word reading times of such sentences. When participants read the verb, and imediately thereafter (Regions 5-6), there is a large main efect of gramaticality. Verbs that fail to agre with their subjects appear to substantialy disrupt procesing. 15 The patern of ERPs can shift if a more meta-linguistic task is added. Osterhout & Mobley (1995) found diferent components when an aceptability judgment task was added: an increased P2, and a Left Anterior Negativity. 50 Figure 2-3 The efects of subject-verb misagrement in reading Reading times from Wagers, Lau, & Philips (2008) Experiment 1. Sample sentence with subscripts to indicate region coding: The 1 old 2 key(s) 3 unsurprisingly 4 was/were 5 rusty 6 from 7 many 8 years 9 of 10 .. The large block arows indicate the expected efects based on the Symmetry Prediction for feature percolation. Eror bars indicate standard eror of the mean (in each cel). Superimposed on Figure 2-3 are the expected efects based on the Symmetry Prediction for feature percolation leads us to expect the following. The Sg [ Pl ] sentences are predicted to be liable to both ilusions of gramaticality and ungramaticalty. Therefore, reading times in the verb region should speed up for the ungramatical sentences (dashed arow), since more sentences should sem gramatical. But likewise, 51 reading times should increase for the gramatical sentences (solid arow), since more sentences should sem ungramatical 16 . The most complete (and until the present work, nearly unique) online data set for procesing agrement atraction sentences in English comes from Pearlmutter, Garnsey & Bock (1999). Do their data support the Symetry Prediction? We wil examine results from their experiments, which superficialy sem to support Symmetry (as the authors concluded). However, after pointing out some problems which we believe block drawing this conclusion from their data, we report eight of our experiments (four of which are reported in Wagers, Lau, & Philips 2008) that systematicaly fail to confirm the Symmetry Prediction. These results require an explanation that cannot be stated in terms of the faulty encoding of the subject; but like Solomon & Pearlmutter?s (2004) simultaneity acount of production, require that the verb select the wrong feature from an otherwise properly encoded representation. Structure insensitivity thus arises from how online proceses navigate the structured representation. 16 One might worry that the symmetry prediction does not hold, due to the characteristicaly skewed distribution of RTs in SPR studies. In Appendix A, we show that the symmetry prediction does hold in mean reaction times. To spel out this logic briefly, suppose that the reading time on an invidiual trial for a verb which matches the number encoded on the subject projection is drawn from RT distribution M; and that the reading time for a verb which fails to match the encoded number is drawn from distribution M?. The Symmetry Prediction for reading times can be modeled explicitly if we asume that the distribution of reading times observed experimentaly is a linear mixture, sampling from M and M?. The mixing proportions depend on the likelihood of a faulty encoding of the subject DP. In Appendix A, we report the simulations showing that this predicts linear efects on the means of reaction time distributions, even for the highly skewed Ex-Gaussian RT distributions that are typical of reading. The diference in means betwen a 85/15% M/? distribution and a 100% M distribution is identical to the diference in means betwen a 15/85% M/? distribution and a 100% M? distribution. Therefore the feature percolation acount does predict a miror efect in RTs for atraction conditions. 52 2.7.2 Pearlmuter, Garnsey, & Bock (1999) Pearlmutter, Garnsey, & Bock (1999; henceforth, also PGB) have conducted the most extensive experiment on agrement atraction in comprehension. They considered both word-by-word self-paced reading and eyetracking measures of comprehension for gramatical and ungramatical sentences with complex subjects containing atractors. In their experiments, they considered only singularly headed complex subjects, and crossed the number of the atractor (Sg or Pl) with gramaticality. An example materials set is given below: (25) (a) SG [ SG ] / GRAMATICAL: The key to the cabinet was rusty. (b) SG [ SG ] / UNGRAMATICAL: The key to the cabinet were rusty. (c) SG [ PL ] / GRAMATICAL: The key to the cabinets were rusty. (d) SG [ PL ] / UNGRAMATICAL: The key to the cabinets was rusty. Reaction time results from their Experiment 1, a word-by-word self-paced reading experiment, are plotted in Figure 2-4 (their Figure 1). Results are given as residual reading times with 95% confidence intervals (for graphical inference) 17 . Consider Region 17 Residual reading times linearly regres out the relationship betwen reading time and word length. In doing so it also takes out overal diferences in means betwen participants. This measure is used widely in the sentence procesing literature, since there is a monotone increasing relationship betwen word length in characters (or some property that correlates with length) and reading times. Se Fereira & Henderson (1991). However residual RTs are often used somewhat indiscriminately: a beter solution is to counterbalance materials for length, so that the distribution of character lengths for a region of interest is identical across conditions. Almost al researchers, including Pearlmutter, Garnsey, & Bock (1999), do this as mater of course. In an analysis of a large corpora of our own RT experiments (including experiments not reported here), we found that there was indeed a reliable, linear RT trend in word length, when collapsed across participants and experiments: each character contributed 3-5 ms of extra reading time. However, there is considerable variability among participants (and experiments), and only a subset exemplify the overal patern. Some participants exhibit non-linear relationships, like a stepwise increasing function; others an opposite patern, like a decreasing linear function, and many no apparent relationship at al. More worryingly, there are widespread spilover efects in self-paced reading. The benefit of the regresion may be greatly diminished, if efects of word-length are also strong in the spil-over region. If the experimenter must take to word length into acount, it is best acomplished 53 7, the region imediately following the verb (which is outlined). There is a large gramaticality efect ? compare SG [SG] UNGRAMATICAL sentences (Symbol: ?), compared to their gramatical counterpart (?). However, the atraction sentences, SG [ PL ] / UNGRAMATICAL (?), were read significantly faster than the SG [ SG ] UNGRAMATICAL sentences (?). SG [ PL ] / UNGRAMATICAL was stil read significantly more slowly than either gramatical sentences (SG [ SG ] or SG [ PL ]), but it is a smaler efect exactly for those sentences that are produced in the elicited production tasks. By the Symmetry Prediction, we should find a comparable efect in the opposite direction, when we compare SG [ PL ] / GRAMATICAL to SG [ SG ] / GRAMATICAL. Indeed we find the prediction upheld in Region 7: SG [ PL ] / GRAMATICAL (?) is read more slowly than SG [ SG ] / GRAMATICAL (?). Consistent with the Symmetry Prediction, RTs in both atraction conditions have moved symmetricaly to an intermediate position betwen the non-atraction conditions: there is a significant interaction betwen gramaticality and atractor number. in the context of a mixed-efects model to acount for individual subject variability and prior region dependencies (se Vasishth, 2006 and our section 2.9.2). 54 Figure 2-4 Pearlmuter, Garnsey & Bock (1999) Experiment 1 Results Grand mean residual reading time by word position. Eror bars are 95% confidence intervals for diferences betwen cel means (over participants analysis). The Match factor refers to atractor number: ?Mismatch? corresponds to plural atractors. Region 7, showing the atractor efect, is outlined. Figure and text adapted from PGB Figure 1. The authors conclude: This patern across the four conditions is consistent with the idea that the efect of a head/local NP-mismatch is to increase the probability of an eror in computing the number of the subject NP, resulting in more mismatch-induced seming erors in the gramatical conditions, but fewer mismatch-induced seming erors in the ungramatical conditions (436). They also interpret the data to support Symmetry. What is worrisome about this dataset, however, is that the interaction betwen gramaticality and atractor number is observed following a large main efect of atractor number on the verb. The fact that there is no efect of gramaticality on the verb itself is not troublesome, as efects of procesing dificulty are often delayed by a region or two in self-paced reading. However the fact 55 that an efect of atractor number is observed before the interaction betwen number and gramaticality creates a serious ambiguity for interpreting the data: does the slowdown observed in gramatical atractor conditions realy reflect an ilusion of ungramaticality? or does it reflect a (temporary) shift in the baseline RT for Sg [ Pl ] subjects? Noun number is wel-known to afect lexical decision times, although the size of this efect is modulated by root and surface frequency (New, Brysbaert, Segui, Ferand & Rastle, 2004; Lehtonen, Niska, Wande, Niemi & Laine, 2006). There is now evidence that many of the efects from lexical decision experiments can impact techniques employed in sentence comprehension experiments, like self-paced reading and eye- tracking (Niswander, Pollatsek, & Rayner, 2000; Bertram, Hy?n?, & Laine, 2000; Lau, Rozanova, & Philips, 2007). Indeed the simple subject-verb agrement experiment summarized in Figure 2-3 shows an apparent morphological complexity efect in Region 4, one word beyond the plural DP. The main efect of atractor number observed on the verb may thus be a consequence of increased morphological complexity in that condition. If such efects can spil over to the following region, then the data do not unambiguously support the Symmetry Prediction. Notice that the same concerns do not apply the Ungramatical atractor condition, because a baseline shift due to morphological complexity would only help to obscure any atractor efect. Since the atractor efect, in the form of a downward shift in reading times, is present, then we can stil conclude that atractor conditions are easier to proces when they are ungramatical. Pearlmutter, Garnsey, & Bock (1999) replicated the same symmetrical patern in eye- tracking (their Experiment 2), where RT efects are typicaly reported to be more tightly aligned to manipulated regions. In addition to finer temporal resolution, eye-tracking 56 experiments also provide more measurements of procesing dificulty, since it possible to track (separately) initial fixations on a word of interest, re-fixations, and total fixations 18 . The total fixation time measure, which adds up time spent in a region across an entire trial, shows the same interaction betwen atractor number and gramaticality as in the self-paced reading experiment, on the verb, the post-verbal region, as wel as on the atractor itself. Because this measure incorporates re-reading (as the presence of the interaction on the atractor indicates), it is dificult to addres the concerns we raised above. First-pas reading times, in which only the initial time spent in a region (that is, before moving out of that region, in any direction) is measured, are more germane. Interestingly, in first-pas reading times, there were no reliable efects of experimental condition, except a marginal by-items interaction in the region following the verb, where the SG [ SG ] / GRAMATICAL condition was read reliably faster than all other conditions. SG [ SG ] / GRAMATICAL (?The key to the cabinet is ..?) thus appears to be a distinguished experimental condition in very early measures (se Figure 2-5). Whether this has anything to do with agrement is unclear. This condition is also the only one that has no plural morphology whatsoever, nominal or verbal. As the authors stres, though, readers may adopt one of (at least) two strategies to violations in reading: spend more time on the violation region, or regres and re-read. So first-pas measures in a reading- 18 Eye-tracking experiments also putatively involve a diferent kind of language procesing, since experimental participants have aces to an external memory aid. In this respect, reading in eye-tracking experiments is more similar to naturalistic reading; one might then debate whether self-paced reading is likewise more like natural language procesing, at least mnemonicaly. It is possible, for example, that the nature of the encoding and navigation problem may change with an external memory. Indeed, an interesting aspect of Pearlmutter, Garnsey, & Bock (1999), which we do not discuss, is where in the text readers are likely to sacade to following a violation, and whether such sacades are under linguistic control (the short answer appears to be no). 57 time experiment that contains violations may not be as diagnostic of early proceses as they might initialy sem (se also fn. 18). Figure 2-5 Pearlmuter, Garnsey & Bock (1999) Experiment 2 Results First-pass reading times Grand mean residual reading time by word position. Eror bars are 95% confidence intervals for diferences betwen cel means (participants analysis). The Match factor refers to atractor number: ?Mismatch? corresponds to plural atractors. Figure and text adapted from PGB Figure 2. In Experiment 3, another self-paced reading experiment, the authors replicated the gramatical slowdown observed in Experiment 1. In that experiment there were no ungramatical conditions, only a comparison with pluraly-headed complex subjects (in order to test whether the plural markednes generalization holds in comprehension). This slowdown could be liable to our morphological complexity concern, except that no slowdown is observed for Pl [ Pl ] conditions in that region (relative to Pl [ Sg ]. But there 58 does appear to be at least a trend for slower RTs folowing plural atractors; and the Pl [ Pl ] condition is read reliably slower than Pl [ Sg ] in the following region. In sum, Pearlmutter, Garnsey, & Bock (1999) provide evidence that is consistent with the Symmetry Prediction of the feature percolation acount. If the interactions of gramaticality and atractor number that they observe in verbal and post-verbal reading times reflect only proceses of agrement, then we conclude that agrement atraction configurations can lead both to ilusions of ungramaticality and to ilusions of gramaticality in comprehension. However there are concerns with the timing of the atraction efect with respect to the timing of putative morphological complexity efects. It is dificult for us to make this case conclusively, because there is not complete alignment across their thre data sets. The first-pas measures in PGB?s Experiment 2 are too non-selective for the manipulations of interest, because they do not segregate the ungramatical conditions. If morphological complexity can explain the slowdowns observed in PBG?s Experiment 3, then it must be part of an acount that is sufficiently nuanced to explain why plurals in singularly-headed subjects offset reading times with a delay of one region, whereas plurals in pluraly-headed subjects do so only after two regions. It is worth considering though that feature percolation is not unambiguously supported, either, even if concerns of morphological complexity could be overcome in PGB. For example, if the subject DP?s number is simply faultily encoded, then one stil wants an explanation for why the atractor efect appears earlier in gramatical sentences than in ungramatical ones. 59 2.7.3 Sumary There are good reasons to suppose that a faulty encoding of the subject lies behind the basic agrement atraction phenomena. However as we have tried to stres, there are also good reasons for skepticism, both theoretical and empirical. We wil not dwel any longer on interpreting the record, and instead report the results from several experiments which show that falibility to agrement atraction configurations in comprehension is selective. Overwhelmingly agrement atraction leads to ilusions of gramaticality for actualy ungramatical strings, but does not engender (at a rate that we can reliably detect) ilusions of ungramaticality for actualy gramatical strings. This kind of patern cannot be explained on the asumption that the subject encoding is faulty. 2.8 Testing Percolation I: Relative Clause Atraction 2.8.1 Kimball & Aisen (1971), Relative Clause Atraction & Experiment 1-2 Rationale In the first set of experiments we present testing percolation, we wil examine a non- canonical agrement atraction configuration. Thus far the discussion has focused on complex subjects like ?the key to the cabinets.? Testing the eroneous feature percolation model in comprehension with complex subjects, we have argued, is dificult. Feature percolation predicts that the procesing load in Sg [ Pl ] gramatical strings should increase, but so does the presence of a local plural also predicts a slow-down for exactly the same strings. Fortunately there is another configuration of agrement controller and atractor, one that behaves similarly to complex subjects in inducing atraction, but does not put a plural in the local environment of the verb: agrement inside relative clauses. 60 Kimbal & Aisen (1971) first brought to our atention the contrast betwen the clearly il-formed (26a) and the apparently more aceptable (26b): (26) (a) * The politician who the farmer refuse to vote for .. (b) ? The politicians who the farmer refuse to vote for .. Below, we wil show that this patern is quite similar to the canonical agrement atraction phenomenon that obtains with complex subjects. But there are two properties of this construction that we want to first highlight which make it an interesting experimental test of an acount based upon eroneous feature percolation. First, the local environment of the verb is identical in gramatical sentences where the number of the relative clause head is varied: (27) (a) RC HEAD SG The politician who the farmer refuses to vote for .. (b) RC HEAD PL The politicians who the farmer refuses to vote for .. This construction therefore offers a potentialy ?purer? test of whether or not gramatical subject-verb agrement is procesed more slowly when an atractor is in the structure. Secondly, this configuration inverts the hierarchical relationship betwen subject head and atractor. We asume that the relative clause examples have an adjunction structure (se Bhat, 1999, for a discussion of alternatives), where the NP atractor ?politicians? c- commands the RC subject: (28) attractor subject [ DP the [ NP [ NP politicians ] [ CP who [ TP [ DP the farmer ] refuse to vote for ] ] ] ] In contrast, the subject head c-commands the atractor in the canonical complex subject cases of atraction: (29) subject attractor [ DP the [ NP key [ P to [ DP the cabinets] ] ] ] were rusty .. 61 Naturaly this raises the question of whether there is a real paralel betwen the two types of atraction. In the case of complex subjects, the dominance path proceds from atractor to subject so that the atractor?s features are always only inherited by a dominating category ( attractor DP < P < NP subject ). For relative clauses, the dominance path from atractor to subject goes in the opposite direction: ( attractor NP > CP > TP > DP subject ). This reversal of dominance relationships is only troublesome (perhaps) for the eroneous rule application construal of percolation. Recal that Eberhard, Cutting, & Bock (2005)?s more comprehensive theory holds that, ?number information bound anywhere within a structure has the potential to influence agrement proceses .. to a degre that is negatively correlated with its structural distance from the locus of agrement control? (544). There are arguably more major category boundaries that separate subject and atractor in the relative clause configurations (and, in particularly, a clausal boundary), yet the dominance path length distance is comparable. Consequently, Eberhard, Cutting, & Bock (2005) should offer paralel explanations to complex subject and relative clause atraction. Because the local environment around the verb is identical across atracting and non-atracting configurations, and because the hierarchical relationship betwen atractor and subject is inverted relative to complex subjects, the comprehension patern for relative clause atraction is informative across a variety of basic results. Table 2-2 details the interpretation of potential reaction time paterns, if the RC sentences were compared in an experiment similar to Pearlmutter, Garnsey, & Bock (1999). 62 Eroneous feature percolation Atractor efect on RT Via rule application Via spreading activation Ungramatical Gramatical ? ? ? Strongly supports feature percolation, both construals. No ?local plural? confound. ? ~ ? Contradicts feature percolation, both construals. Verifies ?atracting? property of RC heads. ~ ~ ? Consistent, upwards- only percolation ? Contradicts, as currently formulated ~ ? ? ? ? ~ ? Consistent with no feature percolation theory. RC atraction may be fundamentaly diferent. Table 2-2 Interpretations of reading time paterns in relative clause agrement comprehension Relative clause agrement therefore provides a compatible, and even stronger, test of feature percolation models than complex subjects. Before embarking on an experiment, it is important to consider whether there realy are appropriate analogies betwen the two configurations. Informal intuition suggests the answer is yes. The plural markednes distinction holds: there is no similar amelioration of a plural subject ? singular verb mismatch, by a singular relative clause head: (30) (a) * The politicians who the farmers refuses to vote for .. (b) * The politician who the farmers refuses to vote for .. There is suggestion of a hierarchical distance efect reported in Kimbal & Aisen. They report that the amelioration appears go away if the misagreing subject-verb pair are more deeply embedded 19 : 19 An interesting further claim by Kimbal & Aisen, to which we cannot devote much atention to here, gives a more nuanced view of how distance from verb to relative clause head could impact agrement. One embedding sems to neutralize the amelioration of the 63 (31) (a) *The politician who the farmer believes his neighbor refuse to vote for .. (b) ?*The politicians who the farmer believes his neighbor refuse to vote for.. Kimbal and Aisen?s study was observational, and they atributed this patern to a Northeast US/Boston dialect, for whom (26a) is unaceptable, but (26b) is fine. To the (non-Northeastern) ear of myself and others, the contrast holds, however. Moreover, examples of the relative clause-type can be found in wel-edited texts, such as the New York Times, and a University of Arizona honors thesis 20 : (32) (a) We can live with the [ NP erors PL ][ RC that clasification software SG make PL .. ] (Nunberg 2003, p. 5) (b) These include .. the [ NP problems PL ][ RC that incrementality SG pose PL .. ] (Byram 2007, p. 58) They can also be observed in casual speech: (33) In what ways do the [ NP hypotheses PL ][ RC one SG entertain PL ] influence visual information search? Mike Dougherty, 1 Feb, 208, University of Maryland Psychology Department talk subject-verb mismatch, and render the sentence bad again, as in (31b), repeated as (i)-(i). But if plural agrement is realized on each verb in the relative clause, the contrast is claimed to re-appear, (ii)-(iv). (i) *The politician who the farmer believes his neighbor refuse to vote for .. (i) ?*The politicians who the farmer believes his neighbor refuse to vote for.. (ii) *The politician who the farmer believe his neighbor refuse to vote for .. (iv) The politicians who the farmer believe his neighbor refuse to vote for .. Kimbal & Aisen consider this to be a kind of cyclicity efect in agrement. They also note that the same atraction efect sems to hold for other A? dependencies, as in (v), Topicalization, and (vi), Wh-questions (v) (a) *This politician, the farmer refuse to vote for (b) These politicians, the farmer refuse to vote for (vi) (a) *Which politician do the farmer refuse to vote for? (b) Which policitians do the farmer refuse to vote for? To our ear, however, the contrasts also sem to hold in these examples, though the efect inside relative clauses remains the clearest. 20 RC atraction also proved dificult to avoid in the composition of this disertation. Here is an eror made during the writing of Chapter 4 (subsequently corrected): .. testing [ NP sentences PL ][ RC in which the verb SG do PL not participate in a filer-gap dependency ] 64 Some recent experimental work using cumulative (word-by-word) gramaticality judgments confirms an amelioration for ungramatical Sg-Pl agrement, when embedded inside a pluraly-headed relative clause (Clifton, Devy & Frazier 2001). Indeed, in the first significant production study on agrement atraction, Bock and Miler (1991) demonstrated strong atraction efects in production for relative clause constructions in their Experiment 3, and recently Franck and colleagues (2006) observed such efects in a French production experiment (though they did not find a plural markednes asymmetry). Interestingly neither Bock and Miler, nor Franck et al., make much of the structural distinction betwen complex subjects and the relative clause configuration. There are clear similarities betwen RC atraction and complex subject atraction. More importantly, as detailed in Table 2-2, the RC atraction configurations sem capable of resolving the question of whether feature percolation is active, and, potentialy, which construal of feature percolation is relevant. Therefore, Experiments 1 & 2 below report the online efects of agrement atraction in object relative clause constructions. 2.8.2 Experiment 1 In Experiment 1 participants read singular-subject relative clauses in a moving window, word-by-word, self-paced reading task. This experiment crossed the factor, GRAMATICALITY, whether or not the verb inside the relative clause matched the subject in number, with ATRACTOR NUMBER, whether the relative clause head was singular or plural. Therefore, this experiment is the relative clause analog of Pearlmutter, Garnsey, & Bock (1999)?s Experiment 1. 65 2.8.2.1 Materials and Methods Note: The experiments reported in this chapter are self-paced reading and speeded gramaticality judgment studies. Because these kinds of studies re-occur in subsequent chapters, we wil describe the procedure and analysis techniques in close detail below. In subsequent discusions of similar experiments we wil more briefly summarize the methodological aspects of the experiment, unles there are crucial diferences. Standard Erors: Unles otherwise noted, eror bars in data figures indicate the standard eror of the cel means. Confidence intervals for experimental comparisons are reported in the text. Participants Participants were 30 native speakers of English from the University of Maryland community with no history of language disorders. Sample set of experimental items for Experiment 1 Grammaticality Atractor number Sg The musician who the reviewer praises so highly wil probably win a Gramy. Grammatical Pl The musicians who the reviewer praises so highly wil? Sg The musician who the reviewer praise so highly wil? Ungrammatical Pl The musicians who the reviewer praise so highly wil? Table 2-3 Sample materials for Experiment 1 Materials Experimental materials consisted of 48 sentence sets aranged in a 2 ? 2 design with relative clause head number (singular/plural) and gramaticality (gramatical/ungramatical) as factors. An example set is presented in Table 2-3. The first six words always contained a noun phrase modified by a relative clause, following the form det-noun-?who?-det-noun-verb. The subject-verb agrement manipulated here was the agrement betwen the noun and verb contained in the relative clause, and thus the head noun modified by the relative clause was considered to be the ?atractor?. The 66 subject of the relative clause was always singular. Because in this design the noun imediately adjacent to the verb was always singular, efects of morphological complexity were not a concern. The word following the critical verb was usualy a short function word and never caried agrement. The 48 sentence sets were distributed across 4 lists in a Latin Square design, and were combined with 24 items of a prepositional- phrase agrement atraction design (al gramatical; these data are reported in section 2.9.1) and 192 filer sentences of similar length. Experiment-wide the percentage of ungramatical sentences was 13.6%. Procedure Sentences were presented on a desktop PC using the Linger software (Doug Rohde, MIT) in a self-paced word-by-word moving window paradigm (Just, Carpenter, & Woolley, 1982). Each trial began with a scren presenting a sentence in which the words were masked by dashes, while spaces and punctuation remained intact. Each time the participant presed the space bar, a word was revealed and the previous word was re- masked. A yes/no comprehension question appeared al at once on the scren after each sentence. The ?f? key was used for ?yes? and the ?j? key was used for ?no?. Onscren fedback was provided for incorrect answers. Participants were instructed to read at a natural pace and answer the questions as acurately as possible. Order of item presentation was randomized for each participant. 7 practice items were presented before the beginning of the experiment. Analysis Only items for which the comprehension question was answered correctly were included in the analysis. Reading times that exceded a threshold of 2.5 standard 67 deviations, by region and condition, were excluded (Ratclif, 1993). Regions 2-10 were examined; the critical verb appeared in region 6. Data for each of these 9 regions were entered into a 2 ? 2 repeated-measures ANOVA with subject number and gramaticality as factors. Using R (R Development Core Team, Vienna), ANOVAs were computed on participant mean reading times across items (F1) and on item means across participants (F2). Min F? statistics (Clark, 1973; Raijmakers, Schrijnemakers, & Gremen, 1999) were also computed, although because our items were counterbalanced acros lists, this test is probably too conservative (se Raijmakers et al., 1999 for discussion). These statistics are presented in full in Appendix B. Since it has been argued to be problematic to determine confidence intervals from repeated measures ANOVAs in a way that treats participants as random efects (Blouin & Riopele, 2005), we performed a complementary analysis by fiting linear mixed-efect models to our data. Models were fit using restricted log-likelihood maximization, simultaneously controlling for subject and item as random factors; 95% confidence intervals (CIs) were then derived from these models by Markov Chain Monte Carlo simulation (Bayen, Davidson, & Bates, submited). These are presented in the text. 2.8.2.2 Results Comprehension Acuracy Mean comprehension question acuracy for experimental items across participants and items was 92.3%, and did not difer across conditions (logistic mixed-efects model, ps n.s.). Two participants showed a comprehension question acuracy rate of les 80% across al items and were thus excluded from further analysis. 68 Reading Times Figure 2-6 summarizes the reading time results from Experiment 1. The critical verb region (R6) did not show a main efect of gramaticality (Fs < 1). However, the spilover region (R7) showed main efects of atractor number and gramaticality, and crucialy, their interaction. Pairwise comparisons showed that there was a significant gramaticality efect only when the relative clause head was singular (gramatical mean = 337 ms; ungramatical mean = 399 ms; 95% CI = 27.2 ms, p < .05), but not when it was plural (gramatical mean = 331; ungramatical mean = 341; 95% CI = 18.7 ms, p > .1). In region 8, both main efects and the interaction persisted. No significant efects were found in regions 9 and 10 (Fs < 1), except for a marginal efect of atractor number. Figure 2-6 Experiment 1: Relative Clause Atraction Reading Time Results 69 We interpret the specificity of the gramaticality response to singular headed relative clauses, and in particular, the absence of such an efect when the relative clause head is plural, as an online atraction efect. The gramaticality efect, however, reflects the diference betwen ungramatical and gramatical conditions, when a misagreing verb is read. The crucial question is whether the gramatical conditions themselves show any sensitivity to atractor number: that is, was there an ilusion of ungramaticality? To investigate this possibility we conducted a pairwise comparison over the two gramatical conditions in the regions following the verb. In R7, the region imediately following the verb, where the gramaticality efect is strongest, no significant diferences were found (singular mean = 337 ms; plural mean = 331 ms; 95% CI = 16.3 ms; p > .1). No diferences were discovered in subsequent regions. Also of interest is whether or not there were any efects of morphological complexity in this experiment. Region 2 (the atractor) showed a main efect of atractor number, such that the plural conditions had longer reading times (plural mean = 322; singular mean= 311; 95% CI = 7.68 ms). Since the verb was not encountered until region 6, this efect could not have been driven by agrement, and was therefore plausibly related to the additional length and morphological complexity of the plural nouns relative to the singular nouns. Longer reading times for the plural-atractor conditions persisted to the critical verb (R6); the main efect atractor number was reliable at the relative pronoun (R3) and the relative clause subject noun (R5); however, these later efects were relatively smal. 70 2.8.2.3 Discussion The results of Experiment 1 suggest that the head of the relative clause can act as a strong atractor for agrement. When a singular subject is followed by a plural verb, and the RC head is singular, there is a large reading time disruption, as compared to a singular subject-singular verb sequence. However, when the RC head was plural, the ungramatical singular subject-plural verb sequences did not difer from the gramatical singular subject-singular verb condition at the spilover region. Thus, as in previous studies of agrement atraction in comprehension using P-modified subjects, we found that an RC head atractor matching the RC verb?s number significantly reduced the reading time disruption normaly sen to subject-verb number mismatch. The main efect of atractor number in region 2 also corroborates the finding in Experiment 1 that plurals result in longer reading times than do singulars in self-paced reading. These results contrast with the predictions of a feature percolation acount. Such an acount crucialy predicts that the reduction in sensitivity to gramatical violation should be driven symmetricaly: ungramatical conditions should improve, and gramatical conditions should worsen. However, we also found that there was no atractor efect in the gramatical cases. The gramatical plural atractor condition did not show significantly longer reading times than the gramatical singular atractor condition. This result is inconsistent with the hypothesis that atraction efects in comprehension are due to a faulty representation of the subject. This finding contrasts with some previous work which did show gramatical atractor efects (e.g. Pearlmutter et al. 1999; se also Nicol et al. 1997). Consistent with our discussion above, it sems plausible that diferences in the local environment were partialy responsible for the diferences betwen gramatical conditions sen in those studies. In this experiment, because of the non-intervening nature 71 of the relative clause configuration, the local environment was matched (singular subject). However, another possibility is that the diferences we found in the ungramatical conditions were due to some phenomenon other than what has typicaly been described as agrement atraction. The next experiment provides further evidence to argue that these diferences do represent agrement atraction. 2.8.3 Experiment 2 We undertook Experiment 2 to further test whether what we have caled relative clause atraction behaves similarly to complex subject atraction. To do so, we wanted to test whether the plural markednes generalization observed in complex subject atractions is also present in RC atraction. Kimbal & Aisen?s observations suggest this to be the case. To test this possibility in an online context, we conducted a more complicated version of Experiment 1 in which we added a further cross to the existing 2 ? 2 design: al sentences in Experiment 1 had singular subjects, but in Experiment 2 we manipulated this as a third factor. If only singular subjects are liable to permit atraction, as is true in complex subjects, then we should se only a main efect of gramaticality when the subject is plural. 2.8.3.1 Materials and Methods Participants Participants were 60 native speakers of English from the University of Maryland community with no history of language disorders. Materials Experimental materials consisted of the same 48 sentence sets as in Experiment 2, but 72 this time aranged in a 2 ? 2 ? 2 design with atractor number (singular/plural), subject number (singular/plural), and gramaticality (gramatical/ungramatical) as factors. Table 2-4 gives a sample set of materials. The composition of filers, and the experiment- wide proportion of ungramatical sentences was the same as in Experiment 1. Sample set of experimental items for Experiment 2 RC Subject number Grammaticality Atractor number Sg The musician who the reviewer praises so highly wil probably win a Gramy. Grammatical Pl The musicians who the reviewer praises so highly wil? Sg The musician who the reviewer praise so highly wil? SG Ungrammatical Pl The musicians who the reviewer praise so highly wil? Pl The musicians who the reviewers praise so highly wil probably win a Gramy. Grammatical Sg The musician who the reviewers praise so highly wil? Pl The musicians who the reviewers praises so highly wil? PL Ungrammatical Sg The musician who the reviewers praises so highly wil? Table 2-4 Sample plural subject materials for Experiment 2 Procedure and Analysis The procedure was the same self-paced reading procedure described in Experiment 1, and the analysis followed similar steps. Only items for which the comprehension question was answered correctly were included in the analysis. Reading times that exceded a threshold of 2.5 S.D. by region and condition were excluded. Due to experimenter eror the distribution of participants across the 8 lists was unbalanced across the first 56 participants. Four additional participants were recruited to balance the design at n = 56, 73 and these are the results that we discuss However, the patern of results did not difer from the analysis in which al 60 participants were included. The same analysis procedures were followed as in Experiment 1. In this experiment, the comparisons of interest were al within a given level of the subject number factor; we were interested in whether the 4 plural subject conditions would show the same patern relative to each other as the 4 singular subject conditions. In order to examine this question we split the design into two 2 ? 2 repeated-measures ANOVAs, one for each level of subject number (singular/plural), with atractor number and gramaticality as factors. For completenes, we also computed a 2 ? 2 ? 2 repeated measures ANOVA with atractor number, subject number, and gramaticality as factors. The results are also presented in Appendix B. 2.8.3.2 Results Comprehension Question Acuracy Mean comprehension question acuracy for experimental items across participants and items was 93.7%, and did not difer across experimental conditions. However, visual inspection of the means suggested that the plural-atractor/plural-subject conditions were answered les acurately than the other conditions. A post-hoc comparison revealed a reliable efect of the presence of two plural nouns (plural atractor and subject), compared to one or none (mean of zero or one plurals = 94.5%; mean of two plurals = 91.2%; p < 0.01). Self-paced reading The results of Experiment 2 are plotted in Figure 2-7 (singular subject conditions) and 74 Figure 2-8 (plural subject conditions). At the verb in region 6 there was a clear and consistent efect of RC subject number (plural mean = 365 ms, singular mean = 348 ms, 95% CI = 7.5 ms, p < 0.005), similar to the efect of subject number observed in Wagers, Lau, & Philips (2008)?s simple subject-verb agrement study (reported in Figure 2-3). Figure 2-7 Experiment 2: RC Atraction Reading Time Results, Singular Subject Region 7, the region following the critical verb, showed a main efect of gramaticality (mean ungramatical = 403 ms; mean gramatical = 355 ms; 95% CI = 13.1 ms, p < 0.005), and a marginal interaction of gramaticality and atractor number. Spliting the design by relative clause subject number revealed a patern of atraction similar to Experiment 1, but only for singular subjects. For singular subjects, as in Experiment 1, the plural atractor conditions showed a smaler gramaticality efect 75 (ungramatical mean = 386 ms, gramatical mean = 356 ms, 95% CI = 18.6 ms, p < 0.05) than the singular atractor conditions (ungramatical mean = 415 ms, gramatical mean 348 ms, 95% CI = 21.3 ms, p < 0.005). This interaction was marginaly significant in region 7 (p = .09), and significant in region 8 (p < .05). Figure 2-8 Experiment 2: RC Atraction Reading Time Results, Plural Subject By comparison, atractor number had no impact upon the gramaticality efect for plural RC subjects (Plural atractors: ungramatical mean = 401 ms, gramatical mean = 358 ms, 95% CI = 17.8 ms, p < 0.001); Singular atractors: ungramatical mean = 410 ms, gramatical mean = 355 ms, 95% CI = 23 ms, p < 0.0005). As in Experiment 1, we further tested for the existence of atraction efects in the gramatical conditions by conducting pairwise comparisons betwen singular and plural atractor conditions in gramatical sentences, for both singular and plural subject sets, in 76 the critical verb spilover region (R7). We found no significant efects of atractor in either the gramatical singular-subject conditions (R7: plural mean = 356 ms, singular mean = 348 ms, 95% CI = 15.4 ms, p > .1; R8: plural mean = 361 ms, singular mean = 345 ms, 95% CI = 17.1 ms, p > .1) or the gramatical plural-subject conditions (R7: plural mean = 358 ms, singular mean = 355 ms, 95% CI = 14.6 ms, p > .1; R8: plural mean = 344 ms, singular mean = 353 ms, 95% CI = 15.2 ms, p > .1). This patern of results shows that atraction is asymmetric: it leads to ilusions of gramaticaly, corresponding to the reductions in dificulty observed for ungramatical conditions, but no ilusions of ungramaticaly, which would correspond to an increase in dificulty for gramatical conditions. We examined these results for efects of morphological complexity as wel. In regions 2 and 3, the relative clause head and relative pronoun, the omnibus ANOVA showed a main efect of atractor number as in Experiment 1, due to slower reading times for the plural head conditions (R2: plural mean = 353 ms, singular mean = 341 ms, 95% CI = 6.5 ms, p < 0.01; R3: plural mean = 339 ms, singular mean = 331 ms, 95% CI = 4.8 ms, p < 0.05). However, the 2 x 2 ANOVAs, split by subject number, reveal that singular subject conditions show this efect more strongly in region 2, and plural subject conditions in region 3. Because the diferences were smal and subject number was not manipulated until later in the sentence, this variation is presumably random. However, it is consistent with the timing of the efects reported in Pearlmutter, Garnsey, & Bock (1999)?s Experiment 1, in which Sg [ Pl ] subjects showed a slowdown in the region imediately following the plural, whereas Pl [ Pl ] subjects showed this slowdown two regions downstream. Region 5, the relative clause subject region, showed both main 77 efects of atractor number and subject number (RC Subj: plural mean = 337 ms, singular mean = 329 ms, 95% CI = 6.5 ms, p < 0.05; atractor number: plural mean = 339 ms, singular mean = 327 ms, 95% CI = 6.4 ms, p < 0.05). However the efect of subject number in this region appears to be caried by an exceptional value for gramatical plural atractor/plural subject conditions. Since this diference preceded the gramaticality manipulation, it is likely spurious. 2.8.3.3 Discussion Experiment 2 replicated the basic atraction efect discovered in Experiment 1: the presence of a plural atractor in the RC-head position reduced the disruption due to ungramatical subject-verb mismatch when the subject was singular. Furthermore, Experiment 2 confirmed the plural markednes generalization that has repeatedly been observed in studies on atraction in complex subject DPs: the atractor manipulation had no efect when the subject was plural. This finding supports the idea that the atraction efect shown here has a similar basis, despite the diferent ordering of atractor and subject. Taken together, the results of Experiments 1 and 2 combined argue against a percolation acount of agrement atraction in comprehension. In both experiments, we found atractor efects in singular subject, ungramatical conditions, but we found no ?miror? efect in gramatical conditions, for either singular or plural subject conditions. In other words, the presence of an atractor noun mismatching the verb in number had no efect on reading times as long as the subject and verb did match in number. The presence of a plural atractor was able to create ilusions of gramaticality, but not ilusions of ungramaticality. This finding is inconsistent with al models of eroneous 78 feature percolation. 2.8.4 Experiment 3 Experiments 1 and 2 provide evidence that relative clause agrement is subject to atractor efects, and that the explanation cannot be a faulty encoding of number on the subject. The basis for these conclusions is paterns of reading-time dificulty experienced in the verb and post-verbal regions. An interesting question is how readers would report perceiving these violations. Convergence betwen online and offline measures would provide stronger evidence for the selective falibility of agrement atraction. A simple, and usualy reliable, means of obtaining aceptability judgments from a large population of informants is a pencil-and-paper rating questionnaire, in which participants rate the aceptability of a sentence on a 5- or 7-point scale. In pilot work we found that this method was insensitive to agrement atraction efects, and generated considerable variability across participants. Informal analysis and debriefings suggested that some participants in this task could be remarkably insensitive to any kind of agrement violation. Others semed hyper-sensitive (and reported engaging in the task as if it were a proof-reading exercise). We therefore turned to speeded gramaticality judgments. In this task, participants read a sentence in the Rapid Serial Visual Presentation modality (RSVP; Potter, 1988) and must make a choice, ?Aceptable? or ?Not aceptable?, under a deadline. 2.8.4.1 Materials and Methods Participants 79 Participants were 16 native speakers of English from the University of Maryland community with no history of language disorders. They received credit in an introductory linguistics course for their participation. Materials 40 of the 48 sentence sets were taken from the Experiment 2, and so crossed the factors gramaticality, atractor number (RC head: plural/singular) and subject number (plural/singular). They were distributed across 8 lists by a Latin Square. Each participant therefore saw five items per condition. This experiment was run concurrently with a related experiment on complex subjects (Experiment 4, reported below in 2.9.3.1), which incorporated 24 sentence sets (half of the conditions in which were ungramatical). 56 further filer sentences were included, 28 of which contained diverse types of violations, including: sequence-of-tense mismatches (?If the careful scientist had tested his data one more time, he finds that his results were wrong al along?), auxiliary selection violation (?Every new intern that the political campaign group hired wil doing lots of work?), subcategorization violations in filer-gap dependencies (?The orphan to whom the milionaire inherited his fortune ..?), gender mismatching reflexives (?The businesman who made a record number of sales this year treated herself to a drink.?), and event structure violations (?The goofy clown amused the children in 30 minutes.?) Overal, each participant saw 60 nominaly wel- formed sentences and 60 nominaly il-formed sentences. Procedure Sentences were presented on a desktop PC using the Linger software (Doug Rohde, MIT) in an RSVP paradigm (Potter, 1988). Each trial began automaticaly, and 80 sentences were presented at a rate of 300 ms/word. Imediately after the last word, which was marked with a period, participants were prompted to respond whether or not the sentence was an aceptable sentence of English. The ?f? key was used for ?yes? and the ?j? key for ?no.? Participants had 2 seconds to respond. If 2 seconds elapsed with no response, participants were informed they had waited too long. The next trial began 1 second after the participant?s response or the time-out. Participants were instructed to pay close atention and respond as quickly as possible. Order of item presentation was randomized for each participant. 6 practice items were presented before the beginning of the experiment. Analysis Data were analyzed by fiting a logistic mixed-efects model (Agresti, 2002) to the fixed experimental factors of gramaticality, atractor number and relative clause subject number, and the random factors of subject and item. Logistic model coeficients reflect the contribution of the predictor variables to the probability of responding ?yes? in the judgment task. This probability is expresed as a logit, or log-odds, where logit(p) = log [ p / (1 ? p ) ]. Models were fit with R (R Development Core Team, Vienna) and the lme4 package (ver. 0.99875-9; D. Bates, 2007), using restricted log-likelihood maximization. As in Experiment 2, thre model fits were performed: a ?ful model? that included al main efects and interactions of the complete 2 ? 2 ? 2 design, and two models split by subject number, one 2 ? 2 each for singular and plural subjects. Fixed efects were initialy nested within the random factors, but were found to do no beter at explaining the variance than the non-nested models; therefore only non-nested models are reported. 95% confidence intervals were calculated over the coeficients, as described in Experiment 1. 81 Out of the 640 experimental responses collected overal, there were only 11 timeouts. 2.8.4.2 Results Figure 2-9 reports the average proportion of ?yes? responses for singular subject conditions, and Figure 2-10 the same values for the plural subject conditions. Before reporting the analysis, inspection of the figures reveals two paterns: first, the atractor only has any efect in singular subject conditions, consistent with plural markednes; second, the atractor has a large asymmetrical efect in singular subject conditions, inconsistent with feature percolation. Figure 2-9 Experiment 3: Relative clause atraction, Singular subjects Speded grammaticality, proportion ?yes? responses Eror bars represent the standard eror of the mean proportion across participants. 82 Full model. We observed a main efect of gramaticality: participants were les likely to respond ?yes? when subject and verb failed to agre (fixed efect logit coeficient ?: -2.65 ? 0.78; p < 0.001). There was a marginal efect of subject number, such that participants were slightly les likely to respond ?yes? when the subject was plural (?: - 0.68 ? 0.77; p < 0.10). Two interactions were significant: the two-way gramaticality ? atractor number interaction, such that participants were more likely to respond ?yes? to ungramatical sentences when the relative clause head was plural (?: 2.33 ? 1.0; p < 0.001); and the thre-way gramaticality ? atractor ? subject interaction, such that participants were les likely to say ?yes? to ungramatical sentences when both the relative clause head and subject were plural (?: -2.18 ? 1.5; p < 0.005). Figure 2-10 Experiment 3: Relative clause atraction, Plural subjects Speded grammaticality, proportion ?yes? responses Eror bars represent the standard eror of the mean proportion across 83 participants. Singular subject model. In the 2 ? 2 model that holds subject number constant as singular, there was a significant main efect of gramaticality (?: -2.50 ? 0.76; p < 0.001) and a significant interaction of gramaticality with atractor number (?: 2.21 ? 1.0; p < 0.001). Participants are les likely to say ?yes? for ungramatical sentences, but that likelihood increases when the relative clause head is plural. There is a marginal efect of atractor number (?: -0.67 ? 0.75 ; p < 0.10), such that participants are slightly les likely to say ?yes? for gramatical sentences when the relative clause head is plural. Plural subject model. In the 2 ? 2 model that holds subject number constant as plural, there is only a significant main efect of gramaticality (?: -2.14 ? 0.78; p < 0.001), and no reliable efects of atractor number or their interaction. 2.8.4.3 Experiment 3: Discussion The results of the offline judgment task converge with those found in the online task. When the subject of the relative clause is singular, participants are good at detecting subject-verb misagrement; except when a plural atractor is present. The presence of the atractor in such sentences leads to a strong ilusion of aceptability: participants go from rejecting sentences at a overal rate of 72%, when the relative clause head is singular, to rejecting only 36% of them, when there is a plural atractor. When the subject of the relative clause is plural, there is only an efect of gramaticality. The number of the relative clause head does not afect judgment behavior when the subject is plural, consistent with the plural markednes generalization. The atractor does not have a symmetrical efect on judgments of gramatical sentences. There is some deviation from the online results, reflected in a marginaly 84 increased tendency to reject gramatical sentences, when a plural atractor is present: from a baseline rejection rate of 18% or singular atractors to 30% for plural atractors. In the online task, participants did not slow down when they read gramatical sentences containing a plural atractor. In the 2 ? 2 model, however, this efect was only marginal, so we do not make strong conclusions in the absence of a replication, or an experiment with greater power (but se Chapter 3). On the face of it, this may sem to be support for the feature percolation model, except that the efect observed is far from equivalent in size to the ungramatical efect. The eroneous feature percolation hypothesis predicts the efects should be symmetrical. Whatever efect the atractor has, it is simply not increasing the proportion of subjects that are eroneously encoded plural. 2.9 Testing Percolation I: The Grammatical-Ungrammatical Asymetry in Comprehension 2.9.1 Wagers, Lau, & Philips (2008) and On-line Comprehension Despite the apparent similarities in the atraction phenomenon betwen complex subjects and relative clauses, a reasonable response to the conclusions of Experiments 1 ? 4 would be that the mechanism of atraction in RC agrement is clearly not feature percolation, but that this does not impact conclusions about complex subjects. This objection is most potent for a construal of feature percolation in which features may only percolate upwards. There is no upwards-only path from relative clause head to relative clause subject (unles the features may percolate from the head?s copy or coindexed category in the relative clause object position). However, for Eberhard, Cutting & Bock (2005), any source of nominal number should be able to transmit its ?SAP? to the structural network, irespective of dominance relations, and impact agrement proceses. 85 Therefore, the lack of a gramatical contrast in relative clause atraction chalenges Eberhard, Cutting & Bock (2005)?s model. 21 As for complex subject atraction in comprehension, Wagers, Lau, & Philips (2008; WLP) revisited Pearlmutter, Garnsey & Bock (1999)?s experiment. WLP reasoned that if what looked like a gramaticality efect in the previous comprehension experiments was realy a confound of morphological complexity, then separating the local plural from the verb should decrease this ?impostor? efect. In a self-paced reading study, they had participants read sentences adapted from PGB, with a sentence-level adverb interposed betwen the local plural and the verb, as below: (34) The path to the monument(s)unsurprisingly was/were litered with bottles. The idea is that the adverb could buffer any of the procesing dificulty that might spil- over (or be delayed) from the atractor. Their results are shown in Figure 2-11. 21 One might counter that clause boundary betwen the relative clause head and the relative clause subject somehow obstructs the flow of ?SAP.? However, recal that both Bock & Cutting (1992) and Solomon & Pearlmutter (2004) found than an atractor inside a relative clause could lead to erors in the main clause verb (albeit at lower rates than a P-contained atractor). Therefore clause boundaries cannot completely block atraction. 86 Figure 2-11 Wagers, Lau, & Philips (2008): Complex Subject Atraction Reading times from Wagers, Lau, & Philips (2008) Experiment 4. Sample sentence with subscripts to indicate region: The 1 path 2 to 3 the 4 monument(s) 5 unsurprisingly 6 was/were 7 litered 6 with 7 bottles 8 .. Adapted from their Figure 5. The results show that there is a cost to reading the plural atractor, reflected as main efect of number both in the atractor region itself, and on the adverb (Regions 5-6). In the imediate post-verbal region (Region 8), there is an atraction efect for ungramatical conditions, but not one for gramatical ones. However, in Region 7, there is stil a slow-down observed for gramatical conditions. It is important to note that in that region there is no diference betwen ungramatical conditions. Nonetheles, because Region 7 corresponds to the the verb, it raises the possibility atractor could incur dificulty in gramatical sentences, consistent with the feature percolation acount. The 87 temporal disjunction betwen when a slow-down is observed for gramatical conditions and when a speed-up is observed for ungramatical conditions is much stronger in Wagers, Lau & Philips? data than in Pearlmuter, Garnsey, & Bock (1999). The post- verbal efect of the atractor on ungramatical conditions is long-lasting, whereas the only slowdown observed in gramatical conditions does not continue past the verb. Wagers, Lau, & Philips (2008)?s adverb manipulation did not, on its own, fuly resolve the isue of whether agrement procesing contributes to the slowdown observed in gramatical conditions. The reading time efects engendered by plurals may be more long-lasting than one region, however. RC atraction Experiment 1 also showed plural efects that extended beyond one region (as did Pearlmutter, Garnsey, & Bock?s Experiment 3). Based on the raw data from Wagers, Lau, & Philips, we conducted a further analysis that more precisely localizes the source of the slowdown in gramatical conditions. 2.9.2 Controlling for RT correlations among adjacent regions: a mixed-efects models analysis The distribution of reading times in a given region is not independent of its neighbors. There is a strong correlation betwen reading times within a window of 1-2 regions (unpublished observation from a large reading time corpus). We therefore analyzed the RT data from Wagers, Lau, & Philips (2008)?s Experiment 4, by regresing out from the verb region data the contribution of the RT in the two regions that precede verb (se Vasishth, 2006). Data were analyzed using a linear mixed-efects model, as described above. Before the previous region RT regresion was caried out, an unnested model, incorporating fixed and random factors, estimated the slowdown in gramatical conditions to be 18 ms ? 17 ms (p < 0.05). Recal that a gramatical slowdown due to 88 the atractor is the crucial efect predicted by feature percolation. In reading time data sets, it is also what we suspect is contaminated by the morphological complexity of the plural atractor. After the previous region RT covariates were incorporated into the model, the slowdown shrunk to 4 ms ? 16 ms (n.s.). Both previous region covariates were significant (R(n-1) ?: 0.27 ? 0.07; R(n-2) ?: 0.34 ? 0.10; ps < 0.005). These efects can be visualized by regresing out the two previous region RTs for every region and then plotting the residuals. Se Figure 2-12. Figure 2-12 Wagers, Lau, & Philips (2008) Experiment 4, Residual RTs Two previous Region RTs regresed out On the atractor itself there is a main efect of atractor number (p < 0.01) but no efect on the adverb or verb. On the verb there is only a main efect of gramaticality ( p < 0.01 ). The main efect of gramaticality persists into the following region (p < 0.005), and is marginal two regions downstream (p < 0.10). On the imediate post-verbal region, 89 there is a strong interaction betwen gramaticality and atractor number such that plural atractors considerably speed-up procesing in ungramatical sentences (p < 0.05). Most importantly, there is no efect of atractor number on the gramatical conditions in any of the verbal or post-verbal regions. In this analysis we atempted to ases the contribution that each new ord made, independently of the preceding RT baseline. One region beyond the verb we found strong evidence that ungramatical strings were read considerably faster in sentences containing an atractor. However, we found no evidence at or beyond the verb that gramatical strings were read more slowly in atractor sentences. If agrement checking can only commence once the information on the verb is available, the only efect an atractor has on agrement is in the ungramatical conditions. We also found an independent contribution of morphological complexity introduced by the atractor. These results strongly contradict the symmetry predictions of the feature percolation model of agrement atraction. They are consistent with our interpretation of previous data: the apparent slow-downs in gramatical atractor conditions observed in Pearlmuter, Garnsey, & Bock (1999) and Wagers, Lau, & Philips (2008) are not due to the impact of the atractor on agrement checking in those conditions; they reflect the contribution of reading a plural in the atractor position, independently of its relationship to the verb. 2.9.3 Experiment 4: Speeded grammaticality tests of complex subject attraction As with RC atraction, we can ask what comprehenders report perceiving when they proces gramatical and ungramatical complex subject sentences which contain agrement atractors. 90 2.9.3.1 Experiment 4a: Singular complex subjects When we conducted Experiment 3, we also incorporated Wagers, Lau, & Philips (2008)?s complex subject stimuli in a speeded gramaticality task. These stimuli are the canonical complex subject atraction sentences, without adverbs. The design was a 2 ? 2 cross of gramaticality and atractor number. An example set of materials is given below: (35) (a) GRAMATICAL/SG ATRACTOR The path to the monument is litered with bottles. (b) GRAMATICAL/PL ATRACTOR The path to the monuments is litered with bottles. (c) UNGRAMATICAL/SG ATRACTOR The path to the monument are litered with bottles. (d) UNGRAMATICAL/PL ATRACTOR The path to the monuments are litered with botles. Participants, Procedure and Analysis details are identical to Experiments 3 above. Figure 2-13 reports the proportion of ?yes? judgments for the four experimental conditions. The results of the 2 ? 2 logistic mixed-efects model confirm what an inspection of the figure suggests: there is main efect of gramaticality (?: -4.0 ? 1.0, p < 0.001) and an interaction of gramaticality with atractor number (?: 1.7 ? 1.3, p < 0.01). Participants are more likely to acept ungramatical sentences, when an atractor is present. Aceptance rates more than double ( Sg [ Sg ] ungramatical: 25%, Sg [ Pl ] ungramatical: 55%), which is comparable to the same efect observed in RC atraction (Experiment 3). However, participants are not more likely to reject gramatical sentences, when there is an atractor (?: -0.26 ? 1.1, n.s.). 91 Figure 2-13 Experiment 4a: Complex subject atraction, singular subjects Speded grammaticality, proportion ?yes? responses Eror bars are standard eror of the mean proportion across participants. 2.9.3.2 Experiment 4b: Plural complex subjects We conducted a version of Experiment 4a using plural complex subjects, to test the plural markednes efect for complex subjects in this task. The Procedure and Analysis were identical to Experiment 4a. A sample set of materials set is given below: (36) (a) GRAMATICAL/PL ATRACTOR The paths to the monuments are litered with bottles. (b) GRAMATICAL/SG ATRACTOR The paths to the monument are litered with botles. (e) UNGRAMATICAL/PL ATRACTOR The paths to the monuments is litered with bottles. (f) UNGRAMATICAL/SG ATRACTOR The paths to the monument is litered with bottles. 92 In this experiment, there were 24 participants, al members of the University of Maryland community. They were awarded credit in an introductory linguistics course for their participation. Figure 2-14 reports the proportion ?yes? judgments for the four experimental conditions. The results of the 2 ? 2 logistic mixed-efects model confirm that there is only a main efect of gramaticality (?: -3.0 ? 0.9, p < 0.001). Consistent with the plural markednes generalization, the atractor has no impact on judging behavior for pluraly headed complex subjects. Figure 2-14 Experiment 4b: Complex subject attraction, plural subjects Speded grammaticality, proportion ?yes? responses Eror bars represent the standard eror of the mean proportion across participants. 93 2.10 Conclusions Agrement atraction has plausibly been argued to result from the faulty encoding of a complex syntactic object: the subject. What is faulty about this encoding is that it incorrectly binds the number feature of a nearby noun phrase as its own. The feature percolation model explains this ?mis-binding? as a property of the encoding architecture: features can be mistakenly pased from the atractor noun to the subject projection by means of the dominance paths betwen categories in the structure. We have argued against eroneous feature percolation based on skepticism of its mechanistic foundations (I) and on a series of comprehension experiments that show agrement atraction is selectively falible (I): (I) The candidate mechanisms for pasing features by means of structural links imply either an encoding system that is frequently insensitive to the core gramatical notion of headednes (eroneous rule application construal), or a mechanism of encoding that must consume resources to make erors (pasive spread of activation construal in a structural network). (I) The comprehension facts are inconsistent with feature percolation. Because feature percolation determines the gramatical number of the subject projection, it predicts that comprehenders should perceive some ungramatical strings as gramatical and symmetricaly perceive a similar proportion of gramatical strings as ungramatical. In reading times and in speeded gramaticality judgments, we find clear evidence 94 that comprehenders perceive some ungramatical strings as wel-formed. In our own data, there is scant evidence that gramatical strings are perceived as anything but gramatical. The comprehension results not only rule out eroneous feature percolation models, but, more generaly, any model in which the subject encoding and only the subject encoding is to blame for the erors. What is required to explain the results is sensitivity to the match betwen the information caried by the verb, and information contained by the subject projection. When the verb and the head of the subject match, the atractor does not intrude into agrement checking. When the verb and the head of the subject fail to match, the atractor can exert an influence. In the terms in which we have framed the problem, it is a ?navigation? question: how the comprehender uses the input to control reference to the preceding syntactic context. In Chapter 3 we wil defend an acount of agrement atraction in comprehension based on cue-based retrieval in a content-addresable memory. We wil argue that, though the logical space for comparison has been undersampled, comprehension and production atraction phenomena stem from the same kind of memory and control architecture. Finaly in an atempt to unify a number of diferent empirical domains, we compare agrement atraction erors to the other instances of gramatical infidelity that have been documented, as wel as instances that are resistant to gramatical infidelity. 95 3 The trouble with subjects Binding and acesing features in complex syntactic objects, Part I 3.1 Introduction In Chapter 2 we introduced the phenomenon of agrement atraction, in which the number feature of the verb matches a non-subject projection. We argued that agrement atraction could not result from a faulty encoding of the subject, in which the subject projection binds the wrong number feature. Our strongest evidence against faulty encoding came from a set of comprehension experiments, reported in Chapter 2 and in Wagers, Lau, & Philips (2008), in which non-subject projections only impact procesing when the verb fails to match the subject head in number features. This asymmetry is unexpected if the subject projection characteristicaly mis-binds the number feature of the atractor as a part of the encoding proces. In this chapter, we wil offer an explanation for how this asymmetry could arise. We posit that subject-verb agrement is licensed in part by a retrospective search operation in a content-addresable memory (McElre, 2006). We substantiate this claim by presenting a simple, formal model of agrement licensing that makes the correct predictions with respect to both complex subjects and relative clause atraction. We show that the model extends to a novel variant of relative clause atraction, confirmed by a speeded gramaticality experiment. Agrement atraction is thus taken to exemplify a procesing architecture in which partial matches can arise due to retrieval interference. 96 The concept of interference, and particularly interference steming from cue- based memory retrieval, has recently achieved prominence in the sentence procesing literature (Gordon, Hendrick, & Johnson, 2001, 2004; Van Dyke & Lewis, 2003; Van Dyke & McElre, 2006). There is naturaly interest in identifying the commonalities of sentence procesing with other cognitive tasks. The existence of a broadly succesful syntactic structure building model (Lewis & Vasishth, 2005), based on John Anderson?s general ACT-R model of cognition (Anderson et al., 2004), has strengthened this pursuit. However many of the conclusions in this literature have semed to be at odds with asumptions, implicit perhaps, in gramaticaly-based theories of syntactic recognition and procesing. There are two closely-related, but distinct, architectural claims that bear evaluating. Before returning to agrement atraction, let us lay out the relevant architectural terain, though the reader may proced directly to section 3.3 for the agrement atraction model. The first claim we wil consider is that the search mechanism occurs via cue- directed retrieval in a content-addresable mechanism. This mechanism is fast and afords direct aces to target information. Yet it poses a chalenge to enforcing structure- sensitive restrictions on gramatical dependencies because it is controlled by item and not relational properties. The second claim is that the efective procesing workspace is highly restricted and incapable of maintaining much information. As a consequence most discontinuous information has to be retrieved. This property also presents chalenges to structure-sensitivity, as it reduces the efective syntactic context that can be consulted to make parsing decisions. 97 3.2 Searching structure with unstructured searches 3.2.1 Content-addresable search 3.2.1.1 Fundamental properties The innovative claim in recent retrieval-based acounts of syntactic procesing (McElre, Foraker, & Dyer, 2003; Lewis & Vasishth, 2005) is that aces to the syntactic context is not granted by means of a structuraly-ordered search proces, but rather by means of a direct aces, content-addresable search, which renders al relevant constituents simultaneously available. The diagram in Figure 3-1 ilustrates an abstract phrase structure, in which a head, X, must be licensed by a c-commanding category in the structure with the feature +?. This scenario is a pervasive one in natural language: agrement betwen subject and verb fit this schema, as does wh-movement (or movement dependencies generaly). Figure 3-1 Acesible and inaccesible licensers in an abstract tre Category X must be licensed by the feature ?. YP is an acesible licenser, as it is c-commands X. ZP is inacesible. Imagine that we are building structure left-to-right and have reached X, which must now be licensed by a +? category that c-commands it. In order to find a c- commanding +? category it is possible to specify a search proces that traverses the X [ -? ] ZP [+?] ZP[+] YP [+?] ZP [+] ? ! ? ! " ! ! " ? ! " ? ! " ! 98 dominance paths, starting from X. Here is how a very simple algorithm would work in binary branching structure. (37) (a) Begin at the node imediately dominating X: XP. (b) Move to the node imediately dominating XP. (c) Inspect its daughters. Do any bear feature +?? (d) Yes: X is licensed by that daughter. Terminate search. No: Return to step (b). The order in which this proces visits nodes is ilustrated in Figure 3-1 with circled numerals. The search terminates on node ?, where it finds YP [+?]. This proces can straightforwardly be modified to incorporate appropriate gramatical restrictions. Additional termination conditions can be added: for example, the search could terminate at cyclic nodes, in the search for [+wh], or terminate at clause nodes to find the antecedent of a reflexive anaphor. Conditions on the inspection proces could restrict the search to particular kinds of daughters, e.g., specifiers, not heads. The key point is that the order in which the proces generates and then selects candidate licensing categories is governed by phrase structural relations. Our simple procedure belongs to a larger clas of wel-specified algorithms known in computer science as node visitation algorithms (cf. Knuth, 1965/1997). Node visitation algorithms provide a means for acurately and exhaustively searching a tre structure, by traversing its paths and visiting each node once. Crucialy for syntactic procesing, a node visitation algorithm permits relational restrictions like c-command to be incorporated into the search. This outcome sems desirable, as gramatical generalizations are often almost always stated over the relational properties that hold betwen two categories, in addition to their inherent properties. 99 How would the same licensing procedure ilustrated in Figure 3-1 occur in a content-addresable memory? The key idea in such a memory is that search begins with a (sub)set of information that the desired representation should contain. We wil refer to this as the retrieval structure or the set of retrieval cues. This information is then compared simultaneously with al representations in memory to generate a set of candidate matches, based on the similarity of the encoding in memory with the retrieval structure. In theories of recognition and recal, this property is exhibited by ?global matching models,? and it can be implemented in diverse architectures (Clark & Gronlund, 1996). Models can difer on how the strength of the match maps onto selection, depending on task. For purposes of ilustration, we wil asume that the (normalized) match strength maps onto the probability that a particular representation wil be selected (se section 3.3.2.1 below). In our hypothetical example, it is straightforward to se how to generate candidates based on the inherent features necesary for licensing node X: the retrieval structure would contain the [+?] feature. But this cue alone would return both the c-commanding and non-c-commanding categories, YP and ZP. It is unavoidable that gramaticaly-ilicit constituents should be returned in some cases, if the search initialy identifies constituents? encodings on the basis of their contents. Natural language representations repeat many of the same motifs and are potentialy recursive; consequently they are likely to be self-similar. This outcome could have one of two efects on language comprehension. The presence of multiple candidates could render licensing or dependency formation more dificult if a subsequent proces had to then somehow select the gramaticaly-licit antecedent. Alternatively it could make the proces more eror-prone, if there were no subsequent proces to select the 100 gramaticaly-licit antecedent. That is, the system ay unknowingly construct ilicit representations. Van Dyke (2007) has argued this is exactly what happens when verbs atach to complex subjects (ones that contain other subject projections), a claim that we discuss in great detail in section 3.4.2. 3.2.1.2 Evidence McElre (2000) and McElre, Foraker & Dyer (2003) have argued that the generation of candidate constituents in memory search reflects direct aces, consistent with a content-addresable search architecture. There are thre crucial components they use to make their case: (I) a contrastive prediction betwen ordered and direct aces searches, (I) a measure for evaluating this prediction, and (II) a linguistic construction to test the prediction. The logic of these studies is as follows: if the search proces follows a node-by- node algorithm, generating candidates in a structuraly-guided fashion, then the time it takes to license a dependency should be an increasing function of the hierarchical distance betwen the two elements of a dependency. If, on the other hand, the search proces involves direct aces, generating candidates by a feature match, then the time it takes to begin to license a dependency should be constant . The measure of procesing time is crucial. McElre and colleagues argue that simply measuring reaction times in a reading task or gramaticality task is misleading, as participants may make speed-acuracy tradeoffs. For language procesing, these tradeoffs can difer not only across sentences, but within sentences. For example, participants may engage in detailed, high acuracy procesing to establish the thematic roles of an event expresed in the main clause, but revert to relatively shalow, low-acuracy procesing in 101 a non-restrictive relative clause that contributes litle to fixing reference. Similarly depth of procesing may vary with clausal embedding or sentence length. McElre advocates the use of a response-signal paradigm, the speed acuracy tradeoff procedure (SAT), which measures task acuracy as a function of procesing time (Wickelgren, 1977). SAT provides separate measures of two key properties: the strength of a representation and its acesibility to cognitive proceses. Acesibility refers to the speed with which on-going proceses can aces a representation, i.e. retrieval speed. During an SAT experiment, participants are trained to discriminate two clases of stimuli; in the case of sentence procesing experiments, the discrimination is typicaly betwen aceptable versus unaceptable sentences. Participants read sentences word-by- word in RSVP presentation. A response cue follows the sentence, to which participants are trained to respond with yes/no aceptability judgment within 100-300 ms. The onset of the response cue is varied across the experiment, so that data is at collected at time points marking predetermined intervals, following the conclusion of the sentence to several seconds thereafter 2 . For each participant it is posible to construct a response function that shows how sensitivity grows over time. This function yields thre measures, ilustrated by the schematic SAT functions in Figure 3-2. 2 A minimum of six to eight time samples are required, so experiments are resource- intensive, requiring up to 180-320 trials per condition, and consequently several experimental sesions (e.g. McElre & Grifith, 1995). There is an alternative procedure, the multi-response SAT (R-SAT) procedure, which provides a more eficient means of deriving SAT functions. During MR-SAT, participants are trained to respond repeatedly within a trial to a series of response cues, dynamicaly modulating their responses as their judgment of the sentence changes. MR-SAT reduces the number of trials necesary to 30- 40 trials/condition (Brian McElre, p.c.) 102 Figure 3-2 Hypothetical SAT functions The SAT time-course functions show that acuracy initialy is at chance, steadily increases for period, and then reaches an asymptote. Time-course functions are usualy fit by the exponential approach to a limit, fit with thre parameters: the intercept, rate of rise, and asymptote (Wickelgren, 197 inter alia.; cf. Ratcliff, 1978). Panel A shows conditions difering by asymptote, while Panel B shows conditions difering by intercept and rate. Figure provided by Brian McElre. The first measure is overal acuracy, reflected in the asymptote of the function, and taken as a measure of representational strength; two conditions that difer in asymptotic acuracy are ilustrated in Figure 3-2, Panel A. The intercept and rate parameters jointly describe the dynamics of procesing. SAT dynamics reflect either the underlying acrual of information if procesing is continuous or the underlying distribution of finishing times if procesing is discrete or quantal (Dosher, 1976; McElre & Dosher, 1989; Ratclif, 1988). Panel B ilustrates two conditions with disproportional 103 dynamics, which reach the same proportion of the asymptote at diferent times. The intercept and rate parameters are the parameters of interest, since they reflect speed of aces to a representation, independent of its strength or ultimate availability. McElre, Foraker & Dyer (2003; henceforth MFD) applied the SAT procedure to sentences containing clefts. Clefting is a species of A? movement, and as such creates an unbounded dependency betwen a clause-initial phrase and gap site. MFD hypothesized that to comprehend these sentences it would be necesary to retrieve the clefted constituent at the gap site. To test whether increasing distance increased the retrieval time for the clefted constituent, as predicted by a system with ordered search, MFD systematicaly varied the distance betwen the clefted phrase and its gap. The gap site was in object position, either in the same imediate clause as the displaced constituent, or was one or two clauses distant. As a signal for whether participants retrieved the constituent and interpreted at the gap site, MFD manipulated the selectional requirements of verb that hosts the gap. In unaceptable sentences, binding the clefted phrase to the gap site would lead to an anomalous interpretation, so participants should reject the sentence. A full example set of materials is given in (38). The clefted constituent is underlined and the gap site marked by an underscore. The aceptable and unaceptable verbs for each sentence prefix are given separated by a slash: in this set, the displaced constituent is ?the scandal,? for which the aceptable verb is ?relish? and the unaceptable verb is ?panic.? 104 (38) (a) SAME CLAUSE It was the scandal that the celebrity relished / panicked ___. (b) ONE CLAUSE INTERPOLATED It was the scandal that the model believed the celebrity relished / panicked ___. (c) TWO CLAUSES INTERPOLATED It was the scandal that the model believed that the journalist reported that the celebrity relished / panicked ___. The critical region is at the end of the sentence. By hypothesis, forming the A? dependency is the last sentence procesing event to afect discriminability of aceptable versus unaceptable sentences, and as such it determines the shape of the SAT function. If information about the clefted constituent is gained via a direct aces method, then there should be no diference in the intercept/rate parameters of the SAT function. However, if it is obtained by an ordered search, governed by the dominance relations in the superficial structure of the sentence, then the intercept/rate parameters should vary with hierarchical distance. Figure 3-3 shows the average SAT function for the eight participants in this experiment. SAT data are analyzed for each participant separately, by fiting hierarchicaly nested models to acuracy as expresed by d?. A nul model fits the data with one intercept, one rate, and one asymptote parameter for al conditions; a full model with separate parameters for each condition (thus, nine parameters over al). What turned out to be the best fiting model had thre asymptote parameters, one for each condition, but a single intercept and single rate parameter for al conditions. The dynamics of acruing information about the clefted constituents was identical across conditions, and thus did not vary with hierarchical distance. Ultimate acuracy did vary across condition, as a decreasing function of hierarchical distance. There are two possible kinds of explanation for this efect: (1) as sentence length increases, the probability of misanalysis 105 increases; (2) while the acesibility of the representation in terms of aces time may not change, its overal availability, or the availability of the information contained inside it, may. Availability is a joint function of the quality of the encoding and the contents of the retrieval structure, and decrements are to be expected as more constituents compete for cues or offer spurious matches. This observation is consistent with the results from the literature on list memory, in which both recognition and recal decline as more items are added to the list (cf. Dennis & Humphreys, 2001). It is also consistent with other psycholinguistic research, which indicates dificulty in succesfully procesing A? dependencies as the number of intervening clauses increases (e.g., Philips, Kazanina, & Abada, 2005 23 ). Figure 3-3 McElre, Foraker, & Dyer (2003), Experiment 2 Average SAT Function 23 Philips, Kazanina, & Abada (2005) found that longer wh-dependencies led to later P600s on the verb that subcategorized the wh-phrase. The P600 is an evoked response potential that is sensitive to dependency completion (Kan et al., 2000). Note that such a timing delay is itself neutral about whether the extra length leads to longer search times or simply decreased acuracy. 106 The best fit model for these data posits one intercept and one rate parameter for al conditions, but thre separate asymptote parameters. Figure taken from McElre, Foraker, & Dyer (2003). The key finding for the present discussion is that aces speed for a displaced constituent does not vary with the number of clauses that intervene betwen the constituent and the site of interpretation. This finding is consistent with a content- addresable memory architecture that afords direct aces to representations in memory. What is problematic, however, is that some widely held asumptions about the syntactic representation guarante the same result. MFD asume that an ordered search must sift through a number of representations that is proportional to the hierarchical distance betwen where the constituent is pronounced and where it is interpreted. If there were no information about the constituent in the interposed clauses, this asumption would be valid. The bracketing in (39) ilustrates a phrase structure consistent with this asumption. Syntactic elements that ?share? information about the clefted constituent are in bold font. (39) [ DP the scandal i [ CP Op i that ([ TP .. [ CP .. [ TP .. [ CP ..) [ TP the celebrity relished t i ]]]] However there is considerable evidence from syntactic research (Chomsky, 1973; Torrego, 1983, 1984; Georgopoulos, 1985; Chung, 1998; McCloskey, 2000, 2001; Bruening, 2004) that dependencies into embedded clauses are not mediated directly betwen the displaced constituent?s surface position at the highest clause edge and the gap site in the embedded clause. Rather, an intermediate dependency element is created at the edge of each embedded clause. This property of displacement is refered to as succesive cyclicity and conforms to the restriction that rule application applies to bounded syntactic domains. A cyclic representation of MFD?s stimuli, given in (40), 107 which shows that distance betwen the gap and the first syntactic element that contains information about the displaced constituent is constant across the clausal interpolation manipulation. (40) [ DP the scandal i [ CP Op i that ([ TP .. [ CP t i [ TP .. [ CP t i ) [ TP the celebrity relished t i ]]]] It is an open question whether cyclic representations are constructed in real-time and, if so, whether the information encoded at the intermediate positions is sufficient to judge the selectional fit of the wh-phrase with the verb. Therefore, although the clausal interpolation increases the temporal offset from when the displaced constituent was first encoded, it is possible it does not increase the structural distance in the sentence representation. McElre, Foraker, & Dyer (2003) also considered the dynamics of subject-verb procesing. They studied sentences in which they varied the distance betwen the subject head and the verb by P and RC modification. There they also found constant dynamics (except when subject and verb were adjacent, and except when two object relative clauses intervened). This manipulation is irelevant to discriminating betwen a direct aces and an ordered search mechanism, because their manipulations did not increase hierarchical distance betwen subject projection and the verb. Only serial distance and the complexity of the subject changed across conditions. These results at least do suggest that the order that governs search cannot be linear. It is dificult to construct the right stimuli that truly modify hierarchical distance betwen dependent elements without the possibility of intermediate representations. Movement dependencies in general present this problem, since the tendency in syntactic explanation has ben to suppose that apparently distant 108 dependencies are realy a succesion of local ones (cf. Kayne, 1984, Kroch & Joshi, 1985, Pollard & Sag, 1994). We ofer some possibilities for future work, however. One is to consider reflexive binding in syntactic alternations: (41) (a) Scott gave a book about himself/herself to the library. (b) Scott gave the library a book about himself/herself. The dependency of interest is betwen the reflexive anaphor and the subject of the sentence. In (41a), this anaphor is contained within the closest argument to the verb, while in (41b) it is in the farthest argument. If we acept the analysis of ditransitive verbs as heading a binary branching projection (e.g., Bars & Lasnik, 1986, Larson, 1988, Pesetsky 1995, Harley, 2002), then the anaphor in (41a) is hierarchicaly closer to the subject than in (41b) 24 , as the structures in (42) ilustrates (adapted from Harley, 2002): (42) (a) [ vP Scott [ v? give [ P [ DP a book about himself ][ P ? LOC [ P to the library ]]] (b) [ vP Scott [ v? give [ P [ DP the library ][ P ? HAVE [ DP a book about himself ]]] Unfortunately these stimuli are not wel-suited to the SAT technique, since the region that putatively triggers retrieval, himself, is not at sentence-final positions in both pairs. Furthermore, it sems unnatural to give the response cue before the sentence is finished, unles participants were trained to judge aceptability-so-far. One possibility would be to shift ?a book about himself? in (41a) to the sentence final position, without changing the phrase structure; this may be possible if the DP is sufficiently heavy: (43) ? Scott gave to the library a self-aggrandizing book about himself. Another possibility is to consider wh-island violations, as in (44). Such sentences are interesting because the occurrence of an intervening wh-phrase prevents the occurrence of an intermediate representation of the target wh-phrase (Chomsky, 1977). 24 The distance contrast does not change, regardles of whether there is a transformational relationship or not betwen the ditransitive?s alternate projections. 109 The [Spec,CP] position in the embedded clause is filed with another wh, (45), potentialy forcing truly long-distance movement. (44) * It was the scandal that the journalist reported how the celebrity relished ___ (45) [ DP the scandal i [ CP Op i that [ TP the model wondered [ CP how C [ the celebrity relished ]]] Such sentences are also il-formed but not awful. It is therefore not unreasonable to ask participants to discriminate the sentences on the basis of the verb?s selection properties, as in MFD: (46) It was the scandal that the model wondered how the celebrity relished/panicked ___ These stimuli do not lend themselves to more than one clausal interposition. Though one wh-island violation can lead to only mild unaceptability, filing multiple [Spec,CP]s sems much worse: (47) * It was the scandal that the model wondered why the journalist reported how the celebrity relished/panicked. Most recently, Martin & McElre (2008) have examined VP Elipsis, as in (48). (48) The editor admired the author?s writing, but the critics did not ___. In order to interpret this sentence, it is necesary to identify the antecedent VP for the elipsis site (antecedent underlined, elipsis site marked with an underscore). Martin & McElre (2008) tested whether or not increasing hierarchical distance betwen a candidate VP and the elipsis site led to diferent aces dynamics. In their experiment, antecedent VPs could be either one or two clauses away as in (49a) and (49b), respectively. An aceptability contrast was created by modifying the properties of the subject of the elided VP. (49) (a) Near antecedent The editor admired the author?s writing, but the critic/*binding did not. 110 (b) Distant antecedent The editor admired the author?s writing, but everyone at the publishing house was shocked to hear that the critic/*binding did not. Using the SAT technique Martin & McElre (2008) found that ultimate acuracy was lower for distant antecedent conditions, but that neither rate nor intercept parameters of the SAT function varied. They concluded that acesibility of the antecedent VP representation did not vary as a function of distance. Unlike MFD, this phenomenon is les clearly liable concerns about intermediate representations (but cf. Johnson, 2001), and so constitutes stronger evidence for a direct aces mechanisms. However, there are relatively few syntactic constraints that govern where the antecedent VP can be found (Johnson, 2001). Antecedent VPs can be found in non-commanding positions (as in Martin & McElre?s stimuli), in c-commanding positions (as in Antecedent Contained Deletion), and extra-sententialy. It therefore sems plausible that locating a VP antecedent need not be ordered by dominance pathways in the sentence. On the one hand, a content-addresable mechanism ay be wel-suited for this kind of search. On the other hand, it does not constitute a strong test of whether any ordered searches are used in language comprehension since VP Elipsis may not be the right kind of phenomenon to invoke such a search. In summary, McElre and colleagues have offered the only direct evidence that the search for constituents occurs in paralel, based on the observation of constant response dynamics in discriminating aceptable and unaceptable sentences. The argument in McElre, Foraker, & Dyer (2003) is unfortunately undercut by fairly standard asumptions about the syntactic representation of wh-dependencies. It nonetheles deserves to be taken seriously, since it is a unique and theoreticaly precise argument. The chalenge that remains is practical (albeit not trivial), which is to sample a 111 spectrum of linguistic stimuli that would give the ordered search hypothesis a fair chance of showing its influence in retrieval dynamics. Martin & McElre (2008)?s elipsis test is one such case. We offered some tentative further suggestions. 3.2.1.3 Bringing structure back In a content-addresable system there are ways to compensate for the structuraly- insensitive search proces. Firstly, the matching proces could be used merely as a fast and eficient first-pas proces, which is then followed by a more controlled, ordered search proces. The logic of such an architecture might be as follows: instead of ascending the phrase structure tre node-by-node for a constituent (that may or may not exist), perform a fast paralel search to identify whether or not there is match to begin with. Then, descend along the dominance paths to return to the foot of the tre (on the asumption that constituent encodings point to their mothers and daughters). It should not be overlooked that in many cases of simply structured sentences, the content features are likely sufficient to pinpoint the right candidate, especialy if it is unique. For example, consider a subject-seking head procesed inside a simple matrix clause, with no embedded clauses. Probing memory for a constituent bearing nominative case may be more eficient than doing any tre climbing at al. Secondly, some hierarchical order may be implicitly encoded in analogue properties of the constituent encodings, like in an activation value. This strategy has been pursued in models of simple serial order (e.g. Grossberg, 1978; Page & Noris, 1998). Figure 3-4 ilustrates the basic idea. Succesive items in a list are encoded with decreasing activation levels. The relative activation levels of any two items retrieved map onto their (relative) order in the list. 112 Figure 3-4 Implicit encoding of serial order The cue +? retrieves two item representations (solid lines with arowheads). The relative order of the two items can be infered by their relative activation levels (dotted lines). Indeed a similar mechanism can be sen at play in the ACT-R model of sentence procesing (Lewis & Vasishth, 2005). There each constituent representation has a baseline activation value. This activation value is meant to reflect the likelihood a constituent representation wil be used in future procesing, consistent with ACT-R?s emphasis on being an ?adaptive? architecture (se also Shifrin & Steyvers, 1997). The more frequently a representation is used by a proces, the more easily retrievable it wil be. Now consider a complex subject, like ?the old man from New England that my father introduced me to.? In this phrase, the head noun ?man? is modified by thre separate constituents, an AP, a P, and a CP. Consequently it undergoes relatively more procesing and re-encoding than the other heads in the string. ACT-R wil therefore +? 113 incrementaly asign it a higher activation value. Suppose that the ?man? and ?father? constituents are later identified as candidates in a search (for example, triggered by the reflexive anaphor ?himself?). If both candidates are equaly good matches for the retrieval cues, and both candidates have the same baseline activation (prior to modification), ?man? wil be more easily retrievable. A heuristic decision metric would choose candidates with higher activation values on the asumption that they are more likely to be hierarchicaly prominent. This metric must truly be heuristic though. Any independent modulation of the activation values wil disrupt the hierarchical order. It is not dificult to imagine a potential counterexample to the modification example above; for example, think of a complex subject with a modestly modified head, but a heavily modified embedded subject: ?the man that my old cousin from New England who knows many famous people introduced me to.? Finaly there is a general problem with an analogue encoding of hierarchical order. We can think of the syntactic structure of a sentence as esentialy a list of lists. The categories along the main ?trunk? of a binary branching phrase structure tre comprise the master list. The relative order of any two categories within the list is sufficient to determine which category dominates the other (or which c-commands which, if it is a list of heads). But each category points to another list, that of the sub-tre it dominates, which may itself contain further lists. The dificulty, therefore, with implicitly encoding relative order in an analogue fashion is that to interpret the outcome, it is necesary to know whether the candidate representations in a comparison set come from the same ?list? or not. In section 3.3.4, we argue for a plausible encoding scheme under which al the constituents contained in a given domain are marked as belonging to that 114 domain, which gives slightly more traction to a more general analogue solution. However, we have not yet been able to find a satisfactory general scheme that would map an activation-like quantity onto global hierarchical order, but some devices may be available in restricted circumstances. Finaly, a third way to compensate for a structure-insensitive search procedure is to enforce retrieval of the corect constituent by predicting the necesity of retrieving it. For some dependencies, the first encountered member of the dependency is distinctive enough to signal the presence of the dependency. Wh-dependencies in English have this character: encountering a wh-phrase in the clause periphery signals that a wh-dependency exists, and that the phrase must be paired with a gap in the subsequent structure. The parser could preserve the wh-phrase by entering it onto a stack, which was one of the earliest suggestions for completing wh-dependencies (Wanner & Maratsos, 1978). It may be too much to concede such a mechanism, however. As we discuss in section 3.2.2, the cue-directed retrieval of imediately retained information is closely related to a second architectural constraint, which is an extremely restricted focus of atention (Cowan 1995, 2001; Nairne, 2002; McElre, 2006; Jonides et al., 2008). A stack, conceived as a distinguished memory space for the maintenance of an encoding, is at odds with this viewpoint. If space limitations truly are at isue, then a reasonable compromise would be to maintain a highly stripped-down encoding of the wh-phase, containing hardly any of its content but perhaps some signature property, like a unique I.D., throughout the full- course of procesing. When the full information in the phrase needs to be retrieved, as it wil be at a gap site, then the ?I.D.? of the wh-phrase as a retrieval cue wil be a highly efective cue for retrieving the structuraly-licit constituent because it is highly 115 distinctive. Suffice it to say, elements that can participate in a dependency do not necesarily signal that they wil need to retrieved in the future. Pronominal anaphora is a good example: the occurrence of a name in a structure does not guarante that a coreferential pronoun wil occur later in the structure 25 . We expand upon this general idea in much greater detail in Chapter 4, and defend the generalization that constituents that predict their own retrieval are more acurately retrieved. In summary, a content-addresable memory alows representations to be acesed based on the inherent properties of the representation. On the one hand, this permits the rapid retrieval of potentialy relevant representations, without the need to consider wholly irelevant ones (McElre, 2006). On the other hand, relational properties, like c- command, are dificult to recover in the same fashion. Indeed it is a property of the architecture that relational properties are backgrounded to inherent ones, since aces is determined by the match betwen retrieval cues and encodings. But relational properties like c-command are not inherent features of a node, since it must be determinable whether they holds for any arbitrary pair of constituents in a structure. It is conceivable to imagine an encoding system in which every constituent contains a list of the constituents in c-commands; however, this would require updating a large proportion of the constituents anytime nodes are added to the structure. We sketched out some general alternatives for recovering hierarchical information, and we return to this problem in greater detail below and in Chapters 4. However, we want to consider one further joint architectural claim about syntactic memory, before moving on to the agrement model. The claim is that there is a very restricted amount of information that can be 25 But se Omaki et al. (2007) for arguments that even names have weak predictability for pronominal coreference. 116 simultaneously maintained in the focus of atention, and, consequently, that most information that is discontiguous, from a time or proces perspective, must be retrieved (McElre, 2006). It has been argued that much of syntactic procesing is skiled memory retrieval (Lewis & Vasishth, 2005). These claims have a broader corolary in an emerging new perspective in memory theory, that there realy is no distinguished working memory for recent events, just long-term episodic memory (Nairne, 2002, Jonides et al., 2008). 3.2.2 The restricted focus of attention Interpreting an expresion depends on coordinating information that enters the procesing system widely separated in time. This problem is exemplified not only by the unbounded dependencies found in wh-questions, clefts, comparatives, and relative clauses (e.g., ?The tune that John was casualy whistling ..?) but also more localy, as in subject-verb agrement (?The songs from the popular movie were playing?) or verb- argument selection (?The orphan inherited a sizable portfolio of securities from the milionaire?). However, it has long been argued that the ability to actively atend to and concurrently proces information is very limited (e.g., Broadbent, 1958; Cowan, 1995, 2001, 2005; McElre, 2001, 2006). As a key architectural feature in many contemporary models of memory (Jonides et al., 2008), a focus of atention instantiates this limitation by partitioning representational space into a sharply bounded nucleus of information imediately acesible to cognitive proceses and the representations which must be acesed via a retrieval proces. For language procesing it is relevant to know how much linguistic information can be procesed concurrently, since this determines what kind of information, and how often, wil have to be retrieved. 117 Several independent lines of evidence from a variety of cognitive and perceptual tasks support sharp capacity limitations on the focal state (Cowan, 2000, 2005). McElre (1998, 2001, 2006) has argued that measures of the speed of acesing information provide the most direct and unequivocal evidence for a unique representational state asociated with focal atention. These measures show a sharply dichotomous patern: procesing dynamics are exceptionaly fast for responses based on information actively maintained in awarenes, approximately 30-50% faster than responses based on information displaced from focal atention (Dosher, 1981; McElre, 1996, 1998, 2001, 2006; McElre & Dosher, 1989, 1993; McElre et al, 2003; Oberauer, 2002, 2006; ?ztekin & McElre, 2007; Verhaeghen et al., 2004; Wickelgren et al., 1980). Responses are argued to be fast because no retrieval operation is needed to aces the contents of these representations; hence, that information is imediately available for ongoing operations (McElre, 2006). The observed discontinuity in procesing speed provides a way of empiricaly measuring the span of focal atention. The available evidence on the procesing of sequentialy presented information using this estimate suggests a very limited span: in most circumstances, only the representation asociated with the last event remains in focal atention. However, there are two crucial qualifications: more than one nominal item can be in focus if the task encourages the encoding of multiple items into a chunk (McElre, 1998) and les recent items may be present if subjects are induced to actively (re-)proces those events (McElre, 2001, 2006). These findings set the stage for meaningfully relating focal atention and language comprehension. An adequate system for linguistic understanding would sem to require the ability to entertain at least two 118 place relations, and thus for focal atention to host at least that many ?items? bound as a single representation ? what Jonides et al. (2008) refer to as a ?functional complex?. We have no current estimate of what counts as a chunk for focal atention, and particularly what counts as a linguistic chunk. McElre, Foraker & Dyer (2003) provided some evidence that a complex subject can quickly occupy al of focal atention. In a SAT experiment identical to the cleft-procesing experiment discussed in section 3.2.1.2, MFD succesively interposed more material betwen subject head and verb: (50) (a) The book ripped/laughed (b) The book that the editor admired ripped/laughed (c) The book from the prestigious pres that the editor admired ripped/laughed (d) The book that the editor who quit the journal admired ripped/laughed SAT dynamics, measured by the intercept and rate parameters of the acuracy function, showed a discontinuous split, betwen very fast dynamics in condition (a), where subject and verb heads are adjacent, and conditions (b)-(d), where the subject is complex. These results, they argued, implicate a retrieval operation in conditions (b)-(d) that is not present in condition (a) (or is required les often). It is infered that in conditions (b)-(d), it is no longer possible to maintain the encoding of the subject projection, or of the subject head itself, concurrently with the incoming information. These results do not place an especialy strong bound on what counts as a chunk in focal atention, since the slow conditions al included an extra clause. There are already many reasons to suspect that clause-boundednes plays a strong role in segmenting linguistic encodings (se section 3.3.4 below for further discussion). It would be useful to know whether task dynamics are fast or slow in simpler cases, like when the subject is P-modified, or there is an adverb: (51) (a) The book from Susan ripped/laughed (b) The book easily ripped/laughed 119 In the case of P modification above, we introduced one more closed-clas head and one more lexical head, which bears a close (restrictive) relation to the subject head. Two lexical heads related by a functional projection sems like a reasonable candidate for a minimal ?functional complex,? in Jonides et al.?s terms. Despite the present results being only suggestive, it is once again useful to know that efects from the memory literature, with its focus on a laboratory task, such as verbal list learning, show up in more natural, linguistic context. And it gives us confidence that drawing architectural paralels is justified. There is good evidence to think that there is a smal focal state and that representations in this state lead to the fastest procesing dynamics. It is actualy hard to avoid this conclusion, if the focal state is thought of as the information that is undergoing imediate procesing. However, what lies outside that state? A traditional division of the memory space, outside of imediate procesing, is into long-term emory and working memory. Working memory is conceptualized as consisting of those memory traces which occupy a distinguished memory state ? a ?workspace? ? either because they are in a special store (e.g., Baddeley, 1986; Shalice & Valar, 1990, inter alia) or because they have intrinsic, persistent activation (Anderson, 1983; Cowan 1995, 2001, i.a.). Working memory has a certain capacity, depending on the size of the bufer or how much activation can be shared (e.g., Usher & Cohen, 1999). Representations in working memory are thought to be more acesible than memory traces stored in the long-term store, though acesibility can fluctuate. Acesibility in the short-term depends on either on the trace?s location in the buffer and a proces?s ability to cycle through diferent 120 locations; or on the inherent activation of the memory trace, which fluctuates over time, generaly decays, but can be refreshed through rehearsal. There is an emerging perspective, however, that eschews the distinction betwen working memory and long-term emory, in favor of unifying the memory architecture (se Jonides et al., 2008, for a review). It has long been known that at least long-term forgeting cannot be atributed to a decay-like proces (McGeoch, 1932; se Anderson & Nely, 1996) and that remembering depends both on the properties of the stored representation and the present information used to recal it (se, e.g., Tulving, 1983). As Nairne (2002) puts it, representations do not have, ??strength?, or special mnemonic properties, outside of particular retrieval environments.? The characteristics of working and long-term emory would sem to be theoreticaly very distinct, therefore. However, mounting evidence indicates that the recognition and recollection of imediately retained information sems to depend on the retrieval environment, just as much as long-term information does. Performance depends on the distinctivenes of the information with respect to the retrieval cues (se Nairne, 2002, for review). Moreover, speed of aces sems to be constant across studied items, even if they exced working memory capacity, except for those in imediate awarenes (se McElre, 2006, for review). 3.2.3 Implications There is not a necesary connection betwen a limited focus of atention and a content-addresable information retrieval mechanism. However, if the capacity of focal atention for linguistic material is very smal and representations outside of the focus of atention are contacted by means which are in principle hierarchicaly insensitive, then we are forced to rethink many isues in the real-time comprehension of language. There 121 is one way of thinking of syntactic procesing as ocurring in a ?workspace? that makes available a significant amount of structured information over which globaly-sensitive parsing decisions can be made. However if al that remains is the focal/non-focal distinction, then the amount of structural information imediately available on which to base parsing decisions is considerably restricted. It sems, furthermore, that recovering syntactic context occurs in an architecture that is inherently not wel-suited to structure- sensitive procesing. The content-addresable nature of retrieval means that shunting information into the focal state is potentialy only weakly constrained by hierarchical order. In a real sense, the recent interest in retrieval-based, interference-prone, content- addresable memory requires a radical re-evaluation of the interface betwen syntax and memory. There is an extreme view that al specialized structures for language procesing, such as the use of stacks or queues, should be eliminated if possible (e.g., McElre, Foraker, & Dyer, 2003). This view sems premature to us and we wil defend the use of some specialized information maintenance as necesary in Chapter 4, which we believe adapts language procesing to the memory architecture. It is important to remember that language comprehension is by most measures efortles and acurate. Nonetheles we acept that the information imediately available to the parser?s decision proceses is restricted. This is not necesarily a bad thing: making decisions on the basis of a limited amount of information, whose format is known in advance, can make parsing more eficient. Berwick & Weinberg?s acount of subjacency (1984) made exactly this point: movement is subjacent because it alows the parser to decide whether or not to posit traces by looking at fixed-size context representation. The alternative is to conduct an 122 unbounded search over the tre, which is an operation of increased computational complexity. In the following section, we work out how agrement procesing could occur in a content-addresable architecture in which even nearby information has to be retrieved. We discover that a linguisticaly wel-motivated retrieval structure provides considerable flexibility in acomplishing structure-sensitive procesing. Nonetheles, agrement atraction, as an eror, is a natural consequence of the architecture. More generaly, the architecture sems to pose a chalenge for procesing complex subjects in a structure- sensitive fashion, as we?l se in section 3.4 26 . 3.3 Agrement attraction in comprehension 3.3.1 Intuition First we wil spel out the intuition behind the retrieval interference model of agrement atraction before formalizing it. The key patern to capture is the asymmetry betwen gramatical and ungramatical sentences. An agrement atractor leads to an ilusion of gramaticality when the sentence is ungramatical. It does not lead to an ilusion of ungramaticality when the sentence is gramatical. Data from the Experiment 4 speeded gramaticality task are repeated in Table 3-1 to emphasize this point. It should be kept in mind that what distinguishes the gramatical and ungramatical sentences is whether the number on the verb matches the number on the subject head noun. 26 This fact, or at least its analogue in derivational syntax, has arguably already been appreciated in the syntactic literature (Uriagereka, 1998). 123 Response (% ?Yes?: acceptable) Sentence Type ?cabinet? ?cabinets? Grammatical ?The key to the ___ was? 93% 91% Ungrammatical ?The key to the ___ were? 25% 55% Table 3-1 Speded grammaticality judgments of complex subject attraction From Experiment 4 Let us suppose that an (automatic) event in procesing the verb is checking for agrement with the subject. This proces can be conceptualized as a search of the preceding syntactic context to verify that the right kind of agrement controller exists, in the right configuration, to license the verb?s morphology. Therefore the search proces wil be guided by two kinds of information: the properties of the agrement controller that can license the agrement relationship as wel as the compatible feature value for the number on the verb. The former, which I shal refer to as ?licensing features?, could include properties like gramatical function (i.e., SUBJ), structural position (i.e., [Spec,TP] 27 ), or Case (i.e., NOM). If the search succeds in identifying a constituent that bears the licensing features and a compatible number feature, then the agrement is considered ?checked? and the proces succeds 28 . Here is the key intuition about agrement atraction in comprehension: it is the outcome of a checking proces in which no constituent matches both features, but a constituent that matches the agrement feature is identified by the search proces. There 27 Encoding the structural ?coordinates? of an item would be somewhat more non- standard than gramatical role or case. Though it is not unimaginable on a local level to mark which categories are the head, which the complements, and which the specifier (as HPSG does using atribute-value matrices; cf. Pollard & Sag, 1994). 28 We are, just for the moment, intentionaly vague about what counts as ?identifying? a constituent: it could be the equivalent of returning a recognition signal, i.e. ?I remember encountering this constituent?, or it could be the actual recal of the constituent, bringing it back into the active procesing workspace. Clearly for dependencies that require interpretation, like wh-dependencies, it must be recal. For more formal dependencies, like agrement, it is les obvious whether recal is necesary. 124 is a similar acount of negative polarity item licensing that works on this partial match principle as wel (Vasishth, Drenhaus, Saddy & Lewis, 2005), which we wil discuss in section 3.4.4. As wel, a concurrently developed model of agrement atraction production relies on the same principle (Badecker & Lewis, 2007), which we discuss in section 3.3.6. This simple formulation goes far in capturing the gramatical/ungramatical asymmetry. For ungramatical sentences with a plural atractor, the subject phrase bears the licensing feature, but not the agrement feature; whereas the atractor at least bears the matching agrement feature. There is a partial match with both the gramatical controller and the atractor. For gramatical sentences with a plural atractor, the subject phrase itself bears both the matching agrement feature and the licensing feature; and the atractor bears neither agrement feature or licensing feature. There is a full match with the gramatical controller, and none with the atractor. The chalenge is to come up with a search proces that has these properties, which can be consistently specified across gramatical and ungramatical sentences. We propose a model implemented in a content-addresable memory that sems to deliver the right outcome. Recapitulating section 3.2: a content-addresable memory is one in which information is retrieved by means of the content of the encoding. The key property of content-addresable memory is that it alows direct aces to representations in memory. There is no need to search through irelevant representations, only those that match, in some respect, the retrieval probe that initiates the search proces (Clark & Gronlund, 1996; McElre, 2006). Because the search is driven by the similarity betwen the retrieval probe and what has previously been encoded in the memory, it alows partialy similar representations to impact the proces. This property of the search, caled 125 similarity-based interference (Anderson & Nely, 1996), is what permits the atractor to intrude in the agrement-checking proces in ungramatical sentences. To formalize our intuitions, we wil first work out our asumptions in the framework of Shifrin?s Search of Asociative Memory (SAM; Raijmakers & Shifrin, 1981, Gilund & Shifrin 1984). Nothing in particular depends on this choice, except that it alows straightforward modeling 29 . It is only one of several frameworks that captures the major empirical generalization of similarity-based interference: memory retrieval depends on the match betwen the retrieval probe and representations in memory. For other models that formalize this notion, se, among others, Eich (1982), Hintzman (1984, 1988), Murdock (1982, 1993), Nairne (1990), or Shifrin & Steyvers (1997). In the rest of section 3.3 we introduce the retrieval model of agrement atraction, and work through some of its consequence in a speeded-gramaticality experiment. Our goal is to beter understand the concept of content-addresable memory in language procesing, and some of the isues that it raises, by working through a specific example. However, in section 3.4 we turn to a phenomenon that is closely related to our agrement atraction model, which is the atachment of complex subjects. Recent work by Julie Van Dyke (Van Dyke & Lewis, 2003; Van Dyke, 2007) has argued that complex subject atachment is dificult because it is prone to similarity-based interference. We examine her arguments and try to refine them in a self-paced reading experiment (Experiment 6) in section 3.4.3. 29 It is important to note that we are not engaging in simulation modeling, in the sense of generating a distribution of outcomes that indicate something about the robustnes or dynamics of the retrieval proces. Rather we are interested in how the formal properties of a content-addresable memory interact with the formal properties of the syntax at a more abstract level. 126 3.3.2 Formalization The memory model is specified as consisting of thre components: (1) the content of stored representations, (2) the retrieval structure, and (3) the task or memory goal. We give first the general outline of these components, and then lay out the asumptions specific to syntactic structure. 3.3.2.1 General Properties In SAM parlance, the encodings in memory are refered to as images. Images consist in feature sets packaged as a single unit. Images can include information about the item representation itself, the context in which it was encountered, and its asociations to other images. The retrieval structure consists of a set of cues, corresponding to item, context, and category properties. Each cue has a particular strength of asociation to the images in memory. The set of cues diferentialy activates the images in memory, acording to the weighted product of the cue strengths to each image. Let us asume that the memory consists of a set of n images (I 1 , .. , I n ), the cue set?s m cues (Q 1 , .. , Q m ), and the strength from the jth cue to the ith image given by S(Q j , I i ). The equation in (52) specifies that the activation of image i , A i , is the product of the strengths from each cue in the cue set to that image. The strengths inside the product are raised to the weighting value asociated with each cue, w j . The weight alows the model to asign diferent saliencies to the cues in the retrieval structure. That is, some cues can count more than others. If the cues are constrained to sum to one, then retrieval efectively becomes a limited-capacity proces: adding more cues lowers the expected activation of any given image based on a 127 single matching cue 30 . (52) Activation equation ! A i =S(Q j ,I i ) w j j=1 m " The non-linear combination of retrieval strengths endows the model with an important property: sensitivity to conjunctions. Figure 3-5 ilustrates this property for a hypothetical retrieval scenario with five cues, in which the degre of match, normalized to unity, is shown as a function of the number of convergent cues. Thre combination rules are shown: a sum of cue strengths rule, a cube-of-sums rule (cf. Hintzman, 1988) and a product of cue strengths rule (as in SAM). For the non-linear rules, representations that match al the cues are much more highly favored than partial matches. Figure 3-5 Comparison of cue convergence rules 30 A useful intuition about the capacity limitation is that, in general, a few highly distinctive cues is beter than many mediocre ones at retrieving a target encoding. 128 Normalized match score is shown as a function of how many of five cues converge on a given representation. Each matching cue has high cue-to- image strength (0.95) while non-matching cues have low strength (0.1). Match scores are normalized to a full match, in which five cues converge. How the activations map onto the retrieval depend on the goal of probing the memory. SAM countenances two kinds of tasks: recognition and recal. In recognition, the goal of probing the memory is to generate a signal indicating whether or not a match exists. In recal, the goal is to bring the image itself back into the active procesing workspace. For recognition, the activations are simply summed to generate a familiarity score, given by the equation in (53): (53) Recognition familiarity score ! F(Q i ,...,Q m )=A i i=1 N " This familiarity score then feds a decision proces. One simple decision proces is to set a threshold. For example, if F(?) exceds this threshold, then the system deems the cues to correspond to an existing image. For recal, the activations correspond to the probability with which the image wil be sampled and recovered. The probability of sampling image i, out of the N images in memory, is given by the equation in (54). This equation says that this probability is determined by normalizing the activation of a given image, by the sum total of activations. (54) Recall sampling probability ! P(I i |Q 1 ,... m )= A i A k k=1 N " If an image is sampled, the probability of recovering the information contained therein is given by the equation in (55): 129 (55) Recall recovery probability ! P RECOVER (I i |Q 1 ,...,Q m )=1"exp("w j S(Q j ,I i j=1 M # )) 3.3.2.2 Language-specific asumptions We posit that images correspond to syntactic constituents, and particularly maximal projections. This is an asumption that we share with Lewis & Vasishth (2005). It would not be impossible to package an arbitrary extent of a structure as a unit. But there are good reasons for asuming a unit of storage that is something like an XP. The most obvious is that we are considering within-sentence procesing, so there needs to be a way of addresing smaler-than-sentence portions of the structure. With respect to agrement atraction, it needs to be the case that the number features of diferent DPs can be independently acesed; packaging them in separate images sems a natural way to do this. This asumption is not at odds with how a tre representation would be encoded in a standard random-aces memory. In such a memory, structured representations like tres are encoded by linking a set of discrete memory locations with pointers. For tres, each node in the tre corresponds to a discrete memory location with a certain number of fields. What makes it a tre representation is that certain distinguished fields point to the next node down (or the next node up) (Knuth, 1965/1997). How a syntactic representation maps onto individual encodings is one (modest) way in which a candidate procesing architecture could guarante structure sensitivity. By forcing the memory architecture to package its encodings in linguisticaly-relevant pieces, like maximal projections, retrieval-mediated reference to syntactic encodings is constrained to only return linguisticaly-relevant pieces. This conclusion is familiar: the 130 earliest work in psycholinguistics quickly came to the conclusion that the clauses were a salient perceptual and mnemonic unit (se Fodor, Bever, & Garet, 1974; cf. Shifrin, Murnane, Gronlund, & Roth, 1989, for more recent evidence from memory paradigms). In order to encode an arbitrarily complex syntactic representation as a set of features in a unitary encoding, we need a way of specifying recursive feature values. A complex subject is an excelent example to show that point: it is a DP that contains a DP (Figure 3-6). For sake of ilustration, suppose that an image is a list of atribute-value pairs, where the atributes are adapted from some standard relations in phrase structure. We can imagine two ways of encoding a complex subject: a ?single image? encoding, (56), in which feature values can be recursive feature structures; and a ?multiple image' encoding (57), which contains only non-recursive feature values. Instead of recursive feature structures, these images point to other images in the memory (indicated by the ? symbol). Figure 3-6 Phrase structure tre for ?the man with the hat? (56) SINGLE IMAGE ENCODING HEAD: COMP: DP.1 ADJUNCT: , COMP: , COMP: > 131 (57) MULTIPLE IMAGE ENCODING HEAD: COMP: ?NP.1 DP.1 ADJUNCT: P.1 NP.2 HEAD: HEAD: P.1 COMP: ?DP.2 HEAD: DP.2 COMP: ?NP.2 NP.1 HEAD: Linguistic theory does not arbitrate a decision betwen a single or multiple image encoding. We can certainly specify a feature language that is recursive (consider HPSG, for example; Pollard & Sag, 1994). Recursivity is a fact about language structure, and that is not at isue. The trouble with recursively specified features in a memory that is content addresable is that it sems to render opaque the content contained within deeper embeddings. For example, in (57), it is apparent that the head ?hat? exists in the structure by inspecting just the HEAD values of the images. The cue { HEAD: }would activate NP.2 from the image set in (57) by means of its feature composition. In (56), however, ?hat? is not visible as a head until the value of the ADJUNCT feature is unpacked. The cue { HEAD: } would not be an eficacious cue strictly by means of (56)?s feature structure; the asociation betwen { HEAD: } and DP.1 would have to be additionaly encoded. If we want to guarante direct, content-addresable aces to the information contained in a structure, therefore, then the multiple encodings sem preferable. There are clear tradeoffs. Breaking up the encoding into many images renders the pieces more visible to the search proces, but it makes recovering relationships more dificult. For example, it raises the expected number of retrieval operations that are 132 necesary to asemble information about a relationship. If we consider the encoding in (56), knowing that ?hat? is contained within that encoding translates into knowing that ?hat? is dominated by DP.1. In the encodings in (57), this fact must be deduced by sequentialy retrieving the images (cf. McElre, Foraker, & Dyer, 2003; and McElre, 2006; which make a similar point regarding the retrieval of order information). It is an empirical question exactly how much information a given image contains. For example, we might re-encode (57) in a way that does not split DPs and NPs, such that each image encodes something like an extended projection (Grimshaw, 1991). The crucial point is that we asume that multiple images exist in memory for a complex subject. As a consequence, for a complex subject there are two DPs that can be contacted by a retrieval operation. The retrieval structure is asumed to consist of atribute-value pair cues. If an image contains an atribute-value pair, then that pair wil be an efective cue. Although it is conceivable to vary the strength of the cue, based on how confidently that feature has been encoded, we wil first asume that cue strength is esentialy al-or-none. For the memory goal we wil consider two alternative conceptions of agrement checking: agrement checking as recognition and agrement checking as recal. In the case of checking-as-recognition, whether or not agrement is licensed is determined based on a familiarity score. In the case of checking-as-recal, a candidate match must be recovered into the procesing workspace, and the feature values inspected to se whether they match. For simplicity, we asume that probability of recovering the information in a sampled image is uniformly high; this does not impact the patern of results. 133 3.3.2.3 Complex subjects First we consider complex subject agrement. We asume that the fragment ?the N to the N? corresponds to a memory with thre images, the two DPs and the P. DP images contain information about a single functional head and a single lexical head (F- HEAD and L-HEAD). The contents of the memory are given for both Sg [ Sg ] and Sg [ Pl ] complex subjects. We adhere to a privative feature system, in which there simply is no NUM feature for singular nouns. DP.1 and DP.2 are distinguished on the basis of case values: Nominative for DP.1 and Oblique for DP.2. (58) Sg [ Sg]: Images for ?the path to the monument? (59) Sg [ Pl ]: Images for ?the path to the monuments? F-Head: L-Head: L-Comp: ?P.1 Case: Nom DP.1 L-Head: L-Comp: ?DP.2 PP.1 F-Head: L-Head: Num: Pl Case: Obl DP.2 We evaluate four scenarios, corresponding to an experiment that crosses atractor number with gramaticality. The gramatical continuation to the fragment ?the path to the monument(s)? is a singular verb. We asume that a singular verb prompts retrieval with the following cue set: { CASE: Nom }; that is, it prompts with only a licensing feature. The ungramatical continuation to the fragment is a plural verb. A plural verb prompts retrieval with the cue set: { CASE: Nom, NUM: Pl }; that is, both a licensing F-Head: L-Head: L-Comp: ?P.1 Case: Nom DP.1 L-Head: L-Comp: ?DP.2 PP.1 F-Head: L-Head: Case: Obl DP.2 134 feature and the agrement feature. Consistent with our asumptions, the strength of an atribute-value pair cue to images containing that atribute-value pair is set to near 1: 0.99, and near 0 otherwise: 0.01. We set these strengths to just near the extrema as a means of incorporating noise into the system and to avoid perfect performance. The weights, w, asigned to each cue are asigned to sum to 1, and are split uniformly among cues in the cue set. The following table, Table 3-2, demonstrates how these cue sets map onto activations for DP.1 and DP.2, when the head of the complex subject is singular, and the sentence is gramatical. The only cue in the cue set is CASE:Nom, which points unambiguously to the subject projection, DP.1. Consequently, in both singular and plural atractor conditions, the activation of this image always dwarfs that of DP.2, which only receives noise activation (A(DP.1) = 0.99 > A(DP.2) = 0.01). Familiarity scores are identical across conditions, so both conditions should behave identicaly in the decision proces linked to a recognition task. The probability of sampling these images in a recal proces mirors the activation values. Consequently, the corect projection would always be recaled. 135 [ DP.2 the monument] is [ DP.2 the monuments] is [ DP.1 The path to .. DP.1 DP.2 DP.1 DP.2 Q (Cue) W(eight) S(trength) S(trength) CASE:Nom 1 0.99 0.01 0.99 0.01 F(amiliarity) F A(ctivation) 0.99 0.01 1 0.99 0.01 1 P SAMPLE 0.99 0.01 0.99 0.01 Table 3-2 Retrieval structure and outcomes Singular-headed subjects, grammatical continuations The upper cels specify the strength of asociation betwen the cues in the cue-set and the stored images. The lower cels show the outcome of probing the memory with that cue set: in terms of the activations of the images, their sum (the familiarity score in a recognition task), and the normalized activation (the sampling probability in a recal task. The probability of sampling the corect projection, i.e. the subject DP.1, is highlighted in the double-bordered cel. To se how the model performs for ungramatical conditions, when the head is singular, we turn to Table 3-3. Here the verb supplies two cues: CASE:Nom and Num:PL. When there is a singular atractor, only CASE:Nom is an efective cue; and consequently the subject projections receives al of the activation. As before, the probability of sampling the correct image considerably exceds the probability of the atractor image (91% > 9%). When there is a plural atractor, one cue in the cue set points to the subject image, DP.1, and one points to the atractor image, DP.2, but neither points to both. Consequently, each image receives equal activation, and each has an equal likelihood of being sampled. Notice that in both cases, overal activation is lowered (with respect to the previous scenario). 136 [ DP.2 the monument] *are [ DP.2 the monuments] *are [ DP.1 The path to .. DP.1 DP.2 DP.1 DP.2 Q (Cue) W(eight) S(trength) S(trength) CASE:Nom 0.5 0.99 0.01 0.99 0.01 NUM:Pl 0.5 0.01 0.01 0.01 0.99 F F Activation 0.10 0.01 0.11 0.10 0.10 0.20 P SAMPLE 0.91 0.09 0.50 0.50 Table 3-3 Retrieval structure and outcomes Singular-headed subjects, ungrammatical continuations The upper cels specify the strength of asociation betwen the cues in the cue-set and the stored images. The lower cels show the outcome of probing the memory with that cue set: in terms of the activations of the images, their sum (the familiarity score in a recognition task), and the normalized activation (the sampling probability in a recal task. The probability of sampling the corect projection, i.e. the subject DP.1, is highlighted in the double-bordered cel. The limited capacity property of retrieval comes into play here: the presence of an additional cue, even if it is inefective, lowers the weight of other cues, even if they are efective. As a consequence, for the singular atractor condition, despite the fact that the only efective cue points unambiguously to the correct image, the sampling probability of the correct projection is only 91% (compared to 99% in the previous scenario). However, since neither projection matches the verb in number, this shift in probabilities does not imply a shift in judgment behavior. Finaly, the familiarity scores diferentiate the conditions in this scenario as wel, reflecting the fact that one cue is eficacious in the singular atractor condition, but two are eficacious in the plural atractor condition. In Table 3-4 and Table 3-5, the model outcomes are specified for gramatical and ungramatical sentences, when the head of the subject is plural. In the gramatical condition (Table 3-4), the verb is plural, and the cues are asumed to be { CASE:Nom and 137 NUM:Pl }. Regardles of the number on the atractor, both cues wil activate the subject. Because the atractor never satisfies both cues, the correct projection receives most of the activation, and has the highest sampling probability. When the atractor does satisfy the plural cue, the balance of sampling betwen DP.1 and DP.2 shifts. It is unclear whether this would impact judgment behavior, because the atractor does match the verb in number. In the cases where the atractor is sampled (10% of the time), its number feature would stil agre. If only that fact matered, then both conditions would lead to a succesful outcome in agrement checking in virtualy al cases. The familiarity score is high in both cases, because both scenarios involve a cue set that is maximaly efective for one image. Table 3-4 Retrieval structure and outcomes Plural-headed subjects, grammatical continuations The upper cels specify the strength of asociation betwen the cues in the cue-set and the stored images. The lower cels show the outcome of probing the memory with that cue set: in terms of the activations of the images, their sum (the familiarity score in a recognition task), and the normalized activation (the sampling probability in a recal task. The probability of sampling the corect projection, i.e. the subject DP.1, is highlighted in the double-bordered cel. In the ungramatical condition (Table 3-5), there is no NUM:Pl cue. The only efective cue is CASE:Nom, and it correctly points to the subject projection in both cases. [ DP.2 the monument] are [ DP.2 the monuments] are [ DP.1 The paths to .. DP.1 DP.2 DP.1 DP.2 Q (Cue) W(eight) S(trength) S(trength) CASE:Nom 0.5 0.99 0.01 0.99 0.01 NUM:Pl 0.5 0.99 0.01 0.99 0.99 F F Activation 0.99 0.01 1.0 0.99 0.10 1.1 P SAMPLE 0.99 0.01 0.91 0.10 138 Notice that the familiarity scores are high and equal in both cases: on the one hand, this indexes the eficacy of the single cue in the cue set for identifying a constituent. On the other hand, it ilustrates the point that raw familiarity alone is likely not sufficient to determine a gramaticality response. We raised the question earlier of whether familiarity alone might be sufficient to acount for agrement atraction. The intuition was that agrement atraction represents a situation in which the comprehender recognizes the existence of the number features in the syntactic context, but does not pause to examine whether or not they were present in the image corresponding to the structuraly-licensed constituent. However, that strategy only sems to gain traction when there are two features at play. In the case of a pluraly-headed subject, and an ungramatical sentence, the only cue is for nominative case; it alone drives the distribution of activation among images. Because there is no singular cue, there is therefore no way in which familiarity is informative about agrement checking. It must be the case that information contained in the returned image is examined, in order to determine that these sentences are ungramatical. 139 [ DP.2 the monument ] *is [ DP.2 the monuments ] *is [ DP.1 The paths to .. DP.1 DP.2 DP.1 DP.2 Q (Cue) W(eight) S(trength) S(trength) CASE:Nom 1 0.99 0.01 0.99 0.01 F F Activation 0.99 0.01 1.0 0.99 0.01 1.0 P SAMPLE 0.99 0.01 0.99 0.01 The upper cels specify the strength of asociation betwen the cues in the cue-set and the stored images. The lower cels show the outcome of probing the memory with that cue set: in terms of the activations of the images, their sum (the familiarity score in a recognition task), and the normalized activation (the sampling probability in a recal task. The probability of sampling the corect projection, i.e. the subject DP.1, is highlighted in the double-bordered cel. In summary, a very simple cue-driven model sems to re-capitulate the basic paterns observed in agrement atraction comprehension. Crucialy we were able to capture the asymmetry betwen gramatical and ungramatical conditions, observed for plural atractors. The plural atractor only intruded in the ungramatical condition, because the eficacy of the cues was divided betwen the subject projection and the atractor projection. Using the SAM model alowed us to make our initial intuitions precise. Before moving on to relative clause atraction, we would like to consider one further isue. Suppose that our choice of cues in the scenarios above was misleadingly judicious. Suppose that it was the case that, even in Sg [ Pl ] gramatical sentences, the verb supplies a cue that contacts both subject and atractor projections, that is, both images DP.1 and DP.2. It is not implausible to imagine such a case: for example, a syntactic category licensing feature, like CAT:DP. This scenario is similar to how the Table 3-5 Retrieval structure and outcomes Plural-headed subjects, ungrammatical continuations 140 model treats sentences like ?The paths to the monuments are,? where two cues point to the subject (CASE:Nom, NUM:Pl), but one cue points to two images (NUM:Pl). Notice that in that case the disparity in activation of the DP.1 and DP.2 images was stil large: 0.91 to 0.10. The reason is that cue combination is not linear, but a weighted product. Consequently convergent cues are much more efective than single cues at retrieval. Therefore, there would remain a qualitative diference betwen the Sg [ Pl ] ungramatical sentences and the Sg [ Pl ] gramatical sentences. In the later case, the verb triggers retrieval with information that is always asociated with the subject image (CASE:Nom, CAT:DP, etc. etc.), whereas in the former case, the information is split betwen the two images (CASE:Nom v. NUM:Pl). Put more succinctly, al of the cues in the gramatical sentences are cooperative, whereas they are in competition in ungramatical sentences. 3.3.2.4 Relative clause atraction Next we turn our atention to relative clause atraction. Because the experimental results are qualitatively similar for the two constructions, the aim is to se how far the same explanation wil extend. An example of ungramatical atraction is given in the relative clause in (60), with a partial phrase structure bracketing: (60) [ DP [ DP The runners ][ CP who [ DP the driver ]wave to each morning ] .. Focusing on just the DPs in the structure, we asign the images below: (61) [ Pl [ Sg ] ]: Images for ?the runners who the driver? F-Head: L-Head: Adjunct: ?CP.2 Case: Nom Num: Pl DP.1 F-Head: L-Head: Case: Nom DP.2 141 Notice that there is a problem with the images as given. If the (ungramatical) verb form supplies the same two cues as in the complex subject examples above, { CASE:Nom, NUM:Pl }, then the cues no longer compete, but converge on the atractor. Table 3-6 ilustrates the numerical predictions. [ DP.2 the driver ] *wave [ DP.1 The runers .. DP.1 DP.2 Q (Cue) W(eight) S(trength) CASE:Nom 0.5 0.99 0.99 NUM:Pl 0.5 0.99 0.01 F Activation 0.99 0.10 1.1 P SAMPLE 0.91 0.09 Table 3-6 Retrieval structure and outcomes for RC Atraction Plural RC head, Singular RC subject, Ungrammatical The probability of sampling the corect projection, i.e. the subject DP.2, is highlighted in the double-bordered cel. The reason this scenario difers from the complex subject scenario is that the licensing feature we selected, CASE:Nom, diferentiates the two DPs in complex subjects: the embedded DP does not bear the same case feature as the subject DP. However, in the relative clause case, both DPs are nominative DPs. This outcome may not be problematic for the atraction case: for example, we se no evidence of dificulty due to ungramaticality in the first RC atraction reading time experiment. (Experiment 1). However in both Experiment 2 and the judgment study (Experiment 4) there are gramaticality efects. More troublesome is when we consider an example like the following: (62) The runner who the drivers waves to .. 142 Both the online and offline results indicate that comprehenders detect the ungramaticality in this string, and that the singular relative clause head never intrudes to ameliorate it. However, if the cue set of ?waves? is, by hypothesis, just { CASE:Nom }, then we expect retrieval to be split betwen the matrix subject and the RC subject, which would incorrectly predict an atraction efect. Thus our retrieval structure cannot be consistently implemented in the two scenarios and achieve comparable results. In the specification above, at retrieval the system uses the licensing properties of the subject to identify its constituent image in memory. We chose Case to serve as the relevant licensing feature. However, the problem the RC construction poses is not which inherent property we choose to identify the subject with. Instead of Case, we might have selected gramatical function (i.e., Subject), or structural position (i.e., [Spec,TP]). But in the RC examples, both DPs are gramatical subjects and both occupy the same position in their respective clause. The problem is that the system needs to be able to identify the subject relationaly: it must identify the subject of the same clause as the verb. Can we re-configure the retrieval structure to acount for this aspect of the licensing? Suppose that we add another atomic feature to the system, identifying which clause a constituent belongs to. A revised set of images for ?the runners who the driver? is given below: (63) [ Pl [ Sg ] ]: Images for ?the runners who the driver? F-Head: L-Head: Adjunct: ?CP.2 Case: Nom Num: Pl Clause: 1 DP.1 F-Head: L-Head: Case: Nom Clause: 2 DP.2 143 The clauses are diferentiated by a scalar value: cal the matrix clause 1, and the RC clause 2. These numbers have no special meaning (like depth of embedding) but merely serve as arbitrary indices (se section 3.3.4 below). Suppose now the cue set for the RC verb in an ungramatical sentence is: { CASE:Nom, NUM:Pl, Clause:2}. We keep our default asumptions that cues are asociated with images with near-0 or 1 strength for features present in an image, and that there is an equal division of atention. The outcomes are encouraging. First, we consider the full results for the plural atractor RC fragments. Table 3-7 reports the results for both gramatical and ungramatical sentences. [ DP.2 the driver ] waves [ DP.2 the driver ] *wave [ DP.1 The runers who.. DP.1 DP.2 DP.1 DP.2 Q (Cue) W(eight) S(trength) W(eight) S(trength) CASE:Nom 0.5 0.99 0.99 0.33 0.99 0.99 CLAUSE:2 0.5 0.01 0.99 0.33 0.01 0.99 NUM:PL - - - 0.33 0.99 0.01 F F Activation 0.10 0.99 1.1 0.22 0.22 0.43 P SAMPLE 0.09 0.91 0.50 0.50 The probability of sampling the corect projection, i.e. the subject DP.2, is highlighted in the double-bordered cel. Focusing on the ungramatical case first (second column set), we se that the cues compete equaly for the DP.1 and the DP.2 images, consistent with the patern achieved for complex subjects. In the gramatical case, the convergence of the Case and Clause cues means that the correct image is sampled overwhelmingly (91% v. 9%). The qualitative patern is not categorical however and opens the possibility that the atractor may intrude in agrement checking for the gramatical sentences, in a very smal Table 3-7 Revised retrieval structure and outcomes, RC atraction Plural RC head, Singular RC subject 144 proportion of cases. A similar prediction was obtained for sentences like ?The keys to the cabinets are ..? but this prediction is dificult to verify, given that any noun-verb combination correctly agres. However, in the RC case, the atractor and the verb misagre, so there could be a consequence: 10% of the time, an ilusion of ungramaticality should obtain. On the one hand, no reliable efect of atractor was found in the real-time studies, when gramatical conditions were compared. On the other hand, in the offline studies, there was a marginal efect of atractor number in the gramatical conditions. The data from Experiment 3 are repeated in the table below: Response (% ?Yes?: acceptable) Sentence Type ?runner? ?runners? Grammatical ?The ___ who the driver ses? 82% 70% Ungrammatical ?The ___ who the driver se? 28% 64% Table 3-8 Speded grammaticality judgments of RC atraction From Experient 3 It may be that the increased tendency to report gramatical RC sentences as ungramatical when there is a plural atractor stems from the kind of retrieval structure we have posited. Case and Clause cues converge to select the correct subject, but the Case cue also partialy activates the RC head. The RC head mismatches the verb in number feature, resulting in the impresion of ungramaticality. We explore the consequences of the partial case activation in greater detail in 3.3.3, and acrue some supporting evidence (Experiment 5). However, there is one more crucial scenario under which we must consider our revised retrieval structure for RC atraction: plural RC subjects. Empiricaly, the plural subject sems to ?insulate? the agrement proceses from intrusion by a singular atractor. Table 3-9 reports the model results from such a configuration. The correct image is 145 overwhelmingly sampled in both cases. The decreased number of cues, however, means that the contribution of partial match of CASE:Nom is greater for ungramatical conditions. Just as in the gramatical condition above, a smal intrusion of the atractor is predicted in ungramatical sentences here. Experimentaly, however, no amelioration of the ungramatical cases is observed, either in judgments or in reading times. [ DP.2 the drivers ] wave [ DP.2 the drivers ] *waves [ DP.1 The runer who.. DP.1 DP.2 DP.1 DP.2 Q (Cue) W(eight) S(trength) W(eight) S(trength) CASE:Nom 0.33 0.99 0.99 0.50 0.99 0.99 CLAUSE:2 0.33 0.01 0.99 0.50 0.01 0.99 NUM:PL 0.33 0.01 0.99 - - - F F Activation 0.05 0.99 1.0 0.09 0.99 1.1 P SAMPLE 0.05 0.95 0.09 0.90 The probability of sampling the corect projection, i.e. the subject DP.2, is highlighted in the double-bordered cel. How should we interpret the results in Table 3-7 and Table 3-9? On the one hand, the predictions sem qualitatively correct, and in line with the asymmetrical results obtained in Experiments 1 through 4. The atractor intrudes strongly in ungramatical sentences. On the other hand, the model does not categoricaly predict a complete lack of intrusion of the atractor in gramatical sentences (Table 3-7), nor does it categoricaly predict that a singular atractor should have no impact for ungramatical singular agrement (Table 3-9). In both of these cases, the CASE:Nom cue points to the atractor, shifting by a few points the baseline activation contributed by the near-0 strength cues. As we mentioned above, there is some experimental indication that the plural atractor can intrude slightly when the sentence is gramatical, which the model captures as the Table 3-9 Revised retrieval structure and outcomes, RC atraction Singular RC head, Plural RC subject 146 partial contribution of CASE:Nom as in Table 3-8). However, there is no corresponding experimental efect when the atractor is singular, and the sentence is ungramatical. Because the judgment efect for gramatical sentences containing plural atractors was smal (and marginal), we would like to replicate it. If this efect is real, then either the model is correct, and we must explain why no corresponding efect is observed for ungramatical sentences containing singular atractors; or we must revise the model. 3.3.3 Agrement & Case (Experiment 5) In this Experiment we atempted to replicate part of Experiment 3, in which we observed that a plural RC head could intrude upon the checking of gramatical agrement, leading to the slightly decreased aceptability of (64b) with respect to (64a). (64) (a) The runner who the driver se .. ~> (b) The runners who the driver se .. There was a marked asymmetry betwen how much beter ungramatical sentences got, when an atractor was present ? a lot ? and how much worse gramatical sentences got, when an atractor was present ? just a litle. These results were clearly inconsistent with an acount based upon eroneous feature percolation. In our formal, cue-based retrieval model, atraction arises from partial activation by the number cue on the atractor encoding. In that model, it was necesary for some cue to contact the representation of the subject, which we implemented with the Case property. This decision sems motivated, as case and agrement have been linked in syntactic analysis, asociating nominative Case with subject-verb agrement (Chomsky, 1995). However, there is a consequence to using an inherent property, like Case, to aces subject representations, which is that irelevant subjects should be subject to retrieval. In gramatical RCs, like 147 (65), the Case and clausemate requirement of ?ses? converge on the correct subject phrase ?the driver?; the ?Case? cue partialy contacts ?The runners? as wel. (65) The runners who the driver ses each morning on the commute .. Consequently it is expected that in a smal proportion of cases, ?the runners? wil be retrieved for agrement checking. How smal a number of cases depends on the precise numerical properties of the model. The smal (marginal) decrement in aceptability observed in Experiment 3 is consistent with this system. However, there is a qualitative distinction with configurations in which the RC head matches none of the verb?s cues. Such a configuration obtains when the relative clause is not atached to a subject-like head: for example, in an object-atached relative clause. (66) Gerard recognized the runners who the driver ses each morning on the commute. Here the relative clause head is an object in the main clause. It occupies a diferent structural position, and bears acusative Case. It therefore looks nothing like an agrement licenser, and should not intrude as a partial match. Consequently, unlike the subject-atached RC, where a smal ilusion of ungramaticality is possible in gramatical sentences, the object-atached RC should exhibit no such ilusion. In Experiment 5 we performed a direct comparison of subject- and object-atached RCs in a speeded gramaticality task, to determine whether or not the matrix clause subject-hood of the atractor impacts aceptability. 148 3.3.3.1 Model predictions First we consider the SAM predictions for object-atached v. subject-atached RC atraction. The retrieval structure is identical as outlined in Table 3-7 and Table 3-9 above: there is a Case cue, a clause-mate cue, and a number cue. The predictions for a subject-atached RC, singular RC subject are re-capitulated below, in Table 3-10, simply as the sampling probability of the two DP images in memory. Condition Sampling probability Agrement Number Example Subject Atractor Gramatical SG The runner who the driver ses .. 0.91 0.09 PL The runners who the driver ses .. 0.91 0.09 Ungramatical SG The runner who the driver se .. 0.82 0.18 PL The runners who the driver se .. 0.50 0.50 Table 3-10 Subject and atractor sampling probabilities for Subj-attached RCs Thre cues are used by the model: a clause-mate cue, a case cue, and a number cue. Cue strength is al-or-none and atentional weighting is distributed uniformly. The predictions for an object-atached RC are given in Table 3-11. Condition Sampling probability Agrement Number Example: ?Gerard recognized .. ? : Subject Atractor Gramatical SG .. the runner who the driver ses .. 0.99 0.01 PL .. the runners who the driver ses .. 0.99 0.01 Ungramatical SG .. the runner who the driver se .. 0.94 0.06 PL .. the runners who the driver se .. 0.80 0.20 Table 3-11 Subject and atractor sampling probabilities for Obj-attached RCs Thre cues are used by the model: a clause-mate cue, a case cue, and a number cue. Cue strength is al-or-none and atentional weighting is distributed uniformly. 149 There are two important comparisons to make betwen the two tables: first, as we?ve been discussing, for gramatical RCs with a plural head there is no likelihood of sampling the atractor when the RC is object atached, whereas there is an increased likelihood of doing so when the RC is subject atached (compare the double-bordered cels across tables). Secondly, however, the atraction efect is predicted to grow smaler for ungramatical object-atached RCs (compare the wavy-bordered cels tables). The CASE:Nom cue is no longer efective in contacting the atractor, so the increased activation it provided is mising: NUM:Pl is the only cue activating the atractor, competing against a near-full match with the subject, CASE:Nom and CLAUSE:2. Note, however, that the atraction efect is not absent: the model simply predicts a more lop- sided split in the sampling of subject and atractor: 80/20 instead of 50/50. In an experimental design that crosses gramaticality, atractor number and atachment site, the model makes the usual, familiar predictions: a decrease in ?yes? responses when the subject and verb fail to agre, but an increases in ?yes? responses when a plural atractor is present. However, there are two signature predictions: (1) a decrease in ?yes? responses to gramatical sentences containing a plural atractor, but only for subject-atached RCs; and (2) a smaler ungramatical atractor efect for object- atached RCs. 3.3.3.2 Materials and Methods Participants Participants were 24 native speakers of English from the University of Maryland community with no history of language disorders. They received credit in an introductory linguistics course for their participation. 150 Materials 40 sentence sets were adapted from Experiment 3. Each sentence crossed gramaticality, atractor number (RC head: plural/singular), and the site of atachment for the relative clause (Subj/Obj). The atachment site manipulation afects the Case property of the atractor: when subject-atached, the Case of the atractor is nominative; acusative when it is object-atached. A sample set of materials is given below: Sample set of experimental items for Experiment 5 Atachment site Grammaticality Atractor number Sg The musician who the reviewer praises so highly wil probably win a Gramy. Grammatical Pl The musicians who the reviewer praises so highly wil? Sg The musician who the reviewer praise so highly wil? Subj Ungrammatical Pl The musicians who the reviewer praise so highly wil? Sg Phil met the musician who the reviewer praises so highly .. Grammatical Pl Phil met the musicians who the reviewer praises so highly ? Sg Phil met the musician who the reviewer praise so highly ? Obj Ungrammatical Pl Phil met the musicians who the reviewer praise so highly ? Table 3-12 Sample materials set for Experiment 5 The materials were distributed across 8 lists by a Latin Square. Each participant therefore saw five items per condition. This experiment was run concurrently with a related experiment on complex subjects, which incorporated 24 sentence sets (half of the conditions in which were ungramatical). 56 further filer sentences were included, which were the same filers run in Experiment 3. Overal, each participant saw 60 wel-formed sentences and 60 il- formed sentences. 151 Procedure and Analysis Presentation and analysis details were as described for Experiment 4. 3.3.3.3 Results Figure 3-7 reports the proportion of ?yes? response in the Subj-atached conditions, and Figure 3-8, the same values in the Obj-atached conditions. Figure 3-7 Experiment 5: Relative clause atraction, Subject-attached RCs Speded grammaticality, proportion ?yes? responses Full model: GRAMATICALITY ? ATRACTOR ? ATACHMENT. We observed a main efect of gramaticality: participants were les likely to respond ?yes? when subject and verb failed to agre (fixed efect logit coeficient ?: -2.11? 0.65; p < 0.001). We also observed a main efect of atachment: participants were more likely to respond ?yes? when the relative clause was atached to the subject (?: 1.00 ? 0.66; p < 0.005). Two interactions were significant: the two-way gramaticality ? atractor number interaction, 152 such that participants were more likely to respond ?yes? to ungramatical sentences when the relative clause head was plural (?: 1.12 ? 0.80; p < 0.001); and a thre-way gramaticality ? atractor ? atachment interaction, such that participants were more likely to respond ?yes? to ungramatical sentences, when both the relative clause head was plural and the atachment-site was the subject (?: 1.27 ? 0.60; p < 0.05). Partial model/Subject Atached: GRAMATICALITY X ATRACTOR. Considering only the half of the conditions in which the RC was subject atached, there was a main efect of gramaticality (?: -2.73 ? 0.67; p < 0.001), indicating a decreased likelihood of responding ?yes? to ungramatical sentences. There was also a gramaticality ? atractor number interaction (?: 2.37 ? 0.88; p < 0.001), such that participants were more likely to respond ?yes? to ungramatical sentences when the relative clause head was plural. Crucialy, participants were also slightly les likely to say ?yes? to sentences containing a plural, even if they were gramatical (?: -0.91 ? 0.67; p < 0.01), such that participants were les likely to say ?yes? to sentences containing a plural atractor. This efect corresponds to the marginal efect observed in Experiment 3; it confirms that gramatical agrement within RC sentences is perceived as les aceptable, when the RC head is plural, the RC subject is singular, and the RC is atached to a phrase in subject position. Partial model/Object Atached: GRAMATICALITY X ATRACTOR. The patern of results changes when only the object-atached conditions are considered. The only reliable efects were gramaticality (?: -2.02 ? 0.58; p < 0.001) and the gramaticality ? atractor number interaction (?: 1.06 ? 0.40; p < 0.01). Participants detected ungramaticality, but were much more likely to say ?yes? to ungramatical sentences if 153 the sentence contained a plural RC head. Unlike subject-atached relative clauses, participants did not show a decreased likelihood to say ?yes? to gramatical sentences, however. In other words, gramatical agrement is impervious to the atractor in object- atached RCs, but not in subject-atached RCs. Figure 3-8 Experiment 5: Relative clause atraction, Object-attached RCs Speded grammaticality, proportion ?yes? responses 3.3.3.4 Discussion This experiment once again replicated the basic atraction efect: comprehenders detect subject-verb misagrement robustly, except when there is a plural atractor. However, it also revealed a tendency to judge gramatical sentences containing a plural atractor as ungramatical. This smal tendency was sen in Experiment 3, but there it was only marginaly significant; in this experiment, which included more participants, it 154 was reliable. What this experiment further showed is that this tendency was only present in subject-atached relative clauses. The cue-based model is consistent with the existence of a smal efect, but crucialy predicts that it should be absent when the atractor is not subject-like. The experiment bore out this prediction. The experiment also bore out the model?s second prediction, which is that the ilusion of gramaticality that an atractor induces should shrink when the atractor is in matrix clause object position. In both subject and object-atached cases, a baseline atraction efect is guaranted by the NUM:Pl cue; but in the subject-case, it is reinforced by the CASE:Nom cue. The discrepancy betwen the two efects is not great, though it is reliable. In subject cases, the atractor increased aceptability judgments from an ungramatical baseline of 38% to 67%, a raw diference of 29%. In object cases, the atractor increased judgments from a baseline of 28% to 47% aceptable, a raw diference of 19%. In the full logit model, the diference in odds ratios betwen subject-atached and object-atached atraction was reliable (?: 1.27 ? 0.60; p < 0.05). The diference in baselines raises concerns for why the sentences containing object-atached relative clauses were judged les aceptable overal. Even for singular- atractor, gramatical sentences, the aceptability rate was 71% for object-atached RCs, compared to 86% for subject-atached RCs. The raw rates are misleading, however: indeed, participants were overal more sensitive to agrement mismatches in object- atached RCs. The odds ratio of saying ?no? in ungramatical v. gramatical sentences was 6.3 in object-atached RCs, which is slightly greater than the odds ratio of 6.0 for subject-atached RCs. The diferences observed in atractor efects betwen subject- and object-atached RCs cannot therefore be atributed to scale diferences. 155 3.3.3.5 Timing diferences, task demands and an alternative explanation The diference betwen subject- and object-atached RCs appears consistent with the partial match property of the content-addresable retrieval model. However there is another candidate explanation to consider, which concerns the relationship betwen the act of judgment and the occurrence of the violation (or of procesing dificulty). Suppose that there are two ways to perform the gramaticality judgment task: (1) upon noting violations, set and maintain an internal ?no? response until the judgment is signaled; (2) re-inspect the representation when the signal to judge is given. An individual may choose and mix strategies depending upon properties of input. Suppose then that the ?flag and set response? strategy is more efective when the delay betwen flagging and responding is relatively modest, and becomes les efective the longer the ?flag? has to be maintained. The intuition is that ?older? flags are les reliable, and are treated with les confidence as time goes by. The earlier that the flag is set in the sentence, then the les information it is based on; and the more likely it indexes a violation that could have been resolved later. In the subject-atached RC case, the subject-verb mismatch occurs early in the comprehension of the sentence, and considerable time elapses before the judgment is made. There are on average 10 words that occur betwen the critical verb and the end of the sentence. Given a presentation rate of 300 ms / word, roughly 3 seconds would have elapsed betwen the violation and the opportunity to respond. In the object-atached RC case, the judgment occurs relatively soon after the critical verb. There are on average only 5.5 words that are presented in that interval, corresponding to 1650 ms of elapsed time. As a consequence, in object RC cases the judging behavior is controlled largely by the ?flag? strategy, since object RC erors are mostly asociated with recently set flags. In the subject RC cases, the participant is more prone to re-inspect or re-generate the 156 syntactic representations. In particular, suppose that the comprehender covertly produces the sentence to judge it. In such a scenario, the agrement atraction eror could be recapitulated, just as if the individual were producing the sentence for the first time. Even if the comprehended sentence was gramatical, the presence of the plural atractor could sometimes lead to the (covert) production of an eror, which would lower the perception of gramaticality for gramatical sentences. Notice that it must be a property of the stimulus that determines which strategy the comprehender relies on. There would be no ?no? flag for gramatical object-atached sentences, yet no decrement in aceptable judgments is observed in the atractor condition. Therefore, it couldn?t be that the absence of a ?no? flag triggers regeneration. Rather we would have to posit that the participant notices that subject RCs generate many low-confidence responses, and therefore have a greater tendency to re-inspect al such sentences. The regeneration scenario is plausible. Syntactic representations have long been sen as labile entities, with short half-lives. This property of the representation has been infered from the fact that an individual?s ability to discriminate betwen a recently presented sentence and a semanticaly-equivalent but syntacticaly distinct version of the sentence declines rapidly after presentation (e.g., Sachs, 1967). It is a prominent property of verbal memory, however, that sentences can be repeated verbatim with high acuracy imediately after presentation (e.g., Jarvela, 1971). Potter & Lombardi (1992) argued that verbatim emory does not reflect a ?read-out? operation over a wel-preserved syntactic representation. Rather acurate verbatim performance is the outcome of a new production act, fed by the recently activated lexical items and the recent interpretation. They later extended this explanation to syntactic priming (Poter & Lombardi, 1998). We 157 have raised the specter of precisely this mechanism in suggesting that re-inspection of a syntactic representation invokes production proceses. Based on the present data, however, it is dificult to significantly strengthen this perspective or conclusively argue against it. On the one hand, the notion that more procesing is involved in judging the subject-atached RC sentences is potentialy consistent with the overal higher acuracy in judgments. On the other hand, it na?vely predicts that judgment times for object- atached RC sentences should be faster than for subject-atached RCs. Our data show this is not the case: the average time to make a judgment of object-atached RC sentences is 727 ms, compared to 613 ms for subject-atached RC sentences (??: 114 ms; 95% C.I.: [45 ms, 116 ms]; p < 0.01). That prediction probably is too simplistic, however: there is potentialy more local procesing dificulty when judgments are made imediately following an object RC and that dificulty could spil-over into judgment times. A more direct test wil be necesary to beter evaluate this alternative explanation. For example, subject-atached RCs presented in a sentence initial position could be contrasted with those that occur in sentence-final position: (67) (a) The runners who the driver ses each morning were next to the busy intersection. (b) Next to the busy intersection were the runners who the driver ses each morning. The case properties of the atractor head are putatively identical in both positions, but the diference in timing betwen the critical verb and the judgment signal is greater in (a) than in (b). Consequently, a decrement in aceptance of gramatical atractor sentences should be more pronounced in (a) than in (b), if the controlling factor is timing and not the inherent properties of the atractor. 158 3.3.4 Clause-boundednes The retrieval structure proposed to handle RC agrement atraction incorporates a clause-mate licensing feature: the CLAUSE:2 cue. This cue ensures that in an ungramatical sentence like ?The runner who the drivers ses? the Case cue wil target the correct image, and not sampling wil not be split betwen the atractor and the subject. There is no evidence for singular DP atraction in these configurations, which places an important constraint on the model. However, is the idea of a ?Clause? cue reasonable? Notice that a clause cue is fundamentaly diferent from a case or number cue. The later are wel-motivated item properties of DPs. Case and number features play a role in structure building and licensing. For example, Case asignment (or checking) is tightly connected with syntactic explanations for which structural positions nominals may occur (e.g., Rouveret & Vergnaud, 1980). It is therefore plausible to asume that encodings of DPs include information about their Case properties. Likewise, both the existence of formal covariation betwen verbal and nominal heads and the semantic import of number implies that each head caries number information. It is a diferent claim, however, to suppose that the DPs contained within a clause encode information about the clause in which they belong. We wil now atempt to substantiate this view. In our view the Case and number properties of a DP map onto what the memory literature refers to as item features, whereas clause identity aligns with an encoding of context. Item features belong to encodings by virtue of the representational system in which they take part. DPs have Case features because the gramar asigns DPs a Case value. Context features belong to encodings by virtue of the circumstances under which the encodings were instantiated. It has been extensively documented that memory search is afected not only by inherent features of item encodings, but also the circumstances 159 under which an instance of the item was presented for study. A key paradigm is studying how memory search is studied is the fre recal paradigm: individuals study a list of words, and are then prompted to recal as many as they can. An interesting aspect of an individual?s response is the order in which the words are recaled, which is taken to reflect the organization of the search. One determinant of order is pre-existing asociations betwen words, or how the intrinsic properties of words relate to one another (Bousfield, 1953; Cofer, Bruce, & Reicher, 1966; and many others). However response clustering can also be observed for extrinsic factors related to the conditions of study, like modality, gender of a speaker?s voice, or typeface (Murdock & Walker, 1969; Hintzman, Block & Inskeep, 1972; Nilson, 1974). In addition, internal factors can induce clustering, such as the procesing task engaged in during encoding (Craik & Lockhart, 1972) or the mood of the participant (Eich, 1980). Finaly, temporal contiguity maters: once pre-existing asociations are controlled for, it is observed that words in succesive list positions tend to be recaled succesively (Kahana, 1996). Generaly it can be said that pseudo-arbitrary conditions of encoding impact the means of searching items in memory, and these are broadly termed context efects. Context could afect organization by diferent mechanisms: memory could be organized into separate kinds of stores that reflect input modality (e.g., an auditory store, a visual store, etc.; e.g., Nilson, 1974), or atributes of the study conditions could be encoded directly in a memory trace as a set of features (Hintzman et al. 1972, Howard & Kahana (2002). Howard and Kahana (2002) conceive of a context representation as a global state code that slowly evolves in time and is impacted by both the input and internal state variables. It can be thought of as a kind of an elaborate time stamp (containing, however, more than just temporal information). 160 Items that bear temporaly closer time stamps wil be more similar than those with temporaly more separated states, and consequently they wil be more likely to be sampled together. We propose that information about clause membership (or membership in a relevant syntactic domain) is included in al constituent representations as a context encoding, a sort of linguistic time stamp. The global organization of syntactic representations, we conjecture, is reflected not always in explicit encodings, but in the organization of the parsing mechanism. Segmentation of the sentence at the clausal level has been proposed as a perceptual and planning strategy (cf. Fodor, Bever, & Garet, 1974, Levelt, 1973). This organization sems logical for a parser which must keep similar kinds of relationships distinct across clauses, as each clause, anchored by a verb or finite tense, introduces its own packet of thematic, event, and information structure. If the procesing of separate clauses is reflected in separate epochs of procesing, then it sems likely that an internal context representation would be impacted in large degre by which clause is currently being procesed. Let us flesh out a few details, in what is esentialy a simplification of Howard & Kahana (2002)?s temporal context model. Suppose, therefore, that there is an internal context representation, which can be conceived of as simply a large vector. Constituent encodings include not only their inherent features, like Case, number, tense, X? relations, etc., but also the context vector. The curent value of c is determined by inputs from currently procesed items and any number of internal state variables. If we imagine that context is updated in a step-wise fashion, then we can asume that context at step i is simply a linear combination of current context, c i-1 and the new information that is placed in context, c NEW . The parameter ? determines how much strength is asigned to the new 161 information in context. ? i weakens the current representation (as a function of how similar new and old information are, in order to keep the magnitude of c constant; se Howard & Kahana, 2002, for details) (68) c i = ? i c i-1 + ?c NEW Suppose further that when procesing shifts from one clausal (or cyclic) domain to another, the Context encoding is shifted by a randomly generated c NEW . This asumption is a crucial one. It has the consequence that the value of context for images encoded in one clausal procesing epoch is more similar than the value of context for images across clausal procesing epochs. This similarity translates into diferent cue strengths in the SAM model. During the agrement checking proces, the verb inside the RC wil use its own context information as a cue, and this cue wil highly activate images encoded in similar contexts. In the formulation of the RC model above, the efect corresponds to the index feature we used, asigning the whole RC DP the feature CLAUSE:1 and the RC subject the feature CLAUSE:2, and putting CLAUSE:2 in the retrieval structure of the verb. We have sketched one means by which clause information could be encoded in every representation of a constituent contained within the clause. In efect we have stipulated that each constituent image contains a tag with a code for that clause. What we have atempted to argue, however, is that such a stipulation is not unnatural: a clause tag can be conceived of as one (important) component of an evolving context representation. The update of this representation can be keyed to major procesing events in a way that leads to gradations and shifts in similarity of encoded units in memory as a function of which procesing epoch a constituent was encoded in. 162 If we reconsider complex subject atraction, however, the incorporation of context in the retrieval structure is not unproblematic. Recal the case of a gramatical sentence including a plural atractor: (69) The path to the monuments is litered with bottles. The retrieval model was able to reliably contact the true subject in virtualy al cases, because the only active feature in the retrieval structure is the Case cue. A Case cue points unambiguously to the correct image. However, if now there is both a Case cue and a Clause/context cue, the incorrect image should be subject to sampling in a smal number of cases because the Clause cue points to it as wel as the correct image. One possible way to avoid this situation is to adapt the atention alocated to the context cue on the basis of its power of discriminability. We have operated under the asumption that atention is alocated uniformly to al cues, but that was a simplifying asumption made to ilustrate the general properties of the system. However suppose that the weighting asigned to the context cue changes depending on how quickly or how often context has changed in some recent time window (and suppose that the system is capable of determining this fact). Context cues wil therefore be more potent in embedded RCs than in simple matrix clauses. If this system has this property, then it predicts that it should be possible to modulate the efect of the atractor in gramatical sentences by embedding the complex subject in more complex (multi-clause) environments. In multi-clause environments the Clause cue is more salient and would consequently receive greater weight. 163 (70) (a) The path to the monuments is litered with bottles. ? NO ATRACTION (b) The lawn is tidy but the path to the monuments is litered with bottles. ? SOME ATRACTION This experiment has not been run. Intuition suggests that there is no diference, but any intrusion would be subtle, so intuition is likely not a reliable guide in this situation. 3.3.5 Next to the cabinets .. A further prediction of the current model is that any plural can in principle afect subject-verb agrement. Imagine a sentence like the following: (71) Next to the cabinets the cat were sleping. The cues provided by the verb ?were? are NUM:Pl and CASE:Nom. No DP satisfies both of these cues, but each should partialy activate at least one: NUM:Pl being asociated with ?the cabinets? and CASE:Nom with ?the cat.? The data for such constructions do not exist, both because the comprehension literature is relatively smal, but also because the production literature, though large, has sampled only a smal portion of the logical space of head-atractor configurations. Other kinds of complex subject atraction are sometimes reported, often simply observationaly, as in: (72) The participants? identity are to remain a secret. (R. Kayne, reported in den Dikken, 2001) Greater work is needed on non-complex subject atraction to beter formulate a model. Given the unatractivenes of feature percolation, these configurations are of increased interest to determine what counts in principle as an acesible DP for atraction. 3.3.6 Linking comprehension and production The content-addresable model of agrement licensing captures the major characteristics of agrement atraction in comprehension. With a set of common 164 asumptions across constructions, it was possible to recapitulate the major paterns and to some extent the size of efects observed in our reading time and speeded gramaticality studies. The partial match property of cue-based retrieval is responsible for alowing the atractor to intervene in ungramatical sentences. However, because of a non-linear cue combination rule, partial matches can be relatively marginalized when cues fully converge on a representation. The results of Experiment 5 supported this conclusion. However, what is to be made of the link betwen comprehension and production? One atractive aspect of the feature percolation model is that it can be easily stated independently of the task because the eror derives from the syntactic encoding itself. Consequently the explanation for agrement atraction in comprehension and production is fundamentaly the same. It is les clear that the content-addresable model has the same extensibility, but we would like to argue that it does. First, consider Solomon & Pearlmutter (2004)?s model of agrement atraction in production, discussed in Chapter 2. What drives the atraction efect in that model is the simultaneity of the head nouns during the production planning proces. The atractor can wrest away control of agrement by being co-active with the gramatical controller when conceptual structure is mapped onto syntactic structure. Because both heads are simultaneously acesible, the proces of selecting verb form sometimes spuriously pays atention to the atractor?s features. The more acesible the atractor, the more likely it is to be spuriously selected. Acesibility is determined by the tightnes of the semantic relation betwen heads: roughly, the degre to which the atracting head characterizes the subject head. In the comprehension model proposed here, simultaneity again plays a role. Because the search occurs in paralel, there is a potential that multiple constituents wil be rendered 165 acesible to comprehension proceses. The choice of which set of constituents is made acesible depends on the match betwen the retrieval cues and constituent features. Therefore, there are formal similarities betwen the two models, although the mechanisms difer. More recent work by Bil Badecker and Rick Lewis in the ACT-R framework (2007), conducted simultaneously with our own, has sought to explain production erors of agrement in nearly exactly the same fashion that we are explaining the comprehension erors: atraction arises when cue competition leads to a partial match. The production proces is posited to take place in exactly the same kind of architecture as comprehension, one in which most operations must retrieve information from recent memory. Verb marking occurs after the subject has been constructed, and it must retrieve the subject to inspect its number properties. Badecker & Lewis asume a richer set of features than we have asumed. Both the local and subject DPs share a category feature, as wel as a nominative case feature in the planning proces. The whole subject is distinguished because its dominating category, IP, is encoded. Therefore a set of cues exists that wil converge upon the whole subject, but only partialy on the local noun. Consequently the production system retrieves the whole subject in the majority of cases. Embedded plurals lead to atraction because the cue structure includes a variable value cue: NUM:var. The system is thus biased to return explicitly number marked constituents. Finaly, Badecker & Lewis explain the hierarchical efects of agrement atraction (e.g., Franck, et al., 2002) in terms of distinctions in inherent activation. Hierarchicaly more dominant categories have higher base rates of activation, because they have undergone more procesing, a phenomenon that we discussed in 3.2.1.3. We have not 166 atempted to give an acount of hierarchical efects because the evidence for such efects is simply lacking in comprehension. Pearlmutter (2002) atempted to test for such efects, but the results were equivocal. Indeed the similarity we have discovered betwen complex subject and relative clause atraction suggests that hierarchical efects may be limited in comprehension. Subject and verb are adjacent in relative clause comprehension, yet the relative clause head manages to intervene. This fact adds to the motivation in 3.3.5 for greater work in comprehension, in order to beter sample the logical space of subject head and atractor configurations. Despite the specificity of our acount for comprehension phenomena, the potential for unification with production models sems high: the same abstract intuition underlies Solomon & Pearlmutter?s model, and the connection with Badecker & Lewis?s model is quite explicit. 3.4 Interference and subjects 3.4.1 Introduction In Section 3.3 we explained agrement atraction by appealing to the properties of retrieval in a content-addresable memory. Using the Search of Asociative Memory model, we worked out the asumptions necesary to make such a model feasible, constrained by experimental data and general properties of linguistic representations. In a larger context our discussion of agrement atraction has revolved around the question of where and how to impute falibility to the real-time linguistic systems. For the phenomenon of agrement atraction the answer sems to be that the dificulty is not in a faulty encoding apparatus, but in the means by which those encodings are acesed in later procesing. We would like to broaden of the scope of the discussion to other 167 phenomenon, and argue that the means by which previously encoded information is acesed is a major determinant of falibility in real-time systems. In the remainder of this chapter, we wil consider the cases of complex subject atachment discussed by Van Dyke & Lewis (2003) and Van Dyke (2007). Van Dyke claims that in cases like the folowing, comprehenders sometimes eroneously select a gramaticaly inacesible constituent as the subject of predicate. The eroneous subject and predicate pairing is underlined. (73) The critic who said that the author was busy with his new novel was writing a comprehensive survey of contemporary literature. The explanation offered should be by now familiar: the gramaticaly-inacesible constituent is subject-like enough to be returned in a retrieval operation. Van Dyke draws upon a number of online and ofline data to support this point. We take isue with the materials used to obtain this data. In a more closely controlled self-paced reading experiment, we obtain divergent results in the online data, though not the offline data. In Chapter 4 we wil consider other cases in which computing an analysis with a complex subject is falible and the system appears to entertain ilicit relationships. Secondly, however, we examine cases in which ilicit analyses never sem to be formed. We argue that the procesing system is adapted to the memory architecture in that it aggresively exploits predictive information to restrict the formation of dependency to gramaticaly-licensed constituents. This point we wil ilustrate with reference to the sizable body of literature on anaphora, but wil examine it in greater detail for the procesing of wh-dependencies. We wil present further experimental evidence that supports our generalization. 168 3.4.2 Van Dyke & Lewis (2003), Van Dyke (2007) Julie Van Dyke and colleagues have studied complex subject atachment in considerable detail, and have concluded that the parser sometimes identifies gramaticaly-inacesible constituents as the subject. Complex subject atachment refers to the pairing of an incoming verb with the imediately preceding subject, when the subject is complex. Consider the following sentence: (74) [ DP.1 The student who knew [ DP.2 the exam ] was important ] was waiting in the halway. The subject of the sentence is the entire phrase bracketed by DP.1. However DP.1 itself includes a sentence, which has its own subject, DP.2. Parsing DP.1 requires identifying the entire segment ?the student who knew the exam was important? as a DP, as wel as asembling an analysis for the RC inside DP.1. When the matrix clause auxiliary ?was? is encountered, it must be understood as signaling a predication over DP.1. Van Dyke has argued that if identifying the subject involves cue-based retrieval operations at the verb, identifying DP.1 as the subject wil be eror-prone if the syntactic left context contains other subject-like constituents. The logic transfers transparently from our previous discussion: imagine was providing a CASE:Nom cue. In (74) both DP.1 and DP.2 look inherently subject-like, both having a CASE:Nom feature. Van Dyke & Lewis (2003) tested this prediction in a self-paced reading experiment. They considered simple subjects, complex subjects containing a lexical subject (High Interference conditions), and complex subjects with no lexicaly-filed subject position (Low Interference conditions). They embedded sentences with each of these kinds of subjects under verbs like ?forget? that can take either simple DP objects or an entire clause. They also crossed the subject type with the initial ambiguity of the 169 embedding, by varying the presence of the complementizer, to obtain six conditions. An example set of materials is given in (75). The ambiguity cross was performed because in that paper Van Dyke & Lewis were also interested in whether or not cue-based parsing was a general mechanism, or whether it might be restricted to the repairing of mis- analyzed sentences (as would be necesary in the reanalysis of the initialy ambiguous sentences). (75) AMBIGUOUS The executive asistant forgot _____ was standing in the halway. (a) CONTROL: Simple subject .. the student .. (b) HIGH INTERFERENCE: Complex subject, contains a lexical subject .. the student who knew that the exam was important .. (c) LOW INTERFERENCE: Complex subject, no lexicaly-filed subject position .. the student who was waiting for the exam .. UNAMBIGUOUS The executive asistant forgot that _____ was standing in the halway. (d) CONTROL: Simple subject .. the student .. (e) HIGH INTERFERENCE: Complex subject, contains a lexical subject .. the student who knew that the exam was important .. (f) LOW INTERFERENCE: Complex subject, no lexicaly-filed subject position .. the student who was waiting for the exam .. Notice that it is not possible to have a relative clause that has no gramatical subject, but if the subject position is the relativized position, as in (75c/f), then the subject is not lexicaly expresed. The same phrase that occupies a (further embedded) lexical subject position in the HIGH INTERFERENCE conditions, ?the exam? occupies an oblique position in the LOW INTERFERENCE conditions, as the object of a preposition. If the parser?s identification of the subject, at the matrix verb, is subject to similarity-based interference, then conditions (b/e), which contain additional overt subject-like phrases, should present greater dificulty than conditions (c/f). 170 The results from Van Dyke & Lewis? Experiment 4 are presented below in Figure 3-9. We include only data from the critical ?Aux+V? region, i.e. ?was standing.? No diferences were observed in preceding regions, except slow-downs that obtained when one region contained a single additional word. Residual reading times are presented, regresed over both word length and ordinal position; raw untrimed reading times exhibited (approximately) the same paterns. Reading times in the ?Aux+V? region show a main efect of subject type, ambiguity and the interaction of the two. Pairwise comparisons reveal what drives these efects: high interference conditions are always read more slowly than low interference conditions (??: 63 ms). There is a strong slow- down due to ambiguity, although only in the interference conditions (??: 93 ms). Only in the ambiguous sentences are high interference conditions read reliably more slowly than short sentences; low interference conditions are not read reliably diferently than short sentences (though the diference is marginal in unambiguous sentences, and approaching marginal in ambiguous ones). 171 Figure 3-9 Van Dyke & Lewis (2003), Experiment 4 Residual readings times for Experiment 4, Region 3, the two-word ?Aux + Verb? region. These data show a main efect of subject type, a main efect of ambiguity, and an interaction. There were insufficient descriptive or inferential statistics reported to construct eror bars or confidence intervals. However, results of significant (p < 0.05) pairwise comparisons are indicated by lines betwen the cels compared. N SUBJ = 36. N ITEM = 36. The contrast betwen high and low interference conditions, which is irespective of ambiguity, is consistent with the idea that the subject is identified at the verb by a general, cue-based proces. When a gramaticaly-inacesible constituent that is nonetheles subject-like is present in the syntactic context, reading times at the auxiliary and verb increase. It is somewhat troubling, however, that there is no diference betwen the high interference and short conditions in the unambiguous conditions. On the one hand, it sems plausible that the adjacency of the subject and verb obviates the need for a retrieval operation. On the other hand, we observed atraction efects in RC 172 configurations where subject and verb were adjacent; so there may not be a necesary connection betwen string adjacency and the need for retrieval. However, reasuringly, the online results are mirored by of-line measures, as wel. In Experiments 2 and 3, participants read the same sentences in a self-paced reading task, after which they had to give a gramaticality judgment (on the sentence they had just read). 31 The results of those experiments, presented in Figure 3-10, show that participants were always les acurate on high interference conditions, regardles of ambiguity. Indeed this patern of results is more consistent with the notion that the inacesible subject is interfering than the online results: even in the unambiguous conditions, the high interference condition is always more dificult than either the short or low interference conditions. 31 The instructions administered to participants are somewhat curious however. From Van Dyke & Lewis (2003, p. 296): The experimenter explained that for a sentence to be ungramatical, it should either be mising words or have too many words. This was ilustrated with a sentence like ?The police gave the citizen who he caught driving too fast on the parkway? or ?The student was practicing reviewed his homework.? It sems unusual to explain gramaticality in this fashion, though the examples given are indeed ungramatical. However by emphasizing a connection betwen gramaticality and a wel-formed phrase structure, it may sharpen the expected contrasts. In the present experiment, sentences are putatively dificult because up to thre subject-verb pairs must be matched, and that aspect of the materials may become more salient given the instructions. 173 Figure 3-10 Van Dyke & Lewis (2003), Experiments 2 & 3 Acuracy is expresed as percentage of experimental sentences judged gramatical. N SUBJ = 36. N ITEM = 36. These materials however contain a serious confound, one which we believe potentialy short-circuits the conclusion that it is the presence of a subject-like constituent in the high interference condition that leads to dificulty: the high interference conditions contain one more clause than the low interference conditions. 32 Those two conditions are repeated below for comparison, with the intervening clauses bracketed. 32 There are other diferences in the example materials given, though it is unknown how representative those materials are of the entire set. The information in the High interference RC is more weakly related to the clause it is contained in. For example, in the Low interference condition, the student was standing in the halway putatively because she was waiting for the exam. However in the High interference conditions the fact that the student knew the exam was important sems to characterize her independently of the other events she participates in. The Low interference condition is also more imageable. Thanks to Colin Philips for these observations. 174 (76) The executive asistant forgot that (a) HIGH INTERFERENCE the student [ CP who knew [ CP that the exam was important ] ] was standing .. (b) LOW INTERFERENCE: Complex subject, no lexicaly-filed subject position the student [ CP who was waiting for the exam ] was standing .. The region that is supposed to be sensitive to interference, ?was standing?, occurs at the boundary of two clauses in the high interference conditions, but only one clause in the low interference condition. There are several ways in which this might impact the results. A simple possibility is that gramaticality ratings are a monotone decreasing function of the number of clauses contained in a sentence, or particularly the number of clauses that are not right-embedded. One can imagine a non-retrieval based alternative, in which the subject must be actively maintained until a gramaticaly-licensed verb is encountered; as the number of subject-verb relationships, or the number of clauses, that intervene increases, the maintenance of this subject may be more eror-prone (in the spirit of Wanner & Maratsos, 1978). Moreover, for the reading time results, there may be increased procesing dificulty incurred by leaving a doubly embedded clause and returning to the main clause, as opposed to leaving a singly embedded clause. This dificulty would be reflected in the Aux + V region, which signals the clause boundaries, rendering comprehension more eror prone in that part of the sentence. Other Spil-over efects are imaginable, such as the diference in the complementation structure of the verbs in the high vs. low interference conditions. Those concerns al center on the high- interference region inducing more dificulty in the course of procesing. However even if we put aside those concerns, we can view the results of Van Dyke & Lewis (2003) as reflecting a type of cue-independent interference efect steming from the presence of an extra clause. Locating any constituent may be more dificult when there are more domains to locate that constituent in. 175 Van Dyke (2007) goes some way towards addresing the clause number confound. In one of the experiments, participants read sentences similar to Van Dyke & Lewis (2003)?s materials in an eye-tracking experiment, except that in the new experiment an adverbial spil-over region was inserted at the clause boundary. Although this manipulation does not addres al of the concerns raised above, it has the potential to relieve some dificulty in the critical region. The double clause boundary problem does stil remain, since the adverbial sems to most naturaly atach inside the RC. Nonetheles this experiment is worth considering, both to compare how it replicates Van Dyke & Lewis (2003), but also because a second kind of interference manipulation was included: a semantic appropriatenes manipulation. In the low semantic interference conditions, the gramaticaly-inacesible DP inside an intervening relative clause was manipulated for its plausibility as subject of the critical verb. This semantic interference condition was crossed with the same kind of syntactic interference condition in Van Dyke & Lewis (2003). A sample set of materials is given below: (77) The pilot remembered that the lady ____ yesterday afternoon moaned about a refund for the ticket (a) LOSYN/LOSEM: Low Syntactic Interference / Low Semantic Interference who was siting in the smely seat (b) LOSYN/HISEM: Low Syntactic Interference / High Semantic Interference who was siting near the smely man (c) HISYN/LOSEM: High Syntactic Interference / Low Semantic Interference who said that the seat was smely (d) HISYN/HISEM: High Syntactic Interference / High Semantic Interference who said that the man was smely HISYN conditions included a gapped subject position and a lexicaly filed subject position embedded inside the intervening relative clause. LOSYN condition RCs were monoclauses which had a single gapped subject position, locating the only overt DP in an 176 oblique position. In HISEM conditions, the DP was a good subject for the critical verb, whereas in the LOSEM condition, it was not. With respect to the example given, it is not dificult to conceive of men moaning, but it is highly atypical for seats to moan. On the asumption that the verb supplies a cue relevant to the property of the DP that underlies the sensibility or typicality of the relation, e.g., ANIMACY:ANIM, then the HISEM verbs pointed ambiguously to the gramaticaly-licensed subject and the inacesible one. Both HISYN and HISEM conditions provided partial support for the gramaticaly inacesible constituent; and in the HISYN/HISEM condition these partial cues converged. Consequently, there should be a cline of eroneous procesing. Figure 3-11 reports the results of thre eye-tracking measures in the critical Aux+V region: first pas, regresion path and total reading time. Figure 3-11 Van Dyke (2007) Experiment 3, Critical region reading times 177 In the first-pas and regresion path reading measures, there was an efect of syntactic interference, but only in LOSEM conditions. Since first pas measures reflect the total duration of fixations before any left-ward or right-ward movement from the region, those results are encouraging for the retrieval acount. When the dependent measure does not incorporate re-reading of the syntactic context, then we can most confidently asert that the RT measure reflects properties of how the comprehender is consulting constituent representations in memory. In total time, syntactic interference showed a marginal efect overal. No efect of semantic interference was observed until several words downstream, in the sentence final region, corresponding to ?for a ticket? in the example set above. There the efect was strongest in the HISYN condition. These results are consistent with the findings of Van Dyke & Lewis (2003) in the sense that a slowdown that may reflect syntactic interference was observed in the conditions that most closely matched the ones in their experiment (where the semantic fit was low). What is troublesome is that the adverbial spil-over region showed strong efects of syntactic interference as wel. Those data are given in Figure 3-12 below. No significant efects of syntactic interference were observed in first pas measures, but in the regresion path measures there was a slowdown in HISYN/HISEM conditions. This efect suggests that any dificulty caused by the HISYN conditions need not necesarily be tied to retrieval proceses at the verb or erors of subject-verb binding. This efect may be consistent with the non-specific interference conjecture we offered above for why intervening biclausal RCs cause dificulty. Alternatively, there was also an atachment 178 ambiguity in just the HISYN conditions which may explain this efect, though it is unclear why it only shows up in HISEM conditions 3 . Figure 3-12 Van Dyke (2007) Experiment 3, Pre-critical region reading times Notably in the online measures there was not a cline of dificulty localized to the verb, with HiSyn/HiSem conditions being the most dificult to proces. However, in the offline measures, this patern was apparent. Van Dyke (2007) reports cloze comprehension measures for the reading time experiments. Following each sentence participants were presented with an open frame like ?the ______ moaned about a refund for the ticket,? and then given a thre-alternative forced choice task to fil the blank: 3 The dificulty of the atachment could be determined by properties of the predication. Compare ?the lady said the man was smely yesterday afternoon,? and ?the lady said the seat was smely yesterday afternoon.? In both cases, ?yesterday afternoon? could refer to the event time of the bounded ?saying? event. Only in the case were ?the man? is subject, however, does the downstairs atachment sem felicitous. It sems more typical, that a man might be smely in a time period bounded by ?yesterday afternoon,? than a seat. 179 {?pilot?, ?lady? , ?man?}. The set of nouns included the gramaticaly correct subject, the matrix clause subject, and the relative clause DP from the HiSem conditions. The same thre nouns were used for each condition since in the LoSem conditions, the relative clause DP could be rejected based on plausibility at test. This task potentialy taps more directly into the interpretive outcome of reading. Results of this task are presented in Table 3-13 below. Syntactic Interference Semantic Interference Low High Low 85% 77% High 77% 66% Table 3-13 Comprehension accuracy from Van Dyke (2007) Experiment 3 The HISYN/HISEM condition was the most eror prone, the LOSYN/LOSEM conditions were the least eror prone, and the two HI*/LO* conditions were in the middle. Both main efects of syntactic interference and semantic interference were reliable. Interestingly, Van Dyke (2007) provided a break-down of eror responses. Since the comprehension task was forced choice, participants were either eroneously choosing the matrix subject or the HISEM DP. The break-down is given as an experiment-wide taly in Table 3-14. When the distractor noun occurred in the subject position (HISYN), participants chose it in 57% of eror responses, compared to 45% in LOSYN position. This shift in proportion was not huge, though it was reliable. It is interesting to note that in many of the HISEM eror trials (on average, 48%), the participants nonetheles choose the matrix subject. 180 Interference condition Matrix subject HISEM Distractor LOSYN/LOSEM 41 10 LOSYN/HISEM 44 36 HISYN/LOSEM 66 15 HISYN/HISEM 52 68 Table 3-14 Eroneously chosen nouns in Experiment 3 cloze comprehension task Overal the eror data suggest that as interference from syntactic and semantic factors increased, the comprehender was more likely to choose a gramaticaly ilicit constituent in the comprehension task. The breakdown of erors shows that the comprehender was more likely to choose a constituent inside the relative clause when it has subject-like features. 3.4.3 Replicating and extending Van Dyke (Experiment 6) We atempted to obtain a subject interference efect in a set of materials similar to Van Dyke & Lewis (2003) and Van Dyke (2007), but one which did not include the clause confound. The data from both of those previous experiments clearly show that the sentences clasified as inducing high syntactic interference were harder to proces and derive a correct interpretation from. However the cause of the interference remains doubtful because the high interference conditions both contained lexical subjects, but also contained two clauses. In a self-paced reading experiment we tested sentences that manipulated the structural subject-hood of a distracting DP, without introducing extra clauses. 181 3.4.3.1 Materials and methods Materials. We constructed sentences that contained a sentential complement taking noun like ?report.? The complement of ?report? was a copular sentence, either with a lexical projection in [Spec,TP], or the expletive-asociate version of that sentence. By hypothesis, [Spec,TP] conditions contained a more subject-like DP constituent than Expletive-Asociate conditions. The DP is in the structural position asociated with subject in the [Spec,TP] position, in contrast with the Expletive-Asociate condition 34 . These were compared to two types of controls: one in which there was no sentential complement imediately after the head noun, and one in which the sentential complement was replaced by a P that expresed the same thematic relations. Consequently interveners were either +/- Clausal Complementation , and +/- Filed Subject Position. A sample set is given below: (78) The politician was displeased that the report .. (a) LEXICAL SUBJECT IN [SPEC, TP] +CLAUSE, +SUBJPOSITION that support was widespread for her opponent was covered on the evening news. (b) EXPLETIVE-ASOCIATE +CLAUSE, -SUBJPOSITION that there was widespread support for her opponent was covered on the evening news. (c) P COMPLEMENT -CLAUSE, -SUBJPOSITION of widespread support for her opponent was covered on the evening news. (d) CONTROL/NO INTERVENERS was covered on the evening news that support was widespread for her opponent. 34 Admitedly the case properties of the DP in Expletive-Asociate construction are somewhat murkier. The DP might bear oblique case (cf. Burzio, 1988; Chomsky 1995), though it might also bear nominative case which is checked covertly. Asociates do have many subject-position properties; e.g., they control agrement on the verb: (i) There was/*were widespread support .. (i) There *was/were widespread rumors .. 182 The critical regions in each case were the auxiliary and verb: ?was covered.? In the thre intervener conditions, (a)-(c), the critical region was preceded by the same thre-word P. In the control non-intervener condition, the sentential complement was extraposed beyond the critical VP, so that al sentences had (approximately) the same interpretive content. 24 item sets were created, each with these four conditions. Each item set was asociated with a yes/no comprehension question. The content of the comprehension questions was designed to target information derived from diferent portion of the sentence, to test for selective deficits in comprehension of these sentence types. 8 item sets had comprehension questions that targeted information obtained from the matrix portion of the sentence. For example, with respect to the sample materials set, ?Was it a politician who was displeased?? 8 item sets had comprehension questions that targeted information within the sentential complement. For example, ?Was there much support for the politician?s opponent?? 8 item sets had comprehension questions that targeted information requiring the correct pairing of subject and embedded verb. For example, ?Was it a report that was on the evening news?? Participants, procedure and analysis. Protocols and analysis methods are identical to Experiments 1 and 2 reported in Chapter 2. Analysis of variance was not performed, and was replaced exclusively by estimation and simulation of linear mixed-efects models. The regions of interest were Regions 8-12, which correspond to the variable intervener phrases; Regions 13-15, which correspond to the P common to al interveners; Region 16, the critical auxiliary; Region 17, the critical verb, and Regions 18-21, the VP spilover regions. 183 Participants were 36 members of the University of Maryland community, who received partial credit in an introductory linguistics course. 3.4.3.2 Comprehension acuracy results The comprehension results are given in Table 3-15 as percentage correct. Overal acuracy was high, 86%. Control sentences, which had no interveners, had 90% acuracy. Both [Spec,TP] and P Complement conditions led to significantly lower acuracy (by logistic regresion, ps < 0.001 and 0.05, respectively), though not Expletive- Asociate conditions (n.s.). Question-type Condition MatrixSubj EmbedSubj EmbedPred [Spec,TP] 80% 88% 83% 68% Exp-Asoc. 88% 99% 83% 81% P Comp. 85% 96% 85% 74% Control 90% 94% 90% 86% Overal 86% Overal 94% 85% 77% Table 3-15 Comprehension accuracy for Experiment 6 Betwen items there was significant efect of question type. Participants were more acurate on items followed by a question about the matrix subject than either a question about the embedded subject (p < 0.05) or about the embedded predicate (p < 0.001). We can consider performance on the experiment conditions clasified by question type. For questions about the matrix subject, participants did best in the Expletive- Asociate and P Complement conditions, and worst on the control or [Spec,TP] condition. For questions about the embedded subject, there were no reliable diferences acros conditions. Performance was most degraded on the embedded predication questions, which were writen to require that the embedded subject and verb be corectly paired. Performance was best on the control and Expletive-Asociate conditions, but worst on the [Spec,TP] and P Complement conditions. Kep in mind that these 184 comparisons are not ideal, because they are betwen items, and so do not counterbalance possible lexical efects. However they do show consistently degraded performance for [Spec,TP] conditions, and particularly in the embedded predicate questions. 3.4.3.3 Self-paced reading results The results of the self-paced reading task are presented in Figure 3-13. Results are only graphed starting at Region 6 (the determiner of the embedded clause subject). No reliable diferences were observed in prior regions (the matrix clause prefix). Regions 8 ? 12: Variable intervener regions. For analysis, word-by-word reading times from the intervener regions were collapsed into one region of interest, excluding the common P. On average the [Spec,TP] condition was read most slowly. In pairwise comparisons it was reliably slower than the expletive-asociate condition (??: 17ms, 95% C.I.: [0, 31 ms], p < 0.05), though not reliably slower than the P complement condition (??: 12 ms, 95% C.I.: [-3 ms, 29 ms], n.s.). Regions 13 ? 15: Common P regions. No reliable diferences were observed betwen Conditions in Regions 13, 14, or 15. There was no spil-over of diferential dificulty from the previous regions. Region 16: Critical auxiliary. The control condition was slower than al intervener conditions combined (??: 17 ms, 95% C.I.: [3 ms, 30 ms], p < 0.05). Crucialy no diferences were observed in the diferent intervener conditions: a model with only an intercept (d.f.: 3) captured the variation as wel as one with condition coeficients (d.f.: 5; ? 2 = 0.7, n.s.). Region 17: Critical verb. The control condition was numericaly slower than al intervener conditions, but this efect was not reliable statisticaly. There was no 185 diference observed in the diferent intervener conditions, or across al conditions (by model comparison; ? 2 = 0.1, n.s.) Regions 18-21. No diferences betwen conditions were observed in Regions 18 ? 19. In both Regions 20 and 21, the intervener conditions were read reliably more slowly than the control condition (Region 20: ??: 23 ms; 95% C.I.: [5 ms, 44 ms], p < 0.05; Region 21: ??: 78 ms; 95% C.I.: [57 ms, 100 ms], p < 0.01). No diferences were observed among intervener conditions in Region 20. However, in Region 21, the clausal complement conditions ([Spec,TP], Expletive-Asociate) were read more slowly than the P complement conditions (??: 27 ms; 95% C.I.: [5 ms, 51 ms], p < 0.01). However, there was no reliable diference betwen the two clausal complement conditions (??: 3 ms; 95% C.I.: [-24 ms, 34 ms], n.s.). Figure 3-13 Experiment 6 Self-paced reading results 186 3.4.3.4 Discussion The results of this experiment confirm that ataching material to a subject head afects the procesing dificulty of a following VP. The online and offline results pull in two directions. The offline results are similar to Van Dyke?s in the sense that the condition containing the most subject-like DP, the [Spec,TP] condition, led to the greatest decrements in performance. Consistent with Van Dyke?s claim that a gramaticaly- inacesible subject is retrieved and interpreted as the subject, we found that [Spec,TP] conditions lead to the worst performance on comprehension questions that require the embedded subject and predicate to be correctly paired (68%). For questions that required acurate procesing of the intervener region itself, there were no diferences among conditions. Therefore it sems that there were no diferences in succes at comprehending the intervener itself, which would correlate with the other deficits. [Spec,TP] conditions were procesed the most slowly, as the online record reveals. In the critical region, the online results showed diferences among the intervener conditions, which were al read more quickly than the non-intervener control. However, diferences among the intervener conditions show up in the sentence-final regions, regions 20 and 21. There the clausal complement interveners were read most slowly, folowed by the P complement condition, and then the non-intervener condition. The diference with the non-intervener control can be atributed to the fact that the non-intervener condition is not sentence-final in those regions: recal the sentential complement of the embedded subject head was extra-posed to control for similar interpretations across materials. In the online results, the only contrast we can claim is one having to do with whether or not the intervener is dominated by CP or not. These results support our contention that the diferences Van Dyke & Lewis (2003) and Van Dyke (2007) obtained 187 were confounded by the extra clause in the high interference condition. Nonetheles the robust patern obtained in the offline measures, in both Van Dyke?s experiments, and our own remains to be explained. We can offer one explanation with Van Dyke?s general viewpoint, which is that representations that interfere during a retrieval operation need not lead to online dificulty, measured in reaction times. We can suppose that the gramaticaly-inacesible subject is a nearly good match for whatever retrieval structure the embedded verb provides. Some proportion of the time it selects that subject. If no information contained within the inacesible subject signals that the subject is gramaticaly ilicit, then the procesing system proceds as if it has constructed the correct representation. Consequently, there would be no observed reaction time diference, but only a diference in measures of interpretation. This explanation may acount for why Van Dyke (2007)?s earliest online measures only showed an efect of syntactic interference for subjects that were LOSEM, if semantic fit counts as a signal for a good subject or not. However, it is important to recognize that while evidence for dificulty in ?High interference? conditions sems clear, the evidence that actual mis- binding occurring in ?High interference? is scant. The results of Van Dyke?s experiments, and our own, remains a somewhat mixed bag. The procesing cost induced by clausal interveners sems clear enough, though its localization remains imprecise. What is least clear is that similarity-based interference is to blame for decrements in performance. One proposal for future research is to use a clearer case contrast, to se whether that yields more distinctive results. Given the efect of case in the agrement atraction research, this contrast sems promising. For example, the use of exceptional case marking constructions (ECM) could be used to create 188 interveners that have a DP in a subject position, yet bear inappropriate case for an agreing subject. In the folowing paradigm, the intervening DP, in bold font, is increasing les subject-like based on whether case is Nominative or not, and whether it is explicitly marked in the input. (79) (a) Subject position of a finite clause, Nominative, Lexically Case ambiguous The man [ who believes John is foolish ] was unsurprised by his behavior. (b) Subject position of a non-finite clause, Acusative, Lexically Case ambiguous The man [ who believes John to be foolish] was unsurprised by his behavior. (c) Subject position of a non-finite clause, Acusative, Lexically Case unambiguous The man [ who believes him to be foolish] was unsurprised by his behavior. Similarity-based interference predicts a cline of dificulty, with (a) being the most dificult and (c) the least. 3.4.4 NPI Licensing v. Reflexive Anaphora One problem with interpreting the data on complex subject atachment is that, unlike in subject-verb agrement, the link betwen the theory and the measures provided by reading times or comprehension acuracy is not as tight. In the case of subject-verb agrement, the paterns of dificulty could clearly be atributed to a discrete property of the stimulus: whether or not the verb matched the head of the subject. In the case of the complex subject atachment, the properties of the interference manipulation are more indirectly related to whatever proceses take place at the auxiliary. Moreover in the case of agrement atraction, a putative interference manipulation could also predictably improve procesing complexity gramaticality ratings. That is, in the agrement atraction experiments we reported, the manipulation did not change performance only by 189 making the comprehension tasks more dificult, but also by making comprehension easier. Other claims of interference can be found for complex subjects. Most notably Negative Polarity Item (NPI) licensing has been found to be sensitive to a gramaticaly- inacesible licenser found inside a complex subject (Drenhaus et al., 2005, Xiang, Dilon, & Philips, submited). NPIs are lexical items or constructions like ?any?, ?ever?, or ?lift a finger? that can only occur in certain semantic contexts: for example, under negation, in conditionals, or in rhetorical questions. The contrast in (80) ilustrates the basic phenomenon with ?ever.? Only in (80a), when ?ever? occurs in the domain of negation, is the sentence aceptable. (80) (a) No candidate wil ever apologize for slander (b) *A candidate wil ever apologize for slander. It is not sufficient for negation to be present in the sentence. In (81), the negation-bearing element is embedded inside a relative clause, where it does not c-command ?ever,? and the resulting sentence is unaceptable. (81) *The candidate that no pundit likes wil ever apologize for slander. However, Drenhaus and colleagues (Drenhaus et al., 2005) have shown that the mere presence of an NPI licenser, even if it not in the right structural relationship with the NPI, can improve perception of aceptability in German sentences. For example, in a speeded aceptability task, they tested sentences with the German NPI ?jemals? (?ever?). A negative licenser (?kein?) was either present in the c-commanding subject position (82a), present but embedded in a subject-atached relative clause (82b), or absent (82c). In the sentences below, the NPI is in bold font, and the negative element is underlined. The acuracy rates in the aceptability task are given to the right. 190 (82) (a) Kein Mann, der einen Bart hate, war jemals gl?cklich. 85% ?No man who had a beard was every happy. (b) *Ein Mann, der keinen Bart hate, war jemals gl?cklich. 70% ?*A man who had no beard was ever happy.? (c) *Ein Mann, der einen Bart hate, war jemals gl?cklich. 83% ?*A man who had a beard was every happy.? Drenhaus et al. (2005) found that participants were equaly acurate in clasifying (82a) and (82c), but were slower and les acurate in judging (82b). Crucialy (82b) is the case in which the licenser is present but in a gramaticaly-inacesible position. The basic finding sems quite robust, and has ben replicated in ERP measures and reading times, both in German and in English (Vasishth et al., in pres; Xiang, Dilon & Philips, 2006, Xiang; Dilon & Philips, submited). Vasishth and colleagues (Vasishth, Drenhaus, Saddy, & Lewis, 2005; Lewis, Vasishth, & van Dyke, 2006) have argued that the intrusion of the gramaticaly-inacesible licenser proceds under a regime of cue- based retrieval, just as the inacesible subject is claimed to have had its intrusive efect in Van Dyke?s studies. They have hypothesized that encountering the NPI initiates a search controlled by two kinds of cues: a semantic cue for negation and a syntactic cue for a c-commanding position. In gramatical sentences like (82a), there is a full match; in ungramatical sentences with no negative element, like (82b), no constituents match the cues. However in sentences like (82c), there is a negative element, just not in a c- commanding position. The cues partialy match with a constituent in this case, and so in some portion of cases, the NPI is mistakenly licensed 35 . 35 As we discussed in section 3.2.1.3, there is no way to encode the c-command property on an individual node, since c-command is a relational property. Therefore, for Vasishth et al. to include c-command in the retrieval structure, they must resort to a heuristic cue like [+Nom] or [+Subj]. In the German examples, the intrusive negative element occurred in an embedded object position, and so it would not be contacted by this cue, only the negation cue. 191 The logic of this explanation for NPI licensing ilusions is identical to our own for agrement atraction. Two cues are provided by the constituent that needs to be licensed: one points to a gramaticaly acesible constituent, but one which does not have the licensing features, and the other points to a gramaticaly inacesible constituent, but one which does have the licensing feature. However there are crucial diferences that break the analogy. Xiang, Dilon & Philips (submited; henceforth XDP) have argued that the cue-based retrieval acount is undermined by the fact that NPI licensing is not an item-to-item dependency. As they point out, there is a consensus that NPI licensing reflects the interaction betwen specific lexical properties of the NPI and the semantics and pragmatics of propositions (Chierchia, 2006; Fauconnier, 1975; Israel, 1998; Kadmon & Landman, 1993, Krifka, 1995; Ladusaw, 1992, inter alia). It is not a relationship betwen two items in a c-command configuration. Two sets of their examples ilustrate this point nicely. In (83) the NPI ?ever? is licensed, despite there being For NPIs occuring inside a VP, the subject is one prominent position in which a licenser could occur. However a cue for subjects wil not be generaly applicable. For example, a c-commanding licenser can occur as a VP-internal arguments, as in (i), as VP negation (i), or as the verb itself in the case of adversative predicates: (i) It occurred to nobody that Mary would ever write a sonnet. (i) Mary did not wish to ever write a haiku. (ii) ary outright refused to ever write a vilanele. The point is moot for the case of NPIs since, as we discuss in the text, NPI licensing does not have a direct c-command requirement. The c-command relationship is a by-product of the requirement that NPIs occur in a downward entailing environment (Ladusaw, 1979). More generaly though the use of a Subject cue ilustrates the problem with addresing constituents in a structure-sensitive manner. It is in principle impossible to enforce a c-command requirement directly. In the case of agrement atraction, this problem is les dire, since the licensing of agrement truly occurs betwen a verb and the nominative-marked element in subject position, and not any possible c-commander. 192 no obvious lexical licenser in the sentence. In (84) ?ever? is licensed in sentence in which there is a negative item, ?nobody,? but one which does not c-command it. (83) (a) Has John ever cleaned his own dishes? (b) The reason one ever bothers to decant a wine is to leave the sediment [..] behind in the botle [SouthWest Airlines Spirit August 1994: 47; reported in Israel, 1998]. (84) Nobody?s mother has ever complained about his grades NPIs are not directly licensed by c-commanding negation. One prominent analysis, Ladusaw (1979), holds that NPIs are instead licensed in downward entailing environments (though this stil does not capture licensing in polar questions). What sems like a c-command requirement for many cases is not itself a licensing condition, but rather a by-product of being in particular downward entailing environments. Xiang and colleagues propose a theory of ilusory licensing in which the contrastive function of the restricted relative clauses (which have been used in al NPI licensing experiments to date) can lead to the generation a pragmaticaly-sensible negative inference. It is this inference that comprehenders may use to license the NPI 36 . What is relevant for our purposes is the comparison they go on to draw betwen NPI licensing and reflexive anaphora. Reflexive anaphora is more plausibly licensed in comprehension as an item-to- item dependency. In English, reflexive anaphors like ?himself? or ?herself,? must occur in the presence of a c-commanding antecedent, as the comparison betwen (85a) and (85b) shows; and in the same local domain, as the comparison betwen (86a) and (86b) shows. 36 XDP propose that, given a subject like (i), a comprehender infers that the set of individuals denoted by (i) has some property P (not yet expresed). However, because of the use of the restrictive rel=ative, they might also consider a contrast set, expresed by the string in (i), of which the speaker may have intended to asert that P does not hold, i.e. ?P. If comprehenders generate the inference ?P for the contrast set, then that may have been used to license the NPI (cf. Israel, 2004) (i) The bils that no democratic senators have voted for .. (i) The bils that some democratic senators have voted for .. 193 (85) (a) *John i ?s mother bought a book for himself i (b) John i bought a book for himself i (86) (a) *John i wished that Mary would buy a book for himself i (b) John wished that ary i would buy a book for herself i This restriction can be expresed syntacticaly (e.g., as Principle A, Chomsky, 1981) or semanticaly (e.g., Jackendoff, 1992). Unlike NPI licensing, it truly sems to depend on a structuraly-conditioned relation betwen the two constituents that occur in the sentence 37 . The question arises whether licensing reflexive anaphora would be subject to intrusion in just the same configurations that NPI licensing is. XDP reasoned that since an anaphoric dependency is more plausibly item-to-item in the case of reflexives, it would constitute a stronger test of cue-based retrieval. The resolution of anaphora in comprehension can be tested by gender or number feature matches betwen an anaphoric element and candidate antecedents. A common way of asesing which antecedents the comprehender is entertaining is to use a stereotypical gender manipulation. Certain nouns, like ?soldier,? are asociated with a stereotypical gender and this asociation is strong enough to lead to a procesing disruption when they are identified as the antecedent of a reflexive which does not match that gender (Osterhout et al., 1997; Sturt, 2003). For example, in (87b), the anaphor ?herself? must be coreferent with ?the tough soldier.? However there is a procesing disruption in resolving the anaphora, compared to (87a), because the stereotypical gender of ?soldier? is male. (87) (a) The tough soldier introduced himself to al the nurses. (b) The tough soldier introduced herself to al the nurses. 37 There is another clas of reflexives, the logophors, which is sensitive to discourse relations (e.g., Pollard & Sag, 1992; Reinhart & Reuland, 1993). We do not discuss these. 194 Sturt (2003) tested configurations containing two potential antecedents, one of which was gramaticaly acesible, and one of which was not. In two eye-tracking experiments he crossed gramatical acesibility with the gender match. (88) ilustrates the design from his Experiment 2, which is most comparable to the NPI experiments. (88) (a) Grammatically Acesible Match / Gram. Inaccesible Match The surgeon who treated Jonathan had pricked himself with a used needle. (b) Grammatically Acesible Match / Gramm. Inaccesible Mismatch The surgeon who treated Jennifer had pricked himself with a used needle. (c) Grammatically Acesible Mismatch / Gramm. Inacesible Match The surgeon who treated Jennifer had pricked herself with a used needle. (d) Grammatically Acesible Mismatch / Gramm. Inacesible Mismatch The surgeon who treated Jonathan had pricked herself with a used needle. In this design the inacesible antecedent is the name inside the subject-atached relative clause. If the reflexive anaphor initiates a search for an antecedent with both structural cues (i.e., c-commands) and morphological cues (i.e., masculine or feminine), then the gramaticaly inacesible constituent should exhibit a partial match in conditions (a) and (c). Moreover, in Condition (c), in which the gramaticaly acesible constituent mismatches, the inacesible constituent should be the only match. Consequently, there would be intrusion, as in the NPI cases. However Sturt found that in early eyetracking measures there was only an efect of match in gramaticaly acesible conditions. The feature match of the inacesible constituent had no impact. He did however find an efect of the inacesible constituent in second-pas and later region measures. Nonetheles it would sem that the initial resolution of anaphora is faithful to the structural constraints on binding. This finding is inconsistent with the prediction that an inacesible constituent is sometimes considered because of a partial match. In a separate experiment he replicated these results for a configuration in which the inacesible constituent is a higher clause: 195 (89) He/she remembered that the surgeon had pricked himself/herself. The findings thus generalize across both requirements for reflexive antecedence: that it c- command, and be in the local domain. Xiang, Dilon, & Philips (submited) conducted a similar study to Sturt?s, in which they measured evoked response potentials (ERPs). The interesting innovation in their design is that they also had participants read NPI sentences in the same experiment. They could therefore directly compare the relative response to intrusive NPI conditions and the relative response to intrusive antecedents within the same participants. The experimental design for anaphora conditions is ilustrated in (90). In congruent conditions, a gender-matching antecedent occurred in subject position. In intrusive conditions, a gender-matching antecedent occurred in a gramaticaly-inacesible embedded position. Finaly, in incongruent conditions, there was no gender-matching antecedent. (90) XDP?s reflexive anaphora sentences (a) Congruent The tough soldier that Fred treated in the military hospital introduced himself to al the nurses. (b) Intrusive The tough soldier that Katie treated in the military hospital introduced herself to al the nurses. (c) Incongruent The tough soldier that Fred treated in the military hospital introduced herself to al the nurses. The design of the NPI sentences mirored these conditions (and the previous experiments discussed above) and is given in (91). (91) XDP?s NPI sentences (a) Grammatical No restaurants that the local newspapers have recommended in their dining reviews have ever gone out of busines. 196 (b) Ungrammatical/Intrusive The restaurants that no local newspapers have recomended in their dining reviews have ever gone out of busines. (c) Ungrammatical/No licensor Most restaurants that the local newspapers have recommended in their dining reviews have ever gone out of busines. XDP had participants read these sentences in RSVP presentation while they recorded scalp voltage. They then analyzed the ERPs measured from the onset of the NPI or the reflexive. For both NPIs and reflexives, the ungramatical conditions exhibited a posterior positivity resembling the P600, an ERP component characteristicaly evoked in response to syntactic or morphological violations (Friederici, Pfeifer, & Hahne, 1993; Hagoort, Brown & Groothusen, 1993). In NPI intrusive conditions, the P600 was either absent or greatly atenuated (across electrodes). This is consistent with the idea that intrusive NPI licensers lead the comprehender into an ilusion of gramaticality. However, in constrast to the NPIs, the intrusive reflexive conditions were not distinct from the incongruent reflexive ones. In other words, for NPIs the ERP response to intrusive conditions either groups with the good conditions or is in betwen good and bad conditions. For reflexive anaphora, however, the intrusive condition groups unequivocaly with the bad conditions. It therefore sems that whatever mechanism is responsible for intrusion in the case of NPIs is not likewise operating in the resolution of reflexive anaphora. As we have discussed, there are good formal reasons for suspecting that NPIs and reflexives should be distinct in their licensing procedures. However reflexives semed like a stronger candidate for a cue-based retrieval resolution mechanism. We would further contend that the reflexive cases are stronger tests that either ours or Van Dyke?s complex subject atachment experiments because the link betwen the measure and the 197 properties of the materials is clearer. Therefore the real point of comparison sems to be betwen reflexives and the agrement atraction cases. In agrement atraction cases the verb needs to agre with the subject, yet comprehenders sem unable to reliably target the subject for comparison. In the reflexive cases the anaphor needs to find its antecedent in the a same-clause c-commanding position, also corresponding to the subject. Here comprehenders never semed led astray. What acounts for the distinction in gramatical acuracy in the two cases? One potential distinction betwen the two cases is that the intruder in agrement atraction sentences came from a constituent in the same imediate clause. In the reflexive anaphora cases the potential intruder was in a diferent (embedded) clause. We have argued that imediate clausal context may be an important cue in the agrement atraction cases (to acount for the patern of judgments in RC atraction; se section 3.3.4). If clause context is used as a restrictor in the search for reflexive anaphors, then the RC-embedded intruder becomes a les-good partial match, becoming even les good the les that cue is weighted. There are good reasons for supposing the clause cue would be important in resolving reflexive anaphora. One is that there is no good way of identifying a c-commander, except by using a heuristic feature like +Subj (se fn. 35). The clause cue may be thus be the only structural licensing cue available to the system. Thus a beter comparison with the agrement atraction case would be obtained if a potentialy intrusive licenser occurred inside a P modification, as in (92). (92) The surgeon for Mary pricked himself/herself with a needle. The structure of this example is more directly analogous to the agrement atraction cases. Crucialy the potential licenser belongs to the same imediate clause in both cases. 198 Another possibility is that a reflexive does not pick up its antecedent by means of searching for a c-commanding constituent. Instead it could acquire it when it is integrated with the verb. In some frameworks, like Combinatory Categorial Gramar (Stedman, 1997), the reflexive is not treated as independent argument of the verb but as a device that reduces the verb?s valency, changing the verb meaning to that of the corresponding reflexive predicate. For example, ?The surgeon pricked himself? would esentialy mean ?The surgeon self-pricked.? The procedure for matching the morphological features of the reflexive to the subject may thus be controlled by information contained in the encoding of the verb phrase, instead of by initiating an independent retrieval. Let us suppose that the verb has succesfully integrated the subject as an argument, and that this is reflected by a pointer to the actual encoding of the subject constituent. The reflexive could check its morphological features by retrieving the subject via this unique pointer and not on the basis of abstract properties. If this acount is on the right path, then we could construct a strong test for intrusion by using reflexives in non-argument positions of the verb, as only argument anaphors can reflexivize the verb 38 . 3.5 Conclusions In this Chapter we introduced some key properties of a content-addresable memory and the pitfals of such an architecture for linguisticaly-structured representation. We ilustrated these properties by examining a number of phenomena involving complex subjects. 38 Thanks to Jef Lidz for this observation. 199 We proposed a simple model of agrement atraction based on retrieval-based similarity interference. Using a smal set of linguisticaly-motivated features we were able to achieve consistency across P and RC atraction phenomena and crucialy to capture the asymmetry betwen gramatical and ungramatical cases, which the feature percolation model failed at. We tested and confirmed a novel prediction of this model in Experiment 5. We examined Van Dyke?s claims that complex subjects containing subject-like constituents are more dificult to atach. We refined her design in our own Experiment 6. On the whole we found that the online evidence for interference from inside complex subjects was equivocal. In our own data online measures indicated sensitivity to the number of clauses in an complex subject, but not to interference from a subject-like constituent. Ofline measures consistently provided a clear indication of greater dificulty in the interference conditions, but they were not so clearly indicative that this dificulty stemed from the wrong subject being integrated at the atachment site. We then turned to NPI licensing and the resolution of reflexive anaphora. These phenomena have been tested across the same sentence structures as in Van Dyke?s experiments. Evaluation of the phenomena and examination of existing experimental results provides litle solid evidence for the occurrence of similarity-based interference in licensing these constructions. However, we also argued that these cases did not provide the strongest test for such efects. In the case of NPI licensing, this was because the licensing procedure did not involve an item-to-item dependency. In the case of reflexive anaphora, we speculated that clause-boundednes or a non-retrieval based mechanism could strongly restrict identification of candidates, and suggested some follow-ups. 200 In Chapter 4 we turn to the procesing of wh-dependencies. While granting similarity-based interference a role in online comprehension, we develop a broader acount in which predictability is a major determinant of falibility. We briefly reconsider the atraction data in terms of predictability. However, we present novel evidence on the procesing of unbounded wh-dependencies, which alows us to defend a link betwen predictability and interference based on how retrieval structures are composed. 201 4 Active dependency formation and mechanisms for the acurate recognition of gramatical dependencies 4.1 Introduction The content-addresable memory architecture outlined in Chapter 3 introduces a potentialy significant source of falibility into online structure building. In the proces of recognizing and licensing dependencies the comprehender is liable to establish gramaticaly-inacurate relationships. The likelihood of doing so is jointly determined by the extent to which forming a dependency involves cue-directed retrieval of one constituent in the dependency and whether there are other similar constituents in the structure. The reasons that information retrieval cannot be tightly gramaticaly regulated inhere in the content-addresable memory architecture: search of memory identifies and ranks candidates efectively in paralel on the basis of the similarity betwen encoded representations and the retrieval structure. Gramatical restrictions that are stated over the relative hierarchical order of constituents cannot be encoded as a feature of item representations, and thus cannot be reflected in the match operation. In two sets of linguistic proceses, agrement licensing and complex subject integration, gramaticaly-irelevant constituents appear to impact gramatical acuracy, as reflected in online complexity, paterns of judgment, or comprehension acuracy. However, as we wil review in greater detail in this chapter, though, linguistic comprehension is highly gramaticaly sensitive and acurate in many domains, notably 202 in the procesing of wh-dependencies. The question we wish to addres more broadly is why some proceses are characteristicaly acurate and why some are not. We have already articulated a mechanism in which inacuracy sems to come for fre. However, we also discussed several possibilities for introducing relative hierarchical order into the search of memory. In the first case, the generation of candidate matches unavoidably identifies gramaticaly inacesible candidates; however more controlled procesing verifies the structural relationships ? for example, by carefully and serialy following the inter-item asociations ? and ultimately delivers a licit constituent. This possibility aligns with debates in psycholinguistics over whether gramatical constraints act as early versus late filters on structure building (e.g., Badecker & Straub, 2002; Sturt, 2003). In a second case, candidates are ordered not only by their similarity to the retrieval structure, but by another analogue value, like activation, which may implicitly recapitulate hierarchical order. This possibility is related to acounts of (non-linguistic) serial order in the list learning literature (e.g. Page & Norris, 1998). We argued that this mechanism ay have heuristic value in restricted scenarios, but that global hierarchical order cannot recovered in any simple way. In both of these alternatives, searching a structured representation overgenerates, and a subsequent selection proces must prune. We mentioned a third possibility: if dependency formation is anticipated, then some licensing can occur in the absence of complete information about the head of the dependency and the contents of the retrieval structure can be controled to identify just the gramaticaly licit constituent. It is this possibility we wil expand upon in this chapter. There are two aspects to this idea: one is that by simply restricting how much retrospective search the system performs, then the system can avoid the intrusion of 203 gramaticaly-inacesible information. Information that can be caried forward in time can license dependencies without having to rely on retrieved information. The second aspect is that, if the system preserves enough distinctive features of the actual constituent- to-be-retrieved, then it can minimize the impact of similarity-based interference when it retrieves the head of the dependency. Our argument advances in four parts: (I) first, we glean the basic generalization from the procesing of anaphora and comprehension of wh-movement dependencies; (I) next, we present some experimental evidence from wh-procesing that constraints on global wel-formednes motivate parsing decisions, independently of the local retrieval environment (Experiments 7-8); (II) then, we show that the mere presence of formaly-similar but irelevant wh- phrases in a structure does not interfere with the retrieval of a target wh-phrase (Experiments 9-10). (IV) thirdly, we provide more detailed experimental evidence from wh-procesing that some information about the head of the dependency is preserved across increasing dependency lengths to guide decision making (Experiments 11-13); 4.2 The role of predictability 4.2.1 Forwards v. backward anaphora In Chapter 3 we introduced one kind of referential dependency, that betwen a reflexive anaphor and its antecedent. Reflexive anaphora must be resolved in a tightly- constrained syntactic domain and al measures indicate that initial dependency construction is highly gramaticaly acurate (Sturt, 2003; Xiang, Dilon, & Philips, submited). In the studies we reported the reflexive anaphor was never related to a subject 204 that was not in the same imediate clause. The gramatical acuracy in forming this dependency contrasts with the licensing of NPIs and verbal agrement. Both of these proceses sem liable to ilusions of gramaticality caused by the presence of gramaticaly-inacesible constituent inside of a complex subject. We defended a retrieval-based acount of agrement atraction and rejected such an acount for NPIs. The formal properties of reflexive anaphora sem to lend themselves to retrieval-based procesing but, based on the experimental results, they do not sem to be liable to the same kind of falibility as agrement is. We considered two conjectures: one, that the boundednes of reflexive anaphora is efective at excluding constituents from al but the imediately dominating clause; two, that anaphora is resolved through the verb, perhaps through a subject-pointer contained in the VP?s encoding. Both of these required further experimental support. However we might also consider whether the referential nature of anaphora leads to more careful procesing. Putatively nothing goes awry if an agrement violation fails to be noticed 39 . However fixing the reference of anaphors is crucial for interpreting the sentence. Therefore perhaps referential dependencies are resolved with more deliberate care. We therefore turn to pronominal anaphora to se whether it uniformly gramaticaly resolved in real-time. 39 Lau, Wagers, Stroud & Philips (2008) have recently provided reading time evidence that agrement atraction violations do not have interpretive consequences for establishing gramatical roles. In an atraction sentence like ?The phone by the toilets were what Yolanda used,? participants never mis-identified ?toilets? as the matrix sentence subject (and thus theme of the pseudocleft) despite that fact it agred with the verb. One domain in which geting agrement right sems to mater is in the resolution of atachment ambiguities. The strings in (i) and (i) are disambiguated entirely by the number on the verb in the relative clause (i) The actors [ in the play [ that always impreses critics ].. (i) The actors [ in the play ][ that always impres critics ] .. 205 When we look at pronouns, like ?him? or ?her,? we se that they have, as an approximation, the inverse licensing requirement of reflexive anaphors. If a pronoun gets its reference from an antecedent in the same sentence, then its antecedent must not c- command it in the same clause, a restriction known as Principle B (Chomsky, 1981). The contrast in (93) ilustrates this restriction: ?him? in (a) cannot refer to the name ?Phil,? because ?Phil? c-commands it in the same clause; but ?her? in (b) can refer to ?Laura? because ?Laura? is outside of the imediate clause dominating ?her.? (93) (a) *Laura remembered that Phil i liked to buy him i drinks. (b) Laura i remembered that Phil liked to buy her i drinks. As with reflexive anaphora we can ask whether the comprehender is imediately sensitive to the restrictions on pronominal anaphora: does the comprehender exclude within-clause, c-commanding constituents as candidate antecedents. The studies to addres this question have given mixed answers. Using cross-modal lexical priming, Nicol & Swinney (1989) found that candidates excluded by Principle B were not considered during dependency formation. Clifton, Kennison, & Albrecht (1997) reached a similar conclusion in self-paced reading. However more recently both Badecker & Straub (2002) and Kennison (2003) have found that Principle B-excluded antecedents are identified nonetheles as candidates in anaphora resolution. There is therefore not the same kind of unanimity about pronouns that there sems to be for reflexive anaphors. Interestingly, evidence that the resolution of pronominal anaphors is not gramaticaly acurate holds only in case of forwards anaphora, when a pronoun is encountered and must search the syntactic context for its antecedent. However pronouns may also occur in backwards anaphora configurations, in which case its antecedent occurs later in the sentence, as example (94) ilustrates. 206 (94) While he i was the bar, Phil i bought a drink for Laura. In backwards anaphora candidate antecedents can be evaluated left-to-right. Van Gompel & Liversedge (2003) provided reading time evidence that comprehenders sek to resolve the referent of the pronoun quickly. However, there is a structural restriction on antecedents in backwards anaphora, and that is Principle C (Chomsky, 1981). Principle C restricts a noun phrase from co-refering with a pronoun that c-commands it. The examples in (95) ilustrate this restriction both for simple and embedded clauses. (95) (a) *He i bought Phil i a beer. (b) *He i said that Phil i should get a beer. Given an incentive to resolve the reference of pronouns, do comprehenders ever consider noun phrases banned by Principle C? The two studies to examine this question (Cowart & Cairns, 1987; Kazanina et al., 2007) both show that Principle C is respected in procesing backwards anaphora. Kazanina et al. (2007) used a gender match manipulation to make their case. They considered two kinds of sentences. In ?Principle C? sentences, e.g. (96), a pronoun (in bold) and a name (underlined) were on the same c-command path. Consequently co-reference betwen the two was ilicit. In ?No constraint? sentences, e.g. (97), the pronoun and name were not on the same c-command path. Co-reference was thus alowed. (96) Principle C condition Because last semester she i was taking clases full-time while Kathryn/Russel was working two jobs to pay the bils, Erica i felt guilty. (97) No constraint condition Because last semester while she i was taking clases full-time Kathryn i /Russel was working two jobs to pay the bils, Russel/Erica i never got to se her. For ?No constraint? sentences, Kazanina and colleagues found that when the name mismatched the pronoun in gender, there was a large reading time slowdown on the 207 name. Crucialy, the same manipulation had no efect in Principle C sentences, in which co-reference could never be possible. Therefore the proces of identifying candidates in backwards anaphora is constrained by structural restrictions provided by the gramar. An important diference betwen forward and backwards anaphora is how the search proces for candidate antecedents is triggered. When participants read the sentence-initial pronoun in Kazanina et al. (2007)?s study, they could not asign it reference unles they began searching for a name or description later in the sentence. In the Principle B studies, however, the pronoun occured much later in the sentence, when there already a number of names and descriptions mentioned. There was no advance warning that a pronoun would occur and consequently participants had to search memory for candidate antecedents triggered entirely by a bottom-up signal. If that search occurred in paralel, as would happen in a content-addresable memory, then ilicit candidates could intrude upon the proces of dependency resolution. In the case of reflexive anaphora, it is possible to constrain potential referents by which imediate clause they belonged to (at least, in many cases; ECM constructions provide a counterexample). In the case of pronouns, it would have to be possible to constrain potential referents by excluding a certain clause and structural relation. It sems likely, as we argued in Chapter 3, that constituent representations are encoded with features related to their gramatical domain. However it sems implausible to encode constituents with information about which domains they do not belong to. If there is only a matching mechanism, and no mechanism for otherwise inhibiting certain kinds of constituents from being returned as candidates, then it is not surprising that the retrospective search required of Principle B is prone to gramatical inacuracy. 208 Let us suppose that, in general, achieving gramatical acuracy is harder when a retrospective search is invoked. Retrieval in a content-addresable memory provides one explanation for reduced acuracy in these circumstances: the search cannot be ordered by important gramatical relations and as a consequence gramaticaly ilicit candidates are sometimes contacted. If it were then true that forward, or prospective, searches were beter at keeping track of gramatical relations, then we would expect to find greater gramatical acuracy when the need to form a dependency is announced early. Under these circumstances there would be a great premium in exploiting predictive information. In the remainder of this chapter we wil argue that wh-dependency formation exemplifies the virtues of prospective searching. We review the standard evidence that the parser recognizes wh-dependencies early and atempts to complete them as soon as possible. We provide new experimental evidence that the recognition proces is highly sensitive to global wel-formednes constraints, and that the decision to complete dependencies is based principaly on top-down information. Indeed we conclude that initial dependency formation of wh-dependencies may be retrieval fre. 4.2.2 Reconsidering agrement attraction Before moving on to wh-dependencies we want to mention that the gramatical- ungramatical asymmetry in agrement atraction may submit to an explanation in terms of prospective procesing. Recal the basic data: gramatical atractor sentences, like (98a), do not induce ilusions of ungramaticality (pace the conclusions of Experiment 5). However ungramatical atractor sentences, like (98b), induce ilusions of gramaticality. (98) (a) The path to the monuments is litered with bottles. (b) *The path to the monuments are litered with bottles. 209 One way of explaining this contrast is exclusively in terms of the match betwen cues provided by the verb and the constituent encodings of the subject, as we did in Chapter 3. The other possibility is that the atractor efect in comprehension is specificaly a reanalysis efect. On this view, agrement computation is always caried out correctly on the first-pas. However when this computation fails, a reanalysis mechanism can check back to se if an eror was made. The initial computation could be instantiated as a predictive proces: when the head noun is encountered, a verb marked with the corect number can imediately be predicted. When the verb is encountered, its number features can be checked against the predicted features, and if they match, nothing more needs to be done. However, if the bottom-up features mismatch the top-down predicted features, a cue-based-retrieval is deployed to se if the correct feature was somehow mised in the context the first time. It is in this ?rechecking? stage that the atractor NPs might sometimes be mistakenly retrieved. The fact that atractor efects are mainly sen in ungramatical cases thus naturaly fals out from this view, as does the fact that atractor efects can be sen even when the subject and verb were adjacent in the first place. The mismatch in the adjacent case would set in motion the same content- based retrieval proces, subject to the same erors. This view also has the advantage that it makes retrieval of the number feature unnecesary in the normal case, although there may be other costs involved in making a prediction 40 . Furthermore, given that English agrement paradigms for lexical verbs are largely syncretic, it may be necesary to use 40 If there were clear evidence that prediction plays a role in explaining the gramatical/ungramatical asymmetry in agrement atraction, then the finding in Chapter 3 that there are smal decrements in the acuracy of some gramatical sentences (in the subject-atached RC constructions) must be reconsidered. In this case, then the decrement might indicate how often prediction occurs; or it would bolster the alternative judgment/regeneration explanation acount sketched in section 3.3.3.5. 210 top-down information, like the number of the subject head, to identify the number features of the verb in the first place. 4.3 Procesing Wh-dependency constructions 4.3.1 Active dependency formation In overt movement languages like English, displaced wh-phrases are found in clause-edge positions where they establish scope properties, but they are asigned a thematic role within the clause in gramaticaly licensed positions. When the language procesor encounters a wh-phrase, it must have a way of deciding when and where to link that phrase with its thematic role asigner. Making this decision is complicated by two aspects of unbounded dependencies. Firstly, the tail of a wh-dependency is (usualy) only indirectly signaled by the input, for example, by the absence of a verb?s subcategorized constituent. Secondly, there are numerous and diverse island constraints that restrict where wh-dependencies can terminate (Ross 1967, se Szabolcsi & den Dikken 2002 for a review). In the past two decades, it has been established that the sentence procesor atempts to form wh-dependencies before direct evidence in the input signals the position of mising constituents (Crain & Fodor 1985; Stowe 1986; inter alia), a phenomenon commonly refered to as ?active filing? (Frazier & Flores D?Arcais 1989). Once the parser identifies a displaced element (henceforth, a filer), gaps are posited at each available position that would alow the dependency to be completed 41 . For example, in (99), a wh-dependency exists betwen the underlined filer phrase and the verb ?play?: 41 We use the term ?gap? and the expresion ?posit a gap? in a theory neutral way, as is standard in the psycholinguistics literature. Our discussion and results do not depend on whether or not the tail of the dependency is a trace or not. This question has been the 211 (99) Which CD does the toddler like her mother to play ___ before naptime? A wide range of experimental findings indicates that, in the course of comprehending this sentence, speakers posit a direct object gap upon encountering the verb ?like.? This gap asignment must then be revised upon reach the pronoun ?her?, and the ultimate gap asignment is made upon reaching the verb ?play?. This sequence of events has been established using several experimental paradigms. For example, previous work has compared strings like (99) to nearly identical ones that lack a wh-dependency, for example (100a) versus (100b): (100) (a) The babysiter forgot which CD the toddler likes her mother to play ___ (b) The babysiter forgot whether the toddler likes her mother to play a CD .. Word-by-word reading times have shown that a disruption begins at the constituent ?her mother? in (100a) (Crain & Fodor 1985; Stowe 1986; Le 2004). This procesing disruption, caled the Filed Gap Efect, suggests comprehenders posit a gap in advance of an overt constituent. Convergent evidence comes from measures in which the semantic fit betwen a filer and potential-gap host (in bold) is manipulated, as in the following pair from Traxler & Pickering (1996): (101) (a) That?s the pistol with which the heartles kiler shot the haples man .. (b) That?s the garage with which the heartles kiler shot the haples man .. These semantic fit or plausibility manipulations show that readers detect when a filer is an implausible argument of a verb while they are reading it, or very shortly thereafter. A disruption is reflected either by a slowdown in reading times (Traxler & Pickering 1996; subject of periodic debate in psycholinguistic circles (McElre & Bever 1989; Nicol, Fodor & Swinney 1989; Pickering & Bary 1991; Gorrel 1993; Gibson & Hickok 1993). Current experimental techniques are most informative about the timing of dependency formation, but the timing facts in this case are orthogonal to the representational hypotheses (Philips & Wagers 2007). 212 Philips 2006), a deflection of a lexical-semantic ERP component, the N400 (Garnsey, Tanenhaus & Chapman 1989), or a sharply increased tendency to report that the sentence stops making sense (Tanenhaus, Stowe & Carlson 1985; Boland et al. 1995). These results show that comprehenders not only posit a dependency betwen a filer and potential gap site, but also evaluate the semantic impact of their decisions as soon as possible. They are strengthened by a wide aray of findings using related methodologies and sampling a number of diferent languages (se footnote 42 ). ERP studies provide a diferent index of long-distance dependency completion. Procesing of the verb that alows completion of a wh-dependency elicits a posterior positivity relative to the same verb in a sentence without a wh-dependency (102ab: Kan, Haris, Gibson, & Holcomb, 2000). (102) (a) NO WH-DEPENDENCY: Emily wondered whether the performer in the concert had imitated a pop star for the audience?s amusement. (b) WH-DEPENDENCY: Emily wondered which pop star the performer in the concert had imitated for the audience?s amusement. Kan and colleagues argue that this is the P600 evoked response typicaly asociated with syntactic anomaly detection and reanalysis proceses, and use this finding to suggest that the P600 is an index of ?syntactic integration dificulty? in general. Although the interpretation of this efect remains uncertain (cf. Fiebach, Schlesewsky, & Friederici, 42 CROS-LINGUISTICALY: Dutch (Frazier, 1987; Frazier & Flores D?Arcais, 1989; Kan 1997); German (Schlesewsky, Fanselow, Kliegl, & Krems, 2000); Hungarian (Rad?, 1999), Italian (de Vincenzi, 1991), Japanese (Aoshima, Philips, & Weinberg, 2004); Russian (Sekerina, 2003) CROS-METHODOLOGICALY: electrophysiology using EG (Garnsey, Tanenhaus & Chapman 1989; Kan, Haris, Gibson & Holcomb, 2000; Philips, Kazanina, & Abada 2005) and MEG (Lau, Yeung, Hashimoto, Braun, & Philips 2006); the ?stops making sense? task (Tanenhaus, Stowe, & Carlson 1985; Boland, et al. 1995); eye-tracking (Traxler & Pickering, 1996); cross-modal lexical priming (Nicol & Swinney, 1989; Nicol, Fodor, & Swinney 1994); anticipatory eye movements (Sussman & Sedivy, 2003). 213 2002; Kazanina, Philips, & Abada, 2005), its timing reinforces the generalization that wh-dependencies are formed as soon as a syntacticaly-appropriate thematic role asigner is encountered and before any direct evidence that a constituent is mising. 4.3.2 Mechanisms of active dependency formation The computational problems that procesing a filer-gap dependency, actively or otherwise, poses in English is schematized below: (103) (a) DEPENDENCY RECOGNITION The parser must recognize long-distance dependencies, at least by: (i) identifying the filer; (i) identifying the gap. (b) RETAINING A FILLER IN MEMORY The parser enter into memory an appropriate syntactic representation of the filer for later re-integration. In languages where gaps may proced filers, there is an analogous problem of retaining the integration environment in memory when the filer is ultimately encountered. (c) FILLER REACTIVATION & INTEGRATION The filer memory must be contacted, possibly reactivated, and integrated at the foot of the dependency. For present purposes we wil ignore the problems of recognizing the head of a wh- dependency. We asume that the use of morphological and positional cues are enough to start a search in English, though this may not be true cross-linguisticaly (cf. Yoshida, 2006). However, consider the problems of maintenance and retrieval. Dependency completion is active, in the sense of the Active Filer Strategy (AFS; Frazier 1987; Frazier & Flores d?Arcais 1989), because it is driven by a signal related to the filer, and not the gap (Wanner & Maratsos 1978; Fodor 1978): gaps are postulated by a top-down signal, and not on the basis of bottom-up evidence. What is the nature of this signal? Wanner & Maratsos (1978), in their formulation of an Augmented Transition Network (ATN; Woods 1973) as a procesing model, had proposed that when an NP is 214 identified as the head of a relative clause, that the NP is placed on a HOLD list 43 . The HOLD list alows the asignment of gramatical function to be put off until an appropriate subsequent context, and, in this sense, enables the ATN to operationalize a transformational gramar. For example, in an object-extracted relative clause, after the verb has been procesed, the ATN atempts to analyze subsequent input as an NP. When it is unsuccesful, it checks to se whether or not the HOLD list is empty; if it isn?t, it retrieves its contents, treating them as input. This parsing sequence does not qualify as an active strategy, because a gap is not postulated unles the parser fails to recognize the verb?s lexical argument. Frazier (1987) suggested that Wanner & Maratsos?s HOLD model could itself be a ?decision principle?: the identification of a filer ? or the non-emptines of the HOLD cel ? could serve as a signal to postulate a gap. Both the identification of a filer and the status of the HOLD list are closely related candidate signals, but they are logicaly separable cues to active gap postulation. In Frazier & Flores d?Arcais?s (1989) formulation of the AFS, it is the identification of a filer that ?imediately predisposes? the parser to rank gaps more highly than lexical arguments. Later, though, they suggest that this implicates an active filer (one that is not ?inert?): at predicates below the head of a dependency, gap analyses are considered before lexical argument ones because the active filer is efectively always an input, until it is succesfully incorporated into the structure. What it means to be not ?inert? is open to a few diferent interpretations, but, functionaly, the view sems to be that a filer, while a dependency is incomplete, has the same representational status as bottom-up inputs. Considered this way, dependency 43 HOLD doesn?t contain the constituent NP, but the constituents of NP. Only when these categories are retrieved at a gap are they asigned to the category NP (?ASEMBLEd?) (Wanner & Maratos 1978 p. 136). 215 completion is active because the filer is in the workspace before subsequent inputs, and it wil efectively out-compete incoming categories for atachment 4 . What is the representation of the unintegrated filer like? One view is that filers are separately and actively maintained in a distinguished buffer, like the HOLD list, and that the HOLD list is consulted just like an input bufer at each step in the procesing of a sentence. Let us focus on the first proposal: Wanner & Maratsos hypothesized that, if memory were like this in sentence procesing, we would expect some cost to keep the elements on the HOLD list active, until they were al discharged. This maintenance cost could derive either from devoting resources to clamping HOLD list items in an active state. Alternately (though perhaps equivalently), it could be due to the consumption of a fixed pool of memory resources, or decrementing a fixed number of open bufers. Wanner & Maratsos (1978) provided evidence that memory costs were higher when a dependency was incomplete, by comparing subject and object relatives in a dual-task paradigm: participants had to both comprehend a sentence, and recal a list of names that at some point intruded upon the word-by-word reading of the sentence. Ford (1983) criticized Wanner & Maratsos?s result, and claimed that the diferential dificulty in subject- and object-relatives could be localized to the verb, where reading times are elevated for object relatives (King & Just 1991). Nicol & Swinney (1989; also Nicol, Fodor & Swinney 1994) later showed that, once the filer is introduced, semantic asociates are primed in lexical decision, but that this priming does 4 A similar logic has ben aplied to derive a MOVE-over-MERGE preference in the root- first generation of sentences (Richards 1999). Retrieving new items from the lexicon is aserted to be more costly than using what is already merged into the existing phrase structure. If the resource cost in Richards? acount is taken to be procesing time, then his view is fundamentally the same as Frazier & Flores d?Arcais (1989): what is already in the workspace wins because it?s faster. 216 not persist. Only at the verb, when the filer must be contacted and thematicaly integrated, does the priming efect re-emerge. These results were interpreted in favor of a re-activation model, in which the filer is not fully active, and must be back into a state suitable for integration. However, it is important to recognize that, firstly, the reading time results (Ford, 1983; King & Just, 1991) do not exclude an active filer. Procesing complexity, or whatever slows readers down on object-relative verbs, may arise for reasons having to do with memory, ambiguity, particular parsing operations, or some combination of these. Moreover the cros-modal paradigm used by Swinney and colleagues has been extensively criticized (McKoon & Ratclif, 1994) for not uniquely admiting the interpretation they draw. It is not clear, moreover, how informative a demonstrated lack of priming is in dependency-medial positions, when other constituents are being active constructed and interpreted in those positions. Electrophysiological evidence has lately been brought to bear on the question of memory load during an open dependency. Both King & Kutas (1995) and Fiebach, Schlesewsky & Friederici (2002) showed that in object-extracted filer-gap dependencies, the averaged EG record reveals a sustained anterior negativity. Because the efect is modulated by performance and participants? memory span, and has been implicated in more explicit memory load tasks (e.g. Ruchkin et al., 1990), the presence of a sustained anterior negativity in open filer-gap dependencies has been interpreted as a direct reflection of the memory load consumed by actively maintaining the filer. Fiebach and colleagues are cautious to point out that the electrophysiological efect does not choose betwen alternative acounts of what precisely is being maintained. It could be a full semantic or syntactic representation of the filer, or perhaps just a few features. Or, it may 217 not be the content of the filer that?s represented, but rather that prediction for a category that alows completion of the dependency, as in Dependency Locality Theory (Gibson 1998). Moreover, recent work by Philips and colleagues (2005) has provided evidence that the sustained anterior negativity does not reflect a cumulative efect that acrues at each word, but derives mainly from the first few ords of the dependency. Such a finding raises doubt that the efect reflects active maintenance of the filer representation. Taken together, the behavioral and electrophysiological evidence on the representational state of the filer during dependency completion is mixed. The chief idea behind active maintenance is that the filer is in a privileged representational state, which renders it highly acesible while the dependency is open, and prevents it from decaying. This perspective has been argued to conflict with the memory architecture we have been exploring. McElre, Foraker, & Dyer (2003) are explicit in their rejection of postulating distinguished representational states, like stacks or buffers. The resources available for maintaining information in an imediately acesible state are limited, they argue (cf. McElre, 2006; and Chapter 3). Consequently information that does not participate in the computation at hand is efectively shunted out of imediate atention, and must be re-acesed by retrieval. In such a memory architecture the representation of a filer cannot generaly be in an active state for very long, since the input that intervenes betwen filer and gap wil displace it from the focus of atention. However, the evidence is overwhelming that dependency completion proceds in advance of information in the input that could directly signal a retrieval. Therefore some internal signal must direct dependency completion. The active maintenance acount of active filing has the virtue that, by efectively granting filers 218 and inputs the same status, the signal, though internal, is virtualy bottom-up. When a category of the same type as the filer is subcategorized in the structure, filers have the right of first refusal, so to speak. On the one hand clear evidence in favor of the active maintenance hypothesis is lacking and some powerful architectural desiderata militate against it. On the other hand, there is the logical problem of knowing when to atempt dependency completion on the basis of local information alone. One reasonable solution is to more explicitly embed active filing into the parsing routines: the filer representation itself would not compete for atachment, but the system would atempt retrieval of the filer at every gramaticaly-legal subcategorizer until the filer is integrated. In a series of studies we report below (section 4.6), we show that most lexicaly-anchored information is lost over increasing dependency lengths in the active portion of dependency completion. Before the gap site, the parser sems only robustly sensitive to coarse-grained categorial information about the filer. The lack of sensitivity to specific lexical features sems consistent with the explicit incorporation of active filing into parsing routines as a rule operating over syntactic categories. Below we argue that licensing syntactic dependencies works in this way, and that information contained in the full filer encoding may not be retrieved until much later in the timecourse of comprehension. We suggest, however, that some item-specific, partial information about the actual filer is truly caried forward in time. We conjecture that a highly restricted subset of the features of the episodic representation of the current filer encoding is maintained in a state imediately acesible to procesing. This set of features can serve both as an internal signal to complete the wh-dependency as soon as possible but more 219 crucialy may serve as a component of the retrieval structure used to target the gramatical filer. 4.3.3 Similarity-based interference and wh-dependency completion In recent years several lines of evidence have been collected to support the idea that wh-dependency completion is subject to similarity-based interference. The force of this claim is to further undermine the position that the full filer encoding is preserved in a distinguished state. The first source of evidence comes from a series of studies by Peter Gordon and colleagues (2001, 2002, 2004) on the procesing of relative clauses and clefts. A very basic and robust finding in sentence comprehension is the subject/object asymmetry in relative clause procesing (Wanner & Maratsos, 1978; King & Just, 1991, inter alia). When the head of the relative clause is relativized from object position (104a), the relative clause is harder to proces than when it is relativized from subject position (104b). (104) (a) The salesman [ RC that the acountant contacted ] spoke very quickly > (b) The salesman [ RC that contacted the acountant ] spoke very quickly For example, in two sets of baseline data collected by Gordon et al. (2001) in self-paced reading, object relatives raised comprehension eror rates 6-9%, and RC verb reading times 50-85%. However Gordon et al. (2001) also showed that dificulty of object relatives depends on the relative referential types of the filer DP and the subject DP. In examples like (104) both filer and subject are definite descriptions. However, if the subject DP was a pronoun or a name, the asymmetry in verb procesing was esentialy eliminated. The same efect held true if the wh-dependency is in a cleft (though the amelioriation is not as complete). Gordon et al. (2001) argued that the distinctivenes of 220 the representations was an important factor in determining how easily the constituents could be integrated at the verb. When the two DPs are of the same referential ?type? (either both descriptions, or both names), then integration is more dificult, than if they are of mixed types. Gordon et al. (2002) used a memory load paradigm to further support the idea that disimilar constituents were easier to integrate. In this experiment, participants were presented with a list of 3 names or 3 descriptions. They then had to read a sentence, answer a T/F comprehension question about it, and recal the memory list. The design of the experiment is summarized in the example materials set in (105). The clefts were either subject or object clefts, and the memory list either did or did not match the referential type of the DPs in the sentence. (105) (DESCRIPTION) LOAD POET?CARTONIST?VOTER (a) SUBJECT CLEFTS MATCH: It was the dancer that liked the fireman before the argument began. ISMATCH: It was Tony that liked Joey before the argument began. (b) OBJECT CLEFTS MATCH: It was the dancer that the fireman liked before the argument began. ISMATCH: It was Tony that Joey liked before the argument began. Comprehension eror rates and reading times both showed strong efects of subject v. object extraction, and match v. mismatch. These efects interacted such that a matching memory list made object clefts much harder to understand, compared to mismatching lists; the impact on subject clefts was smaler. In later work, Gordon et al. (2004) showed that when pronouns, names and quantifiers mismatched with definite descriptions, the Subject/Object asymmetry could be aleviated (but not by contrast with indefinites, generics, or superordinal category nouns). This efect can be intuited quite strongly if the level of embedding is increased, as Bever (1974) observed: (106) (a) The reporter the politician the commentator met trusts said the president won?t resign. 221 (b) The reporter everyone I met trusts said the president won?t resign. (106a), in which the thre pre-verbal DPs are al descriptions, is much harder to comprehend than (106b), in which the thre pre-verbal DPs are a description, the universal quantifier, and an indexical pronoun (cf. Gibson, 1998). The interpretation that Gordon and colleagues gave their data is that, before the verb is encountered in object extractions, there are two unintegrated arguments: the filer DP and the subject DP. In their unintegrated state, these arguments can interfere with one another to the extent they are similar 45 . Once integrated with the verb, the relative distinctivenes of the DP arguments does not mater (cf. Gordon et al. 2006 for further data and argument). In the present context it is worth noting that Gordon et al. are explicitly not commited to a retrieval-based acount of their data. A working memory acount in which multiple items are actively maintained is, in their view, a permisible acount of their data. However their data can be sen to provide a further argument that the filer encoding is not in a distinguished state insulated from other constituents. If the filer occupied its own buffer or slot, it should not mater how distinctive it is with respect to the incoming subject DP. We find that this argument is undermined somewhat, if there is a presure to structuraly analyze the filer ? i.e., to discharge the contents of such a buffer. Suppose an object-extracted filer is initialy analyzed as the subject but the input forces reanalysis. The relative distinctivenes of the two representations may make the 45 In Gordon et al.?s experiments, the relevant dimension of similarity was referential type, or how the expresions can function in a discourse. It is unclear whether there is a plausible feature-based encoding of the diferent expresion types that can distinguish names, definite descriptions, quantifiers and pronouns. It may be that the syntactic category of the expresion or the abstract internal structure of the phrase is actualy controlling similarity. 222 reanalysis proces more dificult, and thus be uninformative about how the filer is maintained in its unanalyzed state. Van Dyke & McElre (2006) provided more direct evidence that the filer must be retrieved at the site of integration, and that this retrieval is subject to interference. They also examined the procesing of clefts under a memory load manipulation. In half of the experimental conditions, participants were presented with a list of thre nouns at the start of the trial, which would have to be recaled after the sentence comprehension task (Load conditions). For example: (107) TABLE-SINK-TRUCK Participants then read sentences like that following: (108) It was the boat that the guy who lived by the sea sailed / fixed in two sunny days. Two possible critical verbs are given in this sentence (underlined). For half of the conditions, exemplified by ?sailed?, the critical verb was not a good fit for these nouns (Low interference conditions): it is not plausible to sail a table, sink or truck. For the other half of the conditions the critical verb was a good fit (High interference conditions): it is plausible to fix a table, sink or truck 46 . Results for the critical region are reported in Figure 4-1. In the critical verb region they reported an interaction of interference and load conditions. Reading times were identical when there was no memory list; however, under load conditions, high interference conditions were read much more slowly. 46 It is unclear to us why the subject was made complex; perhaps this induced slower procesing, giving efects a greater chance of manifesting themselves. 223 Figure 4-1 Van Dyke & McElre (2006) Critical verb region The interpretation of this patern is that the memory load items interfere with filer-gap completion. Similarity-based interference in this case, Van Dyke & McElre argue, arises during retrieval of the filer, as cues compete in the high-interference conditions. The memory list items (like ?table-sink-truck?) have some common feature specified in the retrieval structure of the verb ?fix? not present in ?sail,? so that more memory items are activated in high-interference conditions than in low-interference conditions. If this explanation of the data is correct, then it is further support for a filer- gap procesing regime supported principaly by retrieval at the verb, and not by a maintained or distinguished representation of the filer. The key finding in Van Dyke & McElre (2006)?s data was the interaction of interference and load at the critical verb. However, some further aspects of the data bear 224 consideration in relation to other experiments. In Van Dyke & Lewis (2003), Van Dyke (2007) and our Experiments 6, reported in Chapter 3, the purported high-interference manipulations most reliably led to decrements in ofline measures of comprehension, even when online diferences were not reliable. In Van Dyke & McElre (2006)?s data, there were main efects in comprehension acuracy: high interference conditions were harder than low interference conditions (83% v. 87% correct), and, interestingly, no-load conditions were harder than load conditions (also 83% v. 87% corect). The authors reported no diference betwen interference conditions specific to the load condition, however, which is exactly what other experiments lead us to expect. We might suppose that diferences across experiments reflect how comprehenders trade speed and acuracy: where interference leads to erors in comprehension, particularly those that can be atributed to misbinding the constituent afected by the intererence manipulation, we may se no diference in the online record. For example in Van Dyke (2007), the condition with the greatest interference (HISYN/HISEM) led to the worst comprehension acuracy, yet it was the intermediate interference condition (HISYN/LOSEM) that most robustly disrupted online procesing. We can imagine two kind of interfering constituents, corresponding to Van Dyke (2007)?s manipulation: just-partial matches, in which there is enough information in the constituent encoding to signal a potential eror; and near-total matches, in which the only way to detect an eror would be a proces of chained retrievals to verify the structural relationship In just-partial matches, the eror signal leads to increased procesing times online, since the system is triggered to select another constituent or otherwise atempt repair. In near-total matches, the system is sufficiently fooled by the item information in the interfering constituent and registers no eror. In 225 just-partial match scenarios participants more often succed in binding the correct constituent, so comprehension acuracy is les impacted; whereas in near-total match scenarios participants more often misbind the interfering constituent, which would be reflected as a decrement in comprehension acuracy. Unfortunately, there is no diagnostic measure of which constituent the comprehender took as the embedded object; a cloze task, as in Van Dyke (2007), could be helpful in that regard. Van Dyke & McElre (2006)?s interference manipulation must surely count as a just-partial match, since the interfering elements are extra-syntactic: they are bare lexemes in a list. The item encodings should therefore cary no gramatical information (perhaps beyond lexical category), and are moreover encoded during a distinct task. Thus if one were misretrieved, it would be clear that it could not be a legal participant in any gramatical dependency. Given just how distinct one would expect the word list encodings to be, and given the presence of a fully matching in the sentence, it is surprising that the online efect is so strong. In our discussion of agrement atraction, one point emphasized was how the retrieval mechanism privileges full matches through non-linear cue combination. The partial matches in this experiment at best match on two dimensions: they are nominal (maybe) and have lexical features that match with the verb 47 . The filer in [Spec,CP] would match on several others: e.g., its gramatical +wh 47 It constitutes a further interesting puzzle whether we can give the right lexical specifications to admit selective semantic cues. In the example discussed, the word list is ?table-sink-truck,? the gramaticaly-licensed constituent is ?the boat? and the verbs are ?fix? and ?sail.? Al items are ?fixable?, but only one item is (prototypicaly) ?sailable.? There are therefore two factors relevant to the retrieval structure provided by the two verbs: how unselective ?fix? is, and whether ?sail? is selective. Is there a feature relevant to ?fix? that al representations share (e.g., +concrete; though surely [?] not +reparable)? Is there then a feature that ?sail? picks out (e.g., +navigable, +marine)? A further question is whether such features in the retrieval structure arise directly from the lexical 226 feature, a shared same-sentence/same-clause context encoding, its dominating category. Moreover one expects that a comprehension system that is even modestly gramaticaly constrained would asign greater atention to features like +wh, since structural relations ultimately determine the interpretation. A priori asumptions about the retrieval structure could be misleading, however. And it is important to keep in mind that the efect of each partial match acrues because there are thre, and it is the presence of thre that competes with the gramatical constituent. Nonetheles there are sufficient concerns about specifying the retrieval structure to raise doubts about a purely retrieval-based acount of these data 48 . An important question is what efect the memory list has on retrieval of the verb itself from the lexicon (prior to any structure building), and whether shared lexical features with items on the memory list could slow the selection of the verb in high interference conditions. It should be straightforward to addres whether it is filer-gap dependency completion per se, or another efect of the memory load conditions, that acounts for the interaction by testing sentences in which the verb does not participate in a filer-gap dependency. For example, in the following sentence, there is no filer-gap dependency to specification of the verbs, or whether they are contextualy provided by earlier procesing. 48 There is a question about whether the unergative thematic role asignment that ?sail? permits (but ?fix? does not) afects the retrieval structure. It is unknown whether this is systematic in Van Dyke & McElre (2006)?s materials, but it is worth considering in its own right. Independently of its occurrence in a filer-gap dependency, the string ?the guy who lived by the sea fixed? has outstanding thematic requirements: the verb ?fix? must discharge its THEME role; on the other hand, in the string ?the guy who lived by the sea sailed? is wel-formed without further constituents. Inclusion of Ps like ?in two days? forces the transitive interpretation of ?sail? ultimately, though this information is not available in the critical region. The thematic structure of the verb is important because there is evidence for argument anticipation (Altmann & Kamide, 1999) and even some indication that it may occur semi-independently of filer-gap dependency completion (but cf. our discussion in section 4.4 below). 227 complete at ?sail?/?fix? and consequently no reason to retrieve for theme arguments (but se fn. 47 for a discussion of why it is important to match the thematic properties of the test verbs closely). (109) Load: TABLE-SINK-TRUCK Test: John heard that the guy who lived by sea sailed / fixed his boat in two sunny days. Asuming the load items are al equaly bad subjects, there should be no interaction of verb clas with load if that interaction is caused by cue competition in a retrieval. If, on the other hand, the presence of the memory list interacts in other ways (e.g., through lexical retrieval), the same interaction should be present in (109). The recent sets of studies by Gordon and colleagues, and Van Dyke and McElre, give us grounds to believe that completing a filer-gap dependency does not depend on a special representation of the filer, at least one that can always be used easily and reliably. The presence of items other than the filer afects the ease and acuracy with which filer-gap dependencies are procesed. These facts do not lead inevitably to the conclusion that the mechanism of interference is exclusively retrieval-based, nor that no information about the filer persists to guide initial dependency completion. In the folowing thre sets of studies we argue that top-down properties of the dependency environment play a major role in dependency construction. We argued that the comprehender uses global context to determine when retrieval is atempted and what information it is based upon. In support of this mechanism we argue that some information is maintained and caried forward in time. Consistent with our conjecture about this kind of prospective procesing, wh-dependency formation is found to be highly gramaticaly acurate. 228 4.3.4 Thre studies The first two sets of studies demonstrate that wh-dependency formation is highly gramaticaly acurate in two respects: first, in deciding where to atempt dependency completion; second, in targeting the head of the dependency to incorporate at a gap site (contra Van Dyke & McElre, 2006). Crucialy, in the context of a highly retrieval- dependent architecture, this acuracy must hang on the parser being able to make acurate, gramar-driven predictions about where a dependency could terminate, and thus where to atempt retrieval. When it does retrieve, it must form a retrieval structure that is capable of (nearly uniquely) targeting the head of the dependency. In the third set of studies we provide evidence that category-level information is available to make a syntactic decision in active dependency formation, consistent with our hypothesis that some information is maintained. In the f irst set of studies, we focus on island constraints (section 4.4, Experiments 7-8). There has been considerable atention devoted to the question of whether wh- dependency completion respects island constraints (Ross, 1967) online or whether the parser sometimes atempts to form a dependency that must ultimately be gramaticaly ilicit. We test whether the online structure building respects the Coordinate Structure Constraint (CSC; Ross, 1967), a condition on extraction that forbids gaps inside coordinate structure, unles the same subconstituent is gapped in each coordinated phrase. The CSC is unique among island constraints in this regard: it does not ban extraction outright, but requires multiple extractions. Consequently we can test whether the comprehender can use the gramar to predict future retrieval sites. In the second set of studies, we present a refined version of McElre & Van Dyke (2006). We conducted two experiments in which competition at retrieval comes from two 229 wh-dependency chains in the same sentence, not from an extra-syntactic list (section 4.5, Experiments 9-10). Our test for interference during filer-gap dependency completion simply concerned the presence or absence of a syntacticaly similar dependency head, and thus we could set aside potential lexical efects. Finaly, in the third set of studies, we test how wel thre diferent kinds of dependency formation probes survive increasing dependency lengths: verb-object plausibility, verb-P selectional restriction, and a DP/P filed gap efect. Each of these probes requires diferent kinds of information about the filer to generate a signal during active dependency completion. The specificity of the information required (minimaly) for each probe is gradualy decreased. The verb-object plausibility test requires information about the filer?s lexical head features; the verb-P selectional restriction requires just the lexical identity of the filer?s functional head; and the DP/P filed gap test only requires filer category. By varying dependency length, we are able to test what kinds of information is efective long after the filer was first encoded, and consequently what kinds of information may be guiding the parser?s initial decision, before it atempts to recover full information about the filer. 4.4 The grammar?s role in triggering wh-dependency formation 4.4.1 The motivation for active dependency formation and island constraints Before we said that if it were true that no information about the filer encoding was maintained throughout procesing, then we must be concerned with another mechanism for signaling that a dependency must be completed. The most obvious mechanism is simply a parsing rule: if a filer has been encountered, atempt to retrieve and integrate that filer at every licit subcategorizer. But why does this rule have high 230 priority? As Fodor (1978) discussed, it is not an inevitable element of the parser ? it has to be motivated. Most discussions of why filer-gap dependency formation is so active identify a presure imposed by the unintegrated filer. One kind of presure ses the unintegrated filer as imposing a tax on procesing resources, as it competes for maintenance resources in working memory. Wanner & Maratsos (1978) were early advocates of this kind of architectural approach to keeping dependencies as short as possible. In a retrieval regime, and one in which representations are not maintained in working memory, that presure cannot come from the presence of the filer representation itself. If the filer is not consuming limiting memory space resources then there presure to complete the dependency cannot stem from that burden 49 . Here we want to focus on another alternative, which is that the presure to complete dependencies actively stems from either the wel-formednes of the syntactic or semantic representation. This clas of explanation is closely related to the how previously encountered information guides parsing, since the wel-formednes of the representation can be gauged either top-down or bottom-up. A top-down indicator of wel-formednes could be something like a list of outstanding requirements to license a structure. For example, one clas of explanations, asociated with principle-based parsing (e.g., Pritchet 1992; Weinberg 1992), identifies a presure to satisfy gramatical licensing requirements as rapidly as possible. In the case of filer-gap dependencies, it has been 49 By virtue of cue competition, retrieval itself is a limited capacity proces, so the presure could arise as an adaptation to maximize retrieval succes. On the asumption that interpolating more material leads to a decline in retrieval acuracy of wh-phrases, then a parser that retrieves and integrates the wh-phrase sooner rather than later might on average be more succesful in ariving at an interpretation. We have advocated this view elsewhere (cf. Wagers, 2006). We do not pursue it further here, though it raises interesting questions, concerning how the parser could adapt itself to this presure. 231 suggested that there is a presure to satisfy the Theta Criterion as soon as possible (e.g., Pritchet 1992; Aoshima et al. 2004). Until a structural relation can be established with the filer and a thematic role asigner, the parse is partialy unlicensed, which is an undesirable state of afairs. This explanation can be framed in les gramaticized terms, under the asumption that the parser atempts to interpret as much of the sentence as soon as possible (e.g. Altmann & Kamide 1999; Sedivy et al., 1999). Under this view the active strategy is simply one manifestation of the parser?s eforts to derive an interpretation from only partial information. By actively completing a wh-dependency the parser can yield a more informative interpretation from the limited input available. Altmann & Kamide (1999) in particular focus their explanation more on the outstanding properties of the verb. Verbs have licensing requirements as wel; for example, they must discharge their theta roles. A single verb thus provides a bottom-up signal that there is an outstanding licensing requirement. Based on its inherent properties a retrieval structure could be asembled to search for arguments. The question arises, whether active dependency formation simply reflects the verb casting about for (thematicaly-unmarked) arguments, and finding it in the head of a wh-chain in the case of an incomplete filer-gap dependency 50 . Island constraints provide a natural way of testing whether active dependency formation is merely the result of a verb seking to saturate its argument structure. Island constraints restrict the kinds of dependencies that can be formed, in ways that are potentialy independent of constraints on interpretation or procesing. For instance, the example sentence in (110) contains a complex noun phrase in subject position. It is 50 Already we must suspect such an explanation, given the presence of filed-gap efects in subject position (Le, 2004). 232 impossible for the filer phrase in the main clause to terminate in the NP-contained clause: (110) * Which babysiter did [ NP the revelation [ S that the toddler tormented ___ ] frighten her mother ? If the context with which the parser deals is realy restricted by architectural constraints on the focus of atention, as has been suggested, then the question arises whether the verb ?torment? would search and identify ?which babysiter? as a argument, irespective of the island boundary that separates them. The notion expresed by linking the filer ?which babysiter? with the verb ?tormented,? as in (110), is plausible and perhaps a likely state of afairs. Furthermore, as the discussion of the filed-gap efect in section 4.3.1 ilustrates, in non-island domains the parser sems wiling to make some mistakes and revise temporary commitments. However, linking ?which babysiter? with the verb ?tormented? can never turn out to be the right analysis, because of the constraint on constructing dependencies inside complex NPs. Showing that parsers do not engage in active dependency formation inside island domains would in principle constitute strong evidence that gramatical knowledge guides the parser?s decisions about when to recover filer information. In particular, it would indicate that the predicted wel- formednes of a candidate analysis influences where the parser decides to recover information about the head of the dependency. A number of studies have shown that measures of active dependency formation are not observed in island domains (Stowe, 1986; Bourdages, 1992; Pickering et al., 1994; Traxler & Pickering, 1996; cf. Fredman & Forster, 1985; Kurtzman & Crawford, 1991), and many share the consensus that island constraints are respected in incremental procesing (Philips, 2006). However this position is vulnerable, exactly because the 233 typical empirical consequence of respecting an island constraint is the absence of evidence that a dependency was ever entertained in that island. 51 The strength of conclusions that can be drawn from a lack of evidence has raised concerns, particularly because some island domains have been argued to be themselves complex procesing environments, whether a dependency is present or not (e.g. Deane, 1991; Kluender, 2005). Therefore null findings in island procesing are liable to alternative interpretations that are unrelated to the parser?s interaction with the gramar. In the first series of experiments, we present a new argument that gramatical knowledge plays a definite role in the active formation of wh-dependencies, and one that does not suffer from the null-efect logic of previous studies on islands in language procesing. Instead of considering island constraints that absolutely restrict the formation of a wh-dependency inside a certain domain, we consider a related constraint on wh- dependency formation, the Coordinate Structure Constraint, in which extractions are permited in certain cases. The Coordinate Structure Constraint (CSC; Ross, 1967) rules out gaps within coordinate structures (111a-b), except in the case of acros-the-board extraction, when one gap must occur in each coordinated phrase (111c) 52 . 51 Thre EG studies have demonstrated a procesing disruption when the search for a gap encounters the boundary of island domain. This disruption is reflected in a particular evoked response potential (ERP; P600: McKinnon & Osterhout, 1996; LAN: Kluender & Kutas, 1993; Nevile et al., 1991). However these ERPs are observed at island boundaries. Therefore they are not informative about whether the parser atempts to construct a dependency, only whether an island domain is noticed. Moreover, the observed ERPs are also sensitive to procesing dificulty. Consequently, while these results may reflect calculation of il-formednes in a formal acount of island constraints, but they may equaly wel reflect increased complexity. 52 There are several wel-known clases of exceptions to this generalization (Goldsmith 1985; Lakoff 1986), as, for example, in: (i) What did you go to the store and buy ___? (i) How much can Josh drink ___ and stil stay sober? 234 (111) Phil generaly dislikes the poetry .. (a) * that The New Yorker reviews authors or publishes ___ (b) * that The New Yorker reviews ___ or publishes interviews (c) that The New Yorker reviews ___ or publishes ___ If the parser is guided by the Coordinate Structure Constraint, then there should be evidence that a second gap is actively posited in the second coordinate. Since this evidence would be positive, and not a null efect, then it could avoid the concerns raised by previous island studies. The presence of continued active dependency formation in coordinate structures could be explained by the real-time application of the CSC, but it could also be explained by a bottom-up retrieval mechanism in which the verb initiates a search for arguments. If a subsequent verb in a coordinate structure can take the filer as its argument, then it can satisfy its interpretive needs earlier than by waiting for an argument phrase. Therefore we consider a second kind of multiple dependency construction, a parasitic gap inside post-verbal adjunct clauses (Engdahl, 1983). Single extractions from a post-verbal adjunct clause are generaly unaceptable (112a). In the presence of an extraction from direct or indirect object, the post-verbal adjunct clause can support an additional gap (112b), but, crucialy, it is optional (112c). (112) Phil generaly dislikes the poetry .. (a) * that The New Yorker reviews authors without publishing ___ (b) that The New Yorker reviews ___ without publishing ___ (c) that The New Yorker reviews ___ without publishing too much detail A comparison of wh-dependency formation in coordinate structures and parasitic gap environments, like post-verbal adjunct clauses, is therefore informative: if dependencies are actively formed in both domains, then we would fail to isolate the role that These exceptions occur in specific circumstances when certain narative relationships hold betwen the coordinates. Al materials used in our studies were designed so as to avoid these contexts. Se Postal (1998) for further discussion of these environments. 235 gramatical principles play, since the results would not reflect any gramatical distinctions. If, on the other hand, CSC environments exhibit active dependency formation, but parasitic gap environments do not, then it would indicate that information in the syntactic context guides when the dependencies are formed, and when filer information is recovered. If this is the case, then it must be that an important gramatical constraint is reflected in parsing routines. Table 4-1 outlines the thre candidate paterns of active dependency formation that might be observed in multiple dependency constructions, and the conclusion that could be drawn from each. Firstly, it is entirely possible that active dependency formation ceases once a single, verified dependency is constructed. In this case, active dependency formation should not be observed either in second coordinates or in post-verbal adjunct clauses. We cal this prediction ?ENTIRELY FILER DRIVEN,? since it corresponds to a parser that is driven solely by the requirements of the filer. We asume in this case that once the parser establishes a gap site or gramatical role for the filer, then active dependency formation is terminated. This patern of results would also contradict an aggresively bottom-up acount of wh-dependency formation. Secondly, active dependency formation might be observed in both the context of a second coordinate and a post-verbal adjunct clause. This prediction is caled ?VERB DRIVEN,? since it is expected if active dependency formation reflects the parser?s drive to saturate the verb?s licensing requirements bottom-up. Finaly, active dependency formation might be observed only in second coordinates, and not in post-verbal adjunct clauses. This prediction is caled ?CONTEXT DRIVEN?, since it suggests that knowledge about the distinction betwen coordinate gaps and parasitic gaps afects parsing decisions. 236 Active dependency formation expected? Verb Position Active dependency formation principle Second coordinate Adjunct clause Entirely filer driven: Interpret or license displaced filer. ? ? Verb driven: Identify arguments of the verb ? ? Context driven: Satisfy gramatical constraints ? ? Table 4-1 Predictions for active dependency formation in multiple dependency constructions. A fourth logical possibility is that active dependency formation only persists in parasitic gap environments. This outcome sems unlikely, and it would be puzzling, as it would imply that the parser undertakes an efortful decision to construct an optional dependency, but not an obligatory dependency. One previous study addreses the real-time status of the Coordinate Structure Constraint. Pickering, Barton, & Shilcock (1994) used a Filed Gap Efect design to compare sentences like: (113) (a) I know what you hit the cupboard and broke the miror with ___ (b) I know that you hit the cupboard and broke the miror with a bal In sentence (a), the filer what is the argument of the preposition with, but there are two predicates that intervene betwen filer and gap: hit and broke. In self-paced reading, Pickering, Barton, & Shilcock (1994) found that reading times were elevated at the determiner following hit in (a), compared to a control sentence (b) with no wh- dependency. But no such efect was observed following broke. One interpretation of these data is that the parser atempted to form a dependency with hit, but then had to retract this analysis because there was an overt direct object. However, recognizing that it was inside a coordinate VP, the parser did not then atempt to form a dependency with broke, since doing so would have violated the CSC. This study thus suggests that the 237 parser can exhibit real-time sensitivity to the CSC, but by restricting certain analyses. However, the signal to apply the gramatical constraint in this experiment comes in the form of a parsing failure followed by the coordinator ?and?. Once the initial object gap site has failed, the reader might not be expected to figure out where it is possible to resolve the wh-dependency. For these reasons, the present study seks to find evidence for application of the CSC that comes from a positive measure and to use a design where the signal to apply the constraint follows a succesfully constructed representation. 4.4.2 The Coordinate Structure Constraint and Active Dependency Formation I (Experiment 7) The goal of Experiment 7 was to test whether or not active dependency formation is operative in second coordinate phrases and parasitic gap environments. To do so, we created sentences containing object extractions from an initial VP, where active dependency formation is uncontroversial. Beyond the first gap, sentences had two possible continuations: (i) a coordinate VP, in which case a second gap is obligatory; or (i) an adjunct clause that could host a parasitic gap, in which case a second gap is optional. As our index of active dependency formation, we manipulated the semantic fit of the filer with the second verb. 4.4.3 Materials and Methods Participants The participants were thirty-seven native speakers of American English from the University comunity, who were paid $10. 238 Materials Experimental materials consisted of 24 sets of 4 conditions organized in a 2 ? 2 factorial design that independently manipulated the conditions of VP structure and plausibility. An example materials set is given in Table 4-2. The second verb is the critical region, where active dependency formation is tested. It is in bold font; the relative clause head is underlined. VP Structure Plausibility Plausible The wines which the gourmets were energeticaly discussing ___ or slowly siping ___ during the banquet were rare imports. Coordinated VP Implausible The cheeses which the gourmets were energeticaly discussing ___ or slowly siping ___ during the banquet were rare imports. Plausible The wines which the gourmets were energeticaly discussing ___ before slowly siping the samples during the banquet were rare imports. Adjunct Clause VP Implausible The cheeses which the gourmets were energeticaly discussing ___ before slowly siping the samples during the banquet were rare imports. Table 4-2 Sample materials set for Experiment 7 The semantic fit of the filer with the first verb was plausible across al conditions, so that procesing would not be disrupted before the critical region. In the examples given, one can equaly discuss wines or cheeses. The plausibility factor manipulated the semantic fit of the filer with the second verb only on two levels. This manipulation provides a measure of dependency formation, as a slowdown is expected for implausible verb-argument combinations (e.g., Traxler & Pickering, 1996). If this slowdown occurs at the verb, before any signal in the input that there is a mising constituent, then we can conclude that dependency formation is active. 239 The VP structure factor manipulated the structure that contained the second verb, on two levels: Coordinated VP or Adjunct Clause VP. This manipulation alowed us to compare evidence for active dependency formation in contexts that require second gaps, and those that merely alow second gaps, as outlined in Table 4-1. Coordinate VP sentences always contained direct object gaps in the second VP, as required by the Coordinate Structure Constraint. The Adjunct Clause sentences provided environments where a gap might be anticipated, but they did not actualy include parasitic gaps (p- gaps). Thus when the initial VP was followed by an adverbial clause, the parser could be lured to a p-gap analysis, but such an analysis was never confirmed in our materials. This design permited the identification of efects due to active dependency formation, rather than bottom-up, gap-driven procesing. No p-gaps were present in our experimental target materials, as they might be potentialy highly noticeable constructions that participants could use to strategicaly identify the target conditions. As discussed in footnote 52, there are several clases of exceptions to the CSC, e.g. ?What did you go to the store and buy ___??, al involving expresions of purpose, outcome, and temporal contiguity. Since these environments sem most felicitous under conjunction with and, thre coordinators were used ? and, but, and or ? equaly balanced across the materials to mitigate against the potential confound that comprehenders might believe that they are in one of the CSC-exempt environments. Additionaly, and conjuncts were constructed to avoid CSC-exempt construals by coordinating events that semed equaly felicitous in either order of mention. An auxiliary ratings study, reported below, shows that participants were not treating the materials as CSC exempt. We fully balanced closed-clas lexical items in two other ways: in the adjunct conditions, four 240 prepositions were used: while, without, before, and after. Across materials, both the relative pronouns who and which were used. Analyses of the results showed that these various lexical diferences had no efect on reading times, and therefore we do not discuss these manipulations further. There were two additional constraints on the materials. First, both VPs contained adverbial modifiers before the verbs. These were included to provide a strong cue for the upcoming verb and thus alow participants suficient time to recognize the CSC environment. Secondly, adjunct clause verbs were in the past progresive form, since simple past forms les readily host parasitic gaps. We matched both these features across al conditions, for comparability, and across both VPs, for paralelism. Seventy-two filer sentences were included. In order to prevent recognition of target structures, the filers included syntactic features characteristic of the target items, such as progresive morphology, coordinate structures, filer-gap dependencies, and anomalous predicate-argument combinations. Since there were no parasitic gaps in the experimental design, parasitic gaps were included in some filers. There would therefore be no implicit cue in the experiment to only expect gaps in non-p-gap positions. Aceptability rating study In order to verify the generalization that multiple dependencies are necesary in coordinate structures but optional in parasitic gap environments we conducted an off-line rating study. 32 participants who did not take part in the on-line study completed an aceptability questionnaire using a 5-point scale. These participants either received extra credit in an introductory linguistics clas, or payment as part of another set of experiments. 241 Half of the target item sets from the on-line study were included in this study. The sentences were minimaly-modified versions of the on-line items from the plausible conditions, crossing the structural role of the second VP, as a Coordinate VP or as an Adjunct Clause VP, with the presence of a gap in the second VP. (114) The wines which the gourmets were energeticaly discussing ___ .. (a) Coordinate VP, Gap .. or slowly sipping ___ during the banquet were rare imports. (b) Coordinate VP, No Gap .. or slowly sipping the samples during the banquet were rare imports. (c) Adjunct Clause VP, Gap: .. before slowly sipping ___ during the banquet were rare imports. (d) Adjunct Clause VP, No Gap: .. before slowly sipping the samples during the banquet were rare imports. The 12 item sets of these conditions were distributed by a Latin Square across four lists and combined with 12 filer items of similar length and complexity. Six filers were uncontroversialy aceptable sentences and six were highly unaceptable sentences. Each list was permuted in two pseudo-randomized versions. Results for the experimental items are given in Table 4-3. The average rating for uncontroversialy aceptable filer items was 4.3? 0.09 (standard eror), and 1.8 ? 0.07 for unaceptable filer items. 242 GAP IN SECOND VP? SECOND VP YES NO C.I. COORDINATE VP 4.2 ? 0.1 2.9 ? 0.1 0.49 ADJUNCT CLAUSE 4.2 ? 0.1 3.7 ? 0.1 0.35 C.I. 0.30 0.80 Table 4-3 Experiment 7 Aceptability Ratings Sumary Average ratings are given for each sentence type with standard eror. 95% confidence intervals on mean ratings diferences, across participants and items, are reported in margins. N = 32. A repeated measures ANOVA showed a main efect of structure (F 1 (,31) = 9.8; MSE: 12.8; p < 0.01), a main efect of the presence of a second gap (F 1 (1,31) = 49.5; MSE: 77.0; p < 0.0001), and, crucialy, an interaction of the two factors (F 1 (1,31): 11.4; MSE: 14.3; p < 0.01). In the coordinate VP condition, ratings were substantialy lower if there was no gap in the second coordinate (p < 0.001). This patern confirms that participants were sensitive to the CSC. P-gap conditions were highly rated, and there was no diference betwen the two conditions with multiple gaps. This result mirors an earlier finding for subject p-gap constructions (Philips 2006), and neutralizes the potential concern that p- gaps are somehow marginal structures. Surprisingly, there was a moderate decline in ratings if no gap was present in the adjunct clause (p < 0.05). P-gap and non-p-gap materials were predicted to be equaly highly rated, so it was unexpected that the gaples adjunct conditions were rated slightly lower than their p-gap analogues. It is worth emphasizing the smal size of this efect relative to the drop in ratings observed for CSC- violations. The mean diference, normalized against variance, betwen p-gap and non-p- gap continuations (Cohen?s d) was 0.4, much smaler than the corresponding diference 243 in coordinate structures (d: 1.2). We suspect that this diference may reflect a bias in the construction of materials. P-gaps sem most felicitous when there is a close relation betwen the events or states expresed in the main and adjunct clauses. Creating a non-p- gap analogue in the adjunct clause meant inserting a theme argument that was necesarily closely related to the displaced theme of the main clause. It may have semed to experimental participants, therefore, an awkward way to expres an idea more naturaly expresed by a p-gap or even a coordination. Irespective of the cause of this smal diference, this result strengthens the logic of the online study. If speakers fail to actively construct a second gap in the adjunct conditions, despite the high aceptability of a p-gap in this structure, then this would show that active dependency formation is not merely motivated to derive a natural interpretation from the input, but interacts strongly with gramatical principles. 244 Plausibility Rating Study At the conclusion of the on-line study, 24 of the 37 participants were presented with a questionnaire containing 24 sentences, and asked to rate each on a five-point scale for plausibility, with ?5? being the most plausible and ?1? the least plausible. These data were collected to confirm the efectivenes of the plausibility manipulation. The sentences were simple SVO clauses derived from the critical VPs from the on-line study (e.g. The gourmets discused the {wine/cheese}, The gourmets drank the {wine/cheese}). The rating study used 4 conditions in a 2?2 design, crossing the factors filer type (Plausible, Implausible) and verb type (first vs. second VP in the target items). The four resulting conditions for the 24 item sets were distributed by a Latin Square across four lists, each of which was then permuted into two pseudo-randomized versions. On-line materials had been designed such that al filers should be plausible at the first verb position, but should difer in plausibility at the second verb position. The rating study results confirmed this manipulation, as there was a strong interaction betwen filer clas and verb clas (F 1 (1,23): 70.6; MSE: 160.4; p < 0.0001). Sentences containing first- position verbs were rated equaly highly, regardles of whether the object corresponded to a Plausible or an Implausible-clas filer (mean: 3.9, diference n.s.). Whereas the average rating for Plausible-clas objects, as objects of the second-position verb, remained high and consistent with the first-verb ratings (mean: 3.9), the average rating for Implausible-clas direct objects was much lower (mean: 1.8, p < 0.001). We conclude that the filer clas by verb position manipulation met the desired specifications. 245 Procedure and Analysis The procedure followed was identical to Experiment 7. Self-paced reading times for experimental sentences were examined region-by-region. Sentences were aligned word-for-word up to the second verb, such that each ordinal word position corresponded to a separate region. Evaluation of statistical reliability was caried out by repeated measures analysis of variance 53 . Both participants and items analysis is presented in appendix tables. In the text, however, only the participants analysis is given (which has been argued to be the correct, sufficient test statistic for counterbalanced designs such as our own: Raijmakers, et al. 1999). 4.4.3.1 Results Comprehension question acuracy for the target sentences was high (average: 88.8%) and did not difer reliably across conditions. Figure 4-2 presents the region-by-region condition means for regions 7-19. The omnibus repeated measures ANOVA report is given in Appendix C. Region-by-region condition means and test results for Regions 1-6 are not reported, as materials did not difer across the structural manipulation. Materials did difer in the plausibility manipulation, as the filer was introduced in Region 2. However, no efect of filer type was observed in that region, or in any region before Region 8. 53 In the rest of the text, we also used linear mixed efects models (LMEM), which are in many respects superior to repeated measures analysis of variance (se Bayen, Davidson, & Bates, submited). However RMANOVA reports for Experiments 7, 8 and 11a have been submited for publication, so we present those analyses here. LMEM, with participant and item random efects, generaly give convergent test results with RMANOVA. 246 Figure 4-2 Experiment 7 Region-by-Region Reading Times (The wines/cheese which the gourmets were) 1-6 energeticaly 7 discussing 8 ? Coordinate: or 9 slowly 10 sipping 1 during 13 the 14 banquet 15 were 16 rare 17 imports 18 from 17 ? Adjunct: before 9 slowly 10 sipping 1 (the samples)/(some wine) 12-13 during 14 the 15 banquet 16 ? Region-by-region reading times from the onset of the second VP to 8 regions beyond the critical verb. Punctuation indicates the result of a pairwise by-participants RMANOVA: p: ** < 0.01 * < 0.05 < ? < 0.10. Preceding the second verb. Unremarkably, there were no main efects or interactions at Regions 7-8, the adverb and verb of the first VP. Materials were constructed to be structuraly identical in these regions; and the plausibility norming survey reported in the Materials section confirmed that both types of filers were equaly plausible as the direct object of the verb in Region 8. Materials diverged structuraly in Region 9, which consisted in a coordinator for Coordinated VPs and a preposition for 247 Adjunct Clauses, and there was a reliable increase in reading times for Adjunct Clauses in this region and in Region 10, the second VP adverb. The second verb. No reliable efect of VP STRUCTURE, PLAUSIBILITY, or their interaction was observed at the critical second-VP verb in the participants analysis. A reliable main efect of PLAUSIBILITY was observed in the items analysis, due to plausible sentences being read slightly more slowly, although this was not reliable in pairwise comparisons. Second VP post-verbal region. In the Coordinated VP conditions, implausible sentences were read more slowly in al regions subsequent to the critical verb, reaching significance two words downstream of the critical verb. In Adjunct Clause sentences, no consistent efect of plausibility was observed. In the region two words beyond the critical verb, there was a reliable interaction of plausibility and VP structure, due to slower reading times for implausible-filer sentences in coordinate VP conditions, and an opposite tendency in adjunct VP conditions. Planned pairwise comparisons in this region revealed a highly reliable slowdown due to implausibility for Coordinated VPs (F 1 (,35): 7.87; MSE: 49793; p < 0.01). The opposite reading time patern was observed in Adjunct Clauses, but it was only marginaly significant (F 1 (1,35): 2.92; MSE: 106682; p < 0.10). 4.4.3.2 Discussion Two conclusions follow from the reading time results in Experiment 7. Firstly, in Adjunct Clause VPs the lack of a slowdown due to implausibility suggests that dependency completion does not proced actively in those environments. As no gap occurs in these conditions there is no bottom-up evidence to prompt dependency completion. Therefore the plausibility comparison within this condition constitutes a fair 248 test of purely active dependency completion both at the verb and in the following regions. Second, for Coordinated VPs, the presence of a reliable reading time slowdown for implausible filers in the imediate post-verbal regions shows that filer-gap dependencies are constructed in the second VP. However this finding does not provide definitive evidence for active dependency formation, because of the timing of the efect. By the time the efect becomes reliable, there is botom-up evidence for a gap in the form of mising constituents, and therefore dependency completion could have been cued from the input. Nonetheles, since spil-over efects are commonly observed in self-paced reading, this slowdown could reflect anomaly detection fed by active dependency completion. If the efect in Coordinate VPs were indicative of active dependency completion, then we could conclude that the parser is sensitive to the Coordinate Structure Constraint, such that it recognizes when a filer that has already been sucesfully integrated with the first verb must participate in subsequent dependencies. This conclusion would be consistent with a gramatical licensing parser, in which active dependency formation is driven by the need to satisfy gramatical requirements. On the other hand, if the observed plausibility efect in Coordinate VPs reflects non-active bottom-up proceses, then the diference betwen the plausibility contrasts in Coordinated VPs and the Adjunct Clause may have a more mundane explanation: there is a gap in one structure, but not the other. In order to determine whether active dependency completion in fact persists in coordinate VP environments, and thus tease apart the two possible interpretations of Experiment 7, Experiment 8 was designed such that efects of spil-over and of bottom- up gap-detection would be wel separated in the time course of reading. 249 4.4.4 The Coordinate Structure Constraint and Active Dependency Formation I (Experiment 8) In Experiment 7, the closenes of the critical verb and bottom-up evidence for a gap led to an ambiguous result. In order to separate the verb from the gap, ditransitive verbs were used in Experiment 8 as ilustrated in (115). Consider the following example: (115) The adhesive coating that the engineer sprayed the special test surfaces with ___ in his new laboratory .. In this example, the verb ?spray? subcategorizes for two internal arguments. When the second argument is relativized, the regions imediately following the verb do not provide evidence for a gap site. In a semantic fit manipulation, a slowdown due to implausibility could be observed at or beyond the verb but before bottom-up evidence for a mising constituent. It is thus possible to avoid the confound sen in Experiment 7. If the slowdown occurs at the verb or in the direct object regions, we can conclude that the parser actively completed the dependency, since it had to project the gap site before the input unambiguously signaled its location. In this experiment we used coordinate VPs in which the second verb participates in spray/load-type locative constructions. As there was no implausibility efect in the Experiment 7 Adjunct Clause conditions, there were no such conditions in this experiment. Instead, the coordinate VPs were compared with length-matched conditions with a single filer-gap dependency, in order to compare paterns of dependency formation in a second coordinate VP with dependency completion in a single dependency, which was expected to be active uncontroversialy. 4.4.4.1 Materials and methods Participants 250 The participants were thirty-two native speakers of American English from the University community, who were paid $10 to participate. Materials and Procedure Experimental materials consisted of 24 sets of 4 conditions organized in a 2 ? 2 factorial design that independently manipulated the factors VP structure and plausibility. An example materials set is given in Table 4-4. VP Structure Plausibility Plausible The adhesive coating that the talented engineer designed ___ or his boss and methodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. Coordinated VP Implausible The computer program that the talented engineer designed ___ or his boss and methodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. Plausible The adhesive coating that the talented engineer from the high-tech aerospace firm ethodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. Single VP Implausible The computer program that the talented engineer from the high-tech aerospace firm ethodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. Table 4-4 Sample materials set for Experiment 8 The VP structure factor manipulated the VP structure containing the critical verb. Coordinate VP sentences contained two coordinated VPs, as in Experiment 7. The critical verb was the second verb in the coordinate. Single VP sentences contained only a single verb. A five-word P modifier was atached to the relative clause subject in this condition, so that the ordinal position of the critical verb matched the position of the second verb in the Coordinate VP conditions. 251 The plausibility factor manipulated the semantic fit of the filer with respect to the critical verb by creating two clases of filers. Plausible filers were plausible as the direct object of both the first and second verb. Implausible filers were plausible as the direct object of the first verb, but implausible as the direct object of the second verb. As in Experiment 7, the plausibility manipulation provided a measure of dependency formation. Syntacticaly alternating locative verbs in the spray/load clas were selected for the critical verb in each item set (Anderson, 1971; Fraser, 1971; Pinker, 1989; Rappaport & Levin, 1986 inter alia). Verbs from this clas tend to impose greater semantic restrictions on both of their arguments than do simple datives, like give, or benefactives, like buy. This feature made it feasible to design a large number of items with a semantic fit manipulation that applies to the filer, regardles of the internal position it occupies. It is important that the filer be implausible both as direct or oblique object. Previous research has suggested that if a verb has multiple syntactic frames or argument positions, then filers that are solely implausible as a direct object do not elicit a slowdown in filer- gap constructions (Boland et al., 1995; cf. Pickering & Traxler, 2003). Consider the verbs below, with arguments in the specified configuration: (116) splash/spray/sprinkle/spread NP 1 with NP 2 For these verbs, NP 1 , refered to as the ground argument, must typicaly be a concrete entity, while NP 2 , refered to as the figure argument, must typicaly be either a liquid, plastic or particulate substance. There is nothing crucial about what the semantic selectional restrictions are, just that they tend to exist for both arguments in spray/load verbs. Twelve spray/load verbs were chosen from Levin (1993) as the critical verbs. 252 Each verb was used in two item sets. The first argument of the critical verb was always four words long, providing a large region betwen the verb position and the first direct evidence for a gap position. Any slow-down due to implausibility observed within this spil-over region could be atributed to active dependency formation, since direct evidence for the gap does not occur until after the subcategorized preposition with. The bottom-up cue for the gap was very strong, as two prepositions occurred in sequence (?sprayed the special test surfaces with in his new ..?). Thre further design constraints applied. As in Experiment 7, pre-verbal adverbs were used in al VPs. Unlike Experiment 7, al verbs appeared in simple past tense form. The progresive verb forms used in Experiment 7 were required by the Adjunct Clause conditions included in that study, which were not present in Experiment 8. Finaly, the complementizer that was used to signal the onset of the relative clause, instead of the pronouns who/which. Both the complementizer and the relative pronoun serve as efective signals to the parser for a relative clause, and plausibility efects are obtained in both environments (First author, unpublished pilot results). In the context of these materials, the complementizer was judged to be more natural. Seventy-two filers were adapted from the filers in Experiment 7, so that the distribution of sentence lengths in filers matched the distribution of target items. Procedure and Analysis Procedure and analysis was identical to Experiment 7. One participant who failed to perform the task as instructed was removed from further analysis. 253 Plausibility Rating Study At the conclusion of the on-line study, 16 of the 32 participants were presented with a questionnaire containing 24 sentences, and were asked to rate each on a five-point scale for plausibility. The sentences were simple SVO versions of the VPs from the on- line study, and the design was the same as Experiment 7. Sentences containing first-position verbs were rated equaly highly, regardles of whether the object corresponded to Plausible or Implausible filers (mean for both 4.2; diference n.s.). The average rating for Plausible-clas objects as objects of the second- position verb remained high (mean: 3.7), but the average rating for Implausible-clas direct objects was considerably lower (mean: 1.5, p < 0.0001). Thus, the filer clas by verb position manipulation met the desired specifications. 4.4.4.2 Results Question-answering acuracy was uniformly high. For the 24 experimental targets, acuracy was 92.3% overal. There were modest and reliable diferences due to the VP STRUCTURE and PLAUSIBILITY manipulations. For coordinate structures acuracy was 95.7% for plausible conditions and 90.3% for implausible conditions; for single VP controls acuracy was 89.2% for plausible conditions and 94.1% for implausible conditions. A logistic mixed-efect model estimated that al factors had an odds ratio significantly diferent from zero. In contrast to Experiment 7, this model was significantly beter than a null model that atributed al variation to participants, with no efects for each condition (? 2 (12): 2765.0; p ~ 0). Figure 4-3 presents the region-by-region condition means, segregated into two pair-wise comparisons for regions 6 to 20. These regions extend from the lexical offset of 254 the embedded subject head noun to 3 words beyond the preposition heading the critical verb?s figure argument. The omnibus repeated measures ANOVA report is given in Appendix C for Regions 6-20. Materials did not difer in Regions 1-5, up to the lexical offset of the subject head noun, apart from the filer manipulation. No RT diferences were observed in those regions. Figure 4-3 Experiment 8 Region-by-Region Reading Times (The adhesive coating/computer program that the talented engineer) 1-5 ? Coordinate: designed 6 for 7 his 8 boss 9 and 10 methodicaly 1 sprayed 12 Single VP: from 6 the 7 high-tech 8 aerospace 9 firm 10 methodicaly 1 sprayed 12 the 13 special 14 test 15 surfaces 16 with 17 in 18 his 19 laboratory 20 .. Region-by-region reading times from the ofset of the subject head noun to 8 regions beyond the critical verb, with example text for each region. Punctuation indicates the result of a pairwise by-participants RMANOVA: p: ** < 0.01 < * < 0.05 < ? < 0.10. Preceding the second verb. Materials diverged at the offset of the subject head noun: for Coordinate VP sentences, a verb followed the subject noun, and for Single VP 255 sentences, a P followed the subject noun. Acordingly, reading times difered as a function of VP structure, beginning thre words downstream from the subject head noun, in Region 8, and persisting until thre words downstream, in Region 14. The second verb. At the second verb, in addition the efect of VP structure described above, there was a clear efect of plausibility, due to slower reading times for Implausible filer conditions. However, Coordinate VP sentences showed this contrast most strongly, with a variance-normalized mean diference of 0.18, compared to 0.04 for Single VP sentences. Moreover, as Figure 4-3 shows, the plausibility contrast was robust in pairwise comparisons for Coordinate VPs (F 1 (1,30): 5.63; MSE: 71964; p < 0.05) but not for Single VPs (F 1 (1,30) < 1). However the interaction of plausibility and VP structure was not significant in this region (F 1 (,30) <1) The ground argument region. In the ground argument regions following the critical verb (Regions 13-16), we observed persistent efects of plausibility, especialy at the determiner in Region 13 and at the noun in Region 16. The slow-down due to implausibility was of comparable size for both coordinate VPs and long single VPs in Region 13 (in raw ms). The efect was reliable in pairwise comparisons for Coordinate VPs (F 1 (1,30): 7.19; MSE: 45776; p < 0.05), but only marginaly so for Single VPs (F 1 (,30): 3.68; MSE: 56898; p < 0.10). Region 14 showed no efect of plausibility. Region 15 showed an efect only for Single VPs (F 1 (1,30): 7.59; MSE: 59747; p < 0.01). Region 16 showed a strong efect of implausibility for Coordinate VPs (F 1 (,30): 10.36; MSE: 75008; p < 0.01), and a much weaker, and unreliable, efect for Single VPs (F 1 (,30): 2.50; MSE: 29611; p=0.12). 256 The gap region. An efect of plausibility was present in Regions 17-20, which corresponded to the regions containing the preposition that selects the gap site and subsequent regions. Since Regions 17-18 were two prepositions in sequence (e.g. ?with before?) they provided clear evidence of a mising constituent. Once again, the size and location of specific efects difered slightly across structural conditions, with Coordinate VP sentences showing contrasts in more regions, and displaying the largest contrast in Region 20. 4.4.4.3 Discussion The reading time results from Experiment 8 showed that an implausible filer led to a slowdown at the second verb in a Coordinate VP structure and in subsequent regions in its argument field. The timing of this slowdown provides evidence that the second gap in coordinate structures is constructed actively, because it occurs unambiguously before the direct evidence of a gap position in the second coordinate. The use of spray/load verbs made it possible to put sufficient distance betwen the verb and the gap position, such that we can confidently interpret the slowdown as an efect of active dependency formation. In Experiment 8, in comparison to Experiment 7, the efect of filer-verb plausibility appeared on the verb itself, and not one or two words downstream. There were diferences in the experimental materials that could explain why the efect emerged earlier in Experiment 8. One important diference is that in Experiment 7 the second verb occurred only two words after the first verb, whereas in the Experiment 8 materials a short 3-word P occurred in the first VP. To se why this could make a diference, consider that in order to detect implausibility comprehenders must not only posit a gap 257 location, but must also retrieve and integrate the filer syntacticaly and semanticaly. Even when there is no disruption in dependency formation many proceses must take place. It is more likely that these proceses could have persisted to the second verb in Experiment 7 than in Experiment 8, and thus could have delayed the emergence of an implausibility efect by one or two words. A surprising finding in this experiment was that implausible filers had a more disruptive efect on procesing in a second coordinate than in the single dependency control conditions. In Experiment 11 (section 4.6.1), we delve into the reasons for this in greater detail. 4.4.5 General discusion of Experiments 7 & 8 Experiments 7 & 8 tested whether the parser persists in actively and incrementaly constructing gaps in multiple-gap dependencies, even after the first filer-gap relationship has been succesfully constructed. The goal of the study was to ases whether gramatical constraints actively direct the formation of wh-dependencies, controlling potential retrieval events in a top-down fashion. Two kinds of multiple-gap dependencies were compared: across-the-board extraction from coordinate VP structures, and parasitic gaps inside post-verbal adjunct clauses. Crucialy, multiple gaps are obligatory in coordinate structures, but parasitic gaps are always optional. In Experiment 7 these generalizations were confirmed in an of-line rating study. The self-paced reading studies in Experiments 7 and 8 showed a strong efect of the semantic fit betwen the wh-phrase and the verb in the second coordinate of a coordinated VP. The use of the ditransitive spray/load-type verbs in Experiment 8 confirmed that this efect emerged before direct 258 evidence for the gap position. These results suggest that comprehenders re-engage in active dependency completion when they detect a coordinate structure containing a gap. Comprehenders are sensitive in real-time to the gramatical implications of building multiple gap constructions. The experiments reported here add to previous findings that comprehenders are highly acurate in locating the tail of a wh-dependency. The advantage of testing the Coordinate Structure Constraint was that, because the index of dependency formation was a positive signal, we were able to avoid the specter of null efects steming from an overload in complexity which other island studies are liable to. The gramatical generalizations about across-the-board extraction and parasitic gaps actively guide parsing decisions. The contrast betwen obligatory and optional gap environments strongly suggests that filers are retrieved at the prompting of a parsing mechanism that keeps track of outstanding requirements in the syntactic context. Encountering a potential parasitic gap environment after completing a filer-gap dependency does not re-engage dependency completion mechanisms, despite the fact that an additional gap is fully aceptable and that doing so would saturate the argument structure of the verb sooner than waiting for the input. 4.5 The fidelity of retrieval in wh-dependency formation 4.5.1 Introduction Experiments 7 and 8 concerned gramatical acuracy in locating the tail of a dependency. That the comprehender is acurate in locating the tail of a wh-dependency is consistent with the generalization offered in section 4.2 that predictive dependencies are gramaticaly most faithful. The next question is whether that fidelity extends not only to postulating a gap, but also to recovering the filer itself. Even if the parser anticipates 259 legal gap sites or caries forward a smal amount of information to license them, it must stil recover the full filer encoding at some point in comprehension. Van Dyke & McElre (2006) argued that multiple candidates compete for the retrieval cues provided by the verb in filer-gap dependency completion (section 4.3.3). This is a natural consequence of the memory architecture, if there is overlap betwen the encoding of the filer representation and other constituents in memory. We raised a few concerns over their findings, however. Firstly we asked whether their memory load manipulation specificaly influenced filer-gap dependency formation, or whether the efect arose elsewhere like in lexical aces. Secondly we questioned whether an appropriate, linguisticaly-motivated feature structure could be devised in which the memory load items were serious competitors for a full match. In Experiments 9 and 10 we examine syntactic configurations that ofer a tighter test of the hypothesis that similar constituents induce similarity-based interference during filer-gap dependency completion. We conjectured that for a verb retrieving a filer, the strongest competitors for that filer would be constituents that are filers in other dependencies, or, specificaly, other wh-elements in clause-edge positions. For example, consider the sentence in (117). The critical verb is ?revealed? which hosts the gap for the extracted wh-phrase ?what? in the embedded question. (117) The biographer asked what the idea that the profesor often defended to his colleagues potentialy revealed ____ about his character. By hypothesis, at ?revealed,? the parser initiates a retrieval for the filer. However, a relative clause was recently procesed (underlined), which also contained a filer-gap dependency. The question posed is whether ?reveal? would encounter dificulty retrieving 260 ?what? because there is also a filer in the edge of the relative clause (like an unpronounced copy of idea or its coindexed operator). This filer, which coresponds to the relative clause head ?idea?, is a reasonable semantic match for ?reveal,? but more importantly it has most of the syntactic properties that would license it as theme for ?reveal.? Crucialy it occupies the same local syntactic position [Spec,CP] as ?what,? or putatively has a shared feature like [+wh]. What it lacks is relational: it is not in a c- command relationship with ?reveal? (or its gap) or in the edge of the imediate clause that dominates ?reveal.? It is therefore plausibly a full match in terms of the inherent features it bears. Alternatively, it is a near full match, if the retrieval structure includes a Clause/Context cue (as we proposed for agrement atraction in Chapter 3). 4.5.2 Experiment 9 In Experiment 9 we test for dificulty that can be atributed to similarity-based interference by crossing two experimental factors: whether the embedded question clause contains an extracted wh-phrase (what v. if), and whether the intervening complex subject is a relative clause or a sentential complement. If retrieval identifies multiple candidate filer phrases based on feature overlap, then resolving the filer-gap dependency in an embedded wh-question should be harder when the intervening subject contains a relative clause than when it contains a sentential complement. If dificulty at the verb is truly due to resolving a filer-gap dependency, then this dificulty should be selective to embedded wh-questions, and not observed in embedded if-clauses. 261 4.5.2.1 Materials and methods Participants Participants were 31 native speakers of English from the University of Maryland community with no history of language disorders. Participants were paid $5 for their participation. Materials 24 item sets were created with sentences containing an embedded clause whose subject was complex. Two experimental factors were crossed: embedded clause type (wh, if) and interference load of the complex subject. Interference load was defined as whether or not the complex subject contained a filer-gap dependency: either an object relative clause (high interference) or a sentential complement (low interference). Sample set of experimental items for Experiment 9 Embedded clause type Interference The biographer asked .. High .. what the idea that the profesor often defended ___ to his colleagues potentialy revealed about his character. Wh- question Low .. what the idea that the profesor often defered to his colleagues potentialy revealed about his character. High .. if the idea that the profesor often defended ___ to his colleagues potentialy revealed anything about his character. If question Low .. if the idea that the profesor often defered to his colleagues potentialy revealed anything about his character. Table 4-5 Sample materials set for Experiment 9 The materials were distributed across 4 lists by a Latin Square. Each participant would therefore se six items per condition. Procedures and Analysis Presentation and analysis details were as described for Experiment 6. 262 4.5.2.2 Results Comprehension acuracy is reported in Table 4-6. Interference Embeded clause type High Low Wh-question 73% 87% 80% If-question 82% 81% 81% 77% 84% 81% Table 4-6 Comprehension question accuracy for Experiment 9 In comprehension acuracy there was a main efect of embedded clause type (?: 0.59 ? 0.51; p < 0.05), such that if-questions were overal answered more acurately; a main efect of interference load (?: 1.1 ? 0.56; p < 0.001), such that low interference conditions were overal answered more acurately; and an interaction of the two conditions (?: -1.2 ? 0.78; p < 0.005). This interaction reflects the fact that the interference factor only afected responses in embedded wh-clause conditions. Reading time data is reported in Figure 4-4 for wh-clause conditions and Figure 4-5 for if-clause conditions. Efects are reported only for reliable diferences and for the critical verb. Region 11: Complex subject clause verb. Reading times at the verb inside the intervener clause showed a main efect of embedded clause type: reading times were slower at the intervener verb inside wh-clauses (??: 35 ms; 95% C.I. [2 ms, 66 ms], p < 0.05). Although there is a sizeable numerical slowdown asociated with the intervener verb in High interference conditions (??: 21 ms), it is not reliable (95% C.I. [-22, 80], p < 0.3). At the intervener verb in High interference conditions, the intervener clause 263 cannot unambiguously be identified as a relative clause, though the transitive verb is a strong cue (cf. Solomon & Mendelsohn, 2004). Region 16: Critical verb. At the critical verb, there were no reliable efects. A slowdown was observed for low interference conditions that was nearly marginal (??: 25 ms; 95% C.I. [-20 ms, 60 ms], p ~ 0.10). Restricting comparison to wh-clauses, i.e. those conditions in which a filer-gap dependency is resolved at the verb, the diference is also numericaly slower for the low interference condition, though also not reliable (??: 24 ms; 95% C.I. [-28 ms, 58 ms]). Region 19-20: Sentence-final regions. Region 19 shows a main efect of clause type, such that if-clauses are read much faster than wh-clauses (??: 125 ms; 95% C.I. [96 ms, 157 ms], p < 0.001). This diference likely reflects the lexical diferences betwen the two conditions: in the if-clause, the Region 19 lexical item is a determiner or possesive pronoun whereas in wh-clauses it is a noun. Conditions were lexicaly unmatched following the critical verb, since if-clauses by necesity had an overt argument whereas wh-clauses had a gap. Region 20 also shows a main efect of clause type, with if-clauses being read much faster than wh-clauses (??: 75 ms; 95% C.I. [24 ms, 162 ms], p < 0.05). As in Region 19, the two levels of the clause are not matched lexicaly. 264 Figure 4-4 Experiment 9 Reading time results: Wh-clause Conditions The 1 biographer 2 asked 3 what 4 the 5 idea 6 [ that 7 the 8 profesor 9 often 10 defended 1 /defered 1 to 12 his 13 colleagues 14 ] potentialy 15 revealed 16 about 17 his 18 character 19 .. 20 Critical verb Intervening region Embedded clause 265 Figure 4-5 Experiment 9 Reading time results: If-clause Conditions The 1 biographer 2 asked 3 if 4 the 5 idea 6 [ that 7 the 8 profesor 9 often 10 defended 1 /defered 1 to 12 his 13 colleagues 14 ] potentialy 15 revealed 16 anything 17 about 18 his 19 character 20 4.5.2.3 Discussion In this experiment we tested whether or not the presence of an additional, but irelevant filer-gap dependency, would make resolving a target filer-gap dependency more dificult. On the hypothesis that the filers at the edge of both the embedded clause and relative clauses were substantialy similar, a retrieval-based acount of filer-gap completion predicts an interaction betwen clause type and interference load. The reading time data did not bear out this prediction at the critical verb. There was no indication that completing an embedded wh-clause filer-gap dependency led to greater procesing times when a relative clause had recently been procesed. As we noted in Chapter 3, though, Critical verb Intervening region Embedded clause 266 participants may trade speed for acuracy, or simply acept the wrong constituent and never verify the c-command relationship. If so, then the inteference should be reflected in comprehension acuracy. Indeed, there we saw a selective efect of interference load on wh-clause conditions only, in the expected directions. Interpreting the discrepancy in on-line and off-line data is complicated somewhat, though. The on-line data are localized to specific sentence regions, whereas the off-line data reflect the outcome of numerous procesing events. This concern is relevant, because we observed selective dificulty in embedded wh-clause/relative clause conditions prior to the critical verb. We observed increased dificulty in the subject-atached clause for relative clauses. Moreover this dificulty was greatest for the verb inside a relative clause contained in an embedded wh-clause (though this efect was near-marginal). At the point of procesing the verb, the relative clause and sentential complement analyses cannot be disambiguated; however the use of transitive verbs in the relative clause condition could have been a strong cue to diferentiate the two analyses, either generaly, or particularly if picked up in the experimental context. Pearlmutter & Mendelsohn (2000) have argued that in relative clause/sentential complement ambiguous strings, the relative clause analysis is at least as a strong competitor as the sentential complement. This ambiguity in analysis, combined with the search for licit retrieval sites, could explain why comprehenders experienced selective dificulty at the RC-contained transitive verbs inside embedded wh-clauses. If we grant this much, then the dificulty observed in the relative clause region could acount for the interaction in the comprehension acuracy data. 267 Ofline and online data were at odds in this experiment. The online data suggest that the mere presence of other filer-like constituents did not make resolving the filer- gap dependency at the critical verb more dificult. The offline data, however are consistent with this prediction, though somewhat equivocal because they do not localize the source of the dificulty. In Experiment 10, we atempt the same basic manipulation, though without having nested dependencies. 4.5.3 Experiment 10 In this experiment, we created a configuration in which the interference load region was not nested inside the target filer-gap dependency. To do so, we embedded the critical filer gap dependency in a relative clause atached to the sentence object. The interference region was defined as the relative clause atached to the sentence subject. The configuration is ilustrated schematicaly below: (118) Subj [ RC --Interference region -- ] V Obj [ RC .. V Critical ___ ] The critical filer-gap dependency was a object extraction from inside a full relative clause. The sentence subjects were made high interference regions by ataching a full relative clause, as in (119a). Consequently in high interference structures there were (putatively) two relative clause operators occupying identical structural positions. To create a low interference region, we used subject infinitival relatives, as in (119b), which have been argued to be reduced relative clauses (Kjelmer, 1975; Bhat, 1999; cf. Kayne, 1994) 54 . (119) (a) High Interference / Object Relative Clause The brightest student i [ CP Op i that ___ i took the test ] wrote an esay j [ CP Op j that the instructor praised ____ j for its mature style] 54 We thank Alan Mun for sugesting this comparison. 268 (b) Low Interference / Object Relative Clause The brightest student i [ PredP ___ i to take the test ] wrote an esay j [ CP Op j that the instructor praised ____ j for its mature style] There are clear distributional diferences betwen subject infinitival relatives and subject full relatives that lead us to asume the filer in a full relative clause is more similar to the filer in the target region, than the head of the subject infinitival is. Bhat (1999) argues that the subject infinitival relatives do not involve A? movement to the Spec of a [+wh] C. This analysis is supported by a patern of observations that subject infinitivals lack crucial properties characteristic of A? dependencies in full relatives. Comparison with non-subject infinitival relatives reveals this is not a property of the infinitive per se. In comparison to both full relatives and non-subject infinitival relatives, subject infinitival relatives do not alow a complementizer, a relative pronoun, or long-distance movement. As wel, unlike full relatives and non-subject infinitival relatives, there is no way for an [Op, t] chain to receive case in subject infinitivals. Whether Bhat?s analysis is necesary to capture these diferences is not crucial: for our purposes what is crucial is that the filer?s encoding in subject infinitivals be distinct from the filer?s encoding in full relatives. Let us asume for discussion the relevant distinction is that full relative clause filers are marked [+wh], that subject infinitival filers lack this feature, and that this feature is highly weighted in the retrieval structure used in filer-gap dependency completion. As a control condition we replaced the object-atached relative clause with a coordinate clause continuation containing the same subject and verb. A pronoun was inserted in object position, so that the control sentences would expres the same thematic relations: 269 (120) The brightest student i [that took the test]/[to take the test] wrote an esay and the instructor praised it for its mature style. Crucialy ?praised? in the control conditions does not require a retrieval operation, as it is not within a filer-gap dependency. Therefore we can test for an interaction betwen interference load and filer-gap dependency completion. If the system uses a set of generic structural cues to retrieve the filer during active dependency completion, the gramaticaly unavailable filer in the high interference sentences should compete with the actual filer for the dependency. 4.5.3.1 Materials and methods Participants Participants were 32 native speakers of English from the University of Maryland community with no history of language disorders. Each received partial course credit in an introductory linguistics clas. Materials 24 item sets were created with sentences containing an embedded clause whose subject was complex. The experimental factors were critical verb clause (object relative, coordinated clause) and interference load of the complex subject. Interference load was defined as whether or not the complex subject contained a full relative clause dependency: either there was a full relative (high interference) or an infinitival relative (low interference). Sample set of experimental items for Experiment 10 Critical verb clause Interfere nce The brightest student Object relative High .. that took the test wrote an esay that the instructor praised ___ for its mature style. 270 Low .. to take the test wrote an esay that the instructor praised ___ for its mature style. High .. that took the test wrote an esay and the instructor praised it for its mature style. Coordinate d clause Low .. to take the test wrote an esay and the instructor praised it for its mature style. Table 4-7 Sample materials set for Experiment 10 Superlatives and ordinals like ?brightest? or ?first? were used as prenominal modifiers of the subject across al conditions. We wanted to make interpretations as similar as possible across conditions. However, subject infinitival relatives permit modal interpretations (i.e., ?the man to answer your question is Bil? ? ?the man who should/can answer your question is Bil?). These modal interpretations can be quashed under ordinals, superlatives and only (i.e. ?the first man to answer your question was Bil? ? ?Bil was the first man who answered your question?). Finaly critical verbs were chosen that were semanticaly compatible with either the gramatical or interfering filer (e.g., it is fine to praise either students or esays). The materials were distributed across 4 lists by a Latin Square. Each participant would therefore se six items per condition. Procedures and Analysis Presentation and analysis details were as described for Experiment 6. 4.5.3.2 Results Comprehension acuracy is reported in Table 4-6. Overal acuracy was 93%. There were no significant diferences betwen conditions. A model of the data with a single fixed coeficient could not be distinguished from the full model (? 2 : 2.4, d.f.: 3, n.s.). 271 Interference Critical verb High Low Object relative clause 93% 95% 94% Coordinated clause 91% 92% 92% 92% 93% 93% Table 4-8 Comprehension question accuracy for Experiment 10 Standard eror of the cel means is 2% for al conditions. Reading time data is reported in Figure 4-6 for object relative clause conditions and Figure 4-7 for the coordinated clause control conditions. Efects are reported only for reliable diferences and for the critical verb. Figure 4-6 Experiment 10 Reading time results: Relative clause conditions The 1 brightest 2 student 3 [ to 4 /that 4 take 5 /took 5 the 6 test 7 ] wrote 8 an 9 esay 10 that 1 the 12 instructor 13 praised 14 for 15 its 16 mature 17 style 18 . Critical verb Load region Relative clause region Critical verb Load region Critical verb 272 Figure 4-7 Experiment 10 Reading time results: Coordinated clause conditions The 1 brightest 2 student 3 [ to 4 /that 4 took 5 the 6 test 7 ] wrote 8 an 9 esay 10 and 1 the 12 instructor 13 praised 14 it 15 for 16 its 17 mature 18 style 19 . Regions 1-2: Subject determiner and adjective. In Region 1 there was a reliable interaction betwen interference load and structural continuation factors (high:conj ??: - 25 ms; 95% C.I. [-47,-1], p < 0.05). This interaction reflects a pairwise diference betwen interference load conditions for coordinate clause sentences, but not relative clause sentences. Because al conditions are exactly matched in the first region, this diference must be spurious. The diference persists into Region 2, where the same kind of interaction was observed (high:conj ??: -35 ms; 95% C.I. [-67,-7], p < 0.05). However the RTs are not diferent in Region 3, corresponding to the subject head noun. Critical verb Critical verb Load region Critical verb Load region Critical verb Coordinated clause region Critical verb Load region Critical verb 273 Region 5: Subject relative clause verb. In Region 5 there was a reliable main efect of interference load, such that full relative clauses were read more slowly than infinitival relative clauses (??: 22 ms; 95% C.I. [1, 21], p < 0.05). Note that ful relative clause verbs were longer and morphologicaly more complex in this region (e.g., V+ed v. V). Region 6: Direct object determiner. In Region 6 there was also a reliable main efect of interference load, such that full relative clauses were read more slowly (??: 21 ms; 95% C.I. [4, 37], p < 0.05). Region 11: Conjunction/relative clause complementizer. In Region 11 there was a reliable main efect of clause continuation type, such that coordinated clause conditions were read more slowly (??: 24 ms; 95% C.I. [4, 49], p < 0.05). This region coresponds to a conjunction, in coordinated clause conditions, or the object-atached relative clause complementizer, in relative clause conditions. Region 14: Critical verb region. In Region 14, the critical verb region, there were no reliable efects. The crucial comparison is betwen interference load conditions, in an object relative clause (where the filer-gap dependency can first be resolved); and there was no diference betwen the two load conditions (??: 5 ms; 95% C.I. [-24, 20], n.s.). Sentence-final regions. Region 18 shows a main efect of clause continuation type, such that coordinated clause continuations were read more quickly than relative clause continuations (??: 75 ms; 95% C.I. [-47, 107], p < 0.001). Because this region occurs after the critical verb, the conditions are lexicaly not aligned. This comparison involves an adjective in coordinate clause continuations and the sentence final noun in the relative clause continuations, so a diference is not surprising. 274 If we align the last four regions of the sentence, coresponding to the category sequence, P-D-A-N, so that regions are lexicaly matched, then the only region in which the two clause continuation conditions difer is in the P region: Region 15 for the relative clause condition, and Region 16 for the coordinated clause condition. Coordinated clause conditions were read more quickly (??: 24 ms; 95% C.I. [9, 39], p < 0.005). Note that in relative clause sentences, the preposition occurs imediately after the verb, since there is a gap; in coordinate clause sentences, it follows an overt pronoun. 4.5.3.3 Discussion In Experiment 10 we found no efects, online or ofline, that a structuraly similar filer constituent interferes in dependency construction. Constructing a nearly identical dependency upstream (modulo the gap?s case position) does not make filer-gap resolution more dificult, as reflected in reading times or comprehension acuracy. These data suggest that identifying the head of a filer-gap dependency is in fact a gramaticaly acurate proces. We outline two mechanisms below. 4.5.4 Acurately identifying the head of a dependency We propose two mechanisms below for acurately identifying the head of a filer- gap dependency. The first is specific to explaining our experimental data, and hinges on the idea that filer representations are re-encoded when they are succesfully integrated at the gap. The second is more general, and supposes that some distinctive features of the actual filer representation has been caried forward to target it later. First let us return to the observation in Experiment 9 than there is increased complexity in nested filer-gap dependency constructions. The patern of dificulty inside the RC, but no dificulty outside, suggests that whether or not the dependency is complete 275 maters as to whether or not other filers could interfere. Inside the RC, there are two open filer-gap dependencies, but outside the RC only one remains open. This observation is not surprising, and is of course familiar as characteristic of the dificulty of too many center self-embeddings (Miler & Chomsky, 1963; Cowper, 1976; De Roeck et al. 1982; Lewis, 1996). One response to this observation has been to asume that the architecture must so constrained such that unintegrated dependency constituents tax procesing (Gibson 1998; Gordon et al. 2006). If there is a capacity-limited memory space, then this constraint can be cashed out without much dificulty. The dificult cases run up against the bounds of available memory. However, in the architecture we are considering, where there is virtualy no capacity limitation, then the constraint must be cashed out in terms of cue competition. Either the cues must be diferent at the two verbs; or the encodings of the filers must difer with respect to their status as open or closed; or both. We aserted in the introduction that when the critical verb was procesed, the encodings of the two filers in the structure highly overlap in the relevant features, and thus the gramaticaly-inacesible head should compete with the gramatical one. However our data have raised the possibility that the encodings of the filers change over time, such that the head of a complete dependency is disimilar from that of an open one. One linguisticaly motivated way of acounting for this re-encoding is to suppose that unintegrated filers have a feature indicating that they lack a thematic role, i.e.: THETAROLE:Unmarked. 5 When the dependency is completed, this feature is substituted 5 This encoding scheme would also be consistent with the principle-based parsing acounts of filer-gap procesing, and in particular, Pritchet (1992), who argued active dependency formation was driven to license thematic structure. 276 with a THETAROLE:Theme feature (for example, or THETAROLE:Agent or even THETAROLE:Marked). Suppose, then, that retrieval inside a filer-gap dependency contains a (highly weighted) cue for something like THETAROLE:Unmarked. Under this encoding scheme, filers that have been succesfully integrated wil be relatively more disimilar that unintegrated filers. It is worth worrying about this kind of single feature distinction, since cue combination is non-linear. As more such distinctions acrue, the likelihood of interference declines very quickly. Methodologicaly, the more diferences we can plausibly impute to what sems like an interfering constituent, then the smaler and smaler the competition efects that are predicted, and the les confident we can be in null results like we report. Theoreticaly, though, it suggests that potential linguistic structure building systems could dampen the impact of the memory architecture by maximizing distinctivenes in encodings. An encoding system that can instantiate licensing requirements as unvalued features could thus minimize the impact of having to resolve multiple such dependencies in succesion. The second possibility for acurately targeting the heads of (predicted) dependencies is to cary forward ?just a litle information? about the head of the dependency. If this information were not abstract, but rather, functioned like a tag or unique marker for the specific filer, then it could be used to target retrieval of just that filer. In efect, this proposal is a hybridized version of Wanner & Maratsos (1978)?s HOLD cel approach. In abstract terms it simply means recording a pointer to the filer?s location in memory, instead of preserving al of its contents. However the debate, as it Badecker & Lewis (2007) also use unvalued atribute-feature pairs as retrieval cues. In their system this acounts for the plural markednes efect in agrement atraction. Se Chapter 3. 277 has ben cast in the psycholinguistics literature, has established a tension betwen maintenance and retrieval that is rather more extreme than the intermediate positions permited by the architecture. It sems a reasonable trade-off to preserve a smal amount of information for gains in the licensing proces. When we consider the implications of the island results, discussed in the previous section, we se that the potential payof of preserving some filer-specific information is great. The islands results suggests that the search for dependency completion sites obeys constraints on extraction: some configurational licensing of the dependency occurs left-to-right. If retrieving the filer then returns multiple candidates, then the system could potentialy integrate a non- licensed filer, even though it has already (putatively) expended the efort to verify the path requirement from filer to gap site. To preserve acuracy in this scenario, the system would have to select the filer in the right configuration with the gap site. If our argument from Chapter 3 is sound, there is no way to make this selection on the basis of inherent features, and thus the selection proces would esentialy have to re-capitulate the proceses that traced the path from filer to potential gap site in the first place. If, however, characteristicaly only one candidate encoding were returned in search, then it would be unnecesary to do any (right-to-left) licensing, and the remainder of verifying the dependency could occur localy (i.e., do features match betwen subcategorizer and filer? is there actualy a gap?). The motivation to ?localize? as much decision making as possible is familiar and strongly echoes Berwick & Weinberg (1984)?s acount of subjacency. They argued that subjacency, the requirement that syntactic movement rules be bounded, reflects an adaptation of the gramar to a deterministic parsing mechanism (Marcus, 1980). To 278 make any given parsing decision, the parser is alowed to refer to a limited syntactic context. In the architecture they considered, the syntactic context was represented literaly: that is, it could not include variables. As a consequence, only a bounded context can be represented: (121) (a) A licit context [ S ? what [ S [ NP John] (b) An ilicit context [ S ? what ..X.. [ S [ NP John] (to represent ?what did Bil believe that John ..?) For the decision of whether or not to insert a trace/gap in the parse, it must be known whether there is a wh-phrase in the syntactic context. If the syntactic context were bounded by just the current clause, then only when subjacency holds could the decision about inserting the trace be determined exclusively by consulting context. That is, only if subjacency holds, would the absence of a wh-phrase in the context representation be informative. Looking to a bounded context is preferable for licensing wh because it is just one memory location that has to be consulted. A secondary mechanism could be engaged to climb the parse tre and search for a wh-phrase. However, this means consulting not just one location but many. In the present architecture, the extreme boundednes of the context and the content-addresable search mechanism put a premium on determining wel-formednes localy. Thus we conclude that composing the retrieval structure to maximize identifying a unique constituent would be a valuable adaptation. 4.6 Carrying information forward in time In the thre final experiments of this chapter, we provide evidence that some information survives across the length of the dependency that can guide parsing, 279 consistent with our conclusion from section 4.5.4. We test how wel thre diferent kinds of dependency formation probes survive increasing dependency lengths: verb-object plausibility, verb-P selectional restriction, and a DP/P filed gap efect. Each of these probes requires diferent kinds of information about the filer to generate a signal during active dependency completion. The specificity of the information required (minimaly) for each probe is gradualy decreased. The verb-object plausibility test requires information about the filer?s lexical head features; the verb-P selectional restriction requires just the lexical identity of the filer?s functional head; and the DP/P filed gap test only requires filer category. By varying dependency length, we are able to test what kinds of information is efective long after the filer was first encoded, and consequently what kinds of information may be guiding the parser?s initial decision, before it atempts to recover full information about the filer. 4.6.1 Lexically-specific features (Experiments 11a, 11b) The first experiment in this series tests how wel lexical features of the wh-phrase survive diferent dependency lengths. The probe for active dependency formation is the plausibility of the filer phrase as an internal argument of the verb. Plausibility was crossed with dependency length either by modifying the subject with a P, a serial length manipulation (in Experiment 11a) or embedding the subcategorizing the verb in a further clause, a hierarchical length manipulation (in Experiment 11b). If lexical features of the filer phrase are maintained until the filer is licensed, then plausibility should be an efective probe of active dependency formation regardles of dependency length. 280 4.6.1.1 Experiment 11a: Plausibility and increased serial length Participants Twenty-four native speakers of American English from the University community were paid $10 to participate in an experimental sesion lasting 50 minutes. Al were naive to the purpose of the experiment and gave informed consent. Materials, Procedure and Analysis Experimental materials consisted of 24 sets of 4 conditions organized in a 2 ? 2 factorial design that independently manipulated the factors dependency length and plausibility. Experimental materials followed the scheme in (122) and (123). As in Experiment 8, the filer was extracted from the oblique object position of an alternating locative verb. This manipulation provides a multi-word region betwen the subcategorizing verb and the evidence of a moved constituent. Therefore efects can spil-over from the verb and stil be interpreted as active, if they occur before the gap region. Since we are interested in information that is maintained to aid the parser make decisions about wh-dependency formation, the active efects provide the crucial evidence. (122) Short, Plausible (a) The adhesive coating that the talented engineer methodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. Short, Implausible (b) The computer program that the talented engineer methodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. 281 (123) Long, Plausible (a) The adhesive coating that the talented engineer from the high-tech aerospace firm methodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. Long, Implausible (b) The computer program that the talented engineer from the high-tech aerospace firm methodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. The dependency length factor manipulated the serial distance betwen filer and gap. The long dependency sentences were identical to the corresponding conditions in Experiment 8 (testing the CSC). Short dependency sentences were derived from long dependency sentences by removing the subject-adjoined P. Dependency length here is operationaly defined as the number of words betwen the introduction of the filer and the verb. Seventy-two filers were adapted from the filers in Experiment 8 so that the distribution of lengths of filers matched the distribution of lengths of target items. The experimental procedure was identical to Experiment 8. Data treatment and analysis was identical. No participants were excluded. Results Question-answering acuracy was uniformly high. For the 24 experimental targets, acuracy was 89%. There were no reliable diferences among conditions; a model with no fixed efect coeficients was indistinguishable from one with al fixed coeficients (? 2 (7): 4.40; p = 0.49). In both long and short dependencies, a slow-down due to implausibility appeared in the first word of the ground argument. As this efect occurred wel before direct evidence for the gap, we interpret it as an efect of active dependency formation. In short dependencies, slower reading times persisted for implausible filer sentences throughout 282 the ground argument region and into the gap region, whereas for long dependencies the efect of implausibility was observed only once in the ground argument regions and then not again until the gap region. The results in these conditions were thus similar to the results observed in the Experiment 8 Single VP condition. Figure 4-8 presents the region-by-region condition means from the beginning of the sentence until region 20, divided into two pair-wise comparisons. The main finding in the reading time data is the large atenuation of the plausibility efect in long conditions compared to the robust and large-lasting sensitivity exhibited by the short conditions. The omnibus repeated measures ANOVA report is given in Appendix C. Region-by-region condition means and test results for preceding regions did not difer (when the unmatched P regions of the long condition were excluded). 283 Figure 4-8 Experiment 11a Region-by-region reading times (The coating/program that the engineer) 1-5 Long Subject: from 6 the 7 high-tech 8 aerospace 9 firm 10 methodicaly 1 sprayed 12 the 13 special 14 test 14 surfaces 15 with 16 in 17 his 18 laboratory 19 (..) 20=.. Short Control: methodicaly 6 sprayed 7 the 8 special 9 test 10 surfaces 1 with 12 in 13 his 14 laboratory 15 (..) 16-.. Region-by-region reading times. Arows indicate the critical verb. Punctuation indicates the result of a pairwise by-participants RMANOVA: p: ** < 0.01 < * < 0.05 < ? < 0.10. At the adverb preceding the verb there were no reliable main efects or interactions, although there was a non-significant tendency for long dependency conditions to be read more slowly. On the critical verb itself, there were also no reliable efects or interactions. In the ground argument regions following the critical verb, plausibility efects were found in short and long conditions alike. However, the efect of plausibility was more long-lasting in the short dependencies. In the ground argument determiner region, 284 implausible sentences were read more slowly across both levels of the length factor. The size of the efect in long and short dependency conditions alike was consistent, but it was reliable in pair-wise comparisons only for the short dependency sentences. Neither levels of the length factor displayed an implausibility efect in the second and third words of the ground argument (regions 14-15), although short dependency sentences consistently displayed slower reading times for implausible sentences. In the final word of the ground argument region (region 16), corresponding to the head noun, short dependencies reliably showed a large slowdown due to implausibility, whereas long dependencies did not. Short dependency conditions displayed sensitivity to the plausibility of the filer at the figure argument preposition, whereas long dependency conditions did not. At the two imediately folowing regions, however, there was a reliable slowdown due to implausibility for both long and short dependencies. In the third region subsequent to the preposition short dependency conditions again showed a reliable slowdown whereas Long dependencies did not. Interim Discussion The results of Experiment 11a showed the slowdown due to implausibility was atenuated when the distance betwen filer and gap was lengthened. This can be sen in the comparison of short and long conditions in Experiment 11a; it can also be sen in the comparison of coordinate and single VP conditions in Experiment 8. The largest and most sustained responses to an implausible filer were obtained when the filer-gap distance was short, or in a second coordinated VP. 285 4.6.1.2 Experiment 11b: Plausibility and increased hierarchical length We performed a follow-up to Experiment 11a to test a diferent kind of length manipulation. In Experiment 11a, the long conditions were created by adjoining a P to the subject; thus the model increased length was serial. In Experiment 11b, we also lengthened the dependency by inserting a whole clause betwen the filer and the gap- containing clause. Thus the experimental design was identical, crossing filer plausibility and dependency length. In this experiment, dependency length had thre levels: short, long:clause, and long:P. Experimental materials thus consisted of 24 sets of 6 conditions organized in a 2 ? 3 factorial design that independently manipulated the factors dependency length and plausibility. A sample set is given below. (124) Short, Plausible (a) It pleased the analyst that the coating that the talented engineer methodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. Short, Implausible (b) It pleased the analyst that the program that the talented engineer methodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. (125) Long:Clause, Plausible (a) The coating which the impresed analyst said that the talented engineer from the high-tech aerospace firm methodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. Long, Implausible (b) The program which the impresed analyst said that that the talented engineer from the high-tech aerospace firm methodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. 286 (126) Long:P, Plausible (a) The coating that the talented engineer from the high-tech aerospace firm methodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. Long, Implausible (b) The program that the talented engineer from the high-tech aerospace firm methodicaly sprayed the special test surfaces with ___ in his new laboratory could make the company lots of money. The short conditions were adapted from Experiment 11a; in this experiment, we matched the ordinal position of the critical verb in the short conditions, by embedding those sentences under ?psych? predicates. The long:clause conditions were adapted from the short conditions of Experiment 11a as wel, by inserting a clause whose verb was heavily biased to subcategorize a CP. The long:P level was identical to the long conditions in Experiment 11a, to test whether the results would replicate. In both long conditions, the number of words intervening betwen the filer and the critical verb was identical. In adapting the Experiment 11a materials, some lexical items were simplified; al filers were reduced to single word phrase. The relative pronoun ?which? was used in place of the complementizer in long:clause conditions, to avoid the repetition of ?that? in a short span of time (which sounded unnatural to the experimenter and several informants). Al procedures were identical. There were 36 paid participants in this experiment, and each was paid $10. Analysis of reading times was caried out via linear mixed-efects models (as for experiments in Chapters 2 and 3; se fn. 53 regarding the RMANOVA analysis for Experiments 7-8, 11a). 287 Results Question-answering acuracy was uniformly high. For the 24 experimental targets, acuracy was 90%. There were no reliable diferences among conditions; a with no fixed efect coeficients was indistinguishable from one with al of the fixed efect coeficients (? 2 (5): 6.09; p = 0.30). Figure 4-9 reports the reading times for just the long:clause condition, the new condition in Experiment 11b. The short and long:P results replicated Experiment 11a; they are not reported in the text (but may be found in Appendix D). The main finding in the reading time data is a lack of sensitivity to filer plausibility in the active dependency completion regions in al but the short conditions. Figure 4-9 Experiment 11b Region-by-region reading times (Long:clause) The 1 ( coating / program ) 2 that 3 (the impresed analyst said that ] 4-8 the 9 engineer 10 sprayed 12 the 13 special 14 test 14 surfaces 15 with 16 in 17 his 18 laboratory 19 (..) 20-2 288 Pre-critical regions. We do not report tests for the structure comparison prior to the critical region, since the conditions are lexicaly unmatched. There were some reliable spurious diferences in plausibility comparisons, observed betwen the following cels: plausible conditions slower in Region 4, short condition (??: 33 ms, 95% C.I. [12 ms, 56 ms], p < 0.1), plausible conditions faster in Region 5, long:clause conditions (??: 22 ms, 95% C.I. [0 ms, 40 ms], p < 0.05), Region 6, long:clause conditions (??: 29 ms, 95% C.I. [2 ms, 57 ms], p < 0.05). Adverb region. There were no reliable diferences across structure or plausibility in the adverb region. Critical verb region. There were no reliable diferences across structure or plausibility in the critical verb region. Argument regions. A slow-down due to implausibility appeared in the first and second words of the ground argument region (Regions 13-14) (Region 13: ??: 23 ms, 95% C.I. [-1 ms, 45 ms], p < 0.10; Region 14: ??: 26 ms, 95% C.I. [1 ms, 50 ms], p < 0.05). This efect was significantly reduced in the long:clause conditions (Region 13: ??:-39 ms, 95% C.I. [-4 ms, -69 ms], p < 0.05; Region 14: ??:-27 ms, 95% C.I. [3, -64 ms], p < 0.15), though it was not diferent in the long:P conditions, in the full model of the data. There was no reliable slowdown due to implausibility in long:clause conditions. Focusing on the comparison betwen short and long:P conditions, the results are nearly identical to Experiment 11a. Pairwise comparisons over long:P conditions revealed a reliable efect of plausibility in Region 13 (??: 18 ms, 95% C.I. [1 ms, 38 ms], p < 0.05), which did not persist into Region 14. The same comparisons over short conditions showed efects in both Regions 13 and 14. Despite being numericaly larger, the efect of 289 plausibility was however only marginal in those regions (Region 13: ??: 22 ms, 95% C.I. [-2 ms, 51 ms], p < 0.10; Region 14: ??: 25 ms, 95% C.I. [-5 ms, 55 ms], p < 0.10 ). What is notable, however, is that, like in Experiment 11a, the slowdown persists from the verb to the post-gap regions in short conditions, but not in long:P conditions. If we collapse across the entire first argument region (Regions 13-16), then there is a reliable slowdown for short conditions (??: 19 ms, 95% C.I. [6 ms, 32 ms], p < 0.01), but not so for long:P or long:clause conditions. As a slowdown due to plausibility occurred wel before direct evidence for the gap, we interpret it as an efect of active dependency formation. Interestingly, both long dependencies showed a strong efect of plausibility in the post-gap region. This efect replicates Experiment 11a as wel. What sems to be characteristic of the short dependencies in both experiments is a long-lasting efect, slowing down procesing from the verb onwards. This efect is ?bi-phasic? in long:P dependencies, showing a smal slowdown in the region imediately following the verb, no diference in the intervening regions, and a larger response in the post-gap region. In long:clause dependencies, we only observed a slowdown in the post-gap regions. It is worth considering whether baseline procesing is more dificult in the long conditions, which might help to mask a plausibility efect. However, comparisons at the adverb, verb, and entire post-verb region ? for just the plausible conditions ? show no reliable variation in reading times due to structure. The comprehension acuracy results support this conclusions: though participants were overal 2 points lower in acuracy on long:clause conditions than short conditions, this diference is not reliable; moreover, they were numericaly beter on long:P conditions (by 3%). 290 Discussion The results of Experiment 11b conformed to those of Experiment 11a. In regions preceding the gap site, where the plausibility efect indexes active dependency completion, plausibility efects were longer lasting and more numerous in short conditions than in long:P conditions, and they were absent in long:CP conditions. Both long conditions showed a strong efect of plausibility after the gap site had been signaled, indicating that participants were ultimately aware of the implausible verb-filer combination. However, only in the short conditions are they consistently aware in the active regions. What is striking about the present results is that only a modest amount of distance, a 5-word P, is necesary to substantialy disrupt sensitivity to an implausible filer. This disruption is selective: it is only in the pre-gap regions, where dependency completion is active, that sensitivity is lost. It suggests, however, that when the parser completes the dependency actively it may not have reliable aces to the lexical features that alow evaluation of plausibility. With regards to a retrieval model, the very presence of a plausibility efect indicates that the system is able to complete the dependency in spite of the lexical feature cues the verb might provide. We might suppose that categorial or positional cues are given priority (e.g., +wh) to retrieve the filer at the verb. Why then, should, the plausibility efect atenuate as dependency length increases? This atenuation with length might be explained if the interposed constituents provide sufficient interference, through spurious matches or shared features. Completing the filer-gap dependency is simply les probable as dependency length increases, because the likelihood of retrieving the filer declines. The fact that information is ultimately and robustly recovered in the post-gap 291 region casts doubt on this explanation. By the time the post-gap region is reached, the number of intervening constituents is even greater. There is perhaps more strongly constraining information, such as co-arguments, by the time the gap region has been reached in this experiment, which could help pinpoint the search for the filer. However it is dificult to se what highly distinctive features of the filer, instantiated at the time of encoding, could enter the retrieval structure only in the post-gap region. One conjecture is that the exact sense of the verb has been selected with more procesing, and this yields a more informative retrieval structure. However notice that we must then interpret the implausibility efect in the post-gap region as a signal that the parser has failed to find the filer constituent. If additional procesing alows the retrieval structure to acrue more cues, and in particular there are more lexicaly specific cues, then the implausible filer should be harder and harder to retrieve. It is not posible to rule this explanation out conclusively. However if the filer constituent ultimately failed to be retrieved, leading to an unlicensed dependency, it is surprising that comprehension acuracy is not impacted. An alternative explanation finds greater traction with these data. We have conjectured that whatever information can be caried forward in time is used to complete the dependency actively. Active dependency formation, under this view, occurs largely independently of retrieval operations. Only after the dependency is constructed or licensed is the full filer encoding recovered to proced with interpretation. Highly specific lexical information, by hypothesis, requires more space to maintain and is thus likely to be quickly displaced from focal atention as more relations have to be parsed. Consequently, in long dependencies, the plausibility efect only shows up reliably much later, when the filer representation is actualy recovered. The plausibility efect in short 292 dependencies reflects the survival of some lexical information over the relatively short span from filer encoding to the occurrence of the verb. The retrievability of the filer does not vary substantialy with the length of the dependency, which is consistent with the relatively smal shifts in asymptote McElre, Foraker, & Dyer (2003) observed for 1 and 2 clause interpolations in filer-gap dependencies (se Chapter 3). Finaly it is worth emphasizing our finding that long:clause dependencies are worse than long:P dependencies. Under succesive cyclic movement, the wh-phrase is actualy structuraly equidistant from the verb in both short and long:clause dependencies (Chomsky, 1977). Therefore it is not unreasonable to expect long:clause dependencies to show the plausibility efect more robustly than the long:P dependencies. The fact that they did not does not count as evidence against succesive cyclicity (or any number of similar devices: Kayne, 1984; Gazdar, Pulum, Klein & Sag, 1985, etc.): it merely suggests that full filer information is not recovered succesive cyclicaly. If that were true, it is also consistent with our alternative explanation. Coarse grained category information could be used to establish the cyclic trace or copy in the structure and no retrieval would be necesary 56 . 4.6.2 Lexical identity (Experiment 12) In Experiment 11, we demonstrated that lexical features appear to be inefective at longer dependency lengths during active dependency formation. In this experiment, we asked whether slightly more coarsely-grained lexical information is preserved: the lexical 56 In an earlier presentation of this work (Wagers & Philips, 2006), we claimed that the filer was retrieved at clause boundaries (which we interpreted as consistent with Gibson & Waren (1999)?s observations). This conclusion, however, was premature and based on a partial data set. The addition of more participants has confirmed that there is no implausibility efect in multi-clause dependencies, prior to the gap site. 293 identity of a function word. To answer this question, we took advantage of the fact that some verbs select for prepositional phrases with a certain head. For example, in (127a), the verb ?entrust? requires a goal argument, which in English is headed by the preposition ?to?. Correspondingly, in (127b), the verb ?inherit? requires a source argument, which is headed by the preposition ?from?. (127) (a) The secretary entrusted the correspondence to/*from the courier. (b) The orphan inherited a fortune from/*to the milionaire. The head of the P can be pied-piped in a wh-dependency. (128) (a) The courier to whom the secretary entrusted the corespondence .. (b) The milionaire from whom the orphan inherited a fortune .. In this experiment, we conjectured that comprehenders would be sensitive online to the selectional requirement of the verb for its P argument. By means of the pied-piping manipulation, we could ask whether comprehenders would be sensitive to this requirement over short and long dependency lengths. We conjectured that recognizing whether or not the P is of the right type in active dependency formation would require preserving the lexical identity of the head. 4.6.2.1 Materials and Methods Participants Participants were 18 native speakers of English from the University of Maryland community with no history of language disorders. Participants received $10. Materials A sample item set is given in Table 4-9. The experimental factors were the match betwen verb and P, and dependency length betwen filer and gap host. Gap position is marked with an underscore and the interposed P is bracketed. 294 Dependency length Match ? The courier .. unfortunately wrecked his bike in trafic.? Match .. to whom the secretary recently entrusted the confidential busines correspondence ___ after some hesitation .. Short Mismatch .. from whom the secretary recently entrusted the confidential busines correspondence ___ after some hesitation .. Match .. to whom the secretary [ for the high-powered defense atorney ] recently entrusted the confidential busines correspondence ___ after some hesitation .. Long Mismatch .. from whom the secretary [ for the high-powered defense atorney ] recently entrusted the confidential busines correspondence ___ after some hesitation .. Table 4-9 Sample materials set for Experiment 12 Items were balanced so that the match preposition was ?to? in 12 sets and ?from? in 12 sets 57 . Materials were distributed acording to a Latin Square across four lists, and acordingly each participant read six sentences from each condition. 72 filer sentences were included, adapted largely from Experiment 11; however some new distractors were devised that included pied-piping in other contexts beside the head of a subject-relative clause. The purpose of this manipulation was to ensure that participants could not strategicaly identify the experimental targets. Procedures and Analysis Presentation and analysis details were as described for Experiment 11b. Regions were structuraly aligned, so that the VP regions comon to al item sets were analyzed word-for-word across conditions. 4.6.2.2 Results Comprehension acuracy is reported in Table 4-10. 57 In preliminary analysis, we split the data set into the 12 items for which from was the mismatching prepositions and the 12 items for which to was the mismatching prepositions. The results were identical and conform to the paterns reported below. 295 Dependency length Short Long Match 88% 84% 86% Verb-P Selection Mismatch 84% 80% 82% 86% 82% 84% Table 4-10 Comprehension accuracy for Experiment 12 Average percentage correct over participants, with row, column and grand means. Standard eror of the cel means is 4% for al conditions, except the mismatch:long condition, in which it is 3%. N = 18. There was a 4% decrement observed for long dependencies, and a 4% decrement for verbs with mismatching P arguments. However none of these efects reached significance in the mixed-efects logit models. Reading time results are reported in Figure 4-10. The main finding in the reading time data is that there is only sensitivity to the verb-P match in short dependency conditions. 296 Figure 4-10 Experiment 12 Reading time results Example sentence, with region subscripts: The 1 courier 2 to 3 /from 3 whom 4 the 5 secretary 6 [ for 7 the 8 high-powered 9 defense 10 atorney 1 ] recently 12 entrusted 13 the 14 confidential 15 busines 16 correspondence 17 after 18 some 19 hesitation 20 (unfortunately wrecked his bike in trafic). Only significant efects are reported. Region 1-3: Sentence-initial regions. Despite being identical, there are some baseline diferences observed in Regions 1-3, corresponding to the sentence-initial determiner, subject head noun and pied-piped preposition. In Region 1, there is a main efect of match (??: 35 ms; 95% C.I. [10 ms, 60 ms], p < 0.01) and an interaction of match and length (long:mismatch ??: -40 ms; 95% C.I. [-2 ms, -74 ms], p < 0.05). This interaction reflects the fact that a pairwise diference in match is only observed in the two long dependency conditions. In Region 2, there is a reliable efect of length (??: 45 ms; Critical verb Interposed P 297 95% C.I. [9 ms, 83 ms], p < 0.05). In Region 3, there is a reliable interaction of the two experimental factors (long:mismatch ??: -43 ms; 95% C.I. [-4, -84], p < 0.05). Because these diferences occurred before any experimental manipulations, they are interpreted as spurious. In Regions 4-6, corresponding to the wh phrase and the subject of the relative clause, there are no diferences among conditions. Since al the early diferences involve comparisons among the long conditions, it is important to note that in the P region, Regions 7-11, there are no reliable diferences. Therefore across al conditions, the baselines are stabilized wel in advance of the critical verb. Region 13: Critical verb. In Region 13, the critical verb region, there is an efect of match (??: 91 ms; 95% C.I. [33, 148], p < 0.01), and an interaction with distance (long:mismatch ??: -125 ms; 95% C.I. [-45, -215], p < 0.01). Pairwise comparisons show that there is only a slowdown for mismatch conditions when the dependency is short (??: 90 ms; 95% C.I. [5, 166], p < 0.05), and no reliable diference betwen long dependency conditions. Indeed, match conditions are (numericaly) slower in long dependencies (??: 34 ms; 95% C.I. [-15, 85], n.s.) No diferences are observed in subsequent regions of the sentences. In an atempt to detect a match diference in long conditions, we pooled Regions 14-19; results were not significant. 4.6.2.3 Speeded gramaticality follow-up The results of the reading time data suggest that comprehenders are only sensitive to the selectional properties of the verb with respect to a pied-piped P in short dependencies. Unlike plausibility detection, reported in Experiment 11 above, participants showed no ultimate sensitivity to mismatch in online measures. 298 Comprehension acuracy indicated a decrement due to match, regardles of length, but those results were not reliable. However, even if comprehenders failed to notice the il- formednes of the sentences, it is likely they could determine the correct interpretation, by knowing the verb?s semantic properties (i.e. ?inherit? needs a source argument; even if the preposition in the pied-piped oblique argument is forgotten, it can be taken a source). In a follow-up speeded gramaticality experiment, we asked a diferent group of participants to judge the materials used in the reading-time experiment as aceptable or not. We report these results below. Identical item sets from the current experiment were included in the speeded gramaticality experiments reported in Chapters 3. In results reported below, 16 participants came from Experiment 3 and 16 from Experiment 6. Analysis revealed no betwen group diferences, so we consolidate them below. Results of the speeded gramaticality task are reported in Figure 4-11. 299 Figure 4-11 Experiment 12 Follow-up results Speded grammaticality, proportion ?yes? responses Eror bars are standard eror of the mean proportion across participants. There are two paterns to note in these data. Firstly, judgments of aceptability were sensitive both to the selection mismatch (?: -.71 ? 0.49, p < 0.001) and dependency length (?: -.95 ? 0.49, p < 0.001). There was no interaction of the two: participants were equaly les likely to say ?no? to pied-piped verb-P mismatches in both short and long dependency conditions. Secondly, however, overal the diferences betwen match and mismatch conditions were smal: for example, in short dependencies participants acepted match conditions 83% of the time and mismatch conditions 71% of the time. This cannot be atributed to an overal ?yes? bias in the experiment, since we se from data in Experiments 3 and 6, run simultaneously, that participants were wiling to say ?no? 70-80% of the time for some agrement violations. 300 4.6.2.4 Discussion The data in Experiment 12, and the speeded gramaticality follow-up, indicate that as dependency length increases, sensitivity to the lexical identity of the pied-piped preposition declines in active dependency formation. In short dependencies, a mismatch betwen the verb and the P leads to a strong reading time disruption. This disruption is entirely absent in long dependencies. Thus we conclude that lexical identity cannot be wel maintained in an imediately acesible state to guide active dependency formation. The data in the speeded gramaticality follow-up demonstrate that the violation is not incapable of being noticed. Consequently it is possible, regardles of length, to recover a representation over which the selectional restriction betwen verb and P can be evaluated. However it sems that information about the identity of the head of the P is not reliably available to the parser during initial dependency formation. 4.6.3 FG (Pied-piping) (Experiment 13) In this experiment we turn to les lexicaly anchored properties of the filer. It is possible in English to extract both wh-phrases and the Ps that contain them. By using a modified filed gap efect design, we can test whether the categorial identity of the extracted phrase is wel maintained over long dependencies. We used comparisons like the following: (129) (a) The website which the blogger recently designed | the flashy graphics for .. (b) The website for which the blogger recently designed | the flash graphics .. In (129a), by the time comprehender encounters the verb (indicated by the ?|?), the extracted phrase wil be analyzed as the direct object. As the direct object is recognized, the comprehender must reanalyze the site of extraction. However, in (129b), the pied- piped extraction does not lead to a direct object analysis. Comparison across the direct 301 object regions in (129) is thus expected to show a slow-down for simple DP extraction, reflecting the cost of reanalysis. Comparing regular and pied-pied extractions is a modified version of the filed-gap logic (Stowe, 1986) and has been demonstrated to be efective before (Le, 2004). In an important respect, it is preferable to the standard filed gap experimental design because it compares two sets of conditions that both involve an extraction dependency. In contrast, the standard design (Stowe, 1986) compares the direct object region inside a movement dependency with one inside an if of whether clause. What would happen in long dependencies? If the categorial identity of the extracted phrase is preserved across longer dependencies, then we expect a filed gap efect in both short and long conditions. If, however, categorial identity is lost, then long conditions should be identical in procesing complexity, either because both regular and pied-piped extraction leads to a direct object analysis or because neither do. 4.6.3.1 Materials and methods Participants Participants were 42 native speakers of English from the University of Maryland community with no history of language disorders. Participants received $10. Materials To achieve the desired contrast, we constructed sentences that involved extraction from a benefactive P. The experimental design crosed extraction type, either ?simple? or ?pied-piped,? with dependency length. As in Experiment 11b, the length factor had thre levels: short, long:P and long:clause. Conditions were combined in a 2 ? 3 factorial design creating six conditions for each item set. There were 24 items sets. Thus 302 each participant would se four examples of each condition. The ordinal position of the critical verb was matched across conditions by embedding the short dependency under a psych predicate. A sample set of materials is given in Table 4-11. The length manipulation in long:* conditions is bracketed. Dependency length Extraction The CEO was worried that the website .. would soon be obsolete. Simple .. which the blogger recently designed the flashy graphics for ___ after user demand grew .. Short Pied-piped .. for which the blogger recently designed the flashy graphics ___ after user demand grew .. The website .. would soon be obsolete. Simple .. which the blogger [ from the local art school ] recently designed the flashy graphics for ___ after user demand grew .. Long:P Pied-piped .. for which the blogger [ from the local art school ] recently designed the flashy graphics ___ after user demand grew .. Simple .. which [ the company CEO said that ] the blogger recently designed the flashy graphics for ___ after user demand grew .. Long:Clause Pied-piped .. for which [ the company CEO said that ] the blogger recently designed the flashy graphics ___ after user demand grew .. Table 4-11 Sample materials set for Experiment 13 Procedures and Analysis Presentation and analysis details were as described for Experiment 11b. Regions were structuraly aligned, so that the VP regions comon to al item sets were analyzed word-for-word across conditions. 4.6.3.2 Results Comprehension acuracy is reported in Table 4-12. 303 Dependency length Short Long:P Long:Clause Simple 88% ? 2% 89% ? 2% 76% ? 3% 84% Extraction type Pied-piped 85% ? 3% 85% ? 3% 74% ? 3% 81% 87% 87% 75% 83% Table 4-12 Comprehension accuracy for Experiment 13 Average percentage correct over participants, with row, column and grand means. Standard eror of the cel means is reported in the table. There was a significant (negative) efect of long:clause conditions (?: -1.0 ? 0.6, p < 0.005). No other comparisons were reliable. 304 Reading time data are reported in Figure 4-12 for short conditions, Figure 4-13 for long:P conditions, and Figure 4-14 for long:clause conditions. The main finding in the reading time data is the appearance of a filed-gap efect across al dependency length conditions, occurring in the critical direct object regions. Figure 4-12 Experiment 13 Region-by-region reading times: Short conditions (The CEO was worried that) 1-5 the 6 website 7 simple: which 9 the 10 blogger 1 recently 12 designed 13 the 14 flashy 15 graphics 16 for 17 pied-piped: for 8 which 9 the 10 blogger 1 recently 12 designed 13 the 14 flashy 15 graphics 16 after 18 user 19 demand 20 grew 21 would 2 soon 23 be 24 obsolete 25 (..) 26 305 Figure 4-13 Experiment 13 Region-by-region reading times: Long/P conditions The 1 website 2 simple: which 4 the 5 blogger 6 (from the local art school) 7-1 recently 12 designed 13 the 14 flashy 15 graphics 16 for 17 pied-piped: for 3 which 4 the 5 blogger 6 (from the local art school) 7-1 recently 12 designed 13 the 14 flashy 15 graphics 16 after 18 user 19 demand 20 grew 21 would 2 soon 23 be 24 obsolete 25 (..) 26 306 Figure 4-14 Region-by-region reading times: Long/Clause conditions The 1 website 2 simple: which 4 (the company CEO said that) 5-9 the 10 blogger 1 recently 12 designed 13 the 14 flashy 15 graphics 16 for 17 pied-piped: for 3 which 4 (the company CEO said that) 5-9 the 10 blogger 1 recently 12 designed 13 the 14 flashy 15 graphics 16 after 18 user 19 demand 20 grew 21 would 2 soon 23 be 24 obsolete 25 (..) 26 Pre-verbal regions. There were no reliable diferences among conditions in Regions 1 ? 12, with the exception of Regions 4 ? 6 in the long conditions. These regions correspond to the sequence ?which ? Det ? N? at the edge of the relative clause. Simple extractions were read more slowly in these regions (??: 23 ms, 95% C.I. [15 ms, 31 ms], p < 0.005). We speculate that the initial ambiguity of parsing ?for? in pied-piped extraction may have contributed to the discrepancy betwen extraction type conditions. This slowdown was restricted to the long conditions, where it occurred sentence initialy, which could magnify such efects. Because this diference occurs so-far upstream of the 307 critical regions (Region 14 and subsequently), and there are no diferences in Regions 7- 12, we are not concerned that this diference would impact tests in the critical regions. Verb. Reading times at the verb, Region 13, showed no efect of length or extraction type. Critical direct object region. Regions 14 ? 16 corresponded to the critical direct object regions where a filer-gap efect was expected. Al length conditions showed a slowdown for simple extractions in either Region 14 or Region 15. The full mixed efects model of the data revealed a reliable efect of extraction type in Region 15, such that simple conditions are read more slowly across al length conditions. For short conditions it was only in Region 15 that a slowdown was observed. However, for long:P conditions, a smaler slowdown was observed in Region 14, 15 and 16. This slowdown was marginaly significant in Region 14 alone (??: 16 ms, 95% C.I. [-1 ms, 32 ms], p < 0.10), and reliable in Region 15 (??: 16 ms, 95% C.I. [3 ms, 26 ms], p < 0.01). When Regions 14 ? 16 were combined in a pooled analysis of long:P conditions, the result was also reliable (??: 16 ms, 95% C.I. [4 ms, 27 ms], p < 0.01). For long:clause conditions, the slowdown was reliable only in the Region 14 pairwise comparison (??: 19 ms, 95% C.I. [0 ms, 36 ms], p < 0.05). The mean diference due to extraction type was numericaly smaler in both long conditions compared to the short conditions. This diference was not reliable in the full model for the Region 15 data. However, as a stronger test we combined the Region 14 data for long:P conditions with the Region 15 data for long:clause conditions to perform a single long v. short comparison, with the Region 15 data for short conditions. The combination of the two regions to create the pooled long condition reflects the fact the 308 diferences were not time-locked across conditions; however we wanted to compare the largest and most reliable conditions for each condition. This analysis returned a significant efect of extraction type (??: 35 ms, 95% C.I. [14 ms, 55 ms], p < 0.005), but the interaction with length is not reliable (??: 17 ms, 95% C.I. [7 ms, 42 ms], p < 0.20). We therefore cannot conclude that the filer-gap efect was smaler for long conditions. It is possible our design simply does not give us enough statistical power to detect the interaction. The raw mean diferences in efect sizes are somewhat misleading: normalizing each extraction-type comparison against the pooled standard deviation in that condition shows that the efect of extraction type is les discrepant across length conditions (Cohen?s d ? short: 0.24, long:P: 0.14, long:clause: 0.15) 58 . Moreover, while one larger efect is observed for short conditions, several smaler efects are observed for the long conditions. Gap and post-gap regions. Region 18 is the first post-gap region, and constitutes the signal that a constituent is mising (as in previous experiments, with a sequence of two preposition). In this region long:CP conditions were read significantly more slowly (??: 21 ms, 95% C.I. [3 ms, 42 ms], p < 0.05). In Region 19, long:P conditions were read significantly more slowly (??: 38 ms, 95% C.I. [17 ms, 60 ms], p < 0.001). There was a marginal slow-down for long:CP conditions (??: 21 ms, 95% C.I. [-1 ms, 42 ms], p < 0.10); there was also a marginal interaction of long:P conditions with pied-piped extraction, such that pied-piped conditions were read faster (??: 26 ms, 95% C.I. [-6 ms, 58 Establishing a reliable interaction betwen extraction type and dependency length would require power to detect an efect size of roughly d = 0.09 (or R 2 = 0.002; a very smal efect). Given our design and the betwen condition correlations observed within participants in the actual experiment, we estimate nearly 120 participants would be necesary to achieve power (1 - ?) of 0.80 for that efect size (calculations caried out in G*Power 3, Faul, et al., 2007). 309 53 ms], p < 0.10). In Region 20, pied-piped extractions were read more slowly (??: 23 ms, 95% C.I. [5 ms, 42 ms], p < 0.05). 4.6.3.3 Discussion The modified filed gap paradigm used in Experiment 13 gave us our first indication that some efects of active dependency formation can robustly survive longer dependency lengths. We detected a filed-gap efect in both short and long conditions. It was numericaly larger in the short condition, but restricted to one region only; in long conditions, it was smaler but spread over several regions. The numerical diferences were not reliable. We conclude that while the longer dependency lengths may introduce variability in when the reanalysis is triggered, comprehenders robustly detect that they have misparsed the simple extractions. In order to do so, they must have maintained the distinction betwen the category extracted: DP or P. Consequently, basic category information sems to survive longer dependency lengths beter than either lexical features or lexical identity. Interestingly, comprehenders are slower at showing the filed-gap efect in short dependencies, but faster in long dependencies. We believe this may be indirectly related to our hypothesis: in the long dependencies, comprehenders only have filer category information to rely upon, and can so quickly ases (and reject) the direct object analysis of the filer in simple extractions. In short dependencies, comprehenders may be evaluating the direct object analysis using a broader aray of information about the filer, since more lexicaly-anchored information survives at shorter dependencies. 4.6.4 Conclusions In Experiments 11 ? 13 we examined how diferent measures of active dependency formation respond to diferent dependency lengths. In Experiments 11a and 310 11b, we used a plausibility manipulation, which has been succesfully used before in a variety of tasks (Garnsey, Tanenhaus & Chapman 1989; Tanenhaus, Stowe & Carlson 1985; Boland et al. 1995; Traxler & Pickering 1996; Philips, 2006; Lau, Yeung, Hashimoto, Braun & Philips 2006) and has thus generaly been considered a robust index of dependency formation. In Experiment 12, we used a selectional restriction betwen a verb and its P argument, which we devised for the study. In Experiment 13, we used a modified version of the filed gap paradigm (Stowe, 1986; Le, 2004) comparing simple extractions with pied-pied extractions. Only in the later case did we observe evidence for active dependency formation in long dependencies. It is worth emphasizing that most of our length manipulations involved adjoining a five-word P to the subject. Based on word-by-word reading times, this corresponded to a 1-1.5 second increment in the time elapsed from filer to verb. The mismatches in Experiment 11a were especialy strong, so it sems counter -intuitive that this additional procesing time, spent as it was parsing a structuraly unambiguous substring, should nearly extinguish sensitivity. Crucialy in al experiments we observed ultimate sensitivity to the dependency formation manipulation. In Experiments 11a & 11b, sensitivity to an implausible verb- argument pair was manifested imediately following unambiguous evidence for the gap. In Experiment 12, we never observed sensitivity to the verb-P mismatch in the online record, but an of-line followup revealed (length-independent) sensitivity. If we restricted our atention to just the results in Experiments 11-12, then the data would suggest that the impact of length is to switch into a non-active mode of dependency formation The patern in Experiments 11a & 11b support this conclusion most strongly: in short 311 dependencies, the reading time disruption imediately follows the verb, whereas in long dependencies, it imediately folows evidence for the gap site. Such conclusion would constitute a serious chalenge to the generality of active dependency formation, suggesting that only in very short dependencies does the parser posit gaps before direct evidence for a mising constituent. However, in Experiment 13, the filed gap efect survived both P and clausal extension of the dependency. The active positing of gap sites thus persists across the same dependency lengths that extinguish sensitivity to specific lexical information. These data support the view that dependency formation precedes independently of aces to the detailed contents of the filer. 4.7 Conclusions In this chapter we considered how the parser could adapt to a content-addresable memory to facilitate the acurate recognition of gramatical dependencies. Based primarily on previous results in the procesing on anaphora, we concluded that the predictable dependencies were most likely to be constructed and licensed without considering gramaticaly acurate constituents. Experimental studies on the formation of wh-dependencies elaborated this viewpoint in several ways: 1) Top-down, gramar driven dependency formation. The Coordinate Structure Constraint experiments (7-8) provide strong evidence that the parser uses its knowledge of island constraints to prompt the construction of filer-gap dependencies. The contrast with potential p-gap environments afirms that the active dependency formation strategy proceds top-down with reference to outstanding licensing requirements, and not on the basis of the bottom-up compatibility of an analysis. 312 2) Acurately targeted dependency heads. The Interference experiments (9-10) atempted to follow-up and refine Van Dyke & McElre (2006)?s demonstration that filer-gap dependency completion is liable to similarity-based inteference. On balance they support the idea that (predicted) dependency formation is fre from interference, when the comparison set is other dependency heads (and not extra-syntactic lexemes). We presented two mechanisms to acount for acurate performance: in the first, the encoding system can, in a gramaticaly motivated way, render an incomplete dependency head featuraly distinct from a complete one; in the second, a smal amount of filer-specific information is caried forward that is used to retrieve it at licit retrieval sites. The data do not choose betwen the two acounts. From the standpoint of structuraly licensing the dependency, the first strategy is heuristic; the second one requires some maintenance, but alows licensing to occur entirely left-to- right. 3) Robust availability of filer category information. The Dependency Length experiments (11-13) asked whether it was plausible that any information was caried forward in time to guide dependency formation. Lexicaly-anchored information was lost quickly, even in monoclauses. Categorial information survived across longer dependencies. These results are consistent with the top-down nature of active dependency formation and decision-making proces that is most robustly supported by coarse-grained information. We propose that these results taken together are characteristic of a procesing system in which the initial licensing of predicted dependencies could be largely retrieval-fre. The synthesis of the later two set of experiments supports this point most strongly. 313 Experiments 11-12 suggest that most of the information about a filer is not recovered during the active formation phrase of a long dependencies, but that it is eventualy recovered. Given a direct aces retrieval mechanism, one explanation for the near complete insensitivity observed in active regions of Experiments 11-12 is the competition of many similar constituents. Longer dependencies mean more encodings in memory that could be activated by the retrieval structure at the verb. However, Experiments 9-10 pull in the opposite direction. Even when dependencies are short, the most similar constituents, filers in other dependencies, do not interfere. It sems unlikely, therefore, that the extra length in Experiments 11-12 introduced strongly interfering constituents. 314 5 Conclusions 5.1 Specific conclusions The major empirical target of this disertation is how acurately on-line comprehension reflects gramatical principles and constraints. This disertation reported the results of several reading time and speeded gramaticality judgment experiments on two kinds of dependency completion proceses: subject-verb agrement licensing and wh-dependency formation. The goal of these experiments was to test under which conditions real-time dependency formation is faithful to gramatical principles and constraints. For the same reasons, we also examined the existing literature on procesing complex subjects. Asesing those conditions was part of a broader efort to determine how structure-sensitivity could be achieved in a memory architecture that is inherently not wel-suited to verifying hierarchical relations betwen constituents. We first report the specific experimental conclusions in this section. In the next section we report the broader conclusions. 5.1.1 Agrement attraction 5.1.1.1 Experimental results (Experiments 1 ? 4) Agrement atraction occurs in English when a DP other than the subject matches the verb in number features. The canonical example of agrement atraction, discussed most heavily in the production literature, occurs in complex subjects modified by a P (Bock & Miler, 1991): 315 (130) The path to the monuments were litered with botles. Here we provided the first online complexity data for a relatively understudied species of agrement atraction: atraction betwen a relative clause head and a relative clause verb, first reported by Kimbal & Aisen (1971): (131) The runners who the driver wave to .. We found that comprehenders procesed agrement atraction highly similarly in both RC and complex subject atraction. The occurrence of a plural atractor eased the RT disruption normaly asociated with subject-verb mismatches. Crucialy a plural atractor only reliable eased the subject-verb mismatch disruption; it did not increase dificulty for gramatical sentences. This patern, which we describe as eliciting ilusions of gramaticality, but never ilusions of ungramaticality, was mirored in the judgment results. We concluded that agrement atraction is selectively falible in comprehension: it is liable to intrusion of a ungramatical analysis only when a gramatical analysis is not available. The results of our studies, across sentence types and experimental measures, failed to confirm the Symmetry Prediction of feature percolation acounts of atraction (Vigliocco & Nicol, 1998; Eberhard, Cutting, & Bock, 2005). 5.1.1.2 Modeling results (and Experiment 5) We presented a cue-based retrieval acount of our data which we formalized using Shifrin and colleagues? Search of Asociative Memory (Gilund & Shifrin, 1984). In this model, atractors intrude in online comprehension when a constituent that fully matches the verb?s retrieval cues is not found. In ungramatical cases, a subject cannot be found with plural number, but a plural, non-subject is partialy activated. In 316 gramatical cases, there is no plural feature to contact the non-subject. This acount was developed initialy for complex subject atraction, and then extended to RC atraction. There it was necesary to introduce a clause context cue, since the RC atractor is subject- like. Finaly the use of case cues predicted that there should be a smal amount of intrusion from RC heads in subject position even in gramatical sentences, but not in object position. This was confirmed in an speded gramaticality study: atraction was found in ungramatical sentences when the RC was either subject- or object-atached. However a slight but reliable efect was found in gramatical sentences as wel when the RC was subject-atached. 5.1.2 Wh-dependency formation 5.1.2.1 The Coordinate Structure Constraint (Experiments 7-8) We tested whether the Coordinate Structure Constraint (CSC) was respected in real-time procesing. The CSC forbids extraction out of coordinated phrases, unles the same subconstituent is extracted out of each coordinate (Ross, 1967). We found that when a gap was detected inside a coordinate environment, the parser actively completed a second filer-gap dependency in the second coordinate. Experiment 8 confirmed that the parser did so without any evidence of a mising constituent. In a very similar environment, i.e., a post-verbal adjunct clause which could support a parasitic gap, we found no evidence of active dependency completion. Consequently we concluded the comprehender was imediately aware of the implications of completing a wh- dependency in a coordinate phrase. Gramatical knowledge was directing dependency completion. 317 5.1.2.2 Locating the head of a wh-dependency (Experiments 9-10) We tested whether the parser was as acurate at locating the head of a dependency as previous experimental results, including our own Experiments 7-8, indicate it is for the tail of a dependency. Specificaly we tested for whether the presence of additional wh- dependency heads in a sentence, in similar structural positions, would interfere with the dependency formation proces. We found no online evidence at the critical verb that an additional candidate wh-phrase interfered in completing a target dependency. There was evidence in Experiment 9 that completing two wh-dependencies led to decreased acuracy. In that experiment, the distractor wh-phrase was nested inside the target dependency. It is possible that there was interference at the medial verb; but there may have also been an interaction betwen the gap search and the ambiguity resolution present in those materials. In Experiment 10, which consider same-sentence dependencies which were not nested, there was no observed decrement in acuracy. We presented two candidate mechanisms to explain these results in a content-addresable memory: in the first mechanism, wh-phrases are re-encoded when they have been succesfully integrated and thus can be restricted from a verb-triggered search; in the second mechanism, a smal amount of idiosyncratic information about the wh-phrase is caried forward that alows targeted retrieval. Our findings contrast somewhat with Van Dyke & McElre (2006) who found that words from a extra-syntactic memory load list could interfere in wh- dependency formation. We argued, however, that they provided no evidence that it was the dependency resolution proces which the memory list interfered with, and that our manipulation constitutes a stronger test. 318 5.1.2.3 Carying forward information in time (Experiments 11-13) In the final set of experiments, we tested how wel diferent measures of active dependency formation survive increasing serial and hierarchical dependency length. These measures included a plausibility manipulation (Traxler & Pickering, 1996), a verb- P selectional restriction, and a pied-piping filed gap efect (Le, 2004). We found that only the filed gap efect survived the longer dependencies. The increase in dependency length was relatively modest ? in the serial length conditions, it only involved interpolating a 5-word P region, corresponding to roughly 1 second of elapsed procesing time ? so it was surprising that some measures of active dependency formation were so efectively atenuated. Both the plausibility and verb-P selectional restriction manipulations required lexicaly-anchored information, while the filed-gap efect only required information about the category of the filer. We concluded that active dependency formation is most efectively guided by coarse-grained categorial information. 5.2 Broader Conclusions Achieving structure sensitivity in a content-addresable memory is inherently dificult. Properties that constituents have by virtue of their hierarchical relation to other constituents, like c-command, cannot be encoded in constituent representations. These kind of restrictions therefore cannot be enforced in a content-addresable, direct aces search for constituents that license or participate in a dependency. Consequently, gramaticaly inacesible constituents wil be generated as candidates in the search proces. This fact about embedding linguistic representations in a content-addresable 319 memory architecture is a major determinant of gramatical inacuracy in real-time procesing. There are ways of countering this inacuracy. Our own experiments on wh- dependency procesing, and a review of the existing literature, revealed relatively litle inacuracy in active dependency formation. Islands are respected or enforced and the head of the dependency is located without interference. Likewise the procesing of backwards anaphora is highly gramaticaly acurate. In both wh-dependencies and backwards anaphora, the need to construct a dependency is announced by the first (temporaly-occurring) element in the relationship; consequently the parser can search left-to-right. A predictive or prospective search for candidate constituents thus appears to be more acurate than a retrospective one. Some information about the syntactic context must be caried forward in these cases to alow local evaluation of the dependency. We proposed that a smal amount of information caried forward could aid in licensing the dependency and in targeting the retrieval of the right dependency head. The selective falibility observed in ungramatical agrement atraction sentences and in the resolution of pronominal anaphora is consistent with a content-addresable memory architecture. Previous researchers have also proposed that complex subject atachment and NPI licensing are liable to partialy matching candidates (Vasishth et al., 2005; Van Dyke & Lewis, 2003). Pursuing Xiang, Dilon & Philips (submited)?s line of argumentation, we dismised NPI licensing as relevant for triggering a constituent retrieval and thus reflective of retrieval-based interference. It is an interesting question, though, whether NPI licensing would be more acurate if there were some kind of early signal that a NPI would appear in the sentence. Xiang, Dilon, & Philips? explained the 320 spurious licensing of NPIs by appealing to a pragmaticaly-supported negative interference generated by contrastive relative clauses. If, however, an NPI was expected, the comprehender may be more careful about verifying whether or not an appropriate downward entailing environment is available. As for complex subject atachment, we argued that previous experimental work has confounded clause number with the presence of a subject-like constituent (Van Dyke & Lewis, 2003; Van Dyke, 2007). In an experiment that deconfounded the two, we provided new evidence that complex subject atachment is not more dificult when there is a subject-like constituent in the context. We stil found ofline evidence, however, of increased dificult in subject-interference conditions. In the previous literature, offline measures were also clearly most afected. On balance, it sems that that subjects embedded within subjects do impact the interpretation of sentences, but they do not robustly lead to increases online complexity. This patern of results is compatible with retrieval-based interference of the inacesible subject, if the presence of a partialy-matching constituent does not lead to RT increases in the selection proces. However, it is also compatible with the influence of later comprehension proceses or task-specific proceses, such as sentence regeneration. The final domain we discussed concerned reflexive anaphora. Even though reflexive anaphora resolution cannot generaly be anticipated by the comprehender, it is highly gramaticaly acurate. Unlike agrement atraction, feature-matching constituents inside a complex subject do not intrude in procesing. One possibility is that the use of clause context cues can be made salient enough to restrict the retrieval of candidate constituents to the imediate clause. This acount gains greater traction if there realy is no syntactic interference in complex subject atachment: we could apply 321 the same mechanism in both cases. The other possibility is that reflexive anaphora resolution does not rely purely on abstract cues: that is, it does not asemble its retrieval structure based on general gramatical cues, but based on cues specific to the present episode of sentence encoding. By hypothesis the verb or VP encoding contains a specific pointer to the actual subject, which it could pas to the anaphor. This conjecture is similar to one we offered to explain acuracy in locating the head of a wh-dependency: unique information can be used to retrieve specific constituents in a sentence that contains many abstractly similar constituents. If the second line of explanation is on the right track, then we might expect more broadly that certain gramatical devices, like feature-pasing, could serve the function of encoding constituents both with abstract features but also pointers that may prove useful later. The memory architecture we have considered not only restricts the amount of information available to make parsing decisions, but also grants aces to syntactic context in a structure-insensitive fashion. While it is undeniable that there is a cost to pasing information forward in time, such a cost would often be justified by the benefit of rendering information about non-local constituents as efectively local. 322 6 Apendices A The Symetry Prediction of Feature Percolation and RTs The Symmetry Prediction of feature percolation acounts of agrement atraction states that the proportion of ilusions of gramaticality should equal the proportion of ilusions of ungramaticality. If the perception of gramaticality, as a distribution of binary responses, shifts symmetricaly in both gramatical and ungramatical sentences, wil reaction times also shift symmetricaly? That is, would reading times in gramatical sentences be slowed down as much as reading times in ungramatical sentences be sped up? Response times have a characteristicaly right-skewed distribution (Luce, 1986) and this is true of reading times in a self-paced reading task. The distribution of responses can be modeled as an ex-Gaussian (Hohle, 1965), a distribution formed by convolving the normal and exponential distributions. By means of simulation we confirm that mixing two ex-Gaussian distributions, corresponding to ?perceived gramatical? and ?perceived ungramatical? internal responses, leads to linear shifts in the mean of the composite distribution. Therefore the Symmetry Prediction leads us to expect a symmetrical interaction in atractor sentence RTs. We us asume a simple model, where the reaction time response at the region of interest in atraction sentences is determined by mixing the reaction time distribution for gramatical responses with the reaction time distribution for ungramatical responses. The mixing proportion is determined by the rate of percolation. The equations in (132) and (133) reflect this asumption. In (132), RT A/U is the composite reaction time 323 distribution for a ungramatical sentence containing an atractor. Reaction times in this distribution are sampled with probability p from the gramatical distribution RT G and probability (1 ? p) from the ungramatical distribution RT U . In (133), the composite reaction time distribution for a gramatical sentence containing an atractor, RT A/G , is given in the same terms. (132) Ungrammatical attraction distribution RT A/U = p ? RT G + (1 ? p) ? RT U p < 1 (133) Grammatical attraction distribution RT A/G = (1 ? p) ? RT G + p ? RT U p < 1 It is not dificult to show that given this model, the means of composite distributions are linearly related to the mixing proportion. We do so by simulation. The logic of the analysis and its results are given step-by-step. 1. An ex-Gaussian distribution is generated by adding a normal distribution (which determines the leading edge of the RT distribution) to an exponential (which gives the long tail). ex-Gaussian distributions have thre parameters: ? - the normal mean; ? - the normal variance; ? - the exponential mean. The mean of the ex-Gaussian is simply ? + ?; and its variance ? 2 + ? 2 . Se Hohle (1965) for further details. 2. The ex-Gaussian parameters for RT G were estimated from the Sg [Sg] gramatical conditions in Wagers, Lau & Philips (2008) Experiment 4, Region 8. Parameters for RT U were estimated from the Sg [Sg] ungramatical conditions. Distributions were fit via maximum likelihood 324 3. The ex-Gaussian parameters are fit via maximum likelihood. Se the following web site for useful discussion on how to do this in the R language: http:/users.fmg.uva.nl/rgrasman/rpages/2007/07/ex-gaussian-distribution-for- reaction.html. 4. Figures I & I below show the observed Sg [ Sg ] gramatical and ungramatical distributions 59 . The continuous ex-Gaussian distribution generated by the estimated parameters is superimposed. Details of the observed and fit distribution are given in the figure captions. Figure 6-1 Sg [ Sg ] Grammatical RT Distribution: estimated RT G Observed parameters: ? = 313 ms; ? 2 = 12010 ms 2 Estimated ex-Gaussian parameters: ? = 210 ms; ? = 32.8 ms; ? =104 ms 59 These distributions represent al RTs in the condition/region collapsed across participants. Please note that we are asuming that the composite distributions are generated at the participant level, who samples from the pure distributions from trial to trial. There was not enough data to estimate parameters per participant and average. For the purposes of the simulation, that is irelevant since were are only interested in how any ex-Gaussian behaves (and not interpreting the parameters). 325 Figure 6-2 Sg [ Sg ] Ungrammatical RT Distribution: estimated RT U Observed parameters: ? = 365 ms; ? 2 = 25891 ms 2 Estimated ex-Gaussian parameters: ? = 200 ms; ? = 29.4 ms; ? =165 ms 5. Inspection of the parameters shows that the diference sems to be caried largely in the mean of the exponential component 60 . 6. Based on the parameter estimates, it is possible to generate ?mixed? populations acording to equations (132)-(133). For example, if p = 0.15, then RT A/G can be generated by sampling 15% of its values from RT U and 85% from RT G . 7. 50 experiments were simulated, in which atractor RT distributions were generated for n = 225 trials. Mean diferences betwen the baseline distribution and the mixed distribution computed. Figure 6-3 reports the results as follows: 60 This is a casual observation. Note, the distribution ex-Gaussian parameters is not known analyticaly, so the only way to do statistical inference of the parameters would be something like bootstrapping. 326 a. The x-axis corresponds to the proportion of trials drawn from the gramatical distribution. The y-axis corresponds to the mean diference betwen the atractor and non-atractor distributions. Eror bars correspond to the standard deviation of mean diferences. b. Blue symbols correspond to gramatical atractor condition, and indicate how much one would slow down in the presence of an atractor, asuming percolation. c. Red symbols correspond to ungramatical atractor conditions, and indicate how much one would speed up in the presence of an atractor, asuming percolation d. To work an example: Asume percolation happens 30% of the time. p = 0.3. i. Gramatical slow-down is given by (1 ? p ) on the x-axis: 0.7. i. Ungramatical speed-up is given by p: 0.3. 8. Inspection reveals that speed-ups and slow-downs are symmetrical. That is, the relationship betwen mixing proportion and RT diference is linear. 327 Figure 6-3 Simulation results: RT A/G /RT U/G means shift symetrically 328 B Experiments 1 ? 2 Omnibus RM-ANOVA Tables Experiment 1 ANOVA Tests reliable at ? = 0.05 in bold. MSE: MS Effect 329 Experiment 2: 2 x 2 x 2 ANOVA ANOVA Tests reliable at ? = 0.05 in bold. MSE: MS Effect By participants By items MinF' df MS efect F 1 p df MS efect F 2 p df minF' P Region 2 (RChead) Gramaticality 1,5 643 0.35 0.56 1,47 2416 2.08 0.16 1,73 0.3 0.59 Attractor number 1,5 2084 8.18 0.01 1,47 14098 5.89 0.02 1,96 3.42 0.07 Subject number 1,5 2643 1.43 0.24 1,47 4792 2.05 0.16 1,101 0.84 0.36 At-Num x gram. 1,5 9320 2.52 0.12 1,47 3402 1.05 0.31 1,82 0.74 0.39 Sub-Num x gram 1,5 71 0.39 0.59 1,47 580 0.26 0.62 1,10 0.14 0.71 A-num x S-num 1,5 9320 2.52 0.12 1,47 6379 2.36 0.13 1,101 1.2 0.27 3-way interaction 1,5 1042 0.53 0.47 1,47 353 0.15 0.70 1,72 0.12 0.73 Region 3 ('who') Gramaticality 1,5 314 0.29 0.59 1,47 75 < 0.1 0.84 1,102 0.91 0.34 Attractor number 1,5 8909 5.52 0.02 1,47 5052 3.6 0.06 1,94 2.18 0.14 Subject number 1,5 212 1.26 0.27 1,47 798 4.2 0.05 1,84 0.97 0.3 At-Num x gram. 1,5 2 < 0.1 0.97 1,47 18 < 0.1 0.93 1,78 < 0.1 1 Sub-Num x gram 1,5 152 < 0.1 0.76 1,47 570 0.47 0.5 1,75 < 0.1 0.78 A-num x S-num 1,5 1876 1.01 0.32 1,47 1854 1.9 0.18 1,97 0.6 0.42 3-way interaction 1,5 427 4.03 0.05 1,47 3437 1.93 0.17 1,86 1.3 0.26 Region 5 (RC subj) Gramaticality 1,5 703 0.27 0.60 1,47 2749 1 0.32 1,82 0.21 0.65 Attractor number 1,5 15895 7.03 0.01 1,47 10596 4.75 0.03 1,95 2.83 0.1 Subject number 1,5 861 4.34 0.04 1,47 924 5.35 0.03 1,102 2.4 0.12 At-Num x gram. 1,5 828 0.39 0.54 1,47 3723 1.02 0.32 1,90 0.28 0.6 Sub-Num x gram 1,5 1945 0.76 0.39 1,47 2940 1.36 0.25 1,98 0.49 0.49 A-num x S-num 1,5 2941 1.21 0.28 1,47 1948 0.76 0.39 1,93 0.46 0.5 3-way interaction 1,5 3871 1.67 0.20 1,47 319 1.21 0.28 1,96 0.70 0.4 Region 6 (verb) Gramaticality 1,5 942 0.3 0.59 1,47 1831 0.72 0.40 1,92 0.21 0.65 Atractor number 1,5 143 0.61 0.4 1,47 261 < 0.1 0.80 1,57 < 0.1 0.81 Subject number 1,5 3505 1.6 0.01 1,47 3105 1.4 0.01 1,101 5.76 0.02 At-Num x gram. 1,5 540 0.2 0.64 1,47 61 0.02 0.8 1,56 < 0.1 0.89 Sub-Num x gram 1,5 717 0.25 0.62 1,47 147 0.45 0.50 1,98 0.16 0.69 A-num x S-num 1,5 1 < 0.1 0.9 1,47 12 < 0.1 0.96 1,64 < 0.1 0.9 3-way interaction 1,5 7927 2.39 0.13 1,47 6128 1.79 0.19 1,7 1.02 0.32 Region 7 (verb+1) Gramaticality 1,5 36376 28.6 <0.01 1,47 25124 42.1 <0.01 1,101 17 <0.01 Atractor number 1,5 1147 1.81 0.18 1,47 4380 1.12 0.29 1,93 0.69 0.41 Subject number 1,5 4289 0.45 0.50 1,47 1817 0.45 0.51 1,101 0.2 0.64 At-Num x gram. 1,5 19253 3.48 0.07 1,47 13675 1.89 0.18 1,89 1.23 0.27 Sub-Num x gram 1,5 216 0.54 0.47 1,47 141 < 0.1 0.86 1,52 < 0.1 0.86 A-num x S-num 1,5 130 < 0.1 0.89 1,47 1517 0.29 0.59 1,62 < 0.1 0.89 3-way interaction 1,5 104 0.30 0.58 1,47 546 1.37 0.25 1,78 0.25 0.62 Region 8 (verb+2) Gramaticality 1,5 540 1.39 0.24 1,47 156 0.47 0.50 1,7 0.35 0.56 Atractor number 1,5 2723 0.92 0.34 1,47 2785 1.04 0.31 1,102 0.49 0.49 Subject number 1,5 45 < 0.1 0.90 1,47 49 0.02 0.90 1,102 < 0.1 0.9 At-Num x gram. 1,5 739 2.65 0.1 1,47 9174 2.85 0.10 1,102 1.37 0.24 Sub-Num x gram 1,5 438 1.30 0.26 1,47 2174 0.90 0.35 1,95 0.53 0.47 A-num x S-num 1,5 1501 0.61 0.4 1,47 545 0.25 0.62 1,82 0.18 0.67 3-way interaction 1,5 27327 6.03 0.02 1,47 3186 7.68 0.01 1,102 3.38 0.07 330 Experiment 2: 2 x 2 ANOVAs First number is 2x2 for RC Subject=singular, second number is 2x2 for RC Subject=plural. ANOVA Tests reliable at ? = 0.05 in bold. MSE: MS Effect By participants By items MinF' df MS efect F 1 p df MS efect F 2 p df minF' p Region 2 (RC head) Gramaticality 1,5 1354 | 1 0.65 | < 0.1 0.43 | 0.98 1,47 2682 | 314 1.79 | 0.16 0.19 | 0.69 1,8 | 1,5 0.47 | < 0.1 0.49 | 0.98 Attractor numb. 1,5 28383 | 1021 9.4 | 0.35 < 0.01 | 0.56 1,47 19721 | 75 7.39 | 0.31 0.01 | 0.58 1,98 | 1,10 4.14 | 0.17 0.04 | 0.68 Number x gram. 1,5 7950 | 1894 4.19 | 0.84 0.05 | 0.36 1,47 2974 | 781 0.95 | 0.31 0.3 | 0.58 1,68 | 1,79 0.78 | 0.23 0.38 | 0.63 Region 3 ('who') Gramaticality 1,5 451 | 15 0.35 | < 0.1 .56 | .92 1,47 15 | 530 < 0.1 | 0.34 0.79 | 0.56 1,6 | 1,56 < 0.1 | < 0.1 0.81 | 0.92 Atractor number 1,5 1305 | 9480 0.54 | 9.1 0.47 | < 0.01 1,47 393 | 6514 0.29 | 6.35 0.59 | .02 1,89 | 1,96 0.19 | 3.74 0.6 | 0.06 Number x gram. 1,5 2319 | 210 0.35 | 1.87 0.56 | 0.18 1,47 1478 | 1978 0.84 | 0.86 0.36 | 0.36 1,92 | 1,85 0.25 | 0.59 0.62 | 0.4 Region 5 (RC subj.) Gramaticality 1,5 15 | 2494 < 0.1 | 0.78 0.78 | 0.38 1,47 2 | 5687 < 0.1 | 2.2 0.98 | 0.14 1,48 | 1,85 < 0.1 | 0.57 0.98 | 0.45 Atractor number 1,5 2581 | 1625 0.98 | 7.85 0.3 | 0.01 1,47 1729 | 10815 0.36 | 3.86 0.36 | .06 1,10 | 1,87 0.46 | 2.59 0.5 | 0.1 Number x gram. 1,5 59 | 4140 0.27 | 1.76 0.61 | 0.19 1,47 6 | 7036 < 0.1 | 1.78 0.96 | 0.19 1,48 | 1,101 < 0.1 | 0.8 0.96 | 0.35 Region 6 (verb) Gramaticality 1,5 8 | 1651 < 0.1 | 0.42 0.95 | 0.52 1,47 10 | 329 < 0.1 | 1.1 0.95 | 0.3 1,101 | 1,87 < 0.1 | 0.31 0.97 | 0.58 Atractor number 1,5 67 | 767 0.27 | 0.2 0.6 | 0.6 1,47 194 | 80 < 0.1 | < 0.1 0.84 | 0.89 1,61 | 1,5 < 0.1 | < 0.1 0.84 | 0.89 Number x gram. 1,5 6302 | 2165 2.14 | 0.74 0.15 | 0.39 1,47 3708 | 2482 1.13 | 0.82 0.29 | 0.37 1,89 | 1,102 0.74 | 0.39 0.39 | 0.53 Region 7 (verb+1) Gramaticality 1,5 154603 | 21389 18.2 | 25.5 <.01 | <.01 1,47 131642 | 19723 17.8 | 38.1 <.01 | <.01 1,101 | 1,98 25.5 | 38.1 <.01 | <.01 Atractor number 1,5 436 | 6841 0.93 | 0.87 0.34 | 0.36 1,47 527 | 371 1.1 | < 0.1 0.3 | 0.7 1,102 | 1,57 0.51 | < 0.1 0.48 | 0.36 Number x gram. 1,5 14631 | 566 3.40 | 1.2 0.07 | 0.27 1,47 18319 | 902 3.84 | 0.14 0.06 | 0.71 1,102 | 1,58 1.8 | 0.12 0.18 | 0.73 Region 8 (verb+2) Gramaticality 1,5 37 | 9841 < 0.1 | 2.12 0.91 | 0.15 1,47 25 | 3715 < 0.1 | 1.37 0.78 | 0.25 1,73 | 1,93 < 0.1 | 0.83 0.92 | 0.36 Atractor number 1,5 4135 | 90 1.4 | < 0.1 0.24 | 0.85 1,47 2896 | 43 0.93 | 0.26 0.34 | 0.62 1,94 | 1,70 0.56 | < 0.1 0.46 | 0.86 Number x gram. 1,5 32075 | 291 8.52 | 0.81 <0.01 | 0.37 1,47 3763 | 3427 8.63 | 1.14 0.01 | 0.29 1,101 | 1,101 4.29 | 0.47 0.04 | 0.49 C Experiments 7, 8, & 11A Omnibus RM-ANOVA Tables NB: Significance tests: p: 0 *** 0.001 ** 0.01 * 0.05 ? 0.10 MSE: MS Effect Experiment 7 VP STRUCTURE FILLER PLAUSIBILITY STRUCTURE ? PLAUSIBILITY Region 7 VP 1 adverb F 1 : 0.4; MSE: 817 F 2 : 0.0; SE: 298 F 1 : 0.0; MSE: 169 F 2 : 0.1; SE: 181 F 1 : 0.0; MSE: 0.0 F 2 : 0.0; SE: 13 Region 8 VP 1 verb F 1 : 0.7; MSE: 4182 F 1 : 0.4; SE: 2526 F 1 : 0.3; MSE: 12982 F 1 : 0.5; SE: 19470 F 1 : 0.0; MSE: 17 F 2 : 0.2; SE: 1257 Region 9 Cordinator/ Preposition *F 1 : 1.1; MSE: 295434 *F 2 : 5.2; SE: 231453 F 1 : 1.4; MSE: 29350 F 2 : 1.3; SE: 27215 F 1 : 0.3; MSE: 5091 F 2 : 0.3; SE: 7181 Region 10 VP 2 adverb *F 1 : 8.7; MSE: 250432 *F 2 : 5.0; SE: 182765 F 1 : 0.0; MSE: 493 F 2 : 0.0; SE: 14 F 1 : 2.8; MSE: 83234 F 2 : 1.3; SE: 65292 Region 1 VP 2 verb F 1 : 1.6; MSE: 84783 F 2 : 2.0; SE: 15712 F 1 : 2.4; MSE: 7471 *F 2 : 5.5; SE: 76929 F 1 : 0.0; MSE: 208 F 2 : 0.0; SE: 89 Region 12 VP 2 verb + 1 ?F 1 : 3.0; MSE: 9836 ?F 2 : 3.0; SE: 94310 F 1 : 0.9; MSE: 1506 F 2 : 1.1; SE: 15248 F 1 : 0.6; MSE: 7934 F 2 : 0.4; SE: 8257 Region 13 VP 2 verb + 2 **F 1 : 31.6; MSE:19454 **F 2 : 19.9; SE: 107296 F 1 : 0.20; MSE: 5083 F 2 : 0.07; SE: 219 *F 1 : 7.9; MSE:147273 *F 2 : 4.5; SE: 17297 Region 14 VP 2 verb + 3 F 1 : 0.3; MSE: 391 F 2 : ~0; SE: 54 F 1 : 1.7; MSE: 18320 F 2 : 0.9; SE: 2078 F 1 : 0.8; MSE: 16320 F 2 : 0.8; SE: 21034 Region 15 VP 2 verb + 4 ?F 1: 3.9; MSE: 80104 F 2 : 2.7; SE: 108783 F 1 : 0.1; MSE: 539 F 2 : ~0; SE: 52 F 1 : 0.4; MSE: 431 F 2 : 0.3; SE: 6060 Numerator df in each manipulated factor: 1. Subject n: 36. Item n: 24 332 Experiment 8 VP STRUCTURE FILLER PLAUSIBILITY STRUCTURE ? PLAUSIBILITY Region 6 VP 1 verb or P prep. F 1 : 2.0; MSE: 47908 F 2 : 0.8; SE: 48478 F 1 : ~0; MSE: 65 F 2 : 0.1; SE: 655 F 1 : 2.5; MSE: 48105 F 2 : 1.0; SE: 60876 Region 7 VP 1 verb + 1 P prep. + 1 F 1 : 0.3; MSE: 393 F 2: 0.0; SE: 58 F 1 : 1.0; MSE: 625 F 2 : 0.2; SE: 4768 *F 1 : 4.8; MSE: 3625 F 2 : 1.8; SE: 31967 Region 8 VP 1 verb + 2 P prep + 2 *F 1 : 12.6; MSE:26161 *F 2 : 8.5; SE: 281351 F 1 : 0.3; MSE: 2585 F 2 : 0.0; SE: 630 F 1 : 0.3; MSE: 328 F 2 : 0.4; SE: 154 Region 9 VP 1 verb + 3 P prep + 4 **F 1 : 21.9; MSE:497323 *F 2 : 9.5; SE: 42453 F 1 : 0.1; MSE: 2716 F 2 : 0.0; SE: 138 F 1 : 1.7; MSE: 2016 F 2 : 0.5; SE: 8951 Region 10 P prep + 5 **F 1 : 27.2; MSE:654984 **F 2 : 16.7; SE:720729 F 1 : 0.6; MSE: 9607 F 2 : 0.2; SE: 576 F 1 : 0.1; MSE: 2037 F 2 : 0.1; SE: 3089 Region 1 Adverb **F 1 : 15.7: MSE:290247 F 2 : 2.7; MSE: 27167 F 1 : 0.5; MSE: 17590 F 2 : 0.4; SE: 2401 F 1 : 2.4; MSE: 6569 F 2 : 2.3; SE: 8303 Region 12 VP 2 verb **F 1 : 17.7; MSE:30954 *F 2 : 5.9; MSE: 372924 *F 1 : 4.9; MSE: 76503 F 2 : 1.5; SE: 41584 F 1 : 0.4: MSE: 10272 F 2 : 0.3; SE: 9308 Region 13 Ground Arg 1 *F 1 : 4.8; MSE: 6446 *F 2 : 6.4; SE: 75865 *F 1 : 6.8; MSE: 102345 ?F 2 : 3.0; SE: 8161 F 1 : 0.1; MSE: 386 F 2 : 0.0; SE: 574 Region 14 Ground Arg 2 *F 1 : 8.9; MSE: 7807 ?F 2 : 4.2; SE: 87912 F 1 : ~0; MSE: 60 F 2 : 0.0; SE: 359 F 1 : 0.3; MSE: 1680 F 1 : 0.1: SE: 1640 Region 15 Ground Arg 3 F 1 : 3.8; MSE: 4839 F 2 : 2.3; SE: 4052 ?F 1 : 3.5; MSE: 2626 F 2 : 0.6; SE: 14936 *F 1 : 4.6; MSE: 328 F 2 : 0.6; SE: 1751 Region 16 Ground Arg 4 F 1 : 0.9; MSE: 14312 F 2 : 1.3; SE: 21848 *F 1 : 9.1; MSE: 9796 F 2 : 2.6; SE: 74079 F 1 : 0.5; MSE: 4012 F 2 : 1.3; SE: 2012 Region 17 Figure prep. F 1 : 0.9; MSE: 687 F 2 : 1.0; SE: 7083 *F 1 : 7.8; MSE: 3710 F 2 : 1.5; SE: 41409 F 1 : 0.3; MSE: 1497 F 2 : 0.3; SE: 3683 Region 18 AdvP Word 1 *F 1 : 6.7; MSE: 5301 F 2 : 3.5; SE: 5961 *F 1 : 8.0; MSE: 71070 F 2 : 2.2; SE: 59426 F 1 : 0.0; MSE: 308 F 2 : ~0; SE: 50 Region 19 AdvP Word 2 *F 1 : 4.8; MSE: 75894 ?F 2 : 4.1; SE: 120830 *F 1 : 5.1; MSE: 137215 ?F 2: 3.0; SE: 97036 F 1 : 0.1; MSE: 860 F 2 : 0.4; SE: 14768 Region 20 AdvP Word 3 F 1 : 1.9; MSE: 2495 F 2 : 1.0; SE: 2305 **F 1 : 2.3; MSE: 78628 *F 2 : 10.4; SE: 18923 F 1 : 1.1; MSE: 7324 F 2 : 0.4; SE: 7361 Numerator df in each manipulated factor: 1. Subject n: 31. Item n: 24 333 Experiment 11A DEPENDENCY LENGTH FILLER PLAUSIBILITY LENGTH ? PLAUSIBILITY Region 1 Adverb F 1 : 2.4; MSE: 84672 F 2 : 2.6; SE: 67148 F 1 : 0.5; MSE: 10728 F 2 : 0.0; SE: 1078 F 1 : 0.7; MSE: 19646 F 2 : 1.1; SE: 34595 Region 12 Verb F 1 : 0.8; MSE: 29538 F 2 : 0.6; SE: 26439 F 1 : 0.2; MSE: 7270 F 2 : 0.0; SE: 1085 F 1 : 2.2; MSE: 9546 F 2 : 1.8; SE: 69534 Region 13 Ground Arg 1 F 1 : 1.2; MSE: 38526 F 2 : 1.7; SE: 4509 *F 1 : 13.3; MSE: 201845 *F 2 : 9.1; SE: 27367 F 1 : 0.2; MSE: 6148 F 2 : 0.1; SE: 2435 Region 14 Ground Arg 2 F 1 : 0.3; MSE: 7987 F 2 : 0.3; SE: 6818 F 1 : 0.3; MSE: 5894 F 2 : 0.1; SE: 2768 ?F 1 : 3.5; MSE: 6349 F 2 : 1.9; SE: 46906 Region 15 Ground Arg 3 F 1 : 0.7; MSE: 20853 F 2 : 1.1; SE: 25423 F 1 : 1.6; MSE: 27875 F 2 : 1.3; SE: 43107 F 1 : 0.3; MSE: 568 F 2 : 0.0; SE: 10 Region 16 Ground Arg 4 F 1 : 1.4; MSE: 13097 F 2 : 0.5; SE: 1494 ?F 1 : 3.4; MSE: 49719 F 2 : 2.3; SE: 7152 F 1 : 1.8; MSE: 6898 F 2 : 1.5; SE: 34298 Region 17 Figure Prep *F 1 : 6.3; MSE: 10905 *F 2 : 9.4; SE: 131393 F 1 : 0.6: MSE: 89 F 2 : 1.1; SE: 1860 F 1 : ~0; MSE: 68 F 2 : 0.1; SE: 104 Region 18 AdvP Word 1 F 1 : 0.7; MSE: 14124 F 2 : 0.7; SE: 20491 *F 1 : 7.4; MSE: 2750 *F 2 : 7.2: SE: 205372 F 1 : 0.5; MSE: 15231 F 2 : 0.1; SE: 2471F Region 19 AdvP Word 2 F 1 : 0.3; MSE: 3636 F 2 : 0.1; SE: 2368 *F 1 : 13.1; MSE: 19729 *F 2 : 8.9; SE: 243801 F 1 : ~0; MSE: 182 F 2 : 0.0; SE: 171 Region 20 AdvP Word 3 F 1 : 1.5; MSE: 17357 F 2 : 1.9; SE: 15909 *F 1 : 5.4; MSE: 52683 *F 2 : 9.4; SE: 7846 ?F 1 : 4.1; MSE: 34742 F 2 : 1.1; SE: 217 Numerator df in each manipulated factor: 1. Subject n: 24. Item n: 24 Significance tests: p: 0 ** 0.01 * 0.01 * 0.05 ? 0.10. Notes: In short dependencies, the regions above corespond to ordinal word positions 6-15, but these have ben renamed to facilitate comparison with long dependencies. 334 D Experiment 11B Reading times for Short and Long:P Conditions 2.5 S.D. Trimed Reading Times Mean and standard eror reported in ms LENGTH Short P PLAUSIBILITY Plausible Implausible Plausible Implausible Region mean s.e. mean s.e. mean s.e. mean s.e. 1 347 9 353 1 39 9 31 9 2 345 10 356 1 349 1 39 10 3 327 9 36 10 379 14 345 12 4 317 9 353 12 321 8 32 9 5 37 1 39 1 348 1 30 10 6 30 7 319 8 32 9 36 9 7 326 9 37 10 32 8 313 7 8 326 10 340 10 317 8 318 9 9 36 12 32 9 352 12 30 9 10 329 10 33 10 352 1 30 9 1 341 1 354 12 348 12 350 1 12 353 13 36 1 359 1 362 1 13 342 10 360 12 343 9 357 10 14 348 12 372 12 348 10 359 10 15 347 12 364 12 359 12 359 1 16 343 1 352 10 340 9 341 9 17 325 8 341 9 37 8 342 8 18 329 10 353 10 37 9 36 8 19 36 10 358 9 345 10 371 13 20 341 10 367 10 357 12 363 12 21 349 1 369 10 351 9 345 9 2 324 9 347 10 36 9 34 8 335 7 References Altmann, G.T.M., & Kamide, Y. (1999). Incremental interpretation at verbs: restricting the domain of subsequent reference. Cognition, 73, 247?64. Anderson, J.R. (1983). Retrieval of information from long-term emory. Science, 220(4592), 25-30. Anderson, J.R., Bothel, D., Byrne, M.D., Douglas, S., Lebiere, C., & Qin, Y. (2004). An integrated theory of mind. Psychological Review, 111(4), 1036-60. Anderson, M.C. & Nely, J.H. (1996). Interference and inhibition in memory retrieval. E.L. Bjork & R.A. Bjork (Eds.), Memory. Handbook of Cognition and Perception (2 nd ed.), pp. 237-313. San Diego: Academic Pres. Anderson, S.R. (1971). On the role of deep structure in semantic interpretation. Foundations of Language, 7, 387-396. Aoshima, S., Philips, C., & Weinberg, A.S. (2004). Procesing filer?gap dependencies in a head?final language. Journal of Memory and Language, 51, 23?54. Baddeley, A.D. (1986). Working Memory. Oxford: Clarendon Pres. Badecker, W., & Kuminiak, F. (2007). Morphology, agrement and working memory retrieval in sentence production: Evidence from gender and case in Slovak. Journal of Memory and Language, 56, 65-85. Badecker, W., & Lewis, R. (2007). A new theory and computational model of working memory in sentence production: agrement erors as failures of cue-based retrieval. Procedings of the 20 th Annual CUNY Conference on Human Sentence Procesing. La Jolla, CA: University of California, San Diego. Badecker, W., & Straub, K. (2002). The procesing role of structural constraints on the interpretation of pronouns and anaphors. Journal of Experimental Psychology: Learning, Memory and Cognition, 28, 748-769. Bayen, R. H., Davidson, D. J., & Bates, D. M. (submited). Mixed-efects modeling with crossed random efects for subjects and items. Bars, A., & Lasnik, H. (1986). A note on anaphora and double objects. Linguistic Inquiry, 17, 347-54. Bayer, S., & Johnson, M. (1996). Features and Agrement. In Procedings of the 33 rd Annual Meting on Asociation for Computational Linguistics. Morristown, NJ: Asociation for Computational Linguistics, 70-76. 336 Bertram, R., Hyona, J. & Laine, M. (2000). Word morphology in context: Efects of base and surface frequency. Language and Cognitive Proceses, 15, 367-388. Berwick, R. & Weinberg, A. (1984). The gramatical basis of linguistic performance. Cambridge, MA: MIT Pres. Bever, T. (1974). The Ascent of the Specious or There?s a lot we Don?t Know about Mirors. In Cohen, D. (ed.), Explaining Linguistic Phenomena. Washington, DC: Hemisphere Publishing Company. Bhat, R. (1999). Covert Modality in Non-finite Contexts, Ph.D. Disertation, University of Pennsylvania. Blouin, D. C., & Riopele, A. J. (2005). On confidence intervals for within-subjects designs. Psychological Methods, 10, 397-412. Bock, J. K. & Cutting, J. C. (1992). Regulating mental energy: Performance units in language production. Journal of Memory and Language, 31, 99-127. Bock, J. K. & Eberhard, K. M. (1993). Meaning, sound, and syntax in English number agrement. Language and Cognitive Proceses, 8, 57-99. Bock, J. K., Eberhard, K. M., & Cutting, J. C. (2004). Producing number agrement: How pronouns equal verbs. Journal of Memory and Language, 51, 251-278. Bock, J. K., & Miler, C. A. (1991). Broken agrement. Cognitive Psychology, 23, 45-93. Bock, K., Nicol, J., & Cutting, J.C. (1999). The ties that bind: Creating number agrement in speech. Journal of Memory and Language, 40, 330-346. Boland, J.E., Tanenhaus, M.K., Garnsey, S.M., & Carlson, G.N. (1995). Verb argument structure in parsing and interpretation: evidence from wh?questions. Journal of Memory and Language, 34, 774?806. Bourdages, J.S. (1992). Parsing complex NPs in French. In Goodluck, H., & Rochemond, M.S. (eds.), Island constraints: Theory, acquisition and procesing. Dordrecht: Kluwer. Bousfield, W.A. (1953). The occurrence of clustering in the recal of randomly aranged asociates. Journal of General Psychology, 49, 229-40. Bresnan, J. (2001). Lexical-Functional Syntax. Malden, MA: Blackwel. Broadbent, D. E. (1958). Perception and communication. New York: Oxford University Pres. 337 Chomsky, N. (1977). On Wh-Movement. In Culicover, P.W., Wasow, T., & A. Akmajian, eds., Formal syntax. New York: Academic Pres. 71-132. Chomsky, N. (1981). Lectures on government and binding. Dordrecht: Foris. Chomsky, N. (1994). Bare phrase structure. MIT Ocasional Papers in Linguistics 5. Cambridge, MA: MIT Working Papers in Linguistics. Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: MIT Pes. Clark, H. (1973). The language-as-a-fixed-efect falacy: critique of language statistics in psychological research. Journal of Verbal Learning and Verbal Behavior, 12, 335-359. Clark, S. E., & Gronlund, S. D. (1996). Global matching models of recognition memory: How the models match the data. Psychonomic Bulletin and Review, 3, 37-60. Clifton, C., Kennison, S.M., & Albrecht, J.E. (1997). Reading the words him and her: implications for parsing principles based on frequency and on structure. Journal of Memory and Language, 36, 276-292. Clifton, C., Jr., Frazier, L., & Devy, P. (1999). Feature manipulation in sentence comprehension. Rivista di Linguistica, 11, 11-39. Cofer, C.N. Bruce, D.R., & Reicher, G.M. (1966). Clustering in fre recal as a function of certain methodological variations. Journal of Experimental Psychology, 71, 858 - 866. Coulson, S., King, J.W., & Kutas, M. (1998). Expect the unexpected: Event-related brain response to morphosyntactic violations. Language and Cognitive Proceses, 13, 21-58. Cowan, N. (1995). Atention and Memory: An Integrated Framework. New York: Oxford UP. Cowan, N. (2001). The magical number 4 in short-term emory: a reconsideration of mental storage capacity. Behavioral and Brain Sciences 24, 87-185. Cowart, W., & Cairns, H.S. (1987). Evidence for an anaphoric mechanism within syntactic procesing: some reference relations defy semantic and pragmatic constraints. Memory & Cognition, 15, 318-331. Cowper, E.A. (1976). Constraints on sentence complexity: a model for syntactic procesing. Ph.D. Disertation, Brown University. Craik, F.I.M., & Lockhart, R.S. (1972). Levels of procesing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11, 671-684. 338 Crain, S., & Fodor, J.D. (1985). How can gramars help parsers? In Dowty, D., Kartunnen, L., & Zwicky, A. (eds.), Natural language parsing. Cambridge: Cambridge University Pres. Deane, P. (1991). Limits to atention: A cognitive theory of island constraints. Cognitive Linguistics, 2, 1 ? 63. Del, G.S. (1986). A spreading-activation theory of retrieval in sentence production. Psychological Review, 93, 283-321. den Dikken, M. (2001). ?Pluringulars?, pronouns and quirky agrement. The Linguistic Review, 18, 19-41. Dosher, B.A. (1976). The retrieval of sentences from emory: A speed-acuracy study. Cognitive Psychology, 8, 291-310. Dosher, B.A. (1981). The efects of delay and interference: A speed-acuracy study. Cognitive Psychology, 13, 551-582. Drenhaus, H., Saddy, D. & Frisch, St. (2005) Procesing negative polarity items: When negation comes through the backdoor. In Kepser, S. & Reis, M. (Eds.), Linguistic Evidence - Empirical, Theoretical, and Computational Perspectives. Berlin: Mouton de Gruyter, 145-165. Eberhard, K. (1997). The marked efect of number on subject-verb agrement. Journal of Memory and Language, 36, 147-164. Eberhard, K., (1999). The acesibility of conceptual number to the proceses of subject- verb agrement in English. Journal of Memory and Language, 36, 147-164. Eberhard, K. M., Cutting, J. C., & Bock, J. K. (2005). Making syntax of sense: Number agrement in sentence production. Psychological Review, 112, 531-559. Eich, J.E. (1980). The cue-dependent nature of state-dependent retrieval. Memory & Cognition, 8, 157-173. Eich, J.M. (1982). A composite holographic asociative recal model. Psychological Review, 89, 627-61. Engdahl, E. (1983). Parasitic gaps. Linguistics and Philosophy, 6, 5?34. Fayol, M., Largy, P., & Lemaire, P. (1994). Cognitive overload and orthographic erors: When cognitive overload enhances subject-verb agrement erors. A study in French writen language. Quarterly Journal of Experimental Psychology A, 47, 437-464. 339 Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175-91. Fereira, F. & Henderson, J.M. (1991). Recovery from misanalyses of garden-path sentences. Journal of Memory and Language, 30, 725-745. Fodor, J.A., Bever, T.G. & Garet, M.F. (1974). The psychology of language: an introduction to psycholinguistics and generative grammar. New York: McGraw-Hil. Fiebach, C.J., Schlesewsky, M., & Friederici, A.D. (2002). Separating syntactic memory costs and syntactic integration costs during parsing: The procesing of German WH- questions. Journal of Memory and Language, 47, 250-272. Fodor, J. D. (1978). Parsing strategies and constraints on transformations. Linguistic Inquiry, 9, 427?473. Ford, M. (1983). A method of obtaining measures of local parsing complexity throughout sentences. Journal Verbal Learning & Verbal Behavior, 22, 203?18. Franck, J., Vigliocco, G., & Nicol, J. (2002). Atraction in sentence production: The role of syntactic structure. Language and Cognitive Proceses, 17(4), 371-404. Francis, W. N. (1986). Proximity concord in English. Journal of English Linguistics, 19, 309-17. Fraser, B. (1971). A note on the spray-paint cases. Linguistic Inquiry, 2, 603-7. Frazier, L. (1987). Syntactic procesing: Evidence from Dutch. Natural Language and Linguistic Theory, 5, 519?560. Frazier, L., & Flores D?Arcais, G.B. (1989). Filer?driven parsing: a study of gap filing in Dutch. Journal of Memory of Language, 28, 331?44. Frazier, L., Munn, A., & Clifton, C. (2000). Procesing coordinate structures. Journal of Psycholinguistic Research, 29, 343-370. Fredman, S.E., & Forster, K.I. (1985). The psychological status of overgenerated sentences. Cognition, 19, 101-31. van Gelderen, E. (1997). Verbal Agrement and the Grammar behind its Breakdown. T?bingen: Max Niemeyer Verlag. Garnsey, S.M., Tanenhaus, M.K. & Chapman, R.M. (1989). Evoked potentials and the study of sentence comprehension. Journal of Psycholinguistic Research, 18, 51?60. 340 Garet, M. (1980). Levels of procesing in speech production. In B. Butterowrth (Ed.), Language production, Vol. 1: Speech and talk (pp. 177-220). New York, NY: Academic Pres. Goldsmith, J. (1985). A principled exception to the coordinate structure constraint. In Papers from the Twenty?First Annual Regional Meting of the Chicago Linguistic Society. Chicago: Chicago Linguistic Society. Gordon, P.C., Hendrick, R., & Johnson, M. (2001). Memory interference during language procesing. Journal of Experimental Psychology: Learning, Memory and Cognition, 27, 1411-1423. Gordon, P.C., Hendrick, R., & Levine, W.H. (2002). Memory-load interference in syntactic procesing. Psychological Science, 13, 425-430. Gordon, P.C., Hendrick, R., & Johnson, M. (2004). Efects of noun phrase type on sentence complexity. Journal of Memory and Language, 51, 97-114. Gordon, P.C., Hendrick, R., Johnson, M., & Le, Y. (2006). Similarity-based interference during language comprehension: Evidence from eye tracking during reading. Journal of Experimental Psychology: Learning, Memory and Cognition, 32, 1304-1321. Gorrel, P. (1993). Evaluating the direct asociation hypothesis: a reply to Pickering and Bary (1991). Language and Cognitive Proceses, 8, 129-46. Gibson, E. (1998). Syntactic complexity: Locality of syntactic dependencies. Cognition, 68, 1?76. Gibson, E., & Hickok, G. (1993). Sentence procesing with empty categories. Language and Cognitive Proceses, 8, 147?161. Gibson, E., & Waren, T. (2004). Reading time evidence for intermediate linguistic structure in long-distance dependencies. Syntax, 7, 55-78. Grenberg, J.H. (1966). Universals of Language. Cambridge, MA: MIT Pres. Grossberg, S., (1978). Behavioral contrast in short term emory: Serial binary memory models or paralel continuous memory models? Journal of Mathematical Psychology, 17, 199-219. Hagoort, P., Brown, C., & Groothusen, J. (1993). The syntactic positive shift (SPS) as an ERP measure of syntactic procesing. Language and Cognitive Proceses, 8, 439-483. Harley, H. (1994). Hug a tre: deriving the morphosyntactic feature hierarchy. MIT Working Papers in Linguistics, 21, 289-320. Harley, H. (2002). Possesion and the double object construction. Linguistic Variation 341 Yearbook, 2, 29-68. Harison, A.J. (2004). Thre way atraction efects in Slovene. In Procedings of the Second CamLing Postgraduate Conference on Language Research. Cambridge: University of Cambridge. Hartsuiker, R. J., Ant?n-M?ndez, I., & Van Ze, M. (2001). Object atraction in subject- verb agrement construction. Journal of Memory and Language, 45, 546-572. Haussler, J. & Bader, M. (submited). Agrement checking and number atraction in sentence comprehension: Insights from German relative clauses. Hintzman, D.L. (1984). MINERVA 2: A simulation model of human memory. Behavior Research Methods, Instruments & Computers. 16, 96-101. Hintzman, D.L. (1988). Judgments of frequency and recognition memory in a multiple- trace model. Psychological Review, 95, 528-551. Hintzman, D. L., Block, R. A., & Inskeep, N. R. (1972). Memory for mode of input. Journal of Verbal Learning and Verbal Behavior, 11, 741-749. Howard, M.W. & Kahana, M.J. (2002). A distributed representation of temporal context. Journal of Mathematical Psychology, 46, 269-299. Huang, C.T.J. (1982). Move wh in a language without wh movement. The Linguistic Review, 1, 369-416. Jackendoff, R. S. (1977). X?-Syntax: A Study of Phrase Structure. Cambridge, MA: MIT Pres. Jespersen, O. (1924). The philosophy of grammar. London: Alen & Unwin. Kan, E. (1997). Procesing subject?object ambiguities in Dutch. Ph.D. Disertation, University of Groningen. Kan, E. (2002). Investigating the efects of distance and number interference in procesing subject-verb dependencies: an ERP study. Journal of Psycholinguistic Research, 31(2), 165-193. King, J., & Just, M.A. (1991). Individual diferences in syntactic procesing: the role of working memory. Journal of Memory and Language, 30, 580?60. Kjelmer, G. (1975), Are Relative Infinitives Modal? Studia Neophilologica, 47(2), 323- 32. 342 Jarvela, R.J. (1971). Syntactic Procesing of Connected Speech. Journal of Verbal Learning and Verbal Behavior, 10, 409-16. Just, M. A., Carpenter, P. A., & Woolley, J. D. (1982). Paradigms and proceses in reading comprehension. Journal of Experimental Psychology: General, 3, 228-238. Johnson, K. (2001). What VP elipsis can do, what it can't, but not why. In Baltin, M. & Collins, C. (eds.) The handbook of contemporary syntactic theory, Oxford: Blackwel Publishers, pp. 439-479. Jonides, J., Lewis, R.L., Ne, D.E., Lustig, C.A., Berman, M.G., & Moore, K.S. (2008). The mind and brain of short-term emory. Annual Review of Psychology, 59, 15.1-15.32 Kan, E., Haris, A., Gibson, E., & Holcomb, P. (2000). The P600 as an index of syntactic integration dificulty. Language and Cognitive Proceses, 15, 159?201. Kayne, R. (1984). Connectednes and Binary Branching. Dordrecht, Netherlands: Foris. Kayne, R. (1994). The antisymetry of syntax. Cambridge, MA: MIT Pres. Kazanina, N., Lau, E.F., Lieberman, M., Yoshida, M., & Philips, C. (2007). The efect of syntactic constraints on the procesing of backwards anaphora. Journal of Memory and Language, 56, 384-409. Kennison, S.M. (2003). Comprehending the pronouns her, him, and his: Implications for theories of referential procesing. Journal of Memory and Language, 49, 335-352. Kluender, R. (2005). Are subject islands subject to a procesing acount? In Chand, V., Keleher, A., Rodriguez, A.J., & Schmeiser, B. (eds.), Procedings of the 23 rd West Coast Conference on Formal Linguistics. Somervile, MA: Cascadila Pres. Kluender, R., & Kutas, M. (1993). Subjacency as a procesing phenomenon. Language and Cognitive Proceses, 8, 573-633. Kimbal, J., & Aisen, J. (1971). I think, you think, he think. Linguistic Inquiry, 2, 241- 246. King, J.W., & Kutas, M. (1995). Who did what and when? Using word- and clause-level ERPs to monitor working memory usage in reading. Journal of Cognitive Neuroscience, 7, 376-95. Knuth, D. (1965/1997). The Art of Computer Programming. Volume 1: Fundamental Algorithms. 3 rd ed. Reading, MA: Addison-Wesley. Kurtzman, H.S., & Crawford, L.E. (1991). Procesing parasitic gaps. In Sherer, T. (ed.), Procedings of the 21st Annual Meting of the North Eastern Linguistics Society. 343 Amherst, MA: LSA Publications. Ladusaw, W. (1979). Polarity sensitivity as inherent scope relations. PhD thesis, University of Texas, Austin. Lakoff, G. (1986). Frame semantic control of the coordinate structure constraint. In Papers from the Twenty?Second Annual Regional Meting of the Chicago Linguistic Society. Chicago: Chicago Linguistic Society. Larson, R.K. (1988). On the double object construction. Linguistic Inquiry, 19: 335-391. Lau, E., Rozanova, K., & Philips, C. (2007). Syntactic prediction and lexical surface frequency efects in sentence procesing. University of Maryland Working Papers in Linguistics, 16, 163-200. Lau, E., Wagers, M., Stroud, C., & Philips, C. (2008). Agrement and the subject of confusion. Procedings of the 21 st Annual CUNY Conference on Human Sentence Procesing. Chapel Hil, NC: University of North Carolina, Chapel Hil. Le, M.-W. (2004). Another look at the role of empty categories in sentence procesing (and gramar). Journal of Psycholinguistic Research, 3, 51?73. Lehtonen, M., Wande, E., Niska, H., Niemi, J. & Laine, M. (2006). Recognition of inflected words in a morphologicaly limited language: frequency efects in monolinguals and bilinguals. Journal of Psycholinguistic Research, 35(2), 121-146. Levin, B. (1993). English verb classes and alternations: a preliminary investigation. Chicago: University of Chicago Pres. Lewis, R.L. (1996). Interference in short-term emory: the magical number two (or thre) in sentence procesing. Journal of Psycholinguistic Research, 25, 193-215. Lewis, R., & Vasishth, S. (2005). An activation-based model of sentence procesing as skiled memory retrieval. Cognitive Science, 29, 375-419. Marcus, M.P. (1980). Theory of Syntactic Recognition for Natural Languages. Cambridge, MA: MIT Pres. Martin, A.E., & McElre, B. (2008). A content-addresable pointer mechanism underlies comprehension of verb phrase elipsis. Journal of Memory and Language, 58, 879-906. McElre, B. (1996). Acesing short-term emory with semantic and phonological information: A time-course analysis. Memory & Cognition, 24, 173-187. McElre, B. (1998). Atended and non-atended states in working memory: Acesing categorized structures. Journal of Memory and Language, 38, 225-252. 344 McElre, B. (2000). Sentence comprehension is mediated by content-addresable memory. Journal of Psycholinguistic Research, 29(2), 111-123. McElre, B. (2001). Working memory and focal atention. J Exp Psychol: Learn. Mem. Cogn. 27, 817-35. McElre, B. (2006). Acesing recent events. In B. H. Ross (Ed.), The psychology of learning and motivation, vol. 46. San Diego: Academic Pres. McElre, B., & Bever, T. (1989). The psychological reality of linguisticaly defined gaps. Journal of Psycholinguistic Research, 18, 21?35. McElre, B., & Dosher, B. A. (1989). Serial position and set size in short-term emory: Time course of recognition. Journal of Experimental Psychology: General, 118, 346-373. McElre, B., Foraker, S., & Dyer, L. (2003). Memory structures that subserve sentence comprehension. Journal of Memory & Language, 48(1), 67-91. McElre, B., & Grifith, T. (1995). Syntactic and thematic procesing in sentence comprehension: evidence for a temporal disociation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 134-157. McGeoch, J.A. (1932). Forgeting and the law of disuse. Psychological Review, 39, 352- 70. McKinnon, R., & Osterhout, L. (1996). Constraints on movement phenomena in sentence procesing: Evidence from event-related potentials. Language and Cognition, 11, 495- 523. McKoon, G., & Ratclif, R. (1994). Sentential context and on-line lexical decision tasks. Journal of Experimental Psychology: Language, Memory and Cognition, 20, 1239-43. Miler, G.A., & Chomsky, N. (1963). Finitary models of language users. In Luce, R.D., Bush, R.., & Galanter, E. (eds.), Handbook of Mathematical Psychology, Volume I. New York: John Wiley. Murdock, B.. (1982). A theory for the storage and retrieval of item and asociative information. Psychological Review, 89, 609-626. Murdock, B.. (1983). A distributed memory model for serial-order information. Psychological Review, 90, 316-338. Murdock, B.., & Walker, K.D. (1969). Modality efects in fre recal. Journal of Verb Learning and Verbal Behavior, 8, pp. 665?676. 345 Nairne, J. S. (1990). A feature model of imediate memory. Memory & Cognition, 18, 251-269. Nairne, J. S. (2002). Remembering over the short-term: The case against the standard model. Annual Review of Psychology, 53, 53-81. Nevile, H.J., Nicol, J., Bars, A., Forster, K., & Garet, M. (1991). Syntacticaly-based sentence procesing clases: evidence from event-related potentials. Journal of Cognitive Neuroscience, 3, 151-165. New, B., Brysbaert, M., Segui, J., Ferand, L., & Rastle, K. (2004). The procesing of singular and plural nouns in French and English. Journal of Memory and Language, 51, 568-585. Nicol, J.L., Fodor, J.D., & Swinney, D. (1994). Using cross?modal lexical decision tasks to investigate sentence procesing. Journal of Experimental Psychology: Learning, Memory & Cognition, 20, 1229?38. Nicol, J., Forster, K., & Veres, C. (1997). Subject-verb agrement proceses in comprehension. Journal of Memory and Language, 36, 569-587. Nicol, J., & Swinney, D. (1989). The role of structure in coreference asignment during sentence comprehension. Journal of Psycholinguistic Research, 18, 5-19. Nilson, L.-G. (1974). Further evidence for organization by modality in fre recal. Journal of Experimental Psychology, 103, 948?957. Niswander, E., Pollatsek, A., & Rayner, K. (2000). The procesing of derived and inflected suffixed words during reading. Language and Cognitive Proceses, 15(4/5), 389-420. Osterhout, L., & Mobley, L.A. (1995). Event-related brain potentials elicited by failure to agre. Journal of Memory and Language, 31, 785-806. Osterhout, L., Bersick, M., & McLaughlin, J. (1997). Brain potentials reflect violations of gender stereotypes. Memory and Cognition, 25, 273-285. ?ztekin, I., & McElre, B. (2007). Proactive interference slows recognition by eliminating fast asesments of familiarity. Journal of Memory and Language, 57, 126- 149. Page, M.P.A., & Norris, D. (1998). The primacy model: A new model of imediate serial recal. Psychological Review, 105(4), 761-781. Pearlmutter, N. (2000). Linear versus hierarchical agrement feature procesing in comprehension. Journal of Psycholinguistic Research, 29, 89-98. 346 Pearlmutter, N. J., Garnsey, S. M., & Bock, K. (1999). Agrement proceses in sentence comprehension. Journal of Memory and Language, 41, 427-456. Pearlmutter, N.J., & Mendelsohn, A.. (2000). Serial versus paralel sentence comprehension. Ms. Pesetsky, D.M. (1995). Zero Syntax: Experiencers and Cascades. Cambridge, MA: MIT Pres. Philips, C. (2006). The real?time status of island phenomena. Language, 82, 795?823. Philips, C., Kazanina, N., & Abada, S.H. (2005). ERP efects of the procesing of syntactic long?distance dependencies. Cognitive Brain Research, 22, 407?428. Philips, C. & Wagers, M.W. (2007). Relating Structure and Time in Linguistics and Psycholinguistics. In Gaskel, G. (ed.), Oxford Handbook of Psycholinguistics, Oxford: Oxford University Pres. Pickering, M.J., Barton, S., & Shilcock, R. (1994). Unbounded dependencies, island constraints and procesing complexity. In Clifton, C., Frazier, L., & Rayner, K. (eds.), Perspectives on Sentence Procesing. London: Lawrence Erlbaum. Pinker, S. (1989). Learnability and cognition. Cambridge, MA: MIT Pres. Postal, P. (1998). Thre Investigations of Extraction. Cambridge, MA: MIT Pres. Potter, M.C. (1988). Rapid serial visual presentation (RSVP): a method for study language procesing. In Kieras, D.E. & Just, M.A. (eds.) New methods in reading comprehension research. Hilsdale, NJ: Erlbaum Pres. Pickering, M.J., & Bary, G.D. (1991). Sentence procesing without empty categories. Language and Cognitive Proceses, 6, 229?259. Plate, T.A. (1994). Distributed representations and nested compositional structure. PhD Thesis, Department of Computer Science, University of Toronto. Pollard, C. & Sag, I. (1994). Head-driven Phrase Structure Grammar. Chicago: CSLI Publications. Potter, M.C., & Lombardi, L. (1992). The regeneration of syntax in short-term emory. Journal of Memory and Language, 31, 713-33. Potter, M.C., & Lombardi, L. (1998). Syntactic priming in imediate recal of sentences. Journal of Memory and Language, 38, 265-82. 347 Pritchet, B.L. (1992). Grammatical Competence and Parsing Performance. Chicago: University of Chicago Pres. Quirk, R., Grenbaum, S., Lech, G., & Svartvik, J. (1972). A comprehensive grammar of the English language. London: Oxford UP. Raijmakers, J. G. W., & Shifrin, R. M. (1981). Search of asociative memory. Psychological Review, 88, 93-134. Raijmakers, J. G. W., Schrijnemakers, J. M. C., & Gremen, F. (1999). How to deal with ?The language-as-fixed-efect falacy?: Common misconceptions and alternative solutions. Journal of Memory and Language, 41, 416-426. Rappaport, M., & Levin, B. (1986). What to do with Theta Roles. Lexicon Project Working Papers 11, Cambridge, MA: Center for Cognitive Science, MIT. Ratclif, R. (1993). Methods for dealing with reaction time outliers. Psychological Bulletin, 114, 510-532. R Development Core Team (2007). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http:/ww.R- project.org. Rad?, J. (1999). Discourse efects in gap?filing. Procedings of the Architecture and Mechanisms for Language Procesing Conference 1999. Edinburgh: University of Edinburgh. De Roeck, A.N., Johnson, R.L., King, M., Rosner, M., Sampson, G., & Varile, N. (1982). A myth about centre-embedding. Lingua, 58, 327-40. Rouveret, A., & Vergnaud, J.-R. (1980). Specifying Reference to the Subject: French Causatives and Conditions on Representations. Linguistic Inquiry, 11, 97-202. Ross, J.R. (1967). Constraints on variables in syntax. Ph.D. Disertation, MIT. Ruchkin, D.S., Johnson, R., Jr., Canoune, H., & Riter, W. (1990). Short-term emory storage and retention: An event-related brain potential study. Electroencephalography and Clinical Neurophysiology, 76, 419-39. Sedivy, J.C., Tanenhaus, M.K., Chambers, C.G., & Carlson, G.N. (1999). Achieving incremental semantic interpretation through contextual representation. Cognition, 71, 109?47. Sekerina, I.A. (2003). Scrambling and procesing: dependencies, complexity, and constraints. In Karimi, S. (ed.), Word order and scrambling. Malden, MA: Blackwel. 348 Selman, B., & Hirst, G. (1985). A rule-based connectionist parsing system. In Procedings of the 7 th Annual Conference of the Cognitive Science Society, Hilsdale, NJ: Erlbaum. pp 212-221. Schlesewsky, M., Fanselow, G., Kliegl, R., & Krems, J. (2000). The subject preference in the procesing of localy ambiguous wh?questions in German. In Hemforth, B., & Konieczny, L. (eds.), German sentence procesing. Dordrecht: Kluwer Academic Publishers. Shalice, T., & Valar, G. (1990). Neuropsychological impairments of short-term memory. New York: Cambridge University Pres. Shieber, S. M. (1986). An introduction to unification-based approaches to grammar. CSLI Lecture Notes No. 4. Stanford: CSLI. Shifrin, R. M., Murnane, K., Gronlund, S., & Roth, M. (1989). On units of storage and retrieval. In Izawa, C. (Ed.), Current Isues in Cognitive Proceses: The Tulane Flowere Symposium on Cognition. Hilsdale, NJ: Erlbaum. Shifrin, R. M., & Steyvers, M. (1997). A model for recognition memory: REM: Retrieving efectively from memory. Psychonomic Bulletin and Review, 4(2), 145-166. Solomon, E., & Pearlmuter, N. (2004). Semantic integration and syntactic planning in language production. Cognitive Psychology, 49, 1-46. Stevenson, S. (1994). Competition and Recency in a Hybrid Network Model of Syntactic Disambiguation. Journal of Psycholinguistic Research, 23, 295-322. Strang, B. (1966). Some features of S-V concord in present day English. In I. Celini & G. Melchiori (Eds.), English Studies Today: Fourth Series (pp. 73-87). Rome: Edizioni di Storia et Literatura. Stowe, L. (1986). Parsing wh?constructions: evidence for on?line gap location. Language and Cognitive Proceses, 1, 227?46. Sturt, P. (2003). The time-course of the application of binding constraints in reference resolution. Journal of Memory and Language, 48(3), 542-562. Sussman, R., & Sedivy, J. (2003). The time course of procesing syntactic dependencies. Language and Cognitive Proceses, 18, 143?63. Szabolcsi, A., & den Dikken, M. (2002). Islands. In Cheng, L., & Sybesma, R. (eds.), The Second State-of-the-Article Book. Mouton de Gruyter. Tabor, W., Galantucci, T., & Richardson, D. (2004). Efects of merely local syntactic coherence on sentence procesing. Journal of Memory and Language, 50, 355-370. 349 Tanenhaus, M.K., Stowe, L.A., & Carlson, G. (1985). The interaction of lexical expectation and pragmatics in parsing filer?gap constructions. Procedings of the Seventh Annual Cognitive Science Society Conference, Irving, CA. Traxler, M.J., & Pickering, M.J. (1996). Plausibility and the procesing of unbounded dependencies. Journal of Memory and Language, 35,m 454?475. Trollope, A. (1883/1999). An Autobiography. New York: Oxford University Pres. Tulving, E. (1983). Elements of episodic memory. Oxford: Clarendon Pres. Uriagereka, J. (1999). Multiple Spel-out. In Epstein, S.D. & Hornstein, N. (eds.), Working Minimalism. Cambridge, MA: MIT Pres. Usher, M., & Cohen, J.D. (1999). Short Term Memory and Selection Proceses in a Frontal-Lobe Model. In Connectionist Models in Cognitive Neuroscience, London: Springer-Verlag. Van Dyke, J. (2007). Interference Efects From Gramaticaly Unavailable Constituents During Sentence Procesing. Journal of Experimental Psychology: Learning, Memory and Cognition, 33, 407-430. Van Dyke, J. A., & Lewis, R. L. (2003). Distinguishing efects of structure and decay on atachment and repair: A retrieval interference theory of recovery from misanalyzed ambiguities. Journal of Memory and Language, 49(3), 285-316. Van Dyke, J.A., & McElre, B. (2006). Retrieval interference in sentence comprehension. Journal of Memory and Language, 55, 157-166. Van Gompel, R.P.G., & Liversedge, S.P. (2003). The influence of morphological information on cataphoric pronoun asignment. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 128-139. Vasishth, S. (2006). On the proper treatment of spilover in real-time reading studies: Consequences for psycholinguistic theories. Talk presented at the International Conference on Linguistic Evidence, T?bingen, Germany. Vasishth, S., Drenhaus, H., Saddy, D., & Lewis, R. (2005). Procesing negative polarity. Talk presented at the CUNY Sentence Procesing Conference, Arizona, USA. van der Velde, F., & de Kamps, M. (2006). Neural blackboard architectures of combinatorial structures in cognition. Behavioral and Brain Sciences, 29, 37-70. 350 Verhaegen, P., Cerela, J., & Basak, C. (2004). A working memory workout: how to expand the focus of serial atention from one to four items in 10 hours or les. J. Exp. Psychol.: Learn. Mem. Cogn. 30, 1322-37. Vigliocco, G., Butterworth, B., & Semenza, C. (1995). Constructing subject-verb agrement in speech: The role of semantic and morphological factors. Journal of Memory and Language, 34, 186-215. Vigliocco, G., Butterworth, B., & Garet, M. (1996). Subject-verb agrement in Spanish and English: Diferences in the role of conceptual constraints. Cognition, 61, 261-298. Vigliocco, G., Harstuiker, R., Jarema, G., & Kolk, H. (1996). One or more labels on the bottles? Notional concord in Dutch and French. Language and Cognitive Proceses, 11, 407-442. Vigliocco, G., & Nicol, J. (1998). Separating hierarchical relations and word order in language production. Is proximity concord syntactic or linear? Cognition, 68, 13-29. de Vincenzi, M. (1991). Syntactic Parsing Strategies in Italian. Dordrecht: Kluwer Academic Publishers. Wagers, M.W. (2006). (Re)active dependency formation. College Park, MD: University of Maryland Department of Linguistics. Ms. Wagers, M.W., Lau, E., & Philips, C. (2008; submited). Agrement atraction in comprehension: representations and proceses. Ms. Wagers, M.W., & Philips, C. (2006). Re-active filing. Procedings of the 19th Annual CUNY Conference on Human Sentence Procesing. New York: City University of New York Graduate Center. Wagers, M.W., & Philips, C. (2008; submited). Multiple dependencies and the role of the gramar in real-time comprehension. Ms. Wanner, E., & Maratsos, M. (1978). An ATN Approach to Comprehension. In Hale, M., Bresnan, J., & iler, G.A. (eds.), Linguistic Theory and Psychological Reality, Cambridge, MA: MIT Pres. Weinberg, A. (1992). Parameters in the theory of sentence procesing: minimal commitment theory goes east. Journal of Psycholinguistic Research, 3, 339?64. Wickelgren, W. A., & Corbet, A. T. (1977). Asociative interference and retrieval dynamics in yes-no recal and recognition. Journal of Experimental Psychology: Human Learning & Memory, 3(2), 189-202. Wickelgren, W. A., Corbet, A. T., & Dosher, B. A. (1980). Priming and retrieval from 351 short-term emory: A speed acuracy trade-off analysis. Journal of Verbal Learning and Verbal Behavior, 19(4), 387-404. Woods, W. (1973). An experimental parsing system for transition network gramars. In Rustin, R., (ed.), Natural Language Procesing, pp 111?154. New York: Algorithmics Pres. Xiang, M., Dilon, B. & Philips, C. (2008; submited). Ilusory licensing across dependency types: ERP evidence.