ABSTRACT Title of Document: AGE DIFFERENCES AND COGNITIVE APTITUDES FOR IMPLICIT AND EXPLICIT LEARNING IN ULTIMATE SECOND LANGUAGE ATTAINMENT Gisela Granena, Ph.D., 2012 Directed By: Dr. Michael H. Long, Second Language Acquisition Very high-level, functional ability in foreign languages is increasingly important in many walks of life. It is also very rare, and likely requires an early start and/or a special aptitude. This study investigated the extent to which aptitude for explicit learning, defined as ?analytic ability? and aptitude for implicit learning, defined as ?sequence learning ability,? are differentially important for long-term L2 achievement in an immersion setting. A group of 20 native speaker (NS) controls and 100 Chinese-Spanish bilinguals with ages of onset 3-6 (n = 50) and > 16 (n = 50) participated in the study. Early L2 learners use the same language learning mechanisms as NSs (but still differ in ultimate success), whereas late L2 learners have been claimed to be fundamentally different from NSs in terms of learning mechanisms (and also differ in ultimate success). A set of six L2 attainment measures reflecting a continuum from automatic to controlled use of language knowledge was administered, as well as a battery of six cognitive tests (four language aptitude subtests, a general intelligence test, and a probabilistic serial reaction time task). Results confirmed the predicted distribution of cognitive abilities into two main types of aptitudes, interpreted as implicit and explicit. Participants could be high in one, high in both, or low in both. Results further revealed that early and late L2 learners with high aptitude for explicit learning outperformed individuals with low aptitude on tasks that allow controlled use of language knowledge. On these tasks, aptitude for implicit learning also had an effect, but among early L2 learners only. In addition, early and late L2 learners with high aptitude for implicit learning showed greater sensitivity towards agreement violations on the language task at the most implicit end of the continuum. Finally, general intelligence only played a role in late L2 learners? attainment on tasks that allow controlled use of knowledge. The study concluded that 1) cognitive aptitudes play a role in both early and late L2 learners, 2) different types of cognitive aptitudes have differential effects on L2 outcomes, and 3) individual differences in implicit learning ability are related to L2 attainment in adults. AGE DIFFERENCES AND COGNITIVE APTITUDES FOR IMPLICIT AND EXPLICIT LEARNING IN ULTIMATE SECOND LANGUAGE ATTAINMENT By Gisela Granena Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park, in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2012 Advisory Committee: Professor Michael H. Long, Chair Professor Robert M. DeKeyser Professor Catherine J. Doughty Professor Steven J. Ross Professor Jeff MacSwan ? Copyright by Gisela Granena 2012 ii Acknowledgements Many people made this dissertation possible, as well as a very enjoyable experience. My circumstances have been truly privileged and I would like to thank all of those who provided me with the necessary, and far more than sufficient, support along the way. I thank the Spanish Fulbright Commission for bringing me to the U.S. in the first place. They started paving the way for this dissertation by awarding me a grant to study a Master?s abroad. I also thank the National Science Foundation for a Doctoral Dissertation Improvement Grant (BCS-1124126) and Language Learning for a dissertation grant. They made it possible that this dissertation could eventually come to fruition by providing me with the necessary financial support. I would like to express my most sincere thanks to Mike Long, my advisor, for bringing me to Maryland, for his investment in me over the last four years, and for his generosity, fairness, guidance, sense of humor, and support throughout. Without him, this dissertation would not have been produced so smoothly. Thank you for teaching me more than you can ever imagine. My heartfelt thanks also go to my mentors at the University of Barcelona, M? Luz Celaya, Carmen Mu?oz, and Elsa Tragant, for teaching me how to conduct good research and for all their encouragement and understanding throughout these years. I am very grateful to my doctoral committee for their invaluable contribution: Robert DeKeyser, whose work inspired most of this dissertation, Cathy Doughty, for her suggestion to include a general intelligence measure and for helping me structure my ideas, and Steve Ross and Jeff MacSwan, for their feedback and willingness to work with me. iii I would especially like to thank Manne Bylund for all the good times and for his timely support during data collection, Niclas Abrahamsson for supervising my work, Jared Linck for his suggestion of including a probabilistic version of the serial reaction time task, and Luis Jim?nez (Universidade de Santiago de Compostela) for his generous help showing me how to create probabilistic sequences. There are many people who helped me during the intense data-collection period in Madrid. A special thanks to M? Luisa Garc?a Bermejo for making my time in Madrid so enjoyable, to the Education Department at the Universidad Complutense for letting me use their facilities, and to the Chinese community in Madrid for making my participant recruitment and data collection so easy and such fun. Thank you to all my colleagues and friends at the University of Barcelona, and very especially, to Imma Miralpeix and Natalia Fullana, for all the time we shared as research assistants on the BAF Project, for your friendship, and for always providing me with help when I needed it. The biggest thanks of all go to my family, to Santi, my aunt Paz, and, very especially, to my dad, for always encouraging me to study abroad, see the world, and think critically. And finally, to Yucel. Thanks for always being there, despite the distance, and for all the inspiring discussions on SLA that we have had over the years and that started one day in Philadelphia. This dissertation is yours as much as mine. Thank you all. iv Table of Contents Acknowledgements ....................................................................................................... ii Table of Contents ......................................................................................................... iv List of Tables ............................................................................................................... vi List of Figures ............................................................................................................. vii Chapter 1: Review of the Literature .............................................................................. 1 1.1 The Concept of Language Aptitude .................................................................... 1 1.2 Language Aptitude and Instructed SLA ............................................................. 2 1.3 Language Aptitude and Naturalistic SLA ........................................................... 4 1.4 Language Aptitude and Controlled vs. Automatic Use of L2 Knowledge ....... 13 1.5 Cognitive Aptitudes and Language Learning ................................................... 19 1.6 Conclusion ........................................................................................................ 29 Chapter 2: Purpose of the Study ................................................................................. 31 Chapter 3: Research Questions and Hypotheses ......................................................... 37 Chapter 4: Methodology ............................................................................................. 43 4.1 Participants ........................................................................................................ 43 4.2 Design of the Study ........................................................................................... 47 4.3 Instruments ........................................................................................................ 49 4.3.1 Language Tests that Require Automatic Use of L2 Knowledge ................ 50 4.3.2 Language Tests that Allow Controlled Use of L2 Knowledge ................... 55 4.3.3 Explicit Language Aptitude Tests .............................................................. 57 4.3.4 Implicit Language Aptitude Tests .............................................................. 59 4.3.5 General Intelligence Test ........................................................................... 67 4.4 Procedure .......................................................................................................... 70 4.5 Target Structures ............................................................................................... 71 Chapter 5: Results ...................................................................................................... 75 5.1 Cognitive Aptitudes .......................................................................................... 76 5.1.1 The LLAMA Test ........................................................................................ 77 5.1.2 The GAMA Test .......................................................................................... 78 5.1.3 Probabilistic Serial Reaction Time (SRT) Task ......................................... 80 5.1.4 Cognitive Aptitudes for Implicit and Explicit Learning ............................. 87 5.2 Language Attainment ........................................................................................ 92 5.2.1 Grammaticality Judgment Tests ................................................................ 92 5.2.2 Metalinguistic Knowledge Test ................................................................ 100 5.2.3 Word monitoring Task ............................................................................. 105 5.2.4 Summary of Language Attainment ........................................................... 120 5.3 Cognitive Aptitudes and Language Attainment .............................................. 122 5.3.1 Aptitude for Explicit Learning and Language Attainment ....................... 122 5.3.2 General Intelligence and Language Attainment ...................................... 152 5.3.3 Aptitude for Implicit Learning and Language Attainment ....................... 173 5.3.4 Summary of Results: Cognitive Aptitudes and Language Attainment ..... 191 Chapter 6: Discussion and Conclusions ................................................................... 196 v 6.1 Cognitive Aptitudes ........................................................................................ 197 6.2 Language Attainment ...................................................................................... 201 6.3 Cognitive Aptitudes and Language Attainment .............................................. 205 6.4 Summary of Research Findings ...................................................................... 225 6.5 Conclusions and Directions for Further Research .......................................... 226 Appendix A ............................................................................................................... 229 Appendix B ............................................................................................................... 233 Bibliography ............................................................................................................. 245 vi List of Tables Table 1. Predictions Concerning the Relationship between Cognitive Aptitudes, General Intelligence, and Ultimate L2 Attainment ..................................................... 42 Table 2. Participants? Information ............................................................................. 45 Table 3. Balanced Latin Square Design ..................................................................... 47 Table 4. Probabilities of Probable and Non-probable Trials in SRT Task ................ 64 Table 5. Target structures ........................................................................................... 73 Table 6. Descriptives of the LLAMA Language Aptitude Test .................................... 77 Table 7. Descriptives of the GAMA General Intelligence Test ................................... 79 Table 8. Descriptives of the Probabilistic SRT Task .................................................. 83 Table 9. Mean Confidence Ratings for Old and New Triads ...................................... 84 Table 10. Mean Reaction Times for Old and New Triads .......................................... 84 Table 11. Cognitive Aptitudes ..................................................................................... 90 Table 12. High-, Mid-, and Low-explicit Language Aptitude Groups (z-scores) ....... 91 Table 13. High-, Mid-, and Low-Implicit Language Aptitude Groups (z-scores) ...... 92 Table 14. Group Mean Percentage Scores on Timed and Untimed Visual GJTs ...... 93 Table 15. Group Mean Percentage Scores on Timed and Untimed Auditory GJTs ... 93 Table 16. Group Mean Percentage Scores on Timed and Untimed Visual GJTs (Ungrammatical Items) ............................................................................................... 96 Table 17. Group Mean Percentage Scores on Timed and Untimed Auditory GJTs (Ungrammatical Items) ............................................................................................... 97 Table 18. Group Mean Percentage Scores on the Metalinguistic Knowledge Test . 100 Table 19. Group Mean Percentage Scores on the Metalinguistic Test (Ungrammatical Items)......................................................................................................................... 101 Table 20. Word Monitoring Mean Latencies ............................................................ 107 Table 21. Grammatical Sensitivity Index (GSI) ........................................................ 108 Table 22. Word monitoring Mean Latencies (Agreement Structures) ...................... 114 Table 23. Grammatical Sensitivity Index Agreement Structures .............................. 115 Table 24. Word monitoring Mean Latencies (Non-Agreement Structures) .............. 117 Table 25. Grammatical Sensitivity Index (Non-Agreement Structures) ................... 117 Table 26. Correlation Matrix for the Six Language Measures (L2 Learners) ......... 121 Table 27. Summary of Overall Test Scores by Participants with High and Low Aptitude for Explicit Learning .................................................................................. 130 Table 28. Summary of Overall Test Scores on the Untimed Visual GJT and Metalinguistic Test by High- and Low-Intelligence Participants ............................. 159 Table 29. Summary of Test Scores on the Untimed Visual GJT (Non-agreement Items) by High- and Low-Intelligence Participants............................................................. 163 Table 30. Summary of GSIs for Agreement Structures on the Word monitoring Task by High and Low Implicit Aptitude Participants ...................................................... 187 Table 31. Average Percentage Scores on Agreement Items in the Late AO Group . 190 Table 32. Summary of Relationships between Types of Aptitude, General Intelligence, and L2 Attainment ..................................................................................................... 195 Table 33. Summary of the Study Predictions and Findings ...................................... 225 vii List of Figures Figure 1. Representation of visual cues and required key-presses in SRT task ......... 61 Figure 2. Representation of the two sequences used to generate training trials (A) and control (B) trials .......................................................................................................... 63 Figure 3. Sample matching item: Which answer is the same as the first picture? ..... 68 Figure 4. Sample analogies item: Which answer goes on the question mark?........... 68 Figure 5. Sample sequences item: Which answer goes on the question mark to complete the pattern? .................................................................................................. 69 Figure 6. Sample construction item: Which answer can be made with the shapes in the top box? ................................................................................................................. 69 Figure 7. SRT learning performance .......................................................................... 81 Figure 8. Group mean percentage GJT scores............................................................ 94 Figure 9. Group mean percentage GJT scores (ungrammatical items) ...................... 97 Figure 10. Modality x Group interaction .................................................................... 99 Figure 11. Time x Group interaction .......................................................................... 99 Figure 12. Group mean percentage scores on the metalinguistic knowledge test .... 102 Figure 13. Group mean percentage scores on the metalinguistic knowledge test (correction of ungrammatical items) ......................................................................... 102 Figure 14. Group mean percentage scores on the metalinguistic knowledge test (explanation of ungrammatical items) ...................................................................... 103 Figure 15. Group mean percentage scores on the four GJTs and the metalinguistic knowledge test. ......................................................................................................... 104 Figure 16. Distribution of overall word monitoring latencies in the early AO group ................................................................................................................................... 106 Figure 17. Distribution of overall word monitoring latencies in the late AO group 106 Figure 18. Group word monitoring latencies for grammatical and ungrammatical items .......................................................................................................................... 110 Figure 19. Distribution of word monitoring latencies for agreement items in the early AO group .................................................................................................................. 111 Figure 20. Distribution of word monitoring latencies for non-agreement items in the early AO group ......................................................................................................... 112 Figure 21. Distribution of word monitoring latencies for agreement items in the late AO group .................................................................................................................. 113 Figure 22. Distribution of word monitoring latencies for non-agreement items in the late AO group ............................................................................................................ 113 Figure 23. Group word monitoring latencies for grammatical and ungrammatical items testing agreement structures (gender, person, and number agreement) .......... 116 Figure 24. Group word monitoring latencies for grammatical and ungrammatical items testing non-agreement structures (aspect, the subjunctive, and the passive) .. 118 Figure 25. Metalinguistic knowledge test scores as a function of AO with the explicit language aptitude dimension added .......................................................................... 124 Figure 26. Untimed visual GJT test scores as a function of AO with the explicit language aptitude dimension added .......................................................................... 124 Figure 27. Untimed auditory GJT test scores as a function of AO with the explicit language aptitude dimension added .......................................................................... 125 viii Figure 28. Regression of untimed visual GJT scores on aptitude for explicit learning composite scores at each group level ........................................................................ 128 Figure 29. Regression of metalinguistic test scores on aptitude for explicit learning composite scores at each group level ........................................................................ 128 Figure 30. Regression of untimed auditory GJT scores on aptitude for explicit learning composite scores at each group level .......................................................... 129 Figure 31. Regression of metalinguistic test scores on ungrammatical items on aptitude for explicit learning composite scores at each group level ......................... 133 Figure 32. Timed visual GJT scores as a function of AO with the explicit language aptitude dimension added ......................................................................................... 141 Figure 33. Timed auditory GJT scores as a function of AO with the explicit language aptitude dimension added ......................................................................................... 141 Figure 34. Word monitoring task scores (GSI) as a function of AO with the explicit language aptitude dimension added .......................................................................... 142 Figure 35. Metalinguistic knowledge test scores as a function of AO with the general intelligence dimension added .................................................................................... 153 Figure 36. Untimed visual GJT scores as a function of AO with the general intelligence dimension added .................................................................................... 154 Figure 37. Untimed auditory GJT scores as a function of AO with the general intelligence dimension added .................................................................................... 154 Figure 38. Regression of untimed visual GJT scores on general intelligence scores at each group level ........................................................................................................ 156 Figure 39. Regression of metalinguistic test scores on general intelligence scores at each group level ........................................................................................................ 157 Figure 40. Regression of untimed auditory GJT scores on general intelligence scores at each group level .................................................................................................... 157 Figure 41. Regression of untimed visual GJT scores for non-agreement items on general intelligence scores at each group level ......................................................... 162 Figure 42. Timed visual GJT scores as a function of AO with the general intelligence dimension added ....................................................................................................... 170 Figure 43. Timed auditory GJT scores as a function of AO with the general intelligence dimension added .................................................................................... 171 Figure 44. Word monitoring task scores (GSI) as a function of AO with the general intelligence dimension added .................................................................................... 171 Figure 45. Metalinguistic knowledge test scores as a function of AO with the implicit language aptitude dimension added .......................................................................... 175 Figure 46. Untimed visual GJT scores as a function of AO with the implicit language aptitude dimension added ......................................................................................... 175 Figure 47. Untimed auditory GJT scores as a function of AO with the implicit language aptitude dimension added .......................................................................... 176 Figure 48. Two-way interaction between group and aptitude for implicit learning in the untimed auditory GJT (agreement structures) .................................................... 178 Figure 49. Two-way interaction between group and aptitude for implicit learning in the metalinguistic knowledge test (agreement structures) ........................................ 179 Figure 50. Timed visual GJT scores as a function of AO with the implicit language aptitude dimension added ......................................................................................... 181 ix Figure 51. Timed auditory GJT scores as a function of AO with the implicit language aptitude dimension added ......................................................................................... 182 Figure 52. Word monitoring task scores (GSI) as a function of AO with the implicit language aptitude dimension added .......................................................................... 182 Figure 53. Regression of the grammatical sensitivity index for agreement items on aptitude for implicit learning at each group level ..................................................... 186 1 Chapter 1: Review of the Literature 1.1 The Concept of Language Aptitude Language aptitude is conceptualized as a combination of cognitive and perceptual abilities that are advantageous in second language acquisition (SLA) (Carroll, 1981; Doughty et al., 2007). Carroll (1993) referred to this combination of abilities as ?aptitudes? (p. 675) and claimed that they were partly innate, fairly stable and relatively enduring traits. Although experts and laypeople alike would agree on the generic notion of aptitude as a special talent for language, the theoretical construct behind this popular notion has remained somewhat elusive in the SLA field. While there is agreement that language aptitude involves different cognitive abilities, it has been conceptualized in a variety of ways in SLA, each of them with different implications at the measurement level. One line of research has linked the ability to learn a second language (L2) to first- language (L1) learning skills (Sparks, 1995; Sparks & Ganschow, 1991; Sparks et al., 1995). According to this theory, successful L2 learners have significantly stronger L1 literacy skills (e.g., phonological/orthographic processing, word recognition/decoding). Carroll (1973) also speculated that aptitude could be a residue of L1 learning ability and that rate of L1 acquisition was related to aptitude for L2 learning. Skehan (1990) provided some evidence in this respect in a follow-up study to the Bristol Language Project (Wells, 1985), where he found significant correlations between aptitude measures and L1 indices based on spontaneous speech samples (syntax and complexity of language use). 2 Other conceptualizations of aptitude consider working memory capacity, responsible for the simultaneous processing and storage of information, as a central component of the construct (Miyake & Friedman, 1998; Sawyer & Ranta, 2001). Although one of the earliest aptitude test batteries, the Modern Language Aptitude Test (MLAT) (Carroll & Sapon, 1959), included memory measures, these were based on theories of memory that preceded these later developments in cognitive psychology, which operationalized working memory as attentional capacity and control. Evidence for this position comes from studies such as Harrington and Sawyer (1992) and Robinson (2002), which have reported correlations between working memory and L2 performance. Finally, more recent conceptions view language aptitude in a situated manner and distinguish different clusters of aptitude subcomponents according to relevant factors, such as type of learning task and acquisition stage (Robinson, 2001, 2002; Skehan, 1998, 2002). This conceptualization of aptitude results in individually unique language aptitude profiles. L2 learners may have high ability in one aptitude component or complex, but low ability in others. This characterization of aptitude is in line with recent research on aptitude-treatment interaction (ATI), since different ability profiles (e.g., strong memory but weak analytic skills) can be investigated in relation to a given type of instructional treatment or level of L2 attainment. 1.2 Language Aptitude and Instructed SLA In instructed SLA contexts, aptitude is considered a good predictor of rate, or speed, of L2 learning under intensive conditions (Carroll, 1973). All other things being equal, a high-aptitude L2 learner will be faster to learn and enjoy higher overall 3 foreign language achievement under a variety of instructional approaches. Evidence for this claim can be found in both non-experimental and experimental research. In a survey study, Ehrman and Oxford (1995) reported a strong correlation between aptitude measures and overall learning success, despite the communicative changes in teaching methodology taking place at the time. Harley and Hart (1997) also found that aptitude was related to performance on a variety of L2 measures in an immersion learning program. Research in the laboratory has yielded similar findings, suggesting that aptitude positively affects language learning under a variety of conditions of exposure. De Graaff (1997), Robinson (1997), and Williams (1999) all showed that differences in aptitude, as measured by subtests of the MLAT, resulted in learning differences in implicit and explicit learning conditions. Only learning under the incidental, meaning-focused exposure condition in Robinson (1997) was found to be independent of language aptitude, a result that was replicated by Robinson (2002). In addition to research showing that aptitude predicts L2 learning in general, and in line with Skehan?s (1998) and Robinson?s (2002) notion of aptitude profiles, there is some evidence of ATI revealing that different aptitude components may play different roles, depending on instructional treatment (e.g., Erlam, 2005; Sheen, 2007; Wesche, 1981). Sheen (2007), for example, showed that aptitude, operationalized as analytic ability, was more strongly related to achievement with metalinguistic than direct written feedback. An important implication of ATI findings such as Sheen?s (2007) is that the predictive power of aptitude in SLA may have to be qualified and investigated in relation to a variety of factors (e.g., instructional treatments, 4 acquisition stages, aspects of language, and L2 learning environments) in order to fully understand under what type of conditions aptitude is a relevant construct. 1.3 Language Aptitude and Naturalistic SLA While aptitude determines learning rate in instructed settings (as reflected in short-term differences in language use or performance1), the general claim in naturalistic SLA has been that aptitude is related to variation in ultimate level of attainment (i.e., long-term differences in acquisition). Skehan (1989), in fact, argued that aptitude could be even more relevant in naturalistic than instructed learning contexts because of the greater amount of input that the learner has to process and the pressure to discover regularities and make generalizations merely from L2 exposure. However, to date, research on language aptitude has focused primarily on instructed SLA and learning rate and very rarely on naturalistic SLA and ultimate attainment. Given that there is no empirical evidence concerning the extent to which we can generalize findings from one context into the other, both types of research are needed in order to assess fully the predictive power of aptitude in SLA. To date, the few studies that have investigated aptitude in a naturalistic environment have been conducted in the context of tests of the existence of a critical period for language acquisition (Abrahamsson & Hyltenstam, 2008; DeKeyser, 2000; DeKeyser et al., 2010; Granena & Long, 2010; Harley & Hart, 20022). According to the Critical 1 Long (2005:291) makes a distinction between short-term differences in performance and long-term differences in capacity for acquisition. He argues that these concepts are often confused in studies of rate and ultimate attainment, and that long-term differences in capacity for acquisition are more important for a theory of SLA. 2 A study that is often cited as evidence of the relationship between aptitude and L2 learning in an informal setting is Reves (1983) (see Skehan, 1998; Abrahamsson & Hyltenstam, 2008). This study is an unpublished Ph.D. dissertation from the Hebrew University of Jerusalem in Israel. The study is not available online and has never been published in refereed or non-refereed journals. In addition, Sawyer 5 Period Hypothesis (CPH) (Lenneberg, 1967), there are biological/maturational constraints on L2 learning such that post-critical-period L2 learners cannot become nativelike L2 speakers. Although no adult L2 learner has yet been shown to be entirely nativelike across language domains and tasks in a methodologically robust study (see Long, 2005 for a review), there is evidence that some adult learners can become ?near-nativelike? (Hyltenstam & Abrahamsson, 2003) on a variety of linguistic phenomena (e.g., Abrahamsson & Hyltenstam, 2009; Ioup et al., 1994; Novoa et al., 1988). One of the factors considered capable of compensating for maturational constraints and explaining variability in ultimate L2 attainment is language aptitude. DeKeyser (2000) hypothesized that a high degree of language aptitude is a necessary condition for adult L2 learners to reach a level of ultimate attainment in morphosyntax comparable to that of child L2 learners, who attain nativelike command regardless of their language aptitude. DeKeyser (2000) operationalized language aptitude as verbal analytic ability, an ability that is gained by linguistic experience in one?s native language, in foreign languages, or linguistics. The rationale behind DeKeyser?s (2000) position is Bley-Vroman?s (1988, 1990) Fundamental Difference Hypothesis, according to which there is a qualitative difference between the learning mechanisms of child and adult L2 learners. DeKeyser (2000) argued that, while younger learners learn mostly implicitly, using domain- specific mechanisms, older learners learn mostly explicitly, using problem-solving or domain-general mechanisms, and, therefore, have to rely more on language aptitude. and Ranta (2001) point out a methodological problem with the study, mainly that the learners were not acquiring Hebrew purely through naturalistic exposure, but had been simultaneously receiving instruction for 6-7 years. 6 Since individual differences in language aptitude account for variation in explicit language learning ability, and explicit language learning accounts for variation in adult learners? ultimate level of attainment, it follows that language aptitude should explain variation in adult learners? ultimate attainment. While the numerous studies of language aptitude in instructed SLA have, in general, provided converging evidence of the predictive power of aptitude in L2 learning, the few that have investigated the role of aptitude in naturalistic contexts (Abrahamsson & Hyltenstam, 2008; DeKeyser, 2000; DeKeyser et al., 2010; Granena & Long, 2010; Harley & Hart, 2002) have yielded mixed findings. Harley and Hart?s (2002) study is usually cited as evidence for the predictive validity of aptitude in naturalistic contexts, even though, due to the extremely short period of naturalistic exposure involved (i.e., three months), the results of the study can only speak to the role of aptitude in rate of L2 learning, not eventual success.3 The study was based on Harley and Hart?s (1997) work on aptitude as a predictor of L2 achievement in French immersion classrooms, which revealed significant positive correlations between memory (defined by the authors as memory for text) and L2 outcomes in early immersion learners and significant positive correlations between analytic ability and L2 outcomes in late immersion learners. Given that the different types of instruction early and late immersion learners were exposed to (i.e., holistic memory-based vs. language analysis) could have affected the results, Harley and Hart (2002) set out to look at the relationship between aptitude and L2 outcomes among 3 In order to assess level of ultimate attainment, a minimum length of residence in the L2-speaking country should be established. In some studies (e.g., DeKeyser, 2000; Abrahamson & Hyltenstam, 2009), length of residence was at least 10 years. In others (e.g., Sorace, 1993; DeKeyser et al., 2010), it was 5 and 8 years. Oyama (1978) found no differences between a group of Italian L2 learners who had been in the U.S. for 5-11 years, and a group that had been for 12-18 years. 7 adolescent learners after a three-month stay abroad. In addition to two cognitive measures, a measure of aptitude (operationalized as language analytic ability) and a measure of memory (operationalized as memory for text), they also administered a battery of language tests, out of which, after removing the effects of an outlier, only a sentence-repetition task was found to be related to aptitude. Harley and Hart concluded that aptitude was a factor related to success in a naturalistic context, although not consistently so. In fact, the only two measures in the battery that were administered as pretests, and, therefore, the only measures that could have provided reliable evidence of the benefits of study abroad, turned out to be unrelated to aptitude as posttests. Given this limitation of the study, and, given that participants had been learning French in a classroom context for approximately seven years before their period of study abroad, the authors cannot discard the possibility that the significant correlations they found for the sentence-repetition task already existed before the overseas stay. DeKeyser (2000) administered an auditory grammaticality judgment test (GJT) incorporating various elements of morphosyntax to 57 Hungarian speakers of L2 English. He found a significant correlation between GJT scores and language aptitude, operationalized as verbal ability, among late arrivals (r = .33, p < .05), but a non-significant correlation among early arrivals (r = .07, ns). Those participants that were late arrivals and scored within the range of child arrivals, or came close, were all high-aptitude participants. High aptitude was operationalized as being half a standard deviation above the group?s average score, which was 4.7 out of 20. The exception was a participant who did not have high aptitude (i.e., he was below .46 standard 8 deviations from the average), but who, nevertheless, scored within the range of child arrivals. This participant scored 3 out 20 on the aptitude test and 186 out of 200 on the GJT. According to DeKeyser, the fact that he was a postdoctoral student in the natural sciences would indirectly suggest that he was of above-average analytic ability and that his aptitude score was not indicative of his true skills. On the basis of these results, DeKeyser concluded that above-average analytic abilities are required to reach near-native levels in the L2. Long (2007) argued that the language test in DeKeyser (2000), a GJT, could have allowed use of metalinguistic abilities and, therefore, might be measuring the same ability as the aptitude test. Long (2007) further interpreted the lack of a correlation between aptitude and GJT scores among early arrivals as the result of the lack of variance in scores within that group. DeKeyser?s GJT was administered by having participants listen to each sentence stimulus twice with a three-second interval between the two repetitions. There was also a six-second interval between sentence pairs. As a result, it was a test that did ?not require participants to perform under time pressure? (DeKeyser, 2000: 515). A similar administration format, with slightly shorter intervals, was used by Johnson and Newport (1989) and Birdsong and Molis (2001), i.e., a one- to two-second pause between the first and second readings and a similar interval between sentence pairs. This testing format could have maximized reflection, conscious monitoring, and opportunities for the late arrivals to rely on explicit knowledge. Participants with higher aptitude might have been better able to analyze each stimulus by drawing on their explicit L2 knowledge. 9 The lack of a correlation among early arrivals could be explained by the narrow range of aptitude/GJT scores (possible floor effect in the aptitude test, as a result of the non-language independence of the test, and ceiling effect in the GJT). The aptitude test administered in the study was a Hungarian version of the Words-in- Sentences subtest in the MLAT. The test was, therefore, measuring L1 verbal analytic ability. However, when asked about their proficiency in Hungarian compared to English, only 22 of the 57 participants reported feeling more comfortable in Hungarian. One of these participants was in the younger group, and 21 were in the older group. This means that the younger group was more homogenous in terms of language dominance, while the older group was more heterogeneous (half of the late acquirers felt more comfortable in the L1 and half either in the L2 or equally in the L1 and L2). Such a distribution is not surprising, given that early acquirers? schooling took place in the L2 and that degree of L2 acquisition tends to correlate with degree of L1 attrition (Yukawa, 1997; Montrul, 2004; Hyltenstam et al., 2009). This could have biased the distribution of aptitude scores in the study by restricting the range of scores in the group of early acquirers. The distribution of high- and low-aptitude participants seems to support this interpretation. There were 15 participants who had an aptitude score half a standard deviation above the mean and who were, therefore, identified as high-aptitude individuals. Only two of these participants were early acquirers. Given that DeKeyser?s (2000) study did not include any procedures to screen potential participants as near-native L2 speakers (unlike, for example, Abrahamsson & Hyltenstam, 2008), the fact that almost all high-aptitude participants 10 were in the group of late acquirers seems to be an artifact of the aptitude instrument used. An investigation by DeKeyser, Alfi-Shabtay, and Ravid (2010) provided cross- linguistic evidence for the nature of age effects in two parallel studies that looked at the acquisition of English in the U.S. and the acquisition of Hebrew in Israel by native speakers (NSs) of Russian (n = 76 and n = 64, respectively). The findings for aptitude (operationalized as L1 verbal aptitude and measured by a test comparable to the verbal SAT) showed a significant correlation between ultimate attainment and aptitude for the adult learners, but not for the early learners, replicating the findings inDeKeyser (2000). Specifically, the significant correlation in the two parallel studies in DeKeyser et al. (2010) was found for the 18-40 age of acquisition range (r = .44, p < .05 and r = .45, p < .01), but not for the age of acquisition < 18 group (r = .11, ns, and r = -.37, ns), or > 40 group (r = .33, ns, and r = .14, ns). The language measure used in the study followed the same format as that in DeKeyser (2000). The test was untimed and sentences were presented auditorily, twice, with a three-second interval between them. Therefore, similarly to DeKeyser (2000), the testing format used could have influenced the results obtained regarding the relationship between aptitude and GJT scores among adult learners. To address previous methodological gaps in critical-period studies, Abrahamsson and Hyltenstam (2009) used a multiple-task design covering various language subdomains, L2 knowledge and L2 processing, and perception, as well as production. Their study with L2 speakers of Swedish began with detailed linguistic scrutiny of apparent linguistic nativelikeness. The formal procedure to screen participants into 11 the study was as stringent as the instrumentation they used, and half of the study was devoted to selecting participants who identified themselves as nativelike, and who were also perceived to be nativelike by NS judges. Participants in the final sample were 31 childhood learners with ages of onset ? 11 and 10 adult learners with ages of onset ? 12. All the late learners were able to score within the NS range on some of the tasks, but, unlike some early learners, not across the whole range of tests employed. When scores on the aptitude test were considered, Abrahamsson and Hyltenstam (2008) found that the four late learners who were able to score within NS range on the GJT were all above average in terms of aptitude, as measured by the Swansea LAT (Meara et al., 2003). The correlation between GJT scores and aptitude among late learners was moderately positive (r = .53), but not significant (p = .094), probably due to the small size of the group. The authors further observed that 72% of the early learners who performed within the NS range also had high aptitude. In fact, there was a significant positive correlation between GJT scores and aptitude in the early-learner group (r = .70, p < .001). On the basis of these results, Abrahamsson and Hyltenstam (2008) concluded that language aptitude can play a role not only in adult near-native SLA, but also in child SLA. This is a finding that runs contrary to DeKeyser?s (2000) hypothesis that aptitude will not be a significant predictor among early learners. It also differs from the results of DeKeyser (2000) and DeKeyser et al. (2010), which showed no relation between proficiency and aptitude among early learners. In a similar study, Granena and Long (2010) investigated the relationship between language aptitude, as measured by the LLAMA (Meara, 2005), a revised version of the LAT aptitude test used by Abrahamsson and Hyltenstam (2008), and ultimate 12 morphosyntactic attainment, as measured by an auditory GJT. The results showed no relationship between aptitude and GJT performance. Participants in this study were 65 Chinese speakers of Spanish L2 divided into three groups according to age of onset of L2 learning (? 6, 7-15, and ? 16). A univariate analysis of variance with the three age groups as a fixed factor and language aptitude as a covariate revealed that, while age group, keeping aptitude constant, was significantly related to GJT scores (F(2,61) = 21.010, p < .001), aptitude, keeping group constant, was not (F(1,61) = .816, p = .370). The same analysis also showed no group-by-covariate interaction (F(2,59) = 1.225, p = .301). Therefore, the effect of aptitude was shown to be comparable in the three groups. To summarize, research to date regarding the relationship between aptitude and eventual L2 success at a group level in naturalistic SLA has produced mixed findings. While aptitude was not related to ultimate morphosyntactic attainment among early acquirers in studies by DeKeyser (2000) and DeKeyser et al. (2010), Abrahamsson and Hyltenstam (2008) found a relationship between aptitude and GJT performance among those participants that were first exposed to the L2 before age 12. Also, while DeKeyser (2000) and DeKeyser et al. (2010) found a relationship between aptitude and GJT performance among late acquirers, Granena and Long (2010) did not. Finally, research findings have been inconsistent when a battery of tests has been employed, as by Harley and Hart (2002). While DeKeyser (2000), DeKeyser et al. (2010), Abrahamsson and Hyltenstam (2008), and Granena and Long (2010) all investigated the same language domain, morphosyntax, and used the same type of L2 measure, a GJT, the conditions of test administration differed across the studies. 13 Participants in the studies by DeKeyser (2000) and DeKeyser et al. (2010) listened to each sentence stimulus in an auditory GJT twice. Abrahamsson and Hyltenstam (2008) combined the scores of two different GJT modalities, an auditory (online) and a written (offline), with no time pressure. In the only study that did not find a relationship between aptitude and ultimate attainment, the test was auditory (online), and participants had to press a key as soon as they detected an error (Granena & Long, 2010). Sentences were only played once and, when participants pressed a key, the computer automatically moved on to the next sentence without a pause. Therefore, the three studies that reported significant correlations between language aptitude and morphosyntactic L2 attainment had in common the use of language measures with offline features, whereas the only study that did not find a relationship relied on a test with online features. 1.4 Language Aptitude and Controlled vs. Automatic Use of L2 Knowledge The studies that have reported positive correlations between language aptitude scores and ultimate morphosyntactic attainment (Abrahamsson & Hyltenstam, 2008; DeKeyser, 2000; DeKeyser et al., 2010) have in common the operationalization of language aptitude as verbal or language analytic ability, as well as the use of language tests or conditions of test administration that allow monitoring and controlled use of L2 knowledge. Therefore, the measures of language aptitude and ultimate attainment employed in these studies could have been measuring the same underlying abilities. This is, in fact, how Long (2007) and Paradis (2009) interpreted DeKeyser?s (2000) findings. Both suggested the possible role of participants? metalinguistic abilities in affecting the results of the study. Long (2007) noted that aptitude tests and GJTs have 14 in common the fact that they allow use of metalinguistic abilities. Since, in part, they measure the same underlying abilities, ?some positive association between the two sets of scores is to be expected? (p. 73). Similarly, Paradis (2009) pointed out that: ?Very few of the 57 adult Hungarian-speaking immigrants in DeKeyser?s (2000) study scored within the range of child immigrants on a grammaticality judgment task, and the few who did had high levels of analytic skills (suggesting that they probably used their metalinguistic knowledge)? (p. 124). In an instructed language context, Roehr and G?nem (2009) showed that use of metalinguistic L2 knowledge was related to performance on the Words-in-Sentences MLAT subtest, a test that measures grammatical sensitivity, one of the aspects of language analytic ability (Skehan, 1989). The metalinguistic test administered by Roehr and G?nem (2009) included a section with ungrammatical sentences that participants had to correct and explain and a section modeled on the MLAT Words- in-Sentences subtest, which required participants to identify the grammatical role of highlighted parts of sentences. While the study found moderate and significant correlations between MLAT4 (Words-in-Sentences) and the two sections of the metalinguistic test (r = .45 and r = .41, respectively), correlations between L1/L2 working memory (as measured by reading span tests) and the metalinguistic test were weak and non-significant (r = .19 and r = .13), as were the correlations between metalinguistic test scores and the other MLAT subtests (r = .15, MLAT 1, r = .11, MLAT 2, r = .29, MLAT 3, and r = .10, MLAT 5). Roehr and G?nem (2009) concluded that only the more analytic component of aptitude was related to metalinguistic knowledge, not the more memory-based components. 15 In the study by Roehr and G?nem (2009), metalinguistic L2 knowledge was positively related to aptitude scores on the same aptitude subtest that correlated with GJT scores in DeKeyser (2000). Using a different language aptitude test and a fully crossed, within-subjects design, Granena (2011a, to appear) found that language aptitude moderated scores on an untimed written GJT, but not on an auditory GJT. Specifically, the study examined two features of GJTs that could have interacted with language aptitude test scores in previous naturalistic SLA studies: test modality (auditory vs. written) and item (sentence) complexity (syntactically simple vs. complex). Granena argued that aptitude, understood as analytic ability, could be related to L2 performance to the extent that the L2 measure includes features that allow L2 learners to make controlled use of L2 knowledge and monitor their performance. If high-aptitude learners approach language as a puzzle-solving task (Skehan, 1998), off-line tests with no time constraints, or metalinguistic tasks with an error correction component, could allow them additional opportunities to rely on their problem-solving and analytic abilities. Granena (2011a, to appear) also argued that language aptitude could interact with test performance to the extent that test items make L2 processing more demanding, given the strengths in the cognitive abilities, such as processing speed and working memory, that information processing draws on, and that high-aptitude individuals are claimed to have. Participants in the study by Granena (2011a, to appear) were 30 L1 English-L2 Spanish bilinguals with an average length of residence in Spain of 22 years and a group of 15 NS controls. Participants completed an auditory GJT, an untimed written GJT with a correction component, and the LLAMA aptitude test (Meara, 2005). The 16 two speaker groups were comparable in terms of their language aptitude scores (t(43)= .003, p = .998). The average score was 50.67 (SD = 11.55) in the NS control group, and 50.65 (SD = 15.14) in the L2-speaker group, out of a maximum possible score of 100. Regarding sentence complexity, even though the two speaker groups scored significantly higher on simple test items, aptitude did not moderate scores in either group. Regarding modality, both groups scored significantly higher on the written than on the auditory GJT, but aptitude only moderated difference scores in the L2-speaker group. In other words, there was an interaction between language aptitude as a covariate and test modality (auditory GJT vs. untimed written GJT) among L2 speakers, keeping all other factors constant (i.e., target structure, number of items, sentence complexity) (?(1,28) = .751, p = .005, ?p 2 = .249), but not among NSs (?(1,13) = .952, p = .433, ?p 2 = .048). Follow-up analyses with language aptitude as a group variable (high-aptitude = z- scores > .5, mid-aptitude = -.5 < z-scores < .5, and low-aptitude = z-scores < -.5) and Bonferroni-adjusted multiple comparisons further showed group differences in the written GJT (F(2,27) = 5.694, p = .009, ?p 2 = .297), between high- and mid- aptitude L2 speakers and low-aptitude L2 speakers (p = .013 and p = .029, respectively), but not in the auditory GJT (F(2,27) = 1.143, p = .334, ?p 2 = .078). The correlations between L2 speakers? aptitude scores and written GJT scores were .438 (p = .016) and .447 (p = .013) for simple and complex test items, respectively, while the corresponding correlations between aptitude scores and auditory GJT scores were .162 (p = .392) and .173 (p = .360). No correlation in the NS group had a magnitude greater than .10. When only ungrammatical items were considered, for which the 17 built-in error may be the most likely reason for rejection, the correlation between language aptitude and written GJT scores in the L2-speaker group increased in magnitude, while the correlation between aptitude and auditory GJT scores decreased, .495 (p = .005) and .002 (p = .990). These results were interpreted as indicating a positive association between language aptitude, understood as analytic ability, and language tests that involve predominantly controlled use of L2 knowledge by allowing participants time to reflect on language correctness and language structure. The written GJT gave L2 speakers unlimited time to analyze test sentences and monitor their performance. The need to provide a correct version of the ungrammatical sentences further encouraged them to reflect consciously on linguistic structure and sentence correctness. Since high-aptitude L2 speakers? performance improved significantly on the written test (with respect to the auditory test), a possible explanation is that they were able to use the same analytic, metalinguistic abilities that the aptitude test measured. The auditory GJT, on the other hand, required online processing, which may minimize monitoring. On this test, high-aptitude L2 speakers did not outperform low- aptitude participants. In fact, the highest L2-speaker scorer on the auditory GJT was a low-aptitude individual, an adult acquirer with a length of residence of 30 years who had arrived in Spain when she was 19 and who reported having started learning Spanish at university at the age of 18. This L2 speaker obtained an auditory score of 110 out of 128, and she was very close to the lowest scorer in the NS group, a participant with a score of 113. This result could suggest that the type of aptitude that the LLAMA test is hypothesized to predominantly measure, a type of aptitude that 18 relies on explicit cognitive processes such as analytic ability, is not necessary in near native-like attainment in the type of naturalistic context investigated. This finding does not negate the possibility that other cognitive aptitudes could be relevant and account for such high levels of attainment. The results of Granena (2011a, to appear) run contrary to those studies that have reported a relationship between aptitude and adult L2 learners? ultimate level of attainment in a naturalistic setting (Abrahamsson & Hyltenstam, 2008; DeKeyser, 2000; DeKeyser et al., 2010). They also run contrary to DeKeyser?s (2000) claim that only late acquirers with a high level of language aptitude will be able to score within the range of early acquirers. Results are, however, consistent with Granena and Long?s (2010) findings and, together, provide converging evidence for the lack of a relationship between aptitude and ultimate level of attainment, as measured by an auditory GJT, from two different L1 populations (English and Chinese) learning the same L2 in a naturalistic context. The fact that the results reported by Granena (2011a, to appear) and Granena and Long (2010) run contrary to those reported by Abrahamsson and Hyltenstam (2008), DeKeyser (2000), and DeKeyser et al. (2010) suggests that the relationship between language aptitude and adult L2 learners? ultimate level of attainment is more complex than claimed so far on the basis of studies where aptitude has been investigated in single-task designs. A possible explanation for the lack of converging findings is that, even though Abrahamsson and Hyltenstam (2008), DeKeyser (2000), and DeKeyser et al. (2010) all measured ultimate attainment with a GJT, they relied on formats or 19 conditions of test administration that allowed learners time to reflect on language correctness and make use of metalinguistic abilities. A question that is relevant to this discussion is what type of GJT, if any, provides a more valid measure of ultimate level of attainment. The answer to this question can vary depending on how linguistic competence is defined. If defined as knowledge that is available in spontaneous language use (Ellis, 2005), online test conditions where performance takes place in real time should provide a more valid measure of language acquisition. From this perspective, an auditory GJT, preferably under timed conditions and involving a single presentation of test items, should be a more valid measure of linguistic competence than a written GJT under untimed conditions, since unpressured tasks encourage a high degree of awareness and maximize the opportunities to access metalinguistic knowledge. Still, both types of GJT require a focus on language forms, since judging the correctness of sentences necessarily entails this. Other language measures involving the ability to handle language within real-time constraints, but with a clearer primary focus on meaning than GJTs would probably be better indicators of L2 learners? linguistic competence. 1.5 Cognitive Aptitudes and Language Learning Language aptitude was originally understood as a unidimensional construct (e.g., Carroll, 1973) and measured accordingly through a composite of different abilities. Several studies have followed this conceptualization of aptitude that involves across- the-board haves and have-nots by relying on single measures of aptitude conceived as a unitary trait (e.g., Abrahamsson & Hyltenstam, 2008; Bylund et al., 2010; Granena & Long, 2010; Granena, 2011a, to appear). More recent theorizing on aptitude has 20 called for a multifaceted view of the construct that can result in individually unique L2 aptitude profiles (e.g., Skehan, 1998, 2002; 2012) or ?aptitude complexes? (Robinson, 2002). L2 learners may have high ability in one aptitude component or complex, but low ability in others. Different aptitudes, in turn, may moderate L2 learning differently depending on factors such as instructional treatment, L2 learning environment, and type of linguistic feature. With this goal in mind, Granena (2011b, to appear) ran a Principal Components Analysis (PCA) on the scores of 73 participants on the four subtests of the LLAMA aptitude test (Meara, 2005): vocabulary learning, sound recognition, sound-symbol association, and grammar inferencing. The main research question was whether the LLAMA subtests measured a unitary trait, conceived as language aptitude, or multiple aptitude components. A first analysis was conducted with 63 L1 Chinese-L2 Spanish bilinguals and 10 L1 speakers of Spanish. PCA is an exploratory factor analytic technique that summarizes the interrelationships among a set of original variables in terms of a smaller set of orthogonal (i.e., uncorrelated, in an unrotated solution or a solution with Varimax rotation) or non-orthogonal (i.e., correlated, in a solution with Direct Oblimin rotation) principal components. Its purpose is to reduce data by creating natural sets of composite variables. An unrotated PCA of the LLAMA resulted in a two-factor solution with loadings (eigenvalues) greater than 1.0. They accounted for 68.497% of the total variance. Three of the LLAMA subtests loaded on a first component (? = 1.711) and one subtest on a second component (? = 1.029). The three subtests that loaded on the first component with values greater than .3 were vocabulary learning (? = .711), sound- 21 symbol association (? = .770), and grammatical inferencing (? = .759). The correlations between them were .310 (p = .003), between vocabulary learning and sound-symbol association, .317 (p = .003), between vocabulary learning and grammatical inferencing, and .414 (p < .001), between sound-symbol association and grammatical inferencing. The only subtest that loaded on the second component with a value greater than .3 was the sound recognition test (? = .944). The correlations between sound recognition and the other three subtests (vocabulary learning, sound- symbol correspondence, and grammatical inferencing) were close to zero, .141 (p = .113), .069 (p = .279), and -.026 (p = .412), respectively. PCA has several drawbacks. Extracted components tend to overestimate the patterns of relationships among sets of variables because the analysis does not separate out errors of measurement from shared variance. An alternative exploratory technique is Principal Axis Factoring (PAF). PAF extracts factors only from the variance that variables share in common (i.e., common variance) and not from total variance. Like PCA, the default option is to extract orthogonal factors such that the first factor accounts for the maximum amount of common variance in the data, while the second factor accounts for residual variance after having factored out the influence of the first factor, and so on. In PAF, the total amount of variance accounted for and the variance in the variables explained by each of the factors is lower than in PCA. The PAF analysis run on the LLAMA also converged on a two-factor solution with vocabulary learning, sound-symbol correspondence, and grammar inferencing loading on a first component, and sound recognition loading on a second component. 22 As expected, the overall variance accounted for decreased to 35.115%. The amount of variance in the variables explained by each of the factors was also lower, but loadings remained greater than .4. Vocabulary learning, sound-symbol correspondence, and grammar inferencing loaded on a first component with eigenvalues of .516, .598, and .697, respectively. Sound recognition loaded on a second component with a value of .443. Removing the 10 L1 speakers of Spanish and running the analysis with orthogonal (Varimax) or non-orthogonal (Direct Oblimin) rotations made no changes to the reported two-factor structure. To further confirm the results obtained, sample size was increased to 117 participants. The sample combined L1 Chinese-L2 Spanish bilinguals (n = 63), English L1-Spanish L2 bilinguals (n = 29), and NSs of Spanish (n = 25). The two- factor structure was maintained, but with a second factor that did not quite reach an eigenvalue of 1.0 (it ranged between .909 and .995, depending on the analysis). If only one factor was retained, the three subtests already identified (vocabulary learning, sound-symbol correspondence, and grammatical inferencing) contributed to it with loadings greater than .3, while the contribution of the sound recognition test was negligible and did not reach the recommended threshold of .3. Regarding the interpretation of the two factors underlying the LLAMA aptitude test, they could be labeled as language analytic ability and phonological sequence learning ability. The analyses suggest that the vocabulary learning subtest is related to the sound-symbol correspondence and grammatical inferencing subtests, but not to sound recognition scores. Therefore, individuals with good phonological sequence learning abilities may not necessarily have good analytic skills, and vice-versa. The 23 three subtests loading on the component that was interpreted as analytic ability have in common the fact that they include a study phase in which participants are given time to work out relations in a dataset. Unlike the sound recognition subtest, therefore, the three subtests loading on analytic ability allow for strategy use and problem-solving techniques. Granena (2011b, to appear) concluded that the LLAMA is mostly a test measuring the analytic component of aptitude and, therefore, aptitude for explicit language learning. Further support for this interpretation comes from the largest loading in the component interpreted as ?analytic ability? and which corresponded to grammatical inferencing (LLAMA F). In this subtest, test takers are given time to infer or induce the rules governing a set of language materials presented visually. Therefore, it could be argued that the test is primarily measuring (explicit) inductive language learning ability, an ability that, according to Skehan (1989), should not be separated from grammatical sensitivity. In fact, he reconceptualized inductive language learning ability and grammatical sensitivity as language analytic ability (Skehan, 1998). While LLAMA F requires test-takers to work out the grammar of an unknown language by means of pictures and short written sentences, LLAMA D measures the ability to discriminate short stretches of spoken language by analogy. As pointed out by Meara (2005), LLAMA D ?owes something to Speciale (Speciale, N. Ellis, & Bywater, 2004)? who ?suggest that a key skill in language ability is your ability to recognize patterns, particularly patterns in spoken language? (p. 8). Speciale et al.?s (2004) work is based on a strand of cognitive psychology that investigates the implicit induction of phonological sequences (Saffran et al., 1996; Saffran, Johnson, 24 Aslin, & Newport, 1999; Saffran, Newport, Aslin, Tunick, & Barrueco, 1997). LLAMA D can be, thus, seen as an attempt to measure implicit induction learning ability. Construct validity in psychological tests involves building a network where tests are related to constructs, constructs are related to other constructs, and finally constructs are related to observables (Cronbach & Meehl, 1955). Empirical evidence for the relationship between the three LLAMA subtests loading on the factor that was interpreted as analytic ability and explicit language learning can be found in work by Yilmaz (2010) and in a re-analysis of Granena (to appear). In his study investigating corrective feedback, Yilmaz (2010) reported moderate-to-strong and significant correlations between the LLAMA grammatical inferencing subtest and posttest scores in a group that received explicit correction, whereas the same correlation was weak and non-significant in a group that received recasts. The re-analysis of Granena (to appear) showed a stronger relationship between untimed written GJT scores and aptitude operationalized as analytic ability than between untimed written GJT scores and aptitude operationalized as the average of the four LLAMA subtests (r = .58 vs. r = .46). Analytic ability moderated the difference score between auditory and written GJT scores (p = .008) in the L2-learner group only. In addition, while the correlation between analytic ability and written GJT scores was .58 (p = .001), the corresponding correlation with sound recognition approached zero .034 (p = .857). The two adopted cognitive aptitudes that the LLAMA test is hypothesized to measure (analytic learning ability and sequencing learning ability) are very different in nature. While language analytic ability is gained by linguistic experience in one?s 25 native language, in foreign languages, or linguistics, sequence learning ability can be regarded as a core component of human intellectual skill. In fact, according to N. Ellis (1996), much of language acquisition is sequence learning. This type of learning involves the discovery of language structure by way of statistical properties of the input. In the L1, infants have been shown to be sensitive to sequential dependencies in language. Saffran, Newport, and Aslin, (1996) demonstrated that 8-month-old infants exposed to strings of nonsense syllables were able to detect the difference between three-syllable sequences that appeared as a unit in their learning set and sequences that appeared in random order. Statistical learning involves implicit learning processes (Perruchet & Pacton, 2006). Little is known about individual differences in this type of learning. According to Reber?s earliest work (e.g., Reber, 1989), individual differences in implicit learning should be minimal because implicit forms of learning reflect primitive cognitive systems. Reber (1993) predicted that implicit learning is intelligence quotient (IQ)- independent and age-invariant, a claim that was supported by Reber et al.?s (1991) study, in which no association was found between artificial grammar learning and IQ scores. On the other hand, N. Ellis (1996) argued that individuals differ in their sequencing ability. More recently, Woltz (2003) also claimed that individuals can be expected to differ in implicit cognitive processes, just as they differ on most cognitive measures. Evidence from recent studies seems to support this prediction. Misyak and Christiansen (2012) documented variability in statistical learning performance among adults, as well as correlations between individual differences in statistical learning and L1 abilities. Specifically, statistical learning scores predicted comprehension 26 accuracy in a self-paced reading task. The authors concluded that individual differences in statistical learning skills, largely overlooked in previous research, may be able to account for more language variance than other measures typically used in individual differences research. Woltz (2003) also provided evidence of individual differences in repetition and semantic priming. He suggested that the general domain of implicit cognitive processes (implicit memory, implicit learning, and procedural knowledge) could be a fruitful area in which to investigate new aptitude constructs. He argued that exploring individual differences in implicit learning could result in aptitude constructs that have minimal overlap with existing ones. Additional evidence suggesting systematic variation in statistical learning comes from studies that have investigated Reber?s predictions regarding the existence of a disassociation between implicit/explicit learning and IQ. In a replication and extension of Reber et al. (1991), Robinson (2002) looked at the relationship between implicit artificial grammar learning and IQ, as measured by a short form of the Wechsler Adult Intelligence Scale (WAIS-R), which included three subtests: block design, vocabulary, and arithmetic. Findings showed a significant negative correlation between implicit learning and IQ (r = -.34, p < .05), somehow contrary to Reber et al.?s (1991) results. Participants with lower scores on the WAIS-R outperformed those with higher IQ scores. However, the incidental learning condition in the same study, which involved meaning-based processing, showed no relationship to IQ. This result would not be contrary to Reber?s predictions, since incidental learning shares with implicit learning the fact that is ?unintentional and uncontrolled? (Reber & 27 Allen, 2000, p. 238). Conversely, other studies have shown that Culture-Fair (non- verbal) IQ test scores were positively related to learning on miniature L2 learning tasks (Brooks, Kempe, & Sionov, 2006; Kempe & Brooks, 2008; Kempe, Brooks, & Kharkhurin, 2010). A possible explanation to account for these mixed findings could be the actual type of learning that took place in the studies. For example, learning might have not been as implicit as expected in those studies that reported a relationship between IQ or working memory and learning outcomes. As pointed out by Misyak and Christiansen (2012), fluid intelligence has been shown to correlate with artificial grammar learning when participants have been instructed intentionally to search for patterns (e.g., Gebauer & Mackintosh, 2007), but not under more incidental learning conditions. Similarly, Robinson (2002) explained the negative correlation between IQ and implicit learning as indicating participants? adoption of an explicit code-breaking set towards the implicit learning task, a strategy that had more negative effects for those participants with higher IQ scores and who are better at explicit learning. Robinson?s explanation was further supported by the correlation between IQ and the explicit learning condition in the study, which was positive and significant. A clear rationale for the prediction of a lack of a relationship between implicit learning and general measures of intelligence or other cognitive measures that tap explicit processes (i.e., Reber?s position) is offered by Woltz (2003). According to Woltz, implicit learning should be unaffected by differences in IQ as measured by currently available intelligence tests, since these measures have been biased towards explicit processes that are attention-driven (e.g., working memory). In support of his 28 argument, Woltz cites studies such as those by Engle, Tuholski, Laughlin, & Conway (1999), which found correlations between IQ and working memory measures (operation span with words, reading span, and counting span), as well as his own (Woltz, 1990, 1999), which showed low correlations between IQ and implicit memory measures of priming. Further support for this claim can be found in Tagarelli, Borges-Mota, & Rebuschat (forthcoming) study, in which working memory predicted learning under the rule-search condition, but not under the incidental condition. Woltz (2003) further argues that explicit, attention-driven memory processes are, to a certain extent, functionally and structurally independent from implicit cognitive processes, even if all can operate in concert in complex learning tasks. Explicit processes require effort and some level of awareness, while implicit processes are revealed in performance facilitation, often with lack of awareness of the original learning event. With respect to the relationships among aptitude components and language acquisition, the following conclusions can be drawn from the studies reviewed above: ? Language aptitude measures, including those in studies of ultimate L2 attainment to date, have been weighted heavily in favor of explicit cognitive processes (i.e., language analysis) and have overlooked implicit cognitive processes ? There is emerging recognition that individual differences in implicit learning do exist (as even conceded in Reber & Allen, 2000) ? Different cognitive aptitudes measure underlying abilities that are not necessarily correlated with one another 29 ? Different aptitudes relate differently to different measures/aspects of ultimate L2 attainment and language learning ? Research has looked at individual differences in implicit learning, but has overlooked any direct links between differences in implicit learning abilities and variation in language (L1/L2) abilities 1.6 Conclusion Previous research on language aptitude and long-term L2 achievement has been biased towards measures of language aptitude and language attainment that rely on explicit and attention-driven memory processes. Aptitude measures have focused on analytic ability and explicit induction, while language measures of morphosyntactic attainment have engaged participants in explicit processing by focusing their attention on linguistic structure and language correctness and by relying on testing formats that allow time to think. A positive association between the two sets of measures, therefore, could be partly due to the fact that they measure the same abilities. Implicit learning and processing and their potential relationship with L2 learning outcomes have been overlooked as an aptitude construct, despite increasing agreement that explicit and implicit learning are two relatively independent learning systems, each with their own sources of individual differences (Kaufman, DeYoung, Gray, Jim?nez, Brown, & Mackintosh, 2010). Similarly, previous studies have failed to rely on language measures with a primary focus on meaning that avoid raising the participants? awareness towards the phenomenon investigated, despite agreement that language knowledge can be used in more automatic or controlled processing conditions. 30 An optimal learning context to start investigating potential relationships between aptitude for implicit learning and L2 outcomes is the naturalistic (i.e., immersion) context. Naturalistic L2 learning environments maximize the opportunities for implicit learning by providing massive input exposure in communicative situations. In this context, cognitive aptitudes for implicit learning could help L2 learners detect complex and noisy regularities. 31 Chapter 2: Purpose of the Study This dissertation research was motivated by three main research gaps in the body of literature that has investigated the role of language aptitude in ultimate morphosyntactic attainment. First, the few studies that have looked at the relationship between language aptitude and morphosyntactic L2 attainment have relied on single- task designs, which provide no indication of variation between tasks and that, as a result, may have limited generalizability. Second, they have relied on a single type of L2 measure, the GJT, which is a measure that focuses on accuracy of grammaticality judgment and language structure, and have failed to look at more meaning-based online tasks. Finally, they have relied on language aptitude measures heavily biased in favor of explicit cognitive processes, which could have resulted in positive associations between aptitude and the type of L2 measure employed. In order to address these gaps, the present study set out to investigate the relationship between different cognitive aptitudes for L2 learning, including general intelligence, and ultimate level of morphosyntactic attainment in early and late L2 learners as measured by tests hypothesized to allow controlled use of L2 knowledge and tests hypothesized to require automatic use of L2 knowledge. A major goal of this dissertation work was to investigate new aptitude constructs in the domain of implicit cognitive processes and their potential role in determining L2 attainment in immersion settings, a relationship that, to my knowledge, no study has addressed before. It has been argued that analytic ability, memory, and phonetic sensitivity are the three most important components of language aptitude (Carroll, 1964; DeKeyser & 32 Koeth, forthcoming; Skehan, 1989, 1998). However, studies of ultimate morphosyntactic attainment have mostly focused on language analytic ability, using either measures of analytic ability, such as the Words in Sentences MLAT subtest (e.g., DeKeyser, 2000), or a composite of cognitive abilities that have been heavily weighted in favor of analytic ability (e.g., Abrahamsson & Hyltenstam, 2008; Bylund et al., 2010; Granena & Long, 2010). Analytic ability, which, according to DeKeyser and Koeth (2011), is closely related to verbal aptitude and even general intelligence, was the cognitive aptitude that DeKeyser (2000) anticipated as a necessary condition for adult learners to reach high levels of ultimate attainment, due to the explicit learning mechanisms they are hypothesized by some to rely on. No study of ultimate L2 attainment to date has investigated multiple cognitive aptitudes following Skehan?s (1989, 1998) and Robinson?s (2002) calls for aptitude complexes and Sternberg?s (1985, 1990) notion of ?multiple intelligences?. In addition to explicit language learning, individuals differ in other domains of cognitive ability. Although little research exists, recent studies have suggested that some of these domains may be relevant for implicit language learning (e.g., Woltz, 2003). Individual differences in implicit processes may be especially relevant in accounting for variation in the learning outcomes of child L2 learners who have started learning early enough to acquire language without awareness, but late enough to be affected by age of onset. For example, studies such as by Granfeldt et al. (2007) have shown that Swedish child L2 learners of French with ages of onset between 3;5 and 6;7 resembled adult learners of French in their use of features such as gender agreement. Individual differences in implicit learning may also be relevant to account 33 for variation among adult learners, if, contrary to what has been claimed by some, they are still able to learn an L2 incidentally via unconscious associative mechanisms. Studying aptitudes for implicit language learning in immersion contexts where the L2 is the language of the environment should be more revealing than studying them in instructed contexts where input exposure is limited, since language proficiency develops over a long period of constant exposure to the L2, thereby maximizing the potential involvement of implicit learning processes. Reconceptualizing existing cognitive-ability constructs as explicit- and implicit- language-learning aptitudes is a worthwhile endeavor, in order to investigate qualitative differences in learning processes between child and adult learners, as well as aptitude-treatment interactions (i.e., the relative contribution of different aptitudes in specific contexts). One difference between children and adults is that adults can learn aspects of the L2 through explicit reflection on linguistic structure. It is a matter of debate, however, whether this is how adults predominantly learn L2 grammar or whether they are still able to learn implicitly, even if this capacity is constrained, due to age. Following DeKeyser?s (2000) argument, any relationships between cognitive aptitudes and learning outcomes can be potential evidence for learning processes. As DeKeyser (2000) predicted, cognitive aptitudes that are more likely to play a role in explicit language learning should be necessary for L2 learners to reach a high ultimate level of attainment, if they learn through predominantly explicit mechanisms that draw on domain-general problem-solving abilities. Conversely, even if DeKeyser (2000) did not make such a prediction, cognitive aptitudes that are more likely to play a role in implicit language learning should be necessary for L2 learners to reach a 34 high ultimate level of attainment, if they learn through predominantly implicit mechanisms. Specifically, this dissertation research examined the extent to which aptitude for explicit language learning (operationalized as language analytic ability) and aptitude for implicit language learning (operationalized as sequence learning ability) moderate ultimate attainment, as measured by language tasks that involve predominantly controlled and predominantly automatic use of language knowledge. Granena (2011b, to appear) found that analytic ability (as measured by a composite of LLAMA B, E, and F) and phonological sequence learning ability (LLAMA D) were two relatively independent abilities. Granena (2011a, to appear) further showed that analytic ability moderated ultimate attainment only when measured with a task where language performance could be monitored. That study, however, had several limitations. The method included a single online measure that involved predominantly automatic use of L2 knowledge and a single offline measure that involved predominantly controlled use of L2 knowledge. In addition, the two measures were administered in two different modalities (aural and visual). Finally, participants in the study were all late L2 learners with ages of arrival ranging between 17 and 43. Therefore, the present study addressed these limitations by including a multiple-task design and child and adult L2 learners, as well as NS controls. In order to sample early and late L2 learners, an operationalization of each of the groups was necessary. Unfortunately, there is no consensus in the literature regarding a cutoff point between the two. For morphosyntax, the claimed cutoff age ranges from as young as six years of age (e.g., Paradis, 2009) up to the mid teens (DeKeyser, 35 2000). There is more agreement, however, regarding the expected shape of the age of onset-ultimate attainment function that the CP predicts, which resembles a stretched ?Z? (Johnson & Newport, 1989:79): A peak of enhanced sensitivity, followed by a decline in learning ability, and then by a leveling off marking the end of the offset phase of the CP. Age of onset is expected to predict ultimate attainment in the phase of decline. Before and after the decline, individual differences such as cognitive aptitudes become more relevant as potential factors that can account for the spread in proficiency (D?rnyei, 2005; D?rnyei & Skehan, 2003; R. Ellis, 2004). Given that the main predictor of ultimate attainment in the phase of decline is age of onset (e.g., Johnson & Newport, 1989), and that the decline becomes clearly visible from around age 6 and expands for a period of roughly 10 to 15 years (DeKeyser et al., 2010), the present study focused on two distinct groups of L2 learners: early child learners who started learning the L2 between ages 3 and 6, and adult learners who started learning the L2 after age 16. By looking at the extremes of the age of acquisition range, an arbitrarily established cutoff point was avoided. Both groups of learners are considered sequential bilinguals and both differ from NSs and simultaneous bilinguals regarding the degree of variability in their linguistic attainment. While there is relative uniformity of learning rate and ultimate success in L1 acquisition, L2 learners show a high degree of variability, across individuals and within learners, even when they start learning an L2 as early as age 3. Such inter- and intra-individual variation could be moderated by cognitive aptitudes. However, learners who acquire the L2 between ages 3 and 6 are not fundamentally different from NSs or simultaneous bilinguals in terms of learning mechanisms, since it has 36 been claimed that, before age 6, SLA relies on implicit learning (and the younger, the better), while ?after age 6 or 7, second language appropriation relies more and more on conscious learning, thus involving declarative memory? (Paradis, 2009:110). In addition, children first manifest metalinguistic behavior, evident when the child has conscious awareness of why a sentence is ungrammatical and can demonstrate this understanding, around age 5 or later (Karmiloff-Smith, 1979). Unlike early childhood learners, teenagers and young adults may additionally rely on explicit, analytic problem-solving capacities to learn the L2. If the brain of a child L2 learner acquires an L2 much like it acquires the L1, but the brain of an adult learner relies on predominantly explicit mechanisms that draw on domain-general problem-solving abilities, there should be qualitative differences between the two populations, such that different cognitive aptitudes moderate the level of proficiency child and adult L2 learners attain. If, alternatively, fundamental differences in learning mechanisms emerge in early childhood and the FDH (Bley-Vroman, 1988, 1990) applies to child as well as to adult SLA, as argued by Meisel (2009, 2011), the same cognitive aptitudes should moderate ultimate attainment in both child and adult L2 learners. 37 Chapter 3: Research Questions and Hypotheses Following DeKeyser?s (2000) claim that relationships between individual differences in language aptitude and eventual learning outcomes potentially constitute evidence for differences in underlying learning processes, this study investigated the relationship between different cognitive aptitudes for L2 learning, including general intelligence, and long-term L2 achievement in early and late L2 learners. Six research questions, each of them including three measurable hypotheses, were addressed in this dissertation research. The research questions and hypotheses were the following (see Table 1 for a summary of predictions): ? Research Question 1: To what extent will early L2 learners? ultimate attainment on tasks that allow controlled use of L2 knowledge be moderated by their cognitive aptitudes? Hypothesis 1a. Aptitude for explicit language learning will not moderate early L2 learners? attainment on tasks that allow controlled use of L2 knowledge. Hypothesis 1b. Aptitude for implicit language learning will moderate early L2 learners? attainment on tasks that allow controlled use of L2 knowledge. Hypothesis 1c. General intelligence will not moderate early L2 learners? attainment on tasks that allow controlled use of L2 knowledge. ? Research Question 2: To what extent will late L2 learners? ultimate attainment on tasks that allow controlled use of L2 knowledge be moderated by their cognitive aptitudes? Hypothesis 2a. Aptitude for explicit language learning will moderate late L2 38 learners? attainment on tasks that allow controlled use of L2 knowledge. Hypothesis 2b. Aptitude for implicit language learning will not moderate late L2 learners? attainment on tasks that allow controlled use of L2 knowledge. Hypothesis 2c. General intelligence will moderate late L2 learners? attainment on tasks that allow controlled use of L2 knowledge. ? Research Question 3: To what extent will NSs? ultimate attainment on tasks that allow controlled use of L2 knowledge be moderated by their cognitive aptitudes? Hypothesis 3a. Aptitude for explicit language learning will not moderate NS controls? attainment on tasks that allow controlled use of L2 knowledge. Hypothesis 3b. Aptitude for implicit language learning will not moderate NS controls? attainment on tasks that allow controlled use of L2 knowledge. Hypothesis 3c. General intelligence will not moderate NS controls? attainment on tasks that allow controlled use of L2 knowledge. ? Research Question 4: To what extent will early L2 learners? ultimate attainment on tasks that require automatic use of L2 knowledge be moderated by their cognitive aptitudes? Hypothesis 4a. Aptitude for explicit language learning will not moderate early L2 learners? attainment on tasks that require automatic use of L2 knowledge. Hypothesis 4b. Aptitude for implicit language learning will moderate early L2 learners? attainment on tasks that require automatic use of L2 knowledge. 39 Hypothesis 4c. General intelligence will not moderate early learners? attainment on tasks that require automatic use of L2 knowledge. ? Research Question 5: To what extent will late L2 learners? ultimate attainment on tasks that require automatic use of L2 knowledge be moderated by their cognitive aptitudes? Hypothesis 5a. Aptitude for explicit language learning will not moderate late L2 learners? attainment on tasks that require automatic use of L2 knowledge. Hypothesis 5b. Aptitude for implicit language learning will moderate late L2 learners? attainment on tasks that require automatic use of L2 knowledge. Hypothesis 5c. General intelligence will not moderate late L2 learners? attainment on tasks that require automatic use of L2 knowledge. ? Research Question 6: To what extent will NSs? ultimate attainment on tasks that require automatic use of L2 knowledge be moderated by their cognitive aptitudes? Hypothesis 6a. Aptitude for explicit language learning will not moderate NS controls? attainment on tasks that allow automatic use of L2 knowledge. Hypothesis 6b. Aptitude for implicit language learning will not moderate NS controls? attainment on tasks that allow controlled automatic use of L2 knowledge. Hypothesis 6c. General intelligence will not moderate NS controls? attainment on tasks that allow automatic use of L2 knowledge. 40 It was predicted that aptitudes that are more relevant for implicit language learning and processing would moderate L2 learners? attainment on tasks that require more automatic use of L2 knowledge. This prediction was made both for early L2 learners who are sequential bilinguals and for adult learners, since adults were still expected to be able to learn implicitly, but not for NSs, whose ultimate attainment is characterized by inter-individual homogeneity and, therefore, predicted to be independent of aptitude. Aptitude for implicit language learning should also moderate early L2 learners? attainment on tasks that require controlled use of L2 knowledge, since early L2 learners were expected to use the same type of knowledge regardless of language task. The nature of this knowledge is hypothesized to be implicit, like NSs? knowledge. Early L2 learners? ultimate attainment, however, is characterized by greater inter-individual variability than NSs and, therefore, was expected to be moderated by language aptitude. On the other hand, aptitudes that are more relevant for explicit language learning were only expected to moderate adult L2 learners? attainment on tasks that allow controlled use of L2 knowledge. These tasks increase available test time and decrease processing demands; therefore, they provide an opportunity to utilize problem-solving and analytic skills. On these tasks, adult learners can rely on explicit L2 knowledge and compensate for their limited implicit competence. Adult learners with a higher aptitude for explicit language learning should do better as a result of their greater analytic, metalinguistic abilities. Regarding general intelligence, a debated issue is whether it is related to or independent from language aptitude. Carroll (1981, 1993) argued that aptitude was a 41 specialized ability beyond general intelligence, whereas Pimsleur (1966) and Oller and Perkins (1978) considered intelligence a central component of aptitude. Evidence in support of an independent contribution of language aptitude is the fact that aptitude correlates more strongly with L2 outcomes than intelligence (Skehan, 1998). However, aptitude and intelligence ?indeed have a significant degree of overlap? (Skehan, 1998: 208). In this dissertation research, which made a distinction between cognitive aptitudes for implicit and explicit learning, the same pattern of results was predicted for general (i.e., fluid) intelligence as for aptitude for explicit language learning. Therefore, intelligence was considered more relevant for explicit learning, since it is closely related to analytic ability (DeKeyser & Koeth, forthcoming), and largely unrelated, or at most weakly related, to performance on implicit learning tasks (e.g., Gebauer & Mackintosh, 2007; Kaufman et al., 2010; Reber et al., 1991). It has also been argued that, even though general fluid reasoning ability measures tap both explicit (attention-driven) and implicit (procedural) cognitive processes, conventional IQ measures are weighted in favor of explicit processes that require central executive functioning (Woltz, 2003). This would explain that measures of general intelligence are highly correlated with working memory measures, but have low correlations with priming measures (Woltz, 1990, 1999) and with procedural skill performance beyond the initial stages (Ackerman, 1987, 1988). Further, research on artificial grammar learning has revealed that fluid intelligence correlates with learning when participants are instructed to intentionally look for patterns in the training materials (Gebauer & Mackintosh, 2007), but not under more incidental learning conditions (Misyak & Christiansen, 2012). Finally, Robinson (2002) reported a 42 significant, but negative, correlation between IQ and implicit learning of an artificial grammar, and no significant correlation between IQ and incidental learning involving meaning-based processing. Table 1. Predictions Concerning the Relationship between Cognitive Aptitudes, General Intelligence, and Ultimate L2 Attainment Automatic L2 Use Controlled L2 Use Early AO Late AO Control Early AO Late AO Control Intelligence No No No No Yes No Explicit Aptitude No No No No Yes No Implicit Aptitude Yes Yes No Yes No No 43 Chapter 4: Methodology 4.1 Participants Participants were 100 Chinese-Spanish bilinguals in Madrid (Spain) and 20 NSs of Spanish (N = 120), all of whom were at least 18 years of age at time of testing. The Chinese-Spanish bilingual participants had either immigrated to the country or been born in the country to immigrant parents. Half of them (n = 50) were early L2 learners (42% males and 58% females) with ages of onset ranging from 3 to 6. The other half (n = 50) were late L2 learners (34% males and 66% females) with ages of onset of 16 and older. Age of onset was operationalized as the beginning of a serious and sustained process of language acquisition as the result of migration or the commencement of a formal Spanish language program. Age of onset, therefore, could differ from age of physical arrival in the country. In this study, when age of onset and age of arrival did not overlap, formal instruction took place in adulthood, after age 16. Therefore, age of first exposure as a result of immersion in the L2-speaking country and age of first instruction still overlapped for the purposes of the current study, where adult L2 learners are defined as those with ages of onset of 16 and older. Age of onset could also differ from age of physical arrival in the country in the case of early L2 learners, albeit for a different reason. The early L2 learners in the present study arrived in the country at an early age (i.e., from ages 3 to 6) or were born in Spain. In either case, these early L2 learners had been born to Chinese- speaking parents who had immigrated to the country as adults. They had not been born to parents who had themselves been born in Spain. As a result, even those early L2 learners who had been born in Spain had not been immersed in the L2 until a later 44 age, usually at age 3, in pre-school. Until that age, they were primarily exposed to Chinese and, therefore, can be considered sequential, not simultaneous, bilinguals. Participants were recruited by advertising in Chinese-Spanish newspapers, by distributing fliers in cultural centers, embassies, and language schools, and by word of mouth in the community. To qualify for the study, participants had to: 1) have Chinese as mother tongue, 2) have lived in Spain for at least 5 years4, and 3) have an educational level of no less than high school. Participants were informally screened into the study via a telephone interview.5 A group of 20 NSs of Spanish (50% males and 50% females), born in Madrid and with no less than a high school diploma, served as controls. All participants completed a detailed biographical questionnaire (see Appendix A). Table 2 summarizes the information regarding age at testing, age of onset, and length of residence for the participants. 4 According to DeKeyser et al. (2010), length of residence ?turns out to be unrelated to most dependent measures, provided that it is more than 5 years, and that the dependent measures index basic grammatical proficiency (not purisms, collocations, etc.)? (p. 416). 5 The inclusion criterion was a score of at least four on a five-point scale that rated participants? degree of native-like pronunciation: 5 Native or near-native pronunciation. No foreign accent. 4 Generally good pronunciation but with occasional non-native sounds. Slight foreign accent. Pronunciation does not interfere with comprehensibility. 3 Frequent use of non-native sounds. Noticeable foreign accent. Pronunciation occasionally impedes comprehensibility. 2 Generally poor use of native-like sounds. Strong foreign accent. Pronunciation frequently impedes comprehensibility. 1 Very strong foreign accent. Definitely non-native. Participants rated with a three on pronunciation were also included in the study if their grammar use was native-like. 45 Table 2. Participants? Information Group Age at Testing Age of Onset Length of Residence M Range M Range M Range Control n = 20 27.35 (5.18) 20-36 Early AO n = 50 22.38 (4.45) 18-33 4.14 (1.23) 3-6 17.88 (4.49) 11-28 Late AO n = 50 29.46 (6.38) 21-50 20.84 (4.14) 16-30 8.42 (3.14) 5-20 Note. Standard deviations appear between parentheses. Early and late L2 learners were significantly different in terms of age of onset (t(98) = -27.331, p < .001) and length of residence (t(98) = 12.207, p < .001). Early L2 learners? age of onset was 4 years on average, whereas late L2 learners? age of onset was 20 years on average. Regarding length of residence, the average was 17 years in the early AO group and 8 years in the late AO group. In the late L2 learner group, 12 participants had a length of residence lower than 10 years (between 5 and 10), whereas 38 participants had a length of residence higher than 106. Regarding chronological age at time of testing, late L2 learners were in their late 20?s (29 years on average), whereas early L2 learners were in their early 20?s (22 years on average). The average age at testing in the NS group was 27. According to Scheff? posthoc tests, early L2 learners were significantly younger than both NSs (p = .003) and late 6 Having a length of residence between 5 and 10 years or higher than 10 years in the late L2 learner group did not did not make any difference on any of the morphosyntactic measures in the study. All the comparisons yielded non-significant results with p values ranging between .147 and .937. This provides some support to DeKeyser et al.?s (2010) claim that length of residence, provided that it is more than 5 years, is unrelated to measures of grammatical proficiency. 46 L2 learners (p < .001), but NSs and late L2 learners were not significantly different (p = .346). Although the range of ages at testing in the late L2 learner group was 21-50, there were only two participants older than 40 (48 and 50 years old, respectively). The rest of the late L2 learners (n = 48) were younger than 40. Early and late L2 learners also differed in terms of degree of identification with Spanish culture, Chinese literacy skills, Chinese proficiency level, and years of instruction of Spanish as a foreign language. Regarding their identification with Spanish culture, early L2 learners had an average of 3.72 (SD = 0.70) on a five-point Likert scale ranging from 1 (i.e., no identification ? you do not feel Spanish) to 5 (i.e., total identification ? you feel Spanish), whereas late L2 learners had an average of 3.14 (SD = 0.64) (t(98) = 4.323, p < .001). Early L2 learners had significantly lower Chinese literacy skills (M = 1.46, SD = 0.50) than late L2 learners (M = 1.98, SD = 0.14) on a two-point scale where 1 indicated oral skills and 2 indicated oral and written skills (t(97) = -7.031, p < .001). Early L2 learners? proficiency in Chinese on a five-point scale ranging between 1 (basic) and 5 (native-like) was also lower (M = 3.16, SD = 1.45) than late L2 learners? Chinese proficiency (M = 4.90, SD = 0.51) (t(97) = -7.998, p < .001). Finally, regarding years of instruction, late L2 learners had studied Spanish formally for an average of two-and-a-half years (M = 2.45, SD = 1.82), whereas early L2 learners had not taken any Spanish language courses (M = 0.0, SD = 0.0). The number of years of instruction ranged between zero and seven in the late AO group, and instruction had usually taken place in the learners? country of origin (China) before arrival in Spain. A total of 19 late L2 learners had received instruction for one year or less than a year (n = 12), always upon arrival in Spain, or 47 no instruction at all (n = 7), whereas six late L2 learners had taken between five (n = 3) and seven years (n = 1) of Spanish. The remaining 25 participants had received instruction for either two (n = 8), three (n = 5), or four (n = 12) years. Early and late L2 learners did not differ regarding percentage of daily Chinese use (t(96) = -1.500, p = .137) or percentage of daily Spanish use (t(96) = 1.713, p = .090). Early L2 learners used 28.5% Chinese (SD = 15.43) and 69.80% Spanish (SD = 15.52) daily on average, and late L2 learners 34.76% Chinese (SD = 24.96) and 62.73% Spanish (SD = 24.52) daily on average. 4.2 Design of the Study The study combined an ex-post-facto design with a repeated-measures experimental design. Groups were compared in four experimentally-manipulated test conditions: 1) A time-pressured visual GJT, 2) a time-pressured auditory GJT, 3) an unpressured auditory GJT, and 4) an unpressured visual GJT. The four tests were administered following a 4x4 balanced Latin square to control for order and carry- over effects (see Table 3). In a balanced Latin square, each condition appears only once in a given ordinal position and no two conditions are juxtaposed in the same order more than once. Table 3. Balanced Latin Square Design Order 1 1 2 4 3 Order 2 2 3 1 4 Order 3 3 4 2 1 Order 4 4 1 3 2 48 Overall, the same number of participants (n = 30) was randomly assigned to each of the four test orders. Within every group, the same number of participants was also assigned to each test order, as long as the group?s sample size allowed that. Thus, in the control group, the same number of participants (n = 5) could be assigned to each test order. However, in the early and late AO groups, two test orders had 12 participants each, and the other two had 13 participants each. In order to discount test ordering effects, there should be no interaction between the four order groups (between-subjects factor) and scores on the four test formats (within-subjects factor). A repeated-measures ANOVA with the four GJTs as the repeated factor and Test Order as the group factor indicated a significant multivariate effect for GJT (F(3, 114) = 14.639, p < .0017, ?p 2 = .0768, ? = .720), but no significant two-way interaction between the order in which tests were administered and GJT scores for the sample as a whole (F(9, 360) = 1.271, p = .253, ?p 2 = .033, ? = .906). The interaction was non-significant in each of the three participant groups, as well: controls (F(9, 57) = 1.064, p = .413, ?p 2 = .181, ? = .548), early L2 learners (F(9, 147) = 1.880, p = .063, ?p 2 = .114, ? = .695), and late L2 learners (F(9, 147) = .674, p = .732, ?p 2 = .044, ? = .875). This experimental design allows testing the variables of interest while keeping all other factors constant. However, it suffers from two limitations. First, it involves variants of just one method (GJTs) and, second, it involves a method that requires focusing participants? attention on language correctness. Therefore, two additional tasks at the extremes of the controlled/automatic use of language knowledge 7 Alpha was set at 0.05 for all inferential tests in this study. 8 For partial eta squared (?p 2), a small effect size is .01 ? ?p 2 < .06, medium is .06 ? ?p 2 < .14, and large is ?p 2 ? .14. 49 continuum were included in the design: A metalinguistic knowledge test and a word monitoring task. These two tasks are hypothesized to tap directly into controlled and automatic use of L2 knowledge, respectively. In a metalinguistic test, participants? attention is directly focused on linguistic structure, correctness and grammatical rules (i.e., explicit declarative facts about language). It requires language analysis rather than intuition about correctness. In a word monitoring task, participants? attention is not focused on the linguistic relationship of interest to the researcher. Participants monitor for a target word in a sentence and focus their attention on meaning comprehension, while the researcher measures sensitivity to grammatical violations. 4.3 Instruments A battery of 12 tests was administered as part of the study. Six of the tests were language measures hypothesized to lie along a continuum of controlled to automatic use of L2 knowledge: four GJTs (timed visual, timed auditory, untimed visual, and untimed auditory), a metalinguistic knowledge test (at the controlled end of the L2 knowledge use continuum), and a word monitoring task (at the automatic end of the L2 knowledge use continuum). There were also six cognitive measures hypothesized to be aptitudes relevant for either implicit or explicit learning: four verbal language- independent aptitude subtests (the LLAMA aptitude test battery), a non-verbal measure of general intelligence (the GAMA general ability measure for adults), and a non-verbal measure of sequence learning (a probabilistic serial reaction time task). 50 4.3.1 Language Tests that Require Automatic Use of L2 Knowledge Timed Auditory GJT (k = 60). The timed auditory GJT was a computer- delivered test with sentences presented aurally. Participants indicated whether each sentence was grammatical or ungrammatical by pressing a response button within a fixed time-limit. They were asked to press a key as soon as an error was detected in the sentence. Once participants pressed a key, the computer automatically moved on to the next sentence without a pause. Following R. Ellis (2005), the time-limit for each item was established on the basis of NSs? average response time in a pilot study (n = 10). Following R. Ellis, as well, an additional 20% of the time taken for each sentence was added to allow for the slower processing speed of L2 learners. The time allowed for judging each sentence in the timed auditory GJT ranged between 3408.72 milliseconds (3.4 seconds) to 10045.92 (10 seconds) (M = 5807.98, SD = 1000.76). In terms of target structure, NSs? longest response times were on aspectual contrasts (M = 5365.09, SD = 1156.64), followed by gender agreement (M = 5102.60, SD = 471.69), the passive (M = 4988.20, SD = 432.40), person agreement (M = 4892.22, SD = 608.58), number agreement (M = 4691.73, SD = 844.26), and the subjunctive (M = 4000.05, SD = 714.31). Each item was scored dichotomously as correct/incorrect, and percentage accuracy scores were calculated for grammatical and ungrammatical items overall, as well as for grammatical and ungrammatical items separately. Percentage scores out of total number of attempts were used due to the relatively high proportion of missing data as a result of the speeded nature of the test (10.61% of total items). 51 The internal consistency of the test, according to Cronbach?s alpha, which measures the rank-order stability of individuals? scores on different items of the test, was .92. Timed Visual GJT (k = 60). The timed visual GJT was a computer-delivered test with sentences presented visually. Participants indicated whether each sentence was grammatical or ungrammatical by pressing a response button within a fixed time- limit. Once participants pressed a key, the computer automatically moved on to the next sentence without a pause. The time limit for each item was also established by adding 20% to NSs? average response time. The time allowed for judging each sentence in the timed auditory GJT ranged between 3590.23 milliseconds (3.5 seconds) to 8587.20 (8.5 seconds) (M = 5804.37, SD = 993.40). In terms of target structure, NSs? longest response times were again on aspectual contrasts (M = 5289.30, SD = 931.76), followed by gender agreement (M = 4942.46, SD = 844.04), the passive (M = 4930.43, SD = 720.62), number agreement (M = 4742.05, SD = 1122.58), the subjunctive (M = 4595.84, SD = 625.48), and person agreement (M = 4521.77, SD = 553.95). Each item was scored dichotomously as correct/incorrect, and percentage accuracy scores were calculated for grammatical and ungrammatical items overall, as well as for grammatical and ungrammatical items separately. Percentage scores out of total number of attempts were used due to the relatively high proportion of missing data as a result of the speeded nature of the test (15.67% of total items). The internal consistency of the test, according to Cronbach?s alpha, was .89. 52 Word monitoring Task (k = 120). The word monitoring task was a computer- delivered test with sentences presented aurally, and words to monitor presented visually (i.e., cross-modal modality). Word monitoring is considered an implicit task in the sense that participants? attention is not directed towards the linguistic variable of interest. Participants monitor ongoing auditory language input for a prespecified target word that is presented visually on their computer screen, and press a button when they hear the target word. Target words occur immediately after the relevant target structure in each sentence. The onset of each target word triggers a timing device that is stopped when the participant presses one of the response keys. The reaction time is the duration between the onset of the target word and the time when a response is provided. The test also includes comprehension questions, in order to focus participants? attention on meaning. This dual-task paradigm, which involves simultaneously engaging participants in a second unrelated task while performing the experimental task (i.e., word monitoring), minimizes the application of explicit language knowledge and strategy use (Kilborn & Moss, 1996). Participants? word monitoring latencies of grammatical and ungrammatical sentences are compared, and delays in monitoring target words in ungrammatical sentences are interpreted as suggesting automatic and involuntary activation of integrated L2 knowledge (Marslen-Wilson & Tyler, 1980). Results in this experimental paradigm are typically analyzed within a repeated-measures design at a group level. In this study, however, what was called a Grammatical Sensitivity Index (i.e., GSI) was created by subtracting the response latencies of grammatical items from the latencies of ungrammatical items. This index was a measure of degree of 53 sensitivity for each individual participant. By providing a continuous numerical value, this index permitted investigation of any relationships between degrees of sensitivity to grammatical violations and cognitive abilities in correlational and factorial statistical analyses. It also permitted computing correlations with other language measures and comparisons in between-subjects analyses. Two presentation lists (A and B), counterbalanced for grammaticality, were used, with half of the participants in each group randomly assigned to each list. No sentence appeared twice in the same list. A grammatical sentence in one list appeared as ungrammatical in the other, and vice-versa. Each presentation list included 60 target items (10 per target structure), half grammatical and half ungrammatical, and 60 grammatical distracters. The word to monitor appeared in target sentences (i.e., critical items), so that latencies could provide a measure of sensitivity, but it did not appear in distracter sentences. This way, the probability of a word appearing in a sentence was .5. The position of the target word in the distracters varied randomly to prevent participants from anticipating when to respond. The position of the target word in critical items was located immediately after the target structure. In order to assess word monitoring latencies as accurately as possible, split recordings of the target sentences were used. The target structure appeared at the very end of the first half of the sentence, and the timer started at the onset of the second half with the target word (i.e., the word to monitor). This way, the onset of the target word and the timer could be synchronized. This allowed use of the same second half for both the grammatical and ungrammatical version of each item, and control for possible 54 confounding factors, such as the speed with which different versions of an item were read and recorded. Half of the test items (k = 60) were followed by a comprehension question. All comprehension questions were yes/no questions that participants answered by pressing ?A? for yes and ?L? for no. Half of the questions required a positive response and half a negative response. Participants were instructed to monitor ongoing auditory language input for a pre- designated target word that would be presented visually on their computer screen. They were asked to maintain their hands on the keyboard with their index fingers on the yes key (?A?) and the no key (?L?). These two keys were the right-most and left- most keys on the keyboard and allowed participants to rest their wrists on the keyboard table. Participants were instructed to press yes as soon as they heard the word that was displayed on the screen or to press no, if the sentence finished playing, and they had not heard the word displayed on the screen. They were also instructed to pay attention to the meaning of the sentences, since they would be randomly asked comprehension questions. To respond to comprehension questions, the same yes/no keys were used. Pressing the yes key in the word monitoring portion of the task did not stop the sentence from playing, so that participants would have all the necessary information to answer the comprehension questions. A total comprehension score was computed on the basis of correct responses to comprehension questions. A cutoff was adopted in order to exclude any participants who were not listening for comprehension. Previous studies carried out in the same framework (Jiang, 2004; 2007) included only participants who had an error rate lower 55 than 37% (i.e., 63% accuracy level) (Jiang, 2004) or lower than 20% (i.e., 80% accuracy level) (Jiang, 2007). In the present study, a minimum of 75% response accuracy was required (i.e., an error rate lower than 25%). This cutoff is similar to Jiang (2007) and well-above chance-level performance (i.e., 50%). In order to ensure that participants had been focusing their attention on meaning while performing the task, a minimum of 75% response accuracy was required for each participant to be included in the analysis. Before comparing monitoring latencies, data were checked for outliers, defined as +/- 3 SDs from each individual?s mean. Only response times for correctly accepted target words (i.e., hits) were included in the analysis, since failure to monitor the word successfully implied that the task had not been performed correctly. The reliability of the task, using the split-halves method, was .98. 4.3.2 Language Tests that Allow Controlled Use of L2 Knowledge Untimed Auditory GJT (k = 60). The untimed auditory GJT was a computer- delivered test with sentences presented aurally. Participants were required to indicate whether each sentence was grammatical or ungrammatical by pressing a response button. Unlike its time-pressured counterpart, this test presented each sentence twice before participants were allowed to provide a response. Following DeKeyser (2000) and DeKeyser et al. (2010), each sentence was played twice, with a three-second interval between the repetitions and a six-second interval between sentence pairs. Each item was scored dichotomously as correct/incorrect, and percentage accuracy scores were calculated for grammatical and ungrammatical items overall, as well as for grammatical and ungrammatical items separately. 56 The internal consistency of the test, according to Cronbach?s alpha, was .89. Untimed Visual GJT (k = 60). The untimed visual GJT was a computer- delivered self-paced test with sentences presented visually. Participants were required to indicate whether each sentence was grammatical or ungrammatical by pressing a response button. Each item was scored dichotomously as correct/incorrect and percentage accuracy scores were calculated for grammatical and ungrammatical items overall, as well as for grammatical and ungrammatical items separately. The internal consistency of the test, according to Cronbach?s alpha, was .85. Metalinguistic Knowledge Test (k = 60). The metalinguistic knowledge test was a computer-delivered, self-paced test with sentences presented visually. This test followed the same format as the untimed visual GJT, but it included an error correction component and a metalinguistic knowledge component in order to encourage use of metalinguistic abilities. Participants were required to indicate whether each sentence was grammatical or ungrammatical and, if ungrammatical, to correct the error and state the grammar rule. Unlike the word monitoring task, the metalinguistic knowledge test was a correction task that focused participants? attention on language forms and analyzed representations (Bialystok, 1986). Grammatical items were dichotomously scored as correct/incorrect, and ungrammatical items were scored following a system of partial credit (0-3), yielding a maximum of 120 points on the test. One point was given for identifying the sentence as ungrammatical, one point for correcting the error, and one point for providing a statement of the grammar rule. 57 The internal consistency of the test, according to Cronbach?s alpha, was .89. 4.3.3 Explicit Language Aptitude Tests In general, the use of L1- or L2-based cognitive tests can result in confounds between participants? proficiency level and their cognitive capacity. In the case of studies that include both child and adult L2 learners, language-based cognitive measures can be particularly problematic, since degree of L2 acquisition tends to correlate with degree of L1 attrition. A commonly used test of analytic ability is the Words-in-Sentences MLAT subtest. However, this is a test that participants need to take either in their L1 or L2. DeKeyser (2000) administered the test in the participants? L1 (Hungarian). According to the descriptive data, the highest score on the test belonged to the latest arrival (age of arrival = 38). The next highest aptitude scorers were also late arrivals. Conversely, early arrivals, probably with poorer L1 literacy skills, were not able to score as high as late arrivals. The use of L1-based cognitive tests, therefore, can artificially reduce the range of aptitude scores and affect the magnitude of the correlation for early arrivals. The LLAMA aptitude test (Meara, 2005), on the other hand, is to a large extent independent of test takers? L1 and L2, since it relies on picture stimuli and verbal stimuli based on languages that differ from any languages that test takers are likely to know in practice (i.e., a dialect of a language in Northern Canada and a Central American language). The test includes no instructions for test takers, only for test administrators, who provide them orally to test-takers. Granena (2011b, to appear) showed that three of the LLAMA subtests measured the same underlying aptitude and that this aptitude could be interpreted as analytic ability. In the proposed study, 58 explicit language learning aptitude, operationalized as analytic ability, will be measured as a composite of three LLAMA subtests: Vocabulary Learning (LLAMA B), Sound-symbol Correspondence (LLAMA E), and Grammatical Inferencing (LLAMA F). The reliability of the LLAMA test (k = 90) in terms of internal consistency according to Cronbach?s alpha was .77 (an acceptable research standard is considered to be .70, according to Nunnally & Bernstein, 1994). A total of 74 participants aged 19-47 were sampled. Stability over time (i.e., test-retest reliability) according to a Pearson product-moment correlation was .64 (p = .002), based on a subsample of 20 participants from the present study that were tested twice with a two-year period between test and retest (years 2009 and 2011) in order to minimize carryover effects. The internal consistency of the composite score of the three LLAMA subtests (k = 60) that loaded on the same factor (LLAMA B, E, and F) was .79 and their average test- retest reliability was .63 (p = .003). Vocabulary Learning -LLAMA B- (Meara, 2005). LLAMA B is a test that measures the ability to learn new words. The words to be learned are presented visually and are real words taken from a Central American language. Each of them is assigned to a target image. Participants have to learn as many words as possible by relating each of them to a target image. There is a timed study phase in which participants click on the different images displayed on the screen. The name of each object is shown in the centre of the panel. Then, the program displays the name of an object and participants have to identify the correct image on the screen. The internal 59 consistency of LLAMA B (k = 20) was .76, according to Cronbach?s alpha, and its test-retest reliability was .53 (p = .016). Sound-symbol Correspondence -LLAMA E- (Meara, 2005). LLAMA E is a test that measures the ability to form sound-symbol associations. Participants have to work out the relationship between the sounds they hear (i.e., recorded syllables) and a transliteration of these sounds in an unfamiliar alphabet. There is a timed study phase in which participants click on the different transliterations displayed and try to learn the corresponding sound association. Then, they hear a syllable and have to decide its symbol correspondence by clicking on the right transliteration. The internal consistency of LLAMA E (k = 20) was .64, according to Cronbach?s alpha, and its test-retest reliability was .60 (p = .005). Grammatical Inferencing -LLAMA F- (Meara, 2005). LLAMA F is a test that measures the ability to induce the rules of an unknown language. Participants have to relate a sentence presented visually on the screen with its picture. There is a timed study phase in which participants click on a series of small buttons displayed on the screen. For each button, a picture and a sentence describing the scene are displayed. In the testing phase, the program shows a picture and two sentences, a grammatical and an ungrammatical one. Participants choose the correct sentence. The internal consistency of LLAMA F (k = 20) was .60, according to Cronbach?s alpha, and its test-retest reliability was .56 (p = .010). 4.3.4 Implicit Language Aptitude Tests It has been argued that much of language acquisition is sequence learning and that individual differences in the ability to remember verbal strings determine the 60 acquisition of grammar (N. Ellis, 1996). In the proposed study, implicit language learning aptitude, operationalized as phonological and visual sequence learning ability, will be measured by means of two tests: a sound recognition test (LLAMA D) and a probabilistic serial reaction time task. Sound Recognition -LLAMA D- (Meara, 2005). LLAMA D is a test that measures participants? ability to recognize patterns in spoken language. According to Meara (2005), this ability should help learners recognize the small variations in endings that languages use to signal grammatical features. The test is based on Speciale, N. Ellis, and Bywater (2004) and on research on implicit induction of phonological sequences (e.g., Saffran et al., 1996). Participants listen to a string of words based on the names of objects in a British Columbian Indian language. They then complete a recognition test and indicate whether they have heard each stimulus previously. Participants who rapidly acquire the phonological sequences of the target items are able to discriminate better between old and new items. The internal consistency of this subtest (k = 30), according to Cronbach?s alpha, was .63. This coefficient is .07 below the acceptable standard of .70, but not substantially. The test can be considered to have marginal reliability and relatively uniform test items. Test- retest reliability was .61 (p = .004). Serial Reaction Time (SRT) Task. The SRT task is a test that measures participants? implicit sequence learning ability. Unlike other paradigms for studying implicit learning (e.g., Artificial Grammar learning tasks), learning in the SRT is measured online (i.e., during the training phase), which, according to Destrebecqz and Cleeremans (2001), makes it a better measure of implicit learning. 61 Originally developed by Nissen and Bullemer (1987), the SRT task used in the present study was a probabilistic version created using the same stimuli as Kaufman et al. (2010). The probabilistic SRT task measures participants? sensitivity to high- and low-frequency events. Participants see a visual cue (an asterisk) appear at one of four prescribed locations on a computer screen. The four locations are separated by 1.2 inches and indicated by means of a placeholder. Participants are required to press a key corresponding to the location of the asterisk as fast and accurately as possible by placing their middle and index fingers of each hand on the keys marked ?z?, ?x?, ?.?, and ?/?, respectively (see Figure 1). Keys ?z? and ?x? and ?.? and ?/? were adjacent and allowed participants to place their wrists comfortably on the laptop table. No instructions to memorize the series or look for underlying rules are provided. Figure 1. Representation of visual cues and required key-presses in SRT task The asterisks play out a repeating sequence of positions. This sequence, unlike in the deterministic version of the task, follows a probabilistic order. In every task block, sequence trials are interspersed with control trials. Control trials are incongruent with sequence trials and make it more difficult for participants to explicitly discover the 62 target sequence. As a result, the task has greater ecological validity, since implicit learning in the real world takes place under conditions of uncertainty (i.e., noise) that make learning probabilistic, rather than deterministic (Jim?nez & V?zquez, 2005). Following Schvaneveldt and G?mez (1998), stimuli were congruent with the target sequence 85% of the time and intermixed with an alternate sequence 15% of the time. The two sequences used to generate either training (A) or control (B) trials had 12 elements each and were balanced for simple location and transition frequency (Reed & Johnson, 1994). They exclusively differed in the second-order conditional information they conveyed. Reed and Johnson gave the sequences of three locations the name of second order conditionals (SOCs) (vs. first-order probabilities, where the location of an item is unambiguously predicted by the preceding item with a probability of 1.0). In second-order conditionals, at least two previous locations are needed to predict the next location in the sequence. The target sequence chosen (Sequence A) was 1-2-1-4-3-2-4-1-3-4-2-3, while the alternate sequence (Sequence B) was 3-2-3-4-1-2-4-3-1-4-2-1. The starting point was randomly chosen for each block. Figure 2 shows how the two sequences are related to each other. Transitions in one sequence respect the second-order conditionals of the other sequence, but lead to different predictions. If a participant is trained in Sequence A, the most likely successor after locations 4-3 would be 2, but on some trials it could be 1, which is the successor of the series 4-3 according to Sequence B. Trials following the alternate control sequence could appear isolated or in small groups (e.g., 3-2-4-1- 2-4-3 -2- 3 -1-2-1- 3-2 -4-1). 63 Figure 2. Representation of the two sequences used to generate training trials (A) and control (B) trials The SRT task started with a practice block that included 14 trials where the likelihood of probable and improbable transitions was the same (.5 probability). After the practice block, participants completed eight training blocks of 120 trials each (960 in total). The task did not include response-stimulus intervals, since there is evidence that explicit learning can take place when people are given 250 or 500-msec to think (Destrebecqz & Cleeremans, 2001). Out of 960 trials, 149 (15.52%) were control trials and 811 were training trials (84.48%). In order to increase the probabilistic nature of the task, the probability of transitions generated from Sequence A and Sequence B also differed from block to block (see Table 4). 64 Table 4. Probabilities of Probable and Non-probable Trials in SRT Task Sequence A (Probable Trials) Sequence B (Improbable Trials) n % n % Block 1 102 85 18 15 Block 2 105 87.5 15 12.5 Block 3 98 81.67 22 18.33 Block 4 108 90 12 10 Block 5 101 84.17 19 15.83 Block 6 94 78.33 26 21.67 Block 7 101 84.17 19 15.83 Block 8 102 85 18 15 All trials were initially randomized within each block and then presented in the same fixed order for each participant. According to Kaufman et al. (2010), this procedure maximizes ?the extent to which individual differences reflect trait differences rather than differences in item order? (p. 326). Participants were allowed to take a short rest between blocks. Accuracy and reaction time in milliseconds were recorded on each trial. Degree of learning was quantified as the average difference in reaction time between correct responses to congruent and incongruent trials (incongruent RT - congruent RT). The larger the difference, the more learning occurred. 65 At the end of the SRT task, participants were administered a recognition test adapted from Shanks and Johnstone (1999). They were told that they would be presented with short sequences of three elements. They were asked to respond to the asterisks as quickly as possible, and then to provide a rating of how confident they were that the sequence was part of the test they had just taken. The recognition test included 24 three-element sequences (triads) presented in a randomized order for each participant. There were 12 old sequences, constructed following second order conditionals in Sequence A (3-4-2, 3-1-2, 1-4-3, 2-4-1, 4-2-3, 1-2-1, 4-3-2, 4-1-3, 2- 3-1, 2-1-4, 3-2-4, 1-3-4), and 12 novel sequences, constructed following second order conditionals in Sequence B (3-4-1, 3-1-4, 1-4-2, 2-4-3, 4-2-1, 1-2-4, 4-3-1, 4-1-2, 2-3- 4, 2-1-3, 3-2-3, 1-3-2). It should be noted that, in fact, all sequences had been seen before, but with different probabilities (.85 vs. .15), so the terms ?old? and ?new? are relative and actually mean ?familiar? and ?less familiar?. Each location and each first-order transition appeared with the same likelihood. The only difference between old and new sequences was second-order conditional information (e.g., transition 3-4 was followed by location 2 in Sequence A and by location 1 in Sequence B). There were also four practice trials containing novel random sequences (1-1-1, 4-4-4, 1-2-3, 3-2-1). Immediately after participants selected the response button for the third element, they were asked to give a confidence rating on a six-point scale, where 1 = I?m sure that this sequence was part of the test; 2 = I?m pretty sure that this sequence was part of the test; 3 = I think that this sequence was part of the test; 4 = I think that this sequence was not part of the test; 5 = I am pretty sure that this sequence was not part of the test, and 6 = I?m sure that this sequence was not part of the test. 66 The consensus in the sequence learning literature appears to be that if participants are able to discriminate old from new sequences, they have acquired explicit sequence knowledge (Perruchet & Amorim, 1992; Shanks & Perruchet, 2002; Shanks, Wilkinson, & Channon, 2003; Willingham, Salidis, & Gabrieli, 2002). In addition to a measure of (explicit) recognition, the test used in the present study also yielded a concurrent measure of (implicit) priming, based on the speed of responding to old versus new sequences. Recognition scores were computed as the difference between the mean judgment for old sequences minus the mean judgment for new sequences. Priming scores were computed as the difference between the mean reaction time elicited by the third element of old sequences minus the mean reaction time elicited by the third element of new sequences. Evidence of a dissociation between explicit recognition and implicit priming (i.e., poor recognition, but faster reaction times, for segments of the old sequences) was considered as supporting evidence of implicit learning during the training task. The reliability of the probabilistic SRT task in the present study, using split- halves with the Spearman-Brown correction, was .44. This is a low reliability index when compared to reliability indices of measures of explicit learning, which are usually greater than .70, but it is similar to the indexes reported in other studies of implicit learning. Kaufman et al. (2010), from whom the SRT task was adapted, also reported a reliability of .44 and considered it standard for probabilistic SRT tasks, on the basis of the reliability of implicit learning previously reported in the literature (Reber et al., 1991; Dienes, 1992). Reber et al. (1991) and Robinson?s (1996) replication study reported split-half reliabilities of .51 and .52, respectively, also 67 using the Spearman-Brown correction. According to Reber et al., ?a Cronbach above .4 or .5 is taken as reasonable support for the internal reliability of a test? (p. 893). The less reliable is a measure, the lower its possible observed correlation with another variable can be, regardless of the true correlation, given that lower reliability leads to greater attenuation of correlation coefficients as a result of the amount of noise in the measure. However, Kaufman et al. (2010) reported significant correlations between implicit learning on their probabilistic SRT task and processing speed. These correlations were in the middle third of effect sizes reported in psychology (r = .2 to .3; Hemphill, 2003). Other studies have also shown correlations between implicit learning and complex cognition (i.e., school grades in Math and English) (Gebauer & Mackintosh, 2012; Pretz, Totz, & Kaufman, 2010). 4.3.5 General Intelligence Test A Spanish version of the General Ability Measure for Adults (GAMA9) test was used as a measure of general intellectual ability. GAMA is a commercially available non-verbal test of intelligence published by Pearson that uses abstract designs, shapes, and colors (i.e., non-verbal stimuli) to minimize the effects of confounding variables such as language knowledge, verbal expression, and verbal comprehension on test scores. It is a self-administered (booklets were used), 25-minute, timed test with four subtests (66 items with response sets of five options) that require the application of reasoning and logic to solve problems: Matching, Analogies, 9 GAMA is considered a culture-fair test because its non-verbal nature allows evaluation of intellectual ability without substantial influence from linguistic, educational, and cultural factors. The Raven?s Progressive Matrices (Raven, 1938) is also a non-verbal test of fluid intelligence, but it has been criticized for its use of matrix structures, considered a cultural construct (Greenfield, 1998). 68 Sequences, and Construction. The type of non-verbal reasoning measured corresponds roughly to fluid intelligence. The Matching subtest requires examinees to determine which one of the six options is identical to the stimulus in color, shape, and configuration (see Figure 3). Figure 3. Sample matching item: Which answer is the same as the first picture? The Analogies subtest requires examinees to recognize the relationship between two abstract figures and then identify the option that has a different pair of figures with the same conceptual relationship (see Figure 4). Figure 4. Sample analogies item: Which answer goes on the question mark? The Sequences subtest requires examinees to recognize the pattern of change in a geometric design and choose the option that fits the pattern (see Figure 5). 69 Figure 5. Sample sequences item: Which answer goes on the question mark to complete the pattern? The Construction subtest requires examinees to determine how several shapes can be combined to produce one of the designs provided as options. The items require the examinee to analyze and synthesize the spatial characteristics of the shapes to mentally construct designs (see Figure 6). Figure 6. Sample construction item: Which answer can be made with the shapes in the top box? The GAMA was normed on a sample of 2,360 individuals aged 18 to 96 (Naglieri & Bardos, 1997). Internal consistency using split-halves with Spearman-Brown correction ranged from 0.79 to 0.94 across normative age groups, with an average of 0.90 (reliability based on a linear composite). Average reliability coefficients for each of the four subtests across normative groups were .66, .81, .79, and .65 for the 70 Matching, Analogies, Sequences, and Construction subtests, respectively. The test- retest reliability was 0.67 over a two- to six-week interval for a sample of 86 people. In terms of validity (concurrent validity), the correlations between GAMA and other general ability tests, the WAIS-R (Wechsler, 1981) and the K-BIT (Kaufman Brief Intelligence Test, Kaufman & Kaufman, 1990) were .75 (p < .001) and .70 (p < .001), respectively, for a sample of 194 participants. Skinner et al. (1996) looked at the relationship between the GAMA and reading achievement among college students. The results suggested that the GAMA non-verbal scores are significantly related to reading achievement (r = .39, p < .01). 4.4 Procedure Participants were tested individually by the researcher and all the tasks were administered on a laptop computer. Data were collected in a seminar room of the Education Department at the Universidad Complutense in Madrid. Upon their arrival, participants were provided with a Spanish translation of the consent form10 and had the opportunity to ask questions before signing it (5 minutes). Two of the language tests, the word monitoring task and the metalinguistic knowledge test, were administered in a fixed order. The word monitoring task was always administered first (20 minutes), and the metalinguistic knowledge test was administered last (20 minutes). The rationale behind this order was to reserve the tests that allow controlled use of L2 knowledge and that encourage the highest degree of awareness to the end and the most implicit L2 measure as pure as possible (for a similar order of administration, see Ellis, 2005). The four remaining language tests were administered 10 IRB (Institutional Review Board) Protocol #08-0138, approval date February 22, 2011. 71 between the word monitoring task and the metalinguistic knowledge test following a balanced Latin square design. The two timed GJTs took approximately 10 minutes each, whereas the untimed auditory and untimed visual GJT took approximately 20 minutes each. Every language test was followed by a cognitive test selected at random. Each cognitive test (the LLAMA, GAMA, and probabilistic SRT task) took 25 minutes. Overall testing time ranged between three to four hours. The six language tests included a 5-minute break halfway. In addition, participants were allowed to take rests between tests and/or as needed. Participants earned 50 euros (approximately $65) for their participation in the study and were provided with drinks and snacks. 4.5 Target Structures In order to maximize variability among early L2 learners, the study included a combination of six grammatical structures that are early and late acquired items in L1 Spanish. Grammatical agreement, in languages that mark for agreement, is acquired by age 3 and with few errors (Slobin, 1985; Meisel, 1990), but it is among the most difficult grammatical structures for L2 learners. Later acquired items for L1 speakers include the conditional and subjunctive moods, subordinate clauses, passives, and tense and aspect. These are structures that are not mastered with 100% accuracy until age 7 and later, whereas grammatical agreement is acquired with almost 100% accuracy by age 3. Late acquisitions in Spanish such as the subjunctive and the passive depend on cognitive development in children and are more influenced by explicit instruction at school and by literacy skills. 72 The six target structures included in the present study were: (1) Noun-adjective gender agreement, (2) Subject-verb number agreement, (3) Noun-adjective number agreement, (4) Subjunctive mood, (5) Perfective/imperfective aspect contrasts, and (6) Passives with ser/estar. These structures are known to be difficult for NSs of a non-Romance language (Bruhn de Garavito & Valenzuela, 2008; Collentine, 1995; Jiang, Novokshanova, Masuda, & Wang, 2011; Johnston, 1995; Montrul, 2004; Smith, 1980; Terrell, Baycroft, & Perrone, 1987). Gender agreement, subject-verb agreement, and number agreement (agreement structures) are acquired by age 3 by L1 speakers of Spanish, while the subjunctive, aspectual contrasts, and passives (non- agreement structures) are acquired close to age 7 (Montrul, 2004). Grammatical structures acquired early in the L1, such as gender agreement, show greater variability in L2 acquisition than structures that are acquired late. For example, in the study by Granena and Long (2010), the variability (as indicated by standard deviations) among child L2 learners with ages of onset between 3 and 6 (n = 20) was greater for grammatical structures that are typically acquired before age 3 by L1 speakers (e.g., gender agreement, M = 10.45, SD = 2.63) than for structures that are acquired later (e.g., the subjunctive, M = 21.00, SD = 1.45). An example of each target structure is displayed in Table 5. 73 Table 5. Target structures Early L1 acquisitions Agreement structures Late L1 acquisitions Non-agreement structures Noun-adjective gender agreement *Cualquier corriente de aire puede resultar molesto (molesta) para la practica del esqu? ?Any airstream can become annoying for skiing? Subjunctive mood *Es importante que los estudiantes de espa?ol practican (practiquen) el idioma todos los d?as ?It is important for Spanish learners to practice the language every day? Subject-verb number agreement *En el peri?dico se public? (publicaron) todos los art?culos escritos por Miguel Delibes ?All the articles written by Miguel Delibes were published in the paper? Perfective/imperfective aspect *En la edad de piedra, los seres humanos aprend?an (aprendieron) a utilizar la rueda ?In the Stone Age, human beings learned how to use the wheel? Noun-adjective number agreement *Los votos de los que dispone el candidato son mucho (muchos) m?s de los que tiene la oposici?n ?The votes the candidate has are many more than the opposition has? Passives with ser/estar *El jefe inform? de que el trabajo que fuese (estuviese) acabado para el viernes se pagar?a doble ?The boss announced that the work that was finished by Friday would be double- paid? 74 A pool of 360 target items was created and items randomly assigned to each of the six language measures. Each language test had a total of 60 items (10 per target structure), an equal number of which were grammatical and ungrammatical (see Appendix B for item pool). 75 Chapter 5: Results This dissertation predicted that aptitudes that are more relevant for implicit language learning and processing would moderate L2 learners? attainment on tasks that require more automatic use of L2 knowledge. This prediction was made both for early L2 learners who are sequential bilinguals and for adult learners, since adults were still expected to be able to learn implicitly, but not for NSs, whose ultimate attainment is characterized by inter-individual homogeneity and, therefore, predicted to be independent of aptitude. Aptitude for implicit language learning should also moderate early L2 learners? attainment on tasks that require controlled use of L2 knowledge, since early L2 learners were expected to use the same type of knowledge regardless of language task. The nature of this knowledge is hypothesized to be implicit, like NSs? knowledge. Early L2 learners? ultimate attainment, however, is characterized by greater inter-individual variability than NSs and, therefore, was expected to be moderated by language aptitude. On the other hand, aptitudes that are more relevant for explicit language learning were only expected to moderate adult L2 learners? attainment on tasks that allow controlled use of L2 knowledge. These tasks increase available test time and decrease processing demands; therefore, they provide an opportunity to utilize problem-solving and analytic skills. On these tasks, adult learners can rely on explicit L2 knowledge and compensate for their limited implicit competence. Adult learners with a higher aptitude for explicit language learning were expected to do better as a result of their greater analytic, metalinguistic abilities. 76 Finally, regarding general intelligence, it was hypothesized that, in ultimate L2 attainment, relationships between explicit aptitude and general intelligence and learning outcomes would pattern in the same way and would be different from effects of implicit aptitude on outcomes. This hypothesis was based on studies of artificial grammar learning, in which fluid intelligence correlates with learning when participants are instructed to look for patterns in the training materials, but not under more incidental learning conditions This chapter presents the results of the study. The results for each of the cognitive and linguistic measures are presented first. Next, the results of the role of cognitive variables on language outcomes are reported. Overall performance on grammatical and ungrammatical items is reported first, and, then, follow-up detailed analyses are reported for ungrammatical items and for type of target structure (agreement structures ?early L1 acquisitions- and non-agreement structures ?late L1 acquisitions). 5.1 Cognitive Aptitudes In this section, overall performance for each speaker group on the six cognitive tests (LLAMA B, D, E, and F, GAMA, and the probabilistic SRT task) is presented. Next, the results of a PCA, an exploratory factor analytic technique, are reported. PCA was conducted to reduce the dimensionality of the dataset by determining whether cognitive variables could be combined into different aptitude components as equally weighted composite scores. 77 5.1.1 The LLAMA Test Table 6 shows the descriptive statistics for each of the four LLAMA subtests: LLAMA B (vocabulary learning), LLAMA D (sound recognition), LLAMA E (sound-symbol correspondence), and LLAMA F (grammatical inferencing). The maximum possible test score for each test was 100. Table 6. Descriptives of the LLAMA Language Aptitude Test Group LLAMA B LLAMA D LLAMA E LLAMA F M M M M NS Controls (n = 20) 56.50 (18.36) 30-95 33.50 (16.39) 0-60 79.00 (19.17) 30-100 61.50 (22.31) 10-100 Early AO (n = 50) 63.40 (18.03) 30-100 37.40 (13.71) 10-65 84.00 (16.78) 40-100 64.60 (20.52) 10-90 Late AO (n = 50) 50.20 (17.35) 5-80 28.60 (12.90) 0-55 70.40 (25.55) 0-100 55.80 (24.00) 0-90 Note. Standard deviations appear between parentheses The four subtests were normally distributed, according to one-sample Kolmogorov-Smirnov (K-S) tests, in each of the groups: NS controls (p = .629, p = .284, p = .734, and p = .850), early AO (p = .354, p = .347, p = .556, and p = .932), 78 and late AO (p = .376, p = .266, p = .436, and p = .434). There were no extreme outliers (i.e., +/- 3 SDs) from each group?s mean. Early L2 learners scored the highest on each of the four subtests: LLAMA B (63.40%), LLAMA E (84%), LLAMA F (64.60%), and LLAMA D (37.40%), whereas late L2 learners scored the lowest: LLAMA B (50.20%), LLAMA E (70.40%), LLAMA F (55.80%), and LLAMA D (28.60%). Finally, the scores in the NS group were 56.50% (LLAMA B), 79% (LLAMA E), 61.50% (LLAMA F), and 33.50% (LLAMA D). NSs were not significantly different from either early or late L2 learners on any of the LLAMA subtests, according to Scheff? posthoc tests: LLAMA B (p = .345 and p = .412), LLAMA E (p = .674 and p = .314), LLAMA F (p = .871 and p = .629), and LLAMA D (p = .569 and p = .412). Early and late L2 learners, however, were significantly different on all the tests, except on LLAMA F (p = .148): LLAMA B (p = .002), LLAMA E (p = .007), and LLAMA D (p = .008). These differences could be due to the positive cognitive consequences that early bilingualism is claimed to have on executive processes (e.g., Bialystok, 1999), age at time of testing (since late L2 learners were significantly older than early L2 learners), sampling bias (i.e., a biased representation of early bilinguals who succeeded with L2 Spanish), or, perhaps, a combination of two or more of these factors. 5.1.2 The GAMA Test Table 7 shows the descriptive statistics for the GAMA general intelligence test. 79 Table 7. Descriptives of the GAMA General Intelligence Test Group GAMA M Range Control (n = 20) 44.30 (7.01) 30-56 Early AO (n = 50) 45.88 (5.45) 28-56 Late AO (n = 50) 39.88 (7.44) 18-53 Note. Standard deviations appear between parentheses. The maximum possible test score was 66. Tests scores were normally distributed, according to one-sample Kolmogorov-Smirnov (K-S) tests, in each of the groups: NS controls (p = .882), early L2 learners (p = .452), and late L2 learners (p = .465). Late L2 learners scored significantly lower than NS controls (p = .044) and early L2 learners (p < .001). There was no significant difference between NS controls and early L2 learners (p = .665). The fact that late L2 learners had a significantly lower IQ than NS controls was due to the effects of an outlier with a score of 18 out of 66 in the late L2-learner group (this was the participant with the oldest age at testing in the sample). When this participant was removed, late L2 learners were no longer significantly different from NSs (p = .064) and they only differed from early L2 learners (p < .001). The difference in intelligence between early and late learners could be due to late L2 learners? significantly older age at testing with respect to early L2 learners, or to other factors that could have contributed to early L2 learners? higher IQ scores (e.g., cognitive advantages of early bilingualism, sampling bias). 80 5.1.3 Probabilistic Serial Reaction Time (SRT) Task To assess learning on the SRT task, the average response time on probable trials was subtracted from the average response time on improbable trials. The resulting difference was used as an index of sequence learning; greater differences indicated a greater degree of sequence learning. Error responses were discarded (0.90% of trials), as well as outlier responses that were +/- 3 standard deviations from the mean (1.68% of trials), computed individually for each block and participant. A repeated-measures analysis of variance (ANOVA) with block (blocks 1 to 8) and type of trial (training vs. control) as within-subjects factors was conducted on the reaction time measures. The results showed a significant effect for block (F(7, 112) = 12.552, p < .001, ?p 2 = .440, ? = .560), and type of trial (F(1, 118) = 108.842, p < .001, ?p 2 = .480, ? = .520), as well as a significant interaction for block x type of trial (F(7, 112) = 9.982, p < .001, ?p 2 = .384, ? = .616), suggesting that learning of the training sequence occurred. Figure 7 shows the SRT learning performance for probable trials (i.e., congruent with the target sequence) and non-probable trials (i.e., incongruent with the target sequence) for the entire set of participants. 81 Figure 7. SRT learning performance Reaction times for probable trials (SOC-85) were always faster than reaction times for non-probable trials in each of the blocks, except for block 4, where responses to non-probable trials, surprisingly, were faster (t(119) = 2.193, p = .030). The explanation for this pattern of results, which was observed in the sample as a whole, as well as in each of the groups separately, seems to be the percentage of non- probable trials in block 4. This block had the lowest percentage of non-probable trials (10%) and this seems to have decreased the amount of interference effects. Reaction times in block 6 support this interpretation, since this was the block with the largest percentage of non-probable trials (21.67%) and, perhaps for that reason, also the block with the largest difference between the average time to respond to probable trials and the average time to respond to improbable trials. Given that participants? responses showed high sensitivity to changes in probability levels, having kept the 430 440 450 460 470 480 490 500 510 520 1 2 3 4 5 6 7 8 R e ac tio n T im e Block Serial Reaction Time Task Probable (SOC-85) Non-probable (SOC-15) 82 .85/.15 probabilities for probable and improbable trials in each of the blocks throughout the task could have yielded more similar results across blocks. As shown in Figure 7, blocks did not follow a linear trend. Reaction times for probable trials were faster at the beginning of the task and, then, became increasingly slower. This is a common effect in probabilistic versions of the SRT task (see, for example, Kaufman et al., 2010, page 331) and can be interpreted as an effect of the increasing control that takes place when participants learn the target sequence, but also realize that it does not always follow the same order (Jim?nez, p.c., 01/13/2012). The average reaction times for probable and improbable trials were 476.83 (SD = 75.39) and 489.08 (SD = 71.78), respectively. The resulting reaction time difference (i.e., index of sequence learning) was 12.25 (t(119) = -10.564, p < .001). This difference was statistically significant in each of the groups: NS controls (t(19) = - 4.046, p = .001), early L2 learners (t(49) = -8.352, p < .001), and late L2 learners (t(49) = -5.645, p < .001), indicating a significant amount of learning in all the groups. Table 8 shows the index of sequence learning in each of the groups. Sequence learning was normally distributed in every group, according to one-sample Kolmogorov-Smirnov (K-S) tests: NS controls (p = .697), early L2 learners (p = .932), and late L2 learners (p = .983). The three groups exhibited comparable amounts of sequence learning and did not differ significantly from one another, according to Scheff? posthoc tests, between late L2 learners and controls (p = .495), late and early L2 learners (p = .120), or controls and early L2 learners (p = .930). 83 Table 8. Descriptives of the Probabilistic SRT Task Group Probabilistic SRT Task M Range Control (n = 20) 13.37 (14.78) -17.41-44.91 Early AO (n = 50) 14.63 (12.39) -10.51-49.26 Late AO (n = 50) 9.42 (11.79) -13.00-38.64 Note. Standard deviations appear between parentheses. The possible influence of explicit knowledge on participants? learning performance (i.e., conscious access to sequence knowledge) was assessed via a recognition test with an objective and a subjective component. Participants? confidence ratings given to old and new sequences (i.e., triads) on a six-point scale were compared. Low ratings indicated greater confidence in the sequence being old. If participants are unable to discriminate old from new, this may be evidence of implicit learning.11 In addition, the reaction times elicited by the third element of the same old and novel triads were also compared. Response speed provides a direct index of the possible influence of unconsciously applied perceptual-motor programs. Tables 9 and 10 show the descriptive statistics for each of the groups. Table 9 displays confidence ratings, while Table 10 presents reaction times on new and old sequences. 11 Shanks and Johnstone (1999) point out that, if discrimination of old and new sequences is possible, this could be due to the unconscious misattribution of fluency to oldness. In other words, participants may become aware of the fact that some of their responses are faster and judge those sequences as more familiar. That is why, if discrimination is possible, Shanks and Johnstone suggest comparing reaction times of sequences judged old and new, independently of actual old-new status, to test whether there is a contribution of explicit sequence memory to recognition performance over and above the fluency factor. 84 Table 9. Mean Confidence Ratings for Old and New Triads Group Old Sequences (n = 12) New Sequences (n =12) M Range M Range Control (n = 20) 2.65 (0.50) 1.67-3.42 2.70 (0.53) 1.36-3.42 Early AO (n = 50) 2.41 (0.60) 1.08-3.33 2.53 (0.69) 1.08-3.83 Late AO (n = 50) 2.36 (0.64) 1.00-3.67 2.45 (0.69) 1.00-3.75 Note. Standard deviations appear between parentheses. Table 10. Mean Reaction Times for Old and New Triads Group Old Sequences (n = 12) New Sequences (n = 12) M Range M Range Control (n = 20) 560.98 (109.12) 390.36- 810.92 581.94 (112.79) 421.92- 763.50 Early AO (n = 50) 479.44 (92.49) 316.82- 706.92 501.95 (82.39) 351.73- 757.25 Late AO (n = 50) 554.29 (102.78) 369.75- 822.42 570.70 (109.51) 413.42- 845.42 Note. Standard deviations appear between parentheses. 85 As noted in Chapter 4, confidence ratings were indicated on a six-point scale, from 1 (?I am sure that this sequence was part of the test?) to 6 (?I am sure that this sequence was not part of the test?). As can be seen in Table 9, slightly lower mean ratings were assigned to old sequences (low ratings indicate greater confidence in the sequence being old), but there was considerable overlap between the two distributions of ratings in all the groups. Participants? ratings were mostly located on the first half of the scale (points 1 to 3), from ?I am sure that this sequence was part of the test? to ?I think that this sequence was part of the test?, regardless of old-new status. The mean score on new sequences exceeded the mean score on old sequences by only 0.05 of a scale unit in the NS control group, 0.12 in the early L2 learner group, and 0.09 in the late L2 learner group. This suggests that participants judged old and new sequences as being equally familiar and that they were unable to discriminate between them. A repeated-measures ANOVA was conducted with rating (old vs. new) as a within-subjects factor and group as a between-subjects factor. Box?s test was not significant (p = .580), suggesting that participants maintained their relative standing in the two treatment conditions. The equality of variances assumption was met, according to Levene?s test, for both ratings to old sequences (p = .424) and new sequences (p = .326), indicating equal variability across groups. The repeated- measures ANOVA yielded a non-significant main effect for rating (F(1, 117) = 3.340, p = .070, ?p 2 = .028, ? = .972) and a non-significant interaction between rating and group (F(2, 117) = .311, p = .733, ?p 2 = .005, ? = .995), suggesting no significant differences between ratings to old and new sequences and a comparable effect in the 86 three groups of participants. This suggested that participants did not have explicit knowledge of familiar sequences. Regarding reaction times, the third element of old sequences elicited faster responses in the three groups of participants. Given that the two first locations in the triads were the same in old and new sequences and that they only differed in their second-order conditional information (e.g., 1-2 was followed by 1 as an old sequence, but by 4 as a new sequence), the increased fluency observed for old sequences can be attributed to the oldness of the sequence and automatic retrieval of sequence knowledge. A repeated-measures ANOVA was conducted with reaction times (old vs. new) as a within-subjects factor and group as a between-subjects factor. The assumption of equality of covariance matrices according to Box?s Test was met (p = .203). The equality of variances assumption was met, according to Levene?s test, for reaction times on old sequences (p = .512), but not for reaction times on new sequences (p = .046)12, indicating unequal variability across groups. However, the largest standard deviation was less than three times the smallest standard deviation and Levene?s test was approaching the .05 value. Therefore, ANOVA was considered robust. In addition, reaction times on old and new triads were normally distributed in the control group (p = .845 and p = .895), early AO group (p = .553 and p = .282), and late AO group (p = .890 and p = .549), according to K-S tests. The repeated-measures 12 ANOVA is considered reasonably robust to moderate departures from the homogeneity assumption, if sample size is larger than 20, but the departure needs to stay smaller when the sample sizes are very different (largest to smallest > 1.5) (Keppel & Wickens, 2004). In addition, Levene?s test is sensitive to Type I errors and, with a large sample size, it will tend to indicate a significant difference between variances when the real difference may not be that large. As a rule of thumb, if the largest standard deviation is three or four times larger than the smallest standard deviation, it is likely that the assumption has been violated (Houser, 2008). An alternative to transforming the data is to test at a more stringent alpha level, such as .01. 87 ANOVA yielded a significant main effect for reaction time (F(1, 117) = 9.784, p = .002, ?p 2 = .084, ? = .916) and a non-significant interaction between reaction time and group (F(2, 117) = .125, p = .883, ?p 2 = .002, ? = .998). Reaction times were significantly faster on the third element of old (i.e., more familiar) sequences, and this effect was comparable in the three groups of participants. 5.1.4 Cognitive Aptitudes for Implicit and Explicit Learning It was hypothesized that the LLAMA subtests B, E, and F were measures of explicit cognitive processes relevant for explicit language learning, whereas LLAMA D and the probabilistic SRT task were measures of implicit cognitive processes relevant for implicit language learning. These claims were based on the results of an exploratory factor analysis (Granena, 2011b, to appear), which showed that LLAMA subtests B, E, and F loaded together on one component, interpreted as analytic ability, whereas LLAMA D loaded on a separate component, interpreted as sequence learning ability. LLAMA B, E, and F have in common that all include a study phase prior to testing, allow time to think and use problem-solving strategies, and involve working out relations in a data set. LLAMA D, on the other hand, includes no study phase, does not allow time to rehearse, and involves recognition of phonological sequences. To further validate the hypothesized distribution of cognitive aptitudes, a PCA was conducted (n = 120) on the scores of the four LLAMA subtests and on the amount of sequence learning in the probabilistic SRT task, as measured by the difference in reaction time between probable and improbable trials. An orthogonal rotation method (Varimax13) was used. The analysis yielded two principal 13 Orthogonal rotation methods (e.g., Varimax, Equamax, Quartimax) result in uncorrelated factors, 88 components with eigenvalues greater than 1.0 that explained 59.23% of the total variance. The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy was greater than .600 (.712), and the Bartlett?s test of sphericity was significant (p < .05), indicating that the correlation matrix differed significantly from zero. The first component had an eigenvalue of 1.861 and accounted for 37.22% of the variance. The second component had an eigenvalue of 1.101 and accounted for an additional 22.02% of the variance. The rotated component matrix showed that three tests loaded on the first component with loadings greater than .4: LLAMA B, ? = .791, LLAMA F, ? = .790, and LLAMA E, ? = .639. LLAMA D and the SRT task had loadings smaller than .4: LLAMA D, ? = .351, and SRT task, ? = .219. On the other hand, two tests loaded strongly on the second component: SRT task, ? = .787, and LLAMA D, ? = .647. LLAMA B, E, and F had loadings smaller than .4: LLAMA E, ? = .295, LLAMA B, ? = -.030, and LLAMA F, ? = -.086. The same pattern of results was obtained after applying a non-orthogonal rotation method (Direct Oblimin). This method, which allows factors to be correlated, showed that the correlation between the two components was -.093, indicating that there was no significant association between the two components and, therefore, that participants could have high ability in one component, but low ability in another, and vice-versa. In addition to the four LLAMA subtests and the probabilistic SRT task, the study also included a general intelligence measure (GAMA) as part of the battery of tests. Although conventional general ability measures probably tap both explicit and implicit cognitive processes, several studies have found that they are highly correlated with attention-driven working memory measures (e.g., Engle et al., 1999; Kaufman et whereas oblique rotation methods (e.g., Direct Oblimin, Promax) allow factors to be correlated. 89 al. 2010; Kyllonen, 1996; Kyllonen & Christal, 1990) and uncorrelated with measures such as probabilistic SRT tasks, claimed to tap implicit cognitive processes (e.g., Kaufman et al., 2010), and implicit memory measures of priming (e.g., Woltz, 1990, 1999). In order to uncover the underlying structure of all the cognitive measures in the present study and to see whether GAMA scores could be included as part of a composite measuring aptitudes for explicit or implicit learning, another principal components analysis was performed, including all cognitive measures (LLAMA B, LLAMA E, LLAMA F, LLAMA D, GAMA, and the SRT task). The analysis, conducted with Varimax rotation, yielded two principal components with eigenvalues greater than 1.0 that explained 57.34% of the total variance. The first component had an eigenvalue of 2.304 and accounted for 38.41% of the variance. The second component had an eigenvalue of 1.136 and accounted for an additional 18.94% of the variance. The rotated component matrix showed that the GAMA test loaded more strongly on the first component (? = .788), together with LLAMA F (? = .749), LLAMA B (? = .720), and LLAMA E (? = .655). Its loading on the second component, where LLAMA D and the SRT task kept loading more strongly (? = .660 and ? = .784), was -.117, suggesting a negative association between intelligence and the second component. The same pattern of results was obtained via a non-orthogonal rotation, which further showed that the two components correlated at -.072. On the basis of these results, and since GAMA scores correlated more strongly with LLAMA B, E, and F (r = .56, p < .001) than with LLAMA D and the SRT task (r = .19, p = .04), two equally weighted composite scores were created, one with GAMA, LLAMA B, LLAMA E, and LLAMA F scores, and one with LLAMA D 90 scores and sequence learning in the probabilistic SRT task (see Table 11 for the proposed underlying structure of cognitive aptitudes in this study). The two composite scores were created by converting each of the individual variables to z- scores in each of the groups. These scores were added and divided by the number of variables in the composite. The decision to create composite scores for each group separately was motivated by the fact that early and late L2 learners did not have comparable cognitive abilities. Early L2 learners performed significantly better in most cognitive measures (see Sections 5.1.1 and 5.1.2), which would have made the distribution of scores unbalanced across groups (i.e., a large number of participants would have been high-aptitude in the early AO group, but low-aptitude in the late AO group). The resulting composite scores were normally distributed, according to K-S tests: NS controls (p = .886 and p = .833), early L2 learners (p = .629 and p = .956), and late L2 learners (p = .092 and p = .974). The correlation between the two in each group was .06 (p = .722) in the control group, .17 (p = .233) in the early AO group, and .22 (p = .119) in the late AO group. Table 11. Cognitive Aptitudes Measures of Explicit Cognitive Processes Measures of Implicit Cognitive Processes LLAMA B (Vocabulary Learning) LLAMA D (Sound Recognition) LLAMA E (Sound-symbol Correspondence) Probabilistic SRT Task (Implicit Learning) LLAMA F (Grammatical Inferencing) GAMA (General Intelligence Test) 91 Z-scores, which indicate the number of standard deviations above or below the mean, were used to divide participants in each of the groups into high-, mid-, and low-aptitude: High = z-scores >.5, mid = -.5 < z-scores < .5, and low = z-scores < -.5. Tables 12 and 13 display the average scores for each aptitude group. Participants in each of the aptitude levels were not the same across the two aptitude types. In the early AO group, six learners were high both in implicit and explicit language aptitude, eight were high only in implicit aptitude, and 12 were high in explicit aptitude. In the late AO group, six learners were high both in implicit and explicit language aptitude, 12 were high only in implicit aptitude, and seven were high in explicit aptitude. Therefore, there were more L2 learners who were high in implicit aptitude in the late AO group, perhaps indicating a sampling selection bias affecting proficient adult L2 learners who succeed in an immersion language learning context. Table 12. High-, Mid-, and Low-explicit Language Aptitude Groups (z-scores) Control Early AO Late AO High n = 7 Mid n = 8 Low n = 5 High n = 18 Mid n = 18 Low n = 14 High n = 13 Mid n = 24 Low n = 13 Explicit Language Aptitude .93 (.20) .01 (.32) -1.28 (.97) .99 (.38) .04 (.31) -1.32 (.46) 1.11 (.36) .15 (.26) -1.40 (.56) Note. Standard deviations appear between parentheses. 92 Table 13. High-, Mid-, and Low-Implicit Language Aptitude Groups (z-scores) Control Early AO Late AO High n = 6 Mid n = 6 Low n = 8 High n = 14 Mid n = 20 Low n = 16 High n = 18 Mid n = 16 Low n = 16 Implicit Language Aptitude 1.57 (.63) -.09 (.27) -1.11 (.76) 1.17 (.55) .09 (.32) -1.14 (.45) 1.47 (.85) -.05 (.26) -1.48 (.75) Note. Standard deviations appear between parentheses. 5.2 Language Attainment Overall performance for each speaker group on the six language tests (timed visual GJT, untimed visual GJT, timed auditory GJT, untimed auditory GJT, metalinguistic knowledge test, and word monitoring task) is presented in this section. Results are reported for grammatical and ungrammatical items first and, then, for ungrammatical items only, except in the case of the word monitoring task, which does not provide an interpretable measure for ungrammatical items. 5.2.1 Grammaticality Judgment Tests Tables 14 and 15 display the groups? scores on the timed and untimed visual (Table 11) and auditory GJTs (Table 12). All dependent variables were normally distributed in each group, according to K-S tests (p > .05). There were no extreme outliers with values +/- 3 standard deviations from each group?s mean. The three groups were significantly different from one another on the four GJTs, according to Bonferroni-adjusted comparisons. NS controls scored significantly higher than early 93 and late L2 learners on each of the GJTs (p < .001 and p < .001, respectively) and early L2 learners scored significantly higher than late L2 learners (p < .001). Table 14. Group Mean Percentage Scores on Timed and Untimed Visual GJTs Group Timed Visual GJT Untimed Visual GJT M SD Range M SD Range Control (n = 20) 84.21 7.65 67.31- 98.21 90.25 4.30 81.67- 98.33 Early AO (n = 50) 72.03 10.18 50.00- 90.00 76.77 9.13 53.33- 96.67 Late AO (n = 50) 57.88 8.39 41.51- 71.74 61.27 9.05 46.67- 91.67 Table 15. Group Mean Percentage Scores on Timed and Untimed Auditory GJTs Group Timed Auditory GJT Untimed Auditory GJT M SD Range M SD Range Control (n = 20) 92.04 5.13 78.57- 98.33 93.25 6.34 76.67- 100.00 Early AO (n = 50) 76.24 10.07 58.33- 94.92 79.39 9.73 56.67- 98.33 Late AO (n = 50) 57.63 7.90 38.33- 78.38 60.47 9.28 38.33- 85.00 94 In terms of test modality (auditory/visual), GJT scores in the group of NS controls were higher in the auditory than visual modality, and the same pattern was observed in the early AO group, whereas, in the late AO group, scores were almost the same in the two modalities. In terms of time pressure (timed/untimed), the pattern of results was the same in the three groups: scores were higher on untimed than timed GJTs, with a larger difference in the visual than auditory modalities. Figure 8 provides a visual comparison of the three groups of participants. Figure 8. Group mean percentage GJT scores A repeated-measures ANOVA with Modality and Time as within-subjects factors and Group as a between-subjects factor confirmed the descriptive results. Box?s test did not reach the .050 level (p = .046), indicating a violation of the equality of covariances assumption, a common violation in behavioral studies that compare groups with very different variances (e.g., NSs vs. L2 learners). This suggested that 0 20 40 60 80 100 120 Controls Early L2 Learners Late L2 Learners M e an % S co re Groups Grammaticality Judgments: Overall Timed Visual GJT Timed Auditory GJT Untimed Visual GJT Untimed Auditory GJT 95 participants did not maintain their relative standing in the different treatment conditions. Equality of variances, according to Levene?s test, was met for the timed visual GJT (p = .093) and the untimed auditory GJT (p = .077), but not for the timed auditory (p = .032) and untimed visual GJTs (p = .006), indicating that scores did not show equal variability across groups. However, the largest standard deviation was less than three times the smallest standard deviation and, therefore, ANOVA was considered robust (see Footnote 12). The analysis yielded a significant main effect for Time (F(1, 117) = 34.130, p < .001, ?p 2 = .227, ? = .773) and Modality (F(1, 117) = 19.181, p < .001, ?p 2 = .142, ? = .858), as well as a significant interaction between Group and Modality (F(2, 117) = 8.509, p < .001, ?p 2 = .128, ? = .872), but not between Group and Time (F(2, 117) = .360, p = .698, ?p 2 = .006, ? = .994). These results indicated that the entire set of participants scored significantly higher on untimed tests, keeping modality constant, and higher on auditory tests, keeping time pressure constant. However, modality was further qualified by an interaction with group. Early L2 learners and NSs scored higher on auditory tests than on visual tests, whereas late L2 learners scored higher on visual than auditory tests. When only correct responses to ungrammatical items14 were considered (see Tables 16 and 17, as well as Figure 9 for a visual depiction of the results), the three groups also scored significantly different from one another (p < .001). Score differences according to GJT modality and time increased, especially in the late AO 14 The error in an ungrammatical item is the most likely reason for rejection of an item (DeKeyser, 2000), whereas acceptance of a grammatical item can respond to a variety of reasons. Although the margin of error becomes smaller when only ungrammatical items are considered, there is also a loss of power, resulting from the smaller sample of items considered. 96 group. While NS controls and early L2 learners scored higher in the auditory modality, late L2 learners scored higher in the visual modality. Also, the difference between timed and untimed test scores became larger in the late AO group. Late L2 learners? scores on the ungrammatical items of the untimed visual and untimed auditory GJTs were around 20% higher than on the timed versions of the tests. Therefore, time pressure had detrimental effects on late L2 learners? performance, regardless of the modality of the measure. Table 16. Group Mean Percentage Scores on Timed and Untimed Visual GJTs (Ungrammatical Items) Group Timed Visual GJT Untimed Visual GJT M SD Range M SD Range Control n = 20 73.07 15.63 38.46- 100.00 86.33 7.79 70.00- 96.67 Early AO n = 50 52.24 18.28 4.55- 83.33 64.07 17.28 10.00- 93.33 Late AO n = 50 35.31 16.85 10.00- 88.89 55.40 19.12 16.67- 90.00 97 Table 17. Group Mean Percentage Scores on Timed and Untimed Auditory GJTs (Ungrammatical Items) Group Timed Auditory GJT Untimed Auditory GJT M SD Range M SD Range Control n = 20 86.10 9.70 62.96- 100.00 91.00 10.60 63.33- 100.00 Early AO n = 50 57.10 19.12 20.00- 89.66 65.99 17.70 20.00- 96.67 Late AO n = 50 28.25 15.43 00.00- 69.23 45.80 14.46 10.00- 83.33 Figure 9. Group mean percentage GJT scores (ungrammatical items) 0 20 40 60 80 100 120 Controls Early L2 Learners Late L2 Learners M e an % S co re Groups Grammaticality Judgments: Ungrammatical Items Timed Visual GJT Timed Auditory GJT Untimed Visual GJT Untimed Auditory GJT 98 The repeated-measures ANOVA for ungrammatical items also violated the assumption of equality of covariance matrices (p = .002). Error variances were equal, according to Levene?s tests, in the untimed auditory GJT (p = .055), the timed visual GJT (p = .406), and the timed auditory GJT (p = .050), but unequal in the untimed visual GJT (p = .001). However, the largest standard deviation was less than three times the smallest standard deviation and, therefore, ANOVA was considered robust. The analysis yielded a significant two-way interaction between Modality and Group (F(2, 117) = 25.803, p < .001, ?p 2 = .308, ? = .692) and also between Time and Group (F(2, 117) = 6.684, p = .002, ?p 2 = .103, ? = .897). Modality and Time, however, did not interact, either overall (F(2, 117) = 3.096, p = .081, ?p 2 = .026, ? = .974) or in any of the groups (i.e., there was no three-way interaction, F(2, 117) = .174, p = .840, ?p 2 = .003, ? = .997), indicating that the two modalities were similarly affected by time pressure. Figures 10 and 11 illustrate the two two-way interactions between Modality and Group (Figure 10) and Time and Group (Figure 11). These results showed that, unlike NSs and early L2 learners, late L2 learners obtained higher scores on the visual GJT modalities. They also scored proportionally higher on untimed GJTs, when their performance is compared against NSs and early L2 learners, for whom the difference between scores on timed and untimed GJTs was not that large. Time pressure and modality had an effect on performance as separate factors, but not as combined factors, as the lack of an interaction between modality and time indicated. The effect of time pressure on performance was comparable across test modalities, and vice-versa, in all the groups. Therefore, performance on auditory and visual formats was similarly affected by whether the test was timed or 99 untimed, and performance on timed and untimed formats was similarly affected by whether the test was auditory or visual. Figure 10. Modality x Group interaction Figure 11. Time x Group interaction 0 10 20 30 40 50 60 70 80 90 100 Controls Early L2 Learners Late L2 Learners M e an % Groups Modality Visual GJTs Auditory GJTs 0 10 20 30 40 50 60 70 80 90 100 Controls Early L2 Learners Late L2 Learners M e an % Groups Time Untimed GJTs Timed GJTs 100 5.2.2 Metalinguistic Knowledge Test Table 18 shows the descriptive statistics for all the items in the metalinguistic test and Table 19 for ungrammatical items only (i.e., items that were successfully corrected and errors that were correctly explained). All dependent variables in each of the groups were normally distributed, according to K-S tests (p > .05). There were no extreme outliers with values +/- 3 standard deviations from each group?s mean. The three groups were significantly different from one another, according to Bonferroni- adjusted pairwise comparisons. NS controls scored significantly higher than early and late L2 learners overall and on ungrammatical items only (p < .001) and early L2 learners scored significantly higher than late L2 learners (p < .001). Table 18. Group Mean Percentage Scores on the Metalinguistic Knowledge Test Group Metalinguistic Test M Range Control (n = 20) 89.75 (5.75) 76.67-96.67 Early AO (n = 50) 77.20 (9.84) 56.67-91.67 Late AO (n = 50) 63.50 (10.65) 43.33-88.33 Note. Standard deviations appear between parentheses. 101 Table 19. Group Mean Percentage Scores on the Metalinguistic Test (Ungrammatical Items) Group Metalinguistic Knowledge Test Error Correction Error Explanation M Range M Range Control (n = 20) 80.67 (10.63) 53.33-93.33 75.35 (20.23) 31.25-100.00 Early AO (n = 50) 57.20 (18.85) 13.33-83.33 64.60 (21.32) 0.00-100.00 Late AO (n = 50) 36.33 (20.15) 3.33-80.00 73.46 (26.11) 0.00-100.00 Note. Standard deviations appear between parentheses. Metalinguistic test scores (overall and on ungrammatical items) are displayed on Figures 12 and 13. Figure 14 further shows the average proportion of explained errors. As can be seen, the three groups were able to correct more errors than they could explain. NSs and late L2 learners were the groups that, proportionally, could explain a larger number of errors, although group differences were not significant, according to Scheff? posthoc tests, between late L2 learners and controls (p = .954), late and early L2 learners (p = .168), or controls and early L2 learners (p = .223). Since group differences were not significant, the fact that NSs had the largest percentage of correct grammatical explanations was interpreted as being due to chance. There were no features in the profile of the NSs that could account for their higher metalinguistic test scores. They were all studying or had studied university 102 degrees in a variety of disciplines (science, education, business, etc.), but were not linguistically-trained. Figure 12. Group mean percentage scores on the metalinguistic knowledge test Figure 13. Group mean percentage scores on the metalinguistic knowledge test (correction of ungrammatical items) 0 10 20 30 40 50 60 70 80 90 100 Controls Early L2 Learners Late L2 Learners M e an % S co re Groups Metalinguistic Knowledge Test Metalinguistic Test 0 10 20 30 40 50 60 70 80 90 100 Controls Early L2 Learners Late L2 Learners M e an % S co re Groups Metalinguistic Knowledge Test Error Correction 103 Figure 14. Group mean percentage scores on the metalinguistic knowledge test (explanation of ungrammatical items) Finally, Figure 15 compares participants? performance on the metalinguistic knowledge test and the four GJTs. As reported in section 5.2.1, the three speaker groups were significantly different from one another on the four GJTs, and they were also significantly different on the metalinguistic knowledge test, according to a MANCOVA analysis (F(10, 225) = 21.726, p < .001, ?p 2 = .492, ? = .258). The NS control group scored significantly higher than the early and late AO groups, and the early AO group scored significantly higher than the late AO group. All pairwise Bonferroni-adjusted comparisons were p < .001. 0 10 20 30 40 50 60 70 80 90 100 Controls Early L2 Learners Late L2 Learners M e an % S co re Groups Metalinguistic Knowledge Test Error Explanation 104 Figure 15. Group mean percentage scores on the four GJTs and the metalinguistic knowledge test. Although, quantitatively speaking, NS controls and early L2 learners were significantly different (NSs scored between 12% and 14% higher on average on each of the tests), they shared the same pattern of scores, qualitatively speaking. Thus, both groups scored the highest on the untimed auditory GJT and the lowest on the timed visual GJT. Scores on the timed visual GJT were significantly lower than scores on the other tests within each of the two groups (p < .05), according to repeated- measures ANOVAs. In addition, NSs scored higher on the untimed auditory GJT than on the metalinguistic test (p = .007) and this comparison was marginally significant within the early AO group (p = .073). Unlike NSs and early L2 learners, late L2 learners scored the highest on the metalinguistic knowledge test and the lowest on the 0 20 40 60 80 100 120 Controls Early L2 Learners Late L2 Learners M e an % S co re Groups Language Attainment: Overall Timed Visual GJT Timed Auditory GJT Untimed Visual GJT Untimed Auditory GJT Metalinguistic Test 105 two timed GJTs (visual and auditory). Their scores on the metalinguistic test were significantly higher than on the timed visual (p = .014) and auditory (p = .001) GJTs. 5.2.3 Word monitoring Task Table 20 shows the descriptive statistics for word monitoring latencies on grammatical and ungrammatical critical items. A grammatical sensitivity index was created by subtracting latencies on ungrammatical items from latencies on grammatical items (see Table 21). Latencies were normally distributed in the NS control group (p = .189 and p = .186) according to K-S tests, but not in the early L2- learner group (p = .003 and p = .001) or late L2-learner group (p = .011 and p = .029). Visual inspection of boxplots showing the distribution of latencies in each group indicated the presence of five outliers in the early AO group and three outliers in the late AO group (see Figures 16 and 17). 106 Figure 16. Distribution of overall word monitoring latencies in the early AO group Figure 17. Distribution of overall word monitoring latencies in the late AO group SensitivityIndex 600.00 400.00 200.00 0.00 -200.00 -400.00 1 51 79 114 2 SensitivityIndex 400.00 200.00 0.00 -200.00 -400.00 54 85 94 107 After removing these cases, normality was met for grammatical and ungrammatical items in the early AO group (p = .103 and p =.237), as well as for ungrammatical items in the late AO group (p = .146). Grammatical items in the late AO group approached normality (p = .047). The index computed as a measure of grammatical sensitivity was normally distributed in each of the groups: NS controls (p = .998), early L2 learners (p = .195) and late L2 learners (p = .787). Table 20. Word Monitoring Mean Latencies Group Ungrammatical Items Grammatical Items M Range M Range Control (n = 20) 1105.66 (418.29) 677.41- 2629.14 1043.47 (451.27) 646.67- 2782.33 Early AO (n = 45) 920.92 (164.47) 743.07- 1712.83 875.09 (177.24) 693.41- 1799.60 Late AO (n = 47) 1704.41 (755.58) 796.72- 3202.62 1711.95 (766.98) 827.70- 3341.86 Note. Standard deviations appear between parentheses. 108 Table 21. Grammatical Sensitivity Index (GSI) Group GSI M Range Control (n = 20) 62.19 (96.21) -153.20-234.55 Early AO (n = 45) 45.83 (69.11) -91.26-218.93 Late AO (n = 47) -7.54 (101.08) -237.03-216.46 Note. Standard deviations appear between parentheses. A total comprehension score was computed on the basis of correct responses to the set of randomly distributed yes/no questions included in the word monitoring task. In order to ensure that participants had been focusing their attention on meaning while performing the task, a minimum of 75% response accuracy was required from each participant to be included in the analysis (see section 4.3.1 for rationale). No participant had an error rate higher than 25%. In the NS control group, the mean percentage response accuracy was 95.42% (SD = 3.0) (4.58% error rate), in the early L2-learner group it was 92.23% (SD = 4.2) (7.77% error rate), and, in the late L2- learner group 85.76% (SD = 7.06) (14.24% error rate). Word monitoring latencies in the NS and early L2 learner groups were higher for ungrammatical items, indicating a delay in participants? word monitoring when the sentence included a grammatical error. In the late L2 learner group, mean latencies for grammatical and ungrammatical items were very similar and, even slightly higher for grammatical items, which yielded a negative GSI in this group. There was, however, considerable individual variation in GSIs among late L2 learners, as shown by the maximum and minimum GSI values. 109 Group monitoring latencies were compared in a 2x3 mixed factorial ANOVA. The model included a repeated factor with two levels (grammatical and ungrammatical) and a between-subjects factor with three levels (controls, early L2 learners, and late L2 learners). The assumptions of equality of covariance matrices and error variances were not met (p < .001). As a remedial measure (see footnote 12), and given that the largest standard deviation was more than four times the smallest standard deviation, a more stringent .01 alpha was adopted. Results15 revealed that grammaticality was a significant factor (F(1,110) = 13.777, p < .001, ?p 2 = .111, ? = .889), suggesting overall differential sensitivity according to the grammaticality of the item. The average reaction time difference between grammatical and ungrammatical items was 33.49 milliseconds. The interaction between grammaticality and group was also statistically significant (F(2,110) = 6.216, p = .003, ?p 2 = .102, ? = .898): NSs and early L2 learners? reaction times were higher on ungrammatical items (p = .009 and p = .005, respectively), indicating group sensitivity to grammatical violations, whereas late L2 learners? reaction times were higher on grammatical items (almost overlapping with ungrammatical items), suggesting same sensitivity to both types of items as a group (p = .677) (see Figure 16). 15 The results including outliers also showed that grammaticality was a significant main factor (F(1,117) = 9.149, p = .003, ?p 2 = .073, ? = .927) and that the two-way interaction with group was significant (F(2,117) = 4.048, p = .020, ?p 2 = .065, ? = .935). 110 Figure 18. Group word monitoring latencies for grammatical and ungrammatical items A similar pattern of results was found when the data were separated into agreement structures (gender agreement, number agreement, and person agreement) and non- agreement structures (aspect contrasts, the passive, and the subjunctive). Latencies for grammatical and ungrammatical items were again normally distributed in the NS control group (p = .051 and p = .477 for agreement structures, and p = .071 and p = .230 for non-agreement structures), but not normally distributed in the early L2 learner group (p = .001 and p < .001 for agreement structures, and p = .001 and p = .016 for non-agreement structures) or late L2-learner group (p = .025 and p = .032 for agreement structures, and p = .018 and p = .020 for non-agreement structures). Since non-normality may be caused by the presence of one or more outliers, the distribution of the data was visually inspected. 0 200 400 600 800 1000 1200 1400 1600 1800 Controls Early L2 Learners Late L2 Learners M e an R e ac ti o n Ti m e Groups Word Monitoring Task Grammatical Items Ungrammatical Items 111 Visual inspection of boxplots showing the distribution of latencies for grammatical and ungrammatical items in each group indicated the presence of five outliers for agreement structures and four outliers for non-agreement structures in the early AO group (see Figures 19 and 20). Figure 19. Distribution of word monitoring latencies for agreement items in the early AO group Sensitivity_Agreement 800.00 600.00 400.00 200.00 0.00 -200.00 1 114 90 79 78 112 Figure 20. Distribution of word monitoring latencies for non-agreement items in the early AO group After removing these cases, normality was met for grammatical and ungrammatical agreement items (p = .981 and p = .826) and non-agreement items (p = .263 and p = .065). In the late AO group, there were also five outliers for agreement structures and three for non-agreement structures (see Figures 21 and 22). Sensitivity_NonAgreement 400.00 200.00 0.00 -200.00 -400.00 -600.00 51 2 10 6 113 Figure 21. Distribution of word monitoring latencies for agreement items in the late AO group Figure 22. Distribution of word monitoring latencies for non-agreement items in the late AO group Sensitivity_Agreement 600.00 400.00 200.00 0.00 -200.00 -400.00 63 93 6023 85 Sensitivity_NonAgreement 500.00 250.00 0.00 -250.00 -500.00 54 49 63 114 When they were removed, normality was met for ungrammatical items, both agreement and non-agreement (p = .068 and p = .053). Normality could only be approached for grammatical agreement and non-agreement items (p = .043 and p = .042), but ANOVA is considered robust to mild violations of the normality assumption. The resulting GSIs for agreement and non-agreement items were all normally distributed in each of the groups: NS controls (p = .798 and p = .801), early L2 learners (p = .103 and p = .280), and late L2 learners (p = .394 and p = .842). Table 22 shows the descriptive statistics for word monitoring latencies on grammatical and ungrammatical agreement items, and Table 23 the resulting GSIs. Table 22. Word monitoring Mean Latencies (Agreement Structures) Group Ungrammatical Items Grammatical Items M Range M Range Control (n = 20) 1113.23 (256.65) 685.17- 1625.73 1032.02 (247.44) 718.33- 1653.20 Early AO (n = 45) 989.45 (304.68) 776.07- 2708.47 958.81 (326.44) 695.90- 2779.20 Late AO (n = 45) 1679.06 (747.28) 836.72- 3315.47 1673.33 (745.25) 876.50- 3208.52 Note. Standard deviations appear between parentheses. 115 Table 23. Grammatical Sensitivity Index Agreement Structures GSI Agreement M Range Control (n = 20) 81.21 (114.40) -108.13-267.35 Early AO (n = 45) 30.64 (84.41) -178.73-243.67 Late AO (n = 45) 5.73 (104.07) -201.93-224.35 Note. Standard deviations appear between parentheses. A 2x3 mixed factorial ANOVA showed that grammaticality was a significant factor (F(1,107) = 14.442, p < .001, ?p 2 = .120, ? = .880). The average reaction time difference between grammatical and ungrammatical agreement items was 39.19 milliseconds. The interaction between grammaticality and group was also significant (F(2,107) = 3.822, p = .025, ?p 2 = .067, ? = .933) (see Figure 23). Differences in word monitoring latencies between grammatical and ungrammatical items were statistically significant in the NS group (t(19) = 2.327, p = .032) and early L2-learner group (t(44) = 2.462, p = .018), but not in the late L2 learner group (t(44) = .370, p = .713). This indicated that NSs and early L2 learners experienced involuntary delays in their responses when sentences included errors of gender, person, or number agreement. On the other hand, late L2 learners did not show the same grammatical sensitivity as a group. Their word monitoring latencies for grammatical items overlapped with ungrammatical items, indicating that grammatical violations involving agreement relations did not affect their reaction times. 116 Figure 23. Group word monitoring latencies for grammatical and ungrammatical items testing agreement structures (gender, person, and number agreement) Table 24 shows the descriptive statistics for word monitoring latencies on grammatical and ungrammatical non-agreement items, and Table 25 the resulting GSIs. 0 200 400 600 800 1000 1200 1400 1600 1800 Controls Early L2 Learners Late L2 Learners M e an R e ac tio n T im e Groups Word Monitoring Task Grammatical Agreement Items Ungrammatical Agreement Items 117 Table 24. Word monitoring Mean Latencies (Non-Agreement Structures) Group Ungrammatical Items Grammatical Items M Range M Range Control (n = 20) 956.21 (198.13) 664.80-1404.47 874.33 (169.56) 575.00-1203.97 Early AO (n = 46) 908.43 (201.39) 705.20-1690.52 848.99 (172.30) 653.22-1533.07 Late AO (n = 47) 1664.31(749.89) 759.13-3131.60 1687.93(768.56) 755.47-3514.55 Note. Standard deviations appear between parentheses. Table 25. Grammatical Sensitivity Index (Non-Agreement Structures) GSI Non-agreement M Range Control (n = 20) 81.88 (96.26) -109.77-256.15 Early AO (n = 46) 59.44 (86.78) -146.00-245.07 Late AO (n = 47) -23.62 (153.66) -382.95-296.00 Note. Standard deviations appear between parentheses. A 2x3 mixed factorial ANOVA showed that grammaticality was a significant factor (F(1,110) = 9.909, p = .002, ?p 2 = .083, ? = .917). The average reaction time difference between grammatical and ungrammatical non-agreement items was 39.23 milliseconds. The interaction between grammaticality and group was also significant 118 (F(2,110) = 7.781, p = .001, ?p 2 = .124, ? = .876): NSs and early L2 learners? word monitoring latencies were higher on ungrammatical items (t(19) = 2.789, p = .012 and t(45) = 4.645, p < .001, respectively), whereas late L2 learners? latencies on grammatical and ungrammatical items were not significantly different (t(46) = -1.065, p = .292) (see Figure 24). This indicated sensitivity to errors involving the subjunctive, the passive, and aspect contrasts in the NS control and early AO groups, but lack of sensitivity in the late AO group. Figure 24. Group word monitoring latencies for grammatical and ungrammatical items testing non-agreement structures (aspect, the subjunctive, and the passive) Sensitivity to agreement and non-agreement structures was comparable in each of the groups, as indicated by non-significant differences between the two GSIs in the control group (t(19) = -.339, p = .739), early AO group (t(40) = -.896, p = .376), and late AO group (t(41) = .580, p = .565). A between-subjects analysis (ANOVA) 0 200 400 600 800 1000 1200 1400 1600 1800 Controls Early L2 Learners Late L2 Learners M e an R e ac tio n T im e Groups Word Monitoring Task Grammatical Non- Agreement Items Ungrammatical Non- Agreement Items 119 further revealed that group was a significant factor in both the GSI for agreement structures (F(2,107) = 4.697, p = .011, ?p 2 = .077) and non-agreement structures (F(2,110) = 7.781, p = .001, ?p 2 = .124). Bonferroni-adjusted comparisons further indicated that NS controls were not significantly different from early L2 learners on either GSI for agreement (p = .195) or non-agreement (p = .795), but significantly different from late L2 learners on both (p = .022 and p = .007, respectively). Early and late L2 learners? sensitivity to non-agreement structures was also significantly different (p = .005), but their sensitivity to agreement structures was comparable (p = .706). Sensitivity to agreement structures could not, therefore, discriminate between early and late L2 learners, indicating that it is a feature that early and late acquisition may have in common, even though early L2 learners did not differ from NSs, either16. Finally, group was also a significant factor in overall grammatical sensitivity (F(2,110) = 6.216, p = .003, ?p 2 = .102). Multiple comparisons showed that NS controls were not significantly different from early L2 learners (p = .791), but significantly different from late L2 learners (p = .012). Early and late L2 learners were also significantly different (p = .014). These results at the between-subjects level confirmed the patterns observed at a within-subjects level. NSs and early L2 learners were highly sensitive to grammatical errors while monitoring words in a comprehension task. They also displayed comparable amounts of sensitivity. On the 16 Interestingly, early L2 learners? performance on agreement structures also resembled late L2 learners? performance in the untimed visual, timed visual and metalinguistic test, according to Scheff? posthoc tests. In the untimed visual, the two groups of learners scored comparably on gender agreement (p = .526) and subject-verb (person) agreement (p = .431). In the metalinguistic test, they scored comparably on gender (p = .110) and subject-verb agreement (p = .853), and approached non- significance on number agreement (p = .040). Finally, in the timed visual, they scored comparably on gender agreement (p = .683). All these analyses yielded significant differences between NSs and early L2 learners (p > .05). The structures that did not yield any significant differences between NSs and early L2 learners were the subjunctive, in most tests, aspect, and the passive (all late L1 acquisitions). 120 other hand, late L2 learners did not show sensitivity to errors as a group and their sensitivity was significantly lower than NSs? and early L2 learners?. 5.2.4 Summary of Language Attainment Table 26 shows the correlation matrix for the L2 learners? scores (n = 100) on the six language measures. This study hypothesized that the timed auditory GJT, the timed visual GJT, and the word monitoring task are language measures that require automatic use of L2 knowledge, whereas the untimed auditory GJT, the untimed visual GJT, and the metalinguistic test allow controlled use of L2 knowledge. The hypothesis was motivated by previous research, such as R. Ellis? (2005) psychometric study (recently replicated by Bowles, 2011), which showed that time pressure was a distinguishing factor between tasks that tap implicit and explicit L2 knowledge. As can be observed in the matrix, the six measures were positively correlated. The strongest relationships were between the metalinguistic test, the untimed visual GJT, and the untimed auditory GJT, suggesting that these three tests were measuring the same underlying construct, as hypothesized. However, the correlations between the language measures hypothesized to require automatic use of language knowledge were not so strong. Specifically, the correlations between the GSI, which was computed as an index of sensitivity to grammatical violations in the word monitoring task, and the other two measures hypothesized to require automatic use of language knowledge, the timed visual and timed auditory GJTs, were only moderately weak. In fact, the two timed GJTs were more strongly correlated with the untimed GJTs and the metalinguistic test than with the GSI. This could be due to the nature of the tests, since the GJTs and the metalinguistic test shared the same format and scoring 121 procedure, whereas the GSI was a reaction time measure in milliseconds. The observed pattern of correlations could also suggest that the GSI is an index of a qualitatively different type of linguistic competence, the type of integrated language knowledge that word monitoring tasks have been hypothesized to measure, or, perhaps, an index of L2 processing capacity. Like the two timed GJTs, the word monitoring task involves performance in real time and minimizes controlled use of L2 knowledge. However, unlike the two timed GJTs and the other tests used in this study, the word monitoring task is carried out in a dual-task framework that focuses participants? attention on sentence meaning and on word monitoring, while all the other measures focus participants? attention on sentence correctness (i.e., language forms) and accuracy of grammaticality judgment. Table 26. Correlation Matrix for the Six Language Measures (L2 Learners) Word Monitoring Task (GSI) Timed Auditory (TA) GJT Timed Visual (TV) GJT Untimed Auditory (UA) GJT Untimed Visual (UV) GJT Metalinguistic Knowledge Test (MKT) GSI __ .28** .27** .27** .26** .33** TA GJT __ .70** .80** .79** .76** TV GJT __ .70** .66** .66** UA GJT __ .84** .82** UV GJT __ .86** MKT __ *p < .05 **p < .01 122 5.3 Cognitive Aptitudes and Language Attainment In this section, the results of the role of cognitive variables on language outcomes are reported. The section is structured according to type of aptitude: aptitude for explicit learning, aptitude for implicit learning, and general intelligence. Each section is further subdivided into the two types of language outcome measure. These sections present the results of the effects of each type of aptitude on automatic and controlled outcome measures. 5.3.1 Aptitude for Explicit Learning and Language Attainment This section presents the results of the role of aptitude for explicit learning (i.e., an equally weighted composite score combining LLAMA subtests B, E, and F and GAMA general intelligence scores) on language attainment as measured by tasks that allow controlled use of language knowledge (section 5.3.1.1) and measures that require automatic use of language knowledge (section 5.3.1.2). First, descriptive data are presented visually on scatterplots that show attainment scores as a function of age of onset with the aptitude for explicit learning dimension added. This visual display allows determining to what extent a high level of explicit aptitude is a necessary condition at an individual level in order to score within NS range. Next, multivariate analyses of covariance (MANCOVAs) are reported in order to determine the extent to which aptitude for explicit learning moderates language attainment in each of the groups. A MANCOVA was first conducted on overall test scores, grammatical and ungrammatical, and, then, re-run on ungrammatical items, agreement items, and non- agreement items in follow-up analyses. 123 5.3.1.1 Tasks that Allow Controlled Use of Language Knowledge Figures 25, 26, and 27 display individual scores on the metalinguistic test, untimed visual GJT, and untimed auditory GJT, respectively, as a function of AO with the aptitude for explicit learning dimension added. The NS range is marked with a dotted line. The explicit aptitude groups (high, mid, and low) were created by establishing the following cutoffs on the aptitude for explicit learning composite score in every speaker group: high = z-scores >.5, mid = -.5 < z-scores < .5, and low = z-scores < -.5. The highest scorers on the metalinguistic test in the early AO group were two learners with high explicit language aptitude (represented with black diamond markers). In the late AO group, six learners obtained scores as high as NSs. Three of them had high explicit aptitude (among them the highest scorer), two mid explicit aptitude (represented with dark gray circles), and one low explicit aptitude (represented with a light gray circle), suggesting that explicit language aptitude is advantageous, but not a necessary condition, to score within the NS range on the metalinguistic knowledge test. On the untimed visual GJT, the highest scorers in the early AO group were also two high explicit aptitude L2 learners, whereas in the late AO group, only one learner with mid explicit aptitude scored as high as NSs. Finally, on the untimed auditory GJT, the highest scorer in the early AO group was a high explicit aptitude L2 learner, while in the late AO group, two mid explicit aptitude L2 learners, but also one low explicit aptitude learner, scored within the NS range. 124 Figure 25. Metalinguistic knowledge test scores as a function of AO with the explicit language aptitude dimension added Figure 26. Untimed visual GJT test scores as a function of AO with the explicit language aptitude dimension added 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Metalinguistic Knowledge Test High Explicit Aptitude Mid Explicit Aptitude Low Explicit Aptitude 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Untimed Visual GJT High Explicit Aptitude Mid Explicit Aptitude Low Explicit Aptitude 125 Figure 27. Untimed auditory GJT test scores as a function of AO with the explicit language aptitude dimension added In order to investigate the role of aptitude for explicit learning in participants? language attainment as measured by tasks that allow controlled use of language knowledge, a MANCOVA was conducted with overall test scores on the untimed visual GJT, untimed auditory GJT, and metalinguistic test as dependent variables, group (NS controls, early L2 learners, and late L2 learners) as a fixed factor, and the composite aptitude score combining LLAMA B, E, F, and GAMA (i.e., aptitude for explicit learning) as a covariate. An interaction term was added, in addition to the group and covariate terms, to test for possible interactions between covariate and group as an independent factor. This is a necessary step to test for any aptitude- treatment interactions (ATIs)17. 17Cronbach (1957) created ATI as a joint application of experimental and correlational methods, the two main approaches to psychological research at the time. This joint application examined 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Untimed Auditory GJT High Explicit Aptitude Mid Explicit Aptitude Low Explicit Aptitude 126 The assumption of equality of error variances (Levene?s test) was met for the untimed auditory GJT (p = .114), but not for the untimed visual GJT (p = .035) or metalinguistic test (p = .013). The equality of covariances assumption (Box?s test) was not met either (p = .001). Since the largest standard deviation was 9.1 in the untimed visual GJT data and 10.65 in the metalinguistic knowledge test data, and both were less than three times the smallest standard deviation (4.30 and 5.75, respectively), the MANCOVA was considered robust to the violation of the homogeneity assumption. The analysis revealed a non-significant interaction between group and aptitude for explicit learning at the multivariate level (F(6,224) = .950, p = .460, ?p 2 = .025, ? = .951), indicating that the effect of aptitude was comparable across the groups and did not differ for any linear combination of test scores. The analysis also showed a significant multivariate effect of aptitude for explicit learning as a covariate (F(3,112) = 3.581, p = .016, ?p 2 = .088, ? = .912) on a linear combination of the three dependent measures (the metalinguistic test, the untimed visual GJT, and the untimed auditory GJT). The magnitude of this effect was medium. At the univariate level, the covariate was also significant for each of the three language measures separately, the untimed visual GJT (F(1,114) = 5.308, p = .023, ?p 2 = .045), the untimed auditory GJT interactions between individual characteristics and treatment variables. The first term, ?aptitude?, refers to ?any measurable person characteristic hypothesized to be propaedeutic to successful goal achievement in the treatment studied? (Snow, 1991:205). ?Treatment? has a broad meaning of any experimental variable. ?Interaction? is ?the degree to which results for two or more treatments differ for persons who also differ on one or more aptitude variables? (Snow, 1991:206). ANCOVA models and covariate-adjusted means should be used when there is no significant interaction with the covariate. In an ATI model, the covariate shows a different relationship to an outcome variable in one treatment from the relationship it shows in another treatment. Factorial analyses are typically used to follow up ATI results. In this dissertation, where an ex-post-facto design was used, no treatment was delivered. ATI in the context of the present study means an interaction between a non-experimental independent variable (speaker group) and aptitude for a given dependent variable (L2 measure). 127 (F(1,114) = 4.639, p = .033, ?p 2 = .039), both with a small effect size, and the metalinguistic test (F(1,114) = 10.814, p = .001, ?p 2 = .087), with a medium effect size. Like the interaction at the multivariate level, the interactions between group and covariate at the univariate level were not significant (F(2,114) = 1.705, p = .186, ?p 2 = .029, F(2,114) = 1.300, p = .277, ?p 2 = .022, and F(2,114) = .760, p = .470, ?p 2 = .013, respectively). These results indicated that aptitude for explicit learning moderated language attainment on the three measures of controlled use of knowledge and that moderation was robust at the multivariate level, for a combination of the three controlled measures, as well as at the univariate level, for each of the three measures separately. These effects were comparable among NS controls, early L2 learners, and late L2 learners. The non-significant aptitude-treatment interaction suggested that the covariate played a similar role in the two groups of L2 learners. As Figures 28, 29, and 30, respectively, show, when untimed visual GJT, untimed auditory GJT, and metalinguistic test scores were regressed on aptitude for explicit learning composite scores, the slopes of the two L2-learner groups were similar, with the single exception of the untimed auditory GJT. The slopes of the early and late AO groups were both statistically significant for the untimed visual GJT (p = .004 and p = .048, respectively) and metalinguistic test (p = .002 and p = .004, respectively), whereas the slopes of the control group were not (p = .927 and p = .241). In the case of the untimed auditory GJT, only the slope of the early AO group reached significance (p = .011), but not the slope of the late AO or control group (p = .397 and p = .565, respectively). 128 Figure 28. Regression of untimed visual GJT scores on aptitude for explicit learning composite scores at each group level Figure 29. Regression of metalinguistic test scores on aptitude for explicit learning composite scores at each group level 129 Figure 30. Regression of untimed auditory GJT scores on aptitude for explicit learning composite scores at each group level Given that the non-significant aptitude-treatment interaction suggested that aptitude played a similar role in the two groups of L2 learners, but not in the NS group, a second MANCOVA model was run with NSs (n = 20) and L2 learners (n = 100). The results at the multivariate level remained robust. There was no significant interaction between group and aptitude for explicit learning (F(3,114) = 1.631, p = .186, ?p 2 = .042, ? = .958), but a significant multivariate effect of aptitude as a covariate (F(3,114) = 2.741, p = .047, ?p 2 = .068, ? = .932). At the univariate level, the effects of the covariate remained significant for each of the language tests, the untimed visual GJT (F(1,116) = 4.150, p = .044, ?p 2 = .035), the untimed auditory GJT (F(1,116) = 4.978, p = .028, ?p 2 = .041), and the metalinguistic test (F(1,116) = 8.137, p = .005, ?p 2 = .066), and they were further qualified by a significant two-way 130 interaction with group in the case of the untimed visual GJT (F(1,116) = 4.752, p = .031, ?p 2 = .040) and the metalinguistic test (F(1,116) = 3.948, p = .049, ?p 2 = .034). The two-way interaction between group and aptitude for the untimed auditory GJT approached significance (F(1,116) = 3.722, p = .056, ?p 2 = .032). To further determine the effect of explicit aptitude on test scores using a factorial design, follow-up analyses were conducted in each of the groups (NS controls, early L2 learners, and late L2 learners) by comparing high and low explicit aptitude individuals in each group, according to a z-score distribution where high = z-scores >.5, mid = -.5 < z-scores < .5, and low = z-scores < -.5 (see Table 27 for a summary of descriptive statistics). Table 27. Summary of Overall Test Scores by Participants with High and Low Aptitude for Explicit Learning Control Early AO Late AO High n = 7 Low n = 5 High n = 18 Low n = 14 High n = 13 Low n = 13 Untimed Visual GJT 88.81 (5.91) 91.00 (4.50) 81.67 (8.30) 73.57 (8.44) 63.08 (7.26) 58.46 (9.14) Untimed Auditory GJT 94.52 (5.42) 89.33 (10.38) 84.80 (7.70) 76.07 (10.14) 58.33 (8.11) 59.62 (10.83) Metalinguistic Test 90.95 (5.08) 87.33 (7.96) 82.31 (7.90) 73.10 (9.58) 68.97 (9.04) 59.74 (10.27) Note. Standard deviations appear between parentheses. 131 High and low explicit aptitude controls (n = 7 and n = 5, respectively) were not significantly different on any of the three language tests: untimed visual GJT (t(10) = -.694, p = .503), untimed auditory GJT (t(10) = 1.138, p = .282), and metalinguistic test (t(10) = .968, p = .356). For this group (n = 20), the strongest correlation with aptitude for explicit learning corresponded to the metalinguistic test, and it had the same magnitude as the correlation in the late AO group, r = .35 (p = .135) (the disattenuated correlation18 was .47). In the early AO group, high and low explicit aptitude L2 learners (n = 18 and n = 14, respectively) differed significantly from each other on the three tests: untimed visual GJT (t(30) = 2.716, p = .011), untimed auditory GJT (t(30) = 2.653, p = .014), and metalinguistic test (t(30) = 2.984, p = .006). Correlations in this group (n = 50) were .38 (p = .007), .34 (p = .016) and .47 (p = .001) for the untimed visual GJT, untimed auditory GJT, and metalinguistic test, respectively (disattenuated correlations were .52, .46, and .63). In the late AO group, high and low explicit aptitude L2 learners (n = 13 and n = 13, respectively) only differed on the metalinguistic knowledge test (t(24) = 2.432, p = .023), but not on the untimed visual or untimed auditory GJT (t(24) = 1.426, p = .167 and t(24) = -.342, p = .736, respectively). In this group, while the correlation between aptitude for explicit learning and performance on the metalinguistic test was significant (r = .36, p = .010), it did not reach significance for the untimed visual GJT 18 Correlation coefficients disattenuated of measurement error were computed using the formula Rxy = rxy / sqrt (rxx ryy) (i.e., correlation coefficient divided by the square root of the product of the reliabilities of the two tests involved). The disattentuated coefficients suggest the upper bound of possible validity between the measures used. 132 (r = .26, p = 0.66) and untimed auditory GJT (r = .11, p = .472) (disattenuated correlations were .48, .36, and .15). In order to further validate the results obtained for overall test scores (i.e., scores on grammatical and ungrammatical items), MANCOVA analyses were re-run including only scores on ungrammatical items, half of the items on each test (k = 30). Multivariate and univariate analyses with group (NS controls, early L2 learners, and late L2 learners) as a fixed factor yielded no significant interactions (p > .05). At the multivariate level, aptitude for explicit learning remained a significant covariate for a linear combination of the ungrammatical items on the three tests that allow use of controlled language knowledge (F(3,112) = 3.459, p = .019, ?p 2 = .085, ? = .915), and, at the univariate level, it remained significant for the metalinguistic test (F(1,114) = 8.876, p = .004, ?p 2 = .073). The effect size of these associations was medium. Figure 31 displays metalinguistic test scores on ungrammatical items as regressed on aptitude for explicit learning scores in each of the groups. Simple slopes were significant in the early and late L2 learner groups (p = .004 and p = .014, respectively), but not in the control group (p = .256). 133 Figure 31. Regression of metalinguistic test scores on ungrammatical items on aptitude for explicit learning composite scores at each group level There were no significant multivariate or univariate interactions, either, when early and late L2 learners were combined as a single group (p > .05), but aptitude for explicit learning remained a significant covariate with a medium effect size at the multivariate level (F(3,114) = 3.054, p = .031, ?p 2 = .075, ? = .925) and, at the univariate level, for the metalinguistic test (F(1,116) = 5.911, p = .017, ?p 2 = .049), although the interaction with group did not reach significance (F(1,116) = 2.108, p = .149, ?p 2 = .018). Simple correlations between aptitude and metalinguistic test scores on ungrammatical items in each of the groups showed a significant relationship in the early and late AO groups (r = .46, p = .001 and r = .32, p = .026, respectively) (disattenuated correlations were .62 and .43). In the control group, the correlation had 134 practically the same magnitude as in the late AO group, but it did not reach significance, probably due to the smaller sample size (r = .34, p = .147) (the disattenuated correlation was .46). Differences between high and low explicit aptitude individuals were significant in the early AO group (M = 67.22, SD = 13.92 and M = 49.76, SD = 17.76, respectively) (t(30) = 3.121, p = .004) and approached significance in the late AO group (M = 44.62, SD = 17.98 and M = 17.99, SD = 4.99, respectively) (t(24) = 1.970, p = .060), but were non-significant in the control group (M = 82.38, SD = 9.17 and M = 76.67, SD = 15.28, respectively) (t(10) = .814, p = .435). A last set of follow-up MANCOVA analyses was run distinguishing between items testing agreement structures (k = 30) (gender, person, and number agreement) and non-agreement structures (k = 30) (aspect contrasts, the subjunctive, and the passive) on every language test. As for agreement items, multivariate and univariate analyses with group (NS controls, early L2 learners, and late L2 learners) as a fixed factor yielded no significant interactions (p > .05). Aptitude for explicit learning, however, was a significant covariate with a medium effect size at the multivariate level (F(3,112) = 2.772, p = .045, ?p 2 = .071, ? = .929) and, at the univariate level, for agreement items on the metalinguistic test (F(1,114) = 5.302, p = .023, ?p 2 = .046). It also approached significance for agreement items on the untimed auditory GJT (F(1,114) = 3.882, p = .051, ?p 2 = .034), but it was not significant for the untimed visual GJT F(1,114) = .163, p = .687, ?p 2 = .001). When L2 learners were combined as one group, interactions at the multivariate and univariate level remained non- significant (p > .05). Aptitude for explicit learning was not a significant covariate at 135 the multivariate level, either, although it had a p value of .087 (F(3,114) = 2.241, p = .087, ?p 2 = .056, ? = .944). At the univariate level, aptitude was significant for the untimed auditory GJT (F(1,116) = 3.990, p = .048, ?p 2 = .034) and approached significance for the metalinguistic test (F(1,116) = 3.776, p = .054, ?p 2 = .032), in both cases with a small effect size, but it was not significant for the untimed visual GJT (F(1,116) = .268, p = .606, ?p 2 = .002). Follow-up simple correlations revealed that the relationship between explicit aptitude and test performance for agreement items on the untimed auditory GJT and the metalinguistic test was not significant in the control group (r = .18, p = .453 and r = .16, p = .513, respectively) (disattenuated correlations were .24 and .22), but significant in the early AO group (r = .33, p = .023 and r = .35, p = .014) (disattenuated correlations were .44 and .47). In the late AO group, only the correlation with test scores for agreement items on the metalinguistic was significant (r = .39, p = .006), not the correlation with test scores for agreement items on the untimed auditory GJT scores (r = .17, p = .236) (disattenuated correlations were .53 and .23).19 Differences between high and low explicit aptitude individuals in the early AO group were significant for the untimed auditory GJT (t(30) = 2.775, p = .010, mean difference of 12.34) and the metalinguistic test (t(30) = 2.098, p = .044, mean difference of 9.47). In the late AO group, only differences between high and low explicit aptitude individuals on the metalinguistic test were significant (t(24) = 2.599, p = .016, mean difference of 12.31), not on the untimed auditory GJT (t(24) = .360, p 19 This stronger relationship between aptitude for explicit learning and performance on the untimed auditory GJT in the early AO group did not result in a significant group x covariate interaction (F(6,224) = 1.437, p = .202, ?p 2 = .037, ? = .927 with a three-level group factor ?controls, early L2 learners, and late L2 learners- and F(1,96) = 2.392, p = .125, ?p 2 = .025 with a two-level group factor ? early and late L2 learners). 136 = .722, mean difference of 1.54). In the control group, there were no differences between high and low explicit aptitude individuals for either the untimed auditory GJT (t(10) = 1.033, p = .326, mean difference of 4.86) or the metalinguistic test (t(10) = .778, p = .454, mean difference of 3.43). Regarding non-agreement items, multivariate and univariate analyses with group (NS controls, early L2 learners, and late L2 learners) as a fixed factor and aptitude for explicit learning composite scores as a covariate yielded no significant interactions (p > .05). However, aptitude was a significant covariate with medium effect sizes at the multivariate level (F(3,112) = 3.333, p = .022, ?p 2 = .083, ? = .917), as well as for non-agreement items on the untimed visual GJT (F(1,114) = 7.744, p = .006, ?p 2 = .064) and the metalinguistic test (F(1,114) = 7.473, p = .007, ?p 2 = .062), but not for non-agreement items on the untimed auditory GJT (F(1,114) = 1.387, p = .241, ?p 2 = .012). When L2 learners were combined as a single group, aptitude for explicit learning remained a significant covariate at the multivariate level (F(3,114) = 2.725, p = .048, ?p 2 = .067, ? = .933), as well as at the univariate level for non-agreement items on the untimed visual GJT (F(1,116) = 8.717, p = .004, ?p 2 = .071) and metalinguistic test (F(1,116) = 7.128, p = .009, ?p 2 = .058), but not for non-agreement items on the untimed auditory GJT (F(1,116) = 1.691, p = .196, ?p 2 = .015). In order to examine the effect of aptitude for explicit learning in each of the speaker groups separately, follow-up correlational and factorial analyses, were performed. Follow-up correlations showed a significant relationship between aptitude and performance on non-agreement structures on the untimed visual GJT in the early and late AO groups (r = .40, p = .004 and r = .35, p = .013, respectively) 137 (disattenuated correlations were .55 and .48), but not in the control group (r = .17, p = .475) (the disattenuated correlation was .23). The correlation with performance on the metalinguistic test was only significant in the early group (r = .38, p = .007) (the disattenuated correlation was .51). In the late AO and control groups, the relationship was moderately weak20 and positive, but non-significant (r = .21, p = .151 and r = .28, p = .224, respectively) (disattenuated correlations were .28 and .38). The results of the factorial analyses confirmed the correlational patterns. Differences between high and low explicit aptitude individuals were significant in the early AO group on the untimed visual GJT (t(30) = 2.940, p = .006, mean difference of 8.36) and metalinguistic test (t(30) = 3.179, p = .003, mean difference of 8.97). In the late AO group, only differences on the untimed visual GJT were significant (t(24) = 2.551, p = .018, mean difference of 8.21), but they did not reach significance on the metalinguistic test (t(24) = 1.475, p = .153, mean difference of 6.15). In the control group, there were no differences between high and low explicit aptitude individuals on either the untimed visual GJT (t(10) = -.177, p = .863, mean difference of -0.76) or the metalinguistic test (t(10) = .865, p = .407, mean difference of 3.81). To summarize, aptitude for explicit learning moderated both early and late L2 learners? language attainment, as measured by untimed tests that focus participants? attention on language correctness and that allow controlled use of L2 knowledge, but it did not moderate the performance of NS controls. The effect of aptitude for explicit learning was observed at the multivariate level, in a combination of the three untimed measures, as well as at the univariate level, in each measure separately. Two of these 20 Following Cohen (1988), the strength of a linear relationship can be weak (0 < r < .20), moderately weak (.20 < r < .40), moderate (.41 < r < .60), moderately strong (.61 < r < .80), and strong (.81 < r < 1.0). 138 measures were visual (the untimed visual GJT and the metalinguistic knowledge test) and one was auditory (the untimed auditory GJT). In the late AO group, only performance on visual tests was moderated by level of aptitude for explicit learning, while in the early AO group, performance on both untimed modalities, visual and auditory, showed a relationship with aptitude for explicit learning, although this difference did not yield any significant interactions between L2-learner group and covariate for the untimed auditory GJT. When only ungrammatical items were considered, aptitude effects remained robust for the metalinguistic test in the two L2 learner groups. This was the test that encouraged the greatest attention to language forms and the one for which the effect size of aptitude for explicit learning as a covariate was the largest. The fact that aptitude for explicit learning moderated early, but not late, L2 learners? performance on the untimed auditory GJT seems to suggest that the auditory modality could have placed processing constraints on L2 learners that prevented them from making use of controlled L2 knowledge, even if the test was performed under untimed testing conditions. If this was the case, those late L2 learners with higher aptitude for explicit learning as measured by an auditory test should have been able to score higher on the untimed auditory GJT modality. The LLAMA aptitude subtest E, which was used as part of the composite of aptitude for explicit learning, requires test takers to work out relationships between sounds they hear and a writing system. The test gives participants time to freely navigate and work out those relationships by listening to the target sounds as many times as wished within the established testing time. 139 As a follow-up test to the results obtained for the untimed auditory GJT in the late AO group, the relationship between late L2 learners? aptitude for explicit learning and performance on the untimed auditory GJT was further investigated by examining LLAMA E test scores. Regarding overall test performance, while the correlation with aptitude for explicit learning was .11 (p = .421), it was .27 (p = .061) with LLAMA E scores (disattenuated correlations were .15 and .37). Similarly, when only ungrammatical items on the untimed auditory test were considered, the correlation increased from .01 (p = .997) to .28 (p = .051) (disattenuated correlations were .01 and .38). For agreement items testing gender, person, and number agreement, the increase was from .17 (p = .236) to .29 (p = .051) (disattenuated correlations were .23 and .40), and only for non-agreement items testing aspect contrasts, the subjunctive, and the passive, did the correlation not approach significance (from r = .04, p = .780 to r = .15, p = .316) (disattenuated correlations were .05 and .21). 5.3.1.2 Tasks that Require Automatic Use of Language Knowledge Figures 32, 33, and 34 display individual scores on the timed visual GJT, timed auditory GJT, and word monitoring task as a function of AO with the aptitude for explicit learning dimension added. The NS range is marked with a dotted line. The explicit aptitude groups (high, mid, and low) were created by establishing the following cutoffs on the aptitude for explicit learning composite score in every speaker group: high = z-scores >.5, mid = -.5 < z-scores < .5, and low = z-scores < -.5. The highest scorer on the timed visual GJT in the early AO group was a high explicit aptitude L2 learner, whereas, in the late AO group, a combination of high, mid, and low explicit aptitude L2 learners scored within NS range. On the timed 140 auditory GJT, a high explicit aptitude L2 learner obtained the highest score in the early AO group, while, in the late AO group, mid and low explicit aptitude learners overlapped within NS range. Finally, the highest grammatical sensitivity indices on the word monitoring task corresponded to high explicit aptitude L2 learners in the two learner groups. The scatterplot for the word monitoring task further shows that practically all the L2 learners? sensitivity scores were within NS range. Due to the nature of reaction- time data, however, it is not possible to talk about ceiling effects. In addition, as the analyses of group GSIs in section 5.2.3 indicated, the performance of the three speaker groups on the task was not comparable. Late L2 learners? sensitivity scores were significantly lower than NSs? and early L2 learners? scores. Also, the difference between word monitoring latencies for grammatical and ungrammatical items on the task, which was used to compute GSIs, was non-significant in the late AO group, but significant in the NS control and early AO groups. 141 Figure 32. Timed visual GJT scores as a function of AO with the explicit language aptitude dimension added Figure 33. Timed auditory GJT scores as a function of AO with the explicit language aptitude dimension added 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Timed Visual GJT High Explicit Aptitude Mid Explicit Aptitude Low Explicit Aptitude 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Timed Auditory GJT High Explicit Aptitude Mid Explicit Aptitude Low Explicit Aptitude 142 Figure 34. Word monitoring task scores (GSI) as a function of AO with the explicit language aptitude dimension added In order to investigate the role of aptitude for explicit learning on participants? language attainment as measured by tasks hypothesized to require automatic use of language knowledge, a MANCOVA was conducted with overall test scores on the timed visual GJT, timed auditory GJT, and word monitoring task (i.e., GSI) as dependent variables, group (NS controls, early L2 learners, and late L2 learners) as fixed factor, and the composite aptitude score combining LLAMA B, E, F, and GAMA (i.e., aptitude for explicit learning) as a covariate. An interaction term was added, in addition to the group and covariate terms, to test for possible interactions between covariate and group as an independent factor. The assumption of equality of error variances was met for the timed visual GJT (p = .224), the timed auditory GJT -300 -200 -100 0 100 200 300 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an R e ac tio n T im e D if fe re n ce ( m se c) Age of Onset Word Monitoring Task (GSI) High Explicit Aptitude Mid Explicit Aptitude Low Explicit Aptitude 143 (p = .062), and word monitoring task (p = .637). The equality of covariances assumption was also met (p = .462). At the multivariate level, the interaction between aptitude for explicit learning and group was not significant (F(6,210) = .381, p = .891, ?p 2 = .011, ? = .979). Explicit aptitude as a multivariate covariate did not reach significance, either (F(3,105) = 2.460, p = .067, ?p 2 = .065, ? = .935). At the univariate level, aptitude was significant for the word monitoring task (F(1,107) = 5.191, p = .025, ?p 2 = .046), with a small effect size, and it had a p value of .087 for the timed auditory GJT (F(1,107) = 2.981, p = .087, ?p 2 = .027). It was not significant for the timed visual GJT (F(1,107) = 1.067, p = .304, ?p 2 = .010). Two-way interactions between group and covariate were all non-significant: timed visual GJT (F(2,107) = .546, p = .581, ?p 2 = .010), timed auditory GJT (F(2,107) = .376, p = .687, ?p 2 = .007), and word monitoring task (F(2,107) = .481, p = .620, ?p 2 = .009)21. When L2 learners were combined as a single group and compared against NS controls, interactions at the multivariate and univariate level remained non-significant (p > .05). Aptitude for explicit learning remained a non-significant covariate at the multivariate level, although it had a p value of .063 (F(3,107) = 2.502, p = .063, ?p 2 = .065, ? = .935). It also remained non-significant at the univariate level for the timed visual GJT (F(1,109) = 1.987, p = .162, ?p 2 = .018), and significant for the word monitoring task (F(1,109) = 4.282, p = .041, ?p 2 = .037), as well as for the timed 21 The analysis including outliers in the word monitoring task yielded similar results. At the multivariate level, there was no interaction between group and covariate (F(6,224) = .341, p = .915, ?p 2 = .009, ? = .982) and aptitude for explicit learning approached significance as a covariate (F(3,112) = 2.653, p = .052, ?p 2 = .066, ? = .934). At the univariate level, aptitude was a significant covariate for the timed auditory GJT (F(1,114) = 4.122, p = .045, ?p 2 = .035) and the word monitoring task (F(1,114) = 4.665, p = .033, ?p 2 = .039), but not for the timed visual GJT (F(1,114) = 1.327, p = .252, ?p 2 = .012). There were no significant interactions with group (p > .05). 144 auditory GJT (F(1,109) = 4.575, p = .035, ?p 2 = .040). Effect sizes were all small. The interactions between group and covariate were all non-significant, at the multivariate level (F(3,107) = .521, p = .669, ?p 2 = .014, ? = .986), and, at the univariate level, for the timed visual GJT (F(1,109) = 1.379, p = .243, ?p 2 = .012), timed auditory GJT (F(1,109) = 1.223, p = .271, ?p 2 = .011), and word monitoring task (F(1,109) = .095, p = .759, ?p 2 = .001).22 In order to examine the effect of aptitude for explicit learning in each of the speaker groups separately, follow-up correlational and factorial analyses were performed on the two tests that yielded significant results for the entire set of participants (the word monitoring task and the timed auditory GJT). Follow-up simple correlations between aptitude for explicit learning and performance on the word monitoring task and the timed auditory GJT were computed for each of the groups. Correlations had similar magnitudes across groups, but did not reach significance in any case. For the word monitoring task, the correlations in the control, early AO, and late AO groups were .29 (p = .217), .28 (p = .062), and .20 (p = .182), respectively (disattenuated correlations were .37, .36, and .26). For the timed auditory GJT, correlations were .21 (p = .384), .27 (p = .056), and .26 (p = .075), respectively (disattenuated correlations were .28, .36, and .34). 22 The analysis including outliers in the word monitoring task yielded similar results. At the multivariate level, there was no interaction between group and covariate (F(3,112) = .510, p = .676, ?p 2 = .013, ? = .987) and aptitude for explicit learning approached significance as a covariate (F(3,112) = 2.478, p = .065, ?p 2 = .061, ? = .939). At the univariate level, aptitude was a significant covariate for the timed auditory GJT (F(1,116) = 4.924, p = .028, ?p 2 = .041) and approached significance for the word monitoring task (F(1,116) = 3.726, p = .056, ?p 2 = .031), but it was not significant for the timed visual GJT (F(1,116) = 1.951, p = .165, ?p 2 = .017). There were no significant interactions with group (p > .05). 145 Factorial analyses with high and low explicit aptitude individuals further showed no significant differences on the word monitoring task among NS controls (t(10) = 1.308, p = .220, mean difference of 78.71), early L2 learners (t(30) = .798, p = .431, mean difference of 41.24), or late L2 learners (t(24) = 1.255, p = .221, mean difference of 52.01). Score differences on the timed auditory GJT were not significant among controls, either (t(10) = 1.285, p = .228, mean difference of 3.16), or late L2 learners (t(24) = .483, p = .634, mean difference of 1.47), but high explicit aptitude early L2 learners outperformed their low-aptitude counterparts (t(30) = 2.385, p = .024, mean difference of 7.60). Although a significant effect of explicit aptitude was observed only in the early AO group, a MANCOVA conducted on the two L2-learner groups did not show a significant interaction between aptitude for explicit learning and early and late L2 learners? scores on the timed auditory GJT (F(1,96) = 1.594, p = .210, ?p 2 = .016). This indicated that the significant effect of explicit aptitude on early L2 learners? timed auditory GJT scores was not significantly different from the effect on late L2 learners? scores (as reported above, in the late AO group, the correlation between explicit aptitude and timed auditory GJT scores was .26, p = .075). When only ungrammatical items on the timed visual and timed auditory GJTs were considered (the word monitoring task does not provide an interpretable measure for ungrammatical items only), multivariate and univariate analyses with group (NS controls, early L2 learners, and late L2 learners) as a fixed factor yielded no significant interactions (p > .05). Aptitude for explicit learning was not a significant covariate at the multivariate level, either (p > .05), but, at the univariate level, aptitude approached significance for the timed auditory GJT (F(1,114) = 3.787, p = 146 .054, ?p 2 = .032). When L2 learners were combined as a single group, explicit aptitude remained a non-significant covariate at the multivariate level (F(2,115) = 2.496, p = .087, ?p 2 = .042 ? = .958) and the interaction with group was not significant, either (F(2,115) = .344, p = .709, ?p 2 = .006, ? = .994). At the univariate level, interactions were not significant (p > .05) and explicit aptitude was not a significant covariate for the timed visual GJT (F(1,116) = 1.225, p = .271, ?p 2 = .010). The effect of the covariate on the timed auditory GJT, however, reached significance (F(1,116) = 4.782, p = .031, ?p 2 = .040). In order to examine the effect of aptitude for explicit learning in each of the speaker groups separately, follow-up correlational and factorial analyses were performed on the test that yielded significant results for the entire set of participants (the timed auditory GJT). Simple correlations in each of the groups revealed that the relationship between aptitude scores and scores on ungrammatical items on the timed auditory GJT was not significant in the control group (r = .18, p = .467) or late AO group (r = -.03, p = .819), but significant in the early AO group (r = .30, p = .035) (disattenuated correlations were .24, -.04, and .40). High explicit aptitude early L2 learners (n = 18) (M = 67.18, SD = 15.94) scored significantly higher than low explicit aptitude early L2 learners (n = 14) (M = 52.40, SD = 16.93) (t(30) = 2.533, p = .017, mean difference of 14.78). High and low explicit aptitude participants in the control and late AO groups were not significantly different (p = .210 and p = .947, respectively). Although the effect of explicit aptitude on the timed auditory GJT was only observed in the early AO group, a MANCOVA conducted on the two L2-learner groups did not yield a significant interaction (F(1,96) = .804, p = .372, ?p 2 = .008). 147 Therefore, explicit aptitude had a comparable effect in the two groups, even if it only reached significance in the early AO group. A last set of follow-up analyses was run distinguishing between agreement items (k = 30) (gender, person, and number agreement) and non-agreement items (k = 30) (aspect contrasts, the subjunctive, and the passive) on every language test. As for agreement items, multivariate and univariate analyses with group (NS controls, early L2 learners, and late L2 learners) as a fixed factor yielded no significant interactions (p > .05). Aptitude for explicit learning was not a significant covariate at the multivariate level (F(3,102) = 1.888, p = .136, ?p 2 = .051, ? = .949) or, at the univariate level, for the timed auditory GJT (F(1,104) = 1.438, p = .233, ?p 2 = .013), or word monitoring task (i.e., GSI) (F(1,104) = 1.534, p = .218, ?p 2 = .014), but it was significant for the timed visual GJT (F(1,104) = 4.239, p = .042, ?p 2 = .038).23 When L2 learners were combined as one group, interactions at the multivariate and univariate level were all non-significant, as well (p > .05). As a covariate, aptitude for explicit learning was not significant for agreement items at the multivariate level (F(3,104) = 1.897, p = .135, ?p 2 = .050, ? = .950) or, at the univariate level, for the timed auditory GJT (F(1,106) = 2.103, p = .150, ?p 2 = .018) or word monitoring task (F(1,106) = 2.246, p = .137, ?p 2 = .019), but it approached significance for the timed visual GJT (F(1,106) = 3.729, p = .056, ?p 2 = .031)24. 23 The analysis including outliers in the word monitoring task also yielded non-significant results at the multivariate level. There was no interaction between group and covariate (F(6,224) = .446, p = .847, ?p 2 = .012, ? = .977) and aptitude for explicit learning was not a significant covariate (F(3,112) = 2.274, p = .084, ?p 2 = .057, ? = .943). At the univariate level, aptitude was not significant for the timed auditory GJT (F(1,114) = .758, p = .386, ?p 2 = .007) or word monitoring task (F(1,114) = 1.518, p = .220, ?p 2 = .013), but it was significant for the timed visual GJT (F(1,114) = 5.526, p = .020, ?p 2 = .046). There were no significant interactions with group (p > .05). 24 The analysis including outliers in the word monitoring task also yielded non-significant results at the multivariate level. There was no interaction between group and covariate (F(3,114) = .509, p = .677, 148 In order to examine the effect of aptitude for explicit learning in each of the speaker groups separately, follow-up correlational and factorial analyses were performed on the test that yielded significant results for the entire set of participants (the timed visual GJT). Follow-up correlations in each of the groups revealed a significant correlation between aptitude and agreement items on the timed visual GJT only in the early AO group (r = .36, p = .010), but not in the NS control or late AO groups (r = .13, p = .576, and r = .17, p = .242, respectively) (disattenuated correlations were .48, .18, and .23). High explicit aptitude early L2 learners (n = 18) (M = 75.10, SD = 8.08) scored significantly higher on agreement items than their low- aptitude counterparts (n = 14) (M = 65.20, SD = 8.79) (t(30) = 3.311, p = .002, mean difference of 9.90).25 High and low explicit aptitude participants in the control and late AO groups, on the other hand, were not significantly different (p = .509 and p = .485, respectively). The interaction between aptitude and L2-learner group was not significant (F(1,96) = 2.165, p = .144, ?p 2 = .022). Therefore, explicit aptitude had a comparable effect on agreement scores on the timed visual GJT in the two learner groups, even if it only reached significance in the early AO group. Regarding non-agreement items, multivariate and univariate analyses with group (NS controls, early L2 learners, and late L2 learners) as a fixed factor and aptitude for explicit learning as a covariate yielded no significant interactions (p > .05). Aptitude was not a significant covariate at the multivariate level, either (F(3,105) = 2.207, p = ?p 2 = .013, ? = .987) and aptitude for explicit learning was not a significant covariate (F(3,114) = 2.079, p = .107, ?p 2 = .052, ? = .948). At the univariate level, aptitude was not significant for the timed auditory GJT (F(1,116) = 2.075, p = .152, ?p 2 = .018) or word monitoring task (F(1,116) = 2.291, p = .133, ?p 2 = .019) but it was significant for the timed visual GJT (F(1,116) = 4.340, p = .039, ?p 2 = .036). There were no significant interactions with group (p > .05). 25 The subtest in the aptitude composite that contributed the most to the significant relationship with timed visual GJT scores in the early AO group was LLAMA F, the grammatical inferencing subtest, with a correlation of .37 (p = .009). 149 .092, ?p 2 = .059, ? = .941). At the univariate level, aptitude for explicit learning was not a significant covariate for the timed visual GJT (F(1,107) = .013, p = .909, ?p 2 = .000) or word monitoring task (i.e., GSI) (F(1,107) = .218, p = .641, ?p 2 = .002), but it was significant for the timed auditory GJT (F(1,107) = 6.809, p = .010, ?p 2 = .057).26 This relationship remained significant when L2 learners were combined as one group (F(1,109) = 5.468, p = .021, ?p 2 = .046). Interactions were all non-significant (p > .05) and aptitude for explicit learning remained a non-significant covariate for the timed visual GJT (F(1,109) = .776, p = .380, ?p 2 = .007) and word monitoring task (i.e., GSI) (F(1,109) = .754, p = .387, ?p 2 = .007).27 In order to examine the effect of aptitude for explicit learning in each of the speaker groups separately, follow-up correlational and factorial analyses were performed on the test that yielded significant results for the entire set of participants (the timed auditory GJT). Follow-up correlations in each of the groups revealed a significant correlation between explicit aptitude and non-agreement items on the timed auditory GJT in the early AO group only (r = .31, p = .032) (the disattenuated correlation was .41). High explicit aptitude early L2 learners (n = 18) (M = 84.44, SD = 8.50) scored significantly higher than their low-aptitude counterparts (n = 14) (M = 26 The analysis including outliers in the word monitoring task also yielded a non-significant interaction between group and covariate at the multivariate level (F(6,224) = .227, p = .967, ?p 2 = .006, ? = .988), but a significant covariate effect (F(3,112) = 2.732, p = .047, ?p 2 = .068, ? = .932). At the univariate level, aptitude was a significant covariate for non-agreement items on the timed auditory GJT (F(1,114) = 5.057, p = .026, ?p 2 = .042) and it approached significance for the word monitoring task (F(3,114) = 3.366, p = .069, ?p 2 = .029), but it was non-significant for the timed visual GJT (F(1,114) = .020, p = .888, ?p 2 = .000). There were no significant interactions with group (p > .05). 27 The analysis including outliers in the word monitoring task also yielded a significant effect of the covariate on non-agreement items on the timed auditory GJT (F(1,116) = 7.247, p = .008, ?p 2 = .059) and a non-significant effect on the timed visual GJT (F(1,116) = .704, p = .403, ?p 2 = .006) and word monitoring task (F(1,116) = 2.370, p = .126, ?p 2 = .020). 150 74.93, SD = 8.67) (t(30) = 3.114, p = .004, mean difference of 9.52).28 In the control group, the correlation between aptitude and non-agreement items on the timed auditory GJT had a slightly larger magnitude than in the early AO group, but did not reach significance (r = .32, p = .163), and, in the late AO group, the relationship was weak and non-significant (r = .19, p = .211) (disattenuated correlations were .42 and .25). High and low explicit aptitude NSs and high and low explicit aptitude late L2 learners were not significantly different (p = .168 and p = .692, respectively). The interaction between aptitude and L2-learner group was not significant (F(1,96) = 1.614, p = .207, ?p 2 = .017), indicating a comparable effect of aptitude after all. To summarize, as expected, aptitude for explicit learning was not a significant covariate at the multivariate level for the language measures hypothesized to require automatic use of language knowledge (timed auditory GJT, timed visual GJT, and word monitoring task). There were no significant interactions with group, either, suggesting a comparable relationship between aptitude and language attainment in the three groups (NS controls, early L2 learners, and late L2 learners). Although there were no significant interactions between group and covariate, univariate tests and follow-up analyses showed an unexpected relationship in the early AO group between aptitude for explicit learning and performance on the timed auditory and timed visual GJTs that was not present in the late AO group. Early L2 learners with high explicit aptitude performed significantly better than their low-aptitude counterparts on the timed auditory GJT, both overall and when only ungrammatical items were considered. Specifically, there was a relationship between early L2 learners? aptitude 28 The subtest in the aptitude composite that contributed the most to the significant relationship with timed auditory GJT scores in the early AO group was LLAMA E, the sound-symbol correspondence subtest, with a correlation of .30 (p = .036). 151 for explicit learning and scores on items testing non-agreement structures (aspect contrasts, the subjunctive, and the passive), but no effect on items testing agreement structures (gender, person, and number agreement) when the GJT was timed and auditory.29 When the GJT was visual, however, early L2 learners with high explicit aptitude scored higher on agreement items than their low-aptitude counterparts. The relationship between early L2 learners? explicit aptitude composite score and performance on the timed auditory GJT was mostly due to a significant correlation with LLAMA E, the aptitude subtest measuring sound-symbol correspondence ability, which was also found to be related to early and late L2 learners? performance on the untimed auditory GJT. On the other hand, the relationship between early L2 learners? explicit aptitude and performance on the timed visual GJT was due to a significant correlation with LLAMA F, the aptitude subtest measuring grammatical inferencing via pictures and written stimuli. Finally, as predicted, aptitude for explicit learning did not moderate participants? grammatical sensitivity as measured by the word monitoring task, the task hypothesized to be at the extreme of the continuum of tasks requiring automatic use of L2 knowledge. 29 A trend towards dissociation was observed in the data between agreement and non-agreement items on the timed auditory GJT in the early AO group. Early L2 learners with high aptitude for explicit learning scored higher than their low-aptitude counterparts on non-agreement structures, while early L2 learners with high aptitude for implicit learning scored higher than their low-aptitude counterparts on agreement structures. The difference on timed auditory GJT scores for non-agreement items between early L2 learners with high and low aptitude for explicit learning was significant (t(30) = 3.114, p = .004, mean difference of 9.52), but not the difference on scores for agreement items (t(30) = 1.546, p = .133, mean difference of 6.38). On the other hand, the difference on timed auditory GJT scores for agreement items between early L2 learners with high and low aptitude for implicit learning approached significance (t(28) = 1.779, p = .086, mean difference of 7.30), but the difference on scores for non-agreement items was not significant (t(28) = .122, p = .904, mean difference of 0.48). 152 5.3.2 General Intelligence and Language Attainment This section presents the results of the role of general intelligence on language attainment as measured by tasks that allow controlled use of language knowledge (section 5.3.2.1) and measures that require automatic use of language knowledge (section 5.3.2.2). First, descriptive data is presented visually on scatterplots that show attainment scores as a function of age of onset with the general intelligence dimension added. This visual display allows determining to what extent high intelligence is a necessary condition at an individual level in order to score within NS range. Next, multivariate analyses of covariance (MANCOVAs) are conducted in order to determine the extent to which intelligence moderates language attainment in each of the groups. A MANCOVA was first conducted on overall test scores, grammatical and ungrammatical, and, then, re-run on ungrammatical items, agreement items, and non-agreement items as follow-up analyses. 5.3.2.1 Tasks that Allow Controlled Use of Language Knowledge Figures 35, 36, and 37 display individual scores on the metalinguistic knowledge test, untimed visual GJT, and untimed auditory GJT as a function of AO with the general intelligence dimension added. The NS range is marked with a dotted line. Like the aptitude groups, the general intelligence groups were created by converting GAMA raw scores into z-scores within each of the three speaker groups30 and by establishing the following cutoffs reflecting distance from the mean in standard deviations: high = z-scores >.5, mid = -.5 < z-scores < .5, and low = z-scores < -.5. 30 The decision to compute z-scores separately for each group was motivated by the fact that the three groups did not have comparable cognitive abilities (i.e., early L2 learners had significantly higher intelligence than late L2 learners). 153 The highest scorers on the three tests were either high- or mid-intelligence L2 learners in both the early and late AO groups, with the exception of the untimed auditory GJT where a low-intelligence late L2 learner also scored within NS range. Figure 35. Metalinguistic knowledge test scores as a function of AO with the general intelligence dimension added 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Metalinguistic Knowledge Test High Intelligence Mid Intelligence Low Intelligence 154 Figure 36. Untimed visual GJT scores as a function of AO with the general intelligence dimension added Figure 37. Untimed auditory GJT scores as a function of AO with the general intelligence dimension added 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Untimed Visual GJT High Intelligence Mid Intelligence Low Intelligence 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Untimed Auditory GJT High Intelligence Mid Intelligence Low Intelligence 155 A MANCOVA was conducted with overall test scores on the three tasks hypothesized to allow controlled use of language knowledge (i.e., the untimed visual GJT, untimed auditory GJT, and metalinguistic test), group as a fixed factor (NS controls, early L2 learners, and late L2 learners), and general intelligence scores as a covariate. An interaction term was added in a custom model to test for possible interactions between covariate and group as an independent factor. The analysis showed a non-significant interaction between group and intelligence at the multivariate level (F(6,224) = .720, p = .636, ?p 2 = .019, ? = .962). Interactions between the covariate and each of the tests at the univariate level were also non- significant: Untimed visual (F(2,114) = 2.107, p = .149, ?p 2 = .018), untimed auditory (F(2,114) = .326, p = .569, ?p 2 = .003), and metalinguistic test (F(2,114) = 3.140, p = .079, ?p 2 = .027). Since a non-significant aptitude-treatment interaction could suggest a comparable effect of the covariate in the two groups of L2 learners, a second MANCOVA was performed with only two groups: NS controls (n = 20) and L2 learners (n = 100). The analysis revealed that there was no significant interaction between group and general intelligence at the multivariate level (F(3,114) = 1.703, p = .170, ?p 2 = .043, ? = .957). General intelligence was not a significant covariate at the multivariate level, either (F(3,114) = 1.402, p = .246, ?p 2 = .036, ? = .964). However, univariate analyses showed that intelligence was a significant covariate with a small effect size for the untimed visual GJT (F(1,116) = 3.972, p = .049, ?p 2 = .034) and the metalinguistic test (F(1,116) = 4.200, p = .043, ?p 2 = .035). These relationships were further qualified by a group-by-covariate interaction, which was significant for the untimed 156 visual GJT (F(1,116) = 5.253, p = .024, ?p 2 = .044) and had a p value of .072 for the metalinguistic test (F(1,116) = 3.301, p = .072, ?p 2 = .028). General intelligence was not a significant covariate for the untimed auditory GJT (F(1,116) = 2.649, p = .106, ?p 2 = .023) and the interaction with group was not significant, either (F(1,116) = 1.498, p = .223, ?p 2 = .013). As shown in Figures 38, 39, and 40, with the exception of the untimed auditory GJT, where none of the slopes were significant (p > .05), general intelligence was more related to L2 learners? performance than NS controls? performance on the untimed visual GJT and metalinguistic test. Figure 38. Regression of untimed visual GJT scores on general intelligence scores at each group level 157 Figure 39. Regression of metalinguistic test scores on general intelligence scores at each group level Figure 40. Regression of untimed auditory GJT scores on general intelligence scores at each group level 158 There was also a difference between the slopes of the two groups of L2 learners. While the slope of the early AO group was not significant for either the untimed visual GJT or metalinguistic test (p = .349 and p = .445, respectively), the slope of the late AO group was significant in both cases (p = .032 and p = .016). Slope differences between early and late L2 learners, however, did not yield significant a significant interaction in either case (F(1,96) = .007, p = .933, ?p 2 = .000, and F(1,96) = .168, p = .683, ?p 2 = .002). In order to further examine the effect of intelligence, follow-up factorial analyses were performed on the two tests that yielded significant results for the entire set of participants (the untimed visual GJT and the metalinguistic test) by comparing high- and low-intelligence individuals (i.e., z > .5 and z < -.5, respectively) within each group (NS controls, early L2 learners, and late L2 learners) (see Table 28 for a summary of descriptive statistics). High- and low-intelligence controls (n = 6 and n = 7, respectively) were not significantly different on either the untimed visual GJT or the metalinguistic test (t(11) = -.169, p = .869 and t(11) = .567, p = .582, respectively). There were no significant differences in the early AO group, either (t(31) = .336, p = .739 and t(31) = .541, p = .593, respectively). In the late AO group, differences in performance between high- and low-intelligence individuals were significant for the metalinguistic test (t(30) = 2.894, p = .007, mean difference of 8.07) and had a p value of .075 for the untimed visual GJT (t(30) = 1.846, p = .075, mean difference of 5.01). 159 Table 28. Summary of Overall Test Scores on the Untimed Visual GJT and Metalinguistic Test by High- and Low-Intelligence Participants Control Early AO Late AO High n = 6 Low n = 7 High n = 18 Low n = 15 High n = 18 Low n = 14 Untimed Visual GJT 88.89 (3.90) 89.29 (4.50) 75.74 (10.34) 74.67 (7.46) 63.70 (7.20) 58.69 (8.14) Metalinguistic Test 90.83 (5.65) 88.81 (6.99) 77.41 (10.70) 75.44 (9.99) 68.43 (7.17) 60.36 (8.60) Note. Standard deviations appear between parentheses. MANCOVA analyses were re-run including only scores on the ungrammatical items in each language test (k = 30). Multivariate and univariate analyses with group (NS controls, early L2 learners, and late L2 learners) as a fixed factor yielded no significant interactions (p > .05). When early and late L2 learners were combined as a single L2-learner group, there was no interaction between group and covariate at the multivariate level, either (F(3,114) = 1.338, p = .265, ?p 2 = .034, ? = .966). The interactions at the univariate level were also non-significant for all the tests, the untimed visual GJT (F(1,116) = 3.446, p = .066, ?p 2 = .029), the untimed auditory GJT (F(1,116) = .769, p = .382, ?p 2 = .007), and the metalinguistic test (F(1,116) = 2.598, p = .110, ?p 2 = .022). As a covariate, the effects of intelligence approached significance for the metalinguistic test (F(1,116) = 3.657, p = .058, ?p 2 = .031), but 160 they were not significant for the untimed visual GJT (F(1,116) = .363, p = .548, ?p 2 = .003), or untimed auditory GJT (F(1,116) = 1.820, p = .180, ?p 2 = .016). In order to examine the effect of intelligence in each of the speaker groups separately, follow-up correlational and factorial analyses were performed on the test that yielded significant results for the entire set of participants (the metalinguistic knowledge test). Simple correlations between intelligence and metalinguistic test scores on ungrammatical items in each of the groups showed a significant relationship in the late AO group (r = .32, p = .022) (the disattenuated correlation was .41). High- intelligence late L2 learners scored significantly higher (M = 45.74, SD = 18.85) than low-intelligence late L2 learners (M = 29.05, SD = 15.49) on ungrammatical items (t(30) = 3.173, p = .003). Correlations in the control and early AO group were weak and non-significant (r = .06, p =.792 and r = .14, p = .343, respectively) (disattenuated correlations were .03 and .18), thus replicating the results found for overall metalinguistic test scores. Unlike late L2 learners, high- and low-intelligence early L2 learners did not differ on their metalinguistic test scores for ungrammatical items (M = 57.22, SD = 20.36 and M = 54.89, SD = 18.85) (t(31) = .339, p = .737). A last set of follow-up analyses was run, with general intelligence as a covariate, distinguishing between agreement items (k = 30) (gender, person, and number) and non-agreement items (k = 30) (aspect, the subjunctive, and the passive) on every test. For agreement items, multivariate and univariate analyses with group (NS controls, early L2 learners, and late L2 learners) as a fixed factor yielded no significant interactions (p > .05). As a covariate, general intelligence was significant at the univariate level for the metalinguistic test (F(1,114) = 4.435, p = .037, ?p 2 = .038). 161 The size of the effect was small. When L2 learners were combined and compared with controls, interactions at the multivariate and univariate level remained non- significant, and general intelligence had a p value of .064 as a covariate for the metalinguistic test (F(1,116) = 3.496, p = .064, ?p 2 = .031). In order to examine the effect of intelligence in each of the speaker groups separately, follow-up correlational and factorial analyses were performed on the test that yielded significant results for the entire set of participants (the metalinguistic knowledge test). Simple correlations in each group showed that the relationship between intelligence and performance on agreement items on the metalinguistic test was not significant among early L2 learners (r = .16, p = .257) or controls (r = .12, p = .607), but significant among late L2 learners (r = .38, p =.007) (disattenuated correlations were .21, .16, and .49). High-intelligence late L2 learners (n = 18) scored significantly higher (M = 74.26, SD = 11.07) than low-intelligence late L2 learners (n = 14) (M = 61.67, SD = 12.52) on agreement test items (t(30) = 3.014, p = .005). Regarding non-agreement items, multivariate and univariate analyses with group (NS controls, early L2 learners, and late L2 learners) as a fixed factor, and general intelligence as a covariate, yielded no significant interactions (p > .05). When L2 learners were combined as a group and compared with controls, results showed that intelligence was only a significant covariate at the univariate level for non-agreement items on the untimed visual GJT (F(1,116) = 6.044, p = .015, ?p 2 = .051). This relationship was further qualified by an interaction with group (F(1,116) = 4.363, p = .039, ?p 2 = .037). In both cases, the effect size was small. 162 As can be seen on Figure 41, the significant interaction was mostly due to the late AO group, which had a steeper slope than the early AO group (p = .002 and p = .542, respectively). Slope differences between early and late L2 learners did not yield a significant interaction (F(1,96) = 1.107, p = .295, ?p 2 = .011). Figure 41. Regression of untimed visual GJT scores for non-agreement items on general intelligence scores at each group level In order to further examine the effect of intelligence, follow-up factorial analyses were performed on the test that yielded significant results for the entire set of participants (the untimed visual GJT) by comparing high- and low-intelligence individuals (i.e., z > .5 and z < -.5, respectively) within each speaker group (NS controls, early L2 learners, and late L2 learners) (see Table 29 for a summary of descriptive statistics). High- and low-intelligence controls? performance on non- agreement items on the untimed visual GJT was not significantly different (t(11) = 163 .081, p = .937). There were no significant differences in the early AO group, either (t(31) = -.053, p = .958). In the late AO group, high-intelligence L2 learners scored significantly higher than low-intelligence L2 learners (t(30) = 2.400, p = .023, mean difference of 7.49). Table 29. Summary of Test Scores on the Untimed Visual GJT (Non-agreement Items) by High- and Low-Intelligence Participants Control Early AO Late AO High n = 6 Low n = 7 High n = 18 Low n = 15 High n = 18 Low n = 14 Untimed Visual GJT 92.22 (8.07) 91.90 (6.04) 80.93 (11.01) 81.11 (8.79) 62.96 (8.55) 55.48 (9.02) Note. Standard deviations appear between parentheses. To summarize, general intelligence did not moderate either NS controls or early L2 learners? language attainment on tasks that allow controlled use of L2 knowledge. It did not moderate performance at the multivariate level for a combination of those measures, either. Intelligence, however, moderated late L2 learners? attainment as measured by the metalinguistic test and the untimed visual GJT. This effect remained robust for ungrammatical items on the metalinguistic test. Late L2 learners? performance on agreement structures (gender, person, and number agreement) and non-agreement structures (aspect contrasts, the subjunctive, and the passive) was equally related to general intelligence, albeit on different tests. The only test where 164 intelligence did not show an effect in the late AO group was the untimed auditory GJT. The fact that general intelligence did not moderate early L2 learners? attainment on measures that allow controlled use of language knowledge suggests that the intelligence factor did not contribute to the significant results reported for aptitude for explicit learning in the early AO group (despite a significant correlation between general intelligence and LLAMA aptitude subtests B, E, and F in this group of .30, p = .035). On the other hand, the fact that intelligence moderated late L2 learners? language attainment suggests that both general intelligence and LLAMA aptitude subtests B, E, and F contributed to the significant results reported for aptitude for explicit learning among late L2 learners. In this group, the correlation between intelligence and LLAMA B, E, and F was stronger than in the early AO group (r = .62, p < .001). In order to tease apart the relationship between general intelligence (GAMA), explicit language aptitude (LLAMA B, E, and F), and ultimate L2 attainment in the two L2-learner groups, an analysis was conducted with the LLAMA B, E, and F composite as a covariate, group (early and late L2 learners) as a fixed factor and scores on the untimed visual GJT, untimed auditory GJT, and metalinguistic test. The assumptions of equality of covariances and error variances, Box?s and Levene?s tests, respectively, were all met (p > .05). At the multivariate level, there was no interaction between group and covariate, indicating a comparable effect of LLAMA B, E, and F on the two groups of learners (F(3,94) = .954, p = .418, ?p 2 = .030, ? = .970). The LLAMA composite was, however, a significant covariate with a large effect size 165 (F(3,94) = 5.064, p = .003, ?p 2 = .140, ? = .860). At the univariate level, interactions with group were also non-significant for the untimed visual GJT (F(1,96) = 1.318, p = .254, ?p 2 = .014), untimed auditory GJT (F(1,96) = 2.467, p = .120, ?p 2 = .025), and metalinguistic test (F(1,96) = .561, p = .456, ?p 2 = .006). The LLAMA composite was a significant covariate for each of the measures separately, the untimed visual GJT (F(1,96) = 9.861, p = .002, ?p 2 = .094), the untimed auditory GJT (F(1,96) = 7.941, p = .006, ?p 2 = .077), and the metalinguistic test (F(1,96) = 15.494, p < .001, ?p 2 = .140), for which the largest effect size was found (?p 2 = .140). Simple correlations between LLAMA B, E, and F and scores on the untimed visual GJT, untimed auditory GJT, and metalinguistic test were all significant in the early AO group: .37 (p = .007), .38 (p = .007), and .33 (p = .018) (disattenuated correlations were .54, .54, and .47). Early L2 learners with high and low LLAMA B, E, and F composite scores (i.e., z > .5 and z < -.5, respectively) were significantly different on the three language tests: Untimed visual GJT (t(24) = 2.939, p = .007, mean difference of 9.38), untimed auditory GJT (t(24) = 2.876, p = .008, mean difference of 10.30), and metalinguistic test (t(24) = 3.313, p = .003, mean difference of 11.06). In the late AO group, only the correlation between LLAMA B, E, and F and performance on the metalinguistic test was significant (r = .43, p = .002) (the disattenuated correlation was .61). Late L2 learners with a high LLAMA B, E, and F composite score performed significantly higher than late L2 learners with a low composite score (t(26) = 2.289, p = .030, mean difference of 9.28). The correlations between the LLAMA B, E, and F composite and late L2 learners? scores on the untimed visual GJT and untimed auditory GJT were .22 (p = .117) and .14 (p = .320), 166 respectively31 (disattenuated correlations were .32 and .20) (but see section 5.3.1.1 for follow-up analyses on the relationship between LLAMA E and the untimed auditory GJT in the late AO group). Similar results were obtained when only ungrammatical items were considered. The MANCOVA analysis yielded no significant interactions with group at any level (p > .05). As a covariate, the LLAMA B, E, and F composite was significant at the multivariate level with an effect size that was medium large (F(3,94) = 4.454, p = .006, ?p 2 = .126, ? = .874) and, at the univariate level, for the untimed visual GJT (F(1,96) = 4.160, p = .044, ?p 2 = .042) and the metalinguistic test (F(1,96) = 12.507, p = .001, ?p 2 = .116) with a small and a medium large effect size, respectively. The results for the untimed auditory GJT did not reach significance (F(1,96) = 3.121, p = .080, ?p 2 = .032). Simple correlations between LLAMA B, E, and F and early L2 learners? performance on ungrammatical items on the untimed visual GJT, untimed auditory GJT, and metalinguistic test were .34 (p = .018), .28 (p = .054), and .42 (p = .003), respectively (disattenuated correlations were .49, .40, and .59). Early L2 learners with high and low LLAMA B, E, and F composite scores were significantly different on the untimed visual GJT (t(24) = 2.156, p = .041, mean difference of 12.78) and metalinguistic test (t(24) = 3.116, p = .005, mean difference of 19.60). The difference approached significance for the untimed auditory GJT (t(24) = 2.055, p = .051, mean difference of 12.34). In the late AO group, only the correlation between LLAMA B, E, and F and performance on ungrammatical items on the metalinguistic test 31 The correlations between general intelligence and late L2 learners? scores on the untimed visual and untimed auditory GJTs did not reach significance, either (r = .27, p = .062 and r = .04, p = .808). 167 approached significance (r = .27, p = .059) (the disattenuated correlation was .38). The correlations with the untimed visual and auditory GJTs were .08 (p = .572) and .02 (p = .883)32. As for agreement (k = 30) (gender, person, and number) and non-agreement items (k = 30) (aspect contrasts, the subjunctive, and the passive), the MANCOVA analyses yielded no significant interactions at the multivariate or univariate level (p > .05). The LLAMA B, E, and F composite was a significant covariate for both agreement and non-agreement items at the multivariate level (F(3,94) = 4.393, p = .006, ?p 2 = .124, ? = .874, and F(3,94) = 4.778, p = .004, ?p 2 = .134, ? = .866, respectively), with medium large effect sizes in both cases. At the univariate level, it was also a significant covariate for agreement items on the untimed auditory GJT (F(1,96) = 8.496, p = .004, ?p 2 = .082) and metalinguistic test (F(1,96) = 12.263, p = .001, ?p 2 = .114), with a medium and a medium large effect size, respectively, and it approached significance for the untimed visual GJT (F(1,96) = 3.949, p = .050, ?p 2 = .040). In the case of non-agreement items, the LLAMA B, E, and F composite was significant for the untimed visual GJT (F(1,96) = 12.144, p = .001, ?p 2 = .113) and the metalinguistic test F(1,96) = 10.442, p = .002, ?p 2 = .099), with a medium large effect size, but non- significant for the untimed auditory GJT F(1,96) = 2.816, p = .097, ?p 2 = .029). Simple correlations between the LLAMA B, E, and F composite and early L2 learners? performance on agreement items on the untimed visual GJT, untimed auditory GJT, and metalinguistic test were .29 (p = .039), .35 (p = .015), and .35 (p = .012). Early L2 learners with high and low LLAMA composite scores were 32 The correlations between general intelligence and late L2 learners? scores on the untimed visual and untimed auditory GJTs were .25 (p = .079) and -.01 (p = .942), respectively. 168 significantly different on the three tests: untimed visual GJT (t(24) = 2.709, p = .012, mean difference of 10.36), untimed auditory GJT (t(24) = 3.042, p = .006, mean difference of 14.68), and metalinguistic test (t(24) = 2.729, p = .012, mean difference of 12.62). Correlations for non-agreement items were .39 (p = .006), .30 (p = .034), and .40 (p = .004), on the untimed visual GJT, untimed auditory GJT, and metalinguistic test, respectively (disattenuated correlations were .57, .42, and .57). Early L2 learners with high and low LLAMA B, E, and F composite scores were significantly different on the untimed visual GJT (t(24) = 2.593, p = .016, mean difference of 8.41) and the metalinguistic test (t(24) = 2.781, p = .010, mean difference of 9.48), but not on the untimed auditory GJT (t(24) = 3.042, p = .006, mean difference of 14.68). In the late AO group, the only significant correlation was between the LLAMA composite and agreement items on the metalinguistic test (r = .34, p = .016) (the disattenuated correlation was .48). This was the only test that yielded a significant difference between learners with high and low scores on the composite (t(26) = 2.103, p = .045, mean difference of 10.85). The correlations between the LLAMA composite and agreement items on the untimed visual and untimed auditory GJTs were .09 (p = .534) and .18 (p = .207) (but see the follow-up analyses reported for the untimed auditory GJT and LLAMA E in section 5.3.1.1). As for non-agreement items, the only significant correlation in the late AO group was with the untimed visual GJT (r = .29, p = .039) (the disattenuated correlation was.42), although this test did not yield a significant difference between learners with high and low scores on the composite (t(26) = 1.704, p = .100, mean difference of 6.78). The correlations between the 169 LLAMA composite and non-agreement items on the untimed auditory GJT and metalinguistic test were .07 (p = .622) and .23 (p = .110) (disattenuated correlations were .10 and .33). Overall, in the late AO group, the correlations between language performance and the LLAMA B, E, and F composite score mirrored the correlations between language performance and intelligence scores. Like the LLAMA composite score, intelligence scores were significantly correlated with agreement items on the metalinguistic test (r = .38, p = .007) and with non-agreement items on the untimed visual GJT (r = .36, p = .011) (disattenuated correlations were .54 and .52). To summarize, explicit language aptitude (LLAMA B, E, and F) moderated both early and late L2 learners? attainment on measures hypothesized to allow controlled use of L2 knowledge. General intelligence moderated attainment on the same language measures, but only among late L2 learners. 5.3.2.2 Tasks that Require Automatic Use of Language Knowledge Figures 42, 43, and 44 display individual scores on the timed visual GJT, timed auditory GJT, and word monitoring task (GSI) as a function of AO, with the general intelligence dimension added. The NS range is marked with a dotted line. The general intelligence groups were created by converting GAMA raw scores into z-scores within each of the three speaker groups and by establishing the following cutoffs reflecting distance from the mean in standard deviations: high = z-scores >.5, mid = - .5 < z-scores < .5, and low = z-scores < -.5. The highest scorer on the timed visual GJT in the early AO group was a high- intelligence L2 learner, whereas, in the late AO group, a combination of high-, mid-, and low-intelligence L2 learners scored within the NS range. On the timed auditory 170 GJT, a mid-intelligence L2 learner obtained the highest score in the early AO group, while, in the late AO group, two low-intelligence learners overlapped within the NS range. Finally, the highest grammatical sensitivity indices on the word monitoring task corresponded to high-intelligence L2 learners in both the early and late AO groups. Figure 42. Timed visual GJT scores as a function of AO with the general intelligence dimension added 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Timed Visual GJT High Intelligence Mid Intelligence Low Intelligence 171 Figure 43. Timed auditory GJT scores as a function of AO with the general intelligence dimension added Figure 44. Word monitoring task scores (GSI) as a function of AO with the general intelligence dimension added 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Timed Auditory GJT High Intelligence Mid Intelligence Low Intelligence -300 -200 -100 0 100 200 300 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an R e ac tio n T im e D if fe re n ce ( m se c) Age of Onset Word Monitoring Task (GSI) High Intelligence Mid Intelligence Low Intelligence 172 In order to investigate the role of general intelligence on participants? language attainment as measured by tasks that require automatic use of language knowledge, a MANCOVA was conducted with overall test scores on the timed visual GJT, timed auditory GJT, and word monitoring task (i.e., GSI) as dependent variables, group (NS controls, early L2 learners, and late L2 learners) as fixed factor, and GAMA scores as covariate. An interaction term was added, in addition to the group and covariate terms, to test for possible interactions between covariate and group as an independent factor. The results revealed no significant interactions at the multivariate or univariate level (p > .05). Intelligence was not a significant covariate either, either at the multivariate level (F(3,105) = .682, p = .664, ?p 2 = .018, ? = .964), or univariate level for the timed visual GJT (F(1,107) = .582, p = .447, ?p 2 = .005), timed auditory GJT (F(1,107) = 2.378, p = .126, ?p 2 = .020), or word monitoring task (F(1,107) = 1.011, p = .317, ?p 2 = .009).33 Results remained non-significant when L2 learners were combined as one group. Intelligence was not a significant covariate at the multivariate level (F(3,107) = 1.960, p = .124, ?p 2 = .049, ? = .951) or, at the univariate level, for the timed visual GJT (F(1,109) = 1.663, p = .200, ?p 2 = .014), timed auditory GJT (F(1,109) = 2.486, p = .118, ?p 2 = .021), or word monitoring task (F(1,109) = 2.741, p = .101, ?p 2 = .023). Interactions were all non-significant, as well (p > .05).34 33 The analysis including outliers in the word monitoring task also yielded non-significant results: Intelligence was a non-significant covariate at the multivariate level (F(3,112) = 1.376, p = .254, ?p 2 = .038, ? = .962) and at the univariate level for the timed visual GJT (F(1,114) = .392, p = .533, ?p 2 = .004), timed auditory GJT (F(1,114) = 1.434, p = .234, ?p 2 = .013), and word monitoring task (F(1,114) = 3.295, p = .072, ?p 2 = .030). Interactions were all non-significant, as well (p > .05). 34 The analysis including outliers also yielded non-significant results. Intelligence as a covariate was not significant at the multivariate level (F(3,114) = 1.627, p = .187 ?p 2 = .041, ? = .959) and, at the univariate level, for the timed visual GJT (F(1,116) = 1.663, p = .200 ?p 2 = .014), timed auditory GJT (F(1,116) = 3.295, p = .072 ?p 2 = .028), and word monitoring task (F(1,116) = 2.741, p = .101 ?p 2 = .023). Interactions were all non-significant, as well (p > .05). 173 The same results were found for ungrammatical items. The analysis with three groups (NS controls, early L2 learners, and late L2 learners) and the timed visual and timed auditory GJTs (the word monitoring task does not provide an interpretable measure for ungrammatical items only) yielded no significant interactions (p > .05). Intelligence was not a significant covariate at the multivariate level (F(2,113) = 1.193, p = .307 ?p 2 = .021, ? = .979), or, univariate level, for the timed visual (F(1,114) = .101, p = .752, ?p 2 = .001) or timed auditory GJT (F(1,114) = 1.941, p = .395, ?p 2 = .016). Combining L2 learners into one group made no difference to the results. Interactions remained non-significant (p > .05) and intelligence remained a non-significant covariate at the multivariate level (F(2,115) = 1.322, p = .271 ?p 2 = .023, ? = .977) and at the univariate level for the timed visual (F(1,116) = .254, p = .615, ?p 2 = .002) and timed auditory GJT (F(1,116) = 2.182, p = .142, ?p 2 = .019). Finally, the results for agreement and non-agreement target structures were all non- significant, as well. Intelligence was not a significant covariate and did not interact with group at any level (p > .05). To summarize, general intelligence did not moderate any of the groups? language attainment on the three measures hypothesized to require automatic use of L2 knowledge. 5.3.3 Aptitude for Implicit Learning and Language Attainment This section presents the results of the role of aptitude for implicit learning on language attainment as measured by tasks that allow controlled use of language knowledge (section 5.3.3.1) and measures that require automatic use of language knowledge (section 5.3.3.2). First, descriptive data is presented visually on 174 scatterplots that show attainment scores as a function of age of onset with the aptitude for implicit learning dimension added. This visual display allows determining to what extent a high level of implicit aptitude is a necessary condition at an individual level in order to score within NS range. Next, multivariate analyses of covariance (MANCOVAs) are conducted in order to determine the extent to which aptitude for implicit learning moderates language attainment in each of the groups. A MANCOVA was first conducted on overall test scores, grammatical and ungrammatical, and then re-run on ungrammatical items, agreement items, and non- agreement items in follow-up analyses. 5.3.3.1 Tasks that Allow Controlled Use of Language Knowledge Figures 45, 46, and 47 display individual scores on the metalinguistic knowledge test, untimed visual GJT, and untimed auditory GJT as a function of AO with the aptitude for implicit learning dimension added. The NS range is marked with a dotted line. The implicit aptitude groups (high, mid, and low) were created by establishing the following cutoffs on the aptitude for implicit learning composite score in every speaker group: high = z-scores >.5, mid = -.5 < z-scores < .5, and low = z-scores < -.5. The highest scorers on the three tests in the two learner groups had either mid or low implicit language aptitude. These are the same L2 learners that had either high or mid aptitude for explicit language learning, except for a learner in the late AO group who scored within the NS range on the metalinguistic test and who was high in both types of aptitude. 175 Figure 45. Metalinguistic knowledge test scores as a function of AO with the implicit language aptitude dimension added Figure 46. Untimed visual GJT scores as a function of AO with the implicit language aptitude dimension added 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Metalinguistic Knowledge Test High Implicit Aptitude Mid Implicit Aptitude Low Implicit Aptitude 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Untimed Visual GJT High Implicit Aptitude Mid Implicit Aptitude Low Implicit Aptitude 176 Figure 47. Untimed auditory GJT scores as a function of AO with the implicit language aptitude dimension added In order to investigate the role of aptitude for implicit learning in participants? language attainment as measured by tasks that allow controlled use of language knowledge, a MANCOVA was conducted with overall test scores on the untimed visual GJT, untimed auditory GJT, and metalinguistic test as dependent variables, group (NS controls, early L2 learners, and late L2 learners) as fixed factor, and the composite aptitude score combining LLAMA D and the learning score on the probabilistic SRT task (i.e., aptitude for implicit learning) as covariate. An interaction term was added, in addition to the group and covariate terms, to test for possible interactions between covariate and group as an independent factor. The results revealed no significant interactions at the multivariate or univariate level (p > .05). Aptitude for implicit learning was not a significant covariate, either, 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Untimed Auditory GJT High Implicit Aptitude Mid Implicit Aptitude Low Implicit Aptitude 177 either at the multivariate level (F(3,112) = .723, p = .540, ?p 2 = .019, ? = .981), or univariate level for the untimed visual GJT (F(1,114) = .136, p = .713, ?p 2 = .001), untimed auditory GJT (F(1,114) = .113, p = .737, ?p 2 = .001) or metalinguistic test (F(1,114) = .379, p = .539, ?p 2 = .003). Results remained non-significant when L2 learners were combined as a single group and compared with NS controls (p > .05). Implicit aptitude was not a significant covariate at the multivariate level (F(3,114) = .281, p = .839, ?p 2 = .007, ? = .993), or, at the univariate level, for the untimed visual GJT (F(1,116) = .121, p = .729, ?p 2 = .001), untimed auditory GJT (F(1,116) = .592, p = .443, ?p 2 = .005), or metalinguistic test (F(1,116) = .151, p = .698, ?p 2 = .001). Interactions were all non-significant, as well (p > .05). The same results were found for ungrammatical items. The analysis with three groups (NS controls, early L2 learners, and late L2 learners) yielded no significant interactions (p > .05). Aptitude for implicit learning was not a significant covariate at the multivariate level (F(6,112) = .345, p = .793 ?p 2 = .009, ? = .991), or, univariate level, for the untimed visual GJT (F(1,114) = .873, p = .352, ?p 2 = .000), untimed auditory GJT (F(1,114) = .124, p = .726, ?p 2 = .001), or metalinguistic test (F(1,114) = .602, p = .440, ?p 2 = .005). Combining L2 learners as a single group made no difference to the results (p > .05). Regarding agreement and non-agreement structures, the results for non-agreement items were all non-significant (p > .05). However, the MANOVA performed on agreement items yielded a significant two-way interaction between group and aptitude for implicit learning at the univariate level in two of the tests hypothesized to allow controlled use of language knowledge: the untimed auditory GJT (F(2,114) = 178 4.627, p = .012, ?p 2 = .076) and the metalinguistic test (F(2,114) = 4.254, p = .017, ?p 2 = .070), both associated with a medium effect size. The interaction in the case of the untimed visual GJT did not reach significance (F(2,114) = 2.121, p = .125, ?p 2 = .036). At the multivariate level, the interaction between group and covariate had a p value of .085 (F(6,224) = 1.881, p = .085, ?p 2 = .048, ? = .906). Finally, as a covariate at the multivariate and univariate level, aptitude for implicit learning was not significant (p > .05). The existence of a significant two-way interaction between scores on agreement items and group indicated that the effects of aptitude for implicit learning were not comparable in the three groups of participants, as can be observed in Figures 48 and 49. Figure 48. Two-way interaction between group and aptitude for implicit learning in the untimed auditory GJT (agreement structures) 179 Figure 49. Two-way interaction between group and aptitude for implicit learning in the metalinguistic knowledge test (agreement structures) Follow-up correlations in each of the groups showed a significant positive relationship between aptitude for implicit learning and agreement items on the untimed auditory GJT and metalinguistic test in the early L2 group only: .29 (p = .045) and .39 (p = .005), respectively (disattenuated correlations were .42 and .57). The correlation for agreement items on the untimed visual GJT was not significant (r = .19, p = .193). In the NS control and late AO groups, correlations did not reach significance and, unlike in the early AO group, they were all negative: -.37 (p = .110) and -.18 (p = .215) (untimed visual GJT), -.05 (p = .846) and -.27 (p = .056) (untimed auditory GJT), and -.06 (p = .792) and -.12 (p = .394) (metalinguistic test) (disattenuated correlations were -.55, -.27, -.07, -.40, -.09, and -.18). 180 Differences between high and low implicit aptitude individuals were only significant in the early AO group. High implicit aptitude early L2 learners (n = 14) scored significantly higher than their low-aptitude counterparts (n = 16) on the untimed auditory GJT (M = 84.29, SD = 10.08 and M = 71.46, SD = 11.80) (t(28) = 3.177, p = .004) and on the metalinguistic test (M = 78.10, SD = 13.12 and M = 66.46, SD = 10.64) (t(28) = 2.681, p = .012), but not on the untimed visual GJT (p = .130), even though high implicit aptitude early learners also scored higher on this test. To summarize, individual differences in aptitude for implicit learning did not moderate language attainment in any of the groups on measures hypothesized to allow controlled use of L2 knowledge, when overall scores or scores on ungrammatical items were considered. Follow-up analyses on agreement structures, however, showed a positive effect of aptitude for implicit learning in the group of early L2 learners. This effect was present at the univariate level for agreement items on the untimed auditory GJT and the metalinguistic test. Early L2 learners with high implicit language aptitude scored higher on agreement items than early L2 learners with low implicit aptitude. 5.3.3.2 Tasks that Require Automatic Use of Language Knowledge Figures 50, 51, and 52 display individual scores on the timed visual GJT, timed auditory GJT, and word monitoring task (GSI) as a function of AO with the implicit language aptitude dimension added. The NS range is marked with a dotted line. The implicit aptitude groups (high, mid, and low) were created by establishing the following cutoffs on the aptitude for implicit learning composite score in every speaker group: high = z-scores >.5, mid = -.5 < z-scores < .5, and low = z-scores < -.5. 181 The highest scorer on the timed visual GJT in the early AO group, a learner with high explicit language aptitude, was also high in terms of implicit language aptitude. In the late AO group, those learners who scored within the NS range and who had either low, mid, or high explicit language aptitude, had mostly low, or mid, implicit language aptitude. On the timed auditory GJT, the L2 learner who had high explicit language aptitude and who obtained the highest score in the early AO group also had high implicit language aptitude. In the late AO group, the two learners who overlapped within the NS range, and who had either low or mid explicit language aptitude, had both low implicit language aptitude. Finally, the highest grammatical sensitivity indices on the word monitoring task corresponded to L2 learners with high implicit language aptitude. These learners also had high explicit language aptitude. Figure 50. Timed visual GJT scores as a function of AO with the implicit language aptitude dimension added 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Timed Visual GJT High Implicit Aptitude Mid Implicit Aptitude Low Implicit Aptitude 182 Figure 51. Timed auditory GJT scores as a function of AO with the implicit language aptitude dimension added Figure 52. Word monitoring task scores (GSI) as a function of AO with the implicit language aptitude dimension added 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an % S co re Age of Onset Timed Auditory GJT High Implicit Aptitude Mid Implicit Aptitude Low Implicit Aptitude -300 -200 -100 0 100 200 300 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 M e an R e ac tio n T im e D if fe re n ce ( m se c) Age of Onset Word Monitoring Task (GSI) High Implicit Aptitude Mid Implicit Aptitude Low Implicit Aptitude 183 In order to investigate the role of aptitude for implicit learning in participants? language attainment as measured by tasks hypothesized to require automatic use of language knowledge, a MANCOVA was conducted with overall test scores on the timed visual GJT, timed auditory GJT, and word monitoring task (i.e., GSI) as dependent variables, group (NS controls, early L2 learners, and late L2 learners) as fixed factor, and the composite aptitude score combining LLAMA D and the learning score on the SRT task (i.e., aptitude for implicit learning) as a covariate. An interaction term was added, in addition to the group and covariate terms, to test for possible interactions between covariate and group as an independent factor. The results revealed no significant interactions at the multivariate or univariate level (p > .05). Aptitude for implicit learning did not reach significance, either, as a covariate at the multivariate level (F(3,105) = 2.216, p = .090, ?p 2 = .057, ? = .943), or univariate level for the timed visual GJT (F(1,107) = 1.803, p = .182, ?p 2 = .016), timed auditory GJT (F(1,107) = .450, p = .504, ?p 2 = .004), or word monitoring task, which had a p value of .082 (F(1,107) = 3.079, p = .082, ?p 2 = .027).35 Results remained non-significant when L2 learners were combined as one group and compared with NS controls (p > .05). Aptitude was not a significant covariate at the multivariate level (F(3,107) = 1.948, p = .126, ?p 2 = .049, ? = .951), or, at the univariate level, for the timed visual GJT (F(1,109) = .548, p = .461, ?p 2 = .005), timed auditory GJT (F(1,109) = .666, p = .416, ?p 2 = .006), or word monitoring task 35 The analysis including outliers in the word monitoring task yielded similar results: aptitude for implicit learning was not a significant covariate at the multivariate level (F(3,112) = 1.805, p = .150, ?p 2 = .046, ? = .954), or, at the univariate level, for the timed visual GJT (F(1,114) = 2.413, p = .123, ?p 2 = .021), timed auditory GJT (F(1,114) = .201, p = .655, ?p 2 = .002), or word monitoring task (F(1,114) = 1.671, p = .199, ?p 2 = .014). Interactions were all non-significant (p > .05). 184 (F(1,109) = 1.926, p = .168, ?p 2 = .016). Interactions were all non-significant, as well (p > .05)36. The same results were found for ungrammatical items. The analysis with three groups (NS controls, early L2 learners, and late L2 learners) and the timed visual and timed auditory GJTs (the word monitoring task does not provide an interpretable measure for ungrammatical items only) yielded no significant interactions (p > .05). Aptitude for implicit learning was not a significant covariate at the multivariate level (F(2,113) = 1.819, p = .167 ?p 2 = .031, ? = .969), or, univariate level, for the timed visual (F(1,114) = 1.742, p = .190, ?p 2 = .015) or timed auditory GJT (F(1,114) = .069, p = .793, ?p 2 = .001). Combining L2 learners as a single group made no difference to the results. Interactions remained non-significant (p > .05) and intelligence remained a non-significant covariate at the multivariate level (F(2,115) = 2.075, p = .130 ?p 2 = .035, ? = .965) and at the univariate level for the timed visual (F(1,116) = .998, p = .320, ?p 2 = .009) and timed auditory GJT (F(1,116) = .311, p = .578, ?p 2 = .003). As for agreement (k = 30) (gender, person, and number) and non-agreement items (k = 30) (aspect contrasts, the subjunctive, and the passive), the MANCOVA analyses yielded no significant interactions at the multivariate or univariate level (p > .05). However, aptitude for implicit learning was a significant covariate for agreement structures at the multivariate level with the three speaker groups as a fixed factor 36 The analysis including outliers in the word monitoring task yielded similar results: aptitude for implicit learning was not a significant covariate at the multivariate level (F(3,114) = 1.505, p = .217, ?p 2 = .038, ? = .962), or, at the univariate level, for the timed visual GJT (F(1,116) = .904, p = .344, ?p 2 = .008), timed auditory GJT (F(1,116) = .401, p = .528, ?p 2 = .003), or word monitoring task (F(1,116) = .679, p = .412, ?p 2 = .006). Interactions were all non-significant (p > .05). 185 (F(3,102) = 3.217, p = .026, ?p 2 = .086, ? = .914). The size of the effect was medium. Univariate analyses further showed that this was mostly due to the significant effect of the covariate on the word monitoring task37 (i.e., GSI for agreement structures) (F(1,104) = 6.653, p = .011, ?p 2 = .060), since aptitude for implicit learning did not moderate scores on the timed visual or auditory GJTs (F(1,104) = 1.226, p = .271, ?p 2 = .012 and F(1,104) = .185, p = .668, ?p 2 = .002, respectively).38 The multivariate effect of aptitude approached significance when L2 learners were combined and compared with controls (F(3,104) = 2.513, p = .063, ?p 2 = .068, ? = .932), as well as, at the univariate level, for the word monitoring task F(1,104) = 3.884, p = .051, ?p 2 = .035), but not for the timed visual or auditory GJTs (F(1,104) = .384, p = .537, ?p 2 = .004 and F(1,104) = .750, p = .388, ?p 2 = .007, respectively).39 The interactions between group and covariate were not significant for any of the language measures: timed visual GJT (F(2,104) = .382, p = .002, ?p 2 = .961), timed auditory GJT (F(2,104) = 1.115, p = .293, ?p 2 = .010), or word monitoring task (F(2,104) = .268, p = .606, ?p 2 = .003). The fact that aptitude for implicit learning was a significant covariate for agreement structures in the word monitoring task and that there was no significant 37 In addition to the word monitoring task, aptitude for implicit learning showed a trend towards significance in the early L2 group for the timed visual and timed auditory GJTs. Correlations in this group were .26 (p = .074) and .23 (p = .120), respectively. In addition, the difference between high- and low-aptitude early L2 learners had a p value of .086 (t(28) = 1.779, p = .086) in the case of the timed auditory GJT. 38 The analysis including outliers in the word monitoring task also yielded a significant multivariate effect of the covariate (F(3,112) = 3.090, p = .030, ?p 2 = .076, ? = .924). Univariate tests were also significant for the word monitoring task (F(1,114) = 6.010, p = .016, ?p 2 = .050), but not for the timed visual or auditory GJT (F(1,114) = 2.153, p = .145, ?p 2 = .019 and F(1,114) = .177, p = .675, ?p 2 = .002, respectively). Interactions were all non-significant (p > .05). 39 The analysis including outliers in the word monitoring task yielded non-significant results for the covariate at the multivariate level (F(3,114) = 2.102, p = .104, ?p 2 = .052, ? = .948) and univariate level, for the timed visual GJT (F(1,116) = .714, p = .400, ?p 2 = .006), timed auditory GJT (F(1,116) = .777, p = .380, ?p 2 = .007), and word monitoring task (F(1,116) = 2.593, p = .110, ?p 2 = .022). The interactions between group and covariate were all non-significant (p > .05). 186 group-by-covariate interaction suggested a comparable effect of aptitude for implicit learning on grammatical sensitivity towards agreement structures in all the groups. As can be seen on Figure 53, the slopes of the three groups of participants were similar, although steeper in the case of the two L2-learner groups. Figure 53. Regression of the grammatical sensitivity index for agreement items on aptitude for implicit learning at each group level Follow-up simple correlations showed that there was a significant positive relationship between aptitude for implicit learning and sensitivity to agreement structures in the early and late AO groups (r = .34, p = .021, and r = .31, p = .038, respectively40), but not in the NS control group (r = .15, p = .551) (disattenuated 40 The correlations including outliers were .28 (p = .053) and .40 (p = .004), in the early and late AO groups, respectively. The fact that the correlation in the late AO group increased in magnitude from .31 to .40 suggests that outliers are not always the result of task-irrelevant factors. They could be individuals with high aptitude that perform as outliers as a result of their cognitive ability. If so, perhaps they should not be eliminated, since they provide valuable information about the relationship 187 correlations were .47, .43, and .21). A comparison of the sensitivity indices displayed by high and low implicit aptitude participants in each group (see Table 30) further showed no significant differences in the NS control group (t(12) = 1.009, p = .333, mean difference of 63.42), but significant differences in both the early and late AO groups, where L2 learners with high aptitude for implicit learning showed higher sensitivity than L2 learners with low implicit aptitude (t(27) = 2.364, p = .026, mean difference of 69.86, and t(29) = 3.048, p = .005, mean difference of 97.16, respectively). Table 30. Summary of GSIs for Agreement Structures on the Word monitoring Task by High and Low Implicit Aptitude Participants Control Early AO Late AO High n = 6 Low n = 8 High n = 13 Low n = 16 High n = 15 Low n = 16 GSI Agreement 86.92 (157.44) 23.50 (74.18) 90.32 (87.63) 20.47 (71.62) 55.68 (96.31) -41.48 (80.98) Note. Standard deviations appear between parentheses. Given the differential contribution of the LLAMA aptitude subtests B, E, and F, and the GAMA intelligence scores on early and late L2 learners? language attainment, a last follow-up analysis was conducted separating the effects of LLAMA D and SRT learning scores on L2 learners? grammatical sensitivity for agreement structures. With LLAMA D as a covariate and three groups as a fixed factor (NSs, early L2 learners, between aptitude and test scores (Doughty, p.c., 4/10/2012). 188 and late L2 learners), multivariate effects were not significant. LLAMA D was not a significant covariate (F(3,102) = 1.452, p = .232, ?p 2 = .039, ? = .961) and the interaction with group was not significant, either (F(6,204) = 1.130, p = .346, ?p 2 = .031, ? = .940). At the univariate level, the effects of LLAMA D on grammatical sensitivity had a p value of .084 (F(1,104) = 3.035, p = .084, ?p 2 = .027), but there was no interaction between group and covariate (F(2,104) = 1.401, p = .251, ?p 2 = .025). When learners were combined as a single group and compared with controls, results remained non-significant at the multivariate level (p > .05), but the interaction between group and grammatical sensitivity at the univariate level approached significance (F(1,106) = 3.269, p = .073, ?p 2 = .029). Simple correlations showed that the relationship between LLAMA D and sensitivity to agreement structures was significant in the early AO group (r = .34, p = .021) and approached significance in the late AO group (r = .26, p = .079), but was not significant in the control group (r = -.07, p = .757) (disattenuated correlations were .44, .34, and -.09). The effects of the SRT learning score as a covariate on sensitivity to agreement structures with three groups of participants as a fixed factor were not significant. The SRT score was not a significant covariate (F(3,102) = .342, p = .795, ?p 2 = .009, ? = .991) and it did not interact with group (F(6,204) = 1.217, p = .307, ?p 2 = .031, ? = .969) at the multivariate level. The results at the univariate level were also non-significant (p > .05). Combining L2 learners into a single group did not make any difference, and the SRT score remained non-significant covariate (p > .05). When simple correlations were computed, they were weak and non-significant in the control and early AO 189 groups (r = .14, p = .553 and r = .16, p = .263, respectively) (disattenuated correlations were .21 and .24). In the late AO group, the correlation had a slightly higher magnitude but remained non-significant (r = .21, p = .141) (the disattenuated correlation was .32). These results suggested that the significant relationship found between aptitude for implicit learning and early and late L2 learners? sensitivity to agreement structures was mostly due to the LLAMA D subtest in the early group and to a combination of LLAMA D and SRT scores in the late group. Finally, the results for non-agreement items were all non-significant. Aptitude for implicit learning was not a significant covariate at any level (p > .05), and it did not interact with group either (p > .05). This indicated that individual differences in aptitude for implicit learning did not moderate L2 learners? grammatical sensitivity to non-agreement structures (aspect contrasts, the subjunctive, and the passive). Given the significant relationship between aptitude for implicit learning and sensitivity to agreement violations in the late AO group, late L2 learners? performance on agreement items was examined across all the L2 measures in order to determine the extent to which late L2 learners displayed knowledge of grammatical agreement. Table 31 presents a breakdown of the average percentage scores obtained by the late L2 learners (n = 50) on agreement items. 190 Table 31. Average Percentage Scores on Agreement Items in the Late AO Group Agreement Structures M Untimed Visual GJT 63.20 (11.39) Untimed Auditory GJT 62.40 (10.41) Metalinguistic Knowledge Test 67.47 (13.78) Timed Visual GJT 57.62 (10.81) Timed Auditory GJT 57.78 (10.08) Note. Standard deviations appear between parentheses. As can be seen, late L2 learners? scores on agreement items were higher on measures hypothesized to allow controlled use of L2 knowledge than on measures hypothesized to require automatic use of knowledge. The average score on the metalinguistic test, a measure hypothesized to allow controlled use of L2 knowledge, was significantly higher than the scores on all the other tests, according to Bonferroni-adjusted comparisons: untimed auditory GJT (p = .018), untimed visual GJT (p = .044), timed visual GJT (p < .001) and timed auditory GJT (p < .001). The average score on the timed visual GJT, a measure hypothesized to require automatic use of L2 knowledge, did not differ from the average score on the timed auditory GJT (p = 1.000), or from the scores on two of the measures hypothesized to allow controlled use of knowledge, the untimed visual GJT (p = .067) and the untimed auditory GJT (p = .084). 191 To summarize, aptitude for implicit learning moderated early and late L2 learners? scores on agreement structures. This effect was significant at the multivariate level for a combination of agreement scores on the three measures hypothesized to require automatic use of L2 knowledge (timed visual GJT, timed auditory GJT, and word monitoring task), and, at the univariate level, for the word monitoring task, at the automatic end of the L2 knowledge use continuum. In this task, individual differences in implicit aptitude were related to early and late L2 learners? degree of grammatical sensitivity to agreement violations. 5.3.4 Summary of Results: Cognitive Aptitudes and Language Attainment The main findings regarding the relationship between cognitive aptitudes and language attainment among early L2 learners, late L2 learners, and NSs were the following: ? No significant interactions between L2-learner group and language aptitude in any of the analyses when overall test scores or scores on ungrammatical items were considered, suggesting comparable effects of cognitive aptitudes on language attainment among early and late L2 learners o Follow-up tests on agreement structures (gender, person, and number), however, yielded a significant interaction between group and aptitude for implicit learning: early L2 learners? scores on agreement items were moderated by aptitude for implicit learning in measures that allow controlled use of L2 knowledge and approached significance in measures that require automatic use of L2 knowledge ? Significant multivariate effects of aptitude for explicit learning as a covariate 192 on measures that allow controlled use of L2 knowledge, but not on measures that require automatic use of L2 knowledge o In the early AO group, the effect of aptitude for explicit learning was due to a combination of the three LLAMA aptitude subtests (B, F, and E), whereas in the late AO group it was due to a combination of the three LLAMA aptitude subtests plus general intelligence ? Significant univariate effects of aptitude for explicit learning as a covariate on measures that allow controlled use of L2 knowledge in both the early AO and late AO groups o L2 learners with high aptitude for explicit learning outperformed L2 learners with low aptitude for explicit learning on 1) the metalinguistic knowledge test (in the early AO and late AO groups, but not in the control group), 2) the untimed visual GJT (in the early AO group and approaching significance in the late AO group, but not in the control group), and 3) the untimed auditory GJT (in the early AO group, but not in the late AO group or control group ?though a trend was found in the late AO group for the LLAMA E subtest in the aptitude composite) o For ungrammatical items (half of the items on each language test), results remained robust for the metalinguistic test: high- and low- aptitude early L2 learners? scores were significantly different, and high- and low-aptitude late L2 learners? scores approached significance o Early and late L2 learners? performance on agreement and non- 193 agreement structures was equally moderated by aptitude for explicit learning on untimed visual tests (untimed visual GJT and metalinguistic test), but not on the untimed auditory GJT, which only showed a significant relationship with aptitude for explicit learning in the early AO group (this difference in modality between the two L2- learner groups did not yield a significant interaction) ? Significant univariate effects of aptitude for explicit learning as a covariate on measures that require automatic use of L2 knowledge in the early AO group only o Early L2 learners with high aptitude for explicit learning outperformed their low-aptitude counterparts on the timed auditory GJT and this difference remained significant for ungrammatical items o Early L2 learners with high aptitude for explicit learning also outperformed their low-aptitude counterparts on agreement items on the timed visual GJT o No significant effects of aptitude for explicit learning on the word monitoring task (i.e., grammatical sensitivity) in any of the groups ? Significant multivariate effects of aptitude for implicit learning on measures that require automatic use of L2 knowledge and significant univariate effects as a covariate on agreement items in the word monitoring task o L2 learners with high aptitude for implicit learning showed significantly greater sensitivity towards agreement violations than L2 learners with low aptitude for implicit learning, in both the early and 194 late AO groups, but not in the control group o In the early AO group, the effect of aptitude for implicit learning was mostly due to the LLAMA D aptitude subtest, whereas in the late AO group it was due to a combination of LLAMA D plus learning in the probabilistic SRT task ? No significant multivariate effects of general intelligence on either set of language measures, but significant interaction between L2-learner group and intelligence in the untimed visual GJT and metalinguistic test o 1) High- and low-intelligence late L2 learners were significantly different on overall metalinguistic test scores and approached significance on the untimed visual GJT, 2) Differences between high- and low-intelligence late L2 learners on the metalinguistic test remained significant for ungrammatical items, 3) Late L2 learners? performance on agreement and non-agreement structures was equally moderated by general intelligence on untimed visual tests (untimed visual GJT and metalinguistic test), 4) The only test where intelligence did not show an effect in the late AO group was the untimed auditory GJT As a visual summary of the results, Table 32 shows the relationships observed in the data between aptitude for explicit learning, aptitude for implicit learning, and general intelligence on language attainment as measured by tests hypothesized to allow controlled use of knowledge or require automatic use of knowledge. A check indicates that at least one of the analyses conducted on either overall test scores, 195 ungrammatical items, agreement structures, or non-agreement structures was significant for the speaker group in question (early L2 learners and late L2 learners). As can be observed, the two types of aptitude identified in the study played a role in early L2 learners? attainment, regardless of type of outcome measure. However, general intelligence did not play any role. The role of aptitude in late L2 learners? attainment was more specific and could only be observed in certain outcome measures. Like early L2 learners, late L2 learners? attainment on measures of controlled use of knowledge was moderated by aptitude for explicit learning. Like early L2 learners as well, late L2 learners? attainment on measures of automatic use of knowledge was moderated by aptitude for implicit learning. Unlike early L2 learners, however, late L2 learners? attainment on measures of controlled use of knowledge was also moderated by general intelligence. Therefore, general intelligence and aptitude for explicit learning had a similar effect among late L2 learners. No effects of cognitive aptitudes were observed in the NS control group. Table 32. Summary of Relationships between Types of Aptitude, General Intelligence, and L2 Attainment L2 Attainment Explicit Aptitude Implicit Aptitude General Intelligence Early Late Early Late Early Late Controlled Use of L2 Knowledge ? ? ? ? ? ? Automatic Use of L2 Knowledge ? ? ? ? ? ? 196 Chapter 6: Discussion and Conclusions This study set out to investigate the relationship between different cognitive aptitudes for L2 learning, including general intelligence, and ultimate level of language attainment by early (AOs 3-6) and late (AOs ? 16) L2 learners. Early bilinguals who start acquiring the L2 in an immersion context before age 6 were hypothesized not to be fundamentally different from NSs in terms of learning mechanisms (although they may still differ in ultimate success), whereas late bilinguals who start acquiring the L2 as adults (after age 16) should be fundamentally different from NSs in terms of learning mechanisms (and also different in ultimate success). Following DeKeyser?s (2000) claim that any relationships between individual differences in language aptitude and learning outcomes constitute potential evidence for differences in learning processes, the present study examined whether individual differences in cognitive aptitudes hypothesized to play a role in either implicit or explicit learning relate to variation in L2 attainment in early and late L2 learners, as measured by tasks that allow controlled use of knowledge or that require more automatic use of knowledge. A total of 120 participants took part in the study, 50 early L2 learners, 50 late L2 learners, all of them L1 Chinese-L2 Spanish bilinguals, and 20 NS controls. A set of six L2 attainment measures reflecting a continuum from automatic to controlled use of language knowledge was administered: four GJTs (timed visual, timed auditory, untimed visual, and untimed auditory), a metalinguistic knowledge test (at the controlled end of the L2 knowledge use continuum), and a word monitoring task (at the automatic end of the L2 knowledge use continuum). A battery of six cognitive 197 tests was also administered: four language aptitude subtests, a general intelligence test, and a probabilistic serial reaction time task. 6.1 Cognitive Aptitudes Regarding cognitive aptitudes, this dissertation pointed out the heavy bias towards explicit cognitive processes that has characterized language aptitude constructs and measures in SLA. It also brought to attention the fact that SLA studies have neglected implicit cognitive processes as a source of potential aptitudes for language learning. One of the goals of this dissertation was to address this gap by including a cognitive task (a probabilistic SRT task) that could tap implicit learning. Implicit learning was further conceptualized to be an ability with meaningful individual differences that could be related to variation in L2 outcomes, in line with Kaufman et al. (2010) and Woltz (2003). The results of an exploratory factor analysis, conducted using principal components analysis as the method of extraction and Varimax as the method of rotation, showed that the six cognitive tests administered (four language aptitude subtests, a general intelligence test, and a probabilistic SRT task) loaded on two different components. Four of the six tests (LLAMA B, E, F, and GAMA) loaded strongly on the first component, which accounted for the largest amount of variation and which was interpreted as ?aptitude for explicit learning?. On the other hand, the remaining two tests (LLAMA D and the probabilistic SRT task) loaded strongly on a second component that was interpreted as ?aptitude for implicit learning?. The interpretation of the underlying constructs was informed by the characteristics of the tests themselves, as well as by previous research findings. 198 The tests loading together on the ?explicit aptitude? component had in common the fact that they involved explicit cognitive processes (i.e., attention-driven, conscious, and intentional processes). All involved working out relationships in either verbal or non-verbal datasets and allowed time to think and use problem-solving strategies. These skills can be broadly understood as analytic ability or explicit inductive learning ability, and they play a role in discovering patterns and rules (i.e., creating and testing hypotheses) on the basis of input data. This is one of the meanings that ?inductive learning ability? has had in the SLA literature as a component of language aptitude. Moreover, this was one of the constructs that Carroll (1962) proposed as part of his four-factor model (phonetic coding, rote learning, grammatical sensitivity, and inductive learning), but that, nevertheless, was not represented in the MLAT battery, and that Skehan (1998) reconceptualized as language analytic ability together with grammatical sensitivity. More recently, in a review of aptitude research, Skehan (2012) pointed out that the LLAMA aptitude test differed from the MLAT in that it ?adds a receptive interpretation of inductive language ability? (p. 390). The results of this dissertation support Skehan?s interpretation, but further qualify it by making a distinction between explicit and implicit inductive learning ability in the context of the LLAMA test battery and as broader aptitude components. Implicit inductive learning involves learning from input by analogy, not analysis (N. Ellis & Laporte, 1997). DeKeyser (1995) further hypothesized that implicit inductive learning is good for prototypicality (probabilistic) patterns (i.e., linguistic prototypes, such as number, case, or gender markings that are subject to allomorphy). 199 In the context of the LLAMA aptitude test battery, while LLAMA F (grammar inferencing) was the strongest loading on the component interpreted as ?aptitude for explicit learning? and can be defined as a test measuring explicit inductive learning, LLAMA D (sound recognition) loaded on the component interpreted as ?aptitude for implicit learning? and can be defined as a test measuring implicit inductive learning ability. LLAMA F requires test-takers to work out the grammar of an unknown language by means of pictures and short written sentences. LLAMA D, on the other hand, measures the ability to discriminate short stretches of spoken language by analogy. As pointed out by Meara (2005), LLAMA D ?owes something to Speciale (Speciale, N. Ellis, & Bywater, 2004)? who suggest ?that a key skill in language ability is your ability to recognize patterns, particularly patterns in spoken language? (p. 8). Speciale et al.?s (2004) study included two cognitive factors as predictors of L2 vocabulary acquisition, a task of phonological sequence learning, measuring the ability to learn phonological regularities, and a nonword repetition task, measuring phonological short-term memory capacity. One of their findings was that phonological sequence learning ability constitutes a source of individual differences that can be dissociated from short-term store capacity. This line of research is based on a strand of cognitive psychology that investigates the implicit induction of phonological sequences (Saffran et al., 1996; Saffran, Johnson, Aslin, & Newport, 1999; Saffran, Newport, Aslin, Tunick, & Barrueco, 1997). LLAMA D can, thus, be seen as an attempt to measure implicit induction learning ability. 200 Previous research has also shown a distinction between LLAMA B, E, and F, on the one hand, and LLAMA D, on the other, as well as the existence of aptitude profiles based on the LLAMA test (i.e., individuals with high scores on LLAMA F, but low scores on LLAMA D, and vice-versa, resulting in close-to-zero or weak correlations between both) (Granena, to appear). In addition, general intelligence, which, in this dissertation, loaded on the component interpreted as aptitude for explicit learning, has been consistently related to attention-driven working memory measures (e.g., Engle et al., 1999; Kyllonen, 1996; Kyllonen & Christal, 1990) and artificial grammar learning when participants are instructed to look for patterns in the training materials (e.g., Gebauer & Mackintosh, 2007; Reber et al., 1991; Robinson, 2002). However, it has exhibited low correlations with procedural skill performance beyond the early stages (e.g., Ackerman, 1987, 1988), indicating that, at least as represented by conventional tests, intelligence involves explicit cognitive processes similar to those that characterize some language aptitude components. Skehan (1998) further argued that the relationship between aptitude and intelligence was likely to be strongest for components of aptitude such as language analytic ability, but not for others such as phonetic coding ability. Although language aptitude and intelligence overlap to some extent, as shown by either low to moderate or moderate to strong correlations (Gardner & Lambert, 1972; Sasaki, 1996; Skehan, 1982; Wesche, Edwards, & Wells, 1982), they still exhibit different correlations with L2 outcomes (e.g., Skehan, 1982). The results of this dissertation provided further support for the specificity of language aptitude. Intelligence loaded on the same component as three of the LLAMA subtests, and, in 201 combination with these subtests, it moderated L2 attainment as measured by tests that allow controlled use of knowledge in both early and late L2 learners. However, as an independent factor, it only showed a relationship among late L2 learners. This relationship only held when tests allowed controlled use of L2 knowledge. Therefore, L2 attainment seems to be moderated by several factors (Carroll, 1983), specific factors when L2 learning starts at an early age, and both general and specific factors when L2 learning starts in adulthood. However, in this study, general factors did not moderate late L2 learners? attainment on tasks that required automatic use of L2 knowledge. In these tasks, only the type of language aptitude interpreted as being advantageous for implicit language learning, and which was unrelated to general intelligence, played a role. These findings suggest that there are abilities specific to language learning in post-critical period learning that do not overlap with general intellectual functioning. 6.2 Language Attainment Regarding language attainment, this dissertation adopted a multiple-task design to measure participants? morphosyntactic attainment, following studies such as Abrahamsson and Hyltenstam (2009), where multiple tasks were used to sample different language domains. Notwithstanding the complexity of such designs, multiple assessment tasks are desirable in SLA research in order to provide a more comprehensive picture of learners? actual proficiency level (Chaudron, 2003). The multiple-task design used in this dissertation further aimed at addressing a gap in previous studies that have investigated language aptitude in single-task designs relying on L2 proficiency measures that have been biased towards explicit cognitive 202 processes (e.g., DeKeyser, 2000; DeKeyser et al., 2010; Abrahamsson & Hyltenstam, 2008). A distinction was made between L2 measures that allow controlled use of L2 knowledge and measures that require automatic use of L2 knowledge. These measures were hypothesized to lie along a continuum of use of knowledge. The two control tasks at the extreme ends of the continuum were a metalinguistic knowledge test and a word monitoring task. In the metalinguistic test, participants? attention was directly focused on linguistic structure, correctness and grammatical rules (i.e., explicit declarative facts about language). It required language analysis rather than intuition about correctness. In the word monitoring task, participants? attention was focused on meaning. It required monitoring for a target word in a sentence and paying attention to sentence meaning, while the researcher measured sensitivity to grammatical violations. Four more tasks were administered that lay along the continuum: two timed GJTs (visual and auditory), hypothesized to require more automatic use of L2 knowledge, and two untimed GJTs (visual and auditory), hypothesized to allow controlled use of L2 knowledge. The results showed that, as hypothesized, the metalinguistic knowledge test and the two untimed GJTs were strongly correlated with one another (r > .80). These results confirmed the findings of previous psychometric studies (Ellis, 2005 and Bowles, 2011), where metalinguistic knowledge tests loaded on the same factor as untimed GJTs. Contrary to Ellis? (2005) and Bowles? (2011) results, however, the correlations between language measures hypothesized to require automatic use of language knowledge were not strong. Specifically, the correlations between the GSI, 203 which was computed as an index of sensitivity to grammatical violations in the word monitoring task, and the other two measures hypothesized to require automatic use of language knowledge, the two timed GJTs, were only moderately weak (r = .28). In fact, the two timed GJTs correlated more strongly with the measures hypothesized to allow controlled use of language knowledge (magnitudes ranging between .66 and .80). This could be due to the nature of the data, accuracy scores in the case of the GJTs and the metalinguistic test, but reaction times in the case of the word monitoring task. Alternatively, the word monitoring task could be measuring a qualitatively different type of linguistic competence. This may be the type of integrated language knowledge that the test has been claimed to measure (Kilborn & Moss, 1996) and that several studies, mostly neurolinguistic studies investigating language disorders, have provided evidence for (Karmiloff-Smith, Tyler, Voice, Sims, Udwin, Howlin, & Davies, 1998; Kuperberg, McGuire, & David, 1998, 2000; Marslen-Wilson & Tyler, 1980; Peelle, Cooke, Moore, Vesely, & Grossman, 2007). GJTs, on the other hand, could be measuring controlled use of L2 knowledge to a certain extent, regardless of the time pressure factor (Jiang, 2007). Regarding early and late L2 learners? attainment, this dissertation found significant differences between the two groups? overall scores in all the L2 measures administered, in line with previous studies (Abrahamsson & Hyltenstam, 2009; DeKeyser, 2000; DeKeyser et al., 2010; Granena & Long, 2010; Johnson & Newport, 1989). However, early L2 learners were also significantly different from NSs in all the measures, except in the word monitoring task, where they showed the same sensitivity to grammatical violations as NSs. When individual structures were 204 compared, early L2 learners did not differ from late L2 learners in two of the agreement structures, gender agreement and subject-verb agreement. These similarities were observed in the timed visual GJT, untimed visual GJT, and metalinguistic test. In addition, early and late L2 learners did not differ regarding their sensitivity to agreement structures in the word monitoring task, even though, in this case, early learners did not differ from NSs, either (only NSs and late learners did). These results indicate that the acquisition of certain grammatical properties may be affected even when the L2 is acquired as early as age 3 or 4. These findings are partly similar to findings reported by Meisel (2009), who claimed that inflectional morphology is the domain in which child L2 acquisition can resemble adult L2 acquisition, and differ from L1 acquisition. He proposed a modified version of the Critical Period Hypothesis for certain domains of grammar. In this dissertation, an area that was especially affected was gender and subject-verb agreement, and the language pairing investigated Chinese-Spanish, two languages with very different inflectional paradigms (uniform vs. complex). Still, over half of the early L2 learners were able to score within NS-control range, several across the entire set of measures, whereas only a few late learners did, and none across the entire set, which suggests that native-like attainment remains possible for early L2 learners, but impossible for late L2 learners. Meisel (2009) further hypothesized that language-specific learning mechanisms (processing and discovery mechanisms) may be also affected early and proposed applying the Fundamental Difference Hypothesis (Bley-Vroman, 1990) to child, as 205 well as adult L2 acquisition. According to Meisel, success in L2 acquisition depends on ?a person?s ability to inhibit the competing non-domain-specific cognitive resources? (p. 18), an explanation that seems compatible with the results reported in this dissertation, which are discussed in the next section, regarding the role of cognitive aptitudes not only in late L2 learners, but also in early L2 learners. 6.3 Cognitive Aptitudes and Language Attainment This dissertation hypothesized that cognitive aptitudes that are more relevant for explicit language learning and processing would predict late, but not early, L2 learners? attainment on tasks that allow controlled use of L2 knowledge (Hypotheses 2a and 1a, respectively). These tasks increase available test time and decrease processing demands and, therefore, give L2 speakers an opportunity to rely on problem-solving and analytic skills. In these tasks, adult learners can rely on explicit L2 knowledge and compensate for their limited implicit competence. Adult learners with higher aptitude for explicit language learning were expected to do better as a result of their greater analytic, metalinguistic abilities. Contrary to expectations, the results of the MANOVA analyses did not provide support for a differential role of aptitude for explicit learning in the two L2-learner groups. There was no evidence of a significant interaction between group and covariate and, therefore, the relationship between aptitude and attainment, as measured by tests that allow controlled use of knowledge, was comparable in the two groups of L2 learners. Moreover, the relationship between aptitude for explicit learning and tasks that allow controlled use of L2 knowledge was significant at the multivariate level for a linear combination of the three measures (i.e., untimed visual 206 GJT, untimed auditory GJT, and metalinguistic test), as well as, at the univariate level, for the untimed visual GJT and metalinguistic test in both the early and late AO groups, and for the untimed auditory GJT in the early group only. These results confirmed Hypothesis 2a and refuted Hypothesis 1a, since aptitude for explicit learning, unexpectedly, also played a role in the early AO group. The fact that early L2 learners with high aptitude for explicit learning outperformed those with low aptitude on the untimed visual GJT, untimed auditory GJT, and metalinguistic test contradicts the findings in DeKeyser (2000) and DeKeyser et al. (2010), which showed a relationship between verbal analytic ability and scores on an untimed auditory GJT, like the one designed for this study, only in the late AO group. The relationship found between aptitude for explicit learning and morphosyntactic L2 attainment is, however, in line with the findings of Abrahamsson and Hyltenstam (2008), who concluded that aptitude played ?not only a crucial role for adult learners but also a certain role for child learners? (p. 499). Common to both studies (Abrahamsson & Hyltenstam, 2008, and the present study) was a larger n size of early L2 learners than in DeKeyser (2000) or DeKeyser et al. (2010). Sample sizes in Abrahamsson and Hyltenstam (2008) and in the current study were 31 and 50, respectively, whereas DeKeyser?s (2000) early AO group included 15 participants and DeKeyser et al.?s (2010) included 20. As a result, the range of early L2 learners? test scores was more restricted. In fact, in DeKeyser (2000), all the early L2 learners scored above 90% on the GJT. In addition, the language aptitude test employed in DeKeyser (2000) and DeKeyser et al. (2010) was administered in the participants? L1. This could have further restricted the range of 207 scores, given that the test was measuring verbal analytic ability in the language that the early L2 learners might have felt less comfortable with and in which their literacy skills might have been the poorest. Therefore, the lack of a significant positive correlation between early L2 learners? language aptitude and GJT scores in DeKeyser (2000) and DeKeyser et al. (2010) could have been an artifact of the small variance (Long, 2007). DeKeyser?s (2000) explanation of the significant relationship between verbal analytic ability and morphosyntactic attainment only in the late, but not in the early, AO group was that adult learners relied on explicit learning mechanisms to compensate for increasingly inefficient implicit learning mechanisms. According to this explanation, the results in the present study would suggest that, not only late L2 learners, but also early L2 learners, rely on explicit, analytic, problem-solving capacities to reach higher levels of proficiency in morphosyntactic L2 attainment, as measured by certain tests. This claim would be in line with Paradis? (2009) position, according to which only children exposed to the L2 ?before the age of 4 or 5 (and the younger the better) acquire the second language implicitly? (p. 110). According to him, the reason why some early L2 learners can still perform or be perceived as native-like is because of speeded-up controlled use of metalinguistic knowledge (i.e., conscious knowledge about form). An alternative explanation could be that untimed L2 measures with a focus on language correctness allow L2 learners (both early and late) to control their performance consciously, inducing them to process language explicitly. As a result, untimed L2 measures would be partly measuring the same abilities as tests of aptitude 208 for explicit learning. This would explain the fact that the largest effect size observed in the data corresponded to the test at the most explicit end of the continuum from automatic to controlled use of L2 knowledge, the metalinguistic knowledge test. This test encouraged the highest degree of awareness and the greatest amount of attention to language forms by asking participants to correct grammatical errors and provide grammatical rules. Similar results were reported by Granena (2011), who found a positive effect for aptitude, measured by an average of the four LLAMA subtests, on an untimed visual GJT with a correction component in a group of 30 NSs of English, all of them adult L2 learners of Spanish and very advanced speakers. The question remains whether early L2 learners who started learning the L2 as early as age 3, and who were hypothesized to have used the same (implicit only) language learning mechanisms as NSs, would rely on conscious knowledge about language form on untimed L2 measures that focus on language correctness. Like NSs, one would expect them, predominantly, to make use of feel judgments when responding to any language task, unless, as already argued, untimed L2 measures that focus on language correctness induce learners to approach the task analytically by placing a great deal of conscious (i.e., controlled) attention on sentence structure. L2 learners with higher analytic ability as measured by tests of aptitude for explicit learning could be more successful at detecting grammatical errors when a task requires focusing on language forms. In Abrahamsson and Hyltenstam?s (2008) study, early L2 learners? aptitude level was strongly related to scores on a GJT (r = .70, p < .001). This might be explained by the highly complex nature of the stimuli (i.e., very long, semantically complex 209 sentences) and/or by the fact that the GJT combined the results of an online and an offline version of the test. In other words, L2 learners with high aptitude in the domain of explicit, attention-driven memory processes could be more successful at processing and parsing sentence stimuli to identify grammatical errors. If the relationship between aptitude for explicit learning and untimed L2 measures that focus on language correctness is due to the nature of the language test (i.e., test effects), rather than to reliance on explicit language knowledge, one would also expect a relationship between aptitude and performance on untimed L2 measures among NSs. However, this study did not find any significant relationships between NSs? language attainment and cognitive aptitudes on any of the attainment measures. One possibility is that the high inter-individual homogeneity that characterizes NSs? performance on language measures, in combination with the smaller sample size that typically characterizes NS control groups, precludes finding any significant results. In fact, Abrahamsson and Hyltenstam (2008) reported a correlation of .47 in their NS group, which did not reach significance (p = .077), probably due to the small size of the group (n = 15).41 This would suggest that a relationship between NSs? cognitive aptitudes and their performance on certain types of GJTs cannot be discounted. The question is what the results from tasks that call for the use of analytic abilities and attention-driven memory processes can reveal about language competence in general, and to what extent they are similar to results from spontaneous language production tasks. 41 In another study, Abrahamsson (p.c., 8/21/2011) did find a significant relationship between aptitude (as measured by the LLAMA aptitude test) and language performance on a GJT among native speakers, as well. 210 It was also predicted that cognitive aptitudes more relevant for explicit language learning and processing would not moderate either early or late L2 attainment on tasks that require automatic use of L2 knowledge (Hypotheses 4a and 5a, respectively), since these are online tasks that minimize the opportunities to plan responses or rely on problem-solving and analytic skills. The results confirmed Hypothesis 5a, but refuted Hypothesis 4a. As predicted, explicit language aptitude did not moderate late L2 learners? attainment on a timed visual GJT, a timed auditory GJT, and a word monitoring task. However, it did moderate early L2 learners? attainment on the two GJTs, the timed visual and the timed auditory. Therefore, explicit language aptitude moderated early L2 learners? performance on all the L2 measures administered, except for the word monitoring task, at the extreme end of the continuum of automatic use of L2 knowledge. A feature that these measures have in common is the fact that they all focus test- takers? attention on language correctness and accuracy of grammaticality judgment. The word monitoring task, on the other hand, is carried out under a dual-task framework (e.g., Fodor, Ni, Crain, & Shankweiler, 1996; Furst & Hitch, 2000; Mullennix, Sawusch, & Garrison, 1992; Ransdell, Arecco, & Levy, 2001; Waters & Caplan, 1997; Wurm & Samuel, 1997) that focuses participants? attention on sentence meaning and word monitoring, while the researcher measures participants? sensitivity to linguistic violations (participants are never told about the presence of ungrammatical stimuli). One could argue that tests with a focus on language forms and language correctness call for, or may benefit from, test-takers? cognitive aptitudes for explicit language learning (i.e., analytic, metalinguistic abilities). According to 211 Jiang (2007), ?a learner?s performance in a GJT task (even a timed GJ task) can be a result of applying explicit knowledge rather than automatic competence? (p. 6, emphasis added). He further argued that psycholinguistic research paradigms, such as the one followed by the word monitoring task, are more likely to be informative about automatic activation of integrated linguistic knowledge, since participants react to grammatical errors without intending to do so. If language tests that focus on language correctness call for analytic and/or metalinguistic abilities, the question remains as to why this study found a relationship between explicit language aptitude and attainment on the timed GJTs in the early AO group only, and not late AO group. The reason could be the time constraints imposed on the test and their effect on late L2 learners? performance. On timed language tests, test-takers are pressured to perform a task online under additional time constraints, in order to minimize controlled use of L2 knowledge. Performance typically declines when compared to untimed tasks (e.g., Bialystok, 1979; Bialystok & Miller, 1999; Murphy, 1997; Loewen, 2009), even among NSs. In the present study, declines were significant regardless of test modality (visual or auditory), suggesting that time pressure creates a confounding factor at the level of processing above and beyond the possible confounding factor of phonological decoding typically associated with the aural presentation of stimuli. Time pressure made all participants? scores decline significantly, including NSs? scores. In the case of the late L2 learners, overall average raw scores (including missed items) on the timed visual and timed auditory GJTs were 26.68 (SD = 5.41) and 29.00 (SD = 5.25), out of a maximum of 60. Both raw averages were, therefore, below chance level. The 212 proportion of missed items in this group, with the corresponding loss of power it entails, was also considerable. The percentage scores taking into account total number of attempts were close to chance level, 57.88% and 57.63% for the timed visual and timed auditory GJTs, respectively. Participants in the study by R. Ellis (2005) with 91 adult foreign language learners of mixed proficiency levels also scored close to chance level (54%) on the timed GJT, but well above chance on the untimed GJT (82%) and the oral narrative task (72%). These results indicate that the trade-off between processing demands and reliable use of (any type of) L2 knowledge may not be positive among adult L2 learners, when processing demands are considerable. Contrary to R. Ellis? (2005) suggestion to use timed testing formats, it seems that online tasks performed in real time, but not under time pressure, would make more reliable measures of automatic use of L2, since they would lie more comfortably within L2 learners? processing capacity (e.g., self-paced visual formats, auditory GJTs where sentences are played only once, and spontaneous language production tasks). Regarding general intelligence, it was hypothesized that, in ultimate L2 attainment, relationships between explicit aptitude and general intelligence and learning outcomes would pattern in the same way and would be different from effects of implicit aptitude on outcomes. This hypothesis was based on studies of artificial grammar learning, in which fluid intelligence correlates with learning when participants are instructed to look for patterns in the training materials, but not under more incidental learning conditions. It was also based on studies in cognitive psychology that have shown psychometric intelligence to be more related to explicit 213 associative learning than to implicit learning. Hypotheses 1c and 4c predicted that intelligence would not moderate early L2 learners? language attainment, as measured by tasks that allow controlled use of knowledge (1c), or tasks that require automatic use of knowledge (4c). Hypotheses 2c and 5c posited that intelligence would moderate late L2 learners? attainment on tasks that allow controlled used of knowledge (2c), but not on tasks that require automatic use of knowledge (5c). All these hypotheses were supported by the findings. High-intelligence late L2 learners outperformed their low-intelligence counterparts on two measures of controlled L2 use (the metalinguistic test and the untimed visual GJT), but not on any other L2 outcome measures. Moreover, there were no effects of intelligence for early starters on any of the ultimate L2 attainment measures. Follow-up analyses revealed that the intelligence factor did not contribute to the significant results reported for the composite of aptitudes for explicit learning in the early AO group (despite a significant correlation between general intelligence and LLAMA subtests B, E, and F, r = .30, p = .035). In the late AO group, on the other hand, general intelligence moderated L2 attainment on the same language measures that yielded a significant relationship with LLAMA B, E, and F (the metalinguistic test and the untimed visual GJT). Therefore, both general intelligence and language aptitude measures were relevant in late L2 learners? attainment, at least on tests that allow controlled use of L2 knowledge. The main difference between the intelligence test used (the GAMA) and the LLAMA language aptitude subtests is the fact that the GAMA is a non-verbal (visual) test, whereas the LLAMA is a verbal (albeit language-independent) measure. This 214 may suggest that general learning mechanisms play a role in adult SLA (in combination with language-specific mechanisms), but no role in child SLA, in support of skill acquisition theory (Anderson 1983, 1993), and, as defended by DeKeyser (2001, 2003, 2007). However, the effect of general intelligence in the late AO group was only observed on tests that allow controlled use of L2 knowledge. Similarly, studies comparing learning conditions have also found general intelligence to be more highly correlated with conditions where participants are explicitly instructed to look for underlying patterns than with incidental conditions in artificial grammar learning (e.g., Gebauer & Mackintosh, 2007; Reber et al., 1991; Robinson, 2002). Therefore, one cannot discount the possibility that the positive association between the two is due to the fact that they are measuring the same abilities. While being high- or low-intelligence did not make a difference for early L2 learners, late L2 learners needed the additional contribution of their general intellectual ability to perform on tasks that emphasize grammatical correctness and metalinguistic abilities. It seems that these tasks, then, would create a situation where late L2 learners may need to, and would be allowed to, resort to other cognitive resources, bringing into play all their verbal and non-verbal problem-solving capacities. A factor that could have also contributed to the relationship between general intelligence and attainment on tasks of controlled use of L2 knowledge in the late AO group is formal instruction, since only 19 of the 50 late L2 learners had received either no instruction or instruction for a period less than one year (see Section 4.1). A comparison of late L2 learners with one year of instruction, or less, (n = 19) and late 215 L2 learners with more than two years of instruction (n = 31) revealed a significant correlation between general intelligence and metalinguistic knowledge test scores in the group with more than two years of instruction (r = .39, p = .032), but a close-to- zero negative correlation in the other group (r = -.03, p = .900). If late L2 learners who have received formal instruction potentially have more explicit language knowledge, these results would suggest a relationship between intelligence and stored, or use of stored, explicit language knowledge in adult learners42. Regarding cognitive aptitudes that are more relevant for implicit language learning and processing, it was predicted that these would moderate L2 learners? attainment on tasks that require more automatic use of L2 knowledge. This prediction was made for both early and late L2 learners (Hypotheses 4b and 5b), with the expectation that adult L2 learners can still learn implicitly, but not for NSs, whose ultimate attainment, characterized by inter-individual homogeneity, and mostly performance at ceiling, was considered independent of cognitive aptitudes. In addition, individual differences in aptitude for implicit language learning were predicted to be related to early, but not late, L2 learners? attainment on tasks that allow controlled use of L2 knowledge (Hypotheses 1b and 2b), since early L2 learners were expected to rely on the same type of knowledge, regardless of language task. Like NSs, this knowledge was hypothesized to be implicit. Unlike NSs, however, early L2 learners? ultimate attainment is characterized by greater inter- 42 The subgroup of late L2 learners with more than two years of formal instruction obtained significantly higher scores on the metalinguistic knowledge test (p = .001) and results approached significance for the untimed visual GJT (p = .064). However, they were not significantly different from late L2 learners with one year of instruction, or less, on the rest of L2 measures (p > .05). 216 individual variability and was, therefore, expected to be moderated by cognitive aptitudes. Results showed that, as predicted by Hypothesis 2b, implicit language aptitude did not moderate late L2 learners? attainment on tasks that allow controlled use of language knowledge. Contrary to expectations, implicit language aptitude did not moderate early L2 learners? attainment on such tasks either, at least when overall test scores or scores on ungrammatical items were considered. However, it moderated early L2 learners? attainment on agreement structures, thus, partially confirming Hypothesis 1b. Therefore, while only explicit language aptitude was a significant covariate for tasks that allow controlled use of L2 knowledge in the late AO group, both implicit and explicit language aptitude were significant covariates in the early AO group. The difference between the two was that aptitude for implicit learning only moderated early L2 learners? performance on agreement structures, but not on non-agreement ones, where only aptitude for explicit learning played a role. An aptitude effect only occurring in early L2 learners could suggest a qualitative difference in the learning mechanisms of early and late L2 learners. However, because the effect belonged to a type of aptitude hypothesized to be relevant for implicit learning, and it was also present in late learners? scores on the word monitoring task, one could argue that it is indicative of early learners? advantage in implicit learning, or in the particular value of implicit learning for such features as [- interpretable] word-ending morphology. Early learners seem to have relied to some extent on this implicit knowledge, even when the L2 measure allowed controlled use 217 of language knowledge, whereas late learners largely relied on their analytic and/or metalinguistic abilities (aptitude for explicit learning and intelligence). The relationship between aptitude for implicit learning and agreement structures in the early group was present to a greater or lesser extent in all the L2 measures. It was a significant relationship in measures that allow controlled use of L2 knowledge and on the word monitoring task, at the extreme end of the automatic use of knowledge continuum, and it showed a trend towards significance in the two timed GJTs. In the late AO group, on the other hand, aptitude for implicit learning only had an effect on agreement structures in the measure that drew late learners? attention away from language correctness, the word monitoring task. That was the task where late learners seem to have largely relied on implicit knowledge of grammatical agreement. Hypotheses 4b and 5b, which predicted a relationship between implicit language aptitude and ultimate attainment on tasks that require automatic use of L2 knowledge, were partially confirmed. A significant relationship was found in the two groups of L2 learners, early and late, for the word monitoring task (at the extreme of the continuum of automatic use of L2 knowledge), but only for target structures involving grammatical agreement relations (gender, number, and person agreement). Both early and late L2 learners with high aptitude for implicit learning showed greater grammatical sensitivity towards agreement violations than L2 learners with low aptitude for implicit learning. It is worth pointing out that the two aptitude composites patterned in the same way in the two groups of learners, as far as type of grammatical structure is 218 concerned. The aptitude composite hypothesized to be relevant for explicit learning moderated the two types of structures investigated (agreement and non-agreement). The aptitude composite hypothesized to be relevant for implicit learning, however, did not moderate participants? attainment on non-agreement structures, and it only played a role in structures involving grammatical agreement. This result may be relevant from the point of view of developmental patterns in acquisition and cognitive aptitudes, given the rationale behind the selection of structures in this dissertation, to which this discussion turns next. The underlying rationale for the distinction between agreement and non- agreement structures was that L1 Spanish children acquire structures such as gender, number, and subject-verb agreement early (i.e., by age 3), whereas structures such as the subjunctive, the passive, and aspect contrasts are acquired later (i.e., at least age 7 or later) (Montrul, 2004). The late acquisition of the subjunctive, the passive, and aspect contrasts has to do with their linguistic complexity and children?s cognitive developmental readiness. For example, in the case of the subjunctive (mood selection), children lack mental representations of ?events that are independent or even incompatible with the reality of physical events? (P?rez-Leroux, 1998, p. 589). They are also structures at the syntax-semantics interface that make essential contributions to meaning and considered [+ interpretable] features (Tsimpli & Mastropavlou, 2007). However, their use is constrained to specific contexts. For example, past actions (aspect contrasts), topicalization (the passive), and negative commands (the subjunctive), among others. Finally, the passive and the subjunctive 219 are more frequently used in written language and formal registers and their acquisition is likely to be influenced by factors such as education and literacy level. Agreement structures, on the other hand, are formal [- interpretable] non-salient features with a very high frequency of occurrence. Grammatical agreement is also characterized by the type of conditional (or transitional) probabilities that govern statistical learning, since it involves co-occurrence patterns within utterances and transitional probability, i.e., the probability of one event given the occurrence of another event (statistical regularity). For example, in the case of Spanish gender agreement, there are forward conditional probabilities of word-final phonemes -a and ?o, given the feminine and masculine determiners la and el (.77 for word-final -a given la, and .56 for word-final -o given el) (Lindsey & Gerken, 2011). Infants and young children are extremely sensitive and finely tuned to such distributional patterns in the input and learn them implicitly, as evidenced by the fact that Spanish children have acquired agreement structures with almost 100% accuracy by age 3. However, there is no consensus as to whether these learning mechanisms are still available to adults and, if so, under which circumstances they operate, and for what type of language features they can do so efficiently. The Fundamental Difference Hypothesis (Bley-Vroman, 1990) states that the implicit learning mechanisms that operate in child language learning are no longer efficient in adult language learning and that domain-general problem-solving mechanisms are used instead, a position supported by DeKeyser (2000), who also predicted that adults would need high verbal analytic ability to succeed in L2 learning. Meisel (2009) further claimed that the fundamental differences in learning mechanisms between 220 child and adult acquisition may already emerge in early childhood, earlier than the critical age range hypothesized by Bley-Vroman (1990) or DeKeyser (2000) (i.e., end of teens), and even Lenneberg (1967) (i.e., at puberty), and only for certain grammatical properties. On the other hand, there is evidence from experimental settings that adults are sensitive to distributional patterns in non-linguistic input and that they can learn tone, noise, and visual sequences implicitly (Creel, Newport, & Aslin, 2004; Gebhart, Newport, & Aslin, 2009; Hunt & Aslin, 2010; Saffran et al., 1999). The results of the probabilistic serial reaction task used in the present study lend support to this body of findings, as well. They have led some researchers (Kaufman et al., 2010; Woltz, 2003) to conceptualize implicit learning as an ability, ?the ability to automatically and implicitly detect complex and noisy regularities in the environment? (Kaufman et al., 2010: 321). This ability is characterized by automatic, associative, nonconscious, and unintentional learning processes. Contrary to Reber (1993), who views individual differences in implicit cognition as minimal, relative to individual differences in explicit cognition (due to the fact that implicit learning is evolutionarily older than explicit cognition), these researchers claim that implicit learning is a cognitive ability with meaningful individual differences. This implies that implicit learning can be significantly related to other cognitive abilities and/or language acquisition outcomes. Adults are also sensitive to probabilities in linguistic input, as evidenced by the fact that they can compute how consistently sounds co-occur, and how frequently words occur online, and use this probabilistic information to acquire simple syntactic structure in miniature languages (Aslin & Newport, in press). The same learning 221 mechanisms could be at work in the acquisition of inflectional morphology (e.g., noun-adjective gender agreement or subject-verb agreement). This area of grammar is, in fact, a good candidate for implicit language learning, known to work through the slow accumulation of instances of input data (DeKeyser, 2003), especially in the type of immersion setting investigated, which is characterized by long-term exposure to large quantities of input. If implicit learning mechanisms are affected in very early developmental phases, as suggested by Meisel (2009), and, as a result, become only partially available, an effect of language aptitude that could compensate for maturational changes should be observed not only in late, but also in early L2 learners, as found in this dissertation research. Some areas of grammar would be especially affected by this reduced capacity for implicit language learning, and it seems that inflectional morphology could be one of them, at least for language pairings with very different inflectional paradigms (e.g., Chinese-Spanish). These are non-salient, [- interpretable] features, which are highly frequent in the input, but known to cause persistent difficulty in adult L2 acquisition. In the present study, even early L2 learners with AOs 3-6 performed significantly worse than NSs on gender and subject-verb agreement (whereas they did not differ from NSs on the subjunctive, which is a non-agreement structure). The fact that there may be a type of, apparently highly selective, cognitive aptitude that is advantageous for the acquisition of such non-salient features and that can compensate for partial loss of the implicit language learning capacity could explain why it is possible for some early and late L2 learners to attain higher levels of L2 competence than others (i.e., inter-individual 222 variation at a within-subjects level). Perhaps, this variation is a reflection of those L2 learners who were more able to keep relying on implicit learning mechanisms, despite other available mechanisms, in which case aptitude for implicit learning would mean the same as degree of implicit learning capacity. The fact that aptitude for implicit learning predicted sensitivity towards grammatical agreement violations in both early and late L2 learners suggests some degree of similarity in early and late learners? language learning mechanisms. Following DeKeyser?s (2000) hypothesis that relationships between individual differences in language aptitude and eventual learning outcomes potentially constitute evidence for differences in underlying learning processes, one could argue that those adults learning an L2 in a naturalistic (i.e., immersion) environment can also acquire certain features of the L2 implicitly, as indirectly indicated by the fact that those adults with higher implicit language aptitude showed greater sensitivity towards grammatical agreement violations. It should be noted that, despite any potential similarities between early and late L2 learners? learning mechanisms, success rate (i.e., the ability to perform in near-native like fashion) was still greater in the case of early L2 learners, suggesting that, if implicit language learning mechanisms remain partially available, they are less available to adult L2 learners or, alternatively, cognitive aptitudes cannot compensate for maturational effects equally effectively in adulthood as they can do in early childhood. The potential role of implicit learning in eventual L2 outcomes by adult learners in an immersion setting is consistent with the findings of experimental studies that have focused on adult learners? implicit learning of semi-artificial grammars 223 (Rebuschat, 2008; Rebuschat & Williams, 2006, 2009; Williams, 1999, 2005). These studies typically show 65% accuracy in implicit learning groups, versus chance performance in control groups. A challenge that any study claiming implicit learning processes has to face, however, is the fact that evidence is based on learning outcomes (i.e., acquired knowledge) and such outcomes can be the result of implicit learning, explicit learning or a combination of both. Evidence of implicit learning can only be indirectly established by measuring the extent to which participants are aware of the acquired knowledge, in semi-artificial grammar learning studies, or by establishing a relationship between learning outcomes and cognitive aptitudes that are more relevant for either explicit or implicit learning, as suggested by DeKeyser (2000) and as investigated in the present study. Even the existence of verbalizable knowledge would not necessarily imply that learning did not happen implicitly, since implicitly acquired language knowledge (e.g., one?s native language) can become verbalizable to a lesser or greater extent. The last set of hypotheses in this study predicted no relationships between cognitive aptitudes and NSs? attainment on tasks that allow controlled use of language knowledge (Hypotheses 3a, 3b, and 3c) and tasks that require automatic use of language knowledge (Hypotheses 6a, 6b, and 6c). All these hypotheses were borne out by the data. Therefore, according to these results, NSs? linguistic competence can be considered independent of NSs? cognitive abilities. However, this study noted two factors that typically preclude finding such significant relationships in NS control groups: N size smaller than target groups, and inter-individual homogeneity, with performance usually close to ceiling. One cannot discount the possibility that larger N 224 sizes could show that cognitive aptitudes play a role in NSs? attainment on certain L2 measures, as some findings by Abrahamsson (p.c.) indicate. The type of language ability measured by the tests in question would have to be established. The prediction would be that no relationships would be observed in L2 measures that tap automatic use of language knowledge in tasks that do not carry any additional processing load. The results of the present study can only speak to NSs? ultimate attainment. Cognitive aptitudes may still be a factor in rate of L1 acquisition, where inter- individual variation is probably greater than in ultimate attainment, as suggested by Skehan?s (1990) findings in the Bristol Language Project (Wells, 1981, 1985). Skehan reported a number of significant correlations between language aptitude at age 13 and measures of acquisition derived from the children?s speech when they were 42 months, and he argued that aptitude was a factor in the development of language competence in NSs. However, the significant relationships between aptitude and the biographical variables in the study make the role of environmental factors difficult to disentangle. Specifically, factors such as family background, parents? level of education, and parents? interest in literacy were significantly related to scores on aptitude measures, such as a verbal intelligence test and a grammatical sensitivity test, which, in turn, were related to linguistic indices, such as mean length of utterance and range of adjectives and determiners. Perhaps not surprisingly, only one of the aptitude measures, a sound discrimination test, was unrelated to biographical factors. This subcomponent of aptitude correlated with two of the comprehension indices in the study and with one of the vocabulary indices, suggesting a distinct dimension of aptitude in L1 acquisition. 225 6.4 Summary of Research Findings As a summary of research findings, Table 33 displays the relationships that were predicted between aptitudes, general intelligence, and ultimate L2 attainment, as well as those relationships that were supported (?) or unsupported (?) by the data. Sixteen of the 18 expected relationships were either confirmed or partially confirmed. Partial confirmation should be understood as indicating that the predicted relationship was held at least in one of the analyses (either main or follow-up analyses). Thus, it could be a significant relationship for overall scores (grammatical and ungrammatical), for scores on ungrammatical items only, for scores on agreement structures, or on non-agreement structures. Table 33. Summary of the Study Predictions and Findings Automatic L2 Use Controlled L2 Use Early AO Late AO Control Early AO Late AO Control General Intelligence No? No? No? No? Yes? No? Explicit Aptitude No? No? No? No? Yes? No? Implicit Aptitude Yes? Yes? No? Yes? No? No? Note. A check mark (?) stands for confirmed (or partially confirmed) and a cross mark (?) stands for refuted. 226 6.5 Conclusions and Directions for Further Research The current study adds to the current body of literature suggesting that different types of cognitive aptitudes have differential effects on long-term L2 outcomes. A broad distinction was made between explicit and implicit language aptitudes in an attempt to address the main limitation of conventional language aptitude measures, which have been heavily weighted in favor of explicit processes. Explicit language aptitude had an effect on L2 outcome measures that were untimed and that focused on language forms and language correctness. There was no evidence of any advantageous effects of this type of aptitude on language attainment, if the word monitoring task is taken as the most representative measure of implicit linguistic knowledge used in this study. The word monitoring task, however, is a psycholinguistic task that relies on reaction-time data, and this can be regarded as a limitation, since claims about integrated L2 knowledge are only indirectly established. Future research should investigate other L2 outcome measures to validate these findings, especially outcome measures that do not call for the use of the same analytic and/or metalinguistic abilities that also characterize explicit language aptitude measures. Whereas explicit language aptitude had an effect on L2 outcome measures that were untimed and that focused on language forms and language correctness, implicit language aptitude had an effect on L2 learners? sensitivity to violations of grammatical agreement in the word monitoring task, which is online and has a meaning focus. The most relevant finding in this regard was the fact that implicit language aptitude moderated not only early L2 learners?, but also adult learners?, 227 sensitivity to gender, person, and number agreement. This finding is convergent with claims that implicit learning is crucial to language acquisition (e.g., N. Ellis, 1994) and with findings showing positive associations between measures of implicit learning and language acquisition (e.g., Gebauer & Mackintosh, 2012). Further research should investigate other implicit language aptitudes, such as priming, in order to evaluate the possible effects of implicit induction (i.e., acquisition of patterns without awareness) on L2 outcomes. A limitation of the probabilistic SRT task used to measure implicit learning was its low reliability. Although the low reliability index was considered standard compared to previous studies in the literature (Dienes, 1992; Kaufman et al., 2010; Reber et al., 1991), it means that the assessment of implicit learning was less than optimal. Despite the noise in the data, there were significant relationships with L2 attainment. Previous studies have also shown significant correlations between implicit learning and complex cognition (Pretz et al., 2010). However, more reliable measures could show an even more prominent role of aptitude for implicit learning in acquisition. Studies should also explore the extent to which aptitude for implicit learning is efficient and/or effective in instructed contexts that typically lack the massive input exposure that characterizes immersion settings, as well as the effects of aptitude for implicit learning on spontaneous language production tasks. As anecdotal evidence for this undertaking, five of the twelve adult learners classified as having high aptitude for implicit learning only (and either mid or low aptitude for explicit language learning) were also among the highest scorers on the oral interview used as an informal screening procedure for the study. 228 Finally, it would be very informative to investigate aptitude profiles in aptitude- treatment interaction studies. The following four aptitude profiles were observed in the present study: [high implicit, high explicit], [high implicit, low explicit], [low implicit, high explicit], and [low implicit, low explicit]. In the adult learner group (n = 50), 24% of the participants (n = 12) were high in implicit aptitude only. This percentage increased to 36% if adults high in implicit aptitude and high in explicit aptitude were considered (n = 18). On the other hand, only 14% were high in explicit aptitude only (n = 7) and 10% were low in both types of aptitude (n = 5). It would be interesting to investigate these different profiles in other populations of very advanced adult L2 learners in either immersion or instructed language contexts. 229 Appendix A Biographical Questionnaire CUESTIONARIO DE DATOS PERSONALES 1. Nombre y apellido: ______________________________________________ 2. Sexo: Hombre ___________________ Mujer_________________ 3. Edad: ____________ 4. Correo electr?nico: _________________@____________ 5. Tel?fono de contacto: _____________________________ 6. Estudios realizados: _______________________________ 7. Profesi?n actual: _________________________________ 8. ?Tienes alguna disminuci?n o problema de tipo visual y/o auditivo? Por favor, especifica. ________________________________________________________________ 9. Lengua dominante (lengua en la que te sientes m?s c?modo hablando): _____________ 10. Otras lenguas por orden de dominio (de m?s a menos dominio): + dominio - dominio ____________ ___________ ____________ ___________ 11. ?Es el espa?ol la lengua materna de tu padre o madre? Por favor, especifica. _________________________________________ 12. ?Sabr?as decir qu? lengua se hablaba en tu casa cuando eras peque?o/a (antes de que fueses a la guarder?a)? __________________ Si se hablaban varias lenguas, indica por favor un porcentaje: Lengua 1: _________________ ____% Lengua 2: _________________ ____% Lengua 3: _________________ ____% 230 13. ?Qu? lengua hablas actualmente en tu casa? _____________________ Si se hablan varias lenguas, indica por favor un porcentaje: Lengua 1: _________________ ____% Lengua 2: _________________ ____% Lengua 3: _________________ ____% 14. ?Puedes hacer un r?nking de las lenguas que utilizas en un d?a normal de la que m?s utilizas a la que menos utilizas? Por favor, especifica un porcentaje aproximado de uso diario: Lengua 1: ____________________% aproximado de uso diario Lengua 2: ____________________% aproximado de uso diario Lengua 3: ____________________% aproximado de uso diario 15. ?Cu?l es tu nivel aproximado de Chino Mandarin (o lengua china hablada en tu casa)? B?sico Intermedio Avanzado Casi nativo Nativo S?lo hablado Hablado y escrito Por favor, especifica en caso necesario: ____________________________________________________________ 16. Elige los contextos en los que normalmente siempre utilizas el espa?ol: Contextos Formales: En el trabajo En la universidad Para hacer gestiones Otros: _______________ Contextos Informales: En casa Con los amigos y conocidos Con familiares En Internet Para ver la televisi?n Para escuchar la radio Para leer el peri?dico Otros: ________________ 17. ?Cu?ntas horas por semana utilizas el espa?ol para??: 231 1-2hrs 2-5hrs 6-10hrs M?s de 10hrs Trabajar _____ _____ ______ _____ Hablar con familiares _____ _____ ______ _____ Hablar con amigos _____ _____ ______ _____ Leer libros, peri?dicos _____ _____ ______ _____ Ver TV, pel?culas _____ _____ ______ _____ Internet _____ _____ ______ _____ 18. ?Hasta qu? punto te identificas con la cultura espa?ola? Por favor, haz un c?rculo sobre el n?mero correspondiente: 5 significa que S? te identificas totalmente (te sientes espa?ol) y 1 significa que NO te identificas con la cultura espa?ola en absoluto (no te sientes espa?ol): + Identificaci?n -Identificaci?n 5 4 3 2 1 19. ?Convives o has convivido con hablantes nativos de espa?ol? ?Durante cu?nto tiempo? ______________________________________________________________ 20. ?A qu? edad llegaste a Espa?a por primera vez? (si has nacido en Espa?a escribe ?nacido en Espa?a?) ______________________________________________________________ 21. ?Aprendiste espa?ol antes de llegar a Espa?a? (si has nacido en Espa?a, por favor ignora la pregunta? _________________________ 22. ?A qu? edad comenzaste a aprender espa?ol? _______________________________ 23. ?D?nde comenzaste a aprender el espa?ol? En un contexto de clase (curso de espa?ol) en pa?s de origen En un contexto de clase (curso de espa?ol) en Espa?a De manera espont?nea en Espa?a, hablando con los que me rodean 24. ?Cu?ntos a?os de cursos de idiomas de espa?ol has hecho? ____________________ 25. ?Has recibido educaci?n en Espa?a (guarder?a, primaria, secundaria, estudios universitarios)? Por favor, especifica. ______________________________________________________________ 232 26. ?Cu?ntos a?os llevas viviendo en Espa?a? _________________________________ 27. ?Han sido a?os seguidos o has pasado temporadas en el extranjero? _____________ 28. ?En qu? poblaciones de Espa?a has vivido y cu?nto tiempo en cada una de ellas? ______________________________________________________________ 29. ?Has estado en alg?n otro pa?s de habla espa?ola? Pa?s(es): ___________________________ Tiempo de estancia: ___________________________ Edad de llegada: ___________________________ 30. ?Crees que cuando hablas en espa?ol pareces nativo? Totalmente de acuerdo Bastante de acuerdo De vez en cuando pero no siempre Totalmente en desacuerdo 31. ?Est?s satisfecho con tu pronunciaci?n del espa?ol? Muy satisfecho Bastante satisfecho No muy satisfecho Totalmente insatisfecho 32. ?Es importante para ti pasar por hablante nativo de espa?ol? Totalmente de acuerdo Es importante pero no esencial para mi No es muy importante No me importa ?GRACIAS! 233 Appendix B Item Pool 1. Noun-Adjective Gender Agreement 1. *La actriz que gan? el premio fue aplaudido calurosamente por el p?blico 2. *Finalmente, la pel?cula no fue tan aburrido como pens?bamos 3. *Los sistemas de iluminaci?n en Europa son m?s innovadoras que en Espa?a 4. *Los terrenos que son demasiado h?medas tienen muchos inconvenientes 5. *Este libro resulta muy apropiada para lectores de todas las edades 6. *Esta criatura anda siempre despitado por culpa de sus compa?eros 7. *Dicen que las fotos de mariposas son muy complicados de conseguir 8. *El reloj de la pared va atrasada siete minutos exactos 9. *En M?jico, la cerveza se ha de servir bien fr?o y con lim?n 10. *Dicen que las hijas de Miguel son muy trabajadores y serviciales 11. *Estoy de acuerdo que el piano de los abuelos es demasiado antigua para nosotros 12. *Seg?n los expertos, la miel m?s saludable es oscuro de color y suave de textura 13. *La identidad del acusado permaneci? oculto hasta el final del juicio 14. *Las manos de dedos largos son delicados y elegantes 15. *Mi compa?era de piso est? muy nervioso por los ex?menes de ma?ana 16. *Mi madre se enfad? porque mi habitaci?n estaba sucio y sin barrer 17. *La calle que lleva al centro estaba abarrotado de gente y coches por todos lados 18. *En este restaurante, el men? del d?a sale bastante cara pero vale la pena 19. *El cultivo del ma?z es apta en cualquier superficie para la agricultura del planeta 20. *La torre de Pisa est? cada vez m?s inclinado y con menos columnas 21. *Los edificios de la universidad est?n todos muy bien equipadas con tecnolog?a punta 22. *El vuelo a Madrid fue muy larga pero agradable gracias a las atenciones de las azafatas 23. *Me gusta el suelo porque est? bien acabada con materiales de alta calidad 24. *La llegada a Madrid fue mucho mas agotador de lo previsto por la organizaci?n 25. *Tus mensajes est?n guardadas en un archivo en el escritorio del ordenador 26. *La v?ctima del accidente fue atendido inmediatamente por los servicios de urgencias 27. *Las empleadas de esta empresa son m?s h?biles y trabajadores que en mi antigua empresa 28. *La habilidad y maestr?a del pintor son asombrosos e inigualables sin 234 duda alguna 29. *La guarder?a infantil est? atendido eficazmente por su propietaria 30. Cada una de las empresas de nuestro sector est? dispuesta a compartir informaci?n 31. Hay muchas personas que se levantan cansadas diariamente porque no pueden dormir bien 32. Las letras del alfabeto del castellano son veintiuna sin contar la ll y la ? 33. La ropa s?lo est? medio seca porque hoy no ha hecho nada de sol 34. Cualquier chiste puede ser aburrido si no se cuenta con gracia y estilo 35. Marte es el planeta visible m?s pr?ximo a la tierra 36. En un futuro, especies como el ?guila estar?n m?s protegidas para evitar su extinci?n 37. La tensi?n entre los pa?ses implicados es demasiado alta para conseguir un acuerdo 38. Cualquier regi?n del sur del Canad? es m?s c?lida que Suecia 39. Cualquier volc?n puede resultar peligroso cuando entra en erupci?n 40. Cualquier corriente de aire puede resultar molesta cuando se practica el esqu? 41. En algunos pa?ses, los rostros de las mujeres quedan ocultos detr?s de un velo 42. El crecimiento de la poblaci?n espa?ola ha sido cont?nuo desde 1975 43. Las editoriales de libros antiguos se mantienen ajenas a la tecnolog?a 44. Cualquier peaje de autopista debe ser aprobado un?nimemente por el congreso 45. La superf?cie disponible para construir estar? regulada este a?o por el gobierno 46. El paisaje del sur de Espa?a es mucho m?s ?rido que el del norte 47. La provisi?n de energ?a est? garantizada mundialmente bajo cualquier ciscunstancia 48. En los ?ltimos d?as el precio de la carne est? m?s caro que el precio del pescado 49. La red de transporte p?blico no es satisfactorio para los ciudadanos de Madrid 50. En este restaurante, la relaci?n calidad-precio es de las mejores de la ciudad 51. La lecci?n de piano de ayer no fue tan buena como otros d?as 52. Las cajas y bolsas que est?n vac?as servir?n para la mudanza del viernes 53. Los armarios y sillas del abuelo est?n muy nuevos para tener tantos a?os 54. La moto esta reci?n pintada de azul metalizado con toques de color dorado 55. Los atardeceres en Grecia son muy luminosos y alegres 56. El mapa de pared parec?a demasiado peque?o para nuestra habitaci?n 57. Proporcionaremos toda la informaci?n que sea necesaria gradualmente y sin prisas 58. La soledad es m?s llamativa en los ancianos que viven solos 59. El motivo de la queja tiene que estar relacionado concretamente con el consumo de gas 235 60. Flores que sean as? de perfumadas durante tanto tiempo no se encuentran f?cilmente 1. Subject-Verb Number Agreement 1. *Ayer por la noche dos ladrones le intent? robar el bolso a mi abuela 2. *El estudiante pidi? a los profesores que le dejara salir antes para ir al medico 3. *Tu opini?n y tu actitud convenci? finalmente al director de la escuela 4. *El fallo de esa empresa es que las decisiones las toman mucha gente 5. *En la pr?xima reuni?n se ampliar? con m?s detalles las causas de la crisis 6. *Los actores se dirigi? r?pidamente al escenario para recoger su premio 7. *El color de las flores cambian seg?n la estaci?n del a?o y el tiempo 8. *El derecho de los trabajadores al descanso no lo respeta los empresarios 9. *Los vendedores del mercado de mi pueblo prefiere recibir dinero en efectivo 10. * Los j?venes en los colegios de este pa?s sabe muy poca geograf?a 11. *Los ?rboles del parque pierde completamente sus hojas cuando llega el oto?o 12. *El chico en mi clase de matem?ticas interrumpen constantemente a la profesora 13. *Los bares cerca del campus sirve cervezas mejicanas y europeas 14. *Los t?os de mi amiga insisti? en pagarle el alquiler este mes 15. *Los jugadores de f?tbol americano juega los domingos y los lunes 16. *A Manuel se le cay? todas las tarjetas de cr?dito al suelo 17. *Los padres de Ram?n le hizo soplar las velas el d?a de su cumplea?os 18. *Los polic?as fue a buscar la pelota de f?tbol que cay? a la calle 19. *Los guardaespaldas no deja pasar a nadie que no lleve zapatos de vestir 20. *Este a?o los Reyes Magos le trajo carb?n a ?scar por su mal comportamiento 21. *Los padres de Isabel la puso a dormir a las ocho de la noche como cada d?a 22. *A los estudiantes que desafinan los profesores les da clases extras de canto 23. *A Emilio siempre se le acaba las palomitas antes de empezar la pel?cula 24. *A los asistentes se les cay? las l?grimas al o?r el discurso del rey 25. *Los ciudadanos se queja de las largas listas de espera en los hospitales p?blicos 26. *?l y t? conoces los inconvenientes de viajar en avi?n con mascotas 27. *Comer y correr a la vez tienen consecuencias negativas para el organismo 28. *Al final los problemas de Miguel se resolvi? a trav?s de la justicia 29. *Los entrenadores le di? la enhorabuena al equipo campe?n de la final 30. *Chema y t? bailas siempre hasta el amanecer cada fin de semana 31. *Las patatas junto con la cebolla y el ajo picados se ha de fre?r durante una hora a fuego lento 236 32. El ox?geno y el hidr?geno los proporciona el medio ambiente en cantidades iguales 33. La ni?a y t? cobrar?is mil euros de indemnizaci?n por el accidente 34. Se permite la entrada de camiones en horas de poco tr?fico 35. El d?a de la inaguraci?n, vinieron el alcalde y el regidor para celebrarlo 36. Mis viejos amigos me reconocieron inmediatamente nada m?s salir por la puerta 37. Finalmente, se unieron a la expedici?n alpinistas alemanes con muy poca experiencia 38. En mi clase, bastantes alumnos ya saben ingl?s y alem?n de negocios 39. Mis padres me han comprado unos pantalones vaqueros y una bufanda negra 40. Muchas organizaciones se especializan en ayudar a las v?ctimas del terrorismo 41. Los jubilados de la plaza observaban atentamente las obras de restauraci?n del ayuntamiento 42. En muchos transportes p?blicos, se admiten ni?os menores de tres a?os de forma gratuita 43. A los chicos no les gustaron los dibujos animados que daban por la tele 44. Los pasajeros del transatl?ntico desembarcaron ayer en el puerto principal de Atenas 45. Mis colegas del trabajo se creen m?s inteligentes que yo 46. Los vecinos del cuarto dejaron de saludarse despu?s de la disputa por las obras 47. Los manifestantes han vuelto a ocupar las calles para reclamar justicia 48. Paco se march? pero los dem?s prefirieron quedarse hasta el final del concierto 49. Los regalos te los traer? el cartero ma?ana por la ma?ana sin falta 50. A Juan le gastaron la broma sus nuevos compa?eros de oficina 51. Los alumnos de cuarto curso escriben redacciones sobre temas de actualidad 52. El pastel se lo comieron con gusto los invitados a la cena de gala 53. Los r?os de Espa?a se desbordan cont?nuamente por la excesiva cantidad de agua que reciben 54. Los autores de esta obra merecen gratitud y reconocimiento por parte de todos 55. Muchas personas se ven afectadas por la gripe cada a?o por no querer vacunarse 56. Este jersey me lo regalaron compa?eros de la facultad por mi cumplea?os 57. El equipo de b?squeda se dispers? por toda la zona del incendio para buscar a los desaparecidos 58. A mi siempre me llamaron la atenci?n esos ni?os tan espabilados 59. Por esta raz?n no son recomendable ba?os de sauna para perder peso 60. *En el peri?dico se publicaron todos los art?culos escritos por Miguel Delibes 237 2. Noun-Adjective Number Agreement 1. *Mis amigos pidieron botas prestada a sus vecinos para ir a esquiar 2. *Hay mucho m?s libros antiguos en esta biblioteca que en el museo 3. *Los ni?os que son as? de espabilado siempre obtienen muy buenos resultados 4. *El vino se mete luego en unos barriles similar a los que se utilizan para el ron 5. *Todos los m?todos son v?lido para atravesar el r?o y llegar al otro lado 6. * Los problemas de suministro est?n muy unido a la falta de recursos 7. *Debemos aprender a ser mejor cada d?a para poder alcanzar nuestros objetivos 8. *Un viento y una lluvia nunca visto antes afectaron toda la zona del norte 9. *Mi hermano les esper? entusiasmados para darles la bienvenida 10. *Los gatos de Pablo son todav?a demasiado peque?o para ir de viaje 11. *La huelga general mantiene paralizado a bastantes transportes p?blicos 12. *Se ha presentado un n?mero de pruebas bastante elevados en contra del acusado 13. *Siempre se viste con trajes oscuro perfectamente cortados a su medida 14. *Hay bastante m?s latinos en pa?ses como Estados Unidos que en Espa?a 15. *?frica tiene poco recursos naturales y mucha poblaci?n necesitada 16. *Todo el mundo sabe que hay gatos que son mejor cazadores que otros 17. *La tormenta ha dejado incomunicado a varios pueblos del sur de Espa?a 18. *Las gaitas son instrumentos de viento parecido a las flautas y a las trompetas 19. *Le gusta hacer las cosas de una manera distintas al resto de los mortales 20. *En este momento ignoramos cu?l van a ser las consecuencias de una crisis nuclear 21. *Tenemos suficiente candidatos en Europa para garantizar la continuidad 22. *El centro de mesa estaba hecho con manzanas reci?n cogida del ?rbol 23. *Pablo siempre lleva los pelos del bigote enredado y sin cuidar 24. *Ayer conoc? a tus futura esposa y suegra casualmente en la calle 25. *Cada a?o ciento de p?jaros emigran hacia el sur en busca del calor 26. *Determinados usuarios se pasan de listo intentando conseguir servicios gratis 27. *Los cambios en materia de educaci?n han sido m?nimo este a?o por culpa de la crisis 28. *Los coches procedente de Europa tendr?n asientos mas amplios y c?modos 29. *Los servicios de transporte ser?n m?s econ?mico respecto al a?o pasado 30. *En la subasta se vendieron art?culos por importes muy superior a los mil euros 31. Juan y Mar?a estaban muy felices bebiendo champagne y brindando por su relaci?n 32. Los cuatro gatos pasaban mucho tiempo juntos jugando y cazando ratones 33. Todos los invitados iban vestidos para la ocasi?n con chaqueta y corbata 238 34. La diferencia entre su estilo musical y cualquier otro estilo es el ritmo y la melod?a 35. Las autoridades sovi?ticas no avisaron a los pa?ses europeos del peligro 36. Todo el festival cost? seis millones de euros m?s de lo previsto por la organizaci?n 37. Mart?n explica que a los tres d?as de matrimonio ?ngela lo dej? por otro 38. Los estudiantes de hoy en d?a est?n llenos de deudas a largo plazo 39. Andaluc?a cobra los precios de alquiler m?s bajos de toda Espa?a 40. Los aficionados volvieron a sus hogares decepcionados tras la victoria del equipo contrario 41. Los productos que son originarios de la India siempre tienen m?s demanda 42. Cada vez estamos mas influ?dos pol?ticamente por los medios de comunicaci?n 43. Necesitamos una ley que regule la exportaci?n de determinados art?culos de consumo al extranjero 44. La normativa de la universidad fue redactada por los anteriores consejeros hace m?s de diez a?os 45. Por regla general, los climas del sur son m?s suaves que los del norte 46. Han sido liberados todos los periodistas, inclu?dos los dos de nuestra agencia 47. Me compr? unos pantalones vaqueros muy bonitos y una bufanda negra 48. El vecino tiene una sobrina y un sobrino cari?osos que le quieren mucho 49. La mayor?a de los animales dom?sticos son muy lentos cuando se ven en peligro 50. Espa?a tiene el ?ndice de accidentes m?s elevado de toda Europa 51. Las relaciones hispano-alemanas se han deteriorado mucho ?ltimamente 52. Cristian ten?a siempre las mejillas rosadas porque lo alimentaban muy bien 53. Despu?s de los tres primeros d?as, Ana demostr? que estaba en plena forma f?sica 54. El nombre de Roma nos trae a la mente im?genes de antiguas civilizaciones y ruinas 55. El consumo de cigarillos es de veinte millones anuales en pa?ses como Colombia 56. Mis amigos no son capaces de ocultarme la verdad sobre lo sucedido 57. Cualesquiera que sean las causas del siniestro, la compa??a de seguros est? obligada a pagar 58. Tengo muchos amigos que estar?an dispuestos a colaborar en el proyecto 59. Los votos de los que dispone el candidato son muchos m?s de los que tiene la oposici?n 60. Los excursionistas que fueron a escalar llevan desaparecidos m?s de tres d?as 239 3. Subjunctive Mood 1. *Jorge se ir? a trabajar en cuanto lo avisan de la oficina 2. *Mi profesor de instituto siempre nos ped?a que lleguemos pronto a clase 3. *La escuela exigir? que los alumnos de primer curso hablan el ingl?s 4. *Tu madre te pide que te portas bien durante la cena de nochevieja 5. *En el pasado, era imprescindible para los agricultores que llueva durante el verano 6. *A los ni?os siempre les prohibimos que salen solos a la calle 7. *Los expertos sugieren que los ancianos toman calcio y vitaminas cada d?a 8. *Nos gust? que todo salga bien el d?a de la boda de Carmen y Roberto 9. *Despu?s de mirar toda la tienda no hab?a nada que le guste a Pilar 10. *Nos impresion? que Silvia apruebe todos los ex?menes de primer a?o 11. *Serviremos los aperitivos cuando vienen los invitados 12. *Estoy muy contento de que Miguel sigue trabajando para la compa??a de seguros 13. *La vecina del quinto siempre nos invita a que entramos para tomar caf? 14. *No hay que dejar que los ni?os comen caramelos todos los d?as 15. *El muro impide que los prisioneros pueden escapar de forma f?cil 16. *Nos quedaremos en la oficina hasta que el informe est? listo para ser enviado 17. *Marcos renunciar? a su puesto de trabajo cuando consigue algo mejor en otra empresa 18. *No me pareci? que Antonio tiene buena pronunciaci?n de los idiomas que habla 19. *El conserje siempre nos prohib?a que fumamos cigarrillos en los pasillos del edificio 20. *Marta est? harta de que el jefe la hace trabajar d?as festivos y fines de semana 21. *Todos nos sorprendimos mucho de que no estabas presente en la fiesta de Carlos 22. *Ayer pudimos llegar a la estaci?n antes de que salga el tren a Madrid 23. *No saldr? de casa mientras no tengo noticias de Rub?n y sus amigos 24. *Los organizadores insisten que los asistentes vuelven ma?ana para devolverles el dinero 25. *Su ambici?n es que su hijo se convierte en presidente del pa?s 26. *El equipo celebrar? la victoria cuando gana la final de esta noche 27. *Ana ir? de vacaciones cuando aprueba todas las asignaturas pendientes 28. *El nuevo jugador organizar? una fiesta cuando firma oficialmente su contrato con el equipo 29. *Los periodistas dar?n la noticia cuando lo permite el gobierno 30. *Juan no cree que puede llegar a tiempo de ver el principio de la pel?cula 31. Mario no sube a un avi?n ni aunque le paguen una fortuna 32. Nos fuimos para casa antes de que empezase a llover 33. Ojal? que los beb?s durmiesen as? de bien durante toda la noche 34. Todo el mundo piensa que es bueno que te cases finalmente con el hijo del 240 alcalde 35. Los pacientes quer?an que el doctor los atiendese por orden alfab?tico 36. Mis hermanas dudan que yo recuerde sus cumplea?os 37. No tengo ning?n amigo que vaya de vacaciones a Toledo este verano 38. Est? muy bien que le den el premio a Sara por su papel en la obra de teatro 39. Nos podemos quitar los zapatos cuando estemos m?s cerca del detector de metales 40. Es f?cil que Sergio se olvide hoy de llamarme por tel?fono 41. Puede ser que tengamos que escalar la roca para cruzar el r?o 42. Es poco probable que mis padres encuentren hoy un sitio para aparcar 43. Dami?n no conoce a nadie que haya nacido en un pa?s n?rdico 44. Tan pronto como te ajustes el cintur?n pondr? el coche en marcha 45. Ustedes no pueden salir hasta que alguien pague la cuenta 46. Con tal que no te hagas da?o puedes jugar en el parque con tus amigos 47. Los pol?ticos siempre hablan como si lo supiesen todo sobre econom?a 48. El jurado duda que el acusado diga toda la verdad sobre lo occurrido 49. La agencia publicitaria busca a una chica que tenga aptitud para las lenguas extranjeras 50. Es indignante que la electricidad sea m?s cara en Espa?a que en cualquier otro pa?s europeo 51. Es incre?ble que tantos espa?oles perdiesen familiares en la Guerra Civil 52. Es mejor que no pidamos pollo a la brasa en el nuevo restaurante 53. Es muy extra?o que a Gloria le guste gastar tanto en zapatos de vestir 54. Es probable que ?ngela viaje pronto a M?jico para conocer a su familia 55. Buscamos un apartamento que est? orientado al Este para aprovechar m?s el sol 56. Marcelo no va a recibir m?s ayuda del banco a no ser que pague la deuda 57. Los chicos no van a ver la televisi?n hasta que no acaben sus deberes 58. Cuando termines de limpiar tu cuarto, iremos al mercado a comprar fruta 59. Es importante que los estudiantes de espa?ol practiquen el idioma todos los d?as 60. Espero que puedan encontrar trabajo mejor pagado en otro sitio 4. Perfective/Imperfective Aspect 1. *En un momento el t?cnico solucionaba los problemas de conexi?n a internet 2. *Todos coincid?an en que el reci?n nacido tuvo un cierto parecido con su padre 3. *Aquella ma?ana Alfonso compraba el peri?dico como cada ma?ana antes de ir a trabajar 4. *Nada m?s empezar a leer la carta ayer me daba cuenta de la gravedad del asunto 5. *Muy pocos daban con la soluci?n al enigma de la semana pasada en el peri?dico 6. *De repente, me acordaba del regalo de cumplea?os para Adri?n 241 7. *Por un instante, todos pens?bamos que los dos coches iban a chocar 8. *Justo ahora hicimos palomitas para ver la pel?cula en la tele 9. *Hac?a varios meses que no com? marisco de Galicia de esta calidad 10. *Por un segundo Sonia cre?a ilusionada que hab?a ganado la loter?a de Navidad 11. *Juan ped?a tres d?as de permiso al encargado para ir a visitar a su familia 12. *Mi padre conoc?a a tu padre aquel d?a en la fiesta de cumplea?os de Rebeca 13. *Durante mi infancia, iba dos a?os a una escuela privada de monjas en Valladolid 14. *Conozco a una mujer que estaba mucho tiempo en Argentina antes de volver a Espa?a 15. *En las ?ltimas vacaciones de verano pod?a descansar m?s de lo habitual en mi 16. *Durante varios a?os estudiaba ingl?s a distancia para mejorar mi curr?culum 17. *En mi antiguo trabajo sal? puntualmente de mi oficina en el centro de Madrid a las cinco de la tarde 18. *Mis abuelos no fueron felices hasta que viv?an cerca de nuestra casa 19. *A Pedro le doli? la cabeza hasta que se tomaba una aspirina 20. *Rodrigo llev? ocho d?as intent?ndolo antes de abandonar la competici?n 21. *Aquella tarde Mar?a bailaba rumbas con sus amigos durante horas 22. *Durante el fin de semana Jaime estaba m?s de cinco horas estudiando para el examen 23. *Aquel d?a Carlos tuvo pensado jugar durante dos horas en el patio 24. *Apenas el presidente acababa el discurso, alguien le dispar? desde una terraza 25. *Durante toda esa ma?ana, el doctor L?pez visitaba decenas de pacientes con gripe 26. *Los invitados jugaban a cartas hasta que dieron las doce de la noche 27. *A medida que Juan habl? de sus problemas, Maite se pon?a m?s nerviosa 28. *Javier y yo nos conocimos de haber estudiado juntos en la universidad 29. *Por lo menos ayer el abuelo estaba tranquilo por un rato 30. *Cada d?a ?scar pens? en su novia Carla y su familia 31. Me entusiasm? al conocer la noticia sobre el embarazo de Irene 32. Hasta los veinte a?os viv? siempre en C?diz con mis padres y mis abuelos 33. Contra todo pron?stico la lluvia cay? toda la tarde sin parar 34. Por aquel entonces siempre cantabas al ducharte por las ma?anas 35. Carla fue mucho a la playa hasta que tuvo problemas en la piel 36. De ni?a fui a Andaluc?a tres veranos para visitar a mis abuelos maternos 37. He tra?do bombones para Ester porque la ?ltima vez le gustaron mucho 38. Durante las vacaciones, cada ma?ana Victor compraba pan y leche para desayunar 39. En Julio del 2000 pasamos dos semanas en el Caribe sin ni?os ni familiares 40. Pilar conoc?a a Nacho y a su familia desde hac?a m?s dos a?os 242 41. Durante a?os Ram?n estuvo estudiando la carrera equivocada en el extranjero 42. Mi hermano corri? dos veces esta ma?ana para prepararse para la marat?n del viernes 43. Los viernes por la noche Juan siempre miraba pel?culas de detectives 44. Todo el invierno hizo mucho fr?o en la zona de Catalu?a 45. Ayer el tren directo de Barcelona a Bilbao lleg? tarde por culpa de la nieve 46. De camino al trabajo, se me ocurri? c?mo solucionar el problema 47. Por suerte para nuestros invitados, el beb? se durmi? anoche antes de lo esperado 48. Durante ese rato acab? de preparar la cena de bienvenida 49. Los dinosaurios de hace 150 millones de a?os com?an cualquier tipo de planta 50. Este profesor es el que ense?aba matem?ticas los jueves en mi instituto 51. A los cinco a?os, Silvia se quedaba dormida en todas partes 52. Santiago me dijo que sal?a dentro de poco de su casa 53. En la foto de la entrada Maite ten?a quince a?os reci?n cumplidos 54. Carlos cumpli? catorce a?os el mismo d?a que Sonia 55. La reuni?n no pudo acabar a las dos como estaba previsto y se alarg? m?s de una hora 56. En la edad de piedra, los seres humanos aprendieron a utilizar la rueda 57. Javier se rompi? el brazo de ni?o a causa de un golpe durante un partido 58. Muchos a?os despu?s Miguel tuvo noticias de su antigua novia 59. La familia lleg? a la iglesia una hora antes de lo previsto 60. Por suerte cada d?a el beb? se dorm?a cinco minutos m?s pronto 5. Passives with Ser/Estar 1. *El nuevo museo de arte estuvo inaugurado oficialmente esta semana 2. *Madrid es situado estrat?gicamente en el centro de Espa?a 3. *En el siglo XV las iglesias estuvieron destru?das completamente en toda Europa 4. *El nuevo empleado es sobradamante cualificado para llevar la contabilidad 5. *Las duras condiciones en la mina est?n bien sabidas por todos 6. *?ltimamente Mar?a es muy encari?ada con mi madre 7. *El r?o de la zona afectada por el terremoto es contaminado indefinidamente 8. *Hasta el d?a de su inauguraci?n, el museo podr? estar visitado gratuitamente 9. *Cazar linces es terminantemente prohibido en pa?ses como Espa?a y Portugal 10. *La biblioteca estar? restaurada gracias a las donaciones de los ciudadanos 11. *El acusado estuvo declarado inocente de todos los delitos cometidos 243 12. *El ni?o desaparecido estuvo encontrado caminando tranquilamente cerca de un r?o 13. *Cada d?a cientos de delfines est?n rescatados de entre las redes de los pescadores 14. *La piscina del hotel estar? vaciada temporalmente por motivos de limpieza 15. *Shakira est? conocida mundialmente por sus ritmos y canciones de amor 16. *Los dos hermanos eran muy unidos hasta que discutieron por la herencia 17. *Despu?s de varias horas de espera el vuelo estuvo cancelado definitivamente hasta nuevo aviso 18. *El proyecto de investigaci?n ha estado aprobado finalmente por el Ministerio 19. *El perro de Sandra estuvo visto por ?ltima vez en una zona de bosque 20. *El cuadro de las Meninas estuvo pintado magistralmente por Diego de Vel?zquez 21. *El concierto de rock estuvo aplazado al pr?ximo 10 de Junio por culpa de la lluvia 22. *Los ingredientes del pastel nupcial estuvieron seleccionados cuidadosamente por los mejores chefs del mundo 23. *Los terroristas han estado capturados huyendo en un coche robado de la polic?a 24. *Los ladrones han estado sorprendidos intentando abrir la caja fuerte de una joyer?a 25. *La constituci?n espa?ola estuvo aprobada un?nimente en el 1978 26. *Con motivo de la boda real, las tiendas han estado cerradas a las 4 de la tarde 27. *La catedral vieja estuvo construida en el siglo XIII por el rey Jaime I el Conquistador 28. *Actualmente no est? legal llevar animales dom?sticos a bordo de los aviones 29. *Muchas de las obras que estuvieron escritas por Cervantes se destruyeron en el siglo XVI 30. *Antes de ganar el premio, la novela ganadora ya hab?a estado le?da por miles de espa?oles 31. El equipo del Valencia dej? de ser invencible en la pen?ltima jornada de liga 32. La oposici?n se quej? por los temas que estuvieron ausentes en el discurso del presidente 33. La celebraci?n de Carnaval de este a?o no va a ser olvidada f?cilmente por los madrile?os 34. La nueva ley ser? aprobada por el gobierno a pesar de los muchos votos en contra 35. Al salir al campo de juego, el jugador fue recibido cari?osamente por toda la afici?n 36. Los paquetes fueron entregados ayer a media tarde por el conserje del edificio 244 37. La pen?nsula ib?rica est? ba?ada por el Atl?ntico y el Mediterr?neo 38. Tras la amenaza de bomba, los pacientes fueron trasladados inmediatamente a hospitales cercanos 39. Mi ?ltimo libro ha sido n?mero uno en ventas en varios pa?ses europeos 40. Estas cestas est?n hechas a mano a base de material reciclado 41. El ni?o estuvo castigado sin salir de casa durante todo el fin de semana 42. La cantante Rosario es muy querida en todos los pa?ses de sudam?rica 43. El gol de Messi fue muy celebrado por la afici?n del estadio 44. Los periodistas con m?s experiencia fueron destinados a zonas de conflicto 45. La canci?n est? dedicada a las v?ctimas de atentados terroristas 46. La ensalada de esp?rragos de hoy est? ali?ada con aceite, lim?n y sal 47. Los delincuentes fueron condenados a dos meses de prisi?n incondicional 48. M?s de mil personas estuvieron afectadas por los cortes de luz durante la tormenta 49. Varias personas fueron heridas a causa del atropello en el centro 50. Las pruebas fueron destruidas antes de que llegara la polic?a 51. El actor fue nombrado embajador de buena voluntad de las Naciones Unidas 52. El concierto de m?sica cl?sica de ayer fue suspendido por la lluvia 53. La ?ltima pel?cula de Almod?var fue premiada como la mejor pel?cula del festival 54. Las normas de juego han de ser cumplidas por todos los jugadores 55. Los monta?eros desaparecidos fueron rescatados ayer por la noche 56. El mundo est? actualmente gobernado por las grandes corporaciones 57. Cientos de ?rboles del Amazonas son talados cada a?o 58. Cada d?a cientos de animales abandonados son adoptados por familias espa?olas 59. Cuando hay un accidente, los coches son habitualmente desviados por rutas alternativas 60. El jefe inform? de que el trabajo que estuviese acabado para el viernes se pagar?a doble 245 Bibliography Abrahamsson, N., & Hyltenstam, K. (2008). The robustness of aptitude effects in near-native second language acquisition. Studies in Second Language Acquisition, 30, 481?509. Abrahamsson, N., & Hyltenstam, K. (2009). Age of onset and nativelikeness in a second language: Listener perception versus linguistic scrutiny. Language Learning, 59, 249?306. Ackerman, P. L. (1987). Individual differences in skill learning: An integration of psychometric and information processing perspectives. Psychological Bulletin, 102, 3?27. Ackerman, P. L. (1988). Determinants of individual differences during skill acquisition: Cognitive abilities and information processing. Journal of Experimental Psychology: General, 117, 288?318. Anderson, J. R. (1983). The Architecture of Cognition. Cambridge, MA: Harvard University Press. Anderson, J. R. (1993). Problem solving and learning. American Psychologist, 48, 35?44. Aslin, R. N., & Newport, E. L. (in press). Statistical learning: From learning items to generalizing rules. Current Directions in Psychological Science. Bialystok, E. (1979). Explicit and implicit judgements of L2 grammaticality. Language Learning, 29, 81?103. 246 Bialystok, E. (1986). Factors in the growth of linguistic awareness. Child Development, 57, 498?510. Bialystok, E. (1999). Cognitive complexity and attentional control in the bilingual mind. Child Development, 70, 636?644. Birdsong, D., & Molis, M. (2001). On the evidence for maturational constraints in second-language acquisition. Journal of Memory and Language, 44, 235?249. Bley-Vroman, R. (1988). The fundamental character of foreign language learning. In W. Rutherford & M. Sharwood Smith (Eds.), Grammar and second language teaching: A book of readings (pp. 133?159). Rowley, MA: Newbury House. Bley-Vroman, R. (1990). The logical problem of foreign language learning. Linguistic Analysis, 20, 3?49. Bialystok, E. (1979). Explicit and implicit judgements of L2 grammaticality. Language Learning, 29, 81?103. Bialystok, E., & Miller, B. (1999). The problem of age in second-language acquisition: Influences from language, structure, and task. Bilingualism: Language and Cognition, 2, 127?145. Bowles, M. (2011). Measuring implicit and explicit linguistic knowledge. Studies in Second Language Acquisition, 33, 247?271. Brooks, P. J., Kempe, V., & Sionov, A. (2006). The role of learner and input variables in learning inflectional morphology. Applied Psycholinguistics, 27, 185? 209. 247 Bruhn de Garavito, C., & Valenzuela, E. (2008). Eventive and stative passives in Spanish L2 acquisition: A matter of aspect. Bilingualism: Language and Cognition, 11, 323?336. Bylund, E., Abrahamsson, N., & Hyltenstam, K. (2010). The role of language aptitude in first language attrition: The case of pre-pubescent attriters. Applied Linguistics, 31, 443?464. Carroll, J. B. (1962). The prediction of success in intensive foreign language training. In R. Glaser (Ed.), Training, research and education (pp. 87?136). Pittsburgh, PA: University of Pittsburgh Press. Carroll, J. B. (1964). Language and thought. Englewood Cliffs, NJ: Prentice Hall. Carroll, J. B. (1973). Implications of aptitude test research and psycholinguistic theory for foreign language teaching. International Journal of Psycholinguistics, 2, 5?14. Carroll, J. B. (1981). Twenty-five years of research in foreign language aptitude. In K. Diller (Ed.), Individual differences and universals in language learning aptitude (pp. 83?118). Rowley, MA: Newbury House. Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. Cambridge: Cambridge University Press. Carroll, J. B., & Sapon, S. (1959). Modern Language Aptitude Test: Form A. New York: Psychological Corporation. Chaudron, C. (2003). Data collection in SLA research. In C. Doughty & M. Long (Eds.), The handbook of second language acquisition (pp. 762-828). Oxford: Blackwell. 248 Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Earlbaum Associates. Collentine, J. (1995). The development of complex syntax and mood-selection abilities by intermediate-level learners of Spanish. Hispania, 78, 122?135. Creel, S. C., Newport, E. L., & Aslin, R. N. (2004). Distant melodies: Statistical learning of nonadjacent dependencies in tone sequences. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 1119?1130. Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281-302. De Graaff, R. (1997). The Esperanto experiment: Effects of explicit instruction on second language acquisition. Studies in Second Language Acquisition, 19, 249? 276. DeKeyser, R. M. (1995). Learning second language grammar rules: An experiment with a miniature linguistic system. Studies in Second Language Acquisition, 17, 379? 410. DeKeyser, R. M. (2000). The robustness of critical period effects in second language acquisition. Studies in Second Language Acquisition, 22, 499?533. DeKeyser, R. M. (2001). Automaticity and automatization. In P. Robinson (Ed.), Cognition and second language instruction (pp. 125?151). New York: Cambridge University Press. DeKeyser, R. M. (2003). Implicit and explicit learning. In C. Doughty & M. Long (Eds.), Handbook of Second Language Acquisition (pp. 313?348). Oxford: Blackwell. 249 DeKeyser, R. M. (2007). The future of practice. In R. M. DeKeyser (Ed.), Practicing in a second language: Perspectives from applied linguistics and cognitive psychology (pp. 287?304). New York: Cambridge University Press. DeKeyser, R. M., Alfi-Shabtay, I., & Ravid, D. (2010). Cross-linguistic evidence for the nature of age-effects in second language acquisition. Applied Psycholinguistics, 31, 413?438. DeKeyser, R. M., & Koeth, J. (2011). Cognitive aptitudes for second language learning. In E. Hinkel (Ed.), Handbook of research in second language teaching and learning (Vol. 2, pp. 395?406). London: Routledge. Destrebecqz, A., & Cleeremans, A. (2001). Can sequence learning be implicit? New evidence with the process dissociation procedure. Psychonomic Bulletin & Review, 8, 343-350. Dienes, Z. (1992). Connectionist and memory-array models of artificial grammar learning. Cognitive Science, 16, 41?79. D?rnyei, Z. (2005). The psychology of the language learner: Individual differences in second language acquisition. Mahwah: Lawrence Erlbaum. D?rnyei, Z., & Skehan, P. (2003). Individual differences in second language learning. In C. J. Doughty & M. H. Long (Eds.), The handbook of second language acquisition (pp. 589-630). Oxford: Blackwell. Doughty, C., Bunting, M., Campbell, S., Bowles, A., & Haarmann, H. (2007). Development of the High-level Language Aptitude Battery. Technical Report: Center for Advanced Study of Language. University of Maryland, College Park. 250 Ellis, N. C. (Ed.). (1994). Implicit and explicit learning of languages. NewYork, NY: Academic Press. Ellis, N. C. (1996). Sequencing in SLA: Phonological memory, chunking, and points of order. Studies in Second Language Acquisition, 18, 91?126. Ellis, N. C. (2006). Language acquisition as rational contingency learning. Applied Linguistics, 27, 1?24. Ellis, N. C., & Laporte, N. (1997). Contexts of acquisition: Effects of formal instruction and naturalistic exposure on second language acquisition. In A. M. B. de Groot & J. F. Kroll (Eds.), Tutorials in bilingualism: Psycholinguistic perspectives (pp. 53-83). Mahwah, NJ: Lawrence Erlbaum. Ellis, R. (2004). Individual differences in second language learning. In A. Davies & C. Elder (Eds.), The handbook of applied linguistics (pp. 525?551). Oxford: Blackwell. Ellis, R. (2005). Measuring implicit and explicit knowledge of a second language. A psychometric study. Studies in Second Language Acquisition, 27, 141?172. Ehrman, M. E., & Oxford, R. L. (1995). Cognition plus: Correlates of language learning success. Modern Language Journal, 79, 67?89. Engle, R. W., Tuholski, S. W., Laughlin, J. E., & Conway, A. R. A. (1999). Working memory, short-term memory, and general fluid intelligence: A latent-variable approach. Journal of Experimental Psychology: General 128, 309?331. Erlam, R. (2005). Language aptitude and its relationship to instructional effectiveness in second language acquisition. Language Teaching Research, 9, 147?171. 251 Fodor, J. D., Ni, W., Crain, S., & Shankweiler, D. (1996). Tasks and timing in the perception of linguistic anomaly. Journal of Psycholinguistic Research, 25, 25? 57. Furst, A. J., & Hitch, G. J. (2000). Separate roles for executive and phonological components of working memory in mental arithmetic. Memory and Cognition, 28, 774?782. Gardner, R., & Lambert, W. E. (1972). Attitudes and motivation in second language learning. Rowley: Newbury House Publishers. Gebauer, G. F., & Mackintosh, N. J. (2007). Psychometric intelligence dissociates implicit and explicit learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33, 34?54. Gebhart, A. L., Newport, E. L., & Aslin, R. N. (2009). Statistical learning of adjacent and non-adjacent dependencies among non-linguistic sounds. Psychonomic Bulletin & Review, 16, 486?490. Granena, G. (To appear). Reexamining the robustness of language aptitude in SLA. In G. Granena & M. H. Long (Eds.). Sensitive periods, language aptitude, and ultimate L2 attainment. To be published by John Benjamins in 2013. Granena, G. (To appear). Cognitive aptitudes for L2 learning and the LLAMA aptitude test: What aptitude does LLAMA measure? In G. Granena & M. H. Long (Eds.). Sensitive periods, language aptitude, and ultimate L2 attainment. To be published by John Benjamins in 2013. 252 Granena, G. (2011a). Reexamining the robustness of aptitude in naturalistic SLA. Paper presented at the American Association for Applied Linguistics, Chicago, IL. Granena, G. (2011b). Cognitive aptitudes for L2 learning and the LLAMA aptitude test: What aptitude does LLAMA measure? Paper presented at the EUROSLA Annual Conference, Stockholm University, Sweden. Granena, G., & Long, M. H. (2010, October). Age of onset, length of residence, aptitude and ultimate attainment in two linguistic domains. Paper presented at the Second Language Research Forum Annual Conference, University of Maryland, College Park, MD. Granfeldt, J., Schlyter, S., & Kihlstedt, M. (2007). French as cL2, 2L1 and L1 in pre- school children. Petites ?tudes Romanes de Lund, 24, 5?42. Greenfield, P .M. (1998). The cultural evolution of IQ. In U. Neisser (Ed.), The rising curve (pp. 81?124). Washington, DC: American Psychological Association. Harley, B., & Hart, D. (1997). Language aptitude and second language proficiency in classroom learners of different starting ages. Studies in Second Language Acquisition, 19, 379?400. Harley, B., & Hart, D. (2002). Age, aptitude, and second language learning on a bilingual exchange. In P. Robinson (Ed.), Individual differences and instructed language learning (pp. 302?330). Amsterdam: Benjamins. Hemphill, J. F. (2003). Interpreting the magnitudes of correlation coefficients. American Psychologist, 58, 78?79. 253 Houser, R. (2008). Counseling and educational research: Evaluation and application. Thousand Oaks, CA: Sage. Hunt, R. H., & Aslin, R. N. (2010). Category induction via distributional analysis: Evidence from a serial reaction time task. Journal of Memory and Language, 62, 98-112. Hyltenstam, K., & Abrahamsson, N. (2003). Maturational constraints in SLA. In C. J. Doughty & M. H. Long (Eds.), The handbook of second language acquisition (pp. 539?588). Oxford: Blackwell. Hyltenstam, K., Bylund, E., Abrahamsson, N. & Park, H.-S. (2009). Dominant language replacement: The case of international adoptees. Bilingualism: Language and Cognition, 12, 121?140. Ioup, G., Boustagui, E., El Tigi, M., & Moselle, M. (1994). Reexamining the critical period hypothesis: A case study of successful adult SLA in a naturalistic environment. Studies in Second Language Acquisition, 16, 73?98. Jiang, N. (2004). Morphological insensitivity in second language processing. Applied Psycholinguistics, 25, 603-634. Jiang, N. (2007). Selective integration of linguistic knowledge in adult second language learning. Language Learning, 57, 1?33. Jiang, N., Novokshanova, E., Masuda, K., & Wang, X. (2011). Morphological congruency and the acquisition of L2 morphemes. Language Learning, 61, 940? 967. Jim?nez, L., & V?zquez, G. (2005). Sequence learning under dual-task conditions: Alternatives to a resource-based account. Psychological Research, 69, 352?368. 254 Johnson, J. S., & Newport, E. L. (1989). Critical period effects in second language learning. The influence of maturational state on the acquisition of English as a second language. Cognitive Psychology, 21, 60?99. Johnston, M. (1995). Stages of acquisition of Spanish as a second language. Australian Studies in Language Acquisition, 4, 1?28. Karmiloff-Smith, A. (1979). Micro- and macro-developmental changes in language acquisition and other representation systems. Cognitive Science, 3, 91?118. Karmiloff-Smith, A., Tyler, L. K., Voice, K., Sims, K., Udwin, O., Howlin, P., & Davies, M. (1998). Linguistic dissociations in Williams syndrome: Evaluating receptive syntax in on-line and off-line tasks. Neuropsychologia, 36, 343?351. Kaufman, S. B., DeYoung, C. G., Gray, J. R., Jimenez, L., Brown, J., & Mackintosh, N. (2010). Implicit learning as an ability. Cognition, 116, 321?340. Kaufman, A. S., & Kaufman, N. L. (1990). K-BIT (Kaufman Brief Intelligence Test) manual. Circle Pines, MN: American Guidance Service. Kempe, V., & Brooks, P. J. (2008). Second language learning of complex inflectional systems. Language Learning, 58, 703?746. Kempe, V., Brooks, P. J., & Kharkhurin, A. V. (2010). Cognitive predictors of generalization of Russian grammatical gender categories. Language Learning, 60, 127?153. Keppel, G. & Wickens, T. D. (2004). Design and analysis: A researcher?s handbook. Upper Saddle River, NJ: Pearson Prentice Hall. Kilborn, K., & Moss, H. (1996). Word monitoring. Language and Cognitive Processes, 11, 689-694. 255 K?pke, B., & Schmid, M. S. (2004). Language attrition: The next phase. In M. S. Schmid, B. K?pke, M. Keijzer, & L. Weilemar (Eds.), First language attrition. Interdisciplinary perspectives on methodological issues (pp. 1?43). Amsterdam: John Benjamins. Kuperberg, G. R., McGuire, P. K., & David, A. S. (1998). Reduced sensitivity to linguistic context in schizophrenic thought disorder: Evidence from online monitoring for words in linguistically anomalous sentences. Journal of Abnormal Psychology, 107, 423?434. Kuperberg, G. R., McGuire, P. K., & David, A. S. (2000). Sensitivity to linguistic anomalies in spoken sentences: A case study approach to understanding thought disorder in schizophrenia. Psychological Medicine, 30, 345?357. Kyllonen, P. C. (1996). Is working memory capacity Spearman?s g? In I. Dennis & P. Tapsfield (Eds.), Human abilities: Their nature and measurement (pp. 49?75). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Kyllonen, P. C., & Christal, R. E. (1990). Reasoning ability is (little more than) working-memory capacity? Intelligence, 14, 389?433. Lenneberg, E. (1967). Biological foundations of language. New York: Wiley. Lindsey, B. A., & Gerken, L. (2011). The role of morphophonological regularity in young Spanish-speaking children?s production of gendered noun phrases. Journal of Child Language, 1?24. Loewen, S. (2009). Grammaticality judgment tests and the measurement of implicit and explicit L2 knowledge. In R. Ellis, S. Loewen, C. Elder, R. Erlam, J. Philp, & 256 H. Reinders (Eds.), Implicit and explicit knowledge in second language learning, testing and teaching (pp. 94?112). Bristol, UK: Multilingual Matters. Long, M. H. (2005). Problems with supposed counter-evidence to the critical period hypothesis. IRAL, 43, 287?317. Long, M. H. (2007). Problems in SLA. Mahwah, NJ: Erlbaum. Marslen-Wilson, W. D., & Tyler, L. K. 1980. The temporal structure of spoken language processing. Cognition, 8, 1?71. Meara, P. (2005). LLAMA language aptitude tests. Swansea, UK: Lognostics. Meara, P., Milton, J., & Lorenzo-Dus, N. (2003). Swansea language aptitude tests (LAT) v.2.0. Swansea, UK: Lognostics. Meisel, J. M. (1990). Inflection: Subjects and subject-verb agreement. In J.M. Meisel (Ed.), Two first languages: Early grammatical development in bilingual children (pp. 237?298). Dordrecht: Foris. Meisel, J. M. (2009). Second language acquisition in early childhood. Zeitschrift f?r Sprachwissenschaft, 28, 5?34. Meisel, J. M. (2011). Bilingual language acquisition and theories of diachronic change: Bilingualism as cause and effect of grammatical change. Bilingualism: Language and Cognition, 14, 121?145. Misyak, J. B., & Christiansen, M. H. (2012). Statistical learning and language: An individual differences study. Language Learning, 62, 302-331. Miyake, A., & Friedman, N. (1998). Individual differences in second language proficiency: Working memory as language aptitude. In A. Healy & L. Bourne Jr. 257 (Eds.), Foreign language learning: Psycholinguistic studies on training and retention (pp. 339?364). Mahwah, NJ: Erlbaum. Montrul, S. (2004). Subject and object expression in Spanish heritage speakers. Bilingualism: Language and Cognition, 7, 125?142. Montrul, S. (2004). The acquisition of Spanish: Morphosyntactic development in monolingual and bilingual L1 acquisition and adult L2 acquisition. Amsterdam: John Benjamins. Mullennix, J. W., Sawusch, J. R., & Garrison, L. F. (1992). Automaticity and the detection of speech. Memory and Cognition, 20, 40?50. Murphy, V. A. (1997). The effect of modality on a grammaticality judgement task. Second Language Research, 13, 34?65. Naglieri, J. A., & Bardos, A. N. (1997). GAMA Manual. Minneapolis, MN: Pearson. Nissen, M. J., & Bullemer, P. (1987). Attentional requirements of learning: evidence from performance measures. Cognitive Psychology, 19, 1?32. Novoa, L. K., Fein, D., & Obler, L. (1988). Talent in foreign languages: A case study. In L. K. Obler & D. Fein (Eds.), The exceptional brain: Neuropsychology of talent and special abilities (pp. 294?302). New York: Guilford Press. Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill, Inc. Oller, J., & Perkins, K. (1978). A further comment on language proficiency as a source of variance in certain affective measures. Language Learning, 28, 417?423. Oyama, S. (1978). The sensitive period and comprehension of speech. Working Papers on Bilingualism, 16, 1?17. 258 Paradis, M. (2009). Declarative and procedural determinants of second languages. Amsterdam: John Benjamins. Peelle, J. E., Cooke, A., Moore, P., Vesely, L., & Grossman, M. (2007). Syntactic and thematic components of sentence processing in progressive nonfluent aphasia and nonaphasic frontotemporal dementia. Journal of Neurolinguistics, 20, 482?494. P?rez-Leroux, A. T. (1998). The acquisition of mood selection in Spanish relative clauses. Journal of Child Language, 25, 585?604. Perruchet, P., & Amorim, M. A. (1992). Conscious knowledge and changes in performance in sequence learning: Evidence against dissociation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 785?800. Perruchet, P., & Pacton, S. (2006). Implicit learning and statistical learning: One phenomenon, two approaches. Trends in Cognitive Sciences, 10, 233?238. Pimsleur, P. (1966). Pimsleur Language Aptitude Battery (PLAB). New York: The Psychological Corporation. Pretz, J. E., Totz, K. S., & Kaufman, S. B. (2010). The effects of mood, cognitive style, and cognitive ability on implicit learning. Learning and Individual Differences, 20, 215?219. Ransdell, S., Arecco, M. R., & Levy, C. M. (2001). Bilingual long-term working memory: The effects of working memory loads on writing quality and fluency. Applied Psycholinguistics, 22, 113?128. Raven, J. C. (1938). Progressive Matrices. London: H. K. Lewis & Co., Ltd Reber, A. S. (1989). Implicit learning and tacit knowledge. Journal of Experimental Psychology: General, 118, 219?235. 259 Reber, A. S. (1993). Implicit learning and tacit knowledge: An essay on the cognitive unconscious. New York: Oxford University Press. Reber, A. S., Walkenfeld, F. F., & Hernstadt, R. (1991). Implicit and explicit learning: Individual differences and IQ. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 888?896. Reber, A. S., & Allen, R. (2000). Individual differences in implicit learning: Implications for the evolution of consciousness. In R. G. Kunzendorf & B. Wallace (Eds.), Individual differences in conscious experience (pp. 227?247). Amsterdam: Benjamins. Rebuschat, P. (2008). Implicit learning of natural language syntax. Unpublished Ph.D. Dissertation. University of Cambridge, Cambridge, UK. Rebuschat, P. & Williams, J. N. (2006). Dissociating implicit and explicit learning of natural language syntax. In Sun, R. & Miyake, N. (Eds.) Proceedings of the Annual Meeting of the Cognitive Science Society, p. 2594. Mahwah, N.J.: Lawrence Erlbaum. Rebuschat, P. & Williams, J. (2009). Implicit learning of word order. In N.A. Taatgen & H. van Rijn (Eds.), Proceedings of the 31th Annual Conference of the Cognitive Science Society (p. 1031). Austin, TX: Cognitive Science Society. Reves, T. (1983). What makes a good language learner? Unpublished Ph.D. dissertation, Hebrew University of Jerusalem, Israel. Reed, J., & Johnson, P. (1994). Assessing implicit learning with indirect tests: Determining what is learned about sequence structure. Journal of experimental Psychology: Learning, Memory, and Cognition, 20, 585?594. 260 Robinson, P. (1996). Learning simple and complex second language rules under implicit, incidental, rule-search, and instructed conditions. Studies in Second Language Acquisition, 18, 27?67. Robinson, P. (1997). Individual differences and the fundamental similarity of implicit and explicit adult second language learning. Language Learning, 47, 45?99. Robinson, P. (2001). Individual differences, cognitive abilities, aptitude complexes, and learning conditions in SLA. Second Language Research, 17, 368?392. Robinson, P. (2002). Individual differences in intelligence, aptitude and working memory during adult incidental second language learning: A replication and extension of Reber, Walkenfeld, and Hernstadt (1991). In P. Robinson (Ed.), Individual differences and instructed language learning (pp. 211?266). Amsterdam: Benjamins. Roehr, K., & G?nem, A. (2009). The status of metalinguistic knowledge in instructed adult L2 learning. Language Awareness, 18, 165?181. Saffran, J. R., Newport, E. L., & Aslin, R. N. (1996). Word segmentation: The role of distributional cues. Journal of Memory and Language, 35, 606?621. Saffran, J.R., Newport, E.L., Aslin, R.N., Tunick, R.A., & Barrueco, S. (1997). Incidental language learning: Listening (and learning) out of the corner of your ear. Psychological Science, 8, 101?105. Saffran, J. R., Johnson, E. K., Aslin, R. N., & Newport, E. L. (1999). Statistical learning of tone sequences by human infants and adults. Cognition, 70, 27?52. Sasaki, M. (1996). Second language proficiency, foreign language aptitude, and intelligence: Quantitative and qualitative analyses. New York: Peter Lang. 261 Sawyer, M., & Ranta, L. (2001). Aptitude, individual differences and instructional design. In P. Robinson (Ed.), Cognition and second language instruction (pp. 319? 353). Cambridge: Cambridge University Press. Schmid, M. S. (2006). Second language attrition. In K. Brown (Ed.), Encyclopedia of language and linguistics (Vol. 11, pp. 74?81). Oxford: Elsevier. Shanks, D. R., & Perruchet P. (2002). Dissociation between priming and recognition in the expression of sequential knowledge. Psychonomic Bulletin & Review, 9, 362?367. Shanks, D. R., Wilkinson, L., & Channon, S. (2003). Relationship between priming and recognition in deterministic and probabilistic sequence learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 248?261. Sheen, Y. (2007). The effect of focused written corrective feedback and language aptitude on ESL learners? acquisition of articles. TESOL Quarterly, 41, 255?283. Skehan, P. (1982). Memory and motivation in language aptitude testing. Unpublished Ph.D. dissertation. University of London. Skehan, P. (1989). Individual differences in second language learning. London: Arnold. Skehan, P. (1990). The relationship between native and foreign language learning ability: Educational and linguistic factors. In H. Dechert (Ed.), Current trends in European second language acquisition research (pp.83?106). Clevedon: Multilingual Matters. Skehan, P. (1998). A cognitive approach to learning language. Oxford: Oxford University Press. 262 Skehan, P. (2002). Theorizing and updating aptitude. In P. Robinson (Ed.), Individual differences and instructed language learning (pp. 69?93). Amsterdam: Benjamins. Skehan, P. (2012). Language aptitude. In S. M. Gass & A. Mackey (Eds.). The Routledge handbook of second language acquisition (pp. 381-395). New York: Routledge. Skinner, C., Johnson, J., Bardos, A. N., & Rhee, S. (1996). Brief measures of cognitive ability and their relationships with achievement. Paper presented at the annual conference of the Colorado Society of School Psychologists, Vail, CO. Slobin, Dan (1985). Cross linguistic Study of language acquisition. Hillsdale, NJ: Lawrence Erlbaum Associates. Smith, K. L. (1980). Common errors in the compositions of students of Spanish as a second language. Unpublished doctoral dissertation. University of Texas at Austin. Sorace, A. (1993). Incomplete vs. divergent representations of unaccusativity in non- native grammars of Italian. Second Language Research, 9, 22?47. Sparks, R. (1995). Examining the linguistic coding differences hypothesis to explain individual differences in foreign language learning. Annals of Dyslexia, 45, 187? 214. Sparks, R., & Ganschow, L. (1991). Foreign language learning difficulties: Affective or native language aptitude differences? Modern Language Journal, 75, 3?16. Sparks, R., Ganschow, L., & Patton, J. (1995). Prediction of performance in first-year foreign language courses: Connections between native and foreign language learning. Journal of Educational Psychology, 87, 638?655. 263 Speciale, G., Ellis, N. C., & Bywater, T. (2004). Phonological sequence learning and short-term store capacity determine second language vocabulary acquisition. Applied Psycholinguistics, 25, 293?321. Sternberg, R. J. (1985). Beyond IQ: A triarchic theory of human intelligence. New York: Cambridge University Press. Sternberg, R. J. (1990). Metaphors of mind: Conceptions of the nature of intelligence. New York: Cambridge University Press. Tagarelli, K. M., Borges-Mota, M., & Rebuschat, P. (forthcoming). The role of working memory in implicit and explicit language learning. Terrell, T. D., Baycroft, B., & Perrone, C. (1987). The subjunctive in Spanish interlanguage: Accuracy and comprehensibility. In B. Van Patten, T. Dvorak, J. Lee (Eds.), Foreign language learning. A research perspective (pp. 19?32). New York: Newbury House Publishers. Tsimpli, I. M., & Mastropavlou, M. (2007). Feature interpretability in L2 acquisition and SLI: Greek clitics and determiners. In H. Goodluck, J. Liceras, & H. Zobl (Eds.). The role of formal features in second language acquisition, (pp.143-183). London: Routledge. Waters, G. S., & Caplan, D. (1997). Working memory and on-line sentence comprehension in patients with Alzheimer?s disease. Journal of Psycholinguistic Research, 26, 377?400. Wells, C. G. (1985). Language development in the pre-school years. Cambridge: Cambridge University Press. 264 Wechsler, D. (1981). WAIS-R (Wechsler Adult Intelligence Scale-Revised) manual. San Antonio, TX: The Psychological Corporation. Wesche, M. B. (1981). Language aptitude measures in streaming, matching students with methods, and diagnosis of learning problems. In K.C. Diller (Ed.), Individual differences and universals in language learning aptitude (pp. 119?154). Rowley, MA: Newbury House. Wesche, M. B., Edwards, H., & Wells, W. (1982). Foreign language aptitude and intelligence. Applied Psycholinguistics, 3, 127?140. Williams, J. N. (1999). Memory, attention, and inductive learning. Studies in Second Language Acquisition, 21, 1?48. Williams, J. N. (2005). Learning without awareness. Studies in Second Language Acquisition, 27, 269?304. Willingham, D. B., Salidis, J., & Gabrieli, J. D. E. (2002). Direct comparison of neural systems mediating conscious and unconscious skill learning. Journal of Neurophysiology, 88, 1451?1460. Woltz, D. J. (1990). Repetition of semantic comparisons: Temporary and persistent priming effects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 392?403. Woltz, D. J. (1999). Individual differences in priming: The roles of implicit facilitation from prior processing. In P. L. Ackerman, P. C. Kyllonen,& R. D. Roberts (Eds.), Learning and individual differences: Process, trait, and content determinants (pp. 135?156). Washington, DC: American Psychological Association. 265 Woltz, D. J. (2003). Implicit cognitive processes as aptitudes for learning. Educational Psychologist, 38, 95?104. Wurm, L. H., & Samuel, A. G. (1997). Lexical inhibition and attentional allocation during speech perception: Evidence from phoneme monitoring. Journal of Memory and Language, 36, 165?187. Yilmaz, Y. (2010, October). Relative effects of explicit correction and recasts: The role of working memory capacity and language analytic ability. Paper presented at the Second Language Research Forum Annual Conference, University of Maryland, College Park, MD. Yukawa, E. (1997). L1 Japanese attrition and regaining: Three case studies of two early bilingual children. Unpublished Ph.D. dissertation, Centre for Research on Bilingualism, Stockholm University.