The role of internal constraints and stylistic
congruence on a variant’s social impact

Charlotte Vaughn

University of Oregon & University of Maryland, USA
Corresponding author. E-mail: cvaughn@uoregon.edu

Abstract
In natural conversation, multiple factors likely impact the social force of a sociolinguistic
variant, yet researchers have tended to examine individual factors in isolation. This paper
considers two underexamined factors together—the role of a variable’s internal constraints
and the role of stylistically congruent surrounding speech—to understand their combined
influence on how a single variable’s realization is socially interpreted. Focusing on English
variable (ING), two accent rating experiments used stimuli varying the grammatical cat-
egory of (ING) words and varying the stylistic congruence (natural sentences versus
spliced stimuli) between (ING) realization and sentence frames. Results indicate that lis-
teners showed sensitivity to (ING)’s internal constraints but only when the congruence
between (ING)’s realization and other cues was not disrupted by using spliced stimuli.
These findings suggest that internal constraints and stylistic congruence play a role in
social signaling, and have methodological implications for the use of splicing.

Keywords: social evaluation; internal constraints; stylistic congruence; expectations; splicing

A linguistic form’s social meaning is not uniform across situations but rather is
co-constructed among the speaker, the listener, and the context. Listeners’ experi-
ences with linguistic forms, speakers, contexts, and surrounding language ideologies
accumulate in their representations, and a given form’s social impact results from the
interplay between these factors (e.g., Eckert, 2012; Ochs, 1992; Podesva, 2008; Zhang,
2005). In the past several decades, experimental studies have examined this interplay
directly, finding, for example, that how frequently a variant is used (e.g., Labov, Ash,
Ravindranath, Weldon, Baranowski, & Nagy, 2011), and how likely the variant is in a
particular social (e.g., Campbell-Kibler, 2009) and linguistic (e.g., Bender, 2005) con-
text can affect how a speaker will be socially evaluated for their use of that variant.
Moreover, clusters of variants of other variables occurring alongside a particular var-
iant also influence the impact of that variant (e.g., Montgomery & Moore, 2018;
Watson & Clark, 2013, 2015). From these factors, the field has begun to assemble
a picture of what predicts how a particular utterance will be socially construed by
a listener, but there is still much to learn about these factors and how they interact.

© The Author(s), 2023. Published by Cambridge University Press. This is an Open Access article, distributed under the
terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unre-
stricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.

Language Variation and Change (2022), 34, 331–354
doi:10.1017/S0954394522000175

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

mailto:cvaughn@uoregon.edu
https://creativecommons.org/licenses/by/4.0/
https://creativecommons.org/licenses/by/4.0/
http://crossmark.crossref.org/dialog?doi=https://doi.org/10.1017/S0954394522000175&domain=pdf
https://doi.org/10.1017/S0954394522000175


This paper examines the contributions of, and interaction between, two factors: a var-
iable’s internal constraints and the stylistic (in)congruence between a variant and
other covarying cues.

Some work has indicated that a variable’s internal constraints, or the linguistic fac-
tors that probabilistically predict a particular variant, can affect a variant’s social
influence (Austen, 2020; Bender, 2005; Drager, 2010; Freitag, 2020; Labov, 2003;
Podesva, Reynolds, Callier, & Baptiste, 2015). Other work suggests that listeners’
social inferences are affected by the presence of multiple variants together (Austen
& Campbell-Kibler, 2022; Campbell-Kibler, 2009; Levon, 2014; Montgomery &
Moore, 2018; Pharao, Maegaard, Møller, & Kristiansen, 2014; Watson & Clark,
2013, 2015). In both cases, these effects are likely due to expectations that listeners
have built up about the likelihood of encountering a variant in a particular context.
Considering English variable (ING), as in thinking versus thinkin’, from prior expe-
rience listeners may have a sense of the relative likelihood of hearing an -in form in a
noun (mornin’) compared to a verb (runnin’), or the likelihood of hearing an -in form
in the presence of surrounding speech containing reduced compared to released /t/s,
and calibrate their social expectations accordingly.

Using English variable (ING) as a case study, this paper assesses the relative
importance to listeners of both variable-internal and multivariable patterns in mak-
ing social judgments. Two accent rating experiments assess whether listeners’ ratings
of stimuli are affected not only by the (ING) variant realized but also by (1) the gram-
matical category of the (ING) word, an example of its internal constraints, and (2) the
extent to which the (ING) realization is stylistically congruent with other cues in
the stimuli. (2) is assessed by using naturally produced stimuli and spliced versions
of the same stimuli, half of the spliced stimuli showing stylistic incongruence between
the (ING) realization and the surrounding speech (e.g., an -in spliced in from an
utterance originally produced with -ing), and half showing stylistic congruence
between the (ING) realization and other cues in the signal (e.g., the -in variant spliced
in from an utterance originally produced with -in). The results of these experiments
provide a window into the relative weight of within-variable and across-variable pat-
terns in social signaling. They also provide an opportunity to consider the method-
ological implications of using naturally produced versus spliced stimuli.

The social impact of within-variable patterns: Sensitivity to internal constraints

Both specific instances of sociolinguistic variants and patterns of variant use can carry
social meaning (e.g., Bender, 2007; Levon, 2007). That is, an individual realization
of a variant can affect listeners’ judgments about the speaker, as in the finding that
hearing (ING) realized as -in raises the likelihood of perceived Southernness of the
speaker as compared with hearing -ing (Campbell-Kibler, 2007). And, individual
instances are interpreted in the context of their larger patterning, such that an indi-
vidual token of -in affects intelligence judgments less when the speaker is already
assumed to be Southern (Campbell-Kibler, 2009). Further, the social force of
instances and patterns of variant use interact, since the social meaning ascribed to
a specific instance of a variant in fact arises from listeners’ awareness of its social
and linguistic patterning in the first place: variants gain their social meaning in

332 Charlotte Vaughn

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


part because of their presence within socially situated clusters or lects (Bender, 2007;
Eckert, 2002; Johnstone, 2016; Ochs, 1992).

A speaker’s rate of use of a variant affects listeners’ social judgments (e.g., Freitag,
2020; Labov et al., 2011; Levon & Buchstaller, 2015; Levon & Fox, 2014; Wagner &
Hesson, 2014), as shown by experiments where listeners hear the same speech sample
multiple times with differing rates of a nonstandard variant in each repetition
(typically, these differing rates being produced by a splicing procedure) and make a
social evaluation of the speaker—often their degree of professionalism—after each
sample. Findings indicate that, for variables of sufficient salience like (ING) in
American English (Labov et al., 2011; Wagner & Hesson, 2014), hearing higher pro-
portions of the nonstandard variant lowers the social ratings given to the speaker.
Labov et al. (2011) explain these findings through the construct of the sociolinguistic
monitor, proposed to be a cognitive mechanism responsible for tracking sociolinguis-
tic variation for purposes of social evaluation.

Social evaluations can be sensitive to how expected a variant of a single variable is
in a given context. That is, in addition to monitoring its frequency overall, listeners
can also track where a form tends to occur most often, such as a variant’s usage
with respect to its internal constraints. For example, Podesva et al. (2015) found
stronger social evaluations of /t/-releases in contexts where released /t/ occurs less fre-
quently (word-medially) than in contexts where it is more likely to occur (word-
finally). And, Labov (2003; described in Preston, 2011), examined listeners’ sensitivity
to the grammatical conditioning of variable (ING), of direct relevance to the present
study. His sample of ten listeners reported that hearing a passage where -in over-
whelmingly occurred in nouns (a grammatical environment more marked for -in)
sounded more unnatural than the same passage where -in overwhelmingly occurred
in verbs (following typical grammatical constraints) (see also Vaughn, 2022).
Bender’s (2005) study, perhaps the most developed exploration of this idea to date,
showed that the grammatical environment of an instance of copula absence in
African American English (AAE) affected the strength of social evaluation, such
that copula absence in less attested (more marked) environments increased listeners’
negative evaluations of speakers, heightening the social effect of deploying the variant.
This effect was confined to those listeners who were most familiar with the variety
(see also Austen, 2020). Listeners have also been shown to track the internal con-
straints of a variable in a linguistic processing task (Vaughn & Kendall, 2018).
When listening to sentences containing (ING) words, listeners were faster to classify
the variant as -in when it occurred in an (ING) word whose grammatical category
strongly favors -ing (noun-like categories) than in those that less strongly favor
-ing (verb-like categories). In other words, listeners classified variants faster in gram-
matical categories for which they have strong expectations about realization when
those expectations were violated (i.e., -in realizations in noun-like categories).

Still, a better understanding of the importance of internal constraints to listeners is
needed. Internal constraints are central in production studies on the transmission,
acquisition, and diffusion of variation. For example, comparing the hierarchy of inter-
nal constraints across communities’ patterns of production is a foundation of com-
parative sociolinguistics (e.g., Tagliamonte, 2013). Given the central importance of
internal constraints to variationist theory (MacKenzie, 2019; Meyerhoff & Walker,

Language Variation and Change 333

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


2007), it is surprising that there has been relatively little attention paid to the role of
these constraints in perception, social and otherwise.

The social impact of across-variable patterns: Sensitivity to stylistic congruence

In natural speech settings, listeners encounter streams of linguistic forms, rather than
hearing only one at a time. It is in the manipulation of multiple variables together that
speakers enact social identities, stances, styles, and personae (e.g., Eckert, 2002;
Johnstone, 2016; Levon, 2007, 2014; Podesva, 2008; Zhang, 2005). Speakers’ stylistic
use of clusters of features means it is likely that listeners have experience with such
clusters, and use cues across multiple variables when construing social information.
First, listeners make social judgments based on multiple features together, as
shown in studies that ask listeners to make social judgments as they are listening
to a speech sample, updating their responses over time, allowing analysts to infer
the aspects of the speech that affected listeners’ inferences about the speaker (e.g.,
Austen & Campbell-Kibler, 2022; Montgomery & Moore, 2018; Watson & Clark,
2013, 2015). Second, stylistically congruent variants are linked in mental representa-
tion (Campbell-Kibler, 2012; Levon, 2007; Vaughn & Kendall, 2019; Wade, 2022).
For example, in a novel word game paradigm, American participants produced
more Southern-like productions of /aɪ/ (i.e., more monophthongal) after hearing a
model talker who never produced tokens of /aɪ/ but whose vowels were otherwise
Southern-shifted (Wade, 2022). And, Levon found that pitch range and sibilant dura-
tion operated jointly to affect the degree of masculinity British listeners ascribed to
speakers, supporting “a gestalt-like understanding of indexicality…whereby linguistic
features are not only salient on their own but can also work in clusters to achieve
social-indexical significance” (2007:546). In addition, listeners typically encounter
stylistically congruent cues together in the signal and expect stylistic congruence
among features, being surprised when the covariation among features is mismatched
(Vaughn & Kendall, 2018, 2019).

The present study: Within- and across-variable patterns

The present study examines the social force of a form based on both internal con-
straints and stylistic congruence. The accent rating task is used as a holistic way to
elicit listeners’ impressions, following work by Campbell-Kibler (2007, 2021) for var-
iable (ING) as well as work in second language speech perception (e.g., Munro &
Derwing, 1995). The underspecified nature of “accent” affords listeners a general
dimension along which to rate speakers. Although of course the degree of perceived
“accent” is always relative to the language variety and experience of the listener, and
in fact there are no “unaccented” speakers, the term is adopted here because it is read-
ily used by naive listeners. The crucial measure of interest is the relative accent rating
of a stimulus produced with -in as compared to -ing, and whether that difference is
conditioned by (1) internal constraints and (2) stylistic (in)congruence.

To assess question (1), listeners’ ratings of -in versus -ing stimuli are compared
across (ING) words of different grammatical categories. It is expected that accent rat-
ings for -ing forms will be low across grammatical categories, as it is the canonical

334 Charlotte Vaughn

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


form. The measure of interest, then, is whether accent ratings of -in realizations for
(ING) words in grammatical categories more marked for -in production (noun-like
forms) are greater than -in ratings for grammatical categories where -in is less marked
(verb-like forms). In other words, is the social effect of a form’s realization modulated
by its internal constraints?

To assess (2), Experiment 1 uses naturally produced stimuli, and Experiment 2
uses spliced versions of the same stimuli. In the natural stimuli, speakers produced
the stimuli once with (ING) words realized as -ing, and once as -in, and were
instructed not to change anything beyond the (ING) realization. Following much
prior work (Campbell-Kibler, 2007, inter alia; Labov et al., 2011; Levon &
Buchstaller, 2015; Levon & Fox, 2014; Podesva et al., 2015), spliced stimuli were cre-
ated using a cutting-and-pasting variation of the matched guise methodology, where
the -ing realization from the natural -ing “frame” is replaced by an -in realization
spliced in from the natural -in production of the sentence (the -in “frame”). In
another version, the -ing realization is replaced by the same -ing pasted back into
the frame, to maintain any artifacts created by the splicing process. In this study,
the same process is also done for the other frame, the natural -in frame sentences,
creating four versions of each stimulus.

In general, the splicing method is used to ensure that the (ING) realization itself is
the only aspect of the signal that changes across guises. For the purposes of this study,
the spliced stimuli offer an additional useful property: half of the stimuli show stylistic
congruence between the realization and the frame in which it was produced, and half
do not. Thus, comparing participants’ behavior when listening to spliced versus nat-
ural stimuli and congruent versus incongruent spliced stimuli allows an exploration
of whether any reliance on (ING)’s internal constraints in Experiment 1 (when the
signal contains only cues stylistically congruent with (ING)’s realization) will still
be evident for spliced stimuli (where half of the stimuli are stylistically incongruent
between the (ING) variant and the surrounding signal).

Since stretches of natural speech tend to cohere stylistically, there is little chance
to test whether listeners are sensitive to the stylistic context of speech patterns.
Intentionally introducing incongruence through spliced stimuli here allows for
this test; it is not done because of an inherent interest in how listeners respond
to artificially created stimuli but because of what those situations can say about
what listeners must be doing during the course of regular speech processing. If lis-
teners show different behavior when the relationship between the (ING) variant and
the stylistic congruence of the surrounding speech is mismatched, this suggests that
listeners have expectations about stylistic congruence. Alternatively, if listeners’
behavior does not change when faced with mismatched stimuli, this suggests that
the stylistic congruence that occurs in natural speech is not critical to their accent
ratings.

In addition to shedding light on these theoretical issues, directly comparing natu-
rally produced versus spliced stimuli in this study has important methodological impli-
cations for the field. Splicing has become a dominant approach in matched guise
studies, but the consequences of its use have not been fully explored. In an early use
of this approach, Campbell-Kibler suggested a need to consider the ramifications of
splicing: “Although [the splicing procedure] did not make [the stimuli] strange or

Language Variation and Change 335

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


unnatural, it does raise interesting questions regarding exactly what we consider to be
‘matched’ in matched guise work” (2009:139). Further, most such studies splice reali-
zations of a variable into only one frame, and, in doing so, have conflated the effect
of the realization of a variable with the effect of its stylistic (in)congruence with the sur-
rounding speech. Splicing a variant into a frame originally produced in the context of
another variant necessarily introduces mismatches between the variant’s realization and
the style of the surrounding speech. The present design, which creates both stylistically
congruent and incongruent versions of stimuli with both realizations of (ING), can dis-
entangle the two and can more generally speak to the consequences of the now com-
mon methodological decision to use spliced stimuli.

Experiment 1: Natural stimuli

Methods

Stimuli. The stimuli used in this study were ninety-six sentences (plus four practice
sentences), each containing one (ING) word, with a total of forty-six distinct word
types, originally used in Vaughn and Kendall (2018, Experiment 3; see Appendix
A of that paper for full list of stimuli and details). (ING) words from the following
grammatical categories were included: progressive verbs (n = 48), gerunds (n = 11),
adjectives (n = 16), nouns (n = 9), and two types of pronouns: the two-syllable pro-
nouns something and nothing (pronoun-2s, n = 6), and the three-syllable pronouns
anything and everything (pronoun-3s, n = 6); stimulus distribution across grammati-
cal categories was guided by Hazen (2008). Following Vaughn and Kendall (2018),
pronoun-3s, adjectives, and nouns are coded here as noun-like categories, which
tend to be produced with a lower rate of -in than verb-like categories (here, the cat-
egory progressive verbs), and gerunds tend to be intermediate. Pronouns are noun-
like in their grammar; pronoun-3s are rarely realized with -in, but pronoun-2s are
much more commonly realized as -in, and thus pronoun-2s are not classed together
with noun- or verb-like categories.

Four female native English speakers produced the stimuli. Three identified as
white and one identified as mixed-race; two were from Southern California (ages
eighteen and twenty-three), and two were from Oregon (both aged eighteen).
Stimuli were recorded in a sound attenuated booth using a Shure SM93 microphone
into a Marantz PMD-661 recorder. All speakers were aware of, and could produce,
both -ing and -in forms. Speakers were asked to produce the sentences as naturally
as possible. Stimuli were amplitude normalized to yield an approximately equal vol-
ume across sentences.

Participants. Participants in Experiment 1 were thirty-nine undergraduates from the
University of Oregon’s Psychology and Linguistics Subject Pool who received partial
course credit for their time. Three additional participants were run but excluded
because of experimenter or software errors (n = 2) or uncorrected hearing loss (n =
1). All participants were highly familiar with American English, having learned
English at age six or younger. The average age of participants was nineteen years old
(min = 18, max = 25). Eighteen participants self-identified as male, twenty as female,

336 Charlotte Vaughn

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


and one as non-binary. Twenty-six participants self-identified as white, one as Black,
three as Hispanic, four as Asian, four as mixed-race, and one as other (without iden-
tifying more specifically).

Procedure. Each participant heard all ninety-six sentences, divided equally across the
four talkers (twenty-four sentences per talker), with sentences randomized per partici-
pant. For each sentence, half of participants heard the -ing and half heard the -in version.
This counterbalancing was done within talker, such that listeners heard twelve -ing and
twelve -in sentences from each talker. Thus, for every listener, each talker’s overall rate of
(ING) realization, and the rate of (ING) realization overall, was 50% -ing, 50% -in.

Participants completed the task individually seated in a sound-attenuated room in
front of a Mac Mini computer, wearing Sennheiser HD-202 headphones. The task
was presented using PsychoPy (Peirce, 2007). Participants were told they would be lis-
tening to English speech and were asked to rate the level of accent in each stimulus on
a scale from 1-9, where 1 = no accent and 9 = very strong accent, by pressing the
numbers one through nine on the keyboard. Participants were told that all speakers
were native English speakers and that they should rate the level of accent of each sen-
tence in relation to the other sentences they hear. Participants could hear each sen-
tence only once. Following the accent rating task, all participants completed a
demographic and language background questionnaire.

Results

A total of 3,744 responses were collected (96 items × 39 participants). Before analysis,
potentially spurious responses were trimmed by first transforming response reaction
times to natural log values and then removing responses greater than ±2.5 standard
deviations from each participant’s mean log RT from the dataset (n = 63).

Figure 1 presents listeners’ ratings of each sentence according to realization of the
(ING) word, replicating expectations from prior work that stimuli with -in realiza-
tions were rated as sounding more accented (M = 4.18) than those with -ing realiza-
tions (M = 1.93), and Table 1 provides numbers of observations and means for
factors of interest. Figure 2 plots accent ratings by realization in interaction with
grammatical category. Grammatical categories are presented along the x-axis accord-
ing to the likelihood that a category would be produced with an -in realization in
production (following Vaughn & Kendall, 2018), with categories on the left being
less likely to be produced as -in and those on the right being more likely.
Examining Figure 2, ratings of stimuli produced with -ing did not vary much accord-
ing to grammatical category, while stimuli realized as -in appear to show higher
accent ratings toward the left side of the plot, in grammatical categories where -in
is less common in production.

Turning to statistical analysis to assess these observations, accent ratings were mod-
eled using mixed-effect linear regression (with the lmerTest package in R; Kuznetsova,
Brockhoff, & Christensen, 2014), with individual accent rating responses, centered and
scaled, as the dependent variable. Models considered random intercepts for partici-
pant, stimulus sentence, speaker, and (ING) word, but (ING) word was determined
not to improve model fit, and the final model included only random effects for

Language Variation and Change 337

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


participant, sentence, and speaker. The model included fixed effects of realization of
(ING) word (dummy coded, -in as reference level), grammatical category of (ING)
word (dummy coded, pronoun-3 as reference level), as well as their interaction.1

An analysis of deviance table (using the car package in R; Fox & Weisberg, 2011) is
shown in Table 2, and the fixed effects are summarized in Table 3.

Statistical modeling confirmed that -in realizations were given significantly higher
accent ratings than -ing versions of the same sentences (χ2 = 2341.54, p < .001),2 as
shown in Figure 1, and that this was moderated by a significant interaction with
grammatical category (χ2 = 22.67, p < .001), as shown in Figure 2. Grammatical cate-
gories more commonly realized with -in in production (progressive verbs and
pronoun-2s) showed a smaller difference between ratings of -in and -ing as compared
to the reference level pronoun-3s, a category rarely realized as -in. The effect of real-
ization for the noun-like categories, adjectives, and nouns, and gerunds (the other
grammatical categories where -in is less common in production), did not significantly
differ from the effect of realization for pronoun-3s.

Figure 1. Accent rating for Experiment 1 by
realization.

Table 1. Experiment 1 (Natural stimuli) raw observations and mean ratings per factor

N Mean

Grammatical Category Pronoun-3 (ref) 231 3.273

Adjective 607 3.213

Noun 343 3.122

Gerund 423 3.277

Progressive 1845 2.962

Pronoun-2 232 2.625

Realization IN (ref) 1838 4.187

ING 1843 1.921

338 Charlotte Vaughn

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


Discussion

Experiment 1, using natural stimuli, found that listeners rated sentences with (ING)
words realized as -in as more accented than when realized as -ing, as expected from
prior findings (e.g., Campbell-Kibler, 2007). Further, accent ratings of -in were
affected by the grammatical category of the (ING) word, in line with prior work sug-
gesting that internal patterns of variant use can affect social evaluations (e.g., Bender,
2005). When the (ING) word was realized as -ing, accent ratings did not differ across
grammatical categories, in keeping with -ing’s status as the unmarked form. When
realized as -in, however, accent ratings were higher when the -in occurred in gram-
matical categories where it was less expected from production norms (e.g., noun-like
forms), as compared with categories where -in was less marked (e.g., verbs and
pronoun-2s). The markedness of the marked variant conditioned its social impact.
These results bolster evidence that listeners track the internal constraints of variables
and can make use of them when giving accent ratings.

Figure 2. Accent rating for Experiment 1 by realization and grammatical category.

Table 2. Experiment 1 (Natural stimuli) analysis of deviance

χ2 DF Pr(>χ2)

Realization 2341.54 1 < 2.2e-16 ***

Grammatical Category 10.70 5 0.058 .

Realization:GramCat 22.67 5 0.0004 ***

Signif. codes: 0 = ***, 0.001 = **, 0.01 = *, 0.05 = ., 0.1

Language Variation and Change 339

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


Experiment 2: Spliced stimuli

Experiment 1 confirmed that listeners can use variable-internal patterns when mak-
ing social judgments about speech, finding an interaction between grammatical cat-
egory and (ING) realization when the stimuli were naturally produced and, therefore,
all covarying cues were congruent with the realization of (ING). Vaughn and Kendall
(2018), with the same stimuli, found different behavior regarding grammatical cate-
gory information in natural versus spliced stimuli: when classifying whether they
heard -in or -ing, listeners used grammatical category expectations when listening to
spliced stimuli (Experiment 2), but not naturally produced stimuli (Experiment 3). In
different tasks, however, listeners may weigh their use of grammatical expectations
versus stylistic congruence in different manners.

The effect of stylistic congruence on listeners’ use of (ING)’s realization and its inter-
nal constraints might surface in several ways in the accent rating task. It may be that the
lack of congruence in some spliced stimuli is the most salient factor to listeners, making
sentences with incongruent cues (where frames and realizations mismatch) rated as
more accented than those with congruent cues (where frames and realizations match),
irrespective of the realization of (ING) or grammatical category. Or it may be that sty-
listic congruence interacts with the realization of (ING), where (ING) realization predicts
accent rating but the realization’s ability to signal social meaning may be mitigated in
cases when stylistic cues do not match the realization. Finally, stylistic congruence
may interact with the grammatical category effect: Given that many cues covary with
the realization of (ING), the grammatical category effect may get swamped by the infor-
mation carried across other cues in the spliced stimuli. In this case, the grammatical
category pattern evident in -in realizations in Experiment 1 may not be present at all
in Experiment 2, or may only be present for stylistically congruent (matching) stimuli.

Table 3. Experiment 1 (Natural stimuli) fixed effects

Estimate Std. Error Pr(>|t|)

(Intercept) .649 .192 0.009 **

Realization ING −1.207 .090 < 2e-16 ***

Grammatical Category (Adjective) .056 .107 0.602

Grammatical Category (Noun) −.079 .118 0.505

Grammatical Category (Gerund) −.017 .113 0.882

Grammatical Category (Progressive) −.196 .097 0.046 *

Grammatical Category (Pronoun-2) −.351 .131 0.008 **

GramCatAdjective:RealizING .006 .104 0.953

GramCatNoun:RealizING .104 .116 0.371

GramCatGerund:RealizING .083 .111 0.456

GramCatProgressive:RealizING .221 .095 0.020 *

GramCatPronoun-2:RealizING .371 .128 0.004 **

Signif. codes: 0 = ***, 0.001 = **, 0.01 = *, 0.05 = ., 0.1

340 Charlotte Vaughn

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


Types of stylistic covariation present in these stimuli are illustrated in Vaughn and
Kendall (2019). In that paper, acoustic analyses were conducted on the natural stimuli
produced by these four talkers; the ninety-six stimuli used in Experiment 1 are a sub-
set of the stimuli analyzed there. Five phonetic features were selected for analysis
because they also index aspects of -in’s indexical field, namely Southernness and/or
casualness: /aɪ/-glide length, spectral proximity of the mid-front vowels /e/ and /ε/,
duration of the lax vowels /ε/ and /ɪ/, speaking rate, and prevalence of release and
reduction of intervocalic /t/. Statistical models determined whether the production
of these features differed significantly across sentences produced with -ing versus
-in realizations. Findings indicated that, compared to the -ing versions of the same
sentences, the -in versions had significantly shorter /aɪ/-glides, more proximate mid-
front vowels /e/ and /ε/, longer lax vowels /ε/ and /ɪ/, and more reduced /t/ produc-
tions. Speaking rate did not pattern according to (ING) realization, instead showing
different patterns by speaker across -ing and -in. Thus for four of the five features,
results indicated that speakers’ productions covaried with (ING) realization in ways
that were stylistically congruent with -in’s social meanings of casual (e.g., more
reduced /t/ productions) or Southern (e.g., shorter /aɪ/-glides, despite the four speakers
being from the Western US). It is thus evident that the realization of (ING) is not the
only factor that varies across the naturally produced -ing and -in frames (and therefore
was not the only factor driving the accent ratings in Experiment 1). Because of these
patterns of covariation in the natural stimuli, splicing (ING) realizations across frames
for Experiment 2 will necessarily introduce stylistic mismatches between (ING) realiza-
tion and other features. For example, when splicing -ing into an -in frame, cues in -in
frames that index casualness will be incongruent with -ing’s more formal status.

Methods

Stimuli
The stimuli used in this study are spliced versions of those in Experiment 1 (from
Vaughn and Kendall, 2018, Experiment 2). For the -ing version and the -in version
(or “frame”) of each sentence, splicing was done by identifying boundaries of each
(ING) realization using auditory and spectrographic cues in Praat (Boersma &
Weenink, 2019) and selecting the nearest zero-crossing in the waveform. Then, the
realizations from the opposite version of each sentence (e.g., -in) were pasted into
the other version (e.g., -ing), replacing the original realization for each frame, to create
mismatching or stylistically incongruent stimuli (e.g., -in realization in -ing frame). To
create matching or stylistically congruent stimuli (e.g., -ing realization in -ing frame),
the realization was copied and pasted back into the same sentence frame opened in a
different window, making it unlikely that identical zero-crossings were selected. Thus,
any artifacts of the splicing procedure itself should occur in both matched and mis-
matched stimuli. All stimuli were RMS amplitude normalized to yield an approxi-
mately equal volume across stimuli.

Participants
Due to COVID-19, participants in this study were 117 participants recruited on
Amazon’s Mechanical Turk rather than in the lab. Because of the design of the

Language Variation and Change 341

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


spliced stimuli, more participants were required than in Experiment 1 to ensure that
an adequate number of participants heard each of the four versions of each stimulus.
Mechanical Turk participants were restricted to IP addresses in the United States and
were paid the equivalent of $10/hour for their participation. Twenty-seven additional
participants were run but were excluded because they learned English after age 6 (n =
2), or because they failed attention checks, indicating that they were not adequately
completing the task (n = 25). The average age of participants was thirty-eight (min
= 20, max = 71). Seventy-six participants self-identified as male, thirty-nine as female,
one as nonbinary, and one preferring not to report. Ninety-one participants self-
identified as white, seven as Black, five as Hispanic, four as Asian, eight as
mixed-race, one as other, and one preferring not to report.

Procedure
The procedure and instructions to participants were identical to those in Experiment
1, with only the presentation software and experimental stimuli differing. The task
was presented in a web browser using FindingFive (FindingFive Team, 2019), and
participants were asked to complete the task using headphones. As in Experiment 1,
each participant heard all ninety-six sentences, divided equally across the four
talkers, with sentences randomized per participant. For each sentence, half of the
participants heard an -ing and half heard an -in realization, and within each of
those realizations, half heard the version where frame and realization matched, and
half heard the mismatched version. In this way, an equal number of participants
heard all four versions of each sentence. This counterbalancing was done within
talker, such that listeners heard twelve -ing and twelve -in sentences from each talker.
Thus, as in Experiment 1, for every listener each talker’s overall rate of (ING) reali-
zation, and the rate of (ING) realization overall, was 50% -ing, 50% -in.

Results

11,232 responses were collected in total (96 items × 117 participants). Following
Experiment 1, log RTs greater than ±2.5 standard deviations from each participant’s
mean log RT were trimmed (n = 204). Table 4 provides numbers of observations and
means for factors of interest. Looking first at the raw data plotted in Figure 3 to con-
sider the effect of realization on accent ratings, the effect appears as expected from
prior work and Experiment 1: accent ratings were higher for stimuli with -in realiza-
tions (M = 4.02) than with -ing realizations (M = 2.76). Further, this pattern appears
to hold for stimuli where the frame and realization matched and also where
they mismatched. However, stylistic congruence between frame and realization
seems to amplify the effect of realization, with -in realizations from -in frames having
the highest accent ratings (M = 4.41), -ing realizations from -ing frames having the
lowest (M = 2.35), and mismatched stimuli falling in between (-ing realization/-in
frame M = 3.16, -in realization/-ing frame M = 3.63).

Figure 4 displays grammatical category in interaction with realization. In stimuli
where frame and realization matched (left panel of Figure 4), the pattern appears sim-
ilar to Experiment 1: accent ratings of -in realizations (with -in frames) seem to vary
systematically by grammatical category (with -in realizations for noun-like categories

342 Charlotte Vaughn

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


receiving higher accent ratings than verb-like categories), while accent ratings of -ing
realizations (with -ing frames) are less affected by grammatical category. However, in
stimuli where frames and realizations mismatched (right panel in Figure 4), the same
is not true; here, both -ing and -in realizations are variable by grammatical category,
and systematicity is less apparent.

Turning to statistical modeling, the same model as in Experiment 1 was used, add-
ing in the new factor of match or mismatch between frame and realization (dummy
coded with match as the reference level). Because of the a priori interest in the effect
of frame and realization match on (ING) realization, grammatical category, and their
interaction, the three-way interaction and all two-way interactions of these factors
were included in the model. Results are presented in an analysis of deviance table
(Table 5) and the fixed effects model output (Table 6).

The most relevant findings are highlighted here, with full model results given in
the tables. First, as in Experiment 1, the main effect of realization affected accent

Table 4. Experiment 2 (Spliced stimuli) raw observations and mean ratings per factor

N Mean

Frame.Realization Match (ref) 5511 3.381

Mismatch 5517 3.396

Grammatical Category Pronoun-3 (ref) 686 3.512

Adjective 1846 3.608

Noun 1033 3.421

Gerund 1259 3.524

Progressive 5516 3.321

Pronoun-2 688 2.917

Realization IN (ref) 5520 4.018

ING 5508 2.757

Figure 3. Accent rating for Experiment 2 by reali-
zation and whether stimulus realization and frame
matched or mismatched.

Language Variation and Change 343

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


ratings (χ2 = 1710.12, p < .0001), with -in realizations given higher accent ratings than
-ing. Importantly, this main effect was moderated by realization’s significant interac-
tion with frame-realization match (χ2 = 689.16, p < .0001), grammatical category
(χ2 = 17.22, p = .004),3 and the three-way interaction among these factors (χ2 =
45.96, p < .0001). The significant interaction between realization and
frame-realization match confirms the pattern seen in Figure 3, showing that the
difference in accent ratings between -in and -ing realizations was larger for stimuli
with frame-realization match rather than mismatch. Interestingly, there was not a sig-
nificant main effect of frame-realization match:4 it was not the case that mismatches
were rated as more accented than matches, regardless of realization. Rather,
frame-realization moderated the effect of realization.

The significant three-way interaction between frame-realization match, grammat-
ical category, and realization confirms the pattern evident in Figure 4: the difference
in accent ratings between realizations across grammatical categories was larger when

Figure 4. Accent ratings for Experiment 2 by grammatical category, realization, and match/mismatch
between frame and realization.

Table 5. Experiment 2 (Spliced stimuli) analysis of deviance

χ2 DF PR(>χ2)

Frame.Realization 0.041 1 .840

Grammatical Category 8.700 5 .122

Realization 1706.080 1 < 2e-16 ***

Frame.Realization:Grammatical Category 7.721 5 .263

Frame.Realization:Realization 689.157 1 < 2e-16 ***

Grammatical category:Realization 17.220 5 .005 **

Frame.Realization:Grammatical Category:Realization 45.956 5 9.27e-09 ***

Signif. codes: 0 = ***, 0.001 = **, 0.01 = *, 0.05 = ., 0.1

344 Charlotte Vaughn

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


frames and realizations matched than when they mismatched. Listeners appeared to
use grammatical category information in assigning accent ratings more when frames
and realization were stylistically congruent. The significant two-way interaction
between grammatical category and realization confirms this pattern for matched trials
(displayed in Table 6), though the only comparison that reached significance there
was the difference in realization between pronoun-3 and pronoun-2, the grammatical
categories with more extreme expectations from production. When frames and real-
izations mismatched, the realization by grammatical category interaction was
mitigated.

Table 6. Experiment 2 (Spliced stimuli) fixed effects

Estimate Std. Error Pr(>|t|)

(Intercept) .462 .192 0.041 *

Frame.Realization Mismatch −.575 .076 3.38e-14 ***

Grammatical Category (Adjective) .093 .115 0.418

Grammatical Category (Noun) −.037 .126 0.768

Grammatical Category (Gerund) .002 .122 0.987

Grammatical Category (Progressive) −.048 .104 0.643

Grammatical Category (Pronoun-2) −.181 .140 0.199

Realization ING −.962 .075 < 2e-16 ***

Frame.RealizMismatch:GramCatAdjective .257 .088 0.004 **

Frame.RealizMismatch:GramCatNoun .394 .097 4.80e-05 ***

Frame.RealizMismatch:GramCatGerund .195 .094 0.038 *

Frame.RealizMismatch:GramCatProgressive .229 .080 0.004 **

Frame.RealizMismatch:GramCatPronoun-2 .271 .106 0.010 *

Frame.RealizMismatch:RealizING 1.190 .106 < 2e-16 ***

GramCatAdjective:RealizING .0328 .087 0.705

GramCatNoun:RealizING .125 .097 0.196

GramCatGerund:RealizING .081 .094 0.387

GramCatProgressive:RealizING .081 .080 0.307

GramCatPronoun-2:RealizING .211 .107 0.045 *

Frame.RealizMismatch:GramCatAdjective:RealizING −.427 .123 0.0005 ***

Frame.RealizMismatch:GramCatNoun:RealizING −.822 .136 1.51e-09 ***

Frame.RealizMismatch:GramCatGerund:RealizING −.452 .135 0.0008 ***

Frame.RealizMismatch:GramCatProgressive:
RealizING

−.507 .113 6.54e-06 ***

Frame.RealizMismatch:GramCatPronoun-2:
RealizING

−.745 .150 7.49e-07 ***

Signif. codes: 0 = ***, 0.001 = **, 0.01 = *, 0.05 = ., 0.1

Language Variation and Change 345

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


Discussion

In Experiment 2, with spliced stimuli, the expected effect of realization on accent
ratings was replicated, with -in realizations assigned higher accent ratings than -ing
realizations. The stylistic congruence between frame and realization moderated the
effect of realization, where congruent stimuli showed more extreme accent ratings
based on realization than did incongruent stimuli. In other words, incongruence did
not completely swamp the role of realization, but instead both (ING) realization and
the stylistic context in which the (ING) realization occurred affected listeners’ ratings.

The grammatical category by realization interaction was also conditioned by the
congruence between frame and realization. For stylistically incongruent stimuli,
there was a disruption of the role of grammatical category in affecting the difference
between accent ratings of -in and -ing realizations; in mismatched stimuli, listeners’
expectations about -in’s degree of markedness played less of a systematic role in their
accent ratings compared to the naturally produced stimuli in Experiment 1 and the
stylistically congruent (matched) stimuli in Experiment 2. This pattern suggests
that the mismatching cues in those stimuli had more of an impact on accent ratings
than did (ING)’s grammatical category information. Stylistic incongruence appears to
have overshadowed any potential usefulness of listeners’ knowledge of internal con-
ditioning information, though notably it did not overshadow the usefulness of (ING)
realization.

General discussion

The findings reported here support a role for both variable-internal and across-
variable patterns, and their interaction, in social signaling. Results confirm that listen-
ers have knowledge of conditioning constraints, supporting prior work on (ING) and
other variables (e.g., Bender, 2005; Vaughn & Kendall, 2018). Further, the grammat-
ical category conditioning of (ING), a constraint also used in linguistic processing (in
Vaughn & Kendall, 2018), is available for use in accent rating, a task that is more tied
to social evaluation. And, the stylistic frame in which the realization of (ING)
occurred also affected accent ratings: although listeners made use of (ING)’s realiza-
tion when assigning accent ratings in both naturally produced and spliced stimuli, the
effect of realization was diminished for stylistically incongruent stimuli. Finally, the
grammatical constraints governing (ING) were not used by listeners in assigning rat-
ings when the stimuli were stylistically incongruent. The following sections discuss
some implications of these findings.

Role of markedness/surprisal in social signaling

One interpretation of the finding that listeners use probabilistic information about
(ING)’s internal constraints in the accent rating task is that a form’s markedness,
or surprisal, in a given context is a part of what predicts the strength of its social sig-
naling (Bender, 2007; Jaeger & Weatherholtz, 2016; Rácz, 2013). That is, the deploy-
ment of a marked variant in an unexpected context amplifies its social effect.

Speakers’ grasp of the constraints on sociolinguistic variation is likely part of what
allows them to construct styles in inventive ways in the first place; being able to antic-
ipate their interlocutor’s degree of surprise at the deployment of a particular form—in

346 Charlotte Vaughn

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


concert with other forms—is part of the process of identity construction in language.
Speakers’ knowledge about the linguistic and social factors that condition a form’s use
in a particular context lets them predict whether their use of a form may amplify, or
even block or subvert, an expected social meaning for a listener. It may be that the
same expectation-based mechanism that produces the surprisal effect in this study
also results in “blocking” or “indexical bulletproofing” in cases where a form occurs
in a rare context (see, for example, Campbell-Kibler, 2011; Levon, 2007, 2014; Pharao
et al., 2014). Going forward, the separate literatures documenting markedness effects
on social signaling from social conditioning (e.g., Campbell-Kibler, 2011; Levon,
2007, 2014; Pharao et al., 2014; Stecker, 2020) and from linguistic conditioning (e.g.,
Bender, 2005; Podesva et al., 2015) would benefit from being considered together, con-
tributing to a larger account of the ways that markedness can drive social indexing.

Converging evidence from other areas points to a role of markedness in social pro-
cessing. Results from recent artificial language learning experiments suggest that soci-
olinguistic variants that are more unexpected given particular social contexts are more
learnable (Lai, Rácz, & Roberts, 2020), as are variants that are associated with more
salient social cues (Rácz, Hay, & Pierrehumbert, 2017; Sneller & Roberts, 2018). The
impact of a form’s typicality in context is also predicted by theories of spoken word
processing that assign different attentional weights to forms with different degrees of
social significance (Sumner, Kim, King, & McGowan, 2014).5

There are many more open questions regarding how markedness/surprisal affects
social perception; two examples are discussed here. First, the extent to which a par-
ticular variable-internal pattern is made use of by listeners likely depends on the var-
iable and the patterning of that variable within and across communities. For example,
in this study testing variable (ING) listeners participated from across demographic
categories and dialect boundaries since the grammatical conditioning of (ING) has
been shown to be stable across a variety of Englishes and does not appear to
often interact with external constraints. Thus, it was expected that a variety of listen-
ers would be aware of and able to use the markedness of (ING)’s internal constraints
for social judgments. However, understanding how listeners build up representations
about the markedness of internal constraints upon encountering changing or unfa-
miliar variants or speech communities is an important open question (see Bender,
2005; Levon & Fox, 2014). For example, the extent to which an internal constraint
is informative about social factors is likely to modulate this effect (e.g., Villareal,
Clark, Hay, & Watson, 2021; Wolfram, Childs, & Torbert, 2000). Second, there are
open questions regarding the level at which within-variable patterns are represented
in the mind. For example, does the grammatical category in which the variant is used
best explain how markedness patterns are stored, or is there word-specific informa-
tion that could account for this knowledge as well (see Vaughn, 2022)? There is
ample room for future work to explore these issues.

Processing across tasks and types of stimuli

Although this study demonstrates that listeners can use grammatical category knowl-
edge about the likelihood of (ING) realizations in their accent ratings of a sentence, it
also demonstrates that listeners do not always use such knowledge. Judgments based

Language Variation and Change 347

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


on spliced stimuli did not reflect this grammatical knowledge (taking together match-
ing and mismatching stimuli). Interestingly, this pattern is the opposite of what
Vaughn and Kendall (2018) found with the same stimuli in a different task, where
listeners’ judgments about which realization of (ING) they heard reflected (ING)’s
grammatical constraints only for spliced stimuli. Vaughn and Kendall reasoned that
the incongruence of the mismatches in the spliced experiment put listeners in a
mode of processing where they could rely less on the bottom-up information in the
signal to accurately cue the realization of (ING). With spliced stimuli, listeners relied
more on top-down expectations about grammatical category to accomplish that task.

These seemingly opposing findings across studies are readily explained by consid-
ering the nature of the tasks. The variant classification task asks listeners to focus spe-
cifically on the realization of the (ING) variable itself. In that case, mismatches
between frame and realization impede listeners’ ability to do this task, leading
them to use other information at their disposal: their knowledge about (ING)’s gram-
matical category conditioning. In contrast, the accent rating task involves holistic
judgments of the stimulus, where the realization of (ING) is only one component
among the many linguistic features across the sentence. In this task, given that
other features (e.g., /aɪ/-glide length, /t/-release, and position and duration of certain
vowels) covary in stylistically congruent ways with the realization of (ING) in these
stimuli (Vaughn & Kendall, 2019), it is not surprising that the rest of the stimulus
contains information that affects accent ratings. Here, the congruence between con-
stellations of cues matters more than the constraints governing just one variable.

The pattern of results across these two studies indicates that the type of stimuli
heard affects listeners’ use of variable-internal probabilistic conditioning expectations
in ways appropriate to their attentional goals. Mental representations of probabilistic
conditioning information are apparently available to listeners for use in both linguis-
tic and social processing, and the processing system calls on those representations as
needed for the task at hand. This suggests that modular, separate constructs like the
sociolinguistic monitor, thought to track and store information about sociolinguistic
variation for use in social evaluation (Labov et al., 2011), may not be necessary (see
also Campbell-Kibler, 2021). Listeners show sensitivity to probabilistic information
about what conditions (ING) variation in both social and linguistic tasks, albeit in
the manner most appropriate to the task.

Splicing methodology and the importance of stylistic congruence

The results of this study have significant methodological implications. Much recent
work has employed the splicing procedure because of a well-motivated interest in bet-
ter understanding the impact of individual variables. However, these findings present
mounting evidence that this methodological decision has consequences: listeners’
behavior changes across stylistically congruent versus incongruent stimuli.

Since naturally produced stimuli are of the sort that listeners encounter in actual
speech, of course, the current findings suggest that the grammatical category of -in
may be readily available for listeners to use in making judgments about speakers in
everyday settings, a noteworthy finding. However, given that the splicing procedure
is standard practice in the field, it is worth better understanding why the same findings

348 Charlotte Vaughn

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


did not hold for spliced stimuli. It may be that the artifacts introduced in the process of
splicing were overly disruptive. Or it may be that the presentation of the stylistically
incongruent stimuli (randomized together with the stylistically congruent stimuli)
meant that listeners could not consistently rely on the congruence between all the
bottom-up information in the stimuli and thus allocated processing resources differ-
ently than had all realizations been matched to their frame. Prior work lends support
to the latter explanation. For example, listeners have been shown to put less weight on a
particular acoustic cue, even one that is relevant for the current stimulus, when they are
in a context in which the cue is less reliable (McQueen & Huettig, 2012).

Listeners’ sensitivity to subtle phonetic incongruence has been well-documented
in other domains (see Sumner et al., 2014 for a review). Sumner (2013) found that
two variants of words with medial /nt/ sequences (the more careful splin[t]er and
more casual splin[_]er for the word “splinter”) could prime a semantically related
word (e.g., “wood”), but only when the variants were congruent with the overall
style of the word, not when the more casually articulated n[_] variant was housed
in a more carefully articulated frame.

Uncovering the precise features that contributed to the effects of stylistic incongru-
ence in these stimuli, and accounting for the exact pattern of results in the mis-
matched stimuli, are clear next steps. There are many possible sources of these
patterns, from speakers’ productions of particular items, to commonalities in produc-
tions across grammatical categories or speakers, to listeners’ expectations about these
productions, to intersections of all of these factors. This paper was not designed to
test these possibilities, and as such there are simply not enough stimuli to tease
apart patterns in a systematic way. Although Vaughn and Kendall (2019) established
that there are systematic acoustic differences between -in and -ing sentences for the
linguistic features measured there, it is not possible to test differences on a
by-grammatical category basis: those features do not occur equally, or sometimes
even occur at all, across all grammatical categories. Future work explicitly designed
to systematically vary potential covarying cues could make inroads toward under-
standing the dynamics of ratings made to stylistically mismatched stimuli. For
now, this paper establishes that there is reason to expect that listeners are indeed sen-
sitive to the stylistic incongruence induced by splicing.

Going forward, what might this documented role of stylistic (in)congruence mean
for the use of spliced stimuli in matched guise tasks? First, the impact of realization of
(ING) appears robust to the stylistic congruence or incongruence between frame and
realization. In fact, the present findings (see Figure 3 in particular) suggest that the
impact of realization may be underestimated in prior studies using only one frame
(as often is the case in sociolinguistic studies using stimuli excised from sociolinguis-
tic interviews, where only one version of the sentences is available), since the cues
from the original frame appear to counteract the effect of realization. Of course,
the potential for congruent cues to amplify the social impact of the realization,
which is what was observed in Experiment 2, is part of the reason why the splicing
procedure is conducted in the first place. But as we observe here, the other side of that
coin is that splicing in cues from an incongruent frame dampens the effect of reali-
zation. This finding underscores the idea that listeners are using more than just the
realization of a single variant to make social evaluations.

Language Variation and Change 349

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


Although this paper’s goal was to understand the effect of congruence between
frame and realization (as shown in Figure 4), the same data can be examined accord-
ing to the stimuli’s original frame, as shown below in Figure 5. Doing so can provide
more concrete guidance for future work, as it emulates a methodological decision
often made when using spliced stimuli: splicing realizations into only one frame.
What is immediately evident from Figure 5 is that if only one frame had been
used, which frame was selected would have made a difference. First, the -in frame
produced higher accent ratings than the -ing frame across realizations.

Further, there are differences across frames in terms of how grammatical category
was reflected in accent ratings. For example, stimuli created by splicing both realiza-
tions into the -in frame (Figure 5, left panel) appear to show traces of grammatical
category effects even for -ing realizations. This is surprising given that -ing is licensed
across grammatical categories. However, rather than undermining the current results,
this pattern lends further support to the power of the covarying cues in the frame.
That is, speakers’ original productions of covarying cues may have been affected by
(ING)’s grammatical category. Speculatively, it may be that in aiming to produce
-in in a context where it is less licensed grammatically (noun-like forms), speakers
heightened their use of other stylistically covarying features in an effort to make
the production of -in feel less unnatural. In contrast, the pattern of results for stimuli
created by splicing both realizations into the -ing frame (Figure 5, right panel)
appears more similar to the findings from the natural or matched stimuli, where lis-
teners’ expectations about the markedness of -in across grammatical categories drives
the pattern. This general pattern suggests that in interpreting listeners’ social evalua-
tions of spliced stimuli, at least for (ING) in read speech, it is important to acknowl-
edge the role of internal constraints not just for listeners, but also stimulus speakers.

One concrete methodological takeaway, then, is that if choosing to splice into only one
frame, it is more advisable to use the same frame consistently across all stimuli rather
than to select the frame on a stimulus-by-stimulus basis (i.e., for read sentences, selecting
which frame sounds “more natural” for each sentence, or, for stimuli excised from

Figure 5. Accent ratings for Experiment 2 by grammatical category, realization, and frame.

350 Charlotte Vaughn

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


interviews, selecting the frame that happens to be present). With only one frame, the
potential effects of that frame can at least be taken into account when interpreting results.

In sum, these observations suggest that sociolinguistic experimentalists carefully
consider several factors when determining the best methodology for a study. The
research question and dependent and independent variables of interest should
shape decisions about the type of stimuli to use. For example, in a given study, is
it more important to allow other variables to naturally covary with the variable of
interest, or to artificially create instances of stylistic incongruence via splicing?
And, if splicing is the preferred method, is it more justified to use naturally produced
stimuli (i.e., from a sociolinguistic interview), or to produce new stimuli (see also
Tamminga, 2017)? Moreover, if using splicing, should both frames be used in
order to observe the effects of realization across both stylistically congruent and
incongruent stimuli, or can one frame be selected consistently and factored into
how the results are interpreted? Triangulating between several carefully designed
tasks and types of stimuli can also be useful.

Conclusion

This paper investigated how several patterns of variation, internal constraints and sty-
listic congruence, affect how the realization of variable (ING) is socially interpreted by
listeners. Results indicate that listeners were sensitive to (ING)’s internal constraints
in an accent rating task, but those constraints were only used when the natural sty-
listic congruence between (ING)’s realization and other cues was not disrupted by
using spliced stimuli. Congruence between the (ING) realization and surrounding
forms likely enabled listeners to attend to (ING)’s internal constraints. These findings
provide converging evidence for two central concepts in variationist sociolinguistic
work—a variable’s internal constraints and the stylistic covariation among variables
—from a methodology other than production. Here, (ING)’s internal grammatical
constraints are an example of a pattern that occurs within a variable, and stylistic con-
gruence was operationalized to examine contributions of patterns across variables.
Future work could usefully examine how much listeners use other within-variable
and across-variable patterns, and their interactions, in making social evaluations.

Notes
1. Dummy coding was used here for maximum comparability to Experiment 2, where it is most useful for
ease of interpretation.
2. Dummy coding, though desirable for interpreting the interactions of interest, does complicate the inter-
pretation of main effects. To ensure proper interpretation of all main effects, confirmation is presented
from a separate model, identical except for sum coding rather than dummy coding all fixed effects involved
in interactions. Confirming the results with sum coded factors ensures that the main effects are generaliz-
able to contexts beyond the reference levels of the factors involved in the interactions. The main effect of
realization here was also significant in a sum coded statistical model of the same data.
3. The main effect of realization, and the two significant two-way interactions, were also significant in a
model where the three fixed effects were sum coded rather than dummy coded.
4. The main effect of frame-realization match was also not significant in the sum coded model.
5. Still other areas, such as recent work in sociopragmatics, also explore markedness’ effect on social sig-
naling (e.g., Acton & Potts, 2014; Beltrama & Staum Casasanto, 2021). Because in that literature marked-
ness tends not to be derived from frequency-based expectations but rather from semantic properties or

Language Variation and Change 351

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


lexical characteristics, it is not discussed here, yet a complete account of how markedness affects social pro-
cessing must incorporate such data.

Acknowledgments. This project grew out of my earlier collaboration with Tyler Kendall on the percep-
tion of (ING) variation, and I appreciate his continued insights and conversations on this topic. Thanks
also to Abby Walker and Meredith Tamminga for useful discussions.

Competing interests. The author declares none.

References
Acton, Eric. K, & Potts, Christopher. (2014). That straight talk: Sarah Palin and the sociolinguistics of

demonstratives. Journal of Sociolinguistics 18(1):3–31.
Austen, Martha. (2020). The Role of Listener Experience in Perception of Conditioned Dialect Variation.

Unpublished doctoral dissertation. The Ohio State University.
Austen, Martha, & Campbell-Kibler, Kathryn. (2022). Real-time speaker evaluation: How useful is it, and

what does it measure? Language 98(2):e108-130.
Beltrama, Andrea, & Casasanto, Laura S. (2021). The social meaning of semantic properties. In L. Hall-Lew,

E. Moore, & R. Podesva (Eds.), Social meaning and linguistic variation: Theorizing the third wave.
Cambridge: Cambridge University Press.

Bender, Emily M. (2005). On the boundaries of linguistic competence: Matched-guise experiments as evi-
dence of knowledge of grammar. Lingua 115(11):1579-98.

Bender, Emily M. (2007). Socially meaningful syntactic variation in sign-based grammar. English Language
and Linguistics, 11(2):347-81.

Boersma, Paul, & Weenink, David. (2019). Praat: Doing phonetics by computer [Computer program].
Version 6.0.56. http://www.praat.org/.

Campbell-Kibler, Kathryn. (2007). Accent, (ING), and the social logic of listener perceptions. American
Speech 82:32-64.

Campbell-Kibler, Kathryn. (2009). The nature of sociolinguistic perception. Language Variation and
Change 21:135-56.

Campbell-Kibler, Kathryn. (2011). Intersecting variables and perceived sexual orientation in men.
American Speech 86:52-68.

Campbell-Kibler, Kathryn. (2012). The implicit association test and sociolinguistic meaning. Lingua
122:753-63.

Campbell-Kibler, Kathryn. (2021). Deliberative control in audiovisual sociolinguistic perception. Journal of
Sociolinguistics 25(2):253-71.

Drager, Katie. (2010). Sensitivity to grammatical and sociophonetic variability in perception. Laboratory
Phonology 1(1):93-120.

Eckert, Penelope. (2002). Constructing meaning in sociolinguistic variation. Paper presented at the Annual
Meeting of the American Anthropological Association, November 20-24, New Orleans, Louisiana.

Eckert, Penelope. (2012). Three waves of variation study: the emergence of meaning in the study of socio-
linguistic variation. Annual Review of Anthropology 41:87-100.

FindingFive Team (2019). FindingFive: A web platform for creating, running, and managing your studies in
one place. FindingFive Corporation (nonprofit), New Jersey, USA. https://www.findingfive.com

Fox, John, & Weisberg, Sanford. (2011). An R companion to applied regression (2nd ed.). Thousand Oaks,
CA: Sage.

Freitag, Raquel Meister Ko. (2020). Effects of the linguistics processing: palatals in Brazilian Portuguese and
the sociolinguistic monitor. University of Pennsylvania Working Papers in Linguistics 25(2):4.

Hazen, Kirk. (2008). (ING): A vernacular baseline for English in Appalachia. American Speech 83:116-40.
Jaeger, T. Florian, & Weatherholtz, Kodi. (2016). What the heck is salience? How predictive language

processing contributes to sociolinguistic perception. Frontiers in Psychology 7:1115.
Johnstone, Barbara. (2016). Enregisterment: How linguistic items become linked with ways of speaking.

Language and Linguistics Compass 10(11):632-643.
Kuznetsova, Alexandra, Brockhoff, Per B., & Christensen, Rune H. B. (2014). LmerTest: Tests for random

and fixed effects for linear mixed effect models. R package, version 2.0-3.

352 Charlotte Vaughn

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

http://www.praat.org/
http://www.praat.org/
https://www.findingfive.com
https://www.findingfive.com
https://doi.org/10.1017/S0954394522000175


Labov, William. (2003) “Re-defining ‘the same’ as a mechanism of linguistic change.” Forum Lecture,
Linguistic Society of America Institute. East Lansing: Michigan State University, August 7.

Labov, William, Ash, Sharon, Ravindranath, Maya, Weldon, Tracey, Baranowski, Maciej, & Nagy, Naomi.
(2011). Properties of the sociolinguistic monitor. Journal of Sociolinguistics 15(4):431-63.

Lai, Wei, Rácz, Péter, & Roberts, Gareth. (2020). Experience with a linguistic variant affects the acquisition
of its sociolinguistic meaning: An alien-language-learning experiment. Cognitive Science 44(4):e12832.

Levon, Erez. (2007). Sexuality in Context: Variation and the Sociolinguistic Perception of Identity.
Language in Society 36(4):533-54.

Levon, Erez. (2014). Categories, stereotypes and the linguistic perception of sexuality. Language in Society
43(5):539-66.

Levon, Erez, & Buchstaller, Isabelle. (2015). Perception, cognition, and linguistic structure: The effect of
linguistic modularity and cognitive style on sociolinguistic processing. Language Variation and
Change 27:319-48.

Levon, Erez, & Fox, Sue. (2014). Social salience and the sociolinguistic monitor: A case study of ing and
th-fronting in Britain. Journal of English Linguistics 42(3):185-217.

MacKenzie, Laurel. (2019). Perturbing the community grammar: Individual differences and community-
level constraints on sociolinguistic variation. Glossa: A journal of general linguistics 4(1).

McQueen, James M., & Huettig, Falk. (2012). Changing only the probability that spoken words will be dis-
torted changes how they are recognized. The Journal of the Acoustical Society of America 131(1):509–17.

Meyerhoff, Miriam, & Walker, J. (2007). The persistence of variation in individual grammars: Copula
absence in urban sojourners and their stay-at-home peers, Bequia (St. Vincent and the Grenadines).
Journal of Sociolinguistics 11(3):346-66.

Montgomery, Chris, & Moore, Emma. (2018). Evaluating S(c)illy voices: The effects of salience, stereotypes,
and co-present language variables on real-time reactions to regional speech. Language 94(3):629-61.

Munro, Murray J., & Derwing, Tracey M. (1995). Foreign accent, comprehensibility, and intelligibility in the
speech of second language learners. Language Learning 45(1):73-97.

Ochs, Elinor. (1992). Indexing gender. In A. Duranti and C. Goodwin (Eds.), Rethinking context: language
as an interactive phenomenon. Cambridge: Cambridge University Press. 335-58.

Peirce, Jonathan W. (2007). PsychoPy—Psychophysics software in Python. Journal of Neuroscience Methods
162(1-2):8-13.

Pharao, Nicolai, Maegaard, Marie, Møller, Janus Spindler, & Kristiansen, Tore. (2014). Indexical Meanings
of [s+] Among Copenhagen Youth: Social Perception of a Phonetic Variant in Different Prosodic
Contexts. Language in Society 43(1):1-31.

Podesva, Robert J. (2008). Three sources of stylistic meaning. Texas Linguistic Forum 51:134-43.
Podesva, Robert J., Reynolds, Jermay, Callier, Patrick, & Baptiste, Jessica. (2015). Constraints on the social

meaning of released /t/: A production and perception study of U.S. politicians. Language Variation and
Change 27:59-87.

Preston, Dennis R. (2011). The power of language regard-discrimination, classification, comprehension,
and production. Dialectologia: Revista electrònica, 9-33.

Rácz, Péter. (2013). Salience in sociolinguistics: A quantitative approach (Vol. 84). Berlin: Walter de Gruyter
Mouton.

Rácz, Péter, Hay, Jennifer B., & Pierrehumbert, Janet B. (2017). Social salience discriminates learnability of
contextual cues in an artificial language. Frontiers in Psychology 8(51).

Sneller, Betsy, & Roberts, Gareth. (2018). Why some behaviors spread while others don’t: A laboratory sim-
ulation of dialect contact. Cognition 170:298-311.

Stecker, Amelia. (2020). Investigations of the sociolinguistic monitor and perceived gender identity.
University of Pennsylvania Working Papers in Linguistics 26(2):14.

Sumner, Meghan. (2013). A phonetic explanation of pronunciation variant effects. The Journal of the
Acoustical Society of America. 134(1):EL26-EL32.

Sumner, Meghan., Kim, Seung Kyung, King, Ed, & McGowan, Kevin B. (2014). The socially weighted
encoding of spoken words: A dual-route approach to speech perception. Frontiers in Psychology 4:1-13.

Tagliamonte, Sali. (2013). Comparative sociolinguistics. In J. Chambers and N. Schilling (Eds.), The
Handbook of Language Variation and Change, 128–56.

Tamminga, Meredith. (2017). Matched guise effects can be robust to speech style. The Journal of the
Acoustical Society of America 142(1):EL18-L23.

Language Variation and Change 353

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175


Vaughn, Charlotte. (2022). What carries greater social weight, a linguistic variant’s local use or its typical
use? Paper in the Proceedings of the 44th Annual Meeting of the Cognitive Science Society.

Vaughn, Charlotte, & Kendall, Tyler. (2018). Listener sensitivity to probabilistic conditioning of sociolin-
guistic variables: The case of (ING). Journal of Memory and Language 103:58-73.

Vaughn, Charlotte, & Kendall, Tyler. (2019). Stylistically coherent variants: Cognitive representation of
social meaning. Revista de Estudos da Linguagem 27(4):1787-830.

Villarreal, Dan, Clark, Lynn, Hay, Jennifer, & Watson, Kevin. (2021). Gender separation and the speech
community: Rhoticity in early 20th century Southland New Zealand English. Language Variation and
Change, 33:1-22.

Wade, Lacey. (2022). Experimental evidence for expectation-driven linguistic convergence. Language 98
(1):63-97.

Wagner, Suzanne Evans, & Hesson, Ashley. (2014). Individual sensitivity to the frequency of socially mean-
ingful linguistic cues affects language attitudes. Journal of Language and Social Psychology 33(6):651-666.

Watson, Kevin, & Clark, Lynn. (2013). How salient is the NURSE∼SQUARE merger? English Language
and Linguistics 17(2):297.

Watson, Kevin, & Clark, Lynn. (2015). Exploring listeners’ real-time reactions to regional accents. Language
Awareness 24(1):38-59.

Wolfram, Walt, Childs, Becky, & Torbert, Benjamin. (2000). Tracing English dialect history through con-
sonant cluster reduction: Comparative evidence from isolated dialects. Journal of Southern Linguistics
24:17-40.

Zhang, Qing. (2005). A Chinese Yuppie in Beijing: Phonological Variation and the Construction of a New
Professional Identity. Language in Society 34:431-66.

Cite this article: Vaughn C (2022). The role of internal constraints and stylistic congruence on a variant’s
social impact. Language Variation and Change 34, 331–354. https://doi.org/10.1017/S0954394522000175

354 Charlotte Vaughn

https://doi.org/10.1017/S0954394522000175 Published online by Cambridge University Press

https://doi.org/10.1017/S0954394522000175
https://doi.org/10.1017/S0954394522000175

	The role of internal constraints and stylistic congruence on a variant's social impact
	
The social impact of within-variable patterns: Sensitivity to internal constraints
	The social impact of within-variable patterns: Sensitivity to internal constraints
	The social impact of across-variable patterns: Sensitivity to stylistic congruence
	The present study: Within- and across-variable patterns

	Experiment 1: Natural stimuli
	Methods
	
Stimuli
	Stimuli
	Participants
	Procedure


	Results
	Discussion

	Experiment 2: Spliced stimuli
	Methods
	Stimuli
	Participants
	Procedure

	Results
	Discussion

	General discussion
	Role of markedness/surprisal in social signaling
	Processing across tasks and types of stimuli
	Splicing methodology and the importance of stylistic congruence

	Conclusion
	Notes
	Acknowledgments
	References