This chapter include...

Summarize result (0%)

2.1.2.1.2.2.2.2.2.1.2.2.2.2.3.2.3.1.2.3.2.

Original text

This chapter includes a review of the literature on which the research questions are based.
First, it discusses the sources of foreign accents and the problematic nature of pursuing
nativeness standards in L2 pronunciation pedagogy. This is then followed by a review of relevant
literature that establishes the partial independence of the constructs of intelligibility,
comprehensibility and accentedness, and shows that comprehensibility is the better predictor of
intelligibility compared to foreign-accentedness. Then, the segmental factors influencing speech
comprehensibility and accentedness ratings are discussed. While there is a significant body of
literature investigating the multitude of factors affecting speech comprehensibility and
accentedness, the review is restricted to the empirical studies that a) investigate the contribution
of segments (consonants and vowels); b) establish the predictive power of Functional Load in
segmental importance; c) uncover the correlation between Functional Load and Consonant Age
of Acquisition in Arabic; and d) provide data on Consonant Age of Acquisition in Arabic.
2.1. L2 Speech Comprehensibility and Accentedness
2.1.1. Accentedness
Accent is one of the most salient aspects of human beings, besides physical appearance.
Listeners are able to recognize a dissimilar or unfamiliar accent within a matter of milliseconds
(Flege, 1984). What is more, listeners can sometimes detect foreign accents in languages they are
not proficient in (Major, 2007). Despite the common colloquial use of the word, everyone has an
accent: the term refers to the systematic patterns of sound that someone’s speech exhibits.
However, accents have not been treated equally in the social realm: some have been historically
afforded higher status and this has been reflected in foreign language teaching, too (e.g., through
the privileging of inner circle English varieties for teaching English). While foreign accents can
12
sometimes elicit negative reactions from listeners, it is now widely accepted that accent
reduction and elimination are not the right approach in L2 pedagogy, which favors the
intelligibility principle in pronunciation instruction (Levis, 2005, 2020). The following
paragraphs lay out the rationale for this shift in priorities and principles from the standpoint of
foreign accent.
Historical approaches in linguistics and L2 pedagogy tended to treat foreign accents as
something to eventually get rid of (Murphy & Baker, 2015). These approaches were imbued with
native speakerism that holds the purported native speaker as the norm and the goal to strive for in
language learning at every level of linguistic structure (Holliday, 2006). This ideology was
exemplified to an extreme level by the Audiolingual Method, which set out to eradicate
deviations from native phonological systems through incessant and repetitive drilling of speech
patterns (Baker, 2017). While calls occasionally sprung up to invoke intelligibility as a principle
for pronunciation, these did not gain enough currency to become dominant (Murphy & Baker,
2015). With the advent of communicative language teaching, the explicit treatment of
pronunciation took a backseat, as evidence for the futility of pursuing nativelike accents emerged
(Levis & Sonsaat, 2017). It was also thought that pronunciation was merely a function of
proficiency and did not need to be addressed in a directed manner. This neglect is evidenced in
the dearth of empirical research during what Murphy and Baker (2015) termed the third wave of
pronunciation pedagogy. This shift from a nativeness standard to no explicit treatment of
pronunciation is described by Levis and Sonsaat (2017) as a move from accuracy to fluency in
instructional priorities.
The existence of foreign accents could be considered a manifestation of the exceptional
difficulty learners face in L2 phonological production. The sources and causes of difficulty have
13
been conceptualized differently by various theoretical approaches (Archibald, 2021). Contrastive
Analysis predicted positive transfer of L1 phonological features that are similar to the L2 ones
and negative transfer of dissimilar features, making dissimilar phonological features of L2 more
difficult to acquire (Archibald, 2017). Another approach has posited markedness as a source of
difficulty in L2 phonological production (Eckman, 2008), in that more marked features are
harder to acquire (e.g., consonant clusters). What is largely shared between these approaches is
the assumption of fundamental difference in adult L2 phonological acquisition, which posits the
existence of a critical period for native-like acquisition of accent. In addition, this fundamental
difference is amplified in the case of phonology, compared to the acquisition of lexicon and
morphosyntax, making nativelike attainment of L2 phonology an unrealistic standard (CaldwellHarris & MacWhinney, 2023).
Not only is sounding nativelike an unrealistic goal, but it is also an unfair and unjust
standard. Accents are powerful social markers of identity, and irrespective of difficulty, speakers
might not wish to sound nativelike but want to retain their non-native accents in an attempt to
signal their belonging to a certain group and to assert their identities. Adults are thought to have
an established sense of identity that is tied to their native languages. Moyer (2013) lists some
examples of reasons for L2 speakers wanting to project a non-native identity: maintaining an
interesting personality, wanting to showcase that acquiring the L2 took hard work, and wanting
to fit in with other L2 learner peers. These examples show that at times, regardless of ability,
conscious choice plays a role in the manifestation of foreign accent and that this choice can be
driven by social factors and personal preferences that need to be respected.
Foreign accents are but one example of non-standard language use and as such, are
subject to dominant language ideologies (i.e., native speakerism) that stigmatize non-standard
14
varieties, whether native or non-native (Gluszek & Dovidio, 2010; Moyer, 2013). This has a host
of ramifications including linguistic prejudice and even discrimination based on speech
characteristics (Baugh, 2017). For example, callers judged as sounding black could be denied
housing opportunities in the United States (Purnell et al., 1999). In a similar vein, Americans
have been found to judge foreign accents lower on dimensions of status and solidarity compared
to native ones (Dragojevic & Goatley-Soan, 2022), with an additional hierarchy between the
non-native accents. Similar hierarchical attitudes to perceived non-standard speech
characteristics have been observed in the case of Arabic (Gwasmeh, 2021). Nevertheless, as
Munro and Derwing (2020) point out, L2 learners should not bear the burden of mitigating
negative listener attitudes: the onus is on listeners to adjust to foreign-accented speech and
training seems to be effective in this regard (Derwing et al., 2002).
Despite the overwhelming body of evidence and arguments against imposing native-like
accents on L2 learners, it is still beneficial to measure the degree of perceived foreign
accentedness in speech perception studies involving L2 speakers. Such measures can potentially
tap into language attitudes and give us information about the features of speech that native
listeners associate with sounding foreign. This information could be used to train native speakers
to listen to non-native speech, similarly to what Derwing et al. (2002) carried out. Still, there is
little use in measuring foreign accentedness in isolation, and it is best measured in connection
with comprehensibility (and intelligibility), which will be discussed in the following section.
2.1.2. Comprehensibility
Comprehensibility can be thought of as the lowest common denominator in intelligible
speech: highly comprehensible speech is likely to also be highly intelligible, making
comprehensibility a practical and convenient measure in research even in the absence of direct
15
measurements of intelligibility. However, in certain cases, intelligible speech can still receive
low comprehensibility ratings, which has ramifications on the success of the interaction. For this
reason, the intelligibility principle in L2 pronunciation teaching states that students should target
comfortably intelligible pronunciation, which entails high comprehensibility and high
intelligibility (Levis, 2005, 2020). The overwhelming majority of studies that have investigated
the correlation between foreign-accentedness and comprehensibility in numerous languages
found only a moderate level of correlation, and this partial separation between the two constructs
has been confirmed even at the level of individual words (Uchihara, 2022). The superiority of
comprehensibility over foreign-accentedness in predicting intelligibility has been replicated in
L2 Arabic, as well (Ali, 2023). The following sections discuss the historical, theoretical and
methodological underpinnings of comprehensibility as a research construct in L2 pronunciation.
The study of speech intelligibility originates from the field of telecommunications, where
researchers were originally interested in sound clarity over telephone calls (Weismer, 2008). In
addition, speech pathology research also has a long-established history of studying the
intelligibility of disordered speech. A strand of speech pathology research also investigated the
articulatory features associated with the loss of speech intelligibility, a similar, but not identical
approach to the one taken in SLA (which has focused on individual phonemes and
suprasegmental features). As for the field of SLA, intelligibility and comprehensibility tended to
be used interchangeably before Munro and Derwing’s (1995a) seminal study laid down the
theoretical foundation for the separation of accentedness, intelligibility and comprehensibility.
Since then, pronunciation research and teaching has entered what Murphy and Baker (2015)
termed the fourth wave.
16
The field’s current understanding of the constructs of intelligibility, comprehensibility,
and foreign-accentedness stem from Munro and Derwing’s (1995a) seminal study, in which they
elicited extemporaneous speech from Mandarin-accented L2 English speakers through a picture
description task, and presented sections of the recordings to native English listeners who
transcribed the sections and assigned Likert-scale ratings of perceived ease of understanding
(comprehensibility) and perceived degree of foreign-accentedness. They found that transcription
accuracy (intelligibility) was most strongly correlated with perceived ease of understanding
(comprehensibility), while it was only weakly correlated with the perceived degree of foreignaccentedness. These results have since been replicated in L2 English using read-aloud sentences
(Jułkowska & Cebrian, 2015), and picture-elicited words (Uchihara, 2022). Furthermore, the
same separation of the three constructs has been observed in L2 Spanish (Nagle & Huensch,
2020), L2 Mandarin (Neal, 2022), and L2 Arabic (Ali, 2023). In most cases, the strongest
correlation was found between comprehensibility and intelligibility, followed by a moderate-tostrong correlation between comprehensibility and accentedness, and either low or no correlation
between intelligibility and accentedness, pointing to comprehensibility as the superior predictor
of listener understanding of L2 utterances.
In psycholinguistic terms, comprehensibility taps into processing fluency, that is, the
speed of the online processing of speech (Trofimovich et al., 2022). This theoretical
interpretation is exemplified in the empirical results on the connection between
comprehensibility and processing time. Munro and Derwing (1995b) presented true and false
statements spoken by native and non-native English speakers to native listeners, who were
required to assign a truth value to each statement. Response latencies were calculated based on
the time listeners took to decide whether the statement was true or false. The response latency to
17
foreign-accented statements was longer, but this difference stemmed not from higher foreignaccentedness ratings, but from speech that was rated low on comprehensibility. Uchihara (2022)
investigated the relationship between processing time and comprehensibility at the word level. In
his study, processing time was operationalized as the time elapsed between listening to the word
and the first keystroke in transcription. He found that reduced comprehensibility predicted longer
processing time more strongly than higher accentedness did.
Reduced speech intelligibility and comprehensibility bear implications on the success of
target language interactions. Speech that puts a high processing burden on the interlocutor can
not only hinder listening comprehension but also the ability of the interlocutor to successfully
participate in and contribute to the interaction. In a series of psycholinguistic experiments, LevAri et al. (2018) discovered that native speakers of English who listened to non-native speech
performed more slowly on a lexical recall task and were also less accurate in recalling their own
responses to interview questions read by a non-native (Mandarin-accented) researcher. The
researchers explained the results as a lower level of detail in general linguistic processing as a
result of being exposed to non-native speech. These results could explain the findings of Varonis
and Gass (1982), and Gass and Varonis (1985), who uncovered that the reduced
comprehensibility of non-native speech causes native speakers to simplify their own speech and
sometimes leads them to cut the interaction short altogether. Such reactions from interlocutors
could seriously hurt language learners’ opportunities for interaction and practice, potentially
holding them back from reaching high levels of L2 attainment.
2.2. Segmental Error Gravity in L2 Speech Comprehensibility and Accentedness
L2 pronunciation research has since investigated a multitude of factors involved in
speech comprehensibility and accentedness, such as speaker- and listener-based variables, and
18
linguistic (phonological, lexicogrammatical, pragmatic, and fluency-related) correlates
(Crowther et al., 2022). Out of these, the ones that concern L2 pronunciation pedagogy the most
are phonological correlates, which could be divided into segmental and suprasegmental ones.
Whether it is segmentals or suprasegmentals that are more important and that should take
precedence in pronunciation teaching has been an object of a debate (Wang, 2022; Zielinski,
2015) that remains unsettled. The relative importance of these two aspects of pronunciation
likely depends on the language under study. Considering the relative scarcity of research on the
phonological factors influencing speech comprehensibility in languages other than English, such
a debate is unlikely to be fruitful. What has been more successful is finding an organizing
principle for an error gravity hierarchy between segments (consonants and vowels) in the form of
the Functional Load (FL) principle. The following sections delve into the specifics of the effects
of segmental errors on L2 speech comprehensibility and accentedness and chart the development
of the studies investigating the predictive power of phoneme FL in segmental error gravity.
2.2.1. Segmental Errors
Segmental errors can be categorized into four types, based on the departure they represent
from the native syllable structure of the target word: substitution, distortion, insertion, and
deletion (Derwing & Munro, 2015). Segmental substitutions refer to the replacement of the
target phoneme with another phoneme of the target language (e.g., pronouncing “think” as “sink”
in English, or pronouncing مارح] ħaraːm] as مارخ] xaraːm] in Arabic). Distortions are similar to
substitutions in that they result in non-target-like production of the phoneme in question, but
such production does not involve a recognizable target language phoneme (such as pronouncing
the approximant [ɹ] in English as a trill [r] in the case of Arabic-accented English). Insertions and
deletions, on the other hand, change the syllable structure of the produced word. While insertion
19
involves the addition of a phoneme that was not part of the original structure of the word,
deletion results in the removal of a phoneme originally present. Most studies on L2
pronunciation have investigated the effects of segmental substitutions, and in the present review,
the term “errors” is used to refer to substitutions.
Some studies have compared the differential effect of consonant vs. vowel
mispronunciations on L2 speech comprehensibility and accentedness. These have yielded
conflicting results. Bent et al. (2007) found that in the case of Mandarin-accented read-aloud
sentences in L2 English, vowel mispronunciations harmed intelligibility more than consonantal
errors. This is in opposition to the results of Suzukida and Saito (2021) showing that consonants
errors were more impactful in the intelligibility ratings of Japanese-accented L2 English
extemporaneous speech. A similar result was repeated by Na (2021) on Korean-accented readaloud English words. In all likelihood, a direct comparison between consonants and vowels
might not be useful, since they occupy different positions within the syllable, and possess
different FL values, which have language-specific distributions. At the same time, as it will be
discussed later, separate hierarchies between different consonants and between different vowels
can be determined based on the FL principle.
The position of consonantal errors seems to have an influence on the intelligibility scores
of speech segments. Bent et al. (2007) found that Mandarin-accented read-aloud L2 English
sentences containing word-initial consonant errors received the lowest intelligibility scores. In
their study, this was the only position in which consonant errors were significantly associated
with reduced intelligibility. In terms of lexical competition, it makes sense that word-initial
mispronunciations would have a more severe impact, since upon hearing the first sound of the
word, the listener is sent down the wrong path and it becomes difficult for them to successfully
20
identify the word after the activation of unrelated competitor words (Mattys et al., 2012). This
means that the comparison of consonantal errors needs to take into account the position of said
consonants within the word.
Generally, the longer the utterance is, the easier it is to understand it and the same is true
for individual words. In Uchihara (2022)’s investigation of word-level intelligibility,
comprehensibility and accentedness in L2 English, the number of syllables was a significant
predictor of better comprehensibility. This means that words containing fewer syllables were
harder to understand. This makes sense when considering that longer words contain more
information, and especially in the case of foreign-accented speech, differences from the mental
representation of lexical items in L1 listeners can make it particularly difficult to identify the
target as the amount of phonological information available decreases. The easier understanding
of longer utterances has also been approached from the standpoint of perceptual learning and
adaptation. Given enough information about a speaker’s phonetic variability, native listeners are
able to adapt and learn non-native pronunciation patterns, which has an effect on
comprehensibility ratings. However, it is important to point out that this adaptation is subject to
individual variation. Nevertheless, when comparing the effects of different consonantal
mispronunciations, word length is an important factor to control for.
Errors also seem to have a cumulative effect on comprehensibility and accentedness
ratings, although this effect differs based on the FL value of the erroneously produced segment.
In addition, the accumulation of errors affects comprehensibility and accentedness ratings
differently. Munro and Derwing (2006) found a cumulative effect only for high FL consonant
errors on accentedness ratings. This means that the number of consonantal mispronunciations did
not affect comprehensibility ratings, nor did the number of low FL consonant errors affect
21
accentedness. As a replication and extension to the latter study, Alnafisah et al. (2022) included
sentences with as many as four consonantal errors, and found that the effect of high FL
consonant errors was only magnified when the number of them reached four within a sentence.
Low FL consonant errors started showing a cumulative effect earlier (although the effect still
remained weaker compared to high FL errors). As for accentedness, high FL errors had a more
linear cumulative effect, compared to low FL errors that showed a cutoff after two errors. While
these results suggest that the frequency of mispronunciations could be more confusing for
listeners (especially in the case of the more important consonants), it is important to point out the
myriad of potentially confounding factors that could interfere with the cumulative effects of
mispronunciations, including word length, word position within the sentence, error position
within the word, as well as the unequal distribution of errors between content words and function
words (which Munro and Derwing (2006) highlighted as a limitation in their study).
2.2.2. The Predictive Power of Functional Load in Segmental Error Gravity
By far the most robust predictive framework for segmental error gravity has been the
Functional Load (FL) principle. Originally developed within the functionalist circles of the
Prague school to explain and predict historical sound change, the classical conceptualization of
FL refers to the amount of contrastive work performed by a phonemic opposition (King, 1967):
phoneme pairs that differentiate between more minimal pairs in a language have higher FL.
Since its introduction into foreign language teaching, the predictive value of the framework has
been confirmed by a number of empirical studies. The following paragraphs present a discussion
of the historical and theoretical background of the concept, a summary of the empirical evidence,
as well as a highlight of gaps and factors that have not been controlled for previously.
22
The contrastive work of phoneme pairs was traditionally operationalized as the frequency
of said phonemic opposition, meaning that phoneme pairs that differentiate more minimal pairs
in the lexicon were considered to have higher FL. The original utility of the concept lay in
explaining diachronic sound change in language systems: phonemic pairs with higher FL were
hypothesized to be more resistant to mergers, since they perform a lot of contrastive work to
keep the meanings of lexical items apart (King, 1967). The loss of such a phonemic contrast
would potentially hurt communication more than the loss of a low FL contrast. In terms of
information theory, the loss of a high FL phonemic contrast would lead to a high level of
information loss (entropy) from a particular linguistic system.
Brown (1988) was the first to introduce the concept of FL into foreign language teaching.
His formulation of FL could be considered an expansion of King’s (1967) definition: it includes
12 considerations, including the part of speech of the minimal pairs, the phonetic similarity of
the phoneme pair, and the probability of occurrence of each member of the pair, among others.
Arguing for a relative weighting of these 12 factors, he developed a 1–10 ranking of vowel and
consonant contrasts in British English. A similar hierarchy was created by Catford (1987) on a
scale of 1–100. The main takeaway from Brown’s discussion of FL is that FL is more than just
the raw cumulative type frequency of the phonemic pair in question. Most importantly, from the
standpoint of L2 pronunciation, he argues that only contrasts that are frequently conflated by
language learners should be examined.
The hypothetical error gravity hierarchies built by Brown (1988) and Catford (1987)
have gained empirical confirmation in L2 pronunciation studies. Munro and Derwing (2006)
found that in the case of Cantonese-accented English read-aloud sentences, high FL consonant
errors affected both comprehensibility and accentedness ratings more negatively than did low FL
23
errors. Expanding on these results, Suzukida and Saito (2021) used recordings of
extemporaneous speech produced by Japanese-accented L2 English speakers, replicating the
negative effect of high FL consonant mispronunciations on comprehensibility ratings. In their
study, vowel errors and low FL consonant errors did not have a significant effect on
comprehensibility. In another replication of previous findings, Alnafisah et al. (2022) included
speakers from multiple language backgrounds in their study. The read-aloud English sentences
from these participants containing high FL segmental mispronunciations were judged less
comprehensible and more accented than their low FL counterparts. When it comes to vowels,
Thir’s (2020) study provides the first empirical evidence suggesting that high FL vowel
mispronunciations might cause more problems for listeners when compared to low FL ones. The
first study conducted on a language other than English found the same negative effect of high FL
segmental errors on comprehensibility compared to low FL ones in the case of L2 Chinese
speech (Bao et al., 2022). While the robustness of FL in segmental error gravity is increasingly
evident from the emergence of methodologically innovative studies, some factors remain to be
controlled for, such as segmental error position within the word, grammatical category, and even
word length.
2.3. The Relationship Between Functional Load and Consonant Age of Acquisition
Owing to FL’s robustness as a predictive framework for segmental error gravity, Munro
and Derwing (2015) call for the exploration of FL hierarchies and their effects on speech
comprehensibility and accentedness in languages beside English. This call has been echoed by
researchers in the field of AFL (Hellmuth, 2014; Rifaat, 2017; Wahba, 2021). Hellmuth (2014)
points out the lack of a clearly identified FL hierarchy in Arabic. The difficulty of identifying
such a hierarchy is exacerbated by a dearth of representative, phonetically annotated spoken
24
corpora (Ahmed et al., 2022). Relying on early evidence on the effect of FL on L1 acquisition
(Stokes & Surendran, 2005) in certain languages, Hellmuth proposes an error gravity hierarchy
based on the order of acquisition of consonants as described by Amayreh and Dyson (1998) for
Jordanian children. However, such proposals need to gain empirical confirmation before
implemented pedagogically. Therefore, the aim of this section is to outline the empirical and
theoretical underpinnings that establish the relationship between FL and consonant age of
acquisition (CAoA) and to discuss the hypothesized relationship between FL and CAoA in
Arabic. The section then concludes with a review of L1 phonological acquisition studies in
Arabic and provides a categorization the most common consonantal errors L2 speakers of Arabic
make based on the CAoA data available in Egyptian Arabic.
The study of first language acquisition, similarly to that of SLA, has traditionally been
dominated by formalist explanations for acquisitional patterns. These employ linguistic
universals such as markedness and articulatory complexity to explain why certain phonemes are
acquired earlier than others. According to this explanation, phonemes that are more marked are
acquired later than unmarked ones: e.g., voicing is a marked phonological feature and as such,
voiced consonants (e.g., /d/) are acquired later than their unmarked counterparts (e.g., /t/). In
terms of articulatory complexity, fricatives are more complex to produce articulatorily, and
therefore are acquired later than stop consonants. In addition, the traditional generativist view
(itself being a formalist approach) views the acquisition of language as top-down, with an innate
language acquisition device that predisposes children to follow similar paths of acquisition
across languages. The dominance of linguistic universals, however, has come under question by
proponents of the functionalist usage-based (emergentist) approach (Diessel, 2017). Usage-based
or emergentist approaches to language acquisition seek to locate the emergence of linguistic
25
forms in children’s usage within the ambient language that the child is exposed to. According to
the usage-based framework, no top-down innate device is needed, but rather the child makes
generalizations from bottom-up observations (Behrens, 2009). Based on this framework, the
language-specific acquisitional patterns that arise can be explained by characteristics of the
linguistic input that the child comes in contact with.
In the case of phonology, the two most common language-specific predictors of
acquisition order are phoneme frequency and phoneme FL (Ingram, 2008; Tribushinina & Gillis,
2017). While the traditional calculation of FL involves determining values for specific phoneme
pairs, the computational, information-theoretical approach allows for the computation of FL
values for individual phonemes by pairing them with articulatorily similar counterparts and
arriving at an approximate value. Sewell (2017) terms this formulation the broad sense of FL,
and it has been found to be more robust than Brown’s (1988) hierarchy in predicting CAoA
(Severen et al., 2013). In addition, the computational formula of FL takes into consideration
token frequency, since it is used to calculate segmental FL within representative spoken corpora.
FL has been found to be as or more predictive than phoneme frequency for CAoA. Stokes
and Surendran (2005) found that FL was a unique predictor of CAoA in English and Dutch,
where they did not find additional explanatory power for frequency. In the case of Cantonese,
only frequency seemed to be a predictive factor in CAoA. In order to make more valid
assumptions about children’s ambient language influences, Severen et al. (2013) used childdirected speech corpora and found that the FL calculations based on them better predicted CAoA
for Dutch word-initial consonants compared to FL calculations based on adult-directed speech
corpora. They also found that token frequency was a better predictor than type frequency, which
does not take into account non-standard pronunciations. In a larger-scale comparison of five
26
languages (English, Japanese, Mandarin, Turkish, Spanish), Cychosz (2017) found that FL
(calculated based on child-directed speech) was a better predictor of CAoA than frequency in
four of them, with only Mandarin showing a reverse pattern. Overall, these results point to likely
effects of language typology, as tonal languages do not seem to show high FL effects on CAoA.
In all likelihood, the more a language relies on vowels and suprasegmentals to contrast meaning,
the less role its consonant FL distribution plays in consonant acquisition order.
2.3.1. The Hypothesized Relationship Between Functional Load and Consonant Age of
Acquisition in Arabic
Froud and Khamis-Dakwar’s (2021) critical review of Arabic L1 acquisition studies
highlights that most of the published literature on Arabic L1 phonological acquisition has
approached the subject from the standpoint of universal processes. The discussion of FL as a
possible explanation for Arabic-specific acquisitional patterns appears in Amayreh and Dyson
(1998), Amayreh and Dyson (2000), and Amayreh (2003). However, they do not quantify FL in
any of their studies, merely suggesting that those consonants that defy cross-linguistic patterns of
acquisition based on their marked or articulatorily complex features could have higher FL values
in Arabic: e.g., the voiceless pharyngeal fricative /ħ/. They are also overly cautious in
highlighting that many late acquisition consonants seem to also be marked or articulatorily
complex (e.g., the emphatic consonants with pharyngeal secondary articulations). This
interpretation seems to be an unnecessary privileging of linguistic universals over languagespecific explanations. As it will be discussed, FL could still correlate with CAoA even in cases
of overlap between predictions made by linguistic universals.
By looking at the phonological structure of the different languages in Cychosz’s (2017)
study, we could hypothesize the likely magnitude of FL effects on Arabic CAoA. In her
27
investigation, Spanish showed the strongest correlation between consonant FL and CAoA, a
relationship that is stronger than the one observed in English. There are indications that this
relationship in the case of Arabic could be stronger than the one in English, possibly approaching
the one found in Spanish. Firstly, unlike English, Arabic does not have contrastive stress. English
listeners seem to be sensitive to stress errors, which affect speech comprehensibility negatively.
Stress errors in English also introduce vowel errors, since unstressed vowels undergo reduction,
a phenomenon that is similarly absent from Arabic. In terms of its vowel inventory, Arabic has
three to five distinct vowel qualities (as opposed to 25-28 consonant phonemes depending on the
dialect), which is fewer than the five vowels found in Spanish, and much fewer than the 10+
vowels that exist across varieties of English. What could put Arabic behind Spanish is its two
contrastive suprasegmental features: vowel length and consonant gemination. In comparison,
Spanish does not have contrastive suprasegmental features, relying only on consonants and a
small set of vowels to differentiate meaning. In this sense, Arabic could potentially exhibit an
FL-CAoA relationship that is between Spanish and English, which would mean a fairly high
correlation.
These typological differences between the different languages under study are
represented in the language-specific patterns of early spoken word recognition by children.
While infants from multiple language backgrounds exhibit a vowel bias in spoken word
processing in the first year of age, this bias shifts in favor of consonants in languages that make
greater use of consonantal contrasts (Nazzi & Cutler, 2019). For example, this shift takes place
by the 12th month in Spanish-speaking children (Bouchon et al., 2022). As presented before,
Spanish displays a very strong correlation between consonant FL and CAoA. In comparison,
English-learning children only develop a consonant bias within their 3rd year of life (Ratnage et
28
al., 2023), which can explain why English shows a weaker correlation between FL and CAoA
than Spanish. Lastly, Cantonese- and Mandarin-speaking children retain the vowel advantage
even in their 3rd year of age (Chen et al., 2021; Ma et al., 2017). In addition, speakers of the latter
two languages still do not display a consonant bias in speech processing even in adulthood. This
again can explain the weak correlation between consonant FL and CAoA in both Cantonese and
Mandarin.
When it comes to the consonant bias in Arabic spoken word processing, psycholinguistic
evidence favors a strong consonant advantage. Aldholmi and Pycha (2023) conducted two
experiments to investigate the differential effects of the removal of vowels vs. consonants from
MSA stimuli and found that sentences that had their vowels masked were more accurately
identified by listeners than sentences where the consonants were masked and only the vowels
could be heard. This finding concurs with similar results found in other Semitic languages like
Hebrew (Lador-Weizman & Deutsch, 2022). In comparison, sentence-level word recognition by
English-speaking adults shows a vowel bias. The finding also makes sense in light of the
typological distribution of segments in Arabic, where the balance of the scale is tipped towards
consonants. However, despite claims to the contrary, this consonantal bias seems to be further
enhanced by the root-and-pattern-based morphology of Semitic languages. Lador-Weizmann and
Deutsch (2022) compared the consonant bias for morphologically complex and morphologically
simple Hebrew words and found that it was stronger for the complex words that had clearly
identifiable Hebrew roots and patterns.
Computational evidence also points to the advantage that consonants offer to children
learning a root-and-pattern-based language like Arabic. Kaastner and Aadrians (2018) used a
Bayesian computational model that approximates the statistical learning that children are thought
29
to engage in when processing the ambient language, in accordance with usage-based approaches
to language acquisition. They compared the performance of this model on Arabic and English
phoneme segmentation by feeding it both consonant-only data and consonant-and-vowel (full)
representations. In the case of Arabic, the model was more accurate at correctly identifying word
and morpheme boundaries when fed consonant-only data than when it was given both
consonants and vowels. In English, this consonant advantage was not present, suggesting that a
young learner of Arabic benefits from ignoring vowels and focusing on consonants to acquire the
language.
Overall, the phonological structure that favors consonants against vowels and
suprasegmentals, the unique morphological utilization of consonants in the form of roots, and
psycholinguistic evidence for a consonant bias in speech processing all point to the possibility
that CAoA is strongly associated with FL in Arabic. That is, early-acquisition consonants likely
represent high FL values and late-acquisition consonants likely represent low FL in the language.
Thus, the hypothesis put forth by Hellmuth (2014) seems plausible. What is needed, now, is a
more thorough examination of CAoA in Arabic, with the recognition that dialects probably differ
in the order of acquisition of their consonants, likely pointing to differing FL distributions. This
does not come as a surprise, considering that FL is a usage-based concept that is supposed to
reflect the changing nature of language and social and regional variation in patterns of language
use.
2.3.2. Arabic L1 Consonant Acquisition Order
While many studies on Arabic phonological acquisition suffer from restrictiveness in
scope (small sample sizes, limited age ranges, non-standard elicitation methods, different criteria
for acquisition; Froud & Khamis-Dakwar, 2021), there have recently been promising, large-
30
scale, cross-sectional investigations of CAoA in Syrian (160 children between the ages of 2:6–
6:5; Owaida, 2015) and Egyptian Arabic (360 children between the ages of 1:6–7:4; Elrefaie et
al., 2021). The results of these can be compared with Amayreh and Dyson’s (1998) large-scale
investigation of CAoA in Jordanian Arabic (180 children between the ages of 2:0–6:4).
According to Froud and Khamis-Dakwar (2021), differences observed could be indicative of
dialect-specific acquisitional patterns. For this reason, a speech perception study relying on
CAoA data needs to include listeners from a single dialectal background, to reliably infer
underlying FL values. Listeners also bring their unique dialectal experiences that could affect
their comprehensibility ratings (e.g., the lexicon of each dialect is different). This suggestion is
also in accordance with Sewell’s (2017) note on FL being a context-dependent, dynamic
phenomenon, as opposed to being a fixed property of a larger, more abstract linguistic structure.
L1 consonant acquisition has traditionally been categorized into three stages: early,
middle, and late acquisition. However, these are merely relative stages, since studies have
differed in the age ranges of the children they included as participants across languages. For
example, McLeod and Crowe (2018) reviewed studies describing English consonant acquisition
and categorized early as comprising the age range 2;0–3;11, middle as 4;0–4;11, and late as 5;0–
6;11. However, for Korean, the early stage spans 2;0–2;11, the middle stage spans 3;0–3;11, and
the late stage spans 4;0–4;11. In the case of Arabic, Amayreh and Dyson (1998) categorized their
data as early (2;0–3;10), intermediate (4;0–6;4), and late (after 6;4). From the standpoint of
segmental error gravity, it is more practical to divide the stages of acquisition into early- and
late-acquisition consonants, in keeping with the studies comparing segmental errors with high
and low FL. The boundary between them can be drawn around age 4, which seems to be the
midpoint of acquisition across the three large-scale studies conducted on Arabic. It is also the age
31
by which the majority of consonants have been shown to acquire in the 27 languages reviewed
by McLeod and Crowe (2018).
Importantly, we need to remember Brown’s (1988) point about FL: we are only interested
in the error gravity hierarchies of phonemes that are likely to be mispronounced by L2 learners
of the language. Fortunately, we can rely on Al Tubuly (2018)’s descriptive investigation of the
production accuracy of Arabic consonants by L2 learners. This study involved 50 L2 learners of
Arabic from five language backgrounds: English, German, Greek, Turkish, Chinese. The
advantage of this investigation is that mispronunciations originated from native speakers of
phonologically diverse languages, giving a more complete account of the likely difficulties L2
learners of Arabic face when pronouncing segments. The consonants that exhibited lower than
90 percent accuracy in the study in production were the following: /h/ (ه(, /x/ (خ(, /ð/ (ذ(, /ɣ/ (غ(,
/sˤ/ (ص(, /dˤ/ (ض(, /ðˤ/ (ظ(, /tˤ/ (ط(, /ʕ/ (ع(, /q/ (ق(, /ħ/ (ح(. Out of these, the voiced interdental
fricative /dh/ (ذ (is absent from most colloquial dialects, including Egyptian and Syrian Arabic,
and native speakers of these dialects usually substitute it with the voiced alveolar fricative /z/ (ز(.
Additionally, while the glottal stop /ʔ/ (ء (showed 100 percent accuracy in Al Tubuly’s study, its
substitution with the voiced pharyngeal fricative /ʕ/ (ع (seems to be common by L2 learners and
can oftentimes be observed in word-initial position (e.g., pronouncing مأَ رْ /ʔamr/ as مرْ عَ] ʕamr].
This leaves us with 11 consonants most likely to be mispronounced by L2 learners of Arabic for
which establishing an error gravity hierarchy would be the most useful: /h/ (ه(, /x/ (خ(, /ɣ/ (غ(, /sˤ/
.(ء) /ʔ/ ,)ح) /ħ/ ,)ق) /q/ ,)ع) /ʕ/ ,)ط) /tˤ/ ,)ظ) /ðˤ/ ,)ض) /dˤ/ ,)ص)
There tend to be terminological discrepancies between studies with regards to the usage
of acquisitional criteria. In the case of the three large-scale Arabic studies, acquisition was used
to mean 75 percent of children in an age group correctly producing the consonant in each
32
position tested in the case of Jordanian Arabic (Amayreh & Dyson, 1998), while it referred to a
90-percent criterion in the case of Syrian (Owaida, 2015) and Egyptian (Elrefaie et al., 2021).
The 90-percent criterion was called mastery in the Jordanian study, while mastery was used to
denote a 100-percent criterion in the Egyptian study. Due to these discrepancies, the shared 90-
percent acquisitional criterion can be used to categorize the potential mispronunciations into
early- and late-acquisition groups. Interestingly, using this criterion, all 11 consonants would fall
under late acquisition in the case of Jordanian Arabic. In fact, another study by Amayreh (2003)
found that by age 8;4, there were still consonants unacquired by Jordanian children. These
unusually late acquisitional results go against both the more recent evidence on Syrian and
Egyptian Arabic, where children were found to acquire most consonants by age 6 (except /q/, /ʒ/,
and /ðˤ/ in Syrian), and the crosslinguistic evidence showing that almost all consonants are
acquired by age 6 (McLeod & Crowe, 2018).
Following the 90% criterion and the 4-year boundary between early and late acquisition,
both Syrian and Egyptian data point to the same categorization of the 11 consonants of interest
(Elrefaie et al., 2021; Owaida, 2015). Early-acquisition consonants are the following: /h/ (ه(, /ʕ/
(ع(, /ħ/ (ح(, /ʔ/ (ء ;(while late-acquisition consonants include the following: /x/ (خ/, /ɣ/ (غ(, /tˤ/
(ط(, /dˤ/ (ض(, /sˤ/ (ص(, /ðˤ/ (ظ(, /q/ (ق(. Of course, data from other dialects would be useful to
enrich this categorization or to potentially develop other categorizations based on dialect-specific
data. However, this is not possible because studies have been conducted on specific, restricted
age groups and thus are not directly comparable with each other.
2.4. Conclusion
Overall, CAoA seems to be a promising predictor variable in consonantal error gravity in
languages where the contrastive distribution of segments and suprasegmentals favors consonants.
33
In such languages, CAoA shows a significant and strong correlation with FL. Drawing on
languages with similar phonological paradigms, it can be hypothesized that CAoA is strongly
associated with FL in Arabic. Thus, the expectation is that early-acquisition consonants represent
high FL values, while late-acquisition consonants represent low FL values. Consequently, earlyacquisition consonant errors should reduce speech comprehensibility more than late-acquisition
errors. Likewise, early-acquisition consonant mispronunciations should increase the perceived
degree of foreign-accentedness more than late-acquisition ones. Overall, indirect inferences
about underlying FL distributions can be of great use in the case of other under-researched
languages with few available and representative spoken corpora.

Lakhasly

Summarize result (0%)

Original text

Summarize English and Arabic text online

Latest summaries