Actually, yesterday I spent 12 hours straight on it. Now, I think that no one in is interested in it, but what the heck, I will post it anyway...
One of my favourite errors occurred in an American war film, subtitled in French. One of the soldiers peers into the distance, and another says, “Tanks?” The subtitle reads, “Merci.” ~ Edwards, 1994
Over the last half-century, as the discipline of linguistics has exploded into numerous and often interrelated subfields, the study of second language acquisition has attracted a great deal of attention. Within this field, there are further divisions, including second language (L2) phonology, which will be treated here. It is well known that when learning a second language, unless one is very young, it is almost guaranteed that L2 speech will be accented. Some L2 sound structures pose problems for learners, while others are acquired with ease. Throughout the last fifty years many theories have been formulated in attempts to explain these phenomena, to predict it, and to improve the teaching of second languages. While innumerable studies fill vast bibliographies, this essay can only deal with some of the more salient ones, hoping to sketch out a general view of the advancements, improvements, and even failures in the field of second language phonology.
Early L2 acquisition studies were heavily influenced the behaviorist philosophical basis which was popular at the time. Language was believed to be a set of habits, and the more different the L2 “habit,” the more difficult it would be to learn. The Contrastive Analysis Hypothesis (CAH) approach was spearheaded by Lado (1957) who sums up the basic tenets of the model:
We assume that the student who comes into contact with a foreign language will find some features of it quite easy and others extremely difficult. Those elements that are similar to his native language will be simple for him, and those elements that are different will be difficult. (Lado, 1957, p.2)
We have ample evidence that when learning a foreign language we tend to transfer our entire language system in the process. (ibid, p. 11)
According to Lado, The CAH involves a hierarchy of difficulty, when contrasting differences in two languages, the lowest being “no difference,” and the highest being a situation in which two L1 allophones exist as two separate phonemes in the L2.
Later work by Stockwell and Bowen (1965) introduced more behaviorist terminology, with speech sounds in a given language described as either “optional,” “obligatory,” or “null.” They differed from Lado in that ,to them, the most difficult phonological acquisition in the L2 would be acquiring an L2 allophone that did not exist in the L1. At this point in time, the theories were not based on any empirical evidence. Later research, such as Suter (1976), suggested that the most difficult area of pronunciation was in suppressing L1 allophones that did not exist in the L2, followed , by the problems though to be most difficult by Lado and Stockwell & Bowen, respectively.
At first, the CAH claimed that all L2 errors could be predicted by contrasting the difference in the phonetics and phonologies of the two languages. As this position was tested, and failed, a weaker version was formed out of it, claiming to be able to explain all errors, and not necessarily be able to predict them.
In time, it was noticed that the CAH models not only made some wrong predictions, but even in the weakest forms could not explain certain L2 phenomena. Furthermore, CAH was not capable of predicting rate of acquisition, nor could it always distinguish, within an L2, exactly which sounds were more difficult to learn. For example, the French alveolar /t/ and uvular /R/ are different than the English /t/ and /r/. For a French speaker learning English (or vice versa), which sound would be easier to learn? The CAH could only say that they would both be difficult.
With the rise of Chomskian linguistics and the shift towards Universal Grammar (UG), the behaviorist-based CAH was seen by many as insufficient, and fell in prominence to new models of second language acquisition. These new models were principally based on the concepts of UG, typological universals, and the notion of markedness.
Greenberg (1957) collected data from many languages of the world, identifying tendencies, commonalities and impossibilities. Phonological features which are most common among the world’s languages were described as unmarked, and the more rare sounds and patterns were described as marked. Following this idea, unmarked sounds seem to be more natural to language, and were therefore assumed to be easier to learn than marked sounds. These patterns were called typological universals.
Items were rated on terms of markedness, on an implicational scale, from 1 to 5: 1 being completely unmarked and 5 being most marked. If a speaker’s native language contains a phoneme at level 3, it was then assumed that he or she can easily pronounce, or learn to produce phones at levels 1 and 2, but may have difficulty with sounds at levels 4 and 5.
An example of this would be nasal and oral vowels. A language may have oral vowels and no nasals, or it may have both. But no natural language has nasal vowels without oral vowels. In this instance, the oral vowels are unmarked, nasals are marked.
These relations form an accessibility hierarchy. If a language has nasal vowels, then it must also have oral vowels. If it has only oral vowels, then it may be difficult for a speaker to learn a L2 with nasal vowels. For a speaker coming from an L1 that makes use of both types of vowels, it would be expected that he or she can learn the orals and nasals of the L2 with little difficulty.
Eckman’s Markedness Differential Hypothesis (1977) incorporated, as the name suggests, universal markedness into measuring the level of difficulty of L2 phonological acquisition. MDH is an extension of CAH, adding that typological markedness must be incorporated into the theory as a measure of ‘difficulty,’ rather than positing L1/L2 contrast as the sole basis for difficulty. Additionally, the MDH claimed to explain the order of sound acquisition, and different levels of proficiency.
Essentially, the MDH states that less marked sounds are acquired before marked ones. The model made 3 basic claims regarding areas of difficulty a learner will have in the L2:
1. Those areas of the target language that differ from the native language and are more marked than the native language will be difficult.
2. The relative degree of difficulty in areas of difference of target language that are more marked than the native language will correspond to the relative degree of marked ness.
3. Those areas of the target language which are different from the native language but are not more marked than the native language will not be difficult. (Eckman, 1977, p. 321)
For example, voiced obstruents in final position are universally more marked than voiced obstruents in either initial or medial position. Korean has only voiceless obstruents. The MDH predicts that a native Korean speaker learning English (which employs voiced obstruents in all positions) would acquire the initial and medial obstruents first, because they are less marked universally. This prediction was tested and found to be nearly 100% accurate by Major and Faudree (1996).
However, there is a problem with relying on markedness as the determiner of difficulty of acquisition: many examples exist in which exactly the opposite is true. Oftentimes, because of perceptual similarity, an unmarked L2 sound which is similar to an unmarked L1 sound will be difficult to perceive and produce for the learner. The same learner may also be able to perceive and produce an L2 sound which is very dissimilar from anything in the L1, and more universally marked.
Therefore, other theoretical bases were introduced that sought to explain the difficulty in acquiring a second language without relying on markedness as the determining factor. Many have been proposed, and the common thread throughout is the focus on the similarity/dissimilarity dichotomy of L1/L2 speech sounds.
One example is the Crucial Similarity Measure developed by Wode (1976, 1978, 1983a) The CSM predicts that the closer the L2 sound to an L1 ‘equivalent,’ the more difficult is will be to master. For Wode, transfer from the L1 can only take place in cases where phoneme in the L2 meets a “specifiable similarity requirement.” In that case, the L1 sound transfers to the L2 inventory. Dissimilar sounds in the L2 are thought to be acquired through processes similar to L1 acquisition.
Empirical research has shown that perceptual similarity can be a serious detriment to the acquisition of L2 sounds. Kuhl et al. (1992) investigated the nature of first language acquisition and its effects on phonetic perception of infants at the age of 6 months. They hypothesized that within 6 months, "prototypes," or ideal representations of phones, are formed in the mind. Swedish and American infants were tested on their recognition of both native and foreign vowels /i/ and /y/. The same phonetic sounds, at equal distances from the hypothesized prototype centers, were perceived differently, either as a variation of the ideal /i/ or /y/ phone in the respective languages. This suggests that early on, prototypes "perceptually assimilate similar sounds," which already means that there will be a disadvantage in perceiving non-native speech sounds.
Sebastián-Gallés et al. (1999) tested highly proficient Catalan-Spanish bilinguals in their perception of Catalan phonemic contrasts not found in Spanish. They found that the Spanish dominant bilinguals performed worse than their Catalan-dominant counterparts, and hypothesized that L2 learners use their L1-tuned phonemic categories to "sieve" the acoustic input when listening to the L2. Although the Spanish dominant subjects of this study had intensive exposure to Catalan from age 4, and demonstrated a high pretrial ability to perceive Catalan phonemes, the results suggest that their L1 phonemic categories are so strong that categorial malleability is severely limited. Like Kuhl, they found that the acquisition and organization of L2 phonemic categories is strongly impacted by first language phonological constraints.
The Similarity Differential Rate Hypothesis (SDRH), as described by Major and Kim (1996), sought to correct problems of the MDH and SCH. One problem with the MDH was that it sought to predict everything based on markedness universals. Similarity was not considered, and the studies described above show that it cannot be ignored. For instance, in Sebastián-Gallés task, Spanish-dominant bilinguals could not consistently and accurately perceive even less marked Catalan sounds. The MDH, according to postulate (3) listed above, would predict the opposite result.
While not denying that markedness does indeed have some influence, Major and Kim explained that markedness will only affect the rates at which two L2 features identical in similarity are learned. Another difference is that the SDRH measures rate of acquisition, rather than difficulty of acquisition. A problem with theories that based themselves on “difficulty,” is that “difficulty” is almost impossible to define.
Major and Kim give the example of phonemes x and y. Assuming that x is more difficult to acquire than y, does this mean that y is acquired faster than x? Or does this mean that at any given time, x will be more difficult to pronounce than y? What if, ultimately, pronunciation of x is accented, and y has become native-like, yet x was acquired much faster? MDH is silent on this issue, it cannot explain why if x is actually acquired faster, it would be considered more difficult. Even if competence in using x is ultimately below competence in y, there may have been a time when y was so dissimilar to anything in the L1 that it was extremely difficult to even attempt to use? (An example of this could be a trilled R of Spanish, if an L1 English speaker takes quite some time to learn how to properly pronounce this.)
The problem with these two competing hypothesis is plain: the notions of similarity and markedness are working against each other. Each has been shown by the various theories to be indicative of phonological problems that will arise in L2 acquisition. However, the theories are restricted in what they posit for the cause, they exclude each other, even Major and Kim (1996) admit this. A theory that can unify these attested linguistic phenomena, the effects of L1 similarity, marked universals, language transfer, UG, and similar ideas in their various nomenclatures was needed.
Optimality Theory (Prince and Smolensky, 1993) is generally recognized as the first theory of language in which markedness is “explicitly and intrinsically” incorporated in the explanation of phonological phenomena. To appreciate the shift this represents in theories of L2 phonology, consider that Chomsky and Halle only mentioned markedness in the Epilogue of The Sound Pattern of English (1968)!
In Optimality Theory (OT), each language, or grammar, is comprised of unique constraint rankings. The two types of constraints are Faithfulness and Markedness.
Faithfulness refers to the relationship between the input and the output; it demands that the output not delete or epenthesize something from the input. Markedness concerns output only, it penalizes potential output candidates which contain marked structures. For example, in an English word set like /dark/ -> /dark.ness/, the affixation would seem to have Faithfulness as the highest ranked constraint. An output candidate which violated the faithfulness constraint would not be optimal, hence it is not part of the spoken or written language. With a word like /able/ -> /ability/, a markedness constraint would be the highest ranked, and the optimal output would be that which does not violate markedness. In turn, a faithfulness constraint is violated and the output is different than the input. (ablety is not a word). With “darkness,” an epenthesis or deletion would seem to be more damaging than the marked /rkn/ consonant cluster. Thus, the rankings of these constraints determine the final output of a word or sound.
OT marks a stark divergence from rule-based phonology, in that it allows constraints to be violated; in fact, it is common for constraints to be violated. The optimal output is the form which violates the lowest ranked constraints. On the other hand, rule-based phonology would regard as ill-formed an output which violates a rule. This has led to seemingly impossible questions as to how a particular phonological pattern can exist in a language, if it violates its rules. Either a highly complex series of rules and exceptions must be posed, or it is unexplainable. OT offers a much simpler and superior explanation in that constraint (or “rule”) violation poses no problem, it only means that that constraint had a lower ranking in the particular language.
With regard to language learning, an IL grammar is described by OT as dynamic combination of L1 constraints and L2 constraints. As the learner progresses, the grammar (or constraint rankings) become more like the L2 and less like the L1 (obviously this characterization applies only in cases of L1/L2 difference). OT is a valuable theory in which L2 errors can be analyzed, because in the process from input to output, it “processes” every candidate, and every potential output can be seen and the optimal one identified. It logically follows that, any output is potentially optimal if the constraints are re-ranked. In the case of L2 pronunciation errors, or even in accent shifts within the same language, OT can provide a clear explanation of what re-rankings must have occurred to produce this new, or peculiar, output result.
Hancin-Bhatt describes the sets of constraint rankings at play in a language learner’s IL:
1. The native language ranking, which account for ‘erroneous’ productions due to full transfer;
2. A hypothesized target language ranking, which accounts for accurate productions; as well as
3. Re-rankings between native and target rankings, which account for ‘erroneous’ productions that do not have an obvious link to the native or target grammar.
By assuming that these rankings compete in the developing grammar, we can account for the range of productions we see in L2 phonological acquisition. (Hancin-Bhatt, 2000, pp. 205-206)
Hancin-Bhatt (2000) studied the ways in which Thai learners of ESL dealt with codas. English allows any consonant to appear as a coda, while Thai only allows voiceless stops, nasals, and glides to appear in coda position. Therefore, a novice Thai learner of English would have no difficulty with saying “kit,” “kick,” or “Kim,” but “kid” would pose a problem. The result is usually a substitution with an allowable Thai coda phoneme which shares place of articulation. (“kid” pronounced as “kit,” “wig” pronounced as “wik.”) Additionally, Thai does not allow complex codas, so for a word like “script,” either the /p/ or /t/ is deleted. In her study, Hancin-Bhatt analyzes coda restrictions within the framework of OT, and postulates the constraint rankings that Thai employs in dealing with codas.
From these rankings, it is possible to predict the pronunciation patterns of a L1 Thai ESL learner. This involves which sounds and structures will be difficult to produce, which strategies the speaker will prefer to use in speaking these words (deletion, epenthesis, substitution, etc.), which substitutions are likely to vary, and in the case of words like "script," which sound is likely to be deleted. After tasks were performed and data analyzed, the results matched with the predictions generated within the framework of OT. In fact, Hancin-Bhatt was able to identify patterns of pronunciation improvement in the English L2 learning, and postulate different IL stages which clearly explain the data. These IL stages differ in that as they become more advanced, the sets of constraint rankings are being rearranged to conform to a more English-like L2 grammar.. Essentially, these interlanguage stages are a function of the learners re-ranking coda constraints in their IL. This sort of analysis holds enormous potential for pedagogical application; one can imagine that a detailed account of constraint rankings of IL stages between L1 Spanish and L2 English would be a tremendous asset to many, many ESL teachers.
Another study which highlights the value of OT is the investigation of interdental substitution in ESL presented by Lombardi (2003). The English [θ] is a highly marked sound, and does not exist in many languages. Because of its difficulty, L2 English learners often substitute it with another sound. It is well-attested that speakers of German and Japanese replace it with [s], while speakers of Russian and Thai replace it with [t]. The interesting question is why is there variation here, especially when these diverse L1s contain both [t] and [s]? Why would speakers of some languages choose [s] as a substitute, and others [t]?
In a rule-based phonology, according to Lombardi, this variation is "disturbing." Because the [θ] does not exist in these other L1s, there could be no L1 rule that drives an interdentals to change to anything else. An approach based solely on the concept of marked and unmarked universals would expect the interdental to be substituted with the stop [t], as stops are unmarked (they occur in 100% of known languages) and fricatives ([s]) are relatively more marked. In a rule-based framework, universals cannot be violated, yet we do see [s] as the substitute for L1 German and Japanese speakers. Furthermore, an approach based on similarity/dissimilarity would run into the same problems, as it would predict that a highly marked feature like [θ] may (1) be easier to learn, or (2) if substituted, it would be a consistent substitution by speakers of all L1s that possess the stop/fricative distinction. Again the explanation fails. Other explanations which have been proposed involve very complex arguments working within controversial theories, such as L2-specific pruning within the framework of an underspecification theory. OT seems to be the only phonological framework in which this difference can be clearly and simply explained.
As stated, OT offers a simple solution: speakers who substitute with an [s] operate within ILs that rank faithfulness to manner as the highest constraint , while those who use a [t] do so because their IL grammars place more priority on unmarked segments. Lombardi shows that languages which highly rank faithfulness to manner will view [s] as the optimal substitute. For speakers of these languages, a marked structure (fricative) is part of their L1 grammar. Phrased differently, this is a transfer of L1 rankings to the IL. Speakers from languages which possess no such explicit ranking will rely on the constraint rankings retained from the initial UG state, which would make the unmarked [t] the optimal choice.
Optimality Theory is not restricted to phonology, it is useful and quite successful in the areas of morphology and syntax as well. It seems that earlier theories of language were too narrow to be comprehensive. They employed inviolable rules, were based on somewhat exclusive philosophical bases, and seemed to treat languages as frozen in time. Optimality Theory is different in that it does not take an “all or nothing” approach, but rather a “both/and” approach when describing constraint hierarchies, violations, and optimal language output. In the specific area of L2 phonology, OT works with the notions of UG, L1 transfer, interlanguage progressions, markedness scales and universals, and appears to be the most comprehensive and capable theory of L2 phonological acquisition at this time.
That's all folks. If you have read this far, and are still awake, I am truly impressed!