Make your own free website on
Auditory Perception of English Minimal Pairs
Hui-Li Lin1, Hsing-Wu Chang2 and Hintat Cheung3

1Department of Psychology,National Taiwan University<>

2 Department of Psychology, National Taiwan University

3Graduate Institute of Linguistics, National Taiwan University

Auditory perception of English minimal pairs was tested, with or without noise background. Each subject was interviewed after the test to collect information regarding their early experience on learning English as a foreign language. Results showed that age effect was salient only under noise background situation. Without the interference of noise, most subjects performed well enough to obliterate any potential differences.


One of the most controversial issues in second language acquisition theories concerns the critical period hypothesis. In the acquisition of first language, lots of evidence lends support to the critical period concept, but the relevance of this hypothesis to second language acquisition seems inconclusive (Harley et al. 1997). Empirical studies concerning this hypothesis differ in their focuses. A recent meta-study by Marinova-Todd, Marshall and Snow (2000) pointed out misconceptions about age effects in second language acquisition. However, in a very recent empirical study (DeKeyser 2000), the robustness of age effects in second language acquisition was again found. By testing how adult learners’ grammatical abilities correlated with their problem-solving capabilities, instead of their starting age of second language learning, this study replicated the findings of Johnson and Newport (1989).

Studies of infant speech perception, however, have revealed that the ability to discriminate non-native phonemic contrasts decreases at around the end of their first half-year of life (Werker et al., 1984). In addition, most adults were found to experience difficulties when differentiating non-native phonetic contrasts. The developmental change of speech perception is thus an aspect that requires more intensive investigation in order to account for facts such as better initial gain for older second/foreign language learners, which seems to contradict results from infant studies.

This exploratory study was conducted with two purposes; first, at the theoretical level, it was designed to weigh the differential effects of learning English at three age starting points and two learning durations; secondly, at the educational/practical level, the study was designed to examine how childhood learning experience of English has affected Taiwanese university students’ auditory competence in distinguishing English minimal pairs.



Sixty-six Introductory Psychology students (33 females; 33 males) at National Taiwan University (NTU) volunteered for the experiment. Subjects’ age ranged from 17yr to 24yr, with an average at 20.30yr, and a standard deviation of 1.24yr. All subjects were non-native English but native Mandarin/Taiwanese/Hakka speakers.

Forty English minimal pairs (80 words in total) were selected as stimuli for the study. All words were among the 3000 most frequently used (spoken/written) English words. A male English speaker with American mid-west accent was recruited to record the words. Sound intensity of the stimuli was between 68 db and 77 db with an average of 72.25 db. Each pair appeared 6 times. For example, for a pair containing A and B, the six combinations would be AAB, ABA, BAA, BBA, BAB and BBA. Total trial numbers amounted to 240. For each subject, the 240 trials were presented in random order. Within each trial, the three words were read out loud one by one. At the end of the third sound, the computer started to time how long a reaction took in second until subjects hit one key out of “1”, “2” or “3” on the keyboard. One point was added to the total scores if the answer matched the position of the odd one (e.g. in AAB, the right answer was 3). The subjects’ task was to identify the odd member of a trio of words. Between-trial interval was 1 second.
Every subject was given verbal instruction before taking the test. Examples were also given as means to ensure subjects fully understanding of the testing requirements. After the 240-trial testing procedure, which usually took around 15 minutes, an interview was conducted to collect subjects’ English learning experiences and other related information. During the interview session, subjects were asked to recall (1) at what age they were first exposed to the 26 English alphabets, formally or informally (EXPOSURE); (2) at what age they were taught English formally (STRTFRML); (3) how many years of formal/classroom English instruction they have received (FORMAL); (4) at what age they were taught English in English language immersion settings (the number was then inversed, because some subjects never had any immersion experience--their scores of INVIS were thus zero); (5) how many years in total they were taught English language under immersion settings before college (IMMYR); (6) whether they were taught by native English speakers (NS).


Auditory test scores (i.e. correct identifications) of the 66 subjects range from 199 to 238 (out of 240), with an average at 221.50 and a standard deviation at 9.40. Correlation analysis results show that subjects’ age is highly correlated with EXPOSURE (r=0.33, N=66, p<0.01), indicating not only the increasing rate of learning English as a foreign language but at a younger age in Taiwan. Auditory test scores are highly correlated with STRTFRML (r=-0.32, N=66; p<0.01) and fairly correlated with EXPOSURE (0.01<p<0.05), IMMYR, INVSI and NS. IMMYR is barely correlated with test scores (r=0.22, N=66; p=0.08). Reaction time (RT) and correct answered reaction time (CRRT) were not correlated with scores

The results also show that the correlations among all the predictors involving early English learning experiences are highly correlated among themselves. Therefore, a force entry regression entering the predictors by block was conducted.The whole model is not statistically significant (F(9,56)=1.57, p=0.15); only around 20% of the variance can be accounted for by the nine variables altogether. And the R square change values indicate only 7% of the variances uniquely contributed by the three starting points combined. This seemly lower correlation might have been caused by the narrowly distributed scores, which were also skewed to the left. Therefore in experiments 2, a noise background was superimposed to increase the overall difficulties of the task.



Sixty-six Introductory Psychology students (35 females; 31 males) at NTU volunteered for the experiment. Subjects’ age ranged from 18yr to 24yr, with an average at 20.45yr, and a standard deviation of 1.33yr. All subjects were Mandarin/Taiwanese/Hakka speakers, but none is native English speakers.

All materials and procedures were identical to those of experiment 2, except that a white noise generated with Cool Edit 2000 software with frequency from around 40 to 220 Hz was tape recorded and superimposed upon the same two stereo speakers. The noise was played by an AIWA stereo Walkman connected to the speakers. Amplitude of the noise was tested during pilot study and set at 83db so that the mean scores would be appropriate (i.e. not too close to 80, which would be the scores if the test is taken by guessing exclusively, and lower than the mean scores of no-noise condition).

The overall correlation pattern is different from that of experiment 1. Although scores are still highly correlated with EXPOSURE (r=-0.36, n=66; p<0.01), FORMAL(r=0.26, n=66; p<0.05) and STRTFRML(r=-0.38, n=66; p<0.01), the correlations between correct responses and IMMYR(r=0.08, n=66; p>0.05), INVIS(r=0.07, n=66; p>0.05) and NS(r=0.16, n=66; p>0.05) are no longer statistically significant. RT and CRRT were not correlated with scores in experiment 1, but are marginally significant (with RT, r=-0.24, p=.05; with CRRT, r=-0.23, p=.07) in this experiment.
To better understand the relationship between EXPOSURE and scores, STRTFRML and scores, two partial correlation coefficients were calculated with FORMAL controlled. Usually FORMAL and the two starting points, EXPOSURE and STRTFRML, are highly correlated. People who are exposed to English or taught in a formal English instruction earlier, have also spent more years learning English in formal settings. In Taiwan, teaching English to children before junior high school is not mandatory in schools, but English classes for children are available at private institutes. According to data collected from our interview, we confirm this phenomenon. Both partial correlation coefficients are significant: r between scores and EXPOSURE with FORMAL controlled is -0.28, p=0.026; r between scores and STRTFMRL with FORMAL controlled is -0.29, p=0.021. The results indicate that the two starting points are, by themselves, statistically valid predictors for auditory perception scores of English minimal pairs.

Force entry regression results converge with the partial correlation findings. We entered the variables in 5 blocks. Two regression sets were performed separately, with the first three blocks (containing Native Language, Sex, Age, NS, IMMYR, INVIS) entered in the same order, and the last two blocks’ order switched (containing (1) STRTFRML and EXPOSURE and (2) FORMAL). Although the overall F is not quite significant, probably due to small sample size in relation to the number of predictors and of levels of some of the predictors (F(9,56)=1.549, p=0.139), this force entry regression analysis can still provide us with very important information. Results show that the first three blocks altogether contributed only 3% of the response variation, and FORMAL alone contributed 1% of the variation if entered last. In addition, EXPOSURE and STRTFRML as a block, if entered last, can independently contributed above 10% of the variation. Both correlation and force entry regression analyses results reveal that immersion experiences (including both duration and starting point information) are no longer statistically valid predictors under noisy testing condition.

To confirm that the two samples tested under noise and no-noise conditions are homogeneous, all the independent variables were compared across the two groups. The results did not show any significant differences. Thus, we can safely compare the reaction patterns under the two testing modes. One most obvious contrast between the two testing conditions is that only under noise condition, did early learning experience become an effective predictor of correct response rate. This finding basically agrees with Mayo and his colleagues’ SPIN experiment (1997) in terms of how speech perception in noise is correlated with late and early learners. In our study, under quiet testing mode, almost all the predictors regarding early learning experiences significantly correlated with test scores, even after age related confound variables being controlled. However, under noise condition, only two starting points (EXPOSURE and FORMAL) remained significantly (even more) correlated with the scores. All the other independent variables, except FORMAL, lost their predictabilities.

As infants become older and gain more experience of their ambient languages, their tendency to react to non-phonemic contrasts declines (Wode, 1994). Theoretically, the reorganization of their phonological system can be either in an all-or-none, or in a gradient fashion. What we have found in this study is that speech perception may not be an all-or-none ability. Under noisy testing condition, the earlier people are exposed to English or learn English in formal settings, the more likely they can perform relative better then late starters. That is, early starters’ ability to access their non-native speech perception system are better than late starters’, but this pure effect of age cannot be measured with some tasks, for example, our no-noise testing condition. Under noise condition, acoustic information was not as salient as it was without the noise. Without the noise, subjects might be able to use all kinds of acoustic information provided by the speech input, even if they do not have any immersion experiences or are late starters. At practical level, performance of NTU students on our task reflects age effect, especially under noise condition. However, starting early does not guarantee better ultimate gain individually.

Our findings suggest further research is needed in the following directions: (1) examining the relationship among our perceptual testing scores and other English language tasks such as Johnson and Newports (1989) syntactic test; (2) designing tailored interviews according to our current data to precisely collect early English learning experience that might cause performance differences on the task; (3) examining relationship between different instruction methods and correspondent learning pattern.


Dekeyser, R. M. (2000). The robustness of critical period effects in second language acquisition. Studies in Second Language Acquisition, 22, 499-533.

Harley, B. & Wang, W. (1997). The critical period hypothesis: Where are we. In A. M. B. Groot & J. F. Kroll (Ed.), Tutorials in bilingualism: Psycholinguistics perspectives (pp.19-51). Mahwah, New Jersey: Lawrence Erlbaum Associates, Publishers.

Johnson J. S. & Newport E. (1989). Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language. Cognitive Psychology, 21, 60-99.

Marinova-Todd, S. H., Marshall, D. B., & Snow C. E. (2000). Three misconceptions about age and L2 learning. TESOL Quarterly, 34, 9-34.

Mayo, L. H., Florentine, M., & Buss, S. (1997). Age of second-language acquisition and perception of speech in noise. Journal of Speech & Hearing Research, 40, 686-693.

Werker, J. F. & Tess, R. C. (1984). Cross-language speech perception: Evidence for perceptual reorganization during the first year of life. Infant Behavior and Development, 7, 49-63.

Wode, H. (1994). Nature, nurture, and age in language acquisition: the case of speech perception. Studies in Second Language Acquisition, 16, 325-345