Using Words to Examine the McGurk Effect
Annie Rose H. Nicholson
Stephen F. Austin State University
In 1976, Harry McGurk and John MacDonald published a paper called "Hearing Lips and Seeing Voices" that became a landmark study in the sensory integration field. While studying infant perception at the University of Surrey in England, McGurk and MacDonald accidentally discovered an illusion that combines audio-visual stimuli (American Scientist, 1998). This illusion has become known as the McGurk effect. What McGurk and MacDonald found was that when discrepant auditory and visual information was presented to a human subject, the stimuli were combined to make a completely new response (McGurk & MacDonald, 1976). For example, when the visual stimuli /ga/ is presented with the auditory stimuli /ba/, the subject will perceive /da/. This illusion suggests that visual information is integrated into our perception of speech unconsciously and automatically (University of California at Riverside, 2001). This illusion also suggests that neither auditory nor visual information is more important than the other is but that we can manipulate the dominance of one over the other (Easton & Basala, 1982).
The limitations of the McGurk effect have been studied extensively since 1976. These studies have examined a wide range of circumstances under which the McGurk effect occurs and does not occur. Green and coworkers found that even when the auditory and visual stimuli were presented by different genders the McGurk effect occurred (Green, Kuhl, Meltzoff, & Stevens, 1991). A study done in Canada found that when the auditory stimuli lagged behind the visual stimuli by as much as 180 milliseconds the McGurk effect was apparent (Munhall, Gribble, Sacco, & Ward, 1996). Any lag more than 180 milliseconds between the auditory and visual stimuli caused a combination of the stimuli but not a completely new response. An example would be that the visual /ba/ and the auditory /da/ did not combine to form a new response like /ga/ but only partly combined to form /bda/ (Munhall, Gribble, Sacco, & Ward, 1996). Other studies have been done examining the actual auditory and visual stimuli. Green and Gerdeman (1995) found that when the auditory and visual stimuli contained different vowels that effect of the McGurk illusion decreased significantly. McGurk and MacDonald (1978) found that there were certain consonant combinations that exhibited a greater effect than others. Consonants that use the different formations of the mouth when spoken seem to have a greater influence on the McGurk effect than those that have the same mouth formations. Some examples of good consonant combinations that elicit a high McGurk effect are /ba/ and /ga/ or /ba/ and /da/ (University of California at Riverside, 1998). A study done by Rosenblum and Saldana (1993) concluded that even when the facial images of the visual stimuli were blurred, the McGurk illusion was unaffected. It has even been found that prelinguistic infants exhibited the McGurk effect. Infants were habituated to an audiovisual presentation /va/. The infants were then presented with two different dishabituation stimuli (audio /ba/ - visual /va/ and audio /da/ - visual /va/) that exhibit the McGurk effect in adults. The results suggested that the infants were drawn to the stimulus that exhibited the habituated /va/ (Rosenblum, Schmuckler, & Johnson, 1997).
The McGurk effect has been shown to occur under many different circumstances, but does it occur with words as opposed to sounds? There have been several studies on the use of words in the stimuli but these studies have reported conflicting results. A study done at Boston College concluded that when using words the McGurk effect was not present (Easton & Basala, 1982). On the other hand, a study done at Dartmouth College reported that words do exhibit the McGurk effect (Dekle, Fowler, & Funnel, 1992). In order to further examine and follow up on the results of previous studies, this experiment will investigate the effectiveness of words in the McGurk effect. Green and Gerdeman (1995) found that matching vowels in the auditory and visual stimuli caused a stronger McGurk effect than nonmatching vowel combinations. Based on these findings, the hypothesis is that by using the correct vowel combinations in the auditory and visual stimuli, the McGurk effect will occur with words as well as monosyllabic sounds. The subjects will be tested in four different treatment conditions combining words versus monosyllabic sounds, plus matching-versus-nonmatching vowel stimuli. The dependent variable will be the accuracy with which the subjects can identify the auditory stimuli correctly and will be measured using an identification test. The subjects will be asked to record what they hear, not what they see (Easton & Basala, 1982). This is an attempt to decrease the influence of the subjectís ability to lipread on their reporting accuracy.
Participants
At least 60 undergraduate students from Stephen F. Austin State University will participate for course credit in a psychology course. The requirements will be that the subjects have no speech, language, or hearing problems and normal or corrected to normal vision.
Materials
The auditory and visual stimuli will be presented using a Sony 24-inch color television and videocassette recorder combination. Headphones will be available for the subjects in order for them to be able to listen to the auditory stimuli without interference. No other auditory devices will be used to amplify the sound. The stimulus will include a female speaker with no previous experience that will be filmed (prior to the experiment) on a blank white background. The speaker will be filmed first reading an introductory set of instructions that are provided so that the subject may become comfortable and arrange themselves at their station. The female speaker will then provide the set of stimuli for each level of the independent variable with a small pause between each one. The speaker will be filmed using a personal camcorder and tripod. The same speaker will be used to record the discrepant auditory stimuli that will then be dubbed onto the videocassette in synchrony with the visual stimuli.
Procedure
The subjects will enter the testing room and be asked to select one of the ten TV/VCR stations at which to complete the experiment. The consent form will be read aloud by the experimenter and then signed by the subjects. Copies of the consent form will be available for the subjects at the front of the room after the experiment. Subjects will then be read a set of instructions by the experimenter and asked if they have any questions before beginning. Subjects will be instructed to put on their headphones and begin the videotape by pushing play on the TV/VCR display. During the introductory instructions on the videotape, the subjects will be instructed to adjust their seats so that they are comfortable. Each subject will then complete each of the four treatment conditions using the identification test and writing utensil at his or her station. Each treatment condition will contain five auditory/visual combinations. Each combination will be repeated three times by the female speaker before moving on to the next example. The combinations for each treatment condition are listed in Table 1. The monosyllabic sound combinations were taken from Green and Gerdemanís (1995) study on the discrepancies of vowels in the audio-visual stimuli in addition to those provided by the researcher. The word combinations were taken from Dekle, Fowler, and Funnellís (1992) study using words to examine the McGurk effect also in addition to those provided by the researcher. The identification tests will just be a numbered sheet of paper that has a space for the subjects to write what sounds or words they heard with each combination. Upon completing the experiment, the participants will be read aloud a debriefing form. Copies of the debriefing form will also be available proceeding the experiment. The test forms will be collected, the blue cards handed out, and the subjects excused.
Design
A 2x2 Within Subjects design with the independent variables being whether the stimulus is a monosyllabic sound or a word, and whether the vowels in the auditory and visual stimuli are the same. The dependent variable will be the accuracy with which the subjects are able to identify the discrepant stimuli and will be measured using an identification test.
Prior to being able to come to any conclusions about the results of this experiment, certain analyses must be done to validate the extent of their significance. An ANOVA summary table for a completely within subjects factorial design should be completed. Depending on the results of the ANOVA, which are predicted to be significant, further analysis of the data should be completed. Main effects should be calculated and these are also predicted to be significant. Main comparisons, simple comparisons, or simple effects will be calculated if the ANOVA table proves to provide significant results. Comparisons should be done on several levels to analyze effectiveness of the matching vowel combinations. The first comparisons that should be done have to do with the matching vowel sounds and the nonmatching vowel sounds. The same comparison should also be done with the matching vowel words and the nonmatching vowel words. Comparing these two treatment conditions will allow us to come to a conclusion with regard to the effectiveness of matching vowel combinations on the McGurk effect. If the hypothesis is correct, there should be a significant difference between the scores of the matching and nonmatching vowel combinations in both the monosyllabic sounds and the words. Several reasons could be suggested if the results of the comparison are not significant. One such reason could be that the vowels have very little to do with the McGurk effect. Another reason could be that vowels as well as consonants should be taken into account as was suggested by MacDonald and McGurk (1978). Another comparison that should be done is on the difference in the scores between the nonmatching sounds and words. Again, this same comparison should be done with the matching sounds and words. This comparison would help us to determine how significantly the words elicited the McGurk effect compared to the monosyllabic sounds in both matching and nonmatching vowel combinations. A significant result would suggest that words are able to elicit the McGurk effect effectively as compared with the already successful monosyllabic sounds. An insignificant result would suggest that, compared to sounds, words have no impact on the degree to which the McGurk effect is exhibited.
Although many areas concerning the McGurk effect have been studied, further research is needed in several areas. One of the least understood aspects of the McGurk effect is the actual process in the brain that leads us to the combined perception of the auditory and visual stimuli. A clearer understanding of the sensory integration system could be obtained by investigating a process that would allow us to locate where and how the brain combines this information. Perhaps using equipment such as a MRI or PET scan, researchers may be able to localize activity while the subjects are experiencing the McGurk effect. Using this information, we may begin to manipulate the responses. Another area of the McGurk effect that should be studied further is which consonant combinations elicit the most pronounced McGurk effect. Combining information learned from studies concerning vowel combinations with information from studies concerning consonant combinations, the McGurk effect can be elicited using combinations that provide maximum results.
The McGurk effect [Electronic version] (1998). American Scientist: The Magazine of Sigma Xi. Retrieved March 2, 2002 from www.sigmaxi.org
Deckle, D.J., Fowler, C.A., & Funnel, M.G. (1992). Audiovisual integration in perception of real words. Perception and Psychophysics. Vol 51(4), 355-362.
Easton, R.D. & Basala, M. (1982). Perceptual dominance during lip-reading. Perception and Psychophysics. Vol 32(6), 562-570.
Green, K.P. & Gerdeman, A. (1995). Cross-Modal discrepancies in coarticulation and the integration of speech information: the McGurk effect with mismatched vowels. Journal of Experiment Psychology: Human Perception and Performance. Vol 21(6), 1409-1426.
Green, K.P., Kuhl, P.K., Meltzoff, A.N., & Stevens, E.B. (1991). Integrating speech information across talkers, gender, and sensory modality: Female faces and male voices in the McGurk effect. Perception & Psychophysics. Vol 50(6), 524-536.
MacDonald, J. & McGurk, H. (1978). Visual influences on speech perception processes. Perception and Psychophysics. Vol 24, 253-257.
McGurk, H. & MacDonald, J. (1976). Hearing lips and seeing voices: A new illusion. Nature. Vol 264, 746-748.
Munhall, K.G., Gribble, P., Sacco, L., & Ward, M. (1996). Temporal constraints on the McGurk effect. Perception and Psychophysics. Vol 58(3), 351-362.
Rosenblum, L.D., Schmuckler, M.A., & Johnson, J.A. (1997). The McGurk effect in infants. Perception and Psychophysics. Vol 59(3), 347-357.
Saldana, H.M. & Rosenblum, L.D. (1993). Visual influences on auditory pluck and bow judgments. Perception & Psychophysics. Vol 54(3), 406-416.
University of California Perceptual Science Lab (1998). BA + GA = DA. Retrieved March 2, 2002 from http://mambo.ucsc.edu/psl
University of California at Riverside Audiovisual Speech Web-Lab (2001). The
McGurk effect. Retrieved March 2, 2002 from www.psych.ucr.edu/
Appendix
|
Table 1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|