The Why Files The Why Files --

Music: The universal scale
7 JUNE 2007

Man strums the strings of a white guitar in the woodsWhy are the notes of so many types of music found on the 12-tone "chromatic" scale? Researchers at Duke University believe they have found the answer, and it's rooted in something we do every day: Blab. Yak. Chat. Converse. Utter. Blurt. Spout. Rant. Declaim.


The frets on a guitar neck are placed to produce the smallest interval on the chromatic scale: a half-step.
Photo: The Why Files

Dale Purves, a professor of cognitive neurosciences at Duke, asked native speakers of English and Mandarin Chinese to pronounce vowels, and put the recordings through a sound spectrum analyzer -- a device that extracts the multiple frequencies of pressure waves that compose the sound of each vowel.

When the researchers looked at the fundamental (lowest) tone, they saw no relationship to the Do speech and music have a common structural component?chromatic scale.

It was a different story when they focused on the interval between the most powerful pulses of speech sound, called the formants. In "about 70 percent of the vowel sounds, these ratios were bang-on to musical intervals," Purves said.

Formants exist because most sounds are complex mixes of a fundamental -- the lowest tone -- and various higher-pitched tones, which add richness and complexity to the sound. Formants are natural resonances of the vocal tract, and understanding vowels depends on these changing resonances of the tract as we speak.

Our ears detect these resonances, but the brain does not consciously register them, Purves says. Nevertheless, he says there is a relationship between our familiarity with the formants of speech and the preference for musical scales built around similar spacings. "This predominance of musical intervals hidden in speech suggests that the chromatic scale notes in music sound right to our ears because they match the formant ratios we are exposed to all the time in speech, even though we are quite unaware of this exposure."

The curve and mechanism of a saxophone, created in brassThe recent study, Purves stresses, focuses on the intervals, not the starting tone, or fundamental, which is produced by the vocal cords. "The vocal cords are being modulated all the time, to produce whatever fundamental we want. The musical relationships are not in the fundamental but in the intervals."

Human speech may be the model for music-- or vice versa. You can count on this: Most speakers are not as brassy as the tenor saxophone!

Saran Vaughn called her voice her "instrument"
And maybe she knew what she was talking (or singing) about. Purves compares the vocal tract to a musical instrument, which also produces and modifies sound. The larynx, or vocal cords, "produce a stream of sound energy that is like what you would get from the reed of a clarinet, a specified, back-forth pulse of sound energy. The rest of the wind instrument is taking that sound and modifying it; that's exactly what happens in the vocal tract."

Formants, he stresses, arise not from the vocal cords, but from modifications that the rest of the vocal tract imposes on the sound. Purves and colleagues Deborah Ross and Jonathan Choi studied vowels, because they -- but not consonants -- contain the periodic impulses that define tones in either a voice or a musical instrument.

Curiously, however, we do not actually hear the formants, even though they are necessary for recognizing vowels. "They are there, but you do not perceive them," says Purves. "You hear A, E, I, O or U." Neuroscientists have long known that the brain does not register exactly what the ear or eye detects, he adds. "You perceive something that is useful biologically, you do not perceive the physical information as such."

While the paper makes a tight link between the spacing between the sonic peaks in speech and the chromatic scale, it does not solve the problem of which came first, the fried-chicken-or-the-fried-egg: the music of Justin Timberlake or human gossip about the neighborhood scandal.

Peaks and valleys on a graph show similar jumps in speech and music.
Most of the intervals found in human speech (red) corresponded to those on the 12-tone ("chromatic") musical scale. Black bars represent intervals that were not on the chromatic scale. Graph represents summary of recent experiments. Courtesy Dale Purves.

"The paper is not directed at answering that question," says Purves. "It's directed toward understanding the relationship of speech and music." Because nobody was running a tape recorder when cave-people first picked up the electric guitar and began crooning about lost love, "you could argue both ways," he says. "You could say these intervals were in pre-lingual sounds, the cries, grunts, expression of affection, and so music came first. Or you could argue that the intervals depend pretty much on the range of modern vowels, and that music came second. There is no way to answer that."

-- David Tenenbaum

Related Why Files
• The science of music.
• The science of language.
Inventing language.
• Violin physics.

• Musical intervals in speech, Deborah Ross, Jonathan Choi, and Dale Purves, doi:10.1073/pnas.0703140104, PNAS published online May 24, 2007.

©2021, University of Wisconsin, Board of Regents.