Some years ago, I was invited by my then boss, Jann Wenner, the owner of Rolling Stone, to be the lead singer in a band he was putting together from the magazine’s staff. I had just turned 41, and I jumped at the opportunity to sustain the delusion that I was not getting old. “Sign me up!” I said.
My chief attributes as a singer included impressive volume and an ability to stay more or less in tune, but I was strictly a self-taught amateur. I had, for instance, never done a proper voice warmup, and had certainly never been informed that the delicate layers of vibratory tissue, muscle and mucus membrane that make up the vocal cords are as prone to injury as a middle-aged knee joint. So, on practice days, I simply rose from my desk (I was finishing a book on deadline and spent eight hours a day writing, in complete silence), rode the subway to our rehearsal space in downtown Manhattan, took my place behind the microphone and started wailing over my bandmates’ cranked-up guitars and drums.
The folly of this approach became clear to me a few weeks into rehearsals when J Geils Band frontman Peter Wolf, whom Jann had enlisted to perform a song, pulled me aside. “You don’t have to sing full out in rehearsal, man,” he said. “Save something for the show.” I followed his advice, but by then my voice had taken on a pronounced rasp. I wasn’t concerned. I had suffered hoarseness in the past and it had cleared up. Plus, a little vocal raggedness is never out of place in rock’n’roll. Also, and perhaps most importantly, I felt no discomfort – so how could I have hurt my throat?
I continued attending twice-weekly rehearsals and soon reverted to my old ways – actually singing harder, trying to put some of the old volume back into my voice, which was sounding weirdly dampened. I was also finding it difficult suddenly to hit high notes, like the F above middle C in the Stones’ song Miss You (“Ohhhhhh, why’d you have to wait so long?”). Reaching for it, my voice would break up into a toneless rattle, or vanish altogether. This began to concern me as the days ticked down to our gig – a holiday party at a downtown dance club, to which Jann had invited 2,000 of his closest friends, including a constellation of celebrities.
Singing is as psychological as it is physical. Stress attacks the vocal apparatus, tightening muscles that should remain loose and pliable, restricting breathing, closing off the throat, paralysing the tongue and lips. I was experiencing all of these symptoms as I took my place, centre stage, in the glare of the lights, and began our opening number, the Beatles’ song I’ll Cry Instead, originally sung by John Lennon. It would seem a little on the nose to suggest that Yoko, along with her and John’s son, Sean, were looking up at me from the front row, except they were.
Today, I can barely bring myself to listen to the CD of that concert, which Jann later presented to each band member as a memento. I wince at the tentative way I sing that “Ohhhhh” in Miss You, sneaking up on the note from below, sliding into it gingerly. I get there, sort of. But at what cost? By the end of the night, I was growling the lyrics to White Room like it was a Tom Waits number.
A three-day bout of laryngitis followed. Then I began speaking in a parched whisper. This eventually “improved” to a torn-sounding rumble. Three months after the gig, I was still speaking as if my words were being stirred through gravel. But I was determined to believe the problem would clear up – until an alarming encounter in the building into which I had just moved with my wife and infant son. Holding open the elevator door for one of my new neighbours, a smiling blond woman, I pointed at the buttons and asked, “What floor?” Her smile vanished.
“You’ve got a serious voice injury,” she said. I demurred, but she cut me off, saying that she was a voice coach who worked with Broadway singers and actors. And she said that she could see, in my neck, the compensatory muscle movements I was making as I spoke. I was, she told me, straining the tendons, pressing them in against my voice box (or larynx), in a bid to compress my vocal cords and help them create sound. “I bet your neck gets pretty sore,” she said.
In fact, for weeks I’d been enduring a peculiar sensation in my neck, as if I had scalded the skin. “You’re no doubt straining other muscles, too,” she went on. “We use our whole body to sing, and also to talk. Abdominals. Hip flexors. Shoulders. Back. With an injury like yours, you’re working harder with all of them. You must be pretty tired by the end of the day.” I had been attributing the strange, bone-deep exhaustion that afflicted me every evening to the stresses of new parenthood and finishing my book. Not the muscular effort of speaking.
She invited me to drop by her apartment, anytime. She could show me some simple relaxation exercises that would help with the immediate symptoms. I hate presuming on neighbours, and knew that I would never avail myself of this kind offer. My wife shrugged and said: “At the very least, you ought to see a laryngologist, just in case it’s … something else.”
This caught my attention. I grew up in a medical family and was familiar with the euphemism “something else”. She meant a growth. A malignancy. This had never occurred to me. My rasp was so clearly the result of singing with Jann’s band – or was it?
The next day, I arrived at Mount Sinai hospital. I had an appointment with Dr Peak Woo, chief of laryngology, a subspecialty of ear, nose and throat medicine that focuses on the vocal cords. Woo was a soft-spoken man in his late 40s with a kindly bedside manner. He guided down my throat a laryngoscope, a tool that looked like the curved spray attachment on a garden hose, with a small light affixed to the end. On a nearby computer screen, the live image of my throat was broadcast, a wet red tunnel at the bottom of which sat my vocal cords: two symmetrical, fleshy, pearly-pink membranes stretched like a pair of lips across the opening of my windpipe.
Woo pointed to the screen, which held a photo of my vocal cords in the open position. It was not, he said, a malignancy. The edge of the left cord was ruler-straight. On the margin of the right cord was a small bump. A tumour would be lumpy, asymmetrical. My vocal mass was smooth and regular, as if a tiny pea had been inserted under the semitransparent mucus membrane: a textbook polyp, wholly consistent with my history of over-singing. I had broken a blood vessel in that vocal cord, and the unchecked bleeding had created the bump of scar tissue that was interfering with the vocal cord’s normal, fluid, rippling action. Sweet, pure singing voices are partly the result of vocal cords with clean straight edges that meet flush across the opening of the windpipe as they vibrate. Mine did not, and this is what produced the rasps and rattles and rumbles in my voice.
I asked if he might just snip the offending polyp off in a quick outpatient procedure. Hardly. To have the thing removed, I would need to check into the hospital for several days to undergo surgery, which would require not only a general anaesthetic but a special paraly agent to render me completely immobile – a crucial consideration given the extreme fragility of the vocal cords and the permanent injury to the voice that can result from removing even a micrometer too much healthy tissue.
I left his office with a prescription for a medication to take in the days before the operation. Scheduling the procedure was up to me. He told me to call when I was ready. I never called.
Why? The usual excuses – no time, too expensive, too risky and six weeks of strict postoperative vocal silence. Who could afford to stop talking for six weeks? Like most people, I took for granted the sounds that emerged from between my lips, thinking, as long as I’m getting the words out and being understood, my voice is fine. Which is not to say that I wasn’t self-conscious about my rasp.
Speaking on the phone, which always heightened my awareness of my damaged voice, I often worried that I was conjuring in the brain of my invisible interlocutor the image of a thuggish underworld heavy – a particular concern if I was trying to get a potentially delicate journalistic source to trust me. There was also the inconvenience of disabusing friends who mistook my rattle as a symptom of the flu. But for all these annoyances and discomforts, I was not (I told myself) disabled. I could converse. I could work. By these lights, the surgery was not necessary.
I did, however, take certain measures to preserve what remained of my voice. I concentrated on relaxing my neck, stopped pushing my voice out with an extra effort of my abdominals. This tended to reduce my volume – or “projection” – but it also eliminated the scalding neck pain and overall exhaustion. I also learned, by unconscious trial and error, to lower my pitch, which seemed to smooth my tone a little.
Over time, I was even able to convince myself that the problem had cleared up – a state of denial I sustained for over a decade, until one day in late 2012, when I started working on a new article for the New Yorker magazine.
It was about a vocal surgeon, Steven Zeitels, who worked at Massachusetts general hospital in Boston. Since the mid-90s, Zeitels had ministered to an array of popular singers – Steven Tyler, Cher, James Taylor – as well as famous TV and radio broadcasters, opera stars, Broadway belters and actors. A few months earlier, he had successfully operated on the British singer-songwriter Adele, removing a vocal polyp that had threatened to end her career. She had thanked him from the stage when collecting several Grammy awards.
When I called Zeitels to ask if he might be willing to cooperate with a story, I hadn’t even finished my pitch before he interrupted me, saying: “It sounds like you’re dealing with a pretty significant vocal issue yourself.”
Brought up short, I stammered something about having experienced “a little vocal strain” some time ago, and changed the subject. But I could not staunch his clinical curiosity. When I visited Zeitels for our first set of interviews, he insisted on “looking at” my throat.
Like Woo, Zeitels peered into my throat with a laryngoscope; he, too, left an image of my vocal cords up on his computer screen. Even to my untrained eye, the mass looked far bigger than in the photo taken more than a decade earlier by Dr Woo. Zeitels was certainly impressed. “You couldn’t possibly sing with something this big,” he said. “It’s mechanically impossible.” He was right about that.
The few times I’d tried, my voice shut down, went off-pitch – and the extra exertion of driving air past my burdened vocal cord would force me to reload my lungs at an abnormally fast rate, making my phrasing choppy (good singers time their intakes of breath around natural pauses in a song’s lyrics), causing me to hyperventilate and grow light-headed. Little wonder that I had not sung publicly since Jann’s party, and no longer sang even in private, around the house. Too exhausting. Too depressing.
Zeitels let me know, however, that my singing was not the primary issue. There was also the question of my speaking voice. Yes, I could still talk, he said, but my altered voice was affecting my life in ways that I was not acknowledging. “Here’s the way to understand your speaking voice,” he said. “You’re grossly hoarse. People might say, ‘Well, his voice isn’t that bad.’ No. Your voice is actually pretty bad. Your right vocal cord – the one with the polyp – has a severely impaired elastic dynamic capability. You’re working at 3 or 4% of normal.”
Consequently, he said, I had done what many people with my injury do: I had developed strategies for, as he put it, “speaking around the problem” – retraining my recurrent laryngeal nerve (the nerve that, among other things, controls the tension on the vocal cords) to drop the pitch of my voice, slackening my freighted vocal membrane so that the 3 or 4% that was still pliable would vibrate. This reduced the rattle in my voice, but at a cost. It was robbing me of the natural variation in pitch and volume that people use to give colour, animation, expression and personality to their utterances – what linguists call prosody, the melody of everyday speech.
Through prosody, we express tenderness, or anger, or enthusiasm, or any number of other nuanced emotional states that give the human voice its peculiar power to woo, persuade, threaten, cajole and mollify. Prosody makes the difference between the affectless utterances of HAL, the computer in 2001, and the rich and expressive instrument of Morgan Freeman or Meryl Streep – or even just the lilting, songlike way you say “Hello” when you answer the phone, so your caller doesn’t think you’re a machine. The term comes from the ancient Greek: pros, meaning “toward”, and ody, meaning “song”. We speak toward song. Except I didn’t any more, according to Zeitels.
“You’re behaving through a veil of monotone,” he went on. “When you talk, you can’t express emotion properly. You can’t change pitch, can’t get loud, can’t do the normal things that a voice does to express how you feel.”
This hit me hard. I had not been consciously aware of these changes; but now that he pointed them out, I had to acknowledge that my range of expression had indeed diminished. Though I could still drive my voice through the basic melodic shifts necessary to make my emotional state more or less known, it had become burdensome to do so – too much expressive talking still left me pretty wiped out at the end of a day – and my voice was by no means the precision instrument it had once been. More cudgel than scalpel, it would, when imbuing a word or syllable with special emphasis (“He said what?”), often break up, or cut out, altogether.
But that wasn’t the worst of it. For Zeitels now added: “You are not being transmitted by your voice.”
That the voice is a vital clue to character and personality – to fundamental identity – was not news to me. I had always known that the voice is a kind of aural fingerprint, something unique to every individual and from which listeners draw strong inferences. But in “speaking around” that injury, I was apparently projecting a new personality into the world: a more monotone, less enthusiastic, less engaged personality.
But my polyp wasn’t just changing how others perceived me; it was actually changing my behaviour. “People with your type of injury withdraw from scenarios intuitively,” Zeitels said.
“You might have been brewing this polyp for decades before you sang in Jann’s band,” he added. “Wallflowers and introverts don’t get this injury.”
Feeling shaken, I said: “So – this changes my life, in a way?”
“Totally,” he said.
The voice is a deceptively simple-seeming subject (you sing, you talk – big deal) that actually touches on some of the deepest mysteries in the natural world: namely, how we communicate thoughts, emotions, personality, upbringing and a lot of other personal data, on tiny ripples of air that we beam into other people’s brains by moving our lips and tongue while exhaling. An alien species watching us perform this bio-lingual-psycho-acoustical feat would no doubt think: “This is unreal!”
And it is. But how to get your hands around so big and diffuse a subject? There’s a difficulty in even saying what the voice is. Is the voice singing? Talking? Is a cough voice? A laugh? “Indeed, it seems we know exactly what we mean by the word voice as long as we don’t try to define it!” as Johan Sundberg, the world’s foremost authority on the physiology of singing, put it in the introduction to his classic textbook The Science of the Singing Voice.
Aristotle, who defined the voice as “the sound produced by a creature possessing a soul”, explicitly ruled out coughing as voice because a cough does not call up a “mental image” – that is, words. Unfortunately, that definition also rules out the high, clear sustained note that an opera tenor hits, and which sends shivers through us, despite the isolated vowel’s calling up no specific “mental image” (especially if we don’t understand Italian). To say nothing of the fact that, in the 50s, a branch of speech science called paralinguistics emerged, which convincingly showed that all manner of vocal noises (coughs, sighs, gasps, ums and ers) can be highly revealing of a person’s inner state of mind and heart – and as such have a communicative salience that, even by Aristotle’s definition, qualify them as voice.
Add to these confusions the epistemological conundrum that the voice is, conceptually, impossible to “locate”. It is “in” the speaker’s body as an act of breathing and articulation, but doesn’t exist until it is manifest in the air as a sound wave. Arguably, the voice comes into existence, as voice, only when someone is around to process that sound wave in the brain’s auditory cortex. (In voice science, the answer to the philosophical riddle: “Does a tree that falls in a forest make a sound if there’s no one to hear it?” is “No!”) A final complication arises from the fact that what science calls the voice – everything from the buzzing sound source in our throats, to the way we sculpt that buzz into speech sounds with movements of our mouths, to the rhythm and melody of spoken language or song – results from the synchronised actions of many distinct body parts (lungs, vocal cords, tongue, lips, soft palate), all of them originally designed (by natural selection) for quite different tasks. Which of these is the voice – some, all, none?
The key, I realised, was to think about what makes the human voice different from that of every other creature. All mammals and birds use vocal noises to communicate vital needs, through an array of oinks and squawks, chirps, barks and baahs. Parrots can even expertly mimic human speech – but without any idea what they’re saying. We are the only animal that can perform the miraculous feat of making the link between a specific vocal sound and an object that exists in the world.
I call it a miraculous feat, but that understates the case considerably. It is the reason that we, as a species, rocketed to the top of the food chain. If you’ve read Yuval Noah Harari’s Sapiens, you know that scientists usually attribute our ascent to language, a faculty that allows us to refer to events in the past or future, to allude to people and things not immediately present, to elucidate abstract philosophical concepts, and to make complicated plans and goals that we share with others of our species. No other animal can come close to doing this. Birds, dogs, chimps, dolphins – you name it – use their voices to make in-the-now proclamations about immediate survival and reproductive concerns, including expressions of fear, anger, hunger and mating urges. Our unique ability for language has thus been described as the great dividing line, the “unbridgeable Rubicon”, between us and every other living creature.
More than that, Harari says, it is the key to how we came to rule the Earth, since it enabled early humans – a relatively slow-running, physically weak, easily preyed-upon animal – to plan and cooperate and strategise with each other to outsmart bigger, faster, more lethal predators, to organise into groups (or tribes) of a greater size than any other animal (chimpanzees, our closest animal relation and the next closest in terms of cooperation, can manage about 100 members per group), and eventually to build the villages, towns, cities and nations that have given us primacy over the planet and everything on it. Written language eventually speeded this process up, but that only came along about 5,000 years ago, a blink of the eye in terms of human history. Up until then all verbal communication in our species was achieved via speech.
So, I’m not disputing the grand claims for language made by Hariri and others. I just think we need to refine the concept, to emphasise that we owe our planetary dominion not to language alone, but to our special talent for turning that awesome attribute into sound. The voice.
Our career and romantic prospects, social status and reproductive success depend to an amazing degree on how we sound. This is a question not only of our vocal timbre, which is partly passed down by our parents (in the size, density and viscosity of our vocal cords, and the internal geometry of the resonance chambers of our neck and head), or our accent, but also our volume, pace and vocal attack. These elements of our speech betray dispositions toward extroversion or introversion, confidence or shyness, aggression or passivity – aspects of temperament that are, science tells us, partly innate, but also a result of how we respond to life’s challenges, in the innumerable environmental influences that mould personality and character and, consequently, our voice.
In listeners’ ears, our voice is us, as instantly “identifying” as our face. Indeed, researchers in 2018 discovered that voices are processed in a part of the auditory cortex cabled directly to the brain region that recognises facial features. Together, these linked brain areas make up a person-differentiating system highly valuable for ascertaining, in an instant, who we know and who’s a stranger.
The voice recognition region can hold hundreds if not thousands of voices in long-term memory, which is why you can tell, within a syllable (“Hi … ”), that it is your sister on the phone and not a telemarketer, and that an impressionist is attempting to “do” Bill Clinton and not Ronald Reagan (both of whose voices you can conjure in your auditory cortex as readily as you can call up their faces in your mind’s eye).
That we do, sometimes, mistake family members for one another over the phone shows that not only are immutable anatomical attributes of voice (vocal cords and resonance chambers) as heritable as the facial features that make parent and child (or siblings) resemble each other, but that families often share a style of speaking, in terms of prosody, pace and pronunciation. But the voice of every person is sufficiently unique, in its tiniest details, that such misidentifications are usually caught within seconds.
Indeed, it is a philosophical irony of cosmic proportions that the only voice on Earth that we do not know is our own. This is because it reaches us not solely through the air, but in vibrations that pass through the hard and soft tissues of our head and neck, and which create, in our auditory cortex, a sound completely different to what everyone else hears when we talk. The stark difference is clear the first time we listen to a recording of own voice. (“Is that really what I sound like? Turn it off!”) The distaste with which so many of us greet the sound of our actual voice is not purely a matter of acoustics, I suspect. A recording disembodies the voice, holds it at a distance from us, so that we can hear with pitiless objectivity all aspects of how we speak, including the unconscious ways we manipulate prosody, pace and pronunciation to create the voice we wish we had.
When I mentioned this to a friend, he grimaced at the memory of hearing his recorded voice for the first time. “God!” he cried. “The insincerity!” He was reacting to the mismatch between who he knows himself (privately and inwardly) to be, and the person that he seeks to project into the world. All of us do this, quite unconsciously, and until we hear ourselves on tape, we remain mercifully deaf to how we perform this ideal self, in a bid to “put ourselves across”, to make an impression.
The enterprise of being human is to carve out a congenial place to occupy in the world, an achievement that we know intuitively depends to a frightening extent on how our voices sound in the ears of others. To alter your voice in ways that conform better to the person you feel yourself to be, or that you wish you were, means changing, fundamentally, who you are.
• This is an edited extract from This Is the Voice by John Colapinto, published by Simon & Schuster and available at guardianbookshop.com