How to create perfect vocals

Vocals make tracks memorable. Whether you fancy yourself as a beatboxer or take pride in screaming along with guitar solos and synth hooks, it’s difficult to deny that it’s the combination of melody and lyrics that provides the most natural and irresistible pull towards the world’s favourite tracks.

In this feature, we’re looking at vocal production, and offering advice on how to capture sensational lead vocals, as well as how to create the kind of one-off tricks and harmonies that dominate the pop- and dance-music landscapes.

Let’s hear that again

Vocal recordings can only be as good as the sum of their parts, and there are many considerations and potential obstacles that lie between you, your performer and an A-grade, beautiful-sounding vocal. Here, we’re going to follow the chain of events from start to finish (leaving out the most important consideration, to which we’ll return shortly), beginning with one of the most significant decisions any engineer has to make: microphone choice.

Broadly speaking, studio-based vocal recordings will benefit from the use of a large-diaphragm condenser microphone. If you head to a professional studio to record vocals, you’ll likely find a microphone cupboard replete with stunning mics from which you’re free to choose.

The message should be clear: there’s no one-size-fits-all solution to vocal recording. Different voices and performance styles lend themselves to different microphones. Some mics sound astonishing on softer, warmer, more intimate vocals, whereas others respond better to singers who like to open up and belt out their words and phrases. Some mics are better attuned to capturing lavish top end, others the richer, bolder midrange. All mics have their own sonic character.

After selecting your mic, you’ll need to decide whether or not to add a dedicated recording channel before your audio interface. Most audio interfaces allow you to enable phantom power (which you’ll need if you’re using a condenser microphone), as well as to set recording levels. It’s possible, then, to record vocals directly into your DAW without adding a channel-strip stage before the interface.

However, just as your microphone choice will have a significant bearing on the vocal sound, so will any channel strip through which you choose to record. This is due to the components used within channel strips, which are designed to enrich the colour of sounds and, in some cases, control dynamics too. Manley’s Voxbox and Avalon Design’s Avalon VT-737SP are two high-end examples of vocal-recording channels. Both include EQ and compressor modules and, crucially, employ a tube circuit to add pleasing harmonic distortion, richness and texture as recording levels increase.

Even the world’s greatest gear and vocal-recording techniques can’t salvage a rotten take

Tracking vocals into DAWs has encouraged us to be a little twitchy about how much we want to commit to EQ and compression settings on the way into our computers, because we know that we have the option to tweak these effects (and many others) later on in the production and mix process. But securing a strong and characterful sound from the outset will save lots of time and laborious polishing later down the line. Channel strips can help you shape your desired sound from the get-go.

Of course, your DAW and/or audio interface might allow for a happy middle ground. Here, you could set up a recording pathway via software plug-ins and track a vocal without having to properly commit them to the recording. Audio-interface and plug-in manufacturer Universal Audio has led this approach, with high-quality emulations of some of the world’s most-coveted recording channel strips, EQs and compressors. The UA interfaces also let users decide whether or not to print a strip of effects at the point of recording.

Recording basics

However you proceed, there are some fundamentals that you’ll need to apply no matter your equipment and environment. You’re looking for a healthy signal level that at no point threatens to peak or tip the input stage of your DAW’s recording channel into the red. There’s no failsafe way to remove digital distortion from a recording and, unlike some forms of analogue distortion, there is little that’s musically pleasant about its digital equivalent.

It’s also important to remember that, in your DAW, the fader that controls the channel you’re recording into isn’t responsible for input level. It controls the volume of the sound as it plays back from your computer once it’s been recorded but not as it goes in. Use your audio interface and channel strip (if you’re using one) to get levels right at the recording stage.

Make sure your vocal microphones are mounted on stands rather than being handheld too, to ensure a clear signal and to give the performer less to worry about. You should invest in a pop shield too. These small, usually round shields screw onto microphone stands and act as a screen between your singer and the microphone, helping to eliminate plosives and the rush of air that these P and B sounds send towards the microphone grille, which can leave distorted booming and popping in your recordings.

Experiment too with the distance between your singer and the microphone. Pop shields help here, as they act as a divider between vocalist and mic. Place the pop shield at the distance you want and then ask your singer to sing directly into it. The distance will depend on the volume of your singer and the performance of your microphone. Some prefer closer recordings than others but a distance of three to six inches is a sensible starting point for most singers and microphones.

Talent pool

The past few paragraphs should help with the technicalities and, as long as you remember to tie your vocal-recording setup together with the correct cables – you’ll need an XLR cable for condenser microphones using phantom power – you should be good to go. But even the world’s greatest gear and an encyclopedic knowledge of vocal-recording techniques can’t salvage a rotten take. If you want superb-sounding vocals, you’ll need a superb singer.

What makes a great singer? It’s subjective, sure, but you’re looking for someone who can convey the essence of a song, someone you want to listen to over and over again, and someone whose every word you anticipate with bated breath. But this is an all-too-rare combination. There are many self-styled singers who can sing broadly in tune and in time. But those who can capture the magic of a song, those who live and breathe the sentiment, aren’t so easily found.

Maybe you already know someone who fits the bill. Maybe you fit the bill. If not, you’ll want to spend some time trawling through singers’ portfolios online to determine whose voice moves you and whose voice makes you want to move into another industry. Ask around, too. Do any other producers you know have any singers they can recommend? If you doubt your own abilities to properly record and produce a heavyweight vocalist, just remember that a fantastic vocal recorded badly will carry more weight, power and emotion than an unremarkable performance recorded through the best chain. Prioritise talent from the outset.

Tracking and comping

While recording, we advise that you record several takes of each vocal. This means recording any parts that you want to capture – lead vocal, backing vocals, harmonies, ad libs, etc – multiple times. There are many reasons for this. Vocalists, like all musicians, tend to warm up, getting stronger through the first few takes and rarely producing their very best work the first time around. But don’t delete those early takes just yet. In the days of abundant hard-drive space and Cloud storage, it’s worth holding on to these lesser recordings, even if they don’t seem all that promising at the time of recording. It’s always better to be safe than sorry, so keep backups. Plus, it’s not uncommon for that first take to yield some sort of magic, particularly if a vocalist is really into the track and keen to get going.

Vocal Takes — Finding the best elements of a take requires time and a careful ear

Once you’ve captured your vocal, you should have several takes and, depending on which DAW you’re using and how you’ve gone about the recording process, these will either be all lined up one above the next or embedded into a folder of takes on a single audio track. Either way, your job is now to select the best bits in a process known as ‘comping’, a contraction of ‘compositing’. This isn’t simply a case of choosing the bits that are most in tune or in time though. As before, sometimes it comes down to vibe, feel, and how much you believe in the vocal.

When comping, start by focusing on one short part of the song, and go through them one at a time. Begin with the first half of verse one, for instance. Set up a loop and listen to each take with the others muted for comparison. Don’t be surprised if you like the second half of a word more than another; as you listen through on repeat, your ears will become attuned to the parts you like and those you don’t.

Comped Vox — It’s common practice to dissect vocal takes and craft something sublime from their raw materials

To comp, select these sections, either by promoting them to the top of the stack of takes or simply by cutting up the bits you like across several tracks and colour-coding them so you can remember your favourite parts. Then, if necessary, drag all of your selected sections down to a new comped track and use crossfades to manage the gaps between any overlapping notes. When editing, be careful about how closely you crop to the start and end of notes. It’s easy to inadvertently pinch a quiet S sound of the end of a word as it decays, and easier still to crop so closely that the vocalist’s intakes of breath are entirely eradicated from the performance. Your singer is a person, not a robot. Removing their breaths can make the resulting audio feel strangely claustrophobic for listeners.

Tuning and timing

Once you’ve got a lead vocal take you like, there’s a good chance that you’ll want to adjust individual moments of tuning and timing. You may be able to do this with technology native to your DAW. Alternatively, you could employ third-party options such as Antares’ Auto-Tune or Celemony’s Melodyne. If you need to do some pitch correction, there are two avenues you’ll want to consider: transparent and gentle tuning to subtly finesse performances but retain their original musical intent, and more extreme tuning correction for which the auto-tuning becomes a characterful effect in and of itself. The latter, perhaps used most famously on Cher’s 1998 banger Believe, is an out-there method that has bounced in and out of fashion ever since. It won’t be right for every project but deployed cleverly it can lift a song to unreal heights.

AutoTune — Antares’ Auto-Tune is a widely used industry-standard vocal processor introduced in 1997

If you need to nudge individual notes around to alter their timing, just move the audio files you’ve comped. Both of the aforementioned plug-ins let you adjust timing and tuning too. To avoid having to run said plug-ins live all the way through to the mix stage of your project, producers tend to render their tuned, retimed vocals as new audio files (without any other effects) to capture them as master vocal takes.

If you favour this approach, once you’ve rendered the tune-tweaked version, mute the original track featuring the tuning plug-in but leave it as an available track in your project as you continue working on your production. If, as the production builds, you feel that a line or two needs changing or that you need to go back to adjust the tuning or select a different take, you’ll be glad you kept the pre-tuning vocal track available.

Pitch Shifting — Soundtoys’ Little AlterBoy can be used to modify formants and pitches of a vocal, and even the perceived sex of the vocalist

Mixing your vocal

Now you have to balance the vocal part within the developing mix of your track, ensuring that it sounds like an active part of your project and has the preferential status it needs to run the show of your song. To start with, it needs to be given the same considerations as any other individual part of your track, specifically with regards to tone, dynamics and space. There are no specific settings or parameters that we can recommend here, as every piece of music is unique.

Which microphone did you use? How dry is your recording space? How many tracks are there in your project? Are the instruments acoustic or electronic? Is it house music? Folk? Every genre, every track, every density of mix weight will require a different balance of these key tools, so not only should you take any specific settings you read about purely as a guideline, you should also resist the temptation to use any quick presets suggested by your plug-ins.

That said, there are some absolutes that you need to understand if you want your vocal to sound as professional as possible. All vocal performances generally benefit from the same kinds of tools, irrespective of genre. You’re likely to require some EQ to help shape the tone, compression to help you sculpt dynamics and reverb, and to take the ideally dry sound of your room and give it a sense of space to match the other tracks in your production.

EQ tends to serve two purposes. Use it to make surgical, usually narrow adjustments to problem frequencies, which are often caused by the combination of a dominant frequency in your singer’s voice and an amplification of that through your chosen microphone, preamp and the room. Then, finesse the tone musically by enhancing particular frequency groups to add sheen, colour, clarity and intelligibility. Use your ears to find the cuts and boosts that make the most pleasant musical differences to your mix.

Compression is used to narrow dynamic range and often proves essential when mixing vocals. Taming the loudest peaks of a performance gives you a chance to boost the levels of quieter moments, and helps to ensure that a vocal will sound rich and powerful. If you’re looking for a particularly strong vocal, either for your entire song or for a section of it (the chorus, for example), parallel compression can help hugely.

The Big Apple – New York compression

Parallel compression, also known as New York compression, is a common studio technique used in the Five Boroughs and beyond. Imagine you’ve got a drum part programmed for separate kick, snare and hi-hat parts and, on each of these individual channels, EQ and compression settings appropriate to each sound. Suppose you then group all three sounds together by sending them to a separate auxiliary bus treatment where you add another compressor. This one will affect all three sounds and effectively works as a bus compressor, using the transient response and volume of each contributing sound to affect how the compressor works across all of the drums.

If you compress this auxiliary treatment hard, with a low threshold setting and high ratio, it results in a powerful, smashed, energised drum sound. Used alone, it would be too extreme for most mixes but, blended in beneath the original drum parts, it adds weight and robustness. This technique can be applied to vocal parts too. Parallel-compression treatments are a neat trick with which to reinforce lead vocals throughout a song if the production is tough and the vocal needs to match that mood. Equally, the method can be introduced only on the biggest sections of a song – use automation to turn up the send level only for the parts that need the extra power.

Parallel Vox Compression — The New York compression trick isn’t exclusive to drums. Using an additional, hard-compression treatment underneath your main vocal channel can add power

In terms of processing techniques used purely for vocals, we recommend you explore de-essing. Again, depending on your choice of singer, vocal microphone and recording space, it’s possible that S sounds will pop out of the mix in a way that’s distracting for listeners. De-essers are frequency-specific compressors that reduce levels within a user-defined frequency band. A multiband compressor will often let you perform de-essing but plug-ins dedicated specifically to this task are often better.

Spatial effects

Next, we move onto effects and begin adding space to the mix – and whipping up some real magic. If you were able to record your vocals in a dry environment, this will prove especially fun. Though all that EQ and compression you added will prove essential, your vocal probably sounds a little high and dry right now. But don’t slather it in reverb just yet. Instead, wait a minute – and while you’re waiting, listen to Break My Heart by Dua Lipa.

You’ll notice here that the opening lines are incredibly dry. There’s no audible reverb and no vocal effects on those first few lines either – until, halfway through the first verse. When Dua Lipa sings “I like…”, there’s a reverberated, delayed vocal effect that hangs tantalisingly in the air and fills the gap until the next line. The same dry-wet approach appears in the second half of the first verse before the pre-chorus provides a bolder, more space-heavy vocal treatment. And after that build? That’s where the fun really starts.

What this pop song – and we could have picked many alternatives but similarly gymnastic examples – demonstrates is that spatial treatments aren’t static. Instead, they ebb and flow with the arrangement of their track, adding spatial depth where required and pulling back to something softer, drier and more intimate where necessary.

Vocal Volume Automation — Automation of volume is useful for reining in those more naturally explosive moments and bringing the quiet parts to the fore

So how is this achieved? Effects can be added to a project in two ways: as insert effects on a channel or as auxiliary bus channels. Auxiliary literally means extra, so think of auxiliary busses as an additional element, something more than the original vocal channel itself. When you add an auxiliary channel to a track, you’ll see a send dial, which controls how much volume from the channel is sent to that auxiliary channel. Suppose that the effect you’ve added to that auxiliary is a reverb. For as long as the send-level dial is down, no volume will be sent to the auxiliary reverb and the vocal will remain dry. Turn up that dial and you’ll hear the reverb being added as an extra channel. In Break My Heart, the introduction of a reverb or delay like that heard in the first verse is achieved by automating the send level from the vocal channel to the auxiliary channel carrying those effects.

Adventure time

We’re getting quite advanced here, and there are simpler ways to add effects to vocals. You might prefer to set up an auxiliary reverb or echo effect and leave the send dial set at a static level for the whole track. This can be appropriate, depending on the production you’re working on. Perhaps you’re making a progressive-house tune and you’ve recorded a single-line vocal that you want to spin in and out of your track every once in a while.

Maybe you’ve written a simple song for acoustic guitar and voice and you don’t want it to sound overproduced. In both cases (and many more), a simple, static treatment will likely prove effective. However, if you’re producing music with a more dynamic structure – with rises and falls, drops and builds, verses, pre-choruses and choruses – managing your effects so that they disappear before rearing up again to surprise and delight listeners can be extremely effective. Of course, if both insert and auxiliary-bus vocal effects can be added for reverb and delay effects, these options open up the possibilities of wilder and more out-there effects too.

If you want to enhance your vocals with artificial harmonies by using a lead vocal to trigger supporting notes at different pitches – the kind of thing you might hear in a James Blake or Jacob Collier track, for example – you could use a vocoder, which provide a blend of acoustic sound sources and artificial, synthesised sources, with you using a keyboard input to choose the notes those new harmonised voices will play.

You could instead set up a duplicate channel of your lead vocal, scan it into Auto-Tune or Melodyne, and change the notes to fit the vocal part you had in mind. You could also run it through a plug-in such as Waves’ Ovox, which lets you tackle pitch and timbral vocal effects from a number of creative angles. When it comes to effects processing, there’s no end to the ways you can subtly enhance or radically reframe vocal parts. You might make music that can benefit from a parallel-distortion treatment, which adds bite and edge. It needn’t be a treatment that decimates the mix. A little saturation or guitar-amp power mixed in below a lead vocal can work wonders.

Sample and resample

Beyond effects processing, vocal sampling remains a hugely powerful way to explore unusual sonic territories. Stutter effects, whereby a vocal is teased into a mix via triggering or repetition of its first phoneme, are common in dance music. With these, you can chop up a lead vocal audio file and scatter vocal chops across a mix, sampling a word so that it can be triggered with variations to both pitch and rhythm as often as you like.

Samplers also let you find loop points, turning single one-shot notes into sustainable textures that can play chords or to create unexpected ambiences around a lead vocal. Once you’ve sampled a vocal, there’s nothing to stop you playing notes for it, adding new layers of effects – or resampling it and starting all over again, delving deeper and deeper into your own sound and vocal tricks as you go.

The ways vocals can be recorded, treated and mixed are limitless. Ideally, you want the initial recording to be as good as it can possibly be, so that your first steps aren’t all taken in vain and you’re left trying to clean up something substandard afterwards. Thereafter, whether you’re looking for a subtle treatment to blend the tone and dynamics of a performance to your mix or a radical computer-voiced extreme reworking, the world’s your oyster.

Granular effects

Time-stretching has been a popular technique in vocal processing for as long as samplers have made it possible. Stretching out a vocal is – in most samplers – achieved by copying tiny shards of a sample to elongate its length, which is why extreme time-stretching sounds grainy and lo-fi. If you like that grainy quality then, in addition to experimenting with time-stretched samples, you should try using granular synthesis to process vocals. This involves breaking a source signal down into a number of tiny pieces called grains, and then putting them back together, either to stretch or interrupt a sound in a number of interesting ways.

Pushed to more extreme settings, granular-synthesis tricks can turn your vocals into cloud clusters, disquieting, pitched warbles or computer-like atmospherics. Remember that if an insert-effected granular-synthesis treatment is too much, you can always use one in parallel, either on an auxiliary track or by using the dry/wet balance dial, if your plug-in features one.

For more essential guides, check here.