Author Archives: James Gasson

About James Gasson

My name is James Gasson. I am a musician, sound engineer, artist and chief operator of Third Circle Recordings. I journey through life trying to work out what exactly is going on whilst doing my best to avoid tripping over. Some days are more successful than others.

DI Boxes: A Distinction Without A Difference..?

A good friend of mine recently purchased a new DI box; an Orchid Electronics Direct Inject box – an active DI that retails for about 40 quid. It looks like this:

1.jpg

He excitedly proclaimed that for the money the sound was fantastic, enjoying in particular how it interfaced with his bass guitar. I was intrigued.

At this point I was forced to confess a guilty little secret; I never really got DI boxes. I mean… they balance the signal, right? And they bump up the level… great for giving a leg up to weak signals over long distances… and… that’s… basically it. Isn’t it? Some have bells, others have whistles – pad switches, ground lift toggles, active, passive, etc, etc, but the idea that this little thing can be expected to do anything magical to the sound was never something that made sense to me when faced with that kind of claim. In my studio I had always used a Behringer DI800 unit, or whatever other small, inexpensive DI box I happened to have lying around, and that always seemed to accomplish the task that was being asked of it; get a signal from here to there without adding unwanted noise. But I had never directly compared any two, so what do I know.

While absently Googling different DI boxes and their specs I found this page on gearank.com (dangerously close to gearwank.com..??), which contains sentences such as:

Countryman Type 85 – Everything from acoustic guitars, to Telecasters, to bass guitars, and even old electric pianos have been plugged into this DI with satisfactory results. Most agree that they can hear the nuances of their playing style and instrument better when using this humble DI Box.

Behringer Ultra-DI DI600P – Understandably, those who have more expensive DI Boxes will find the sound of the DI600P lacking, but for the average band performance and home recording enthusiast, it gets the job done efficiently.

Radial ProDI – it is reassuring that professionals like Terry Lawless (plays keyboards for U2!) endorse this unit… From keyboards to laptops, and even instruments, the ProDI will get you sounding good without much hassle.”

 

Blimey. There’s more to this DI lark than I thought. Either that or these sentences were written by someone simply scouring the internet for any old bullshit to say in an attempt to differentiate superficially different devices about which they know fuck all.

Maybe both.

A more pleasingly technical look at the DI box comes from the direction of Sound On Sound’s Hugh Robjohns, who, in this article, writes:

“The main reason most DI boxes employ an output transformer is to provide electrical isolation between the source and destination equipment. As well as preventing ground loops, this offers protection against faults in equipment on one side of the DI box damaging that connected to the other, though whether all DI boxes employ transformers with a sufficient safety rating to guarantee that isolation is another matter! Cheap transformers usually introduce significant amounts of harmonic distortion and struggle with low frequencies and high levels, but a good quality, low-distortion, electrically safe transformer is inherently expensive. If a DI box is being used with a fuzz guitar these things are perhaps less important… but they can stand out like sore thumbs when used with a good acoustic guitar, for example…

…In situations where the equipment connected to these DI boxes is known to be well maintained I’d have no qualms at all about the absent output transformer, but there are a couple of situations where the transformerless boxes may prove inferior to more conventional transformer-coupled designs such as the Radial. The first is when connecting the output of a laptop or computer interface to a mixing console or recorder, as computer ground-noise is more likely to become audible as nasty buzzes and whines. The other is when connecting equipment of unknown condition or connecting to equipment powered separately in an OB truck or separate building. In these cases the galvanic isolation of a proper isolating transformer can, quite literally, be a life-saver.”

 

This is a bit more like it – information that is actually useful. Less conjectural bullshit about “sound character”, and more technical justification of the kind of thing we should really be looking out for in a good DI box.

Anyway, for my own part, I decided to use this circumstance as an opportunity to do some comparisons of my own. Far from being motivated (yet) to run tests comprehensive enough to discuss the differences between a range of different DI boxes under different conditions with any degree of certainty, I at least wanted to see if there was any truth to my friend’s claim that his Orchid DI “sounded great”, particularly when compared to a Behringer Ultra-DI that was to hand. Indeed he seemed to think that his box had just a little extra something that stepped up his bass tone. I, on the other hand, remained skeptical.

2.jpg

 

And so I took both devices and got to work. I wanted to see what all the fuss was about, so I plugged in an electric guitar to each DI box, which was in turn connected to my M-Audio home interface, powered with phantom power, and quite literally recorded the results.

The first thing to notice was the differing amount of gain applied to each signal. Larger than I had expected. I put down my guitar and used a 440Hz test tone to measure this:

3.JPG

The results were as follows:

  • No DI box (signal patched from output straight to input):  -16.4dB
  • Behringer Ultra-DI:  -3.7dB
  • Orchid Direct Inject:  -12.4dB

 

Straight off the bat we see that the Behringer provides 12.7dB of gain, while the Orchid provides only 4dB. Combined with the Behringer’s two -20dB pad switches, vs Orchid’s single -20dB alt input, the Behringer is the winner in terms of providing a higher output with less faff yet more control.

Armed with this information, I recorded a straightforward guitar line, badly played and slightly out of tune through both devices, one after the other, and then gain matched as appropriate after the fact. Here are the results (I recommend downloading the files and comparing in a DAW):

 

On review, I find it (as usual) impossible to glean much useful information from a test such as this because of its lack of scientific accuracy. Testing a different performance through two different devices has all sorts of flaws, not least that the performances are by definition, different. I can easily convince myself that I hear important differences in one over the other, but given that it’s very possible that any number of variables could be affecting my performance in either case (strum velocity, playing position, temperature, unexpected bowel movement, etc, etc), it’s impossible to make any objective claim one way or the other. All I can say is that they sound very similar indeed, and the extent to which they don’t is both tiresomely subtle and impossible to account for.

I decided instead to try something more scientifically agreeable, so I played a pre-recorded guitar part out an output of my interface and then recorded it back in, first with nothing else in the chain, then with the Behringer DI, and finally with the Orchid DI. This would also have the advantage of providing a more harmonically dense source signal as the basis for my comparison. Here are the results, subsequently gain matched:

 

Now, I don’t know about you, but I’m not so very sure I can tell these recordings apart.

 

Still though, perhaps a more accurate approach to take here is to perform a null test, whereby two of these files are superimposed on top of each other, the polarity of one inverted, and the resultant audio inspected. The more they cancel, the more identical they are. However, such a test is littered with pitfalls, given that it invokes more variables than the DI box itself, such as the mic pre, interface and recording algorithms with their quantisation errors, plus the complications arising from the difference in output level of each device. If we do have audio to inspect after a null test, it’s impossible to know what to blame it on. Still, ever the curious fellow, I tried it anyway:

 

So it’s firstly worth pointing out that performing a null test with no DI box at all does not lead to complete cancellation, hence we have some grounds to be uncertain of our findings when a DI box is inserted in the chain. However we do find that there appears to be a difference between the two DIs when compared to the original recording. Crudely summarised, the “Original vs Behringer” test produces a waveform that looks like this:

Null-1.jpg

…and the “Original vs Orchid” result produces a waveform that looks like this:

Null-2.jpg

Similar, but with notable differences, not least that the Behringer seems to have done a much better job of preserving everything above 7 KHz (ish), and is also more truthful around the 1-2 KHz region.

However, as previously acknowledged, this test is crude and fails to take into consideration the behaviour of the mic preamp and interface when met with such a discrepancy in input signal. It wouldn’t be a bad working hypothesis that the Behringer, with its output transformer and lower output impedance, offers a “truer” representation of its source signal, but more extensive work would be required to actually verify this.

In any case, such hypothetical beard stroking is to some degree academic, because, in reality, if the difference can’t be reliably heard by the human ear then it doesn’t fucking matter anyway. And that’s the crux – can we actually hear the difference? It’s very hard to say. Psychological tricks and cognitive biases abound when comparing audio signals in this fashion. When comparing these two DI boxes, my good friend was utterly convinced that he could hear a special something in his one. As in, the one that he paid for. It’s not difficult to speculate why. And so to me this kind of thing isn’t really a story about the fidelity of audio devices, academically ponderous though that is, but one of human psychology and the innate peculiarities of our reasoning, which push us away from truth and unwittingly in the direction of confirmation bias, expectation bias, belief bias, the bandwagon effect, the backfire effect, the ostrich effect, Semmelweis reflex or subjective validation. We are flawed, fallible and fragile, and it is these tendencies that hold greater sway over our decision making than the imperceptible micro-differences between similar objects. There may be a small difference between these two DI boxes when examined under a microscope, but ultimately, who actually gives a shit? They both behave perfectly reasonably, and the moment that is not the case the vital difference will become obvious. If the difference we’re encouraged to be worried about is significantly less than the difference incurred by the smallest tweak of an instrument’s tone dial, I’m not sure it’s a difference I can be bothered to care about.

 

The only other point of interest with regards to these two DI boxes that up until just now I had forgotten to mention was the noise floor, so let me just shine a dim light on that with these images, which show the noise floor in each instance, with nothing plugged in to the DI boxes, and with the level increased by 48dB after recording within my DAW:

  • No DI Box – Noise Floor:

Noise-1.jpg

  • Behringer Ultra-DI – Noise Floor:

Noise-2.jpg

  • Orchid Direct Inject – Noise Floor:

Noise-3.jpg

 

Essentially we find here that there is no noteworthy increase in aggregate noise imparted by either of the two DI boxes on trial, yet with the curious addition of a small spike around the 11.8KHz mark on the Behringer, and 17KHz on the Orchid. In both cases this is due to phantom power, demonstrable by the fact that, when switched off and run on battery power, the spike disappeared. Perhaps an electronics engineer can tell me why this might be the case, but for now I’m not sure I care much about a -50dB spike at an almost inaudible frequency.

 

I think the last thing I’ll say before closing down my computer and going to bed (given how profoundly boring I am now finding this topic to be), is that, given my conviction that the only thing that matters is what we can actually hear, a very good way to discern any difference between two sources is to loop a tiny fragment of audio and allow the harmonic peculiarities of that small repetition to firmly embed themselves in your senses. Apart from being an almost hypnotic experience, it is useful to see if we can actually detect any real difference from a single moment of audio. And so that is what I have done. I invite you, readers, to listen to these repetitions. Hidden in there, somewhere, are spontaneous flips between source signal, no DI, Behringer and Orchid recordings. I won’t say where they occur – they could be all bunched up at the end, evenly distributed throughout, or none of the above, but they are all there in a non-specific order. Your job is to listen out for the changes. See if you can spot them, and if you can, relay which is which. When are you listening to the original file and when are you listening to it recorded through the kind of Behringer DI box that “experienced” audiophiles might look down their noses at? Can you tell? Let me know.

I’m all ears.

 

Advertisements

How To Make A Guitar Sound Like… A Guitar!

It may be an unfamiliar premise to some, but I am fervently of the opinion that the goal of recording audio is to accurately capture and faithfully reproduce a particular sound. Not to fuck around with it, contorting it into something that only dimly represents what was originally going on, but to trust in the natural character of an instrument and the preference of its performer. After all, they picked it, they like it, and it would be too presumptuous of me to think that an arsenal of boring plugins can make it sound “better”. If the musician has competently selected their desired tone, there’s no reason for me to come along and masturbate all over it.

Standard practice for most has always been to shove an SM57 in front of the cabinet, but this has always struck me as less than optimal because I’ve never been satisfied that the recorded sound is anything like that which is coming from the amplifier in the live room; simply a tinny, flat, weird imitation of such. I guess this is where most would plump for their mouth-watering collection of candy-store plugins – and boy, don’t they look sweet, with vintage-effect GUIs that totally make it look like you’re doing something all technical and impressive. Careful though, because more often than not all they do is rot your teeth and give you a stomach ache.

No, for me the solution is not to use a weird sounding mic and then crack out a virtual rack full of plugins in an attempt to de-weird it, but instead to find the correct microphone choice and placement in the first place, such that I am confident that I am doing what I am actually being paid to do; make a faithful recording with minimal post-recording fuckery. My approach to recording a guitar amp has always been to use a combination of microphones – one dark and one bright – phase aligned and blended such that as true a representation of the amplifier sound as possible is produced. However, I must confess that I’ve never actually put this to the test in any evidential sense. Is this really best? Or am I missing a trick with the favoured SM57 microphone that so many seem to opt for?

So, with this in mind, I recently performed an experiment that I have been wanting to carry out for some time; that is, what is the best way to make a guitar sound like a guitar? Or more specifically, how can one best reproduce the sound of a guitar + amplifier through the control room speakers? With that question burning in my mind, I set to work.

 

First off, I used an amp that I know to be very characterful, very harmonically rich – a Vox AC30. I positioned it right next to the studio speakers and recorded a DI’d guitar into the computer. This signal could then be fed back into the amp so as to provide a faithfully replicable guitar performance to serve as the basis for my subsequent microphone comparisons. Next, I set up two mic stands in front of each speaker of the amplifier, in order to position microphones with their diaphragm aligned 15-16cm perpendicular to the centre of each cone, which is standard practice for me. I should note at this point that I did try some preliminary placement tests (which I didn’t actually record) — I tried the mics further back, but they sounded too distant, I tried them closer but they sounded too boomy, I tried them off axis, at the side and at the back of the amplifier, but ultimately concluded that such positioning is not particularly helpful in reproducing anything like a natural tone, and meandering further down this avenue ultimately isn’t very constructive. It all just sounded increasingly alien. Besides, sticking to a normal and fairly predictable methodology would be useful enough to ascertain which microphones were ultimately producing a more convincing sound.

 

And so, with everything in place, I set a good tone and level for the amp, recorded a 2 minute guitar sequence consisting primarily of Killing Joke’s awesome “Requiem” riff, and set about comparing 19 different microphones, each run through a 7CA N72 preamp, and gain matched as closely as possible. This, as I would quickly discover, was an extraordinarily tedious process. Nevertheless, some time later I ended up with 19 separate recordings of the same guitar part. After I phase-aligned them all, finally I was ready to compare, one by one, each recorded signal played back through a single ADAM A7X studio speaker, with the raw sound of the amplifier, perched next to it.

Now, it’s worth pointing out the degree of fallibility for such an experiment, because while this certainly provides interesting insight into the degree to which different microphones affect the sound, ultimately we have to concede that there are elements of the playback signal path that themselves influence the resultant sound, most notably of course the playback speakers. While studio monitors are broadly considered to be “flat response”, in truth each set is subtly different, and therefore this difference is exerted on the sound we hear. This is something for which I am unable to account in my experiment, because to do so would involve trialling each recording through multiple pairs of speakers in multiple different environments (or in an anechoic chamber), and since such a procedure would be grossly impractical, I am happy to take it on faith that my A7Xs are truthful enough for me to have confidence in what I’m hearing. I’ve used them for long enough to feel happy with that.

The other stipulation is that, whilst I am able to show my findings by presenting each recording here for you all to hear (as I will shortly do), what I am unable to do is provide you with the same experience I had at the time in directly comparing the real sound of the amplifier to the recording, flipping back and forth between them. Because… well… how would I do that? I would have to load the AC30 into my car, drive to your house, set it up next to you and repeat the experiment in real time with you in witness. This presents all kinds of problems… Is there space in your house to do that? Do you live overseas? How would I get my car there? Could I get the day off work? Who would look after the cat?

You see… not easy at all. Anticipating this problem, the best I could do was to put a single microphone further back in the room and record the sound of the amplifier being compared with my recordings being played back. This is of course somewhat paradoxical, because I still have to record through a microphone in order to do that and thus incur all the peculiarities of that mic in the process, as well as all sorts of other problems such as room influence given the position of the playback speakers relative to the microphone, but at least it provides something approximating a basis for comparison. We have some context, albeit not the real-life, “you-really-had-to-be-there”, bona fide sort. I used a Telefunken CU-29 for this.

 

Okay. Enough.

 

Enough rambling on.

 

Like a twat.

 

Let’s hear them results!

 

So here is just a straight up mic comparison. For good measure I also included a version of the DI’d guitar recording run through Native Instruments’ Guitar Rig software, emulating an AC30 with settings approximating those on my amp, however I had to do a bit of fiddling to get close to what I was hearing through the amp, and so the settings in the software were nothing like the settings on the amp itself. Still though, it sounded… sort of close… sort of. I would recommend downloading these files and dropping them sequentially into consecutive DAW channels so that you can directly compare them in a more revealing way:

  1. Native Instruments’ Guitar Rig
  2. Shure SM57
  3. Beyerdynamic M201
  4. Electro-Voice RE20
  5. AKG D112
  6. AKG C414
  7. Audio Technica AT4050
  8. Neumann U87
  9. CAD Trion 8000
  10. Telefunken CU-29
  11. AKG C451
  12. Audio Technica ATM450
  13. Gefell M300
  14. Rode NT5
  15. Beyerdynamic TG-D57C
  16. Shure Beta 98
  17. Panasonic WM61A
  18. Beyerdynamic M160
  19. Royer R121
  20. MXL R144

 

So, having completed these comparisons and sat in my studio chair flipping back and forth between the amplifier and each recorded signal, I was able to draw some conclusions. These can be summarised as follows:

  1. No one single mic can reproduce the sound of a harmonically rich guitar amplifier.
  2. The SM57 in particular is not a good choice for this application. It sounds thin, lacking in bottom end and definition.
  3. Large diaphragm condenser mics and ribbon mics are a more appropriate choice.
  4. The most convincing single microphone, if I had to pick one, is the Telefunken CU-29.
  5. I struggle to hear significant difference between a Royer R121 (£1300) and an MXL R144 (£120).
  6. The most convincing replication was produced by pairing microphones, typically a very dark mic and a very bright mic. My favourite combo was the MXL R144 + Panasonic WM61A or Audio Technica ATM450. Here’s what that sounded like:

 

  1. MXL R144 + Panasonic WM61A
  2. MXL R144 + Audio Technica ATM450

 

The MXL R144 is a dark sounding ribbon mic capable of handling high enough SPL to sit quite comfortably in front of a guitar amplifier, and its proximity boost also serves to provide quite an Earth-shaking bottom end. The WM61A and ATM450 are both small diaphragm condenser microphones, which sound questionable on their own but combine nicely with a dark microphone to add detail and “fizz” to the signal. The WM61A actually refers to a cheap Panasonic capsule that can be easily made into a very basic omni-directional microphone. This means that the ideal guitar mic combo is actually comprised of two of the cheapest mics in the trial.

Let’s now take a listen to the amp/playback comparison, as recorded further back in the room by a single condenser microphone. Once again, I recommend downloading all files and comparing them within a DAW:

  1. Vox AC30
  2. Shure SM57
  3. MXL R144 + Panasonic WM61A
  4. Telefunken CU-29
  5. Neumann U87

 

What I conclude from this is that, although there does not appear to be any perfect solution that offers a true reproduction of the source, the closest approximation is achieved by combining tonally distinct microphones and then blending them to taste, perhaps even making subtle adjustments to each of the mics (like rolling off a little top end from the ribbon mic). This then means that only small adjustments need be made, rather than relying on plugins to artificially butcher the sound later on.

So, to wrap up – turns out I was right all along! Well done me.

 

Oh yeah, and fuck the SM57.

 


MOTU??! Urrgh… *Real* Sound Engineers Only Use Prism Converters, Dontcha Know? Hiffle, piffle, pibble and fwaf.

Earlier this year I visited Steve Albini’s Chicago studio, Electrical Audio, with the goal of not only recording some drums with the man himself, but also scrutinising his mic techniques in order to learn more about how such an incredible drum sound is achieved. The results were as fantastic as you would expect, and I publicly documented my experiences via a video presentation, thanks in no small part to the assistance of an impressively bearded cameraman by the name of Kevin Clarke. You can see the resultant video here.

 

In order to record the session as flexibly as possible, and preserve all naked, ungrouped signals for my scrutiny later on, I knew that I had to split the signals coming from each microphone, with one batch going to Steve for recording to tape, and another identical but independent batch leading to my digital recording system. To make this work, Steve carried out an impressive feat of ad-hoc patching in order that we could split the signals after the desk preamp. This way I would still maintain whatever Neotek goodness was being imparted on each signal. The only differing variables between the two simultaneous recordings were the recording mediums themselves; Steve recorded to RTM900 2” tape, 16 track, 15 IPS. I recorded digitally at 48 KHz, 24 bit via a pair of chained MOTU 828 mk2 interfaces.

 

Now, I didn’t particularly give much thought to what interface I would use for my end of the recording, given that I consider all interfaces to be much of a muchness. They all do the same thing, and they all sound pretty much as transparent as each other. Quibbling about spec sheets aside, the fact of the matter is that the analogue-to-digital converters inside all interfaces across the price spectrum these days are perfectly capable of capturing and reproducing music transparently, and any talk about the correlation between “sound quality” and price tag is, in my opinion, grounded in a whole host of psychological biases which influence our perception of “quality” to an impressive degree, even more so when you’ve actually paid over the odds for what is, at best, an imperceptibly subtle improvement. “Yeah, this shit sounds fucking sweet. Now get an awesome photo of it for the website.”

 

The association of price with quality is a well-documented phenomenon, and companies like Apple and Neumann are masters of its manipulation. Of course you’d pay £65 for a MacBook charger! That’s the price you pay for quality (or a flimsy piece of shit that breaks after a year). And of course you would pay £250 for a U87 cradle! Why, you’d be an unprofessional fool not to (despite the fact that any old £10 cradle would do an identical job, they just lack the correct microphone attachment). Companies have been selling us overpriced shit for years, and the justification lands largely on the credos that a particular brand name has within their particular market.

 

Ah, but hang on… I’m not taking into account that, when it comes to audio devices, engineers the world over can really, really hear the difference! No, really! They really can! And that’s why they’ve got Prism converters in their studios! Because their ears are, like, super mega awesome. Better than your ears, and definitely better than mine! They sleep every night in an anechoic chamber in order to recalibrate their hearing for the day ahead, and the really top guys have bionic aural implants that allow them to hear all the way up to 40 KHz! They’re like dogs! They’re literally just like fucking dogs.

 

The fact that so many people claim to perceive an improvement in sound quality with respect to the price of their ADCs is, to say the least, unconvincing. There’s an interesting Caltech study that actually explores this phenomenon in more depth, which you can find on their website here. I’m going to lift the gist of the study from this website, which does a fine job of summarising it:

 

Researchers from CalTech and Stanford told subjects that they were drinking five different varieties of wine and informed them of the prices for each as they drank. But in reality, they only tasted three types, because two were offered twice: a $5 wine described as costing $5 and $45, and a $90 bottle presented as $90 and $10. (There was also a $35 wine with the accurate price given.) Not only did the subjects rate identical wines as tasting better when they were told they were pricier, but brain scans showed greater activity in a part of the brain known to be related to the experience of pleasure. In other words, the experiment may be evidence that we genuinely experience greater pleasure from an identical object when we think it costs more.

 

Admittedly not conclusive findings, it nonetheless points to a phenomenon that our deepest intuitions surely corroborate; when we think something costs more, or when we have a significant vested interest, we are biased towards defending its greatness.

 

So anyway, back to the Albini session. Based on what I considered to be an obvious truth about the perceived sound quality of over-priced ADCs, I was happy to facilitate my recording with the closest, most convenient solution to hand; a couple of MOTU 828s that I could easily chuck in a suitcase and get to Chicago without much hassle. However, in doing that, and knowing that these devices would appear in the video, I absolutely knew that some twat would appear from the woodwork at some point and feel the need to start the tedious “MOTU converters aren’t good enough” debate, and sure enough, one plucky commenter on my YouTube channel decided to do just that.

 

And so the criticism raged about how “MOTU converters sound cloaked and muddy” (whatever that means), and no engineer worth his salt would use anything less than Prism Orpheus converters at £ several-hundred per channel to capture anything like the kind of magic that Steve was laying down on his Studer A820, and he should know, because he’s worked in, like, loads of studios n’ shit, and, like, all his engineer mates agree… and this one time, right, he transferred an album using MOTU converters, and then had to re-record the whole thing because MOTU converters are, like, just so shitty sounding and all cloaked and muddy and stuff.

 

Anecdotal accounts of this kind are unimpressive for reasons too numerous to mention. The only way to claim real knowledge of this sort, I would argue, is to have performed some pretty rigorous blind trials, ensuring that only the single variable under scrutiny is the thing in the signal chain to be altered. Or indeed, if you feel confident enough to make the claim that something sounds “cloaked and muddy”, that you have subjected this to some close scrutiny such that you can describe in less ambiguous terms the perceived character of signal degradation, and under what conditions it manifests.

 

So, as much as I recline from the implication that I’m writing an entire blog post just to prove one person on YouTube wrong, I do think it presents an interesting opportunity to run a few tests and explore the topic further, not least so that my own perceptions may be less clouded by mere rhetoric of this kind.

 

Right then, let’s get to it. The first obvious comparison to run is between Steve’s tape recording and my simultaneous digital recording. This would essentially be a comparison between ATM900 2” tape and 48 KHz, 24 bit MOTU digital. Now, there are a few issues with this comparison which are worth laying bare at the outset, the most notable of which being that the multi-track transfer from tape to computer was itself made using the MOTU devices. This will obviously upset the angry YouTubers of this world, as the claim then becomes that the very act of running the tape recordings through MOTU converters has itself imparted intractable MOTU ugliness upon the signal, and therefore the comparison is of limited utility. And yes, I would agree that ultimate scientific rigour is sadly absent from such a comparison. However, that being said, if we are to assume that the 2” tape itself is doing something uniquely magical to the signal (warm, punchy, creamy, soupy… whatever bullshit adjective you want to throw in there), then we should expect to hear some difference between that recording and a purely digital recording. If the MOTU devices have done something nasty to the signal, then it should have incurred such nastiness identically on both the original digital recording and on the tape transfers, and as such we should still be able to identify which one maintains a semblance of the analogue magic. At the very least we should be able to tell them apart. Of course, tape is identifiable by its hiss, so in order to make this comparison fair, I’ve artificially added some hiss to the digital recording.

 

Below this paragraph there are two files; one tape recording (digitally transferred), and one simultaneous digital recording. Both are multitrack mix-downs of a solo drum performance with matched processing – the analogue tape utilising Steve’s outboard (as described in the aforementioned video), and the digital using comparable in-the-box plugins. You are invited to download them and compare them. See if you can spot the difference. Which one has the analogue “magic”? If you email me at james [at] jamesmakesmusic.com I will happily tell you which one is which, although be warned that I don’t consider this a bullet-proof perception test, given that you still have a 50% chance of guessing it correctly. This is simply a little comparison to get us started:

 

So let’s now move on to the real test, and that is a test of the claim that MOTU converters sound “cloaked and muddy”, and that I am clearly a fool to be using them. I’m going to take my cue here from the brilliant Ethan Winer, acoustician and life-long debunker of audiophile claptrap, who has himself conducted an identical test to the one I am about to perform, but with Focusrite and Soundblaster converters in his sites, rather than MOTU. You can find his tests on his website here. Essentially, the logic works like this:

 

The claim is that ADC X sounds crappy in some way (“cloaked and muddy” in this case). Therefore by performing a loopback recording using the ADC in question, over a number of generations this crappiness should become more and more pronounced. A loopback recording simply means playing an audio file out of a stereo output of the device, and physically patching that output back into two inputs and recording it in whatever recording software you use. The resultant recording is then used as the source for the next loopback recording. And that one for the next. And that one for the next. And so on. After several generations the claim should be very easy to verify, as the signal degradation, or “cloaked muddiness” should be cumulatively imparted on the signal, such that the discrepancy between, say generation 10 and the original file is absolutely obvious. Certainly, if the ADCs under scrutiny are as crappy as has been claimed, then even after a single generation we should hear an obvious difference between the result and the original file. However, if it transpires that it is a struggle to hear any difference, and actually in a blind trial it is not even clear which is the original file and which are the subsequent generations, then it’s pretty safe to say that people like our YouTube friend may well just be subject to the aforementioned psychoacoustical biases, and therefore, simply talking shit.

Below this paragraph you will find several batches of files. I used four different pieces of music as my test subjects, ranging from grunge, through trip hop, to classical and jazz. For each genre I have posted four files; the original, generation 1, generation 5 and generation 10. I have aligned all recordings and gain matched as closely as possible. All recordings are carried out at 44.1 KHz, 16 bit, in order that any degradation manifests as obviously as possible. The goal for anyone who wants to partake in the challenge is to correctly identify each file. Once again, you can obtain the correct answers by sending me an email to james [at] jamesmakesmusic.com. If you are confident about MOTU converters being unsuitable for use because of their defects in “sound quality”, then you should have no trouble correctly identifying each file. And to RasTatum – the man who made the claim about MOTU converters sounding “cloaked and muddy” in the first place (but who has since rather curiously deleted all his comments) – I eagerly await your contribution to this experiment.

 

Good luck!

 

Nirvana – Smells Like Teen Spirit (1991)

 

Sneaker Pimps – 6 Underground (1996)

 

Royal Liverpool Philharmonic Orchestra – Beethoven’s Symphony #2 (1998)

 

Thelonious Monk & Gerry Mulligan – ‘Round Midnight (1957)

 

————–

Finally, the last brief experiment I wished to perform was a test of the self-noise of MOTU converters, because whilst the sound quality may not be as affected as we thought, then that still leaves room for the claim that the devices themselves are noisy. So I performed another loopback test using a file of total silence as the source. I won’t bother actually posting the resultant audio files here, but I will tell you that after 20 generations I was seeing a noise floor of -72.2 dB, as you can see from the image below. I hope you’d agree that that demonstration renders concerns of this type absolutely negligible.

So there you have it. Turns out you don’t actually need to spend thousands of moneys on ADCs just because someone tells you to.

 

Hooray!


Crushed To Hell: My Thoughts About Mastering

For the second time in my life I have realised that reaching out to a mastering studio to put the “finishing touches” on my music is completely pointless.

Allow me to explain…

Mastering recorded audio became its own discipline after the Second World War, when a “dubbing engineer”, secondary to the recording/mix engineer, was tasked with transferring the recorded audio from tape to a master disc, which served as the template from which all following vinyl discs would be pressed. This was a purely technical procedure, whereby the dubbing engineer’s job was to ensure that the final recording, which had been signed off by those creatively involved with the production of the music, was faithfully duplicated onto its designated medium.

Early vinyl records tended to be dogged by various inefficiencies in the tape-to-disc transfer process, not least that the dynamic range of the recorded material could be too large, resulting in the cutting of unplayable waveforms where the needle would actually pop out of the grooves, or even burning out the disc cutting head. The use of compressors and limiters in the mastering process became widespread in the 1960s, to cap the dynamic range at a particular threshold and thus ensure that such problems could be avoided. However, because this process was automated, often the dynamics processing employed was not sympathetic to the fidelity of the original material, and so over-compression would sometimes squeeze the life out of it, making everything sound consistently loud in a way that dishonoured the integrity of the original tape master. Some records ended up sounding particularly nasty due to this pitfall at the mastering stage.

And so, the solution to this problem?

Enter the mastering engineer.

By the 1970s, dedicated mastering studios had been established, staffed by sound engineers using high-end equipment. These “mastering engineers” were incredibly adept at finalising tape masters in an artistically satisfactory way, establishing mastering as a new artistic discipline that could actually make the final result sound “better” than the original recording.

Throughout the 80s and 90s, music production was revolutionised by digital technology, and CDs became the darling format of the music industry. To this end, the significance of mastering for vinyl became less prominent, as the problems incurred by analogue playback were no longer an issue in the digital domain. Mastering engineers, however, did not disappear, and instead their role migrated into audio specialists who serve as the last step in the production process – the guy or gal who collates all the final mixes for a particular release, and applies their technical wizardry to ensure that program volumes and tonal balancing are consistent throughout the entirety of the album. This is arguably of particular importance given the infinitely flexible DIY audio production world in which we now live, where track one may have been recorded and mixed in your bedroom, and track ten is a live recording from that gig you played last year – a far cry from the rigidly calibrated standards of professional audio recording of the 60s and 70s – the mastering engineer can be an invaluable specialist who coalesces all of these final mixes, “topping and tailing” each song to run seamlessly from one to the other, and thereby creating a pleasingly consistent album.

So, what’s my beef with mastering then? Why the need for such cynicism over a specialist process that seems so necessary?

Well, as we have seen, the discipline of mastering has migrated away from being a technical necessity, and has reinvented itself as an artistic process that seeks to “correct” and “improve” audio recordings. It seems to me that underpinning this is an assumption that all recordings require “correction” and “improvement”, such that it has now become an almost unquestioned assumption that recordings must undergo such processes before they are properly finished, regardless of the fact that 99% of all recordings these days end up uploaded onto Soundcloud or YouTube, and as such have absolutely no technical requirement for any fiddling at the final stage. I have had this demonstrated to me twice in my life, and both times I reached the conclusion that mastering is really only necessary if identifiable problems are present with the final mixes. In short, if your final mixes sound great to you, and you are satisfied that they translate well across systems, then you really have to ask yourself what the point of having it mastered actually is.

Case in point, I recently finished working on two songs of my own, and rather than do my normal thing of using some light compression, adding a little sweetening EQ and then normalising the result, I decided that it is high time I found myself a decent, trustworthy mastering engineer to whom I could reliably outsource any material recorded at my studio – for both myself and my clients – to put the “finishing touches” on the mixes. The icing on the cake. The cherry on top. The sachet in the pot noodle. The mayonnaise on your kebab. Whatever your favourite culinary analogy, that’s what I thought. And so I touched base with several mastering facilities, both home and abroad, each of whom did a test master for me of one of my songs.

The result?

In each instance I found their work to be a terrible detriment to my original mix; crushed with compression in a way that seemed to me to be horribly distasteful, and accompanied by notes claiming things like “I tried to make it a tad warmer and kill some spikiness in the guitar”. This seemed to me to be slightly presumptuous – perhaps I like the spikiness in the guitar (I do). But of course, how was he to know otherwise? He is not familiar with my style, my artistic preferences, or what I consider important about my mixes, and so he was just trying to rectify the problems in the mix, as he perceived them. Attempts to articulate my preferences via email just leads to a cumbersome back and forth whereby words prove to be an inefficient medium in which to convey the subjective pleasure of ambiguous terms such as “guitar spikiness”, let alone any other of the myriad things that I neglected to mention. I actually work hard to capture a wide, natural ambience in my music, especially in the drums, and I feel that, in this current age of “loudness war” style over-compression, excessive limiting of transients in order to push up the aggregate volume of the music actually works against this kind of production style, and forces a kind of “breathlessness” in the music, where everything becomes squashed into a mulch of muddy sounding loudness.

Let’s take a closer look…

1

The image above depicts a stereo waveform representation of my original mix (red), followed by two subsequent masters from two different studios. In both cases we see that the audio peaks have been truncated in order that the aggregate level can be further maximised. The blue-backed waveform represents an attempt by the first mastering engineer – this waveform is a real sausage! Obviously hugely compressed (oddly more so on the right hand side than the left), which manifests as very noticeable “gain pumping” (sharp volume rises and falls) when listening. Detailing this concern to the second mastering studio, they returned their master, which is the yellow-backed waveform above. Noting that I was not a fan of excessive compression, they opted to still squash the mix, but just not as much. The result was a slightly less severe but still noticeable and ugly compression.

It actually seems to have become second nature to mastering engineers to simply make everything as loud as possible, because, hey, louder = better, right? We can see this trend towards excessive loudness by comparing two more waveforms, this time from Nirvana’s song “Smells Like Teen Spirit”, recorded in 1991. The image below depicts the original 1991 master (blue), and the 20th anniversary remastered “Special Edition” from 2011 (green):

2

It is interesting to note that the blue-backed waveform clearly shows Nirvana’s signature loud-quiet-loud song structure represented as an actual change in peak volume between the verses and the choruses. Cut to 2011 and this natural dynamic has been crushed in order raise the aggregate level of the song, arguably sacrificing one of the very trademarks that made Nirvana such a dynamically versatile and intense band in the first place. So no, louder is not always better.

But here’s another reason to be wary of excessive compression. Look what happens when we truncate peak waveforms in this way:

3

The above image shows a close-up of my original mix (red) side by side with the first master (blue). What we see is that, by truncating audio transients we are actually sacrificing audio content that would otherwise have been present. The detail displayed in the red wave has been totally lopped off and replaced with something resembling a large square wave. Square waves actually introduce odd-ordered harmonics into the signal, which manifests to our ears as rather ugly distortion.

So it seems to me that we are somewhere close to the old days of ramming final mixes through limiters at the mastering stage simply as a matter of course rather than because the music actually warrants it. Indeed, when I suggested to subsequent mastering engineers that I don’t wish to overdo the compression, they still felt inclined to push it somewhat, rather than to err on the side of subtlety. It’s curious why this has become the norm, and of course the much discussed “Loudness War” of the 2000s has impacted significantly upon the industry, such that it seems as though a mastering engineer doesn’t feel he is creating value for money unless he is seen to be mastering for “competition volume”, or else tampering with the mix to some significant and obviously noticeable degree. But for me, this is not actually the job of a mastering engineer. It seems to me that a principled mastering engineer should not be afraid to listen to a mix and decide that nothing needed to be done to it. And to that end, their job is done, and they are still every bit as entitled to be paid as if they had actually decided that there were real tonal balance problems that needed to be rectified. The mastering engineer is your last line of defence against actual technical problems, not a dude who can make your mixes sound “shit hot”. Working under that preconception actually encourages sloppy mixing, because it’s okay – the mastering guy will fix it!

So, where does this leave me?

Well, just to be clear – I am not a mastering engineer, and I do not claim that I can adequately do the complicated job of fixing the technical problems of someone else’s mixes. This task is for dedicated mastering engineers who are good at what they do and conduct themselves in a principled and agreeable manner. But I would urge you, if you’re happy with your mixes and you love the way they sound, please ask yourself – what exactly is the problem that you’re trying to solve? Personally I can only conclude the same point that I reached several years ago when I went through a similar experience: I seem to be trying hard to locate a mastering engineer to whom I can pay money in order to fix unidentifiable problems. All they seem to do – inevitably – is fail to align with my artistic vision and return results that I actually think make my mixes sound worse, not better. And so, being that I do not wish to employ someone to make further creative decisions on mixes that I am already satisfied with, it seems to me that I should take my cue from my previous decision on this matter, and that is that the person best placed to put any “finishing touches” on my music is me.

Hopefully, in a few years from now, when I have again forgotten why I don’t use mastering engineers and I find myself once again looking for that special someone who can put the awesome “finishing touches” on my music, this blog post will serve as a reminder of just how pointless that pursuit is.


The Bullshittery Of Audio Jargon

The topic of audio recording is vast and open-ended, and discussion about associated equipment in particular often gives rise to much heated debate with respect to perceived differences in the sonic performance between devices. It is not uncommon for hostile discussions to be waged pitting the minutia characteristics of this device against that, with all parties using increasingly elaborate language to define their subjective auditory experience, yet in the process obfuscating any real scientific analysis in favour of regurgitating “buzz” words that, when examined, actually fail to reveal anything helpful about the nature of the device in question. “Warmth”, “openness”, “air”, “punch”, “creaminess”, “sheen”, “silkiness”, “purpleness”, “dogturdidness”; fluffy terminology of this nature can often be observed in industry magazines (as some notable culprits are particularly guilty of), where vast word salads are served up in an attempt to suitably bewilder the reader into believing some imposed perception about a given piece of equipment. Whether it is an industry effort to create brand association with generic “good sounding” Barnum statements, or simply sloppy journalism in which authoritarianism comes from using words that everyone is too confused to question, the amount of bullshit I witness people talking on a regular basis goes to show how successful this method is.

I find language of that nature problematic for several reasons, not least because it denies us, as students of audio recording practices, access to scientific truths with regards to our field, where discussion of imparted harmonic content via signal distortion is much more helpful than fogging the issue under a linguistic cloud of subjective terminology and thereby propagating marketing myths about the necessity of over-priced equipment. It is no doubt a valuable weapon across all levels of the audio equipment industry, each brand justifying the apparent necessity of its newest model by using words that no one really understands. It’s interesting how readily we accept this lack of clarity in the discussion of audio, and how encourageable everyone seems to be to jump on the bullshit bandwagon. Note how we don’t accept this terminology in discussion of equipment where the scientific validity of their specifications really matters – I’m sure no FMRI scanner was sold on the basis of the “punch” of the scan or the “warmth” of the images produced. We can more readily accept fuzzy jargon in that context as obviously ridiculous and unhelpful.

“Brilliance”, anyone?

One of my audio recording heroes is Ethan Winer – musician, acoustician, and owner of the acoustic treatment company RealTraps – Ethan is somewhat notorious for his efforts to debunk tenacious myths prevalent among recording enthusiasts, whilst grounding his discussions in empirical scientific analyses, thereby abstaining from and often criticising the use of ambiguous subjective terms. I highly recommend his book “The Audio Expert” in which he talks about this very topic:

“Some of the worst examples of nonsensical audio terms I’ve seen arose from a discussion in a hi-fi audio forum. A fellow claimed that digital audio misses capturing certain aspects of music compared to analog tape and LP records. So I asked him to state some specific properties of sound that digital audio is unable to record. Among his list were tonal texture, transparency in the midrange, bloom and openness, substance, and the organic signature of instruments. I explained that these are not legitimate audio properties, but he remained convinced of his beliefs anyway. Perhaps my next book will be titled Scientists Are from Mars, Audiophiles Are from Venus.”

With this in mind then, allow me to demonstrate the principle of audio bullshit in action. As I came to undertake an investigation into the sonic differences between several different microphone preamps (post on that soon), I encountered a 2007 article from Sound On Sound in review of the Neve Portico 5012 Dual Microphone Preamp. As my curiosity led me to probe how such a device can justify a £1,400 price tag, one sentence in particular proved to be such an excellent demonstration of the ambiguity of industry terminology that I was inspired to finally write this blog post, hailing my discovery as a gold standard of audio bullshittery. Let’s have a look:

“The 5012 […] has a full bodied, solid sound that gives that slightly larger-than-life character that is the trademark of a really top-class preamp. It sounds clean and detailed in normal use, without that edgy crispness that can detract in some designs…

When the Silk mode is switched in, the sound becomes a little smoother, rounder, and sweeter still in the upper mids. The high end gains a little more air, and the bottom end becomes a tad richer and thicker.”

Terms like “larger-than-life” and “edgy crispness” are rampant when describing microphone preamps, analogue-to-digital converters and other studio essentials, yet they say nothing useful whatsoever about the actual, verifiable sonic characteristics of the device, instead simply propagating the usage of these vague terms and using them as flimsy justification for impressionable enthusiasts to feel anxious about the “below-par” consumer grade equipment they are currently using, and therefore encouraging them to unnecessarily part with not insignificant sums of money, thereby continuing the trend. That’s not to say of course that there is no value in “high-end” gear such as this, however I would prefer that its usage could be justified in more certain terms than these floppy, nothing words that we all have to keep grappling with. In my experience it’s always worth pushing for clarification via language that is arrived at through scientific consensus so that we can all be on the same page in terms of our expectations. This is the best prophylactic available against the tech-heads who claim authority by asserting that their £X,000 device sounds “sweet”. Chances are, they’re talking bullshit.


Stereo Recording Techniques On Test

Often in recording scenarios it is necessary to implement a stereo miking technique. Usually this is employed to capture room ambience at a distance from the originating sound source, by which I mean the reverberant field of an acoustic environment – anywhere where the late reflections are of greater intensity than the direct sound. Whether it’s for drum kit ambience, concert halls or choirs, ambient stereo miking provides a way of adding depth, width and general realism to the recording that is not possible through close-miking alone.

However, there are numerous stereo mic techniques and it struck me recently that I had never undertaken a direct comparison of them. This realisation in fact struck me with such vigour that I felt moved to instantly rectify the situation, spontaneously leaping up from my seat, screaming “STEREO MIC COMPARISON!!”, and bolting, arms flailing and screeching like a girl, towards the door. The other cinema-goers were somewhat bemused.

And with that I decided at once to trial four different stereo mic techniques over a few different scenarios. These are techniques that any good engineer should be aware of, but perhaps not all have actually directly compared. Well, in the name of science I hereby rise to the challenge.

Yes, that’s right… science.
 
 
So the four techniques on the menu today are the following:

  • #1: XY
  • #2: ORTF
  • #3: Blumlein
  • #4: Mid/Side

stereo-mics

I won’t go into detail about the configuration of these techniques here, largely because it’s late and I can’t be bothered, but if you’d like to know more about their implementation, please follow this link.
 
 
The purpose of my trials would be to answer the following questions:

  • Which technique captures a more effective and balanced stereo image?
  • How well does each technique collapse to mono?
  • How rich is the tonal balance?
  • Which one do I like best?

 
 
I chose to make these comparisons under 3 different scenarios: a large, reverberant concert hall, the drum recording environment in my studio, and with a moving sound source in a small room, which in this case was me wandering around and talking. The tests employed two sets of microphones – two AKG C414s for Blumlein and Mid/Side, and two AKG C451s for XY and ORTF. This selection was imposed due to equipment restrictions, otherwise identical microphones would have been used for all applications, thereby eliminating the variable of the sonic performance of the different mics. Nevertheless the comparisons should allow us to draw some reasonably solid conclusions.
 
Listed below are the recordings. Click each one to listen for yourself and see if you agree with my analysis:

 
stereo-mics-2
 
stereo-mics-3
 
 
So, based upon these recordings, along with a whole load of other tests I carried out which are not listed above, here are my answers to the aforementioned questions:
 
 

  • Q: Which technique captures a more effective and balanced stereo image?
  • A: Mid/Side.

The weakest stereo image seemed, across the board,  to be XY. It has a strong centre but very little width. This is unsurprising, since the capsules are so close together that it seems illogical to expect anything more. This is as I had always suspected, and why I never really felt tempted by this technique. The lack of movement in the voice recording is particularly noteworthy. My next preference is ORTF due to its much wider stereo image and strong centre point, followed jointly by Blumlein and M/S, both of which clearly exhibit a wide, detailed image. If I had to pick a winner, I’d go with M/S. The movement may not be quite as authentic as Blumlein, perhaps due to the trickery involved in the M/S configuration vs. the fairly organic method of Blumlein, however for the capture of room ambience for a static source, M/S just seems to have a special kind of something about it – a width and depth that to my ears is incredibly realistic.
 
 

  • Q: How well does each technique collapse to mono?
  • A: ORTF & Blumlein win.

The mono drum recordings reveal that no technique has any particular issue or phase weirdness occurring when collapsed to mono, however in terms of preserving the fidelity of the ambient field that we are attempting to capture, Blumlein and ORTF seem to have it over M/S and XY. With M/S this is due to the cancellation of the side mic so that we are left with only one microphone pointing at the source, and XY had the weakest stereo width anyway, so this result is unsurprising.
 
 

  • How rich is the tonal balance?
  • A: Mid/Side wins.

We have to be a little careful here when we start using ambiguous terms like “richness”, “warmth”, “creaminess”, “silkiness”, “moistness”, “purpleness”, etc, etc, because these aren’t exactly scientific words. However, what I intend it to mean in this instance is how well expressed are the bass, middle and treble parts of the frequency spectrum, subjectively speaking. In my view, M/S clearly trumps all others in terms of its pleasing bottom end yet detailed high frequency reproduction. This was deduced by looping small parts of the drum and concert recordings and directly comparing each technique. ORTF is also very good in this regard, followed by Blumlein and finally XY.
 
 

  • Which one do I like best?
  • A: Mid/Side!

Yep, it would appear that M/S is awesome. Science says so. Well, to my ears at least. My science ears. It adds something magical to the recording and is extremely pleasing to experience, especially on drum recordings when combined with the close miked signals. Here is a demonstration of that:

 
 

Blumlein and ORTF are still excellent techniques though, offering a nice, solid centre and plenty of detailed width, which is certainly bad news for the XY technique, which has since been strapped to a rocket and jettisoned into the centre of the sun.
 
It deserved it, too.
 
 
That is all.

 


Comb Filtering In Drum Overhead Microphones

Recording drums in a small room is a problem that any engineer not blessed with an infinite budget must deal with at some point. Among the difficulties inherent in this scenario is the problem of comb filtering in the audio signal due to the microphone’s proximity to a boundary, i.e. the ceiling or a nearby wall. For example, if a singer sang into an omni-directional microphone placed 1 metre from a reflective wall or surface, the sound of their voice would hit the mic but also carry on past it, hit the wall, rebounding back and re-entering the mic about 6 milliseconds after the direct signal.

boundary

 

This is exactly the right amount of time for the frequency components around 85-86Hz to come back close to 180° out of phase with the direct signal. There will not be total cancellation, since the rebounded signal will be weaker and because the sonic characteristics of the singer’s voice are constantly changing, but the effect may still be significant.

frequencies

 

Rounding down to 85Hz, at 170Hz the reflection will come back in phase and reinforce the 170Hz components within the direct signal. At 255Hz it will be out of phase again, and at 425Hz and 595Hz, and at intervals of 170Hz all the way up the frequency spectrum. This is known as “comb filtering”, due to the regular series of peaks and notches across the spectrum. It sounds phasey and generally undesirable.

This effect is demonstrated in this video, where a drum overhead microphone is moved towards a nearby boundary and back again. The comb filtering artefacts are clearly audible in the recorded signal. The first microphone – a Royer R121 ribbon mic – clearly suffers from this effect with great prominence given it’s bi-directional polar pattern, and thus greater susceptibility to rear reflections. The second mic – an Audio Technica ATM450 – reveals itself to be less harshly affected due to its cardioid polar pattern. This then demonstrates the importance of microphone selection with regard to its placement within a recording environment, as well as the importance of placing the mic as far from boundaries as possible, or, when this is not feasible, treating nearby surfaces with good quality acoustic absorption in order to eliminate as many reflections as possible. A combination of absorption and diffusion is most effective.

 

Many thanks to my beautiful assistant, Bebe Bentley, for helping me with these tests. Check out her excellent work in film and moving image on her Vimeo page.


Impulse Responses & Convolution Reverb: How To Sample An Acoustic Space

Those familiar with audio production probably know that there are two types of digitally synthesised reverb effect. The first, and generally most popular given its byte-sized (heh) use of computer resources is known as “algorithmic” reverb, where the incoming signal is, sample by sample, multiplied by a factor dictated algorithmically by various twiddly knob-like parameters. Such reverb types, although efficient on resources, can be less than convincing upon application, especially when applied to particularly exposed instruments or voices; unsurprising since they are merely a mathematical approximation of the kind of thing reverb should probably sound like.

 

1A twiddly knob algorithmic reverb unit.

 

The second, more sophisticated type of reverb is known as “convolution” reverb, and it is this type that is the focus of today’s brain spillings.

 

You see, convolution reverb more precisely replicates the acoustical properties of an actual real-life environment by manipulating the original recording via a method similar to the algorithmic reverb technique, but crucially different in a very specific way. This time instead of relying on a mathematically produced set of rules to determine the multiplication of the incoming signal, an “impulse response” – an actual recording of an actual real-life actual environment – is used.

 

Actually.

 

2

A convolution reverb unit.

 

“Impulse response” is just a fancy way of saying a “recording of a signal processed via some system”. In the case of convolution reverb, the system is the reverberation of a physical space. Stick a microphone in the middle of the Sydney Opera House, record the sound of a balloon popping plus the ensuing room acoustics, and, hey presto, you have yourself an impulse response. An accurate impulse response must feature all frequencies within the audible spectrum (20 Hz – 20 KHz) in order to be effective for use as convolution reverb, which is why a balloon pop makes for a fairly commonly used source, especially given the ease with which such a scenario can be set up. Generally speaking, a sudden burst of white noise of this kind contains spectral content of sufficient bandwidth to be practically useful.

 

There is however, as impulse response elitists never tire of pointing out, a problem with this method, since no two balloon pops are exactly alike, and the intensity of frequencies across the spectrum may vary wildly. A higher intensity at 500Hz than 2KHz will bias the response of the room in favour of 500Hz. Therefore, for the sake of accuracy, a frequency sweep played back through a flat response studio speaker is considered the definitive method of sampling an acoustic space. This way it is ensured that all frequencies are played at equal intensity. Of course there is then some deliberation about the quality of speaker, microphone and preamp used, however it is claimed that a fairly modest system can at least produce a very reasonable approximation.

 

I should point out at this juncture, ladies and gentlemen, that I did, in order to look clevererer than I necessarily am, try to find a more detailed explanation of the process of multiplying the input signal by the impulse response at a sample level, but my efforts merely resulted in my being sick on my desk. So forgive me if I quietly refrain from that, and instead offer that I think it might have something to do with very, very tiny squirrels.

 

I think.

 
 

Isn’t it?

 
 

Well, okay – here, I drew a picture. I think it’s something like this:

3

Make of that what you will.

 
 

Anyway, so far, so good – find a sonically interesting environment, record a frequency sweep, run it through some jiggery pokery to create a usable impulse response, load it into your favourite convolution reverb plugin – I will be using Cubase’s inbuilt REVerence – and marvel at your digital recreation of your original environment. Sounds like fun to me.

 

So, choosing two sonically interesting spaces at the University of Sussex, I put it to the test. The Meeting House is a large, circular, chapel with a domed roof, apparently designed in the midst of a fairly potent acid trip, whilst the drama studio is a much smaller, enclosed room, yet of equivalent acoustic interest.

4

The Meeting House.

5

The Drama Studio.

 
 

Within these environments I set up a Genelec 8030A for playback of the frequency sweep, a pair of AKG C451Bs in ORTF configuration on the far side of the room and then, to the bemusement of any passing visitors, recorded the signal. Then, in the same positions, I recorded a short segment of acoustic guitar for reference when assessing the authenticity of my resultant convolution reverb. For completeness, I also recorded balloons popping, so that a true comparison could be conducted.

6

7

8

 
 

So then, once all the data had been collected I could return to the lab to analyse the results. Some subtle treatment of the recorded signals was required to eliminate ambient noise where possible, always ensuring never to damage the fidelity of the actual recordings. Once the material had been analysed, treated and turned into usable impulse response files via Voxengo Deconvolver, I could load them into the REVelation Convolution Reverb unit, apply the plugin to a source signal (in this case a dry recording of the guitar riff I had played in each environment), and cross my fingers that it had worked.

 
 

Here are the results. Click to listen.

 

#1: Meeting House
— Guitar in Room / Frequency Sweep Reverb / Balloon Pop Reverb.

The results here are actually pretty good. Firstly it’s noteworthy that the frequency sweep does indeed produce better results than the balloon pop. The balloon pop has a deficit of high end information and a swelled, rather ugly middle. The frequency sweep on the other hand does a reasonably good job of replicating the guitar test recording. Both however produce a reverb tail that is fairly authentic. I am pleased with these results.

 

#2: Drama Studio
— Guitar in Room / Frequency Sweep Reverb / Balloon Pop Reverb.

A generally brighter reverb this time but the results are consistent with the previous environment. Again, the frequency sweep method has produced a far more convincing result.

 

So, all in all, that seems to have proven very successful. It is worth noting that, as well as actual physical environments, the frequency sweep method can also be used to sample hardware or software reverb FX units. By processing a frequency sweep with an interesting reverb unit and then generating an impulse response from the resulting file, a startlingly accurate clone of the original effect can be made. Here are two examples of such a practice, whereby the previous dry guitar recording has been processed first by a dedicated reverb unit, and then by an impulse response clone of that unit:

 

#3: Spring Reverb
— Original Reverb / Cloned Reverb

 

#4: FX Reverb
— Original Reverb / Cloned Reverb

 

So that’s it. Impulse response generated convolution reverb.

Lawks a lordy.

 

I’m going to eat some chicken.

Ta-ra.

 


Binaural Recording

Have you ever wondered how it is possible for the human brain to so accurately detect the location of a perceived sound? We only have two ears, yet somehow we are able to discern the differences between sounds originating from any direction within our 3-dimensional environment – in front, behind, above, below, left or right. How is this possible? And can we therefore simulate this effect in order to artificially reproduce the experience of perceived 3-dimensional sounds, as opposed to the normal left/right experience we are accustomed to in traditional stereophonic speaker set-ups, without simply adding extra speakers?

The answer is yes we can. Directional perception of sound occurs by our brain’s ability to decode the subtle differences in information received by our in-built stereo receivers – our left and right ears. Binaural recording is a recording technique that uses two microphones to mimic the human auditory system, utilising the exact same conditions that create the phenomenon of binaural localisation in humans. And so, with the acquisition of a pair of binaural microphones, a portable Tascam field recorder and a dummy head named John, film maker Bebe Bentley and I spent one evening carrying out some binaural recording tests at the University of Sussex. Here are the results (please note that headphones must be worn in order to perceive the effect):

#1: Binaural recording in a dead room.

#2: Binaural recording in a live room.

#3: Binaural recording of James with a guitar.

NEW-2NEW 1 NEW-4

In the directional perception of sound there are two phenomena at work: Binaural and monaural localisation:

Binaural Localisation

Binaural Localisation refers to the discrepancies in the characteristics of a sound wave arriving at the closest ear, and then the farthest. Your brain is sensitive to the discreet time difference between a sound hitting the nearest ear and the farthest ear – referred to as the Inter-aural Time Difference (ITD) – as well as the slight change in volume between the two ears – the Inter-aural Intensity Difference (IID). If sound originates to your left, your head acts as a barrier or filter and reduces the level of sound heard in the right ear.

Monaural Localisation

Monaural localisation mostly depends on the filtering effects of physical structures. In the human auditory system, these external filters include the head, shoulders, torso, and outer ear or “pinna”, and can be summarized as the head-related transfer function. Sounds are frequency filtered specifically depending on the angle at which they strike the various external filters.

binaural
Binaural recording of the kind Bebe and I carried out works by the use of two omni-directional microphones fitted to a dummy head, thereby simulating as realistically as possible the actual physical location of the human ears, combined with the filtering incurred by the human head. The same effect would be achieved by placing the microphones in your own ears, which would make for an interesting audio experience were you to then simply walk around an urban environment or visit a concert. In these instances it would be possible to accurately record exactly what you heard in these situations, complete with directional perception of the ambient noise, in order to later recreate that exact sensation through a pair of headphones. This, however, is perhaps a test for another day. Here we simply affixed the microphones into John’s ears and proceeded to move objects around and make various noises such that the illusion of directional perception is created.

It is however important, for the effect to be fully realised, that headphones are worn. This is because, on replay, the left ear must receive only the signal recorded by the left microphone, and the right ear only the signal from the right microphone. Playback through speakers destroys this effect by obscuring the stereo field emitted by the left and right speakers.

What strikes me as odd about the experience of listening to this recording is the realism it invokes. When hearing Bebe and I running around the room it is as if ghost figures are appearing in front of you. With your eyes closed you can almost “see” the people. This demonstrates just how unaware we are of the subtleties of our sensory information in building our picture of the world. The next time someone supposes some supernatural bullshit to describe how they “felt a presence in the room”, remind them how easily our senses can be fooled.

So there we are. Artificial directional perception by binaural recording. Now, if only I could find a practical application…


Eliminating High-Hat Spill

When recording a drum kit one of the most perennial problems encountered is high-hat spill on the snare microphone. Some engineers claim to have made peace with this issue by utilising the signal as simply “part of the drum sound”. This doesn’t do it for me since, among other problems, it ruins my stereo image of the kit, placing the hats immovably in the centre. Others aim their microphones such that the null in the cardioid pattern (i.e. the rear of the mic) is directed at the hats. Others even suggest using a figure-8 mic such as a ribbon, which has deeper nulls in its off-axis response, placed so that the side of the capsule looks at the hats.

None of these solutions provide suitable buoyancy to float my little boat. For a start, dynamic mics – especially the SM57 – do not, in my opinion, sufficiently capture the snap and sizzle of a snare drum, and besides, positioning one so that its rear is pointing towards the hats without disturbing the drummer is a tactical nightmare. Ribbon mics are scarcely much better, since there is no one location where the rear of the microphone is not detecting an unworkable amount of the tom behind it. And I don’t even want to think about the consequences of the inevitable battering it is going to take from the drummer. In any case, microphone positioning of this nature when in such close proximity to other undesirable sound sources is purely a hypothetical exercise. In the real world the results achieved by nit-picking in this manner are more or less negligible. The harsh spill from a close set of loud high-hats is simply not going to be significantly reduced by inching a microphone on its axis one way or another.

When I record snare drums I generally like to use the very tiny Shure Beta 98 microphone. It sounds absolutely excellent, gives great top end crack, has very fast transient response, and is so physically small that it can be positioned anywhere around the drum without getting in the drummer’s way (it also has great mounting hardware so as to clamp rigidly onto the side of the drum, thus eliminating the requirement of yet another mic stand). Then when I mix the snare I like to take a good, transparent EQ and make it extremely bright. That’s how to achieve a good crack that pierces like a razor blade though the mix. However, in order for this to work the snare must be as isolated as possible from the rest of the kit, and the high-hat above all must be eliminated as much as possible from the signal, or at the very least its high frequencies significantly reduced.

So. We have a conundrum on our hands. If we can’t budge on mic choice and we can’t solve the problem through placement, the only other alternative is baffling. With this, I set to work.

Now, I have read several times on forums and in textbooks such as Bobby Owsinsky’s “The Recording Engineer’s Handbook” that a good method of baffling ambient sound from a drum mic is to cut a hole in a polystyrene cup, poke your microphone through the middle and then tape the contraption together. Dubious, I gave it a try, suspecting that polystyrene does not present a suitably absorbent or reflective material to deflect close proximity, high intensity sound. As it transpires, I was right. Not only this, but I couldn’t imagine actually putting this into practice in a recording session without feeling like the dickiest of amateur dicks: “We’re all miked up lads… now, get me a paper cup and some gaffer tape!”. However, somewhat inspired by this idea I thought that perhaps I could build a contraption out of a more rigid material, take some steps to furnish it with some proper isolation material and then affix it retractably to the microphone, thus making for a more professional, more effective baffle and thereby solving our problem.

The idea? Tennis balls! One tennis ball, in fact. Cut in half, a hole cut in the middle, the outside covered in tin foil and the inside stuffed with acoustic foam. As I sat in one sunny Saturday, craft materials sprawled everywhere and glitter all over my face, my train of thought pulled in for a long stay at Genius Junction. This, I knew, was the solution to all my high-hat woes. I was indeed a genius. The result looked like this:

Tennis Ball

I thought it looked pretty smart. But did it work? Well, let me tell you…

No. It was shit. Not only was it absolutely ineffective, it also turned the source, i.e. the snare, into a tonally retarded shadow of its former self. And this makes perfect sense too – if you place a microphone within the confines of a cavity, then the acoustical properties of that immediate boundary are going to wreak havoc on the direct source you are trying to capture. The resonant frequency of that cavity combined with the filtering artefacts incurred by the boundary (the boundary effect) are going to dick with your source sound in a totally undesirable way. To see for yourself, just cut a hole in the bottom of a paper cup and put it up to your ear while listening to some music. Sounds awful, doesn’t it? If more proof were needed, here are the results of my tests:

Snare Test 1: Shure Beta 98, close, no baffle
Snare Test 2: Shure Beta 98, close, tennis ball baffle

So I think we can safely say that forming any kind of cavity immediately around a microphone is definitely not a good idea. This means that we have to find some other non-intrusive way of baffling the high-hats. Since the tennis ball idea not only sounded bad but also did very little to reduce the harsh frequencies of the hats, it seemed to me that we needed to think bigger to think better. I know from experience that an extremely good source of acoustic insulation is Rockwool, due to its high absorption coefficient, especially in the high frequencies – exactly where the harshness of the hats resides. So if we could somehow fashion a non-intrusive baffle out of four inches of Rockwool, then maybe we would be on to something. I immediately got to work on some leftover sound insulation with a Stanley knife. After many hours chopping, changing and inhaling an ever increasing quantity of microfibres, I discovered a solution that created no cavity around the microphone and significantly reduced the harsh top end of the hats in the snare mic. That solution was to raise the hats such that a four inch thick slab of high-hat shaped Rockwool could be installed beneath them, with the snare mic tucked underneath. It wasn’t pretty but it worked a treat:

Rockwool

For those of you with anxieties about raising high-hats, I should point out that this approach really is the first port of call when attempting to reduce high-hat spill. The further away you can move a source from the microphone, the less intrusive it will be. With the hats this carries the added bonus that it moves the drummer’s point of contact to the less clangy side of the hats, as opposed to the harsher top.

Finally we’re getting somewhere. For good measure, and simply because it seemed like it was something I should do, I added a chunk of acoustic foam underneath the Rockwool, just to see if I could knock off that spill a little more:

Rockwool + foam

The results were excellent. The high-hat spill was becoming reduced to a much more manageable level:

Snare Test 3: Shure Beta 98, close, Rockwool + foam baffle

The only remaining problems now were a) how to make this monstrosity more aesthetically pleasing, and b) how to not disrupt the drummer by its presence. Both of these concerns were addressed by cutting the Rockwool down to exactly the size of the high-hat (generally 14″) and taking one extraordinarily tedious afternoon to assemble a small pair of trousers in which to house it all:

Trousers

The Rockwool was inserted into the black cotton trousers, with the foam glued to the underside. By clipping this to the stand immediately beneath the hats, the microphone can can be tucked discreetly underneath, also then protecting the mic from an accidental battering from the drummer.

And there it is! This is how to eliminate high-hat spill without ruining your snare sound. And it just goes to show – don’t just believe what the textbooks tell you. Try it yourself, and if it doesn’t work, get creative.

hats