Tag Archives: digital

MOTU??! Urrgh… *Real* Sound Engineers Only Use Prism Converters, Dontcha Know? Hiffle, piffle, plip and wank.

Earlier this year I visited Steve Albini’s Chicago studio, Electrical Audio, with the goal of not only recording some drums with the man himself, but also scrutinising his mic techniques in order to learn more about how such an incredible drum sound is achieved. The results were as fantastic as you would expect, and I publicly documented my experiences via a video presentation, thanks in no small part to the assistance of an impressively bearded cameraman by the name of Kevin Clarke. You can see the resultant video here.

 

In order to record the session as flexibly as possible, and preserve all naked, ungrouped signals for my scrutiny later on, I knew that I had to split the signals coming from each microphone, with one batch going to Steve for recording to tape, and another identical but independent batch leading to my digital recording system. To make this work, Steve carried out an impressive feat of ad-hoc patching in order that we could split the signals after the desk preamp. This way I would still maintain whatever Neotek goodness was being imparted on each signal. The only differing variables between the two simultaneous recordings were the recording mediums themselves; Steve recorded to RTM900 2” tape, 16 track, 15 IPS. I recorded digitally at 48 KHz, 24 bit via a pair of chained MOTU 828 mk2 interfaces.

 

Now, I didn’t particularly give much thought to what interface I would use for my end of the recording, given that I consider all interfaces to be much of a muchness. They all do the same thing, and they all sound pretty much as transparent as each other. Quibbling about spec sheets aside, the fact of the matter is that the analogue-to-digital converters inside all interfaces across the price spectrum these days are perfectly capable of capturing and reproducing music transparently, and any talk about the correlation between “sound quality” and price tag is, in my opinion, grounded in a whole host of psychological biases which influence our perception of “quality” to an impressive degree, even more so when you’ve actually paid over the odds for what is, at best, an imperceptibly subtle improvement. “Yeah, this shit sounds fucking sweet. Now get an awesome photo of it for the website.”

 

The association of price with quality is a well-documented phenomenon, and companies like Apple and Neumann are masters of its manipulation. Of course you’d pay £65 for a MacBook charger! That’s the price you pay for quality (or a flimsy piece of shit that breaks after a year). And of course you would pay £250 for a U87 cradle! Why, you’d be an unprofessional fool not to (despite the fact that any old £10 cradle would do an identical job, they just lack the correct microphone attachment). Companies have been selling us overpriced shit for years, and the justification lands largely on the credos that a particular brand name has within their particular market.

 

Ah, but hang on… I’m not taking into account that, when it comes to audio devices, engineers the world over can really, really hear the difference! No, really! They really can! And that’s why they’ve got Prism converters in their studios! Because their ears are, like, super mega awesome. Better than your ears, and definitely better than mine! They sleep every night in an anechoic chamber in order to recalibrate their hearing for the day ahead, and the really top guys have bionic aural implants that allow them to hear all the way up to 40 KHz! They’re like dogs! They’re literally just like fucking dogs.

 

The fact that so many people claim to perceive an improvement in sound quality with respect to the price of their ADCs is, to say the least, unconvincing. There’s an interesting Caltech study that actually explores this phenomenon in more depth, which you can find on their website here. I’m going to lift the gist of the study from this website, which does a fine job of summarising it:

 

Researchers from CalTech and Stanford told subjects that they were drinking five different varieties of wine and informed them of the prices for each as they drank. But in reality, they only tasted three types, because two were offered twice: a $5 wine described as costing $5 and $45, and a $90 bottle presented as $90 and $10. (There was also a $35 wine with the accurate price given.) Not only did the subjects rate identical wines as tasting better when they were told they were pricier, but brain scans showed greater activity in a part of the brain known to be related to the experience of pleasure. In other words, the experiment may be evidence that we genuinely experience greater pleasure from an identical object when we think it costs more.

 

Admittedly not conclusive findings, it nonetheless points to a phenomenon that our deepest intuitions surely corroborate; when we think something costs more, or when we have a significant vested interest, we are biased towards defending its greatness.

 

So anyway, back to the Albini session. Based on what I considered to be an obvious truth about the perceived sound quality of over-priced ADCs, I was happy to facilitate my recording with the closest, most convenient solution to hand; a couple of MOTU 828s that I could easily chuck in a suitcase and get to Chicago without much hassle. However, in doing that, and knowing that these devices would appear in the video, I absolutely knew that some twat would appear from the woodwork at some point and feel the need to start the tedious “MOTU converters aren’t good enough” debate, and sure enough, one plucky commenter on my YouTube channel decided to do just that.

 

And so the criticism raged about how “MOTU converters sound cloaked and muddy” (whatever that means), and no engineer worth his salt would use anything less than Prism Orpheus converters at £ several-hundred per channel to capture anything like the kind of magic that Steve was laying down on his Studer A820, and he should know, because he’s worked in, like, loads of studios n’ shit, and, like, all his engineer mates agree… and this one time, right, he transferred an album using MOTU converters, and then had to re-record the whole thing because MOTU converters are, like, just so shitty sounding and all cloaked and muddy and stuff.

 

Anecdotal accounts of this kind are unimpressive for reasons too numerous to mention. The only way to claim real knowledge of this sort, I would argue, is to have performed some pretty rigorous blind trials, ensuring that only the single variable under scrutiny is the thing in the signal chain to be altered. Or indeed, if you feel confident enough to make the claim that something sounds “cloaked and muddy”, that you have subjected this to some close scrutiny such that you can describe in less ambiguous terms the perceived character of signal degradation, and under what conditions it manifests.

 

So, as much as I recline from the implication that I’m writing an entire blog post just to prove one person on YouTube wrong, I do think it presents an interesting opportunity to run a few tests and explore the topic further, not least so that my own perceptions may be less clouded by mere rhetoric of this kind.

 

Right then, let’s get to it. The first obvious comparison to run is between Steve’s tape recording and my simultaneous digital recording. This would essentially be a comparison between ATM900 2” tape and 48 KHz, 24 bit MOTU digital. Now, there are a few issues with this comparison which are worth laying bare at the outset, the most notable of which being that the multi-track transfer from tape to computer was itself made using the MOTU devices. This will obviously upset the angry YouTubers of this world, as the claim then becomes that the very act of running the tape recordings through MOTU converters has itself imparted intractable MOTU ugliness upon the signal, and therefore the comparison is of limited utility. And yes, I would agree that ultimate scientific rigour is sadly absent from such a comparison. However, that being said, if we are to assume that the 2” tape itself is doing something uniquely magical to the signal (warm, punchy, creamy, soupy… whatever bullshit adjective you want to throw in there), then we should expect to hear some difference between that recording and a purely digital recording. If the MOTU devices have done something nasty to the signal, then it should have incurred such nastiness identically on both the original digital recording and on the tape transfers, and as such we should still be able to identify which one maintains a semblance of the analogue magic. At the very least we should be able to tell them apart. Of course, tape is identifiable by its hiss, so in order to make this comparison fair, I’ve artificially added some hiss to the digital recording.

 

Below this paragraph there are two files; one tape recording (digitally transferred), and one simultaneous digital recording. Both are multitrack mix-downs of a solo drum performance with matched processing – the analogue tape utilising Steve’s outboard (as described in the aforementioned video), and the digital using comparable in-the-box plugins. You are invited to download them and compare them. See if you can spot the difference. Which one has the analogue “magic”? If you email me at james [at] jamesmakesmusic.com I will happily tell you which one is which, although be warned that I don’t consider this a bullet-proof perception test, given that you still have a 50% chance of guessing it correctly. This is simply a little comparison to get us started:

 

So let’s now move on to the real test, and that is a test of the claim that MOTU converters sound “cloaked and muddy”, and that I am clearly a fool to be using them. I’m going to take my cue here from the brilliant Ethan Winer, acoustician and life-long debunker of audiophile claptrap, who has himself conducted an identical test to the one I am about to perform, but with Focusrite and Soundblaster converters in his sites, rather than MOTU. You can find his tests on his website here. Essentially, the logic works like this:

 

The claim is that ADC X sounds crappy in some way (“cloaked and muddy” in this case). Therefore by performing a loopback recording using the ADC in question, over a number of generations this crappiness should become more and more pronounced. A loopback recording simply means playing an audio file out of a stereo output of the device, and physically patching that output back into two inputs and recording it in whatever recording software you use. The resultant recording is then used as the source for the next loopback recording. And that one for the next. And that one for the next. And so on. After several generations the claim should be very easy to verify, as the signal degradation, or “cloaked muddiness” should be cumulatively imparted on the signal, such that the discrepancy between, say generation 10 and the original file is absolutely obvious. Certainly, if the ADCs under scrutiny are as crappy as has been claimed, then even after a single generation we should hear an obvious difference between the result and the original file. However, if it transpires that it is a struggle to hear any difference, and actually in a blind trial it is not even clear which is the original file and which are the subsequent generations, then it’s pretty safe to say that people like our YouTube friend may well just be subject to the aforementioned psychoacoustical biases, and therefore, simply talking shit.

Below this paragraph you will find several batches of files. I used four different pieces of music as my test subjects, ranging from grunge, through trip hop, to classical and jazz. For each genre I have posted four files; the original, generation 1, generation 5 and generation 10. I have aligned all recordings and gain matched as closely as possible. All recordings are carried out at 44.1 KHz, 16 bit, in order that any degradation manifests as obviously as possible. The goal for anyone who wants to partake in the challenge is to correctly identify each file. Once again, you can obtain the correct answers by sending me an email to james [at] jamesmakesmusic.com. If you are confident about MOTU converters being unsuitable for use because of their defects in “sound quality”, then you should have no trouble correctly identifying each file. And to RasTatum – the man who made the claim about MOTU converters sounding “cloaked and muddy” in the first place (but who has since rather curiously deleted all his comments) – I eagerly await your contribution to this experiment.

 

Good luck!

 

Nirvana – Smells Like Teen Spirit (1991)

 

Sneaker Pimps – 6 Underground (1996)

 

Royal Liverpool Philharmonic Orchestra – Beethoven’s Symphony #2 (1998)

 

Thelonious Monk & Gerry Mulligan – ‘Round Midnight (1957)

 

————–

Finally, the last brief experiment I wished to perform was a test of the self-noise of MOTU converters, because whilst the sound quality may not be as affected as we thought, then that still leaves room for the claim that the devices themselves are noisy. So I performed another loopback test using a file of total silence as the source. I won’t bother actually posting the resultant audio files here, but I will tell you that after 20 generations I was seeing a noise floor of -72.2 dB, as you can see from the image below. I hope you’d agree that that demonstration renders concerns of this type absolutely negligible.

So there you have it. Turns out you don’t actually need to spend thousands of moneys on ADCs just because someone tells you to.

 

Hooray!

Advertisements

Binaural Recording

Have you ever wondered how it is possible for the human brain to so accurately detect the location of a perceived sound? We only have two ears, yet somehow we are able to discern the differences between sounds originating from any direction within our 3-dimensional environment – in front, behind, above, below, left or right. How is this possible? And can we therefore simulate this effect in order to artificially reproduce the experience of perceived 3-dimensional sounds, as opposed to the normal left/right experience we are accustomed to in traditional stereophonic speaker set-ups, without simply adding extra speakers?

The answer is yes we can. Directional perception of sound occurs by our brain’s ability to decode the subtle differences in information received by our in-built stereo receivers – our left and right ears. Binaural recording is a recording technique that uses two microphones to mimic the human auditory system, utilising the exact same conditions that create the phenomenon of binaural localisation in humans. And so, with the acquisition of a pair of binaural microphones, a portable Tascam field recorder and a dummy head named John, film maker Bebe Bentley and I spent one evening carrying out some binaural recording tests at the University of Sussex. Here are the results (please note that headphones must be worn in order to perceive the effect):

#1: Binaural recording in a dead room.

#2: Binaural recording in a live room.

#3: Binaural recording of James with a guitar.

NEW-2NEW 1 NEW-4

In the directional perception of sound there are two phenomena at work: Binaural and monaural localisation:

Binaural Localisation

Binaural Localisation refers to the discrepancies in the characteristics of a sound wave arriving at the closest ear, and then the farthest. Your brain is sensitive to the discreet time difference between a sound hitting the nearest ear and the farthest ear – referred to as the Inter-aural Time Difference (ITD) – as well as the slight change in volume between the two ears – the Inter-aural Intensity Difference (IID). If sound originates to your left, your head acts as a barrier or filter and reduces the level of sound heard in the right ear.

Monaural Localisation

Monaural localisation mostly depends on the filtering effects of physical structures. In the human auditory system, these external filters include the head, shoulders, torso, and outer ear or “pinna”, and can be summarized as the head-related transfer function. Sounds are frequency filtered specifically depending on the angle at which they strike the various external filters.

binaural
Binaural recording of the kind Bebe and I carried out works by the use of two omni-directional microphones fitted to a dummy head, thereby simulating as realistically as possible the actual physical location of the human ears, combined with the filtering incurred by the human head. The same effect would be achieved by placing the microphones in your own ears, which would make for an interesting audio experience were you to then simply walk around an urban environment or visit a concert. In these instances it would be possible to accurately record exactly what you heard in these situations, complete with directional perception of the ambient noise, in order to later recreate that exact sensation through a pair of headphones. This, however, is perhaps a test for another day. Here we simply affixed the microphones into John’s ears and proceeded to move objects around and make various noises such that the illusion of directional perception is created.

It is however important, for the effect to be fully realised, that headphones are worn. This is because, on replay, the left ear must receive only the signal recorded by the left microphone, and the right ear only the signal from the right microphone. Playback through speakers destroys this effect by obscuring the stereo field emitted by the left and right speakers.

What strikes me as odd about the experience of listening to this recording is the realism it invokes. When hearing Bebe and I running around the room it is as if ghost figures are appearing in front of you. With your eyes closed you can almost “see” the people. This demonstrates just how unaware we are of the subtleties of our sensory information in building our picture of the world. The next time someone supposes some supernatural bullshit to describe how they “felt a presence in the room”, remind them how easily our senses can be fooled.

So there we are. Artificial directional perception by binaural recording. Now, if only I could find a practical application…