The High Resolution Audibility Test

February 11 2015, 04:00
As the owner of a standard yellow PonoPlayer, I was catching up on comments over at PonoMusic, and one in particular caught my attention, a note from Allen Farmelo at Farmelo Recording linking to a Tape Op blog entry. Farmelo’s dispatch, The Problem With A-B'ing And Why Neil Young Is Right About Sound Quality, makes several valid points, one of which is that A–B tests are not necessarily that meaningful…
 
“If you want to do a real test of the differences (as an example, between lossy compressed audio and HRA), give people a music collection that’s all MP3s for a month, then give them that same collection as 24bit WAVs for a month, and then ask which one’s which, and I bet you will start to get some (meaningful or statistically significant) answers.” I can’t agree more so, I began a comment on his thread on the PonoMusic blog. As I started to write, I realized there were so many overlooked aspects of judging HRA or High Resolution Music against non–HRA content that I had to dig into this publicly.
 
I agree that, in and of themselves, A–B or A–B–X audibility testing is not valuable unless it’s taken in a larger context. Psychoacoustics is not a purely objective phenomenon so, no matter how much one expounds or stamps one’s feet, A–B tests will not provide meaningful results without considering the main component of the test; our plastic and highly adaptable brain.
 
Let’s start with a hypothesis; I state that carefully recorded higher–than–CD resolution music files, when played back on a highly resolving system, will sound “better” than a lower rez CD – quality equivalent played on that same system. Within the scope of this rant, let’s not worry about why, only that it does. Let us also assume that there is nothing wrong or invalid about observing a phenomenon, then later trying to explain it. Without that fundamental attitude — observation first then explanation — science and technology would not have progressed to its present state.
 
This brings up a fundamental difference between engineers who are truly scientists and crusty engineers who follow their own internal dogma to the exclusion of evidence to the contrary… A scientist is, by definition, a skeptic in the sense that, if her experimental results don’t fit the data, then her thesis is flawed and requires more thought. Many electrical engineers I know who design audio gear will dogmatically opine that high-resolution files are theoretically worthless since any content sampled at higher than the CD’s Nyquist Frequency is a waste of resources. “We cannot hear above 20 kHz,” is the justification. If you sit a true skeptic down and play them two versions, one low rez and another high rez, of the same well-recorded content on a highly resolving playback rig, either they will hear a difference or they will not. If not, then their hearing is somehow impaired or they are not trained to discern subtle audible differences.
 
This is to be expected and there is nothing out of the ordinary about it. I admit that, at times, I cannot discern differences when my colleagues can. At that point, I ask for more information, if possible, so I can better focus on what to listen for or, I allow that I need improvement. If we lack the ability to admit that we are not perfection personified, then we need some serious regrooving.
 
O.K., back to our hypothetical listener and a high rez audibility test: One additional outcome is that the listener may hear a difference but choose not to acknowledge it. Since they cannot explain the outcome, in their view the test is invalid. A recording/production audio engineer, as opposed to a design engineer, would not usually fall into this trap. Though recording folks are usually slightly hearing impaired, being audio–obsessed as a youth often brings some hearing damage along as baggage, their ability to tease out slight differences in two versions of the same material is usually highly refined. We are not afraid to “trust our ears,” just as a winemaker or chef must trust his refined palate or a colorist, who must trust their spectral discrimination. This is a fundamental difference between a craftsperson, someone who relies on their body, senses and abilities to perform a job or service, and a hidebound individual who does not question received wisdom or their own experiences. Craftspeople constantly refine their craft, making them better practitioners and, at some point in their lives, Master Craftsman.
 
Speaking of refinement, Farmelo makes another point about the need to train your hearing. I hope everyone agrees that one needs to train one’s “ears.” This is pivotal and often overlooked. He mentions coffee, I use wine as my quotidian sensory metric. Whether you’re drinking chocolate, Tokaji or bourbon, there's a lot of “information” flooding your sensorium, and it takes a discerning taster to disentangle it all.
 
Also, although many high rez naysayers will dismiss the fact, perception of the improvements afforded by the highest resolution requires a highly resolving playback rig. Most of the A/B test “results” I’ve read are using MI gear or funky playback chains and questionable methodologies. Let’s admit that most pro gear, never mind commodity “Musical Instrument” or MI products, are not built to be highly resolving as their main design objective. Low cost, ruggedness and the ability to play loud usually trumps all other design factors. Low cost “pro” transducers, such as speakers and converters, are particular offenders.
 
Also, all HRA files are not created equal. Some purveyors of so called “high resolution” content are selling either SRC’d lower rez material or accepting transfers from limited & EQ’d production copies, not the actual master recordings… caveat emptor. PonoMusic is an example of a HRA vendor that’s supposed to be sweating the details when it comes to provenance, quality of the “master” recording, and transfer process if needed. Unfortunately, I’ve seen no evidence to support that stance.
 
Finally, the test material itself; I review gear all the time and many “HRA” recordings may be the best they can be, but they’re still compromised by the very way they were created. Compare two songs; Toto’s Rosanna comes to mind as I was listening to it the other day as part of a test… a fun construct but very “small” in many aspects - soundstage width and depth, dynamics, lack of definition, no air to speak of. A typical close mic’d multitracked and overdubbed pop recording. Compare that to an all–live, 176.4 Reference Recording from Dr. Keith Johnson; I admit to using his Crown Imperial as test material a fair bit, an old audiophile chestnut. It may not be perfect, but it does possess a wide and deep soundstage, broad dynamics, and acoustic instruments mic’d to capture all their supersonic energy plus the subtle spatial cues from the surrounding venue. The gear he uses is top notch mastering quality, not fidelity–challenged gear. In short: the opposite of ’80s pop, resolution–wise.
 
All these factors contribute to discerning differences; exposure time, playback environment, training level, abilities of the listener, quality of the playback equipment and environment, and quality of the content itself. More ephemeral factors, such as listener mindset, physical state such as an incipient illness, not to mention self-confidence along with familiarity with the test material, all come together to create a complex sensory package on which an individual is passing judgment.
 
I, for one, am not afraid to admit that sometimes I cannot hear differences during critical listening tests. My skills are modest and I know it, though I continue to hone my abilities. I also admit that I do not “know it all,” I am always learning. I hope you, gentle reader, will be open minded to “unscientific” proposals, even though you cannot conceive of a causal mechanism. Sometimes, you just have to trust your own experiences, and only later discover what is the underlying mechanism.

Read also: Perceived Meaningful Resolution
related items