Just finished reading a new paper from Smart, Blake, Staines and Doody (Clinical Journal of Pain Oct 2011:27(8)) intended to validate a classification system for pain mechanisms. The 3 categorizations explored in this study were pain of primarily nociceptive origin, peripheral neuropathic pain and centrally sensitized pain. In a quick nutshell, the purpose of the study was to evaluate the discriminative ability of a checklist of symptoms and signs intended to identify the different categories. Over 400 total subjects formed the final sample, on which 1 of 14 different physiotherapists performed a standardized evaluation based on the checklist, itself developed from an earlier delphi study. So far so good, but here's where things get a little bit wonky. The 'gold standard' in this case, was the opinions of the assessing therapists about which of the categories each patient belonged to. The authors quite rightly recognize that there are no better gold standards against which to make this comparison for this particular study, although one might wonder why they didn't simultaneously capture results on one of the many neuropathic/nociceptive discriminatory questionnaires currently available if for nothing more than concurrent validation. Anyhoo, won't fault them for that, but I do have to take issue with the results and the way they're presented. On first glance, the results of the classification system appear astounding, with sensitivities and specificities in the 90% or higher range, positive likelihood ratios ranging from 10 to 40, and diagnostic odds ratios of 100 or higher (incidentally, these are all different ways of saying largely the same thing, so I can't help but feel the reporting of all of them, plus predictive values, was a bit superfluous). These values are astounding for clinical tests, considerably better than most of what you're going to find in the published literature. But this is where I'm a bit bothered - remember that the gold standard against which the results of the therapists' assessments were compared was...the opinions of the same therapists. Hence, the astoundingly high values for diagnostic/discriminative validity are not surprising. The obvious question we have to ask when critically appraising a paper like this, is 'to what degree do we trust that the gold standard is correct?', and in this case, that question becomes 'to what degree do we trust that the opinions of these 14 therapists are correct?' Looking at the items that are retained for each of the categories, all in all I would probably agree with the majority of them, with the most conspicuously questionable in my mind the exclusive inclusion of 'presence of psychosocial symptoms' in the centrally sensitized pain category. I get the idea here, but I think many (including me) would argue that so-called 'psychosocial' symptoms probably influence the experience of each of the categories, and should maybe not be inherent proof of centrally-sensitized pain. I also take issue with the method of reporting the results in the text - under nociceptive pain, Sn and Sp are reported along with diagnostic odds ratio. Under peripheral neuropathic positive and negative predictive values are presented, and under central sensitization, positive and negative likelihood ratios are presented. Granted, all indicators for all clusters are included in Table 7, but I'm confused about the choice of reporting different indicators in the text. I wonder how much of this was actually an editors or reviewers request rather than a conscious decision of the authors, but either way it will introduce some extra noise especially for those clinicians who struggle with even the concept of Sn and Sp.
Look, I'm all for working towards pain mechanisms and classifications, I've even done similar things with a group of advance practice PTs in one of our programs here at Western (the subject of a previous blog), but I haven't published it as strong evidence. Lord knows I'm also well aware that a project can look very different at its conclusion than how it was conceptualized at the outset, and I don't want to be coming across as overly critical here. I applaud the effort, and think this kind of work does move the field forward no doubt, but I urge caution in your interpretation of the remarkable results. Someone's going to read this, see the mammoth indicators of diagnostic validity, and believe this is definitive proof of the value of this checklist and it will work its way into practice or policy, and unfortunately I think that's premature. I do wish that the authors did a better job of reporting the potential limitations in interpretation - but also realize that there are other influences (ie. word counts) that affect the detail that authors can discuss.
In summary, I'm not meaning to be hypercritical (though it might sound like it). There are logistic challenges to carrying out clinical researchers, I get that (trust me). I'm also happy to see it published - the point of this post is to highlight some of the areas that you, as reader and potential knowledge consumer, should scrutinize a bit harder before you decide what to do with the information. I'd love to hear your comments.