Measuring Outcomes, Outcome Measures, and Treatment Effects

Measuring outcomes, treatment efficacy, and treatment effectiveness are separate yet interacting constructs. And, it’s more than semantics. Clinically, measuring outcomes masquerades as simple while interpreting these outcomes appropriately can be quite complex. Outcomes bias, or results oriented analysis, presents a significant challenge to the practicing clinician. Outcome measures measure outcomes, not effects of intervention:

Perhaps it is unfortunate that the physiotherapy profession has responded to the perception that physiotherapists must justify what they do by routinely measuring clinical outcomes. The implication is that measures of outcome can provide justification for intervention. Arguably that is not the case. Outcome measures measure outcomes. They do not measure the effects of intervention. Outcomes of interventions and effects of interventions are very different things. Clinical outcomes are influenced by many factors other than intervention, including the natural course of the condition, statistical regression, placebo effects, and so on. (Tuttle (2005) makes this point clearly in his article, in this issue, on the predictive value of clinical outcome measures.)

The implication is that a good outcome does not necessarily indicate that intervention was effective; the good outcome may have occurred even without intervention. And a poor outcome does not necessarily indicate that intervention was ineffective; the outcome may have been worse still without intervention. This is why proponents of evidence-based physiotherapy, including ourselves (Herbert et al 2005), argue it is necessary to look to randomised trials to determine, with any degree of certainty, the effects of intervention. It is illogical, on the one hand, to look to randomized controlled trials for evidence of effects of interventions while, on the other hand, seeking justification for the effectiveness of clinical practice with uncontrolled measurement of clinical outcomes.

Principles of Outcomes Measurement

1. Objective and Measurable
2. Decrease Bias and Improve Accuracy
3. Reliable and Reproducible
4. Valid: Are we measuring what we think?
5. Sensitive to Change: Does the measure detect changes in construct?
6. Patient Report vs. Patient Performance

In addition, measurement of outcomes requires understanding the various constructs and categories that are measurable. This includes, but is not limited to:

Patient Report

Patient Performance

  • Functional Test (5 x Sit to Stand, 6 Minute Walk Test)
  • Functional Task/Activity (squat, stairs)
  • Exercise or Activity Testing

International Classification of Function & Disability Framework

  • Impairments of Body Structure and/or Function
  • Activity Limitations
  • Participation Limitations

Body Systems Level

  • Cognitive
  • Neuromuscular
  • Musculoskeletal
  • Cardiopulmonary
  • Integumentary
  • Psycho-social

Health Services

  • Duration of Care
  • Frequency of Care
  • Number of Visits
  • Future Care Needs
  • Cost
  • Cost Savings
  • Morbidity

These are only a few select constructs and measurements. Another, arguably more complex area of assessment is the narrative and experiential outcome as described by the patient. The patient’s illness narrative, interpretations, and journey through potential suffering.

Differences and disconnect between progression of physical function via patient performance and patient report has been characterized in total hip arthroplasty. “The influence of pain on self-reported physical functioning serves as an explanation for the poor relationship between self-reported and performance-based physical functioning. When using a self-report measure such as the WOMAC, one should realize that it does not seem to assess the separate constructs—physical functioning and pain—that are claimed to be measured.” Both patient report and performance are important. Each can guide further intervention or provide insight into current deficits.

For example, a patient with improvement in performance, but no change in report, may be struggling with recognizing or understanding improvements in certain domains (symptoms, performance, function). Or, perhaps education has not addressed a patient’s main concern or perception. Mistaking outcome measures and measuring clinical outcomes for actual effect of treatment may result in improper (or even pseudo-random) intervention selection and/or patient care approaches. I postulate that this mistake is the prime reason physical therapy as a profession is quick to integrate new, “innovative” treatment “tools” with lack of true prior plausibility. Or, the continued utilization of of interventions in the face of evidence suggesting lack of treatment effect. Mistaking observed and measured clinical outcomes for treatment effectiveness likely results from the post hoc ergo propter hoc logical fallacy.

When we mistake outcomes for effectiveness, we risk assuming causation and subsequently treatment mechanism. Care must be to taken to avoid leaps in logic regarding effectiveness and mechanism of action. A review of the evolution of understanding of manual therapy mechanisms illustrates how continued observation of positive clinical outcomes likely reinforced inaccurate interpretations based upon hypothetical anatomy and biomechanics devoid of true physiology and actual tissue mechanics. We now know much more.

Although, to be fair, construction of care processes, intervention approaches, and treatment paradigms absent of (potential) theoretical mechanistic action is quite challenging. Further, human brains seek explanation for observed clinical events, even within research. So, when treatment X is routinely associated with observed patient report or outcome Y brains will automatically initiate assigning reason Z as the “why.”

Measure everything!

No. Quite the contrary. Clinicians should aim to properly select measures that are relevant to the patient: main complaint, goals, condition, and/or diagnosis (if one exists). In addition, the measures chosen should be sufficiently responsive to change, encompass multiple constructs, and cross domains. While important, relying solely on patient report is an incomplete, flawed approach to measuring outcomes and assessing treatment in the clinical setting.

Two differing scenarios may occur when utilizing outcomes observed or measured in clinic as the primary reasoning for decision making regarding interventions/treatment:

A. Effective interventions may be abandoned when outcome(s) are not improving on the assumption of lack of effect.
B. Ineffective interventions or approaches may be continued when outcomes are improving on the assumption of effect.

In scenario A, the patient may in fact worsen without the treatment. Perhaps progress is predicted to be slower without effective treatment, or natural history suggests a worse trajectory. An effective intervention or process may be ceased prematurely. In scenario B, perhaps improvement is measured. Placebo, non-specific effects, incentives, and/or bias in measuring and patient reporting contribute to the observation of a positive outcome in the clinical environment. “It works!” Or, appears to. But, a multitude of other factors affect the presence of a measured outcome (positive or negative).

The multi-factorial nature of treatment mechanisms, complicate the ability to clinically observe effectiveness. The myriad of reasons why individuals may report and/or exhibit improvements in symptoms, function, and other constructs make “outcomes” a dynamic and complicated subject. Perhaps the condition has a favorable natural history or regression to the mean is present. And, perhaps the patient would have progressed more quickly with a more effective treatment approach. It’s complicated. Don’t take all the credit, and don’t take all the blame. So, what should we do?

Measure nothing, clinical outcomes are meaningless!

No. Quite the contrary. In addition, to selecting appropriate outcomes measurements, clinicians must integrate and understand appropriate current clinical, mechanistic, and basic science research. As science based practitioners, physical therapists are charged to select effective, plausible, safe, and efficient approaches to care that are focused on the individual patient. This is not an argument for the utilization of only specific outcome measurements and interventions with strong randomized control trial level evidence. Plausibility matters. The individual person matters. It’s complicated. And, it’s easy to fool ourselves. Richard Feynman suggests:

The first principle is that you must not fool yourself — and you are the easiest person to fool.

So, measure clinical outcomes. They are important. But, ensure measurements cross constructs and domains. Don’t solely rely on patient reports. And, don’t claim effectiveness based on observation. We must acknowledge the complexity. No one is saying clinical outcomes measurement is not important, or is not illustrative of important concepts. Clinical data and outcomes are vital to self-reflection, integration of evidence, health services, and overall care processes. But, the plural of anecdote is not data, and outcome measures can not illustrate effectiveness. That’s not an argument to not measure outcomes. It’s an argument to improve measurement, and more importantly, understanding.


1. Evidence Based Physiotherapy: A Crisis In Movement
2. Causation and Evidence Based Practice: An Ontalogical Review
3. Casual Knowledge in Evidence Based Practice
4. Mechanisms: What are they evidence for in evidence based medicine?
5. Placebo use in pain management: The role of medical context, treatment efficacy, and deception in determining placebo acceptability
6. Placebo Response to Manual Therapy: Something out of nothing?
7. The Mechanisms of Manual Therapy
8. The influence of expectation on spinal manipulation induced hypoalgesia
9. Evidence for a direct relationship between cognitive and physical change during an education intervention in people with chronic low back pain
10. The contributing factors of change in therapeutic process

5 Replies to “Measuring Outcomes, Outcome Measures, and Treatment Effects”

  1. Shortened version of a response due to some glitch.

    Most tools function at an n=1 level. They should be used by clinicians to inform progress. If interventions had no effect on outcomes, then all outcomes would be the same. We know from studies comparing the outcomes of like patients that better outcomes are achieved when evidence based interventions are included. To say interventions have little to no role in the outcome means that literature was wrong. Clinicians don’t have the capability to clone a patient and have the clone not receive treatment. It is reasonable to believe that during a clinical visit something in that visit contributes to the outcome. When a patient receives physical therapy in the acute stage of an injury, there is a greater likelihood of natural course playing a role in the outcome. When a patient receives physical therapy 6-8 weeks after the onset of an injury, the natural course has probably already been exhausted. Care should be used when mentioning natural course as the reason for the outcome.

    Some outcome management systems do allow for determining effectiveness. If the system has enough data in it and the data is risk adjusted, the system can predict the outcome based on all the data in the system for similar patients. The prediction can be used to determine effectiveness – the patient either reaches the predicted outcome or does not. If the patient reaches the outcome, the episode of care was effective.

    In contract negotiations it would be completely ludicrous to state, “treatment interventions have no effect on outcomes.” Health science research and efficacy studies include outcomes. The reason is to figure out the bang for the buck.

    If you take a look at the STaRT back screening tool… the score changes. When using that tool, the outcome I want is a very low score. The reason that score changes is due to the interactions between the patient and clinician. If that score doesn’t budge from a high risk, the clinician might as well toss in the towel with physical therapy services and recommend a referral to a specialist who can align the biopsychosocial components to lead to a positive outcome.

    Reality… the majority do not care what happens during treatment sessions. What is desired positive change in the shortest amount of time. It is assumed that people who receive physical therapy services need the services to improve their function. It is assumed that clinicians deliver improved function. Researchers are analyzing claims – at the moment, they know number of visits, CPT codes and cost. What is missing is the functional outcome portion. Do we really want to go down a path stating that outcome tools and outcome systems do determine effectiveness? What do you propose instead?

    1. Hi Selena,

      Thanks for your comment and contribution. I’d agree, outcome measures function at individual (n=1) level, and absolutely should be utilized to track outcomes and inform progress clinically. We likely can’t manage what we don’t measure. And, obviously outcomes are improved when the best available interventions and care processes are implemented. We should be striving to implement the most evidence based interventions and approaches with the best researched efficacy and subsequent effectiveness. I’m in no way suggesting that interventions lack effect. As you mention, that’s quite a lardaceous conclusion.

      A big enough data set of clinical outcomes likely can provide robust predictive utility. Prediction of a positive outcome obviously depends on identifying and tracking the potential factors that exert effects (separate from effectiveness of a treatment) on the attainment of a specific outcome. One of these factors is the treatment interventions or approach the therapist utilizes. But, again this is a prediction of a measured clinical outcome, which regardless of mechanism or effect, is separate from the effectiveness of the treatment intervention(s). I think much of our disagreement, well discussion really, is founded upon differing usage of term definitions.

      From what I can gather, you assert the following:
      1. Treatments have effects
      2. Clinical outcomes, especially if predicted via larger data sets, can illustrate effectiveness of an intervention
      3. Measured change is, at least partially if not mostly, attributable to treatment/clinical encounter
      4. Because people care about attainment of specified outcomes in shortest time, cheapest way, we must identify effective treatments via those clinical outcomes

      The presence, interaction, and interventions that occur during clinical encounters affect observed outcomes. I’m not arguing against this, but rather, suggesting that the REASONS for observed outcomes are likely quite complex. Measured outcomes on a clinical level, from a scientific standpoint, can not illustrate effectiveness or efficacy even if they have an EFFECT. Of course interventions have EFFECTS (positive, negative, neutral). But, effect of an intervention (or visit or therapist’s treatment) is completely separate from treatment’s effectiveness or efficacy.

      When I read your comment it appears that effect, positive or predicted outcome, and effectiveness are utilized nearly interchangeably. The definitions of those terms are quite important within the current discussion. Effectiveness has specific meaning within healthcare and research. Although the word itself is taken to mean “the ability to produce a specific result or to exert a specific measurable influence.” The definition within medical science, healthcare, and research is more concrete. Kenny Venere’s post nicely defines and discusses of the difference between effectiveness and efficacy.

      Reaching a predicted clinical outcome, while positive, is not effectiveness. The quote, and manuscript it’s from, at the beginning of my post address this issue outright. The clinical outcome is a completely separate construct from the effectiveness (or efficacy) of the treatment. A clinical outcome in isolation, or even the routine attainment of a predicted outcome, can not illustrate effectiveness of an intervention itself. Usually, we assume that better clinical outcomes will be obtained by utilizing the most efficacious and effective interventions. But, the reverse is not necessarily true. Better clinical outcomes does not equate to effectiveness. And, of course, utilizing effective interventions does not guarantee a specific clinical outcome given the multitude of factors that may exhibit an effect. I think it’s much more complicated than it appears.

      The ability to predict an outcome, and the subsequent attainment of a similar outcome by other patients, is NOT “effectiveness.” Effectiveness and efficacy have operational definitions within the context of medical treatment and clinical research. So, a treatment can have very real effects and be predictive of positive outcome attainment without effectiveness or efficacy. We just can not state that predicted clinical outcomes prove effectiveness.

      I remain appropriately concerned when we as a profession assert, as you have, “The reason that score changes is due to the interactions between the patient and clinician.” I don’t think we can ever assert such a premise with such certainty. There’s likely many separate, and interacting, reasons. The effects and their mechanisms are variable and complicated.

      We’re dealing with humans and complex perceptions (i.e. pain) and behaviors (function/movement). Unfortunately, some people will worsen with the utilization of effective treatments. Others will improve with ineffective treatments. Sometimes secondary to factors we can identify and have researched. Sometimes for absolutely unknown reasons. Please do not equate this with an argument that treatments do not have effects, nothing is specific, everything is non-specific/placebo, and clinical outcomes are meaningless.

      What I propose is that as clinicians we become far more disciplined in our use of terms. I’m not being nit picky here when I say there is a significant and definable difference between efficacy, effectiveness, effect, outcomes, and prediction. Clinicians and researchers a like need to understand and recognize this difference. So, yes, I absolutely want to go down the road of claiming outcomes do not determine effectiveness outright.

      Researchers need to continue to tackle the complex questions of what, how, when, who, and for how long with a variety of study designs. We need basic science research. We need clinical trials investigating effectiveness and efficacy. We need retrospective studies of large data sets, including outcomes data bases. We need to perform regression analyses of big data. All of these inquiries can inform the others. Large outcomes databases can help inform researchers on potential factors associated with desired clinical outcomes: timing of care, clinician qualities, treatment intervention types, etc. Subsequently, researchers can study these factors in more controlled settings to elucidate where potential effects reside. Quality improvement projects utilizing care processes based upon the best understanding of the current literature, and containing the most effective interventions, can be implemented. Measured outcomes can be analyzed. A feedback loop of inquiry and improvement is possible.

      As I state in my post, I’m in full support of measuring outcomes at the clinical level. This data can be vital. In fact, we need to do more of it and do it better. But, again, I think we, as a profession, routinely conflate measured clinical outcomes with other concepts such as effectiveness. And, assumptions that a single measure can illustrates change in multiple domains, when in fact it may not (see example from post). So, I think, we need to temper our claims on what outcome measures and clinically generated data can tell us. It’s complicated, at times convoluted, and even quite confusing. It does us, our patients, and the healthcare system no good to ignore the intricacies and complexity of this topic. Oversimplification is not helpful, but neither is a surrender to the seemingly endless amount of conceptual and practical challenges.

Comments are closed.