Quality Measurement in Adult Cardiac Surgery: Part 2-Statistical Considerations in Composite Measure Scoring and Provider Rating

Sean M. O'Brien, David M. Shahian, Elizabeth R. DeLong, Sharon Lise T. Normand, Fred H. Edwards, Victor A. Ferraris, Constance K. Haan, Jeffrey B. Rich, Cynthia M. Shewan, Rachel S. Dokholyan, Richard P. Anderson, Eric D. Peterson

Research output: Contribution to journalArticlepeer-review

202 Scopus citations


There is increasing interest among payers, patients, regulators, and providers to measure and compare cardiac surgery quality. The Society of Thoracic Surgeons (STS) Quality Measurement Task Force (QMTF) was established to develop comprehensive, summary performance measures encompassing multiple domains of quality. This report describes statistical considerations relevant to combining multiple measures into an overall composite score and then using such scores to rate providers. The QMTF evaluated various options for combining 11 National Quality Forum (NQF)-endorsed process and outcome measures, both within and across the four domains of care chosen by the Task Force (Perioperative Medical Care, Operative Care, Risk-Adjusted Operative Mortality, and Postoperative Risk-Adjusted Major Morbidity). These methods included simple or weighted averaging, a composite opportunity model similar to that used by the Centers for Medicare & Medicaid Services (CMS), "all or none" scoring, scaled combinations, and latent variable models. Each method was illustrated using actual 2004 STS data from 133,149 coronary artery bypass procedures. Provider performance was estimated using Bayesian random-effects approaches to account for small sample size and to incorporate risk adjustment for outcomes. Latent variable modeling failed to provide accurate estimates of provider performance when tested with actual STS data. Most other methods of combining individual measures within a given domain produced similar and consistent estimates of performance (Spearman rank correlations 0.95 to 0.98), and an all or none approach was selected. Combining scores across domains was accomplished by rescaling and then adding the domain-specific estimates. When this methodology is applied to actual STS data, a one percentage point improvement in mortality has the same impact on the overall composite score as does an 8% improvement in the morbidity rate, an 11% improvement in the frequency of internal mammary artery usage, or a 28% change in the frequency of using all four NQF-recommended medications. The QMTF considered various approaches to determining performance tiers based on composite scores. As a demonstration of one such system, the QMTF conducted a pilot study with 2004 STS data, using a 99% Bayesian certainty criterion to assign performance tiers. This stringent criterion was used to maximize the statistical certainty of tier assignments. Applying this methodology, approximately 77% of providers fell into a middle-performance tier, 10% were determined to be in a high-performing tier, and another 13% in a low-performing tier. In summary, the STS QMTF has developed and tested a composite measure of cardiac surgery quality that encompasses multiple domains of care, uses Bayesian random-effects analyses, uses all or none scoring where appropriate, and avoids subjective weighting of individual measures. One possible methodology for assigning performance tiers derived from these scores was demonstrated in a pilot study. This overall methodology was applied to actual STS data and appeared to satisfy multiple criteria for validity. These quality measures for cardiac surgery should prove useful to STS participants, payers, and governmental agencies.

Original languageEnglish
Pages (from-to)S13-S26
JournalAnnals of Thoracic Surgery
Issue number4 SUPPL.
StatePublished - Apr 2007

ASJC Scopus subject areas

  • Surgery
  • Pulmonary and Respiratory Medicine
  • Cardiology and Cardiovascular Medicine


Dive into the research topics of 'Quality Measurement in Adult Cardiac Surgery: Part 2-Statistical Considerations in Composite Measure Scoring and Provider Rating'. Together they form a unique fingerprint.

Cite this